7,542 Matching Annotations
  1. Jun 2024
    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      This study examines the spatial and temporal patterns of occurrence and the interspecific associations within a terrestrial mammalian community along human disturbance gradients. They conclude that human activity leads to a higher incidence of positive associations.

      Strengths:

      The theoretical framework of the study is brilliantly introduced. Solid data and sound methodology. This study is based on an extensive series of camera trap data. Good review of the literature on this topic.

      Weaknesses:

      The authors use the terms associations and interactions interchangeably.

      This is not the case. In fact, we state specifically that "... interspecific associations should not be directly interpreted as a signal of biotic interactions between pairs of species…" However, co-occurrence can be an important predictor of likely interactions, such as competition and predation. We stand by our original text.

      It is not clear what the authors mean by "associations". A brief clarification would be helpful.

      Our specific definition of what is meant here by spatial association can be found in the Methods section. To clarify, the calculation of the index of associations is based on the covariance for the two species of the residuals (epsilon) after consideration of all species-specific response to known environmental covariates. These covariances are modelled to allow them to vary with the level of human disturbance, measured as human presence and human modification. After normalization, the final index of association is a correlation value that varies between -1 (complete disassociation) and +1 (complete positive association).

      Also, the authors do not delve into the different types of association found in the study. A more ecological perspective explaining why certain species tend to exhibit negative associations and why others show the opposite pattern (and thus, can be used as indicator species) is missing.

      Suggesting the ecological underpinnings of the associations observed here would mainly be speculation at this point, but the associations demonstrated in this analysis do suggest promising areas for the more detailed research suggested.

      Also, the authors do not distinguish between significant (true) non-random associations and random associations. In my opinion, associations are those in which two species co-occur more or less than expected by chance. This is not well addressed in the present version of the manuscript.

      Results were considered to be non-random if correlation coefficients (for spatial association) or overlap (for temporal association) fell outside of 95% Confidence Intervals. This is now stated clearly in the Methods section.  In Figure 3—figure supplement 1-3 and Figure 4—figure supplement 1-3, p<0.01 levels are also presented.

      The obtained results support the conclusions of the study.

      Anthropogenic pressures can shape species associations by increasing spatial and temporal co-occurrence, but above a certain threshold, the positive influence of human activity in terms of species associations could be reverted. This study can stimulate further work in this direction.

      Reviewer #2 (Public Review):

      Summary:

      This study analyses camera trapping information on the occurrence of forest mammals along a gradient of human modification of the environment. The key hypotheses are that human disturbance squeezes wildlife into a smaller area or their activity into only part of the day, leading to increased co-occurrence under modification. The method used is joint species distribution modelling (JSDM).

      Strengths:

      The data source seems to be very nice, although since very little information is presented, this is hard to be sure of. Also, the JSDM approach is, in principle, a nice way of simultaneously analysing the data.

      Weaknesses:

      The manuscript suffers from a mismatch of hypotheses and methods at two different levels.

      (1) At the lower level, we first need to understand what the individual species do and "like" (their environmental niche). That information is not presented, and the methods suggest that the representation of each species in the JSDM is likely to be extremely poor.

      The response of each species to the environmental covariates provides a window into their environmental niche, encapsulated in the beta coefficients for each environmental covariate. This information is presented in Figure 2.

      (2) The hypothesis clearly asks for an analysis of the statistical interaction between human disturbance and co-occurrence. Yet, the model is not set up this way, and the authors thus do a lot of indirect exploration, rather than direct hypothesis testing.

      Our JSDM model is set up specifically to examine the effect of human disturbance on co-occurrence, after controlling for shared responses to environmental variables.  It directly tests the first hypothesis, since, if increase in indices of human disturbance had not tended to increase the measured spatial correlations between species as detected by the model, we would have rejected our stated hypothesis that human modification of habitats results in increased positive spatial associations between species.

      Even when the focus is not the individual species, but rather their association, we need to formulate what the expectation is. The hypotheses point towards presenting the spatial and the temporal niche, and how it changes, species for species, under human disturbance. To this, one can then add the layer of interspecific associations.

      Examining each species one by one and how each one responds to human disturbance would miss the effects of any meaningful interactions between species.  The analysis presented provides a means to highlight associations that would have been overlooked.  Future research could go on to analyze the strongest associations in the community and the strongest effects of human disturbance so as to uncover the underlying interactions that give rise to them and the mechanisms of human impact.  We believe that this will prove to be a much more productive approach than trying to tackle this problem species by species and pair by pair.

      The change in activity and space use can be analysed much simpler, by looking at the activity times and spatial distribution directly. It remains unclear what the contribution of the JSDM is, unless it is able to represent this activity and spatial information, and put it in a testable interaction with human disturbance.

      The topic is actually rather complicated. If biotic interactions change along the disturbance gradient, then observed data are already the outcome of such changed interactions. We thus cannot use the data to infer them! But we can show, for each species, that the habitat preferences change along the disturbance gradient - or not, as the case may be.

      Then, in the next step, one would have to formulate specific hypotheses about which species are likely to change their associations more, and which less (based e.g. on predator-prey or competitive interactions). The data and analyses presented do not answer any of these issues.

      We suggest that the so-called “simpler” approach described above is anything but simple, and this is precisely what the Joint Species Distribution Model improves upon.  As pointed out in the Introduction, simply examining spatial overlap is not enough to detect a signal of meaningful biotic interaction, since overlap could be the result of similar responses to environmental variables.  With the JSDM approach, this would not be considered a positive association and would then not imply the possible existence of meaningful interaction.

      Another more substantial point is that, according to my understanding of the methods, the per-species models are very inappropriate: the predictors are only linear, and there are no statistical interactions (L374). There is no conceivable species in the world whose niche would be described by such an oversimplified model.

      While interaction terms can be included in the JSDM, this would considerably increase the complexity of the models.  In previous work, we have found no strong evidence for the importance of interaction terms and they do not improve the performance of the models.

      We have no idea of even the most basic characteristics of the per-species models: prevalences, coefficient estimates, D2 of the model, and analysis of the temporal and spatial autocorrelation of the residuals, although they form the basis for the association analysis!

      The coefficient estimates for response to environmental variables used in the JSDM are provided in Figure 2 and Figure 2—source data 1.

      Why are times of day and day of the year not included as predictors IN INTERACTION with niche predictors and human disturbance, since they represent the temporal dimension on which niches are hypothesised to change?

      Also, all correlations among species should be shown for the raw data and for the model residuals: how much does that actually change and can thus be explained by the niche models?

      The discussion has little to add to the results. The complexity of the challenge (understanding a community-level response after accounting for species-level responses) is not met, and instead substantial room is given to general statements of how important this line of research is. I failed to see any advance in ecological understanding at the community level.

      We agree that the community-level response to human disturbance is a complex topic, and we believe it is also a very important one.  This research and its support of the spatial compression hypothesis, while not providing definitive answers to detailed mechanisms, opens up new lines of inquiry that makes it an important advance.  For example, the strong effects of human disturbance on certain associations that were detected here could now be examined with the kind of detailed species by species and pair by pair analysis that this reviewer appears to demand.

      Reviewer #1 (Recommendations For The Authors):

      L27 indicates instead of "idicates".

      We thank the reviewer for catching that error.

      L64 I would refer to potential interactions or just associations. It is always hard to provide evidence for the existence of true interactions.

      We have revised to “potential interactions” to qualify this statement.

      L69 Suggestion: distort instead of upset.

      We thank the reviewer for catching that error.

      L70-71 Here, authors use the term associations. Please, be consistent with the terminology throughout the manuscript.

      We thank the reviewer for raising this important point.  The term “co-occurrence” appears to be used inconsistently in the literature, so we have tried to refer to it only when referencing the work of us. For us, co-occurrence means “spatial overlap” without qualification as to whether it is caused by interaction or simply by similar responses to environmental factors (see Blanchet et al. 2020, Argument 1). In our view, interactions refer to biotic effects like predation, competition, commensalism, etc., while associations are the statistical footprint of these processes.   In keeping with this understanding, in Line 73, we changed "association" to the stronger word "interaction," but in Line 76, we keep the words "spatiotemporal association", which is presumed to be the result of those interactions. In Line 91, we have changed “interactions” to “associations,” as we do not believe interactions were demonstrated in that study. 

      L76 "Species associations are not necessarily fixed as positive or negative..." This sentence is misleading. I would say that species associations can vary across time and space, for instance along an environmental gradient.

      We thank the reviewer for pointing out the potential for confusion.  In Line 79, we have changed as suggested.

      L78 "Associations between free-ranging species are especially context-dependent" Loose sentence. Please, explain a bit further.

      We have changed the sentence to be more specific; ”Interactions are known to be context-dependent; for example, gradients in stress are associated with variation in the outcomes of pairwise species interactions.”

      L83-85 This would be a good place to introduce the 'stress gradient' hypothesis, which has also been applied to faunal communities in a few studies. According to this hypothesis, the incidence of positive associations should increase as environmental conditions harden.

      In our review of the literature, we find that the stress gradient hypothesis is somewhat controversial and does not receive strong support in vertebrates.  We have added the phrase “…the controversial stress-gradient hypothesis predicts that positive associations should increase as environmental conditions become more severe…”

      L86-88 Well, overall, the number of studies examining spatiotemporal associations in vertebrates is relatively small. That is, bird associations have not received much more attention than those of mammals. I find this introductory/appealing paragraph a bit rough. I think the authors can do better and find a better justification for their work.

      We thank the reviewer for the comments.  We have rewritten the paragraph extensively to make it clearer and to provide a stronger justification for the study.

      L106 "[...] resulting in increased positive spatial associations between species" I'd say that habitat shrinking would increase the level of species clustering or co-occurrence, but in my opinion, not necessarily the incidence of positive associations. It is not clear to me if the authors use positive associations as a term analogous to co-occurrence.

      We thank the reviewer for raising this very important distinction.  Habitat shrinking would increase levels of species co-occurrence, but this is not particularly interested.  We wanted to test whether there were effects on species interactions, as revealed by associations.  We find that the terms association and co-occurrence are used somewhat loosely in the literature and so have made some new effort to clarify and systematize this in the manuscript.  For example, there appear to be a differences in the way “co-occurrence” is used in Boron 2023 and in Blanchet 2020. We do not use the term "positive spatial association" as analogous to "spatial co-occurrence.". Spatial co-occurrence, which for us has the meaning of spatial overlap, could simply be the result of similar reactions to environmental co-variates, not reflecting any biotic interaction. Joint Species Distribution Models enable the partitioning of spatial overlap and segregation into that which can be explained by responses to known environmental factors, and that which cannot be explained and thus might be the result of biotic interactions.  It is only the latter that we are calling spatial association, which can be positive or negative.   These associations may be the statistical footprint of biotic interactions.

      Results:

      Difference between random and non-random association patterns. It is not clear to me if the reported associations are significant or not. The authors only report the sign of the association (either positive or negative) but do not clarify if these associations indicate that two species coexist more or less than expected by chance. In my opinion, that is the difference between true ecological associations (e.g., via facilitation or competition effects) and random co-existence patterns. This is paramount and should be addressed in a new version of the manuscript.

      This information is provided in Figure 3—figure supplement 1,2,3 and Figure 4—figure supplement 1,2,3.  This is referenced in the text as follows, “… correlation coefficients for 18 species pairs were positive and had a 95 % CI that did not overlap zero, and the number increased to 65 in moderate modifications but dropped to 29 at higher modifications" and so on. This criterion for significance (ie., greater than expected by chance) is now stated at the end of the Materials and methods.  In Figure 3—figure supplement 1,2,3 and Figure 4—figure supplement 1,2,3, those correlations that were significant at p<0.01 are also shown.

      I am also missing a more ecological explanation for the observed findings. For instance, the top-ranked species in terms of negative associations is the red fox, whereas the muntjac seems to be the species whose presence can be used as an indicator for that of other species. What are the mechanisms underlying these patterns? Do red foxes compete for food with other species? Do the species that show positive associations (red goral, muntjac) have traits or a diet that are more different from those of other species? More discussion on these aspects (role of traits and the trophic niche) would be necessary to better understand the obtained results.

      The purpose of this paper was to test the compression hypotheses, and we have tried to keep that as the focus.  However, the analysis does open up interesting lines of inquiry for future research to decipher the details of the interactions between species and the mechanisms by which human disturbance facilitates or disrupts these interactions. The reviewer raises some interesting possibilities, but at this point, any discussion along these lines would be largely speculation and could lengthen the paper without great benefit. 

      Reviewer #2 (Recommendations For The Authors):

      The manuscript should be accompanied by all data and code of analysis.

      All data and RScripts have been made available in Science Data Bank: https://doi.org/10.57760/sciencedb.11804.

      The sentence "not much is known" is weak: it suggests the authors did not bother to quantify what IS known, and simply waved any previous knowledge aside. Surely we have some ideas about who preys on whom, and which species have overlapping resource requirements (e.g., due to jaw width). For those, we would expect a particularly strong signal, if the association is indeed indicative of interactions.

      We believe that the reviewer is referring to the statement in Line 90-92 about the lack of understanding of the resilience of terrestrial mammal associations to human disturbance.  We have added a reference to one very recent publication that addresses the issue (Boron et al., 2023), but otherwise we stand by our statement. We have, however, added a qualifier to make it clear that we did indeed look for previous knowledge; "However, a review of the literature indicates that ...."

      Figures:

      Fig. 1. This reviewer considers that this is too trivial and should be deleted.

      This is a graphical statement of the hypotheses and may be helpful to some readers.

      Fig. 2. Using points with error bars hides any potential information.

      Done as suggested.

      That only 4 predictors are presented is unacceptably oversimplified.

      Only 4 predictors are included because, in previous work, we found that adding additional predictors or interactions did little to improve the model’s performance (Li et al. 2018, 2021 and 2022) and could lead to over-fitting.

      Fig. 5. and 6. aggregate extremely strongly over species; it remains unclear which species contribute to the signal, and I guess most do not.

      The number of detection events presented in Table 1 should help to clarify the relative contribution of each species to the data presented in Figures 5 and 6.

      This reviewer considers that the introduction 'oversells' the paper.

      L55: can you give any such "unique ecological information"

      L60: Lyons et al. (Kathleen is the first name) has been challenged by Telford et al. (2016 Nature) as methodologically flawed.

      The first name has been deleted.  The methodological flaw has to do with interpretation of the fossil record and choice of samples, not with the need to partition shared environmental preferences and interactions.

      L61 contradicts line 64: Blanchet et al. (2022, specifying some arguments from Dormann et al. 2018 GEB) correctly point out that logically one cannot infer the existence or strength from co-occurrence data. It is thus wrong to then claim (citing Boron et al.) that such data "convey key information about interactions". The latter statement is incorrect. A tree and a beetle can have extremely high association and nothing to do with each other. Association does not mean anything in itself. When two species are spatially and temporally non-overlapping, they can exhibit perfect "anti-association", yet, by the authors' own definition, cannot interact.

      We believe that the reviewer’s concerns arise from a misunderstanding of how we use the term association.  In our usage, an association is not the same as co-occurrence or overlap, which may simply be the result of shared responses to environmental variables.  The co-occurring tree and beetle would not be found to have any association in our analysis, only shared environmental sensitivities.  In contrast, associations can be the statistical footprint of interactions, and would be overlaid onto any overlap due to similar responses to the environment.  In the case of negative associations, such as might be the result of competitive exclusion or avoidance of predators, the two species would share environmental responses but show lower than expected spatial overlap.  Even though they might be only rarely found in the same vicinity, they would indeed be interacting when they were together.

      Joint Species Distribution Models "allow the partitioning of the observed correlation into that which can be explained by species responses to environmental factors... and that which remains unexplained after controlling for environmental effects and which may reflect biotic interactions." (Garcia Navas et al. 2021). It is the latter that we are calling “associations.”

      L63: Gilbert reference: Good to have a reference for this statement.

      This point is important, but the reviewer’s comments below have made it clear that it is even more important to point out that strong interactions should be expected to lead to significant associations.  We have added a statement to clarify this.

      L70-72: Incorrect, interactions play a role, not associations (which are merely statistical).

      In this, we agree, and we have revised the statement to refer to interactions, not associations. In our view, an interaction is a biological phenomenon, while an association is the resulting statistical signal that we can detect.

      L75: Associations tell us nothing, only interactions do. Since these can not be reliably inferred, this statement and this claim are wrong.

      We thank the reviewer for raising this point, but we beg to disagree. Strong interactions should be expected to lead to significant associations that can be detected in the data. Associations, which can be measured reliably, are the evidence of potential interactions, and hence associations can tell us a great deal.  We have added a note to this effect after the Gilbert reference above to clarify this point.

      However, we do accept that associations must be interpreted with caution. As Blanchet et al. 2020 explain, " …the co-occurrence signals (e.g. a significant positive or negative correlation value) estimated from these models could originate from any abiotic factors that impact species differently. Therefore, this correlation cannot be systematically interpreted as a signal of biotic interactions, as it could instead express potential non-measured environmental drivers (or combinations of them) that influence species distribution and co-distribution.”  Or alternatively an association could be the result of interaction with a 3rd species. 

      L87: Regarding your claim, how would you know you DO understand? For that, you need to formulate an expectation before looking at the data and then show you cannot show what you actually measure. (Jaynes called this the "mind-projection fallacy".)

      We are not sure if the reviewer is criticizing our paper or the entire field of community ecology.  Perhaps it is the statement that “….resilience of interspecific spatiotemporal associations of terrestrial mammals to human activity remains poorly understood….”  Since we are confident that the reviewer believes that mammals do interact, we guess that it is the term “association” that is questioned.  We have revised this to “…the impacts of human activity on interspecific interactions of terrestrial mammals remains poorly understood…” 

      In this particular case, we did formulate an expectation before looking at the data, in the form of the two formal hypotheses that are clearly stated in the Introduction and illustrated in Figure 1. If the hypotheses had not been supported, then we would have accepted that we do not understand. But as the data are consistent with the hypotheses, we submit that we do understand a bit more now.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We would like to thank all reviewers for their detailed and constructive feedback, which substantially helped improve the manuscript. We apologise for the time taken for the revisions, which was partially due to the first author (successfully) writing and defending her PhD thesis in the same time frame. We would like to point out already here that, based on reviewers' feedback, main figure 6 is completely redone and the conclusions of this figure have changed substantially. We no longer suggest RNA chaperoning activity (it was identified as being due to the high concentration of TEV protease, in a control suggested by the reviewers). Instead, our refined assay conditions with lower TEV protease concentration identified ribonuclease activity of membrane-bound full-length 2C, which is consistent with a publication from 2022 (PMID: 35947700).


      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Evidence, reproducibility, and clarity

      Summary:

      In this study by Shankar and colleagues, the authors aim to understand the structure and function of the enterovirus 2C protein, a putative viral helicase with AAA+ ATPase activity. Using poliovirus (as a model enterovirus) 2C, the author's propose the protein contains two amphipathic helices (AH1 and AH2) at the N-terminus that are divided by a conserved glycine. Using purified MBP-tagged 2C and N-terminal 2C truncations, their data suggests AH1 is primarily responsible for clustering at membranes, whilst AH2 is the main mediator of 2C oligmerisation and membrane binding. Furthermore, 2C was suggested to be able to recruit RNA to membranes, with a preference for dsRNA, and the author's data implies that the helicase activity of 2C is ATP-independent. Instead, the ATP activity appears to be required for 2C hexamer formation or chaperone activity. The manuscript is generally well written /presented and the author's present very interesting data which raises several questions, some of which require additional experimentation to help support the author's conclusions. Specific comments are as follows.

      We thanks the reviewer for the overall positive assessment, as well as the specific comments below.

      Major Comments:

      1. The authors use four main constructs throughout the paper: full-length 2C, 2C with deletion of AH1 (ΔAH1), 2C with both AH1 and AH2 deleted (ΔMBP) and 2C with an extended N-terminal deletion. From this, the author's draw conclusions on the function of both AH1 and AH2. One of the author's main conclusions is that AH2 is the main mediator of 2C membrane association (e.g., in line 169). However, is it possible to conclude the relative importance of AH1 vs AH2 without testing a construct containing the deletion of AH2 only (ΔAH2)? This should be generated and used alongside this data to fully define the relative importance of AH1 and AH2 in these assay and remove the possibility that the deletion of AH1 changes the structure and/or function of AH2, which could also result in the observed differences.

      This was a very good suggestion. We expressed and purified the ΔAH2 protein requested by the reviewer and characterized its oligomeric state as well as its membrane binding. It turns out, as suspected, that the ΔAH2 protein behaves very similarly to the ΔMBD protein (i.e. it does not form higher order oligomers and does not bind membranes). The changes in the manuscript due to this addition are many but can primarily be found in main figures 2-3 and their associated supplementary figures.

      Previous structural predictions of 2C do not appear to have two separate AHs at the N-terminus. Are the AH1 and AH2 structures predicted to be formed in the context of the entire 2C protein, 2BC precursors and polyprotein? Are there structural approaches that could provide experimental evidence for two separate AH at the N-terminus?

      This is a good point. Previous predictions were not that detailed, partially since they were done in the pre-alphafold era. Unfortunately, we cannot think of a tractable experimental method that could verify the split nature of the amphipathic helix in the only context that would matter: the protein bound to a membrane. A long-term goal would be in situ structures of full-length 2C on membranes using cryo-electron tomography, but our current sample and data sets are not sufficient for this. We added a mention of the long-term need for experimental structures of full-length 2C on lines 315-318 in the discussion.

      Why are the 2C dimers (lines 137-138) not apparent on the mass photometry data presented (figure 2)?

      Different constructs were measured by mas photometry and SEC-MALS. Also, the required concentration is 100-1000x lower for mass photometry which will affect a dynamic equilibrium in case the same construct were measured by the two methods.

      It appeared that binding of ΔMBD-2C was better when POPS is in the membrane (line 174). What is the explanation for this and was this finding significant?

      Well spotted. It may mean that 2C has a second, lower affinity membrane-binding site which is charge-dependent somewhere outside the MBD. We now added a mention of this in the discussion, lines 321-323.

      From the author's data on lipid drop clustering they conclude ΔAH1 is more effective for clustering, however, the ΔAH1 construct produces pentamers not hexamers (from Figure 2). Is formation of hexamers related to or required for membrane clustering?

      ΔAH1 is LESS effective at clustering, not more. As for the mention of pentamers in the original submission: we now think this was an unfortunate choice of words. The mass photometry data for 2C(ΔAH1) could more parsimoniously be interpreted as a mix of hexamers and other (unknown to us) smaller oligomers such as trimers. We have removed all mentions of pentamers.

      The replicon data presented in Figure 7 should include a replication-defective control (e.g., polymerase mutant), in order to compare how defective in replication ΔAH1 and ΔMBP deletions are compared to a fully-defective construct. Likewise, deletion of ΔAH1 in this construct is likely to affect processing of the viral polyprotein where several previous studies with picornaviruses have demonstrated that the residues in the P2'-P4' positions can change cleavage efficiency (e.g., PMID: 2542331), or the structure of 2C, leading to the reduction of replication.

      Thanks for these good comments. We made the polymerase-dead (GDD-to-GAA) replicon and remeasured it side by side with the 2C replicons. It has a similar luciferase activity indicating that no replication takes place in the 2C deletion replicons. This is shown in the new figure 7. As for the possibility or processing defects, we mentioned this in the original discussion and have now cited the reference suggested by the reviewer in this context (line 324).

      How does the author's model of ATPase-independent helicase activity and an APT-dependent required RNA chaperone activity fit with 2 step model for RNA binding and ATPase activity suggested by Yeager et al (PMID: 36399514)?

      Acting upon comments from other reviewers, we completely redid the "helicase assay" in the revised manuscript. It turns out that the ATP-independent unwinding activity in the original submission was an artefact of the assay conditions (specifically, of the TEV protease at the higher concentration we used in the old assay). In our improved assay we neither see helicase activity nor ATP-independent RNA chaperoning activity.

      Optional major comments that would increase the significance of the work:

      All of the optional comments below are exceptionally interesting. But given the long time needed for the several major changes to this manuscript (e.g. the ΔAH2 protein characterization and reoptimisation of the helicase assay) we believe it is more sensible to address them in future studies, for which the 2C reconstitution system can be used.

      The preference for dsRNA over ssRNA appears to be quite small (Figure 5d). In the context of a viral infection where ssRNA is likely to outnumber dsRNA at different times during infection is this preference physiologically relevant? In relation to this, what size stretch of dsRNA is required for preference, and could this correspond to cis-acting RNA structural elements, dsRNA as it escapes 3D polymerase or as part of the RF and RI forms (PMID: 9343205)? What is the proposed mechanism of how dsRNA outcompetes membrane tethering of 2C? OPTIONAL The author's study has been conducted in the absence of other viral non-structural proteins. What is the physiological importance of the observations, such as membrane interaction/clustering or RNA binding when presented in the context of the other replication machinery. OPTIONAL Do 2C monomers, dimers and hexamers have different functions in viral replication perhaps at different stages of replication and which of these forms are relevant during viral infection or can they all be detected during infection? Can any suggested separate functional arrangements be separated by genetic complementation experiments? OPTIONAL

      Minor comments:

      1. The author's appear to interchange between naming/nomenclature of the constructs which makes it confusing to follow (for example, ΔMBD is the same as 2C(41-329) likewise, 2C(Δ115) is sometimes called 2C(116-329)). It would be much easier to follow if the naming of constructs was consistent throughout (unless I am misunderstanding some subtlety in the difference between such constructs).

      Thanks very much for spotting this. We have fixed it.

      The author's suggest a pentamer arrangement for the ΔAH1 construct, however in the mass photometry data (figure 2D), a hexamer is indicated with the arrow. It would be helpful to change the label to indicate the size of the pentamer where this is being generated, not the hexamer.

      As mentioned above, we think the "pentamer" designation of the original manuscript was unfortunate. It is more parsimonious to interpret this as a mix of states, hexamer and undefined snaller.

      In most figures, data for full-length 2C, ΔAH1 and ΔMBP is shown. However data for ΔMBP is missing in Figure 4. Using ΔMBP may demonstrate even lower clustering, hinting that AH2 is also involved in this process.

      Thanks for this comment. In our view, it can be derived from figure 3 (which shows lack of binding to PC/PE membranes) that the ΔMBD construct would not cluster membranes under the conditions of the assay (clustering requires concomitant binding to two membranes). We now describe our rationale for this on lines 220-222. However, we did include the ΔMBD protein in the new negative staining TEM supplementary figure where it and ΔAH2 show no signs of clustering (figure S10).

      I think it would be better for normalise the data in the flotation experiments such that the percentage of 2C in the upper faction is presented as relative to the amount of lipid in the upper fraction (presented in Figure S4).

      The change suggested by the reviewer would make it impossible to show the important no-liposome control (leftmost bar in Fig. 3C) in the same plot as the other measurements. We believe that would unnecessarily complicate the figure. Thus, we opted to keep the measurement that are normalised by lipid fluorescence in the supplementary figure. Instead, we now added another mention of this supplementary figure in the legend to main figure 3.

      At several places (e.g., lines 232 and 272) the author's refer to "realistic systems". I think the term "physiologically relevant" might be more appropriate.

      Agreed and changed throughout.

      Line 237: I think "y" is a typo and should read "by".

      Thanks. This text was reworked due to the major changes to figure 6.

      Reviewer #1 (Significance (Required)):

      Significance

      I have limited expertise with structural biology but specialise my research on positive-sense RNA virus replication, structure and function. This research is of interest to a broad audience of researchers investigating many positive-sense RNA viruses, which extends beyond the viral family studied here. The work utilises novel techniques to begin to understand the specific roles of 2C in poliovirus replication. The author's data add important incremental new insight into recent studies on viral helicase proteins as referenced in the study, however, a key limitation is understanding the importance/relevance of their observations during a viral infection.

      We thanks the reviewer for this positive and nuanced appraisal of our work.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      The authors present an alternative assay system to investigate picornavirus 2C, a protein that is tricky to analyze biochemically in its full length form because of an amphipathic helix at the N-terminus. Poliovirus 2C is expressed with an N-terminal MBP tag, a 50kD protein that helps with solubility as is commonly used for 2C investigations. A difference here is that liposomes are included to mimic membranes for 2C attachment. The key findings are that 2C induces clustering of of liposomes, that double stranded RNA binding by 2C impacts this clustering effect and that a free N-terminus (after cleavage of MBP by TEV protease) is needed for RNA binding and an ATP independent (ie non helicase) RNA duplex separation activity.

      Major:

      In the floatation assays in figure 3 the authors use a system where MBP-2C is fluorophore-labeled with ATTO488 on exposed cysteines. Poliovirus and other enterovirus 2C has a very well characterized zinc finger domain that has cysteines coordinating a zinc ion. Mutation experiments previously showed that these cysteines are necessary for viral replication and 2C stability. Have the authors controlled for disruption of the zinc finger domain by the labelling of cysteines with ATT0488 and checked if the protein remains folded?

      We completely agree with the reviewer and apologise for the omission in the original submission. We have now included a Zn content measurement, which shows unchanged levels between labelled and unlabelled 2C protein (Figure S7). Also, we now in the revised manuscript explicitly describe our original reasoning for labelling on native cysteines: the presence of two cysteines which are not necessary for viral replication and which are more solvent exposed-exposed (and thus more likely to be labelled) in the crystal structure of the soluble fragment of 2C (lines 176-181).

      In the analysis of the amphipathic helix, did the authors include membranes in their structural predictions o just the free helix? How does inclusion of membranes impact the predictions? In the predictions in Figure D, only 2 of 4 show a kink and there doesn't seem to be a correlation between those that predict a kink or not and whether the hydrophobic side is aligned in Figure S1.

      Unfortunately, predicting a protein structure with the interacting membrane is beyond what is currently doable with protein prediction methods (one would have to combine protein structure predictions with molecular dynamics simulations including a membrane). Based on general principles of protein structure, it is likely that there is some flexibility around G17. Thus there may not be a single "kink angle" for any given virus, but we believe that the presence of the kink (and offset hydrophobic surfaces) for a number of viruses lends credibility and robustness to the observation. We added some descriptions of this thinking on lines 126-127.

      Based on previous structures of 2C from different viruses the N-terminal amphipathic helix containing region is predicted to localize on one face of the predicted hexametric structure tethering 2C to the membrane. How does the authors hypothesized model explain 2C dependent clustering? is there evidence that 2C hexamers can oligomerize further into dodecamers for example, maintaining separate faces to enable N-terminal interaction with different membranes? What is the distance between the liposomes in figure 4 at the points of density attributed to 2C? How does this compare to the size of 2C determined in previous structural studies? Is it consistent with one hexamer/2 hexamers sitting on top of one another?

      These are very interesting questions but we believe it is prudent to limit our speculation at this point. Eventually, we hope that larger data sets of cryo-electron tomography, coupled to subtomogram averaging, may provide a more definitive answer. What we managed to do with our current cryo-electron tomography data set is to estimate the volume of individual protein densities, and from the volume calculate an estimated molecular mass of the individual complexes seen in the tomograms. This correlates very well with 2C hexamers (new figure 4D).

      In the Discussion lines 278-285 the authors suggest that having MBP attached may reflect the polyprotein condition. Can they make a construct with MBP-2B2C to examine interaction with liposomes and assess 2C function?

      This is a highly relevant question, but the biochemistry of 2BC is even more challenging than 2C, and we are unfortunately nowhere near being able to work with purified 2BC at the moment.

      Discussion lines 293-296, the possibility of two different populations of 2C, binding RNA or membranes cannot be excluded, there is much more 2C around late in infection that present in early infection- the model in figure 8 doesn't acknowledge/capture this.

      We have changed the model figure such that more 2C is seen later, and the clustering function is also seen late in infection. The original discussion text referred to (which is unchanged) talks about a "preferential role in RNA replication and particle assembly at later time points" specifically for this reason. We hope the new figure 8 is better at conveying this message.

      Discussion lines 313-317, the authors don't reference a study where a mutant of foot-and-mouth disease virus 2C lacking the n-terminal amphipathic helix that could bind but not hydrolyze ATP, hexamerized in the presence of RNA that seems pertinent here (PMID: 20507978).

      Thanks for the suggestion. However, after the extensive changes we made to the revised to figure 6 based on excellent reviewer comments (essentially: the RNA chaperoning activity turned out to be an artefact, the improved assay shows no sign of RNA unwinding but instead of 2C-mediated ribonuclease activity), these sentence of the original discussion lost most of their context and we opted to remove them.

      Some evidence of MBP-2C cleavage by TEV in the different assays used should be presented as this is a major focus of discussion and currently no gels show TEV cleavage is happening.

      Thanks for the suggestion - we agree. We now show these in the new supplementary figures S5 and S12.

      Reviewer #2 (Significance (Required)):

      The work presents an additional methodology to investigate a a protein that has previously been difficult to study. The authors acknowledge that there is still a lot of 2C biology that remains to be discovered.

      Thanks, we agree.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      The manuscript provides insights into the role of the N-terminus in membrane binding and its importance in the various functions of 2C.

      Major issues

      Line 103-119. Is this novel? I thought people had done a lot of bioinformatic analysis of PV 2C (especially Wimmer) who also did mutational work to analyse the importance of various amino acids in the N-terminal helix. I feel like the paper in general, and this section in particular, underplays the large body of work that has been done on the amphipathic helix by various groups.

      We apologise if our original manuscript didn't sufficiently acknowledge previous work in the field. In the first sentence of the mentioned paragraph (now lines 112-113) , we did however cite several papers that have previously addressed the amphipathic nature of the N-terminus of 2C. We have now added two more references along the same line, and changed the wording in a way that we hope better bring across that the amphipathic nature per se has been studies before. We would be happy to add more specific references if the reviewer has any suggestions. However, the rest of our analysis IS indeed novel for the following reasons: (i) we show that the amphipathic region is not a simple, single amphipathic helix, but instead has a conserved glycine (helix breaker/destabiliser residue) and two distinct amphipathic stretches before and after this region, (ii) we use alphafold2 (not available at the time of the earlier work) to provide the first reliable structural models of the membrane-binding domain. These models consistently, across several enterovirus 2C proteins, reveal that the hydrophobic surfaces of the first and second amphipathic regions, on either side of the conserved glycine 17, are offset from one another. This lends additional credibility to the distinct nature of these regions which have not previously been identified as such and which we also show in the biochemical assays to be functionally distinct. We have now also added a clarification to the Discussion that the N-terminus of 2C had previously been identified as its membrane-binding domain and we cite references for this. We hope that these changes will sufficiently acknowledge earlier work in the field while clearly pointing out the advance that our paper makes.

      Line 132. Did you validate your column with known MW standards? The peak for full length and deltaAH1 look fairly standard for 2C, in that you have a mixture of species. Not sure you can say it is a hexamer when it is such a broad peak. C doesn't really help you too much since the counts at 400 (pentamer) and 480 (hexamer) are almost the same with quite large error bars. Like most people that have worked with 2C I think the best you can say is that you are making some kind of oligomerized 2C that includes hexamer, pentamer, etc. Why no dimer for MBP-2C and MBP-2C(delta AH1) when compared to the other constructs?

      We did not calibrate the gel filtration column since the outcome would anyway be a more crude estimate of molecular mass than the mass photometry and SEC-MALS measurements. But we do agree with the reviewer on the broad mass photometry peaks. To address this experimentally, we compared the existing MBP-2C spectra to new recordings on apoferritin, a highly stable homomultimeric protein complex of a similar mass to aa MBP-2C hexamer. The apoferritin mass estimate is overlayed with the full-length MBP-2C in the new figure 2D and the corresponding supplementary figure S3. This indeed shows that the MBP-2C peak is broader, i.e. consistent with a mix of species which are predominantly but not only hexamers. We describe and discuss this on lines 145-149. As for the mention of pentamers in the original submission: we now think this was an unfortunate choice of words. The mass photometry data for 2C(ΔAH1) could more parsimoniously be interpreted as a mix of hexamers and other (unknown to us) smaller oligomers such as trimers. We have removed all mentions of pentamers.

      Line 143. Does your data show that there are two amphipathic helices? Bioinformatics suggests it but your experiments just show the importance of the two areas in oligomerization, not that it is forming two helices.

      We agree that the choice of words was not idea and have now changed it to "structure predictions indicate" (lines 162).

      Figure S2. Your preps are still relatively dirty, which isn't ideal for biochemical assays. Especially lane 3, where you are looking at 50-60% purity. I don't want you to re-run experiments but I think you need to comment on the purity of the protein you are working with. Also I don't like that you removed the top and bottom of the SDS-PAGE. How much protein never entered the gel. Is there a big fat band at 20 kDa? You need to have the full gel here. Did you measure 260 nm of the preps as well to see if you had bound RNA to the 2C?

      Thanks for the comment, we agree that our original submission lacked detail in the description of the protein purification. This is now addressed with the new figure S2 which shows size exclusion chromatograms of the fluorophore-labelled proteins (same chromatograms as in figure 2) and the corresponding uncropped gels imaged both in the stain-free channel (showing all proteins) and in the fluorescence channel. The A260/A280 ratio measured for all proteins shows that they are free of nucleic acids at the point of imaging. The protein preps are not 100% homogeneous but we do believe that they are more than 50-60% pure.

      Lines 170. Wasn't this done in the recent "An Amphipathic Alpha-Helix Domain from Poliovirus 2C Protein Tubulate Lipid Vesicles"? I don't see it referenced. What is novel about the current work when compared to that paper? Any differences?

      Thanks for pointing this out. The referenced study worked with a synthesized, isolated peptide corresponding to AH2 (i.e. not with full protein). An amphipathic peptide outside the context of its protein cannot be expected to recapitulate the properties of the entire protein, e.g. since it is not spatially constrained in how it interactis with membranes. As one example (relating to the title of that paper) we don't see full-length 2C protein tubulating membranes the way the isolated peptide does. As for the reviewer's question about novelty, the paper mentioned does not identify the split nature of the amphipathic region, does not consider the role of AH1, does not characterise the membrane-binding properties of full-length 2C with respect to liposome membrane composition and size, does not identify and characterise the membrane clustering properties of 2C, nor its interactions with nucleic acid when bound to a membrane. However, we do agree that we should have cited the paper in our manuscript. We now cite it in the discussion, lines 320-321.

      I'm surprised by the lack of electron microscopy (negative stain mostly) of both the oligomerized 2C and the various liposomes. I know the Carlson group is a microscopy group so why the lack of validation using electron microscopy of the various DLS experiments? I know you did cryo-ET for one of the constructs but I think negative stain electron microscopy of other constructs would be useful.

      Thanks for the suggestion. As suggested, we have now expanded the analysis with negative staining EM of several more constructs studied by DLS. It can be found in the new supplementary figure S10.

      Figure 4C. What evidence is there that this is 2C apart from you added it to the liposomes? It also comes back to the relative impurity of your protein prep. Could this be E.coli contamination?

      Thanks for this comment. We have now added a new supplementary figure (S5) showing SDS-PAGE gels of the reactions used for flotation and DLS assays - which are identical to the cryo-ET samples. In addition, we estimated the molecular mass of the individual, putative 2C desities in the cryo-electron tomograms by measuring their volume. This analysis, which can be found in the new figure 4D, shows that the estimated mass of individual protein densities is consistent with a hexamer of full-length 2C. In addition, we mention in the discussion the long-term need to determine high-resolution structures of membrane-bound 2C using cryo-ET and subtomogram averaging (lines 315-318).

      Figure 8. Is this model supported by the data in this paper? Your cryo-ET says that 2C is there but that isn't supported by any other data. How is the dsRNA protected from the innate immune system in this model? is it just sat out in the cytosol? How is the nascent ssRNA packeged into the capsid? Is there competition between the dsRNA and capsid for 2C binding (which your model suggests)? I know it sounds like I am being overly critical of the model but in my opinion there are still too many unanswered questions in the field to come up with a half decent model.

      Thanks for this comment. We are the first to agree that our understanding of the roles of 2C is far from complete! We should have been more clear that the model figure represents some of the roles of 2C identified to date, and does not claim to be complete. However we do feel that a model figure serves a purpose of putting our findings into a context, and also providing testable hypotheses for future research . As for the question, some of the roles of 2C shown in the model figure (in particular, particle assembly) are rather supported but earlier work of ourselves and others. We have now produced a new model figure and changed the figure legend to better reflect the incompleteness of the current understanding, and the origin of the different parts of the model figure. In addition, we extended the final paragraph of the discussion (which lists still-unknown aspects of 2C) with the reviewer's mention of dsRNA shielding from innate immunity (lines 374-375). The other aspects mentioned by the reviewer as not yet fully understood are already mentioned in that paragraph.

      Minor issues

      Lines 43-45: I feel like you underplay the success of the poliovirus vaccination program. Approximately 30 of WPV1 in 2022 and the full eradication of WPV2 and 3. Vaccine derived polio is still an issue but even that is relatively low compared to where the world was in the 1950s.

      We agree that the previous wording was not ideal. We replaced it and added another recent reference - related to the type 2 vaccine switch (lines 47-49).

      Line 66. I agree there are 11 individual proteins but I feel like this leaves out the fact that some of the uncleaved precursors appear to have some functions, for example 2BC.

      Good point. We have now added a mention of 2BC and the fact that it has distinct functions to the introduction (lines 70-71). 2BC is also mentioned in the legend of the model figure (figure 8).

      Line 56: LD needs to be defined.

      Well spotted thanks. Since the abbreviation was not used anywhere else we opted to spell it out instead (line 59).

      Line 75. I think you have misrepresented Xia et al here. They clearly say that in their study that they show helicase and chaperone activity. I never managed to repeat that work but you should still report what they claim. One major thing is that they used insect expressed protein, whereas most people (including myself and in the paper under review) use E.coli expressed protein. Do post translational modifications play an important role in function?

      You are right that the reference to their paper for this statement was incorrect. We have now made this part of the introduction more explicit (lines 82-83) and we also in the new discussion mention the possibility of e.g. post-translational modifications affecting 2C helicase activity, under reference to Xia et al (lines 359-361)

      Line 103. Need to make it clear here it is poliovirus 2C.

      Thanks, we added it (line 112).

      Line 135. I assume you mean kDa instead of uM?

      It should actually be μM. It is the solution concentration at which the assay was performed. We added some words to clarify this (line 154).

      Figure 3. What do you mean by "Only 2C"? Is that MBP-2C? Maybe I am reading the data wrong but adding TEV does nothing? How do you know TEV is removing the MBP? It looks like MBP-2C binds to the liposomes just the same as cleaved MBP-2C. I see in line 165 you acknowledge this. Could an alternative conclusion for line 168 be that MBP isn't being cleaved off but that AH2 is too small to be exposed in that construct? Did you do that construct without MBP being cleaved? I think you need to confirm that MBP is being cleaved off.

      Thanks for spotting this mistake. It should indeed be MBP-2C (in the absence of liposomes). We corrected figure 3. Also, in response to this comment and similar ones, we have now added a new supplementary figure showing SDS-PAGE gels of the reaction loaded onto flotation assays and DLS (figure S5). It shows that MBP-2C is cleaved.

      Line 184. Is there a reason you use the 2019 paper as a reference instead of the far earlier Bienz et al papers? I'd suggest they are the seminal papers on 2C membrane association. Once again how is this work different from the recent "An Amphipathic Alpha-Helix Domain from Poliovirus 2C Protein Tubulate Lipid Vesicles" paper?

      See our response above of the paper mentioned here (which we have now cited). As for why we cite the 2019 paper here: our statement pertains specifically to the contact sites between lipid droplets and replication organelles, not to the membrane binding of 2C per se. We have now added a more general mention of membrane remodelling by non-structural proteins in the introduction, where we cite on of the Bienz papers (lines 75-77).

      Figure 5D. So only 1-3% of RNA is found in the upper fraction? Is that significant enough to say that dsRNA was recruited significantly more than ssRNA? How confident are you in your quantification of the starting amounts of RNA?

      We agree that the fraction is low, however, the fluorescence signal is very clearly above background. We are thus confident in the measurement. The low percentage at the end of the experiment likely has a simple physico-chemical explanation: in a dynamic equilibrium in a density gradient, whatever RNA dissociates during the run will migrate away from the 2C-vesicle fraction and not be able to rebind. We still tried to address this concern by a complementary experiment where we used fluorescence anisotropy to measure binding of RNA to 2C on vesicles. While the measurements showed the same tendency, they curves were not clean enough to be published, which we think is due to the complex system with 2C bound to vesicles and clusters of vesicles. Still, in view of the relatively low percentage of measured recruitment we opted to adjust the paper title and the title of figure 5 (including the subheading related to figure 5) to put less emphasis on the dsRNA recruitment.

      Line 223. Any idea why the MBP needs to be cleaved off? Clearly the MDB is accessible or it would not bind to the liposomes.

      Since we have no data directly supporting this we prefer not to speculate in the paper. But one guess would be that the NTD of 2C, as implicated by previous publications, has a dual role in membrane binding and RNA binding. It may be that it can bind membrane while conjugated to MBP, but needs MBP to be removed in order to simultaneously bind membrane and RNA.

      Line 237: missing "b" in "by"

      Thanks. This paragraph was rewritten in the light of the changes to figure 6.

      Figure 6. I don't fully understand the results here. Earlier you showed that the delta MBD didn't really bind SUV. So presumably it isn't really membrane bound. Why does it have similar activity to full-length MBP in your helicase assay if membrane is important? Did you do SUV and TEV protease only control?

      We are very grateful to this reviewer (and others) for pointing out the need for a TEV control. When performing the control, we found that the TEV protease, at the high concentrations initially used, surprisingly had an artefactual RNA chaperone-like effect on its own. We then proceeded to titrate down the TEV protease concentration to the point where it no longer interfered. At this TEV protease concentration, although 2C was substantially cleaved (see the new supplementary figure S12), we could no longer detect an RNA chaperone activity. Thus, the contents of the new figure 6, and its conclusions, have been substantially changed. We now focused our attention on the remaining effect that 2C has on RNA: single-strand ribonuclease activity. These experiments were all conducted in the presence of RNase inhibitors, and the presence of Mg2+-dependent ribonuclease activity parallels a recent publication that found this for truncated 2C from hepatitis A and several enteroviruses.

      Line 257: "staring"?

      Thanks, corrected. A staring glycine would indeed be something strange.

      Line 336. Need to change the u to mu.

      Thanks, corrected.

      Any discussion on your observation in Figure 1D that EV71 and CVB3 don't appear to have AH1 and AH2 or do you think that the domains are conserved across the different viruses?

      Thanks for bringing this up. Based on this and a comment from another reviewer, we have now clarified our thinking around this. Since the glycine will introduce some flexibility between AH1 and AH2, we cannot say from the single alphafold predictions that this is THE kink angle. The presence of the kink in the predictions of several MBDs lends more credibility to the robustness of the observation, but most importantly the hydrophobic surfaces in AH1 and AH2 are non-aligned for ALL sequences we looked at. This is now described on lines 126-128.

      Table 1 (and possibly elsewhere): an apostrophe is not the prime symbol. 5' compared to 5′.

      Thanks, we corrected this throughout.

      Line 702 "and" should be "an".

      Thanks, corrected.

      I couldn't open one of the movies (140844_0_supp_2820374_a2g272.avi).

      Sorry to hear this, we will check the movie again.

      Reviewer #3 (Significance (Required)):

      Overall I liked the paper and is worth publishing. One of the issues in the 2C field is the difficulty in making pure 2C and carrying out in vitro assays that correlate with what is observed in the natural infection. I think this paper suffers from similar struggles with a 2C preparation that doesn't appear that pure. I think it also suffers from not having 2C from a wild-type infection. I don't think that it is feasible to get that kind of 2C but by once again using a recombinant protein from E.coli we are left with another manuscript that provides conflicting evidence of the functions of 2C without a definitive answer. The experiments are well done, although are missing some controls and the manuscript is laid out in a logical manner and is relatively easy to follow.

      We thanks the reviewer for these comments. We believe that we have now provided better information regarding the purification of the recombinant 2C protein, and we do think that the controls present in the original manuscript and the revised manuscript alleviate the concerns about lack of specificity. Of course, isolating 2C vesicles from wildtype infection would be another interesting way of approaching its function, but such an approach would come with its own set of challenges related e.g. to the presence of confounding host factors.

      Reviewer #4 (Evidence, reproducibility and clarity (Required)):

      This is an interesting manuscript that reports the development of an in vitro membrane assay for probing the biochemical functions of the enterovirus 2C protein. The technique is interesting because it can be applied to 2C proteins from other members of the picornavirus family, an important group of mammalian pathogens. It has the capacity to probe different functions (e.g. membrane clustering, ATPase activity, RNA-binding and manipulation activities).

      Overall, the manuscript is well written and gives a clear account of the work undertaken. It adds insight to previous studies of enteroviral (and picornaviral) 2C proteins, providing confirmation of some earlier work in a more physiological context and some new insights, particularly into the membrane and RNA binding aspects of 2C.

      That said, there are a number of places where some amendment of the claims made is required to provide a more precise statement of the findings of this work. These are listed below.

      We thank the reviewer for this positive feedback on our work, as well as for the specific comments below.

      Line 21 (Abstract) - The authors claim to have shown that a conserved glycine divides the N-terminal membrane-binding domain into 2 helices. I would suggest instead what they have produced are computational predictions that this is the case - some way short of an experimental demonstration. Sequence analysis predicts helical secondary structure in the N-terminus and indeed Alphafold2 also predicts a helical structure, but these predictions require experimental verification. The authors should therefore rewrite sections that claim to have shown the presence of 2 helices. In doing so, they should perhaps also comment on the fact that Alphafold2 does not predict 2 helices in this region for all enteroviruses (see Fig 1D). Moreover, the sequence analysis in Fig. S1 shows the presence of two Lys residues in the segment 17-38; it would be interesting for the reader to have these indicated in the figures showing the Alphafold2 prediction - do they in any way interrupt the hydrophobic face of the predicted helix?

      Thanks very much for this comment, which is in line with what other reviewers also wrote. We agree, and changed the abstract sentence. We have also rewritten the manuscripts in several places to address the limits of structure predictions and the eventual need for an experimental structure of full-length membrane-bound 2C (lines 126-128 and 315-318).

      Line 82 (Introduction) - The authors write that the membrane binding domain (MBD) of poliovirus has been shown to mediate hexamerisation, citing Adams et al (2009) - reference 43. However, that is not what this paper shows. Rather it provides evidence of aggregation of an MBP-2C fusion protein into forms that ranged from tetramer to octamer, but no evidence that these aggregates assume functional forms (e.g. the presumed hexameric ring structure characteristic of the AAA+ ATPase family to which 2C belongs). As far as I am aware the first demonstration of hexameric ring formation by a picornaviral 2C protein was for the 2C of foot-and-mouth disease virus (see Sweeney et al, JBC, 2010). Although this is not an enterovirus, this finding was later confirmed for Echovirus 30 (ref 51). I should declare an interest here: the Sweeney paper is from my lab. I will leave it to the editor and the authors to determine how to write a more precise account of the early observations of hexamerisation in picornaviral and enteroviral 2C proteins.

      Thanks very much for this insightful comment. As a response to this and other similar comments, we are much more cautious about our wording in the revised manuscript (see also response to comment below. In the part of the introduction discussed here (now lines 89-91) we now use the original wording of the Adams paper ("oligomerization"). In the context of that new text we didn't feel that Sweeney et al paper was a suitable reference, but we now cite it in the later mention of 2C's oligomeric/hexameric state in the first part of the Results (lines 137-138 ).

      Line 132 - the authors used mass photometry to investigate oligomeric forms of their MBP-2C constructs and state that for the full length 2C protein "the high-mass peak closely corresponds to a hexamer". While it is true that the peak shown in Fig 2C aligns with the expected MW for an MBP-2C hexamer, the peak is very broad, indicative of the presence of other oligomeric states with lower and higher numbers of monomers. This should be commented on. Indeed, the finding seems to echo the early findings of Adams et al (ref 43) with poliovirus MBP-2C.

      Thanks for this comment, which was also made by another reviewer. We cite here what we replied to that reviewer

      ...we do agree with the reviewer on the broad mass photometry peaks. To address this experimentally, we compared the existing MBP-2C spectra to new recordings on apoferritin, a highly stable homomultimeric protein complex of a similar mass to aa MBP-2C hexamer. The apoferritin mass estimate is overlayed with the full-length MBP-2C in the new figure 2D and the corresponding supplementary figure S3. This indeed shows that the MBP-2C peak is broader, i.e. consistent with a mix of species which are predominantly but not only hexamers. We describe and discuss this on lines 145-149.

      Line 143 - for the reasons given above, this summary paragraph represents too strong a statement of what has been observed.

      We agree, and changed the paragraph. It now only refers to "oligomerization" (lines 162-164).

      Line 197 - I note that the authors did not test the membrane clustering capabilities of the 2C(41-329) construct. Although the 2C(deltaAH1) construct had already shown a significant loss of activity, the shorter construct could still have been a useful control. I don't think it is necessary for this experiment to be done, but if the authors have a rationale for not performing the experiment, perhaps they could include it in a revised manuscript.

      Thanks for the suggestion. The rationale is that a protein that doesn't bind a membrane in the first place will also not cluster them (an action that requires binding TWO membranes). We now describe our reasoning on lines 220-222. Nevertheless, we did test these constructs in the new supplementary figure showing negative staining TEM (figure S10).

      Line 223 - typo. I think you mean MBD.

      Thanks! Corrected (now line 257).

      Line 215 - the authors observed that the presence of ssDNA reduced membrane clustering and conclude that "nucleic acid binding partially outcompetes membrane tethering activity". Two things: (1) although I agree is it likely that this effect is due to binding of DNA to 2C, binding has not been demonstrated experimentally so the authors should be more careful in how they describe their result; (2) there is no data presented to show that RNA binding reduces membrane tethering so at best I think the conclusion has to be that the data are consistent with the notion that DNA binding reduces membrane tethering. It would of course be interesting to see the effects of RNA and I'm curious to know why the assay was not performed.

      Thanks for the comment. The honest answer is that previous publications (primarily Yeager et al, NAR 2022) convinced us that the outcome should be near-identical with DNA, so we chose DNA oligos because they are cheaper and easier to work with. But we agree with the reviewer that RNA is of course more relevant. We now present a comparison at 5 μM of ssDNA and ssRNA, which in fact shows a slightly stronger effect on membrane clustering by RNA (figure 5C). In the light of this additional experiment, we feel that some of the text changes suggested by the reviewer may no longer be necessary.

      Line 237 - typo: by, not y

      Thanks. In the light of the extensive changes to figure 6 this text was removed.

      Line 284 - the authors claim that 2C may only bind RNA after the N-terminus is liberated from 2B in infected cells, since cleavage of the MBP tag from their construct was needed for 2C to bind RNA in their in vitro assay. However, this does not automatically follow given the large structural differences between MBP and 2B and the fact that the authors have not tested the RNA binding capacity of a 2BC fusion protein. Their claim here is too strong and should be re-written.

      We agree, and have added a discussion along the lines suggested by the reviewer (line 330-332).

      Line 293 - The authors speculate that RNA binding might cause a shift between the membrane clustering activities and the role of the protein in RNA replication. However, since they have not shown that RNA binding reduces membrane clustering, this is too speculative.

      In our revised manuscript we have studied the effect of RNA on membrane binding, thus we feel that this text is relevant in the context of the extended experiments.

      Line 299-317 - within this discussion is the assumption that in their assay system enterovirus 2C adopts the ring-like hexameric structure typical of AAA+ ATPases. While I agree this may well be the case, it has not been demonstrated in this study so the authors should make clear they are making this assumption. The same applies to the legend of Fig 8.

      This part of the discussion was extensively rewritten after our changes to figure 6. We now only refer to "hexamer" once in the corresponding part of the discussion, where we talk about structural models of hexamers produced by other groups who have crystallised fragments of 2C. There we believe we should refer to hexamers to accurately cite their work.

      We are not sure what the reviewer is referring to when it comes to the legend for figure 8: the original legend had no reference to the oligomeric state of 2C. We have substantially changed figure 8 and its legend and the new figure and legend make no references to hexamers/oligomers.

      Line 302 - the authors claim to have shown that 2C is 'selective' for dsRNA. I think at best they have shown a preference for binding dsRNA over ssRNA.

      We changed the wording (line 349). We have also changed the title of the paper where we removed "double-stranded".

      Line 313 - The sentence starting "A recent study..." needs a reference.

      The revised discussion no longer contains this sentence.

      Line 332 - the full sequence of the synthetic gene used in this study should be made available (e.g. as supplementary information or a deposited sequence with an accession number). This is a critical point before the paper can be published.

      We will of course submit the sequences as supplementary data. Thanks for the reminder.

      Line 362 - the authors should describe the likely points of attachment of fluorophores and comment on how this labelling might affect 2C function.

      Thanks for the comment. In response to this and a similar comment from another reviewer, we discuss the likely conjugation site of the fluorophore (lines 175-181), and also (due to the proximity to the Zn finger) provide a new measurement showing that equal amounts of Zn can be detected in the labelled and unlabelled protein (figure S7).

      Line 372 - Is a single protein standard (BSA) sufficient to calibrate the SEC-MALS system?

      Yes, it is the recommended procedure (note that SEC-MALS is only dependent on scattering, not elution volumes etc).

      Reviewer #4 (Significance (Required)):

      As stated above this is an interesting study that presents findings from a novel assay. It will be of interest to picornavirologists and the wider community interested in the mechanisms of AAA+ ATPases.

      We thanks the reviewer for this positive appraisal of our work.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1:

      We thank the reviewer for their careful reading of our manuscript and have taken all of their grammatical corrections into account.

      Reviewer #2 (Public Review):

      Weaknesses: 

      The paper contains multiple instances of non-scientific language, as indicated below. It would also benefit from additional details on the cryo-EM structure determination in the Methods and inclusion of commonly accepted requirements for cryo-EM structures, like examples of 2D class averages, raw micrographs, and FSC curves (between half-maps as well as between rigid-body fitted (or refined) atomic models of the different polymorphs and their corresponding maps). In addition, cryo-EM maps for the control experiments F1 and F2 should be presented in Figure 9.

      We tried to correct the non-scientific language and have included the suggested data on the Cryo-EM analyses including new Figures 11-17.  We did not collect data on the sample used for the seeds in the cross seeding experiments because we had already confirmed in multiple datasets that the conditions in F1 and F2 reproducibly produce fibrils of Type 1 and Type 3, respectively. We have now analyzed cryo-EM data for 6 more samples at pH 7.0 and found that several kinds of polymorphs (Types 1A, 1M, 2A, 2B and 5) are accessible at this pH, however the Type 3 polymorphs are not formed at pH 7.0 under the conditions that we used for aggregation.

      Reviewer #2 (Recommendations For The Authors):

      Remove unscientific language: "it seems that there are about as many unique atomicresolution structures of these aggregates as there are publications describing them"   

      We have rephrased this sentence.

      For same reason, remove "Obviously, " 

      Done

      What does this mean? “polymorph-unspecific” 

      Rephrased as non-polymorph-specific

      What does this mean? "shallow amyloid energy hypersurface"  

      By “shallow hypersurface” we mean that the minimum of the multi-dimensional function that describes the energy of the amyloid is not so deep that subtle changes to the environment will not favor another fold/energy minimum. We have left the sentence because while it may not be perfect, it is concise and seems to get the point across.

      "The results also confirm the possibility of producing disease-relevant structure in vitro." -> This is incorrect as no disease-relevant structure was replicated in this work. Use another word like “suggest”.

      We have changed to “suggest” as suggested.

      Remove "historically" 

      Done

      Rephrase “It has long been understood that all amyloids contain a common structural scaffold” 

      Changed to “It has long been established that all amyloids contain a common structural scaffold..” 

      "Amyloid polymorphs whose differences lie in both their tertiary structure (the arrangement of the beta-strands) and the quaternary structure (protofilamentprotofilament assembly) have been found to display distinct biological activities [8]" -> I don't think this is true, different biological activities of amyloids have never been linked to their distinct structures.  

      We have added 5 new references (8-12) to support this sentence.

      Reference 10 is a comment on reference 9; it should be removed. Instead, as for alphasynuclein, all papers describing the tau structures should be included.  

      We have removed the reference, but feel that the addition of all Tau structure references is not merited in this manuscript since we are not comparing them.

      Rephrase: "is not always 100% faithful"

      Removed “100%”

      What is pseudo-C2 symmetry? Do the authors mean pseudo 2_1 symmetry (ie a 2-start helical symmetry)?

      Thank for pointing this out.  We did indeed mean pseudo 21 helical symmetry.  

      Re-phrase: "alpha-Syn's chameleon-like behavior" 

      We have removed this phrase.

      "In the case of alpha-Syn, the secondary nucleation mechanism is based on the interaction of the positively charged N-terminal region of monomeric alpha-Syn and the disordered, negatively charged C-terminal region of the alpha-Syn amyloid fibrils [54]" -> I would say the mechanisms of secondary nucleation are not that well understood yet, so one may want to tune this down a bit. 

      We have changed this to “mechanism has been proposed to be”

      The paragraphs describing experiments by others are better suited for a Discussion rather than a Results section. Perhaps re-organize this part? 

      We have left the text intact as we are using a Results and Discussion format.

      A lot of information about Image processing seems to be missing: what steps were performed after initial model generation? 

      We have added more details in the methods section on the EM data processing and model analysis.

      Figure 1: Where is Type 4 on the pH scale?

      We have adjusted the Fig 1 legend to clarify that pH scale is only applicable to the structures presented in this manuscript. 

      Figure 2: This might be better incorporated as a subpanel of Figure 1.

      We agree that this figure is somewhat of a loner on its own and we only added it in order to avoid confusion with the somewhat inconsistent naming scheme used for the Type 1B structure. However, we prefer to leave it as a separate figure so that it does not get dilute the impact of figure 1.

      Figure 3: What is the extra density at the bottom of Type 3B from pH 5.8 samples 1 and 2. pH 5.8 + 50mM NaCl (but not pH 5.8 + 100 mM NaCl)? Could this be an indication of a local minimum and the pH 5.8 + 100 mM NaCl structure is correct? Or is this a real difference between 0/50mM NaCl and 100 mM NaCl? 

      We did not see the extra density to which the reviewer is referring, however the images used in this panel are the based on the output of 3D-classification which is more likely to produce more artifacts than a 3D refinement. With this in mind, we did not see any significant differences in the refined structures and therefore only deposited the better quality map and model for each of the polymorph types.

      Figure 3: To what extent is Type 3B of pH 6.5 still a mixture of different types? The density looks poor. In general, in the absence of more details about the cryo-EM maps, it is hard to assess the quality of the structures presented.

      In order to improve the quality of the images in this panel, a more complete separation of the particles from each polymorph was achieved via the filament subset selection tool in RELION 5. In each case, an unbiased could be created from the 2D classes via the relion_helix_inimodel2D program, further supporting the coexistence of 4 polymorphs in the pH 6.5 sample. The particles were individually refined to produce the respective maps that are now used in this figure.

      Many references are incorrect, containing "Preprint at (20xx)" statements.  

      This has been corrected.

      Reviewer #3 (Public Review):

      Weaknesses: 

      (1) The authors reveal that both Type 1 monofilament fibril polymorph (reminiscent of JOSlike polymorph) and Type 5 polymorph (akin to tissue-amplified-like polymorph) can both form under the same condition. Additionally, this condition also fosters the formation of flat ribbon-like fibril across different batches. Notably, at pH 5.8, variations in experimental groups yield disparate abundance ratios between polymorph 3B and 3C, indicating a degree of instability in fibrillar formation. The variability would potentially pose challenges for replicability in subsequent research. In light of these situations, I propose the following recommendations: 

      (a) An explicit elucidation of the factors contributing to these divergent outcomes under similar experimental conditions is warranted. This should include an exploration of whether variations in purified protein batches are contributing factors to the observed heterogeneity.

      We are in complete agreement that understanding the factors that lead to polymorph variability is of utmost importance (and was the impetus for the manuscript itself). However the number of variables to explore is overwhelming and we will continue to investigate this in our future research. Regarding the variability between batches of purified protein, we also think that this could be a factor in the polymorph variability observed for otherwise “identical” aggregation conditions, particularly at pH 7 where the largest variety of polymorphs have been observed. However, even variation between identical replicates (samples created from the same protein solution and simply aggregated simultaneously in separate tubes) can lead to different outcomes (see datasets 15 and 16 in the revised Table 1) suggesting that there are stochastic processes that can determine the outcome of an individual aggregation experiment. While our data still indicates that Type 1,2 and 3 polymorphs are strongly selected by pH, the selection between interface variants 3B vs. 3C and 2A vs. 2B might also be affected by protein purity. Our standard purification protocol produces a single band by coomassie-stained SDS-PAGE however minor truncations and other impurities below a few percent would go undetected and, given the proposed roles of the N and C-termini in secondary nucleation, could have a large effect on polymorph selection and seeding. In line with the reviewer’s comments we now include a batch number for each EM dataset. While no new conclusions can be drawn from the inclusion of this additional data, we feel that it is important to acknowledge the possible role of batch to batch variability. 

      (b) To enhance the robustness of the conclusions, additional replicates of the experiments under the same condition should be conducted, ideally a minimum of three times.  

      The pH 5.8 conditions that yield Type 3 fibrils has already been repeated several times in the original manuscript. Since the pH 7.4 conditions produce the most common a-Syn polymorph (Type 1A) and were produced twice in this manuscript (once as an unseeded and once as a cross-seeded fibrilization) we decided to focus on the intermediate condition where the most variability had been seen (pH 7.0). The revised table 1 now has 6 new datasets (11-16) representing 6 independent aggregations at pH 7.0 starting from two different protein purification batches. The results is that we now produce the type 2A/B polymorphs in three samples and in two of these samples we once again observed the type 1M polymorph.  The other samples produced Type 1A or non-twisted fibrils.

      (c) Further investigation into whether different polymorphs formed under the same buffer condition could lead to distinct toxicological and pathology effects would be a valuable addition to the study.  

      The correlation of toxicity with structure would in principle be interesting. However the Type 1 and Type 3 polymorphs formed at pH 5.8 and 7.4 are not likely to be biologically relevant. The pH 7 polymorphs (Type 5 and 1M) would be more interesting because they form under the same conditions and might be related to some disease relevant structures. Still, it is rare that a single polymorph appears at 7.0 (the Type 5 represented only 10-20% of the fibrils in the sample and the Type 1M also had unidentified double-filament fibrils in the sample). We plan to pursue this line of research and hope to include it in a future publication.

      (2) The cross-seeding study presented in the manuscript demonstrates the pivotal role of pH conditions in dictating conformation. However, an intriguing aspect that emerges is the potential role of seed concentration in determining the resultant product structure. This raises a critical question: at what specific seed concentration does the determining factor for polymorph selection shift from pH condition to seed concentration? A methodological robust approach to address this should be conducted through a series of experiments across a range of seed concentrations. Such an approach could delineate a clear boundary at which seed concentration begins to predominantly dictate the conformation, as opposed to pH conditions. Incorporating this aspect into the study would not only clarify the interplay between seed concentration and pH conditions, but also add a fascinating dimension to the understanding of polymorph selection mechanisms.

      A more complete analysis of the mechanisms of aggregation, including the effect of seed concentration and the resulting polymorph specificity of the process, are all very important for our understanding of the aggregation pathways of alphasynuclein and are currently the topic of ongoing investigations in our lab.

      Furthermore, the study prompts additional queries regarding the behavior of cross-seeding production under the same pH conditions when employing seeds of distinct conformation. Evidence from various studies, such as those involving E46K and G51D cross-seeding, suggests that seed structure plays a crucial role in dictating polymorph selection. A key question is whether these products consistently mirror the structure of their respective seeds. 

      We thank the reviewer for reminding us to cite these studies as a clear example of polymorph selection by cross-seeding. Unfortunately, it is not 100% clear from the G51D cross seeding manuscript (https://doi.org/10.1038/s41467-021-26433-2) what conditions were used in the cross-seeding since different conditions were used for the seedless wild-type and mutant aggregations… however it appears that the wildtype without seeds was Tris pH 7.5 (although at 37C the pH could have dropped to 7ish) and the cross-seeded wild-type was in Phosphate buffer at pH 7.0. In the E46K cross-seeding manuscript, it appears that pH 7.5 Tris was used for all fibrilizations (https://doi.org/10.1073/pnas.2012435118).  In any event, both results point to the fact that at pH 7.0-7.5 under low-seed conditions (0.5%) the Type 4 polymorph can propagate in a seed specific manner.

      (3) In the Results section of "The buffer environment can dictate polymorph during seeded nucleation", the authors reference previous cell biological and biochemical assays to support the polymorph-specific seeding of MSA and PD patients under the same buffer conditions. This discussion is juxtaposed with recent research that compares the in vivo biological activities of hPFF, ampLB as well as LB, particularly in terms of seeding activity and pathology. Notably, this research suggests that ampLB, rather than hPFF, can accurately model the key aspects of Lewy Body Diseases (LBD) (refer to: https://doi.org/10.1038/s41467-023-42705-5). The critical issue here is the need to reconcile the phenomena observed in vitro with those in in-vivo or in-cell models. Given the low seed concentration reported in these studies, it is imperative for the authors to provide a more detailed explanation as to why the possible similar conformation could lead to divergent pathologies, including differences in cell-type preference and seeding capability.  

      We thank the reviewer for bring this recent report to our attention. The findings that ampLB and hPFF have different PK digestion patterns and that only the former is able to model key aspects of Lewy Body disease are in support of the seed-specific nature of some types of alpha-synuclein aggregation.  We have added this to the discussion regarding the significant role that seed type and seed conditions likely play in polymorph selection.

      (4) In the Method section of "Image processing", the authors describe the helical reconstruction procedure, without mentioning much detail about the 3D reconstruction and refinement process. For the benefit of reproducibility and to facilitate a deeper understanding among readers, the authors should enrich this part to include more comprehensive information, akin to the level of detail found in similar studies (refer to:

      https://doi.org/10.1038/nature23002).

      As also suggested by reviewer #2, we have now added more comprehensive information on the 3D reconstruction and refinement process.

      (5) The abbreviation of amino acids should be unified. In the Results section "On the structural heterogeneity of Type 1 polymorphs", the amino acids are denoted using three-letter abbreviation. Conversely, in the same section under "On the structural heterogeneity of Type 2 and 3 structures", amino acids are abbreviated using the one-letter format. For clarity and consistency, it is essential that a standardized format for amino acid abbreviations be adopted throughout the manuscript.

      That makes perfect sense and had been corrected.

      Reviewing Editor:

      After discussion among the reviewers, it was decided that point 2 in Reviewer #3's Public Review (about the experiments with different concentrations of seeds) would probably lie outside the scope of a reasonable revision for this work. 

      We agree as stated above and will continue to work on this important point.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers


      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      This manuscript provides a detailed analysis of RNA and protein dynamics during transmission of the rodent malaria model P. yoelii from the mouse host to an in vitro ookinete culture setting (mimicking the mosquito midgut environment). This group and others have shown experimentally that a substantial number of mRNAs is stored in the female Plasmodium gametocyte, ready to be translated following initiation of ookinete development. The process is akin to maternal deposition of mRNA in oocytes of metazoans. With this manuscript the authors provide a significant contribution to the field of translational control in Plasmodium parasites as they explore the translational activation during the early hours of zygote-to-ookinete development. The paper presents RNAseq and mass-spec analyses of female gametocytes and for the first time for 6-hour zygotes (ie a fertilized female gamete); the zygote datasets are much improved and more comprehensive than the only other performed in 2008 in P. gallinaceum. Using comparative analyses of transcriptome and proteome data (including published datasets) the authors arrive at a list of 198 transcripts that are translationally repressed in the gametocyte and translated within 6 hours of fertilization in the zygote. Many of these mRNAs are known to be involved in zygote to ookinete transformation. BioID is finally used to explore changes in mRNP protein composition between the female gametocyte and the zygote.

      The paper is generally well written. The authors present a lot of data (also in comparison with published data). Sometimes perhaps the main message could be simplified / streamlined in section titles (Quantitative Proteomics by DIA-MS is not very informative. The outcome of the analysis would be more telling).

      Response: We have revised section headers to clarify the content.

      A considerable proportion of the DIA mass-spec proteomics results section is very technical. The paper describes a biological phenomenon rather than a technical mass-spec advance. Can these technical details be moved to the methods section?

      Response: As this is one of the first published instances of using DIA-MS to Plasmodium, we want to keep this information in the main text to help our community adopt these approaches. While these details are highly technical, they are also some of the major advances of this project.

      On the other hand, a bit more detail could be provided in the main text. For example, the age of the zygotes is never mentioned. This is important, please add this. The main manuscript text has 16 mentions of the word "many". As the authors are in possession of the data, please provide, if missing, (in parenthesis) the absolute numbers, maybe in an "x out y" format. Please clearly state the number of biological and/or technical replicates used for transcriptome and proteome analyses in the main text, figures and/or figure legends. How many protein coding genes are encoded in the P. yoelii genome?

      Response: Several of these requested details are noted in the materials and methods. We have added this information to the main manuscript now as well. We have also revised the manuscript to replace some instances of “many” with specific numbers unless it adversely impacted the flow of the sentence to do so.

      The authors claim that only zygotes (fertilized females) have surface-exposed Pys25 (a surface protein they use to affinity-purify zygotes) but not gametocytes. I could not find the experimental data for this in the paper. The cited reference #22 also does not appear to show this. In Figure 2C Pys25 is shown to be translated in gametocytes. In this context it may be important to note that in the related P. berghei the related protein P28 is expressed even in the absence of fertilization (Billker 2004; DOI: 10.1016/s0092-8674(04)00449-0). It may not be relevant whether translation requires fertilization, but the authors claim it affects trafficking of the Pys25 protein to the surface, so it needs to be shown. A reference to an infertile P. yoelii line would be great.

      Response: We have corrected the reference supporting the surface exposure of p25 on zygotes. The observation by Billker and colleagues about Pbs28 is also of interest, but outside of the scope of this study as we did not investigate the fertilization event itself here.

      It is highly commendable that all data is provided throughout the manuscript. For readability, may I suggest that the authors add labels to individual sheets within an excel file from A to Z, and do so also within the manuscript. That would really help; the most relevant data sets could then be identified quickly. For example, line 184 refers to 276 zygote proteins in which sheet of which table?

      Response: While this labeling system would also be effective, we have provided a README tab for our files that quickly directs the reader to the relevant tab (as we do for our previous publications).

      Section 176 onwards: here the authors combine P. falciparum and P. yoelii proteomics data. Please explain why you excluded any of the available P. berghei proteome data such as the male and female gametocyte proteome? The same question applies to 294 onwards.

      Response*: We compared our datasets with those of Lasonder et al. NAR 2016 because that study was also focused on translational repression of mRNAs and provided both RNA-seq and proteomic datasets of female gametocytes (although not of zygotes). *

      The comparative transcriptome-proteome analysis arrives at 198 translationally repressed mRNAs. Could the authors provide one or two alternatives using less stringent parameters? The list in P. falciparum and P. berghei is considerably larger (500+ and 700+).

      Response: We could have reduced the stringency of our thresholds to arrive at a far larger number, but prefer to retain higher confidence in those we are scoring as translationally repressed and then released for translation. We provide all of the pertinent data in the supplemental files if readers would like to adjust these thresholds to see which additional mRNAs may also be regulated.

      The turboID data is informative but somewhat speculative in regard to spatial rearrangements within these mRNPs. Figure 6 presents the RNA helicase to bind the 5' end of mRNAs that are associated with polyribosomes and I assume being translated. Is this association realistic? The RNA helicase DOZI homolog of yeast (Dhh1) is also involved in decapping. Response: We provide Figure 6 as our working model of how the reorganization of the DOZI/CITH/ALBA complex could occur based on available data from this study and others. Future studies are warranted to determine if DOZI remains associated with monosomes vs. polysomes, but current data indicate that DOZI can bind to eIF4E when translational repression is not imposed.

      Specific comments:

      title Is global the appropriate word? Some transcripts appear to be translated later.

      Response: We believe it does apply appropriately to these data.

      Line 30/32 Please re-phrase the sentence. There is: Cell Host Microbe 2012 Jul 19;12(1):9-19. doi: 10.1016/j.chom.2012.05.014.

      Response: We conclude that the sentence is correct as written, even in considering Sebastian et al. Cell Host & Microbe 2012.

      30 Perhaps add ookinete that establishes infection rather than the zygote. For a general readership, a brief description of the sexual life cycle might be useful

      Response: It is not possible to get into these nuances in the Abstract. This information is covered in the main text and the works that are cited.

      32 DOZI/CITH/ALBA complex would require some explanation for a more general reader

      Response: It is not possible to get into these nuances in the Abstract. This information is covered in the main text and the works that are cited.

      36-37 I believe zygotes were collected 6 hours after fertilization. Does that qualify as soon after fertilization? Motile ookinetes are generated within 20 hours and motility can be seen before that.

      Response: Yes, we think this qualifies as the process is not synchronous, but relies on when male gametes encounter and fuse with female gametes.

      37 Essential functions for what?

      Response: It is not possible to get into these nuances in the Abstract. This information is covered in the main text and the works that are cited.

      39 Is the spatial arrangement of this mRNP known?

      Response*: Some interactions of members of this complex were known (DOZI with eIF4E, ALBA4 with PABP1), but not the overall spatial arrangement. These findings are novel to this study. *

      40 Can you briefly allude to the "recent, paradigm-shifting models of translational control"

      Response: It is not possible to get into these nuances in the Abstract. This information is covered in the main text and the works that are cited.

      44 Products = mRNA

      Response: We have stated it as products because the maternal cell provides more than just mRNAs that are essential to further development post-fertilization.

      45 Oocyte in metazoans ?

      Response: Yes, this is the correct term. The context here is in higher eukaryotes.

      60/62 Please re-phrase the sentence. There is: Cell Host Microbe 2012 Jul 19;12(1):9-19. doi: 10.1016/j.chom.2012.05.014.

      Response: We conclude that the sentence is correct as written, even in considering Sebastian et al. Cell Host & Microbe 2012.

      81 PbDozi Plasmodium berghei DOZI

      Response: We have added this clarifying text here as suggested.

      84/85 Please rephrase and cite Nucleic Acids Res. 2008 Mar;36(4):1176-86. doi: 10.1093/nar/gkm1142. Epub 2007 Dec 23. and Cell Host Microbe 2012 Jul 19;12(1):9-19. doi: 10.1016/j.chom.2012.05.014.

      Response: As noted above for other comments, we hold that the current phrasing is accurate even when considering these important publications.

      88 Please define the timepoints throughout this manuscript. What age are the zygotes? How many hours post-induction? Please define the time for ookinete development somewhere in the introduction

      Response: The timepoint used for zygote collection is now included in the main text in addition to its previous inclusion in the Materials and Methods section. As we have not studied the ookinete stage here, we have opted to keep the introduction focused on the key details for this study.

      104 Please add the age (in hours) of these zygotes from the time of starting the in vitro cultures. From the methods section it looks like 6 hours.

      Response: The timepoint used for zygote collection is now included in the main text in addition to its previous inclusion in the Materials and Methods section.

      103/105 I can find no evidence for P25 (Pys25) expression relying on fertilization in the cited paper (22). The SOM has no reference to Pys25 either. Please show data or reference published data that there is no translation and trafficking of Pys25 in unfertilized female gametes, ie those that are placed in ookinete medium. In this respect it may be important to note that unfertilized Plasmodium berghei females placed in ookinete medium translate P28, the P25 paralog (https://www.sciencedirect.com/science/article/pii/S0092867404004490?via%3Dihub)

      Response: We have corrected the reference supporting the surface exposure of p25 on zygotes. The observation by Billker and colleagues about Pbs28 is also of interest, but outside of the scope of this study as we did not investigate the fertilization event itself here.

      104 What cell line was used for the zygotes?

      Response*: The PyApiAP2-O::GFP transgenic parasite line was used here. These details are included in the manuscript and supporting information. *

      114 The number of transcripts detected in gametocytes is quite small compared to the twice as large proteomics dataset. See for example also Lasonder 2016 for P. falciparum detected transcripts: 4477 different sense transcripts were identified, 98% of which were shared between MG and FG.

      Response: Yes, the number of mRNAs or proteins scored as detected differs based on thresholds applied. We prefer to err on the side of higher stringency as noted above.

      117 Does the 194 up-in-gametocytes dataset include the 81 not found in zygotes?

      Response: No, these 194 are detected in both datasets, but are more abundant in gametocytes than zygotes.

      117 Could you indicate some of the genes in the plot?

      Response: Several hits of special note are described in the text. We have opted to keep the figure clear and streamlined.

      Fig1 How were the upregulated transcripts identified? 1647 are shown to be specific to zygotes in 1B, yet only 685 are shown in 1C to be upregulated. Do the transcripts found exclusively in zygotes not count? Are these transcripts likely the result of de novo transcription? How old are these zygotes when the libraries are made?

      Response: The details of the RNA-seq processing are provided in the MakeFile, the supplementary tables, and the manuscript. The README tab provides descriptions of what processing occurred between sequential tabs. As noted above, zygotes were collected at 6 hours.

      132 Many? How many? Please provide a precise number.

      Response: These details are now in the revised manuscript.

      134 Please explain why p28 would be differentially abundant in the zygote rather than the female gametocyte. That would require de novo transcription of this gene. If there is experimental evidence for the de novo transcription of p28 and other translationally repressed transcripts in the zygote please cite the references. Can you name a few more examples here? P25 for example, ap2-o, or anything published and experimentally validated. What about AP2-o and AP2-Z? Both are known to be translationally repressed.

      Response: We state in the original manuscript that there is not a significantly different mRNA abundance of pys28.

      139 Please define how many members of the IMC?

      Response*: We have now replaced “many” with the number of IMC members we have detected, which is also shown in supporting tables. *

      156 Can you provide a number of how many parasites were used in total or per run. And how many biological and technical replicates were analysed?

      Response: These details are provided in the Materials and Methods.

      169 The number of proteins detected in the gametocyte sample is twice the size of transcripts. IS this to be expected?

      Response*: This reflects the sensitivity of the assays run for transcriptomics and proteomics. *

      170 How many samples were analyzed? One gametocyte and one zygote sample?

      Response: Yes, for the creation of the DIA-MS spectral library, a single biological replicate was used in addition to in silico library approaches. This information is provided in the next sentence.

      176 Why did you not include P. berghei in the meta-analysis?

      Response: We compared these results to all of the published Plasmodium proteomes in PlasmoDB.

      184 Please refer to an excel table here.

      Response: We have pointed to the relevant supporting files in this section.

      184 145 proteins: do you mean orthologs in general or orthologs with a gene/protein annotation other than unknown function?

      Response: We use the standard form of ortholog throughout the manuscript.

      190 142 proteins: do they all have orthologs in P. falciparum?

      Response: No, not all proteins in our dataset have unambiguous orthologues in P. falciparum, and this is accounted for in our data processing approaches.

      Figure 2C P25 is not exclusive to zygotes here and also found in the gametocyte sample.

      Response: That is correct. It is known that p25 is expressed in female gametocytes, but that the localization changes in the zygote.

      190 shortlist

      Response: The spelling of “short list” as two words is an appropriate American spelling of this term.

      219 onwards Does the list of 198 transcripts exclusively arise from your RNAseq and proteomics comparison? Or does it include falciparum data as outline in section 176 onwards, ie the list of 276 proteins that only are detected in zygotes?

      Response: Yes, this list of 198 mRNAs is derived from our datasets only using our defined thresholds. The details of this are provided in the manuscript.

      224 Early zygote? At 6 hours do the parasites not start to transform, elongate?

      Response: This process is not synchronous, as it is affected by the timing of gamete fusion.

      225 >5-fold. Is this an arbitrary decision?

      Response: This threshold has been used by our group and others in prior studies, and was partially informed by the behavior of previously characterized transcripts.

      227 1417 mRNAs: they are from which dataset?

      Response: These are from our datasets with P. yoelii, as described in the manuscript.

      228/229 Please explain why DOZI and CITH are in the list of 198 repressed transcripts? They are present in the gametocyte. Are they upregulated>5 fold?

      Response: Yes, they meet our criteria for this regulation, and in the manuscript we note that we believe that they are self-regulated and likely have continuing roles in early mosquito stage development.

      259 ... as they are already translated in the gametocyte?

      Response: Yes. Translational repression allows for the existence of some of the protein in the initial timepoint. This differs from translational silencing which does not.

      295 Is this from the 198 TR list S4?

      Response: No. Transcripts that remain repressed would not be in the list of 198, as the protein was not detected in zygotes.

      294 onwards How many putatively falciparum transcripts are there? How many were identified in P. berghei? How many are common to all? A Venn diagram perhaps to compare the different studies

      Response: There is substantial overlap between the species with respect to the presence of syntenic orthologues in this dataset. However, because we did not conduct experiments with P. falciparum or P. berghei here, we do not want to make claims that they are similarly regulated or potentially have a reader misinterpret a figure to that effect.

      301 How many transcripts were found associated with Plasmodium berghei DOZI and/or CITH in female gametocytes? How many of those were abundantly detected as protein in zygotes, or had no difference in protein abundance between gametocytes and zygotes, or even greater abundance in female gametocytes?

      Response: These details are now provided in the revised manuscript.

      303/305 Please indicate the numbers of translationally repressed transcripts identified for P. falciparum and berghei.

      Response: These data are provided in Supporting Information Table 4.

      317/319 Please add the promoter used for tid-GFP

      Response: We have now added this information to the Materials and Methods.

      320 Please elaborate on the spatial organization of the DCA complex.

      Response: This has not been previously characterized, and this entire section is dedicated to the experimental data and interpretations of how the DOZI/CITH/ALBA complex may be organized.

      321/322 Have precise binding sites of DOZI and ALBA4 really been shown experimentally in the cited papers? In relation to 5' and 3' ends of the mRNA? Please cite Braks et al. paper.

      Response: Yes. The association of DOZI with eIF4E and ALBA4 with PABP1 are established in the literature, in some cases by multiple independent laboratories. The Braks publication does not address the binding of these proteins, and thus is not cited.

      323 What is the first generation BioID enzyme? BirA*

      Response: Yes. The first generation enzyme is called BirA*

      323 Please cite relevant Kyle Roux and Alice Ting for the original enzymes

      Response: We have now added these citations to this sentence.

      327 Could you show images of ALBA4::TurboID::GFP, DOZI::TurboID::GFP and cytosolic (free) TurboID? Perhaps stained with fluorescently labelled streptavidin and / or against GFP? In the gametocyte and zygote samples?

      Response: We attempted to stain with monoclonal antibodies that are reactive against biotin and there was insufficient specificity, hence why such data is not included. We conclude that all of the other data that supports this approach suffices to demonstrate its rigor.

      331 What is the age of these zygotes? Where they affinity purified?

      Response: As throughout the manuscript, zygotes were collected at 6 hours. Details of experimental purifications are provided in the materials and methods.

      Fig S4 Please indicate whether ALBA4 and DOZI were tagged endogenously

      Response: Yes. The endogenous loci for both ALBA4 and DOZI were modified to include the C-terminal TurboID and GFP tags.

      421/430 Please add a few references here

      Response: We do not believe that specific references are warranted for these general statements.

      429 translational repression?

      Response: Yes. These statements set the stage for the use of translational repression.

      445 966 proteins in gallinaceum? The zygote cultures in that study were 2-3 hours. How old were the cultures in your study?

      Response: As throughout the manuscript, zygotes were collected at 6 hours.

      481 Please explain / cite why repression is energetically costly.

      Response: These details are provided in both the introduction and discussion sections. The energetic cost of translational repression is both the cost to produce the transcripts without immediately/fully utilizing it for translation, in addition to the energetic cost to impose the regulation.

      501 Please add the time-point of RNA and protein sampling. How many hours into ookinete development? What is the time from cardiac puncture through FACS sampling of gametocytes.

      Response: We have provided all of these details in the materials and methods for female gametocytes and zygotes. We did not look at ookinetes in this study.

      711/713 Do you have any images that show the successful purification of zygotes away from gametocytes? Secondly, please provide a reference for the statement that unfertilized female gametocyte do not express surface exposed Pys25.

      Response*: We do not have captured images of these zygotes, but confirmed them during collection using microscopy. The reference for surface exposure of Pbs25 is now provided earlier in the manuscript as well. *

      711/716 Were parasites lysed and mechanically homogenised?

      Response: We have provided all of these details in the materials and methods for female gametocytes and zygotes.

      Figure 6 What is the evidence that DOZI stays associated with mRNA that is being translated? Rather than mRNA that is being decapped. Please add the references that unequivocally show that DOZI and ALBA4 bind to opposite ends of repressed mRNAs.

      Response: This is our working model of these data. It is feasible that these complexes could form off of mRNA as well. Publications describing the interactions of DOZI with eIF4E and ALBA4 with PABP1 are provided in the manuscript. It is well established that eIF4E binds to the m7G cap of the 5’ end of mRNAs, and PABP1 binds to the poly(A) tail at the 3’ end of mRNAs.

      Reviewer #1 (Significance (Required)):

      The experiments in the manuscript are carefully conducted. Apart from a P. gallinaceum study from 2009 this is the first comprehensive analysis of the transcriptome and proteome of a Plasmodium zygote (developing ookinete) at 6 hours post-fertilization. The data are used to explore the temporal aspect of activation of translation during the first quarter of the 20-24 hour ookinete developmental period. The study will be of interest to the field, specifically those scientists working to understand translational control, ookinete development, and those developing intervention strategies to prevent mosquito infection and thus malaria transmission.

      Response: We appreciate Reviewer 1’s extensive feedback and positive remarks about the significance of our study. We have revised our manuscript to reflect this constructive feedback.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Main findings

      Taking a multi-omic approach, the authors provide quantitative evidence for translation repression of ~200 mRNAs in Plasmodium yoelii female gametocytes. These mRNAs are then translated, and proteins detected by 6 hours after activating gametocytes. They accomplish this by performing a comparative global analysis of the transcriptome and proteome between female gametocytes and early zygotes that provides an intresting resource. The authors also use proximity labelling of the DOZI/CITH/ALBA4 repression complex, and these data suggest the complex may disassemble in the zygote or change its composition.

      Major points

      Line 181-184: The authors state that there is no evidence of how the DCA complex selects specific mRNAs for translation repression. While the exact mechanisms have not been fully elucidated, Braks et al (2008, doi:10.1093/nar/gkm1142) suggested a role of the untranslated regions (UTRs) in translation repression of transcripts in Plasmodium berghei female gametocytes. They identified a uridine-rich 47-base element in the 5'UTR and or 3'UTR that was associated with translationally repressed transcripts and validated it experimentally. Considering this finding, I would recommend an amendment of the statement and to include the earlier work. I would also like to see additional analysis to check if this U-rich motif or other motifs are associated with the translationally repressed transcripts identified in the current study. The current study should be better powered to conduct such an analysis.

      Response: We have now added a comment and citation in the revised text about this study in Lines 86-88. Understanding the full importance of this element is challenging, as the Plasmodium transcriptome is highly enriched in A’s and U’s due to the highly skewed A/T content of its genome. Perhaps for this reason, we did not see an association of this motif with the identified mRNAs.

      The authors used zygotes that expressed GFP tagged AP2-O, however, there is no explanation of the significance of using this line.

      Response: This line is described in the Materials and Methods and supporting information. It was used to provide further validation of the production of zygotes.

      Minor points

      In line 106-107, the authors refer to figure SI, this figure is about genomic locus and genotyping PCR for the PyApiAP2-O::GFP parasites but there is no intext description of why this specific line was used.

      Response: We have provided this information in the revised manuscript.

      Statement in line 122-124 "It is likely that....." should go into the discussion not results.

      Response: We have placed this single sentence immediately after presenting these data here to aid reader comprehension.

      Statement in line 171-175: "In addition to providing confirmatory...." Should be in the discussion not on the results.

      Response: We view this sentence as a concluding remark of this section of data that also places this information in context for the reader.

      In Fig. 4 A and B, could the colour scheme be changed so that the proteins that are not in both samples (and probably contain many unspecifically detected proteins) appear less prominent?

      Response: We appreciate this suggestion and have adjusted these plots accordingly in the revised manuscript.

      Reviewer #3 (Significance (Required)):

      Why is the paper interesting. Translation repression of mRNA at a global level in the female gametocytes has been studied previously in rodent malaria parasites investigated, but prior to the current study, the release of mRNA from translation repression in the mosquito stages has only been demonstrated for specific transcripts. By characterizing and quantitating changes in protein abundance between macrogamete and zygote, coupled with transcriptomic analysis, the current work broadens our understanding of zygotic translation activation that is key to successful malaria parasite transmission to the mosquito.

      This dataset provides a useful resource for the Plasmodium research community as it provides a more comprehensive view of how transcripts behave during the transitions from the mammalian host to the vector. It is one step in a broader endeavour towards finding genes crucial for parasite transmission that could be targeted for interventions.

      How translational repression and derepression is regulated remains unknown, although some of the molecular players have been identified. This paper shows proximity labelling and expansion microscopy data of the ribonuclear protein complex thought to mediate repression. Although the specific mechanistic insights provided by the experiments shown here remain relatively limited, the work demonstrates interesting new avenues for how translational derepression in Plasmodium can be studied.

      Response: We also appreciate Reviewer 3’s excellent feedback and positive remarks about the significance of our study. The revised manuscript addresses these comments, and we believe it is further strengthened because of it.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      1. General Statements [optional]

      We thank all the reviewers for their constructive and critical comments. We provide a point-by-point response to the reviewers' comments, as detailed below. By responding to them, we believe that our revised manuscript will significantly improve so that it will be of interest for researchers in the field of cell biology, signaling pathways, physiology and nutrition.

      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity):

      Summary: The manuscript by Yusuke Toyoda and co-workers describes that the phosphorylation of the a-arrestin Aly3 downstream of TORC2 and GAD8 (AKT) negatively regulates endocytosis of the hexose transporter Ght5 in S.pombe under glucose limiting growth conditions.

      To arrive at these conclusions, the researchers define a set of redundant c-terminal phosphorylation sites in Aly3 that are downstream by GAD8. Phosphorylation of these sites reduces Ght5 ubiquitination and endocytosis. For ubiquitination, Aly3 interacts with the ubiquitin ligases Pub1/3.

      We thank the reviewer for his/her time and reporting advantages and issues of this study.

      Major points:

      Figure 3B: it would be interesting to compare Aly3 migration pattern (and hence potential phosphorylation) under glucose replete or limiting growth conditions. Can the authors provide direct evidence that Aly3 phosphorylation changes in response to glucose availability? Also please explain the 'smear' in lanes aly3(4th Ala), aly3(4th Ala, A584S), aly3(4th Ala, A586T).

      While it is an interesting possibility that the Aly3 migration pattern changes in response to glucose concentrations in medium, we think that this is unlikely and that examining this possibility is beyond the scope of this study. Because a phospho-proteomics study reported by Dr. Paul Nurse's lab showed Tor1-dependent phosphorylation of Aly3 at S584 under high glucose (2%) conditions (Mak et al, EMBO J, 2021), the Aly3 phosphorylation (migration) pattern is likely to be constant regardless of glucose conditions. Glucose conditions affect the mRNA and protein levels of Ght5, but supposedly not its endocytosis to vacuoles (Saitoh et al, Mol Biol Cell, 2015; Toyoda et al, J Cell Sci, 2021).

      As for the smear in Aly3(4th A), Aly3(4th A;A584S), Aly3(4th A; A586T), we suspect that some posttranslational modification occurs on these mutant Aly3 proteins, but the identity of the modification is unclear. We did not mention the smear signals in the original manuscript, because the presence or absence of the smear did not necessarily correlate with cell proliferation in low glucose and thus vacuolar localization of Ght5, which is the main topic of this study. In the revised manuscript, we will mention this point more clearly.

      Figure 4: Ght5 localization should be analyzed + / - thiamine and in media with different glucose levels. Also, a co-localization with a vacuolar marker (FM4-64) would be nice (but not necessary). Ideally, the authors should add WB analysis of Ght5 turnover to complement the imaging data. Also, would it be possible to measure directly the effects on glucose uptake (using eg: 2-NBDG).

      In this revision, we plan to observe Ght5 localization under the conditions indicated by the reviewer (+/- thiamine and high/low glucose levels) to unambiguously show that the vacuolar localization of Ght5 occurs in a manner dependent solely on expression of the mutant Aly3 protein.

      We thank the reviewer for the suggestion of co-staining with FM4-64. Indeed, because we previously reported that the cytoplasmic Ght5 signals were surrounded by FM4-64 signals in the TORC2-deficient tor1Δ mutant cells (Toyoda et al, J Cell Sci, 2021), the cytoplasmic Ght5-GFP signals in Figure 4 are very likely to co-localize with vacuoles. We will modify the text to clarify this point.

      As suggested, we plan to add Western blot analysis of Ght5 turnover in Aly3-expressing cells, to complement the imaging data (Figure 4) in the revised manuscript. Persistent appearance of GFP in Western blot would be a good support for vacuolar transport of Ght5-GFP.

      While regulation of glucose uptake is an important issue, measurement of Ght5-dependent glucose uptake using 2-NBDG was very difficult in our hands. Another reviewer (Reviewer #2) also mentioned the difficulty of this measurement in the Referees cross-commenting section.

      Figure 5: Given the localization of Ght5 shown in Figure 4, I'm surprised that it is possible in to detect full length Ght5, and its ubiquitination in the phospho-mutants of Aly3. I expected that the majority of Ght5 would be constitutively degraded, and that one would need to prevent endocytosis and/or vacuolar degradation to detect full length Ght5 and ubiquitination. Please explain the discrepancy. Also it seems that the quantification in B was performed on a single experiment.

      As the aim of Figure 5 is to compare the ubiquitinated species of Ght5 among the samples expressing different species of Aly3, the loading amount of each sample was adjusted so that the abundance of immunoprecipitated Ght5 is same across them. Therefore, as the reviewer points out, before the adjustment, abundance of the full-length Ght5 might be different in these samples. In the revised manuscript, we will add explanation on this point; why the anti-GFP blot of Figure 5A has the similar intensities in those samples.

      In the revised manuscript, we will add two additional replicates of the same experiment as Figure 5 in Supplementary material to show reproducibility of the result.

      Figure 6: Which PPxY motif of Aly3 is used for interaction with Pub1/3 and does their interaction depend on (de)phosphorylation?

      In the revised manuscript, we will discuss that "both PY motifs of Aly3 might be required for full interaction with Pub1/3," by citing the following published knowledge:

      (a) Mutation of both PPxY motif of budding yeast Rod1 and Rog3 (Aly3 homologs) diminished their interaction with the ubiquitin ligase Rsp5 (Andoh et al, FEBS Lett, 2002).

      (b) Mutating either one of two PPxY motifs of budding yeast Cvs7/Art1 greatly decreased interaction with WW domain, and mutating both abolished the interaction (Lin et al, Cell, 2008).

      Our preliminary results indicated that Pub3 interacted with Aly3, Aly3(4th A) and phospho-mimetic Aly3(4th D), and thus suggested that the Aly3-Pub1/3 interaction does not depend on the phosphorylation status of Aly3. Consistently, budding yeast Rod1 reportedly interacts with Rsp5 regardless of its phosphorylation status (e.g. Becuwe et al, J Cell Biol, 2012). While we have partially mentioned this point in the original manuscript (L499-503), we will discuss this point more clearly in the revised manuscript.

      Reviewer #1 (Significance):

      The results are well presented and clear cut (with few exceptions, please see major points). They provide further evidence that metabolic cues instruct the phosphorylation of a-arrestins. Phosphorylation then negatively regulates a-arrestin function in selective endocytosis and is essential to adjust nutrient uptake across the plasma membrane to the given biological context.

      We thank the reviewer for finding significance of our study. We believe that adding new results of the requested experiments and responding to the raised comments will clarify the significance of our revised manuscript.

      Reviewer #2 (Evidence, reproducibility and clarity):

      **Summary / background. This paper focuses on the regulation of endocytosis of the hexose transporter, Ght5, in S. pombe by nutrient limitation through the arrestin-like protein Aly3. Ght5 is induced when glucose is limiting and is required for growth and proliferation in these conditions. ght5+ encodes the only high-affinity glc transporter from fission yeast. ght5+ is induced in low glucose conditions at the transcriptional level and is translocated to the plasma membrane to allow glc import. Ght5 is targeted to the vacuole in conditions of N limitation. Mutations in the TORC2 pathway lead to the same process, thus preventing growth on low glucose medium, as shown in the gad8ts mutant, mutated for the Gad8 kinase acting downstream of TORC2. Previously, the authors demonstrated that the vacuolar delivery of Ght5 in the gad8ts mutant is suppressed by mutation of the arrestin-like protein Aly3. Arrestin-like proteins are in charge of recognising and ubiquitinating plasma membrane proteins to direct their vacuolar targeting by the endocytosis pathway. This suggested that Aly3 is hyperactive in TORC2 mutants, and accordingly, Ght5 ubiquitination was increased in gad8ts.

      **Overall statement This study aims at deepening our understanding of the regulation of endocytosis by signalling pathways through arrestin-like proteins. Ght5 is a nice model to study a physiological regulation, and the authors have a great set of tools at hand. However, I think the conclusions are not always rigorous and the conclusions are sometimes far-reaching. The main problem is that much of the conclusions concern a potential phosphorylation of Aly3 which is not experimentally addressed. An additional issue is the fact that they look at Ght5 ubiquitination by co-immunoprecipitation in native conditions (or at least, it seems to me) which cannot be conclusive. Overall, I think some experiments should be performed to address (at least) these 2 points before the manuscript can be published, see detailed comments below.

      We thank the reviewer for pointing both advantages and issues of our manuscript.

      We admit that phosphorylation of Aly3 was not experimentally shown in our manuscript, although its phosphorylation has already been shown in phospho-proteomic studies by other groups. For this issue, we plan to add an experiment and modify the text, as explained below.

      The other major issue raised by this reviewer is that detection of Ght5 ubiquitination by immunoprecipitation in a native condition cannot be conclusive. Although we noticed that many studies perform affinity purification after denaturing and precipitating proteins with TCA or acetone to detect ubiquitination of the affinity-purified protein (e.g. Lin et al, Cell, 2008), we disagree with this opinion of the reviewer #2. In a review article describing methods to study ubiquitination by immunoblotting (Emmerich and Cohen, Biochem Biophys Res Comm, 2015), affinity purification of the protein of interest in a native condition is mentioned as one major choice. Moreover, a denaturing condition was not applicable to detect ubiquitinated Ght5 because the Ght5 protein that is once denatured and precipitated with TCA cannot be re-solubilized for immune-purification and -blotting. As the reviewer points out, a pitfall of detection of ubiquitinated Ght5 in a native condition is the presence of co-immunoprecipitated proteins. In our previous study (Toyoda et al, J Cell Sci, 2021), we purified GFP-tagged Ght5 and showed that a 110 kDa band detected in an anti-Ub immunoblot was also recognized by an anti-GFP antibody, confirming that the detected 110 kDa band corresponded to an ubiquitinated species of Ght5, but not a co-immunoprecipitated protein. Similarly, in the revised manuscript, we will add a panel of high-contrast (over-exposed) anti-GFP immunoblot, in which the indicated 110 kDa band was clearly detected by an anti-GFP antibody, in Figure 5A.

      We appreciate these issues raised by the reviewer #2. By responding to them, we believe that conclusions of our study will be more rigorous and undoubtful in the revised manuscript.

      **Major statements and criticism.

      *Fig 1. Based on the hypothesis that TORC2-mediated phosphorylation regulate Ght5 endocytosis, the authors first considered a possible phosphorylation of Ght5. They mutagenised 11 **possible** phosphorylation sites on the Ct of Ght5, but none affected the growth on low glucose in the absence of thiamine, suggesting that they don't contribute to the observed TORC2-mediated regulation. However, I disagree with the statement that "phosphorylation of Ght5 is dispensable for cell proliferation in low glucose", given that the authors do not show 1- that Ght5 is phosphorylated and 2-that this is abolished by these mutations. They should either provide data on this or tone down and say that these residues are not involved in the regulation, without implying phosphorylation which is not proven.

      Although we did not experimentally test whether these 11 residues of Ght5 was phosphorylated in our hand, these residues have been shown to be phosphorylated in phospho-proteomics studies by other groups (Kettenbach et al, Mol Cell Proteomics, 2015; Swaffer et al, Cell Rep, 2018; Tay et al, Cell Rep, 2019; Halova et al, Open Biol, 2021; Mak et al, EMBO J, 2021). In the revised manuscript, we plan to be more precise by replacing this conclusion with the following statement: "11 Ser/Thr residues of Ght5, which are reportedly phosphorylated, are not essential for cell proliferation in low glucose."

      In the presence of Thiamine (Supp fig 1), it seems that the ST/A mutant grows better in low glucose, and this is not explained nor commented. Since the transporter is not expressed, could the authors provide an explanation to this? If the promoter is leaky and some ght5-ST/A is expressed, it may be more stable and allow better growth than the WT, which would tend to indicate that impairing phosphorylation prevents endocytosis (which is classical for many transporters, see the body of work on CK1-mediated phosphorylation of transporters). Have the authors tried to decrease glc concentration lower than 0.14% in the absence of thiamine to see if this also true when the transporters is strongly expressed? (OPTIONAL)

      Improved growth of Ght5(ST11A)-expressing cells in the presence of thiamine was mentioned in the legend of Supplementary Figure 1A. In the revised manuscript, we will mention this observation also in the main text for better description of the results.

      Adding thiamine to medium does not completely shut off transcription from the nmt1 promoter but allows some transcription, as previously reported (Maundrell, J Biol Chem, 1990; Forsburg, Nuc Acid Res, 1993). In the revised manuscript, we will mention this "leakiness" of the nmt1 promoter and, by citing the suggested studies, will discuss a possibility that the ST11A mutations might prevent endocytosis of Ght5 and consequently promote cell proliferation in low glucose conditions.

      We found that, in the absence of thiamine, cells expressing ght5+ and ght5(ST11A) proliferated to the comparable extent on medium containing 0.08% glucose. This result will be added to the revised manuscript.

      *Fig 2. The authors then follow the hypothesis that TORC2 exerts its Ght5-dependent regulation through the phosphorylation of Aly3. They mutagenised 18 **possible** phosphorylation sites on Aly3. This led to a strong defect in growth in low-glc medium. Mutation of the possible Gad8 site (S460) did not recapitulate this phenotype, suggesting that it is not sufficient, however, mutations of 4 ST residues in a CT cluster (582-586) mimicked the full 18ST/A mutation, suggesting these are the important residues for Ght5 endocytosis.

      We thank the reviewer for appreciating the results in Fig. 2. As we explain below, we plan to perform an additional experiment to show that the Aly3 C-terminus is phosphorylated. With this result, our model will gain another experimental support.

      *Fig 3A. Further dissection did not allow to pinpoint this regulation to a specific residue, beyond the dispensability of the T586 residue. Fig 3B. The authors look at the effects of mutation of Aly3 on these sites at the protein level. They had to develop an antibody because HA-epitope tagging did not lead to a functional protein (Supp fig 2). Whereas I agree that the mutations causing a phenotype lead to a change in the migration pattern, I disagree with the statement that "This observation indicated that slower migrating bands were phosphorylated species of Aly3" (p.9 l.271). First, lack of phosphorylation usually causes a slower mobility on gel, which is not clear to spot here. Second, a smear appears on top of the mutated proteins (eg. 4th Ala) which is possibly caused by another modification. There are many precedents in the literature about arrestins being ubiquitinated when they are not phosphorylated (see the work on Bul1, Rod1, Csr2 in baker's yeast from various labs). My gut feeling is that lack of phosphorylation unleashes Aly3 ubiquitination leading to change in pattern. All in all, it is impossible to state about the phosphorylation of a protein without addressing its phosphorylation properly by phosphatase treatment + change in migration, or MS/MS. Thus, whereas the data looks promising, this hypothesis that Aly3 is phosphorylated at the indicated sites is not properly demonstrated.

      We disagree with the reviewer's opinion that a lack of phosphorylation usually causes slower mobility on gel. There are many examples in which phosphorylation causes slower mobility on gel, including budding yeast Rod1 (Alvaro et al, Genetics, 2016), and mammalian TXNIP (Wu et al, Mol Cell, 2013). In the revised manuscript, we will cite these reports to support our interpretation that the slower migrating bands are likely phosphorylated species of Aly3 (L270-271).

      Smear-like signals in Aly3(4th Ala), Aly3(4th A;A584S) and Aly3(4th A;A586T) might result from some modification, but identity of the modification is unknown. As the reviewer #2 mentioned, phosphorylation on Aly3 might negatively regulate another modification. The precedent studies revealed that budding yeast Rod1 and Rog3 arrestins tend to be ubiquitinated in snf1/AMPK-deficient cells (Becuwe et al, J Cell Biol, 2012; O'Donnell et al, Mol Cell Biol, 2015), and that Bul1 arrestin is dephosphorylated and ubiquitinated in budding yeast cells deficient in Npr1 kinase (Merhi and Andre, Mol Cell Biol, 2012). Also, budding yeast Csr2 arrestin is deubiquitinated and phosphorylated upon glucose replenishment, while non-phosphorylated Csr2 is ubiquitinated and activated by Rsp5 (Hovsepian et al, J Cell Biol, 2012). While the smear-like signals are interesting, we noticed that the smear-like signals did not necessarily correlate with cell proliferation defects in low glucose. We therefore think that clarifying the identity of the smear-like signals is beyond the scope of this study. We will discuss the smear-like signals only briefly in the revised manuscript, and would address this issue in our future work, hopefully.

      While the 4 S/T residues at the C-terminus of Aly3 as well as the other 14 S/T residues have been already shown to be phosphorylated in the precedent studies (Kettenbach et al, Mol Cell Proteomics, 2015; Tay et al, Cell Rep, 2019; Halova et al, Open Biol, 2021), we will confirm that the slower migrating Aly3 is indeed phosphorylated by phosphatase treatment in the revised manuscript. This planned experiment will further strengthen our study and support our conclusion and model.

      *Fig 4. The authors now look at the functional consequences of these mutations on ALy3 on Ght5 localisation. The data clearly shows that mutation of the 4 identified S/T residues (Aly3-4th A) causes aberrant localisation of the transporter to the vacuole, likely to cause the observed growth defect on low glucose. There is a nice correlation between the vacuolar localisation and growth in low-glucose for the various aly3 mutants. (A final proof could be to express this in the context of an endocytic mutant, which should restore membrane localisation and suppress the aly3-4thA phenotype - OPTIONAL). However, I still disagree with the statement that "These results indicate that phosphorylation of Aly3 at the C-terminal 582nd, 584th, and/or 585th serine residues is required for cell-surface localization of Ght5." given that phosphorylation was not properly demonstrated.

      While phosphorylation of the 582nd, 584th and/or 585th serine residues of Aly3 is not experimentally demonstrated in our hands, they have been shown to be phosphorylated in phospho-proteomics studies by other groups (Kettenbach et al, Mol Cell Proteomics, 2015; Tay et al, Cell Rep, 2019; Halova et al, Open Biol, 2021; Mak et al, EMBO J, 2021). Among them, the 584th serine residue (S584) was reported to be phosphorylated in a TORC2-dependent manner (Mak et al, EMBO J, 2021), consistent with our model. To explicitly demonstrate that S584 is phosphorylated, we plan to make a strain expressing a mutant Aly3 protein in which all the possible phosphorylation sites except S584 are replaced with alanine, namely Aly3(ST17A;S584). Hopefully, we can properly show the phosphorylation of S584 by measuring the mobility of the Aly3(ST17A;S584) on gel with/without phosphatase treatment or gad8 mutation.

      We thank the reviewer for suggestion of the experiment using an endocytic mutant. Previously we reported that vacuolar localization of Ght5 in gad8 mutant cells was suppressed by mutations in not only aly3 but also genes encoding ESCRT complexes (Toyoda et al, J Cell Sci, 2021). We therefore think that in cells expressing Aly3(ST18A) or Aly3(4th Ala), Ght5 is subject to endocytosis and ensuing selective transport to vacuoles via endosome-localized ESCRT complexes. We will discuss this point in the revised manuscript.

      *Fig 5. Here, the authors question the role of Aly3 mutations on Ght5 ubiquitination. They immunoprecipitate Ght5 and address its ubiquitination status in various Aly3 mutants. The data is encouraging for a role in Aly3 phosphorylation (?) in the negative control of Ght5 ubiquitination. My main problem with this experiment is that it seems that Ght5 immunoprecipitations were made in non-denaturing conditions, which leads to the question of what is the anti-ubiquitin revealing here (Ght5 or a co-immunoprecipitated protein, for example Aly3 itself, or the Pub ligases, or an unknown protein). It seems that this protocol was previously used in their previous paper, but I stand by my conclusion that ubiquitination of a given protein can only be looked in denaturing conditions. The experiments should be repeated in buffers classical for the study of protein ubiquitination to be able to conclude unambiguously that we are looking at Ght5 ubiquitination itself, especially in the absence of a non-ubiquitinable form of Ght5 as a negative control. Could the authors comment on the fact that S-A or S-D mutations display the same phenotype regarding the possible Ght5 ubiquitination?

      As mentioned above, immunoprecipitation of Ght5 in denaturating conditions is not feasible. Ght5 can be affinity-purified only in a non-denaturing condition. In addition, affinity purification in a native condition is considered as a major choice to examine its ubiquitination according to a literature by Emmerich and Cohen (Emmerich and Cohen, Biochem Biophys Res Comm, 2015). A drawback of native condition is, as the reviewer points out, that the affinity-purified fraction might include non-bait (non-Ght5) proteins. The 110 kDa band indicated by an arrow in Fig. 5A was confirmed to be Ght5, not a non-bait protein, as a band at the identical position was detected in the immunoblot with anti-GFP antibody. Because this band in the anti-GFP immunoblot was too faint to be visible in Fig. 5A of the original manuscript, we will add an additional panel showing the contrast-enhanced anti-GFP immunoblot in which the 110 kDa band is clearly visible.

      As for the result that "S-A or S-D mutations display the same phenotype regarding the possible Ght5 ubiquitination," we are afraid that the reviewer #2 misunderstood the labels of the samples. We apologize for confusing notational system of the sample name. Full description of samples is as follows; In Aly3(4th A), all of S582, S584, S585 and T586 are replaced with A; In Aly3(4th A;A584S), S582, S585 and T586 are replaced with A, whereas S584 remains intact; In Aly3(4th A;A584D), S582, S585 and T586 are replaced with A, and S584 is replaced with phospho-mimetic D. Because cells expressing Aly3(4th A;A584S) and Aly3(4th A;A584D) exhibited similarly low levels of Ght5 ubiquitination, we speculated that phosphorylation at S584 of Aly3 negatively regulates ubiquitination of Ght5.

      In the revised manuscript, we plan to add a table showing amino acid sequence of each species of Aly3 (just like Figure 3A) to avoid confusion.

      *Fig 6. The authors want to document the model whereby Aly3 may interact with some of the Nedd4 ligases (Pub1/2/3) to mediate its Ght5-ubiquitination function. They actually use the Aly3-4thA mutant, it should have been better with the WT protein. But the results indicate a clear interaction with at least Pub1 and Pub3. By the way, are the Pub1/2/3 fusions functional? Nedd4 proteins are notoriously affected in their function by C-terminal tagging and are usually tagged at their N-terminus (See Dunn et al. J Cell Biol 2004).

      We plan to test whether Pub1-myc is functional by comparing proliferation of the Pub1-myc-expressing strain and pub1Δ strain, as pub1Δ cells reportedly show proliferation defects at a high temperature (Tamai and Shimoda, J Cell Sci, 2002). As deletion of pub2 or pub3 reportedly exhibited no obvious defects (Tamai and Shimoda, J Cell Sci, 2002; Hayles et al, Open Biol, 2013), it is not easy to assess functionality of the myc-tagged genes.

      Please note that C-terminally tagged Pub1/2/3 proteins have been widely used in studies with fission yeast. Both Pub1-HA and non-tagged Pub1 were reported to be ubiquitinated (Nefsky and Beach, EMBO J, 1996; Strachan et al, J Cell Sci, 2023). Pub1-GFP, which complemented the high temperature sensitivity of pub1Δ, localized to cell surface and cytoplasmic bodies (Tamai and Shimoda, J Cell Sci, 2002). Pub2-GFP, overexpression of which arrested cell growth just like overexpression of non-tagged Pub2, localized to cell surface, and consistently Pub2-HA was detected in membrane-enriched pellet fractions after ultracentrifugation (Tamai and Shimoda, J Cell Sci, 2002). They also reported ubiquitin conjugation of the HECT domain of Pub2 fused with myc epitope at its C-terminus. Pub3-GFP localized to cell surface (Matsuyama et al, Nat Biotech, 2006).

      Regardless of functionality of the myc-tagged Pub1/2/3, we believe that results of this experiment (Figure 6) support our model, because the aim of this experiment, which is to identify the HECT-type and WW-domain containing ubiquitin ligase(s) that interact with Aly3, is irrelevant to functionality of the myc-tagged Pub proteins.

      *Fig 7. The authors want to provide genetic interaction between the Pub ligases and the growth defects in low glc due to alterations in Ght5 trafficking. It is unclear how the gad8ts pub1∆ mutant was generated since it doesn't seem to grow on regular glc concentration (Supp fig 5), could the authors provide some information about this? It is also not clear whether it can be stated thatches mutant is "more sensitive" to glc depletion because of the low level of growth to begin with (even at 3%). Altogether, the data show that deletion of pub3+ is able to suppress the growth defect of the gad8ts mutant on low glc medium, suggesting it is the relevant ligase for Ght5 endocytosis. This is confirmed by microscopy observations of Ght5 localisation. However, I would again tone down the main conclusion, which I feel is far-reaching: "Combined with physical interaction data, these results strongly suggest that Aly3 recruits Pub3, but not Pub2, for ubiquitination of Ght5." Work on Rsp5 in baker's yeast has shown that Rsp5 function goes beyond cargo ubiquitination, including ubiquitination of arrestins (which is often required for their function as mentioned in the introduction) or other endocytic proteins (epsins, amphyphysin etc). I agree that the data are compatible with this model but there are other possible explanations. Anything that would block endocytosis would supposedly suppress the gad8ts phenotype.

      gad8ts pub1Δ was produced at 26 {degree sign}C, a permissive temperature of the gad8ts mutant. While this is described in the Methods section of the original manuscript, we will mention this more clearly in the Results section of the revised manuscript.

      We did not conclude low glucose sensitivity of gad8ts pub1Δ cells in the indicated part (L376-377). Rather, we compared proliferation of gad8ts single mutant and pub1Δ single mutant cells in low glucose, and we found that the pub1Δ single mutant exhibited the higher sensitivity. In the revised manuscript we will correct the text to clarify that we compared proliferation of two single mutants (but not gad8ts pub1Δ mutant).

      We agree with the opinion that the recruited Pub3 may ubiquitinate proteins other than Ght5. In the revised manuscript, we will correct our conclusion of the Figure 7 experiment (L388-390), not to limit the possible ubiquitination target(s) to Ght5.

      In a genetic screen, we found that mutations in aly3+ and genes encoding ESCRT complexes suppressed low-glucose sensitivity and vacuolar transport of Ght5 of gad8ts mutant cells (Toyoda et al, J Cell Sci, 2021). This finding appears consistent with the reviewer's opinion that blocking endocytosis would supposedly suppress the gad8ts phenotype. We will mention this point in the revised manuscript.

      *Discussion Some analogy with the regulation of the Bul arrestins by TORC1/Npr1 and PP2A/Sit4 could be mentioned (Mehri et al. 2012), at the discretion of the authors. The possibility that phosphorylation may neutralise a basic patch on Aly3 Ct, possibly involved in electrostatic interactions with Ght5 is very interesting. Regarding the effect of the mutations on Aly3 localisation (p.15 l.498), did the authors tag Aly3 with GFP? There are examples where proteins tagged with HA are not functional whereas tagging with GFP does not alter their function (eg. Rod1, Laussel et al. 2022) - and here Supp Fig 2 only relates to HA-tagging. Proof of a change in Aly3 localisation upon mutation would definitely be a plus (OPTIONAL).

      We thank the reviewer for the suggestion of a reference. In the revised manuscript, we will cite the indicated report in the corresponding part for an additional support of TORC1-mediated control of Aly3 (de)phosphorylation.

      While examining localization of Aly3 by GFP-tagging is interesting, we do not believe that it is necessary in this study. We would like to produce Aly3-GFP and to examine its functionality and localization in our future study. We thank the reviewer's insightful suggestion.

      **Minor comments.

      *Introduction: - I believe the text corresponding to the work on TXNIP is incorrect (p.5 l.127). TXNIP is degraded after its phosphorylation, not "rectracted" from the surface.

      In the revised manuscript, we will correct the text accordingly.

      • For the sake of completion, the authors could add other references concerning the regulation of Rod1 in budding yeast such as Becuwe et al. 2012 J Cell Biol and O'Donnell et al. 2015 Mol Cell Biol, in addition to Llopis-Torregrosa et al. 2016.

      In the revised manuscript, we will add the suggested references and correct the text in the corresponding part of the Introduction (L123-138).

      • Other examples of the requirement for arrestin ubiquitination beyond Art1 (p.5 l.136-137) are listed in the ref cited: Kahlhofer et al. 2021.

      We will cite the indicated review to navigate readers for more examples of arrestin ubiquitination (and transporter ubiquitination).

      *Figures: In general, I think it would be clearer if the authors showed on the figures that the background strain in which the XXX gene is added (or its mutant forms) is a xxx∆ strain.

      We will modify the figures to clearly show the genetic background of the strains used.

      **Referees cross-commenting**

      Cross review of Reviewer 1 - *I don't believe that the authors "define a set of redundant c-terminal phosphorylation sites in Aly3", because phosphorylation is not proven. *I thinks the points raised for Fig 3B are valid but the authors should focus on making their story conclusive before expanding to other data (except for the explanation of the smear, see my review). Also, I don't think 2NBDG actually works to measure Glc uptake. * same for Fig 6 - not sure the interaction site mapping between Aly3 and Pubs would bring much value since there are more urgent things to do to make the story solid.

      As mentioned above, we will experimentally show phosphorylation of the Aly3 C-terminus in the revised manuscript. Such experiments would make our story more solid and conclusive. We truly appreciate the comments and suggestions.

      We agree with the comments on difficulty of measuring glucose uptake using 2-NBDG. In fact, we tried and failed measuring Ght5-mediated glucose uptake using 2-NBDG.


      Cros review of Reviewer 3 - we have many overlaps, so briefly : *I agree that the bibliography is incomplete (mentioned in my review) *I agree that there is no demonstration of the phospho-status of Aly3, and it is a problem *I agree that the results can be better quantified, esp. in the light of the points raised by this referee concerning the variability of expression of ST18A Other specific comments : *I agree that the statement that dephosphorylation activates alpha-arresting should be toned down - this was observed in several instances but there are examples of arrestin-mediated endocytosis which does not require their prior dephosphorylation. *I fully agree that efforts could be made regarding the classification/nomenclature of arrestins in S. pombe, this had escaped my attention

      As detailed in the individual point raised by the reviewers, we will add the suggested references and accordingly correct the text in the revised manuscript.

      In addition to experimentally showing Aly3 phosphorylation, we will quantify the immunoblot result.

      Our statement that dephosphorylation activates alpha-arrestins might be too generalized. We will mention reports in which arrestin-mediated endocytosis does not require prior dephosphorylation (e.g. O'Donnell et al, Mol Biol Cell, 2010; Gournas et al, Mol Biol Cell, 2017; Savocco et al, PLoS Biol, 2019), and modify the text precisely.

      Reviewer #2 (Significance):

      *strengths and limitations This study aims at deepening our understanding of the regulation of endocytosis by signalling pathways through arrestin-like proteins in S. pombe. Ght5 is a nice model to study a physiological regulation, and the authors have a great set of tools at hand, including the discovery of Aly3 as the main arrestin for this regulation, and a signalling pathway (TORC2/Gad8) acting upstream. The main question is now to understand at the mechanistic level how TORC2 signaling impinges on the regulation of this arrestin.

      Overall, the authors nicely demonstrate that C-terminal Ser/Thr residues are crucial for the function of Aly3 in Ght5 endocytosis. They propose a model whereby Aly3 phosphorylation by an unknownn kinase inhibits its function on Ght5 ubiquitination, which would favour its endocytosis. However, I think the conclusions are not always rigorous and the conclusions are sometimes far-reaching. The main problem is that much of the conclusions concern a potential phosphorylation of Aly3 which is not experimentally addressed. An additional issue is the fact that they look at Ght5 ubiquitination by co-immunoprecipitation in native conditions (or at least, it seems to me) which cannot be conclusive. Overall, I think some experiments should be performed to address (at least) these 2 points before the manuscript can be published, see detailed comments above.

      *Advance

      This study, if completed carefully, would provide among the first examples of mapping of phosphorylation sites on arrestins, which are usually phosphorylated at many sites and are thus difficult to study. Few studies went down to this level in this respect (see Ivshov et al. eLife 2020). There are no changes in paradigms or new conceptual insights, but this work is a nice example of the conservation of these regulatory mechanisms.

      We appreciate that this study is highly evaluated by this reviewer. We understand the main problems raised by the reviewer, and as we detailed above, we plan to perform an experiment and make explanation to respond to the problems. With the raised issues answered, we believe that conclusions of the revised manuscript will be more rigorous.

      Our study reveals mechanisms regulating vacuolar transport of the Ght5 hexose transporter via the TORC2 pathway in fission yeast. The serine residues at the Aly3 C-terminus (582nd, 584th and 585th serine residues), which are probably phosphorylated in a manner dependent on the TORC2 pathway, are required for sustained Ght5 localization to cell surface and cellular adaptation to low glucose. To our knowledge, there is no such study, and thus we think that this study is novel. By responding to the reviewers' comments and adding new data as explained above, the revised manuscript will be able to present novelty of our study more clearly. Comparison of our study in fission yeast to related studies in other model organisms may reveal the conservation and diversity of these regulatory mechanisms.

      *Audience Should be of interest for people studying basic research in the field of cell biology, signalling pathways, transporter regulation by physiology. Reviewer background is on the regulation of transporter endocytosis by signalling pathways and arrestin-like proteins.

      Reviewer #3 (Evidence, reproducibility and clarity): (Authors' response in blue)

      In this manuscript, the authors work to address how phospho-regulation of a-arrestin Aly3 in S. pombe regulates the glucose transporter Ght5. The authors use a series of phospho-mutants in Aly3 and assess function of these mutants using growth assays and localization of Ght5. My main concerns with the manuscript are that 1) there is a lack of appreciation for the similar work that has been done in S. cerevisiae to define a-arrestin phospho-regulation, which is evidenced by the severe lack of referencing throughout the document, 2) the sites mutated on Aly3 are not demonstrated to change phospho-status of Aly3 and so all interpretations of these mutants need to be better contextualized and 3) almost none of the findings are quantified (imaging or immunoblots) making it difficult to assess the rigor of the outcomes. More detailed comments are provided below.

      We thank the reviewer for thorough reading of the manuscript and the detailed comments. As explained below, we will respond to the points raised by the reviewer and accordingly modify the manuscript.

      Minor Comments

      Immunoblotting or immunostaining to define the levels and localization of phospho-mutants - In Figure 1, an immunoblot or immunostaining to define the abundance/localization of WT Ght5 vs its ST11A mutant would be appreciated. It is very difficult to know if ST11A is as functional as WT or not without an assessment of the levels and localization of the WT and mutant proteins to accompany the spot assays. Perhaps a version of Ght5 that is a phospho-mimetic would be more useful here as well since that version should not be dephosphorylated and then presumably would be internalized and not allow for growth on low glucose medium.

      We plan to add fluorescence microscopy data of WT Ght5 and Ght5(ST11A) in the revised manuscript, to compare the localization and abundance of these two Ght5 species. In our preliminary observation, those of two Ght5 species seemed to be indistinguishable.

      We'd like to emphasize that the primary aim of this study is to reveal mechanisms regulating Ght5 localization and consequently ensuring cell proliferation in low glucose. While analyzing a phospho-mimetic Ght5 mutant (e.g. Ght5(ST11D)) is interesting in terms of understanding of the nature of Ght5, we believe that such an analysis is out of the scope on this study. As Ght5(ST11A)-expressing cells proliferated comparably to Ght5(WT)-expressing cells and WT and ST11A Ght5 indistinguishably localize on the cell surface, phosphorylation of the ST residues of Ght5 is not likely to be the primary mechanism regulating Ght5 localization and function. We would like to assess a phospho-mimetic Ght5 mutant protein in our future studies.

      For the Aly3 mutants where the abundance of Aly3 appears lower via immunoblotting (i.e., 4thA-A582S or S582A) how is the near perfect functional readout explained when the levels of the protein are much lower than WT? For the ST18A mutant, this is a particularly important point since the authors indicate on lines 194-197 that based on the functional data for ST18A, some of these ST residues are needed for phospho-regulation of Aly3. However, in Figure 3B the authors clearly show that there is very little ST18A protein in cells, and so these mutations have impacted Aly3 stability, which may or may not be linked to its phospho-status. The authors should be upfront about this finding on lines 194-197 and should not present this phospho-model as the only reason for why ST18A may not be functional. On lines 265-276 for the authors indicate that ST18A is expressed equivalently to WT Aly3, which is just not the case in Figure 3B. Perhaps quantification of replicate data would help clarify this issue. Further, if the authors wish to conclude that the upper MW bands in Figure 3B are due to phosphorylation, perhaps they should perform phosphatase treatments of their extracts to collapse these bands. However, most certainly the overall abundance of the single band for ST18A is reduced compared to the total bands of WT Aly3.

      We disagree with the opinion that the levels of the mutant Aly3 are much lower than WT. For semi-quantitative measurement of the protein abundance, 2-fold dilution series of the WT Aly3 sample were loaded in the leftmost 3 lanes of Figure 3B. Although the levels of Aly3(4th A;A582S), Aly3(S582A) and Aly3(ST18A) were lower than that of WT Aly3, those are 50% or more of the WT, judging from the intensities of the serially-diluted WT samples. To clearly show that the expression of these Aly3 proteins is within comparable levels, we plan to add a column chart of the quantified expression levels and to mention abundances of the Aly3 proteins more quantitatively in the revised text. We do not think that replicate data (of Western blots as in Figure 3B) helps clarify this issue, because nmt1 promoter-driven gene transcription is induced with a small variation (Forsburg, Nuc Acid Res, 1993). We will cite this report and mention this point in the revised text.

      We are afraid that this reviewer seems to consider that Aly3(ST18A) is not functional, but it is not a case and we do not intend to claim so. While deletion of aly3 did not interfere with cell proliferation in low glucose (see vector controls in Figures 2B, 2C and 3A, -Thiamine), expression of the ST18A mutant clearly hinders cell proliferation in low glucose, indicating that the ST18A performs dominant negative function to inhibit cell proliferation. That is, even though the expression level and/or stability of the ST18A is reduced, it is still sufficiently abundant to perform the dominant negative function. We propose the phospho-model not due to dysfunctionality of ST18A, but its dominant negative functionality. The 18 S/T residues of Aly3, which are shown to be phosphorylated in precedent phospho-proteomics studies, seem to be required to down-regulate Aly3's function to inhibit cell proliferation in low glucose. We apologize for this confusion, and we will modify the text and figures to clarify these points in the revised manuscripts.

      To obtain an experimental support for our description that the slower migrating bands in Figure 3B are due to phosphorylation, we plan to perform a phosphatase treatment experiment as suggested.

      Figure 2A - how do the phosphorylation sites identified in Aly3 compare to those identified in Rod1 from S. cerevisiae? See PMID 26920760 or SGD for more information. I am confused as to why the Aly3 protein has an arrowhead at the C-terminus. What does this denote?

      We will mention reported phosphorylation sites of Aly3 and budding yeast Rod1/Art4 in the revised manuscript, by referring to the indicated report and database. It should be noted that similarity between amino acid sequences of Aly3 and S. cerevisiae Rod1 is not so high and limited in Arrestin-N and -C domains. The C-terminal half of Aly3, in which most of the potential phosphorylation sites are found, is not similar to Rod1. Thus, these sites are unlikely to be conserved between them.

      An arrowhead indicates the direction of transcription (from N to C-terminus). We will describe it explicitly in the revised figure legend.

      Figure 2 - The WT and Aly3-ST18A are expressed in S. pombe from a non-endogenous locus under the control of the Nmt1 promoter. However, are these mutants present in cells that contain WT copies of Aly3 at other genomic loci? If so, this would surely muddy the interpretations of these data as a- and b-arrestins are capable of multimerizing and the effect of multimerization on their activities can vary.

      As mentioned in L188, an aly3 deletion mutant strain (aly3Δ) was used as a host, and thus all strains harboring an nmt1-driven aly3 gene lack the endogenous aly3 gene. We will add an illustration clearly showing that the host strain lacks the endogenous aly3+ gene and modify the legend of Figure 2.

      Functional readouts for Aly3 using Ght5 localization - The reduced surface levels of Ght5 does correspond to the spot assay growth in low glucose for the various Aly3 mutants used. However, it would be useful if these assays incorporated an endocytosis inhibitor to help prevent the activities of these Aly3 plasmids to see if the transporter is retained at the PM. At the end of these mutational analyses, the authors conclude that phosphorylation of Aly3 at any of 3 sites is required for Ght5 trafficking to the vacuole in low glucose, however no experiment is done to demonstrate that these sites are phosphorylated residues. A phosphatase assay would be useful to help demonstrate that the modifications in 3B really are phosphorylation and a quantification of the phosphorylated bands in these WBs would also be useful to solidify the statement made on lines 306-309.

      We thank the reviewer for suggestion of the experiment using an endocytosis inhibitor. Previously we reported that vacuolar localization of Ght5 in gad8ts mutant cells was suppressed by mutations in not only aly3 but also genes encoding ESCRT complexes (Toyoda et al, J Cell Sci, 2021). We therefore think that, in cells expressing Aly3(ST18A) or Aly3(4th Ala), Ght5 is subject to endocytosis and subsequent selective transport to vacuoles via ESCRT complexes. We will mention these previous findings in the revised manuscript.

      As mentioned in responses to the comments above and other reviewer's, we will perform a phosphatase treatment experiment and its quantification in the revised manuscript. Here, we'd like to emphasize that these 3 sites have been shown to be phosphorylated in phospho-proteomic studies by other researchers (Kettenbach et al, Mol Cell Proteomics, 2015; Tay et al, Cell Rep, 2019; Halova et al, Open Biol, 2021; Mak et al, EMBO J, 2021), although we do not show it directly in this study.

      Phosphorylation assessments - in general, it would be good to not only build the non-phosphorylatable versions of Aly3 but also the phospho-mimetic forms.

      We produced a phospho-mimetic mutant Aly3 (i.e. Aly3(4th A;A584D)), and showed the result in Figure 5A; cells expressing Aly3(4th A;A584D) exhibited a low ubiquitination of Ght5, similarly to Aly3(WT)- and Aly3(4th A;A584S)-expressing cells. According to our experiences, replacing S/T with D/E does not necessarily mimic phosphorylation. Thus, we do not believe that systematic production of phospho-mimetic Aly3 mutants would help achieve the aim of this study.

      Pub1, 2, and 3 - It would be helpful if the authors indicated what genes Pubs 1-3 correspond to in S. cerevisiae, where Rsp5 is the predominant Ub ligase interacting with a-arrestins. Is there no ortholog of Rsp5 in S. pombe?

      Pub1, Pub2 and Pub3 are regarded as orthologs of budding yeast Rsp5, according to the fission yeast database PomBase. We will perform a homology search for these E3 proteins, and based on the result, we will add a description in the revised manuscript.

      Pub-Aly3 interactions - could the authors please comment on the reason why so very little Aly3 is copurified with Pub1 or Pub2? Can any clear conclusion be drawn about pub2 given how very little Pub2 is present in the IPs? Based on my understanding of these data I do not think that this can be cleanly interpreted. What is is the identity of the ~50kDa MW band in Figure 6 in the upper MYC detection panel?

      We do not have an accurate answer for the result that a small amount of Aly3 is copurified with Pub1 or Pub3. The Pub1/3-Aly3 interaction may be weak or transient. We will discuss this point in the revised manuscript.

      Regarding whether Aly3 interacts with Pub2, we agree with the reviewer. As described in the Results (L360-362), we could not conclude anything about Aly3-Pub2 interaction by this immunoprecipitation experiment alone. On the other hand, the genetic interaction experiment (Figure 7A) suggests that pub2+ is not involved in defects caused by the gad8ts mutation (while pub3+ and aly3+ are). By this experiment, we think that Pub2 is not a partner of Aly3.

      In the revised manuscript, we will describe that Pub2 is not a partner of Aly3 in a paragraph describing the Figure 7A experiment.

      Because the 50 kDa band found in the IP fraction of all the samples appears even in "beads only" (Figure 6), those are supposedly derived from mouse IgG dissociated from the beads used for immunoprecipitation. We will mention this in the legend of Figure 6.

      Phosphorylation and ubiquitination of a-arrestins - The paragraph from lines 123-138 is very superficial in addressing what is known about phosphorylation and ubiquitination of a-arrestins. The way this section is written, it feels misleading to the reader as it omits many of the details for regulation that would help place the current study in context. The discussion of Rod1 phosphorylation by AMPK for example, which is directly relevant to this study, is underdeveloped. I would recommend splitting this into two paragraphs and providing a more in depth, and accurate, view of the literature on this topic, with a focus on the regulation that is relevant for the ortholog of Aly3 in S. cerevisiae. For example, Rod1 phosphorylation by AMPK is greatly expanded upon in the following papers (PMID 22249293 and 25547292) and AMPK regulation of C-tail phosphorylation of a-arrestins is defined further in PMID 26920760. These references are each particularly important to compare with the current findings presented in this manuscript. Torc2 regulation ofa-arrestins is also reviewed in PMID 36149412 and references therein should be considered.

      Because the primary aim of this study is to reveal mechanisms regulating Ght5 localization in fission yeast, but not to dissect modification and regulation of α-arrestins, we decided not to get into the details of phosphorylation and ubiquitination of α-arrestins. Furthermore, although budding yeast Rod1 and Rog3 are found to be downstream of the TORC2-Ypk1 signaling in the context of internalization of the Ste2 pheromone receptor, it is not clear whether TORC2-Ypk1 signaling also regulate α-arrestin-mediated internalization of hexose transporters in budding yeast. For these reasons, we focused on limited literatures essential for interpretation of the results and omitted many references describing the details of α-arrestin regulation. However, as this reviewer commented, we realize that our decision makes the discussion superficial and misleading to the reader. We sincerely apologize for this inconvenience.

      In the revised manuscript, we will reorganize the paragraphs in the discussion and include the suggested references. Regarding budding yeast Rod1, we will cite the study reporting Ypk1-mediated phosphorylation on Rod1 in mating pheromone response via regulation of Ste2 endocytosis (Alvaro et al, Genetics, 2016). We will also mention other reports (Becuwe et al, J Cell Biol, 2012; O'Donnell et al, Mol Cell Biol, 2015) about AMPK-dependent phosphorylation of Rod1 in the corresponding part (e.g. L129-130). In addition, we will mention that Aly2, Rod1 and Rog3 α-arrestins were found downstream of the TORC2-Ypk1 signaling (Muir et al, eLife, 2014; Thorner, Biochem J, 2022).

      As a further detailed example, there is far more work done on ubiquitination of a-arrestins in S. cerevisiae than the single citation provided by the authors on line 137. The way this section is written it feels misleading. Considerable effort has been spent on defining how mono- and poly-ubiquitination regulate a-arrestins and the authors should consider the data provided in the following citations and revise the two sentences they provide in this introduction to better reflect the breadth of our understanding rather than simply indicate that the 'mechanisms that regulate functions of a-arrestisn are not fully understood'. (PMIDs 23824189; 22249293; 17028178; 28298493)

      Ubiquitination of α-arrestin itself is not the topic of this study, and physiological consequences of ubiquitination of Aly3 remain unknown. Because of these reasons, we did not describe the details of ubiquitination of α-arrestins in the original manuscript. However, we never intend to mislead the reader, and thus to avoid it, we will revise the indicated sentences and cite the suggested literatures (O'Donnell et al, J Biol Chem, 2013; Becuwe et al, J Cell Biol, 2012; Kee et al, J Biol Chem, 2006; Ho et al, Mol Biol Cell, 2017) in the revised manuscript.

      Context of the findings and lack of citations - The referencing in this manuscript is very poor as many of the key papers that report analogous findings in the budding yeast Saccharomyces cerevisiae are not cited. This oversight in citing the appropriate literature must be remedied before this manuscript can be considered further for publication. Examples of these omissions occur at the following places:

      We will modify the text and carefully cite more literatures describing analogous finding in budding yeast and other organisms in the revised manuscript. We appreciate the insightful suggestions by this reviewer. It should be noted, however, that it is not evident whether budding yeast Rod1 and Rog3 are orthologous to fission yeast Aly3. Although Rod1 and Aly3 share overlapping roles, amino acid sequence similarity of them is not high and limited only in domains which are generally conserved among α-arrestin-family proteins.

      Line 90 - The Puca and Brou citations is one example of this but the first examples come from Daniela Rotin's work looking at Rsp5 interactions in budding yeast, which is where the association between HECT-domain Ub ligases and a-arrestins is also documented by Scott Emr and Hugh Pelham's labs. Here are some PMID numbers to improve the citations of this section (PMID 17551511; 18976803; 19912579) and each of these references long predates the Puca and Brou publication.

      In the revised manuscript, we will improve the citations by including the suggested studies (Gupta et al, Mol Syst Biol, 2007; Lin et al, Cell, 2008; Nikko and Pelham, Traffic, 2009).

      Lines 123-126 - Phosphorylation can also increase vacuole-dependent degradation of alpha-arrestins as demonstrated in PMID 35454122. The interaction with 14-3-3 proteins that is driven by phosphorylation of a-arrestins was first demonstrated by the Leon group in PMID 22249293). Lines 129-132 - Here again the Leon reference that helps demonstrate the 14-3-3 inhibition of Rod1 is lacking (PMID 22249293).

      We will cite the suggested studies in description of these topics (Bowman et al, Biomolecules, 2022; Becuwe et al, J Cell Biol, 2012).

      Lines 130-132 - Please include references for the statement that dephosphorylation activates a-arrestin activity. There are no citations on this statement and there are many to choose from and I would urge the authors to cite the primary literature on these points.

      We will cite studies for the statement "Conversely, dephosphorylation is thought to activate α-arrestins and to promote selective endocytosis of transporter proteins" (L130-132).

      These are just a few examples from the Introduction, but the Discussion is similarly wrought with issues in referencing and framing the experimental results within the context of the larger field, including what is known about Rod1/Rog3 regulation in S. cerevisiae. For example, the Llopis-Torregrosa et al reference and statement on lines 508-510 is incorrect. There are other phosphorylation sites defined in the C-terminus of Rod1, as described in Alvaro et al. PMID: 26920760.

      We will carefully correct Discussion by citing the suggested references (e.g. Alvaro et al, Genetics, 2016) and framing the obtained results within the context of the larger field.

      Of note, a combination of α-arrestin, upstream kinase(s) and distinct phosphorylation sites appears to determine the target transporter (Kahlhofer et al, Biol Cell, 2021; Thorner, Biochem J, 2022), and it has not been explicitly proved that TORC2-Ypk1 signaling also regulate α-arrestin-mediated internalization of hexose transporters in budding yeast. For these reasons, we stated "S. cerevisiae Rod1 and Rog3 are phosphorylated solely by Snf1p/AMPK" in the context of internalization of hexose transporters. We will also discuss this point in the revised manuscript.

      Minor Comments Clarification needed - Lines 107-121 - The relationship between the S. pombe arrestins and those in other organisms is somewhat unclear. Frist, all the arrestins in humans and S. cerevisiae can be sorted into the alpha, beta and Vps26 classes. However, the authors indicate that the S. pombe genome has 11 arrestin-like proteins but only 4 of these are a-arrestins. What classes do the other 7 arrestins belong to? It would be appreciated if this point was clarified.

      To our knowledge, fission yeast arrestins are not well classified yet. We will perform a phylogenetic tree analysis to classify them, and modify the description of the indicated part accordingly. We will also cite our previous report (Toyoda et al, J Cell Sci, 2021), in which the overall protein structure and domains of 11 fission yeast arrestin-like proteins were reported.

      Next, for the 4 a-arrestins identified in S. pombe the authors indicate that Aly3 is the homolog of Rod1/Art4 and Rog3/Art7 from S. cerevisiae. What is the relationship of Rod1 in S. pombe to Rod1 in S. cerevisiae? Are these also homologs? You can see how the nomenclature is confusing and, given the functional overlap of S. cerevisiae Rod1/Rog3 proteins it is important to know if Aly3 is the only version of these a-arrestins or if there is an additional counterpart in S. pombe. This point becomes somewhat more confusing when on lines 134-136 the authors talk about Arn1/Any1 as an arrestin related protein in S. pombe yet this protein was not included on the list of a-arrestins in the preceding section. What class of arrestin is this protein?

      According to PomBase, both Aly3 and Rod1 are assigned as the orthologue of budding yeast Rod1 and Rog3. However, as mentioned in responses above, it is unclear whether Aly3 is really orthologous to budding yeast Rod1/Rod3. In the revised manuscript, we will perform a homology search for these 4 proteins, and add information on how much these arrestins share homology.

      Arn1/Any1 is regarded as a β-arrestin (Nakase et al, J Cell Sci, 2013). We will also mention this in the revised manuscript.

      Alpha-arrestin homology - On lines 127-129 the authors indicate that TXNIP is the mammalian homolog of Aly3. To my knowledge, there are no evolutionary analyses that can draw these lines of homology between the a-arrestins in humans and those in yeasts. It would be appreciated if the authors could cite the work that leads to this conclusion or revise the sentence to more accurately reflect what is known on this topic. It certainly appears that, given their functional overlap in regulating glucose transporters, Txnip and Rod1/Rog3 in humans and S. cerevisiae are functionally connected. I urge the authors to use more caution when describing this protein family.

      Among human α-arrestins, ARRDC2 (22%) but not TXNIP (20%) has the highest amino acid identity to Aly3 (Toyoda et al, J Cell Sci, 2021). However, as TXNIP has been reported to regulate endocytosis of hexose transporters, GLUT1 and 4 (Wu et al, Mol Cell, 2013; Waldhart et al, Cell Rep, 2017), we think that TXNIP and Aly3 share physiological roles. We will revise the sentence (L127-129) more accurately.

      Text editing - The text could use editing as there are awkward and grammatically incorrect sentences in several places. Here are a few examples to help the authors:

      Please note that the original manuscript is edited by a professional editor, who is a native English (American) speaker and has edited thousands of research papers, before initial submission. We will ask an editor to check the revised draft again before submission.

      Lines 57-60 - the protein is not expressed over the entire cell surface, but is localized to the entire cell surface.

      We will correct this wording.

      Lines 80-83 - this sentence is very confusing

      We will correct this part by changing the phrase "Unlike TORC1," into a clause.

      Line 86 - Is there more than one gene encoding Aly3 in S. pombe?

      No, there is only one gene encoding Aly3. We will correct this part so as to avoid being misunderstood.

      Line 88, 109, - these sentences need to start with a capitol so either capitalize the A in arrestin or write out Alpha with a capitol A.

      We will correct the sentence as suggested.

      Lines 145-148 - unclear as written

      We will clarify the meaning of the sentence by changing the voice.

      Line 224 - why are these amino acids being referred to as hydroxylated? Perhaps hydroxyl-containing amino acids or 18 amino acids with hydroxyl side chains would be better choices?

      We will correct the word as suggested.

      Line 300 - very confusing sentence structure

      We will correct this part by simplifying the structure of the sentence.

      And elsewhere....

      We will carefully check the revised text before submission.

      Reviewer #3 (Significance):

      The authors provide some information as to the residues needed in the Aly3 C-tail for Ght5 trafficking in S. Pombe. These results are not places in the context of similar phosphor-regulatory work done for a-arrestins in S. cerevisiae, and this is needed for appreciation of the significance of the study.

      Overall, it appears that the model put forth is very similar to the one already proposed in S. cerevisiae where phosphorylation impedes a-arrestin-mediated trafficking of glucose transporters. It is interesting to see this similarity hold in S. Pombe, but it does not dramatically alter our appreciation of a-arrestin biology.

      The significance of the findings are somewhat underscored by the fact that very little quantification of data are presented, making the rigor of the work difficult to assess.

      We thank the reviewer for careful reading and evaluation of our study. As the reviewer states, the results are not placed in the context of similar phospho-regulatory works done for α-arrestins in S. cerevisiae. This may partly come from the fact that it remains unclear whether internalization of hexose transporters is regulated by TORC2-dependent phosphorylation in S. cerevisiae. We believe that our study is novel and significant for this reason. By performing the additional experiments/quantification and revising the text as suggested by the reviewers, the manuscript will be further strengthened, and we will be able to clearly conclude that TORC2-dependent phosphorylation of Aly3 regulates localization of the Ght5 hexose transporter and cellular responses to glucose shortage stress.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary/background.

      This paper focuses on the regulation of endocytosis of the hexose transporter, Ght5, in S. pombe by nutrient limitation through the arrestin-like protein Aly3. Ght5 is induced when glucose is limiting and is required for growth and proliferation in these conditions. ght5+ encodes the only high-affinity glc transporter from fission yeast. ght5+ is induced in low glucose conditions at the transcriptional level and is translocated to the plasma membrane to allow glc import. Ght5 is targeted to the vacuole in conditions of N limitation. Mutations in the TORC2 pathway lead to the same process, thus preventing growth on low glucose medium, as shown in the gad8ts mutant, mutated for the Gad8 kinase acting downstream of TORC2. Previously, the authors demonstrated that the vacuolar delivery of Ght5 in the gad8ts mutant is suppressed by mutation of the arrestin-like protein Aly3. Arrestin-like proteins are in charge of recognising and ubiquitinating plasma membrane proteins to direct their vacuolar targeting by the endocytosis pathway. This suggested that Aly3 is hyperactive in TORC2 mutants, and accordingly, Ght5 ubiquitination was increased in gad8ts.

      Overall statement

      This study aims at deepening our understanding of the regulation of endocytosis by signalling pathways through arrestin-like proteins. Ght5 is a nice model to study a physiological regulation, and the authors have a great set of tools at hand. However, I think the conclusions are not always rigorous and the conclusions are sometimes far-reaching. The main problem is that much of the conclusions concern a potential phosphorylation of Aly3 which is not experimentally addressed. An additional issue is the fact that they look at Ght5 ubiquitination by co-immunoprecipitation in native conditions (or at least, it seems to me) which cannot be conclusive. Overall, I think some experiments should be performed to address (at least) these 2 points before the manuscript can be published, see detailed comments below.

      Major statements and criticism.

      • Fig 1. Based on the hypothesis that TORC2-mediated phosphorylation regulate Ght5 endocytosis, the authors first considered a possible phosphorylation of Ght5. They mutagenised 11 possible phosphorylation sites on the Ct of Ght5, but none affected the growth on low glucose in the absence of thiamine, suggesting that they don't contribute to the observed TORC2-mediated regulation. However, I disagree with the statement that "phosphorylation of Ght5 is dispensable for cell proliferation in low glucose", given that the authors do not show 1- that Ght5 is phosphorylated and 2-that this is abolished by these mutations. They should either provide data on this or tone down and say that these residues are not involved in the regulation, without implying phosphorylation which is not proven. In the presence of Thiamine (Supp fig 1), it seems that the ST/A mutant grows better in low glucose, and this is not explained nor commented. Since the transporter is not expressed, could the authors provide an explanation to this? If the promoter is leaky and some ght5-ST/A is expressed, it may be more stable and allow better growth than the WT, which would tend to indicate that impairing phosphorylation prevents endocytosis (which is classical for many transporters, see the body of work on CK1-mediated phosphorylation of transporters). Have the authors tried to decrease glc concentration lower than 0.14% in the absence of thiamine to see if this also true when the transporters is strongly expressed? (OPTIONAL)
      • Fig 2. The authors then follow the hypothesis that TORC2 exerts its Ght5-dependent regulation through the phosphorylation of Aly3. They mutagenised 18 possible phosphorylation sites on Aly3. This led to a strong defect in growth in low-glc medium. Mutation of the possible Gad8 site (S460) did not recapitulate this phenotype, suggesting that it is not sufficient, however, mutations of 4 ST residues in a CT cluster (582-586) mimicked the full 18ST/A mutation, suggesting these are the important residues for Ght5 endocytosis.
      • Fig 3A. Further dissection did not allow to pinpoint this regulation to a specific residue, beyond the dispensability of the T586 residue. Fig 3B. The authors look at the effects of mutation of Aly3 on these sites at the protein level. They had to develop an antibody because HA-epitope tagging did not lead to a functional protein (Supp fig 2). Whereas I agree that the mutations causing a phenotype lead to a change in the migration pattern, I disagree with the statement that "This observation indicated that slower migrating bands were phosphorylated species of Aly3" (p.9 l.271). First, lack of phosphorylation usually causes a slower mobility on gel, which is not clear to spot here. Second, a smear appears on top of the mutated proteins (eg. 4th Ala) which is possibly caused by another modification. There are many precedents in the literature about arrestins being ubiquitinated when they are not phosphorylated (see the work on Bul1, Rod1, Csr2 in baker's yeast from various labs). My gut feeling is that lack of phosphorylation unleashes Aly3 ubiquitination leading to change in pattern. All in all, it is impossible to state about the phosphorylation of a protein without addressing its phosphorylation properly by phosphatase treatment + change in migration, or MS/MS. Thus, whereas the data looks promising, this hypothesis that Aly3 is phosphorylated at the indicated sites is not properly demonstrated.
      • Fig 4. The authors now look at the functional consequences of these mutations on ALy3 on Ght5 localisation. The data clearly shows that mutation of the 4 identified S/T residues (Aly3-4th A) causes aberrant localisation of the transporter to the vacuole, likely to cause the observed growth defect on low glucose. There is a nice correlation between the vacuolar localisation and growth in low-glucose for the various aly3 mutants. (A final proof could be to express this in the context of an endocytic mutant, which should restore membrane localisation and suppress the aly3-4thA phenotype - OPTIONAL). However, I still disagree with the statement that "These results indicate that phosphorylation of Aly3 at the C-terminal 582nd, 584th, and/or 585th serine residues is required for cell-surface localization of Ght5." given that phosphorylation was not properly demonstrated.
      • Fig 5. Here, the authors question the role of Aly3 mutations on Ght5 ubiquitination. They immunoprecipitate Ght5 and address its ubiquitination status in various Aly3 mutants. The data is encouraging for a role in Aly3 phosphorylation (?) in the negative control of Ght5 ubiquitination. My main problem with this experiment is that it seems that Ght5 immunoprecipitations were made in non-denaturing conditions, which leads to the question of what is the anti-ubiquitin revealing here (Ght5 or a co-immunoprecipitated protein, for example Aly3 itself, or the Pub ligases, or an unknown protein). It seems that this protocol was previously used in their previous paper, but I stand by my conclusion that ubiquitination of a given protein can only be looked in denaturing conditions. The experiments should be repeated in buffers classical for the study of protein ubiquitination to be able to conclude unambiguously that we are looking at Ght5 ubiquitination itself, especially in the absence of a non-ubiquitinable form of Ght5 as a negative control. Could the authors comment on the fact that S-A or S-D mutations display the same phenotype regarding the possible Ght5 ubiquitination?
      • Fig 6. The authors want to document the model whereby Aly3 may interact with some of the Nedd4 ligases (Pub1/2/3) to mediate its Ght5-ubiquitination function. They actually use the Aly3-4thA mutant, it should have been better with the WT protein. But the results indicate a clear interaction with at least Pub1 and Pub3. By the way, are the Pub1/2/3 fusions functional? Nedd4 proteins are notoriously affected in their function by C-terminal tagging and are usually tagged at their N-terminus (See Dunn et al. J Cell Biol 2004).
      • Fig 7. The authors want to provide genetic interaction between the Pub ligases and the growth defects in low glc due to alterations in Ght5 trafficking. It is unclear how the gad8ts pub1∆ mutant was generated since it doesn't seem to grow on regular glc concentration (Supp fig 5), could the authors provide some information about this? It is also not clear whether it can be stated thatches mutant is "more sensitive" to glc depletion because of the low level of growth to begin with (even at 3%). Altogether, the data show that deletion of pub3+ is able to suppress the growth defect of the gad8ts mutant on low glc medium, suggesting it is the relevant ligase for Ght5 endocytosis. This is confirmed by microscopy observations of Ght5 localisation. However, I would again tone down the main conclusion, which I feel is far-reaching: "Combined with physical interaction data, these results strongly suggest that Aly3 recruits Pub3, but not Pub2, for ubiquitination of Ght5." Work on Rsp5 in baker's yeast has shown that Rsp5 function goes beyond cargo ubiquitination, including ubiquitination of arrestins (which is often required for their function as mentioned in the introduction) or other endocytic proteins (epsins, amphyphysin etc). I agree that the data are compatible with this model but there are other possible explanations. Anything that would block endocytosis would supposedly suppress the gad8ts phenotype.

      Discussion

      Some analogy with the regulation of the Bul arrestins by TORC1/Npr1 and PP2A/Sit4 could be mentioned (Mehri et al. 2012), at the discretion of the authors. The possibility that phosphorylation may neutralise a basic patch on Aly3 Ct, possibly involved in electrostatic interactions with Ght5 is very interesting. Regarding the effect of the mutations on Aly3 localisation (p.15 l.498), did the authors tag Aly3 with GFP? There are examples where proteins tagged with HA are not functional whereas tagging with GFP does not alter their function (eg. Rod1, Laussel et al. 2022) - and here Supp Fig 2 only relates to HA-tagging. Proof of a change in Aly3 localisation upon mutation would definitely be a plus (OPTIONAL).

      Minor comments.

      Introduction:

      • I believe the text corresponding to the work on TXNIP is incorrect (p.5 l.127). TXNIP is degraded after its phosphorylation, not "rectracted" from the surface.
      • For the sake of completion, the authors could add other references concerning the regulation of Rod1 in budding yeast such as Becuwe et al. 2012 J Cell Biol and O'Donnell et al. 2015 Mol Cell Biol, in addition to Llopis-Torregrosa et al. 2016.
      • Other examples of the requirement for arrestin ubiquitination beyond Art1 (p.5 l.136-137) are listed in the ref cited: Kahlhofer et al. 2021.

      Figures: In general, I think it would be clearer if the authors showed on the figures that the background strain in which the XXX gene is added (or its mutant forms) is a xxx∆ strain.

      Referees cross-commenting

      Cross review of Reviewer 1

      • I don't believe that the authors "define a set of redundant c-terminal phosphorylation sites in Aly3", because phosphorylation is not proven.
      • I thinks the points raised for Fig 3B are valid but the authors should focus on making their story conclusive before expanding to other data (except for the explanation of the smear, see my review). Also, I don't think 2NBDG actually works to measure Glc uptake.
      • same for Fig 6 - not sure the interaction site mapping between Aly3 and Pubs would bring much value since there are more urgent things to do to make the story solid.

      Cros review of Reviewer 3 - we have many overlaps, so briefly :

      • I agree that the bibliography is incomplete (mentioned in my review)
      • I agree that there is no demonstration of the phospho-status of Aly3, and it is a problem
      • I agree that the results can be better quantified, esp. in the light of the points raised by this referee concerning the variability of expression of ST18A

      Other specific comments :

      • I agree that the statement that dephosphorylation activates alpha-arresting should be toned down - this was observed in several instances but there are examples of arrestin-mediated endocytosis which does not require their prior dephosphorylation.
      • I fully agree that efforts could be made regarding the classification/nomenclature of arrestins in S. pombe, this had escaped my attention

      Significance

      strengths and limitations

      This study aims at deepening our understanding of the regulation of endocytosis by signalling pathways through arrestin-like proteins in S. pombe. Ght5 is a nice model to study a physiological regulation, and the authors have a great set of tools at hand, including the discovery of Aly3 as the main arrestin for this regulation, and a signalling pathway (TORC2/Gad8) acting upstream. The main question is now to understand at the mechanistic level how TORC2 signaling impinges on the regulation of this arrestin.

      Overall, the authors nicely demonstrate that C-terminal Ser/Thr residues are crucial for the function of Aly3 in Ght5 endocytosis. They propose a model whereby Aly3 phosphorylation by an unknownn kinase inhibits its function on Ght5 ubiquitination, which would favour its endocytosis. However, I think the conclusions are not always rigorous and the conclusions are sometimes far-reaching. The main problem is that much of the conclusions concern a potential phosphorylation of Aly3 which is not experimentally addressed. An additional issue is the fact that they look at Ght5 ubiquitination by co-immunoprecipitation in native conditions (or at least, it seems to me) which cannot be conclusive. Overall, I think some experiments should be performed to address (at least) these 2 points before the manuscript can be published, see detailed comments above.

      Advance

      This study, if completed carefully, would provide among the first examples of mapping of phosphorylation sites on arrestins, which are usually phosphorylated at many sites and are thus difficult to study. Few studies went down to this level in this respect (see Ivshov et al. eLife 2020). There are no changes in paradigms or new conceptual insights, but this work is a nice example of the conservation of these regulatory mechanisms.

      Audience

      Should be of interest for people studying basic research in the field of cell biology, signalling pathways, transporter regulation by physiology. Reviewer background is on the regulation of transporter endocytosis by signalling pathways and arrestin-like proteins.

    1. Author response:

      Reviewer #1 (Public Review):

      Summary and Strengths:

      The ability of Wolbachia to be transmitted horizontally during parasitoid wasp infections is supported by phylogenetic data here and elsewhere. Experimental analyses have shown evidence of wasp-to-wasp transmission during coinfection (eg Huigins et al), host to wasp transmission (eg Heath et al), and mechanical ('dirty needle') transmission from host to host (Ahmed et al). To my knowledge this manuscript provides the first experimental evidence of wasp to host transmission. Given the strong phylogenetic pattern of host-parasitoid Wolbachia sharing, this may be of general importance in explaining the distribution of Wolbachia across arthropods. This is of interest as Wolbachia is extremely common in the natural world and influences many aspects of host biology.

      Weaknesses:

      The first observation of the manuscript is that the Wolbachia strains in hosts are more closely related to those in their parasitoids. This has been reported on multiple occasions before, dating back to the late 1990s. The introduction cites five such papers (the observation is made in other studies too that could be cited) but then dismisses them by stating "However, without quantitative tests, this observation could simply reflect a bias in research focus." As these studies include carefully collected datasets that were analysed appropriately, I felt this claim of novelty was rather strong. It is unclear why downloading every sequence in GenBank avoids any perceived biases, when presumably the authors are reanalysing the data in these papers.

      Thank you for bringing this to our attention, and we will make the necessary amendments in our revised manuscript.

      I do not doubt the observation that host-parasitoid pairs tend to share related Wolbachia, as it is corroborated by other studies, the effect size is large, and the case study of whitefly is clearcut. It is also novel to do this analysis on such a large dataset. However, the statistical analysis used is incorrect as the observations are pseudo-replicated due to phylogenetic non-independence. When analysing comparative data like this it is essential to correct for the confounding effects of related species tending to be similar due to common ancestry. In this case, it is well-known that this is an issue as it is a repeated observation that related hosts are infected by related Wolbachia. However, the authors treat every pairwise combination of species (nearly a million pairs) as an independent observation. Addressing this issue is made more complex because there are both the host and symbiont trees to consider. The additional analysis in lines 123-124 (including shuffling species pairs) does not explicitly address this issue.

      We concur with your observation regarding the non-independence of the data due to phylogenetic relationships. While common phylogenetic correction methods are indeed not directly applicable to wsp distances between species pairs, we are investigating the potential of phylogenetic mixed models to address this issue. We hope to include a revised analysis using this approach in our revised manuscript.

      The sharing of Wolbachia between whitefly and their parasitoids is very striking, although this has been reported before (eg the authors recently published a paper entitled "Diversity and Phylogenetic Analyses Reveal Horizontal Transmission of Endosymbionts Between Whiteflies and Their Parasitoids"). In Lines 154-164 it is suggested that from the tree the direction of transfer between host and parasitoid can be inferred from the data. This is not obvious to me given the poor resolution of the tree due to low sequence divergence. There are established statistical approaches to test the direction of trait changes on a tree that could have been used (a common approach is to use the software BEAST).

      Thank you for your insightful comments regarding the transfer direction of Wolbachia between whiteflies and their parasitoids. We acknowledge the concern about the resolution of the phylogenetic tree and the inference of the direction of Wolbachia transmission based on the available data. We considered the high infection frequency and obligate nature of Wolbachia in En. formosa, which exhibits a 100% infection rate, as a strong indicator that recent transmission of Wolbachia in this clade likely occurred from En. formosa to B. tabaci. We appreciate your recommendation and will ensure that our conclusions are supported by a more statistically sound approach. As you suggested, we will employ the software BEAST to rigorously test the direction of transmission, and we will revise our statements accordingly.

      Reviewer #2 (Public Review):

      The paper by Yan et al. aims to provide evidence for horizontal transmission of the intracellular bacterial symbiont Wolbachia from parasitoid wasps to their whitefly hosts. In my opinion, the paper in its current form consists of major flaws.

      Weaknesses:

      The dogma in the field is that although horizontal transmission events of Wolbachia occur, in most systems they are so rare that the chances of observing them in the lab are very slim.

      For the idea of bacteria moving from a parasitoid to its host, the authors have rightfully cited the paper by Hughes, et al. (2001), which presents the main arguments against the possibility of documenting such transmissions. Thus, if the authors want to provide data that contradict the large volume of evidence showing the opposite, they should present a very strong case.

      In my opinion, the paper fails to provide such concrete evidence. Moreover, it seems the work presented does not meet the basic scientific standards.

      We are grateful for your critical perspective on our work. Nonetheless, we are confident in the credibility of our findings regarding the horizontal transmission of Wolbachia from En. formosa to B. tabaci. Our study has documented this phenomenon through phylogenetic tree analyses, and we have further substantiated our observations with rigorous experiments in both cages and petri dishes. The horizontal transfer of Wolbachia was confirmed via PCR, with the wsp sequences in B. tabaci showing complete concordance with those in En. formosa. Additionally, we utilized FISH, vertical transmission experiments, and phenotypic assays to demonstrate that the transferred Wolbachia could be vertically transmitted and induce significant fitness cost in B. tabaci. All experiments were conducted with strict negative controls and a sufficient number of replicates to ensure reliability, thereby meeting basic scientific standards. The collective evidence we present points to a definitive case of Wolbachia transmission from the parasitoid En. formosa to the whitefly B. tabaci.

      My main reservations are:

      • I think the distribution pattern of bacteria stained by the probes in the FISH pictures presented in Figure 4 looks very much like Portiera, the primary symbiont found in the bacterium of all whitefly species. In order to make a strong case, the authors need to include Portiera probes along with the Wolbachia ones.

      We are very grateful for your critical evaluation regarding the specificity of FISH in our study. We assure the reliability of our FISH results based on several reasons.

      1) We implemented rigorous negative controls which exhibited no detectable signal, thereby affirming the specificity of our hybridization. 2) The central region of the whitefly nymphs is a typical oviposition site for En. formosa. Post-parasitism, we observed FISH signals around the introduced parasitoid eggs, distinct from bacteriocyte cells which are rich in endosymbionts including Portiera (FIG 3e-f). This observation supports the high specificity of our FISH method. 3) In the G3 whiteflies, we detected the presence of Wolbachia in bacteriocytes in nymphs and at the posterior end of eggs in adult females (FIG 4). This distribution pattern aligns with previously reported localizations of Wolbachia in B. tabaci (Shi et al., 2016; Skaljac et al., 2013). Furthermore, the distribution of Wolbachia in the whiteflies does indeed exhibit some overlap with that of Portiera (Skaljac et al., 2013; Bing et al., 2014). 4) The primers used in our FISH assays have been widely cited (Heddi et al., 1999) and validated in studies on B. tabaci and other systems (Guo et al., 2018; Hegde et al., 2024; Krafsur et al., 2020; Rasgon et al., 2006; Uribe-Alvarez et al., 2019; Zhao et al., 2013). Taking all these points into consideration, we stand by the reliability of our FISH results.

      References:

      Bing XL, Xia WQ, Gui JD, Yan GH, Wang XW, Liu SS. 2014. Diversity and evolution of the Wolbachia endosymbionts of Bemisia (Hemiptera: Aleyrodidae) whiteflies. Ecol Evol, 4(13): 2714-37.

      Guo, Y, Hoffmann, AA, Xu, XQ, Zhang X, Huang HJ, Ju JF, Gong JT, Hong XY. 2018. Wolbachia-induced apoptosis associated with increased fecundity in Laodelphax striatellus (Hemiptera: Delphacidae). Insect Mol Biol, 27: 796-807.

      Heddi A, Grenier AM, Khatchadourian C, Charles H, Nardon P. 1999. Four intracellular genomes direct weevil biology: Nuclear, mitochondrial, principal endosymbiont, and Wolbachia. Proc Natl Acad Sci USA, 96: 6814-6819.

      Hegde S, Marriott AE, Pionnier N, Steven A, Bulman C, Gunderson E, et al. 2024. Combinations of the azaquinazoline anti-Wolbachia agent, AWZ1066S, with benzimidazole anthelmintics synergise to mediate sub-seven-day sterilising and curative efficacies in experimental models of filariasis. Front Microbiol, 15: 1346068.

      Krafsur AM, Ghosh A, Brelsfoard CL. 2020. Phenotypic response of Wolbachia pipientis in a cell-free medium. Microorganisms, 8: 1060.

      Rasgon JL, Gamston, CE, Ren X. 2006. Survival of Wolbachia pipientis in cell-free medium. Appl Environ Microbiol, 72: 6934-6937.

      Shi P, He Z, Li S, An X, Lv N, Ghanim M, Cuthbertson AGS, Ren SX, Qiu BL. 2016. Wolbachia has two different localization patterns in whitefly Bemisia tabaci AsiaII7 species. PLoS One, 11: e0162558.

      Skaljac M, Zanić K, Hrnčić S, Radonjić S, Perović T, Ghanim M. 2013. Diversity and localization of bacterial symbionts in three whitefly species (Hemiptera: Aleyrodidae) from the east coast of the Adriatic Sea. Bull Entomol Res, 103(1): 48-59.

      Uribe-Alvarez C, Chiquete-Félix N, Morales-García L, Bohórquez-Hernández A, Delgado-Buenrostro N L, Vaca L, et al. 2019. Wolbachia pipientis grows in Saccharomyces cerevisiae evoking early death of the host and deregulation of mitochondrial metabolism. MicrobiologyOpen, 8: e00675.

      Zhao DX, Zhang XF, Chen DS, Zhang YK, Hong XY, 2013. Wolbachia-host interactions: Host mating patterns affect Wolbachia density dynamics. PLoS One, 8: e66373.

      • If I understand the methods correctly, the phylogeny presented in Figure 2a is supposed to be based on a wide search for Wolbachia wsp gene done on the NCBI dataset (p. 348). However, when I checked the origin of some of the sequences used in the tree to show the similarity of Wolbachia between Bemisia tabaci and its parasitoids, I found that most of them were deposited by the authors themselves in the course of the current study (I could not find this mentioned in the text), or originated in a couple of papers that in my opinion should not have been published to begin with.

      We appreciate your meticulous examination of the sources for our sequence data. All the sequences included in our phylogenetic analysis were indeed downloaded from the NCBI database as of July 2023. The sequences used to illustrate the similarity of Wolbachia between B. tabaci and its parasitoids include those from our previously published study (Qi et al., 2019), which were sequenced from field samples. Additionally, some sequences were also obtained from other laboratories (Ahmed et al., 2009; Baldo et al., 2006; Van Meer et al., 1999). We acknowledge that in our prior research (Qi et al., 2019), the sequences were directly submitted to NCBI and, regrettably, we did not update the corresponding publication information after the article were published. It is not uncommon for sequences on NCBI, with some never being followed by a published paper (e.g., FJ710487- FJ710511 and JF426137-JF426149), or not having their associated publication details updated post-publication (for instance, sequences MH918776-MH918794 from Qi et al., 2019, and KF017873-KF017878 from Fattah-Hosseini et al., 2018). We recognize that this practice can lead to confusion and apologize for the oversight in our work.

      References:

      Ahmed MZ, Shatters RG, Ren, SX, Jin GH, Mandour NS, Qiu BL. 2009. Genetic distinctions among the Mediterranean and Chinese populations of Bemisia tabaci Q biotype and their endosymbiont Wolbachia populations. J Appl Entomol, 133: 733-741.

      Baldo L, Hotopp JCD, Jolley KA, Bordenstein SR, Biber SA, Choudhury RR, et al. 2006. Multilocus sequence typing system for the endosymbiont Wolbachia pipientis. Appl Environ Microbiol, 72: 7098-110.

      Fattah-Hosseini S, Karimi J, Allahyari H. 2014. Molecular characterization of Iranian Encarsia formosa Gahan populations with natural incidence of Wolbachia infection. J Entomol Res Soc, 20: 85–100.

      Qi LD, Sun JT, Hong XY, Li YX. 2019. Diversity and phylogenetic analyses reveal horizontal transmission of endosymbionts between whiteflies and their parasitoids. J Econ Entomol, 112(2): 894-905.

      Van Meer MM, Witteveldt J, Stouthamer R. 1999. Phylogeny of the arthropod endosymbiont Wolbachia based on the wsp gene. Insect Mol Biol, 8: 399-408.

      • The authors fail to discuss or even acknowledge a number of published studies that specifically show no horizontal transmission, such as the one claimed to be detected in the study presented.

      Thank you for bringing this to our attention. We will address and discuss the published studies that report no evidence of horizontal transmission, as you've highlighted, in the revised version of our manuscript.

      Reviewer #3 (Public Review):

      This is a very ordinary research paper. The horizontal of endosymbionts, including Wolbachia, Rickettsia etc. has been reported in detail in the last 10 years, and parasitoid vectored as well as plant vectored horizontal transmission is the mainstream of research. For example, Ahmed et al. 2013 PLoS One, 2015 PLoS Pathogens, Chiel et al. 2014 Enviromental Entomology, Ahmed et al. 2016 BMC Evolution Biology, Qi et al. 2019 JEE, Liu et al. 2023 Frontiers in Cellular and Infection Microbiology, all of these reported the parasitoid vectored horizontal transmission of endosymbiont. While Caspi-Fluger et al. 2012 Proc Roy Soc B, Chrostek et al. 2017 Frontiers in Microbiology, Li et al. 2017 ISME Journal, Li et al. 2017 FEMS, Shi et al. 2024 mBio, all of these reported the plant vectored horizontal transmission of endosymbiont. For the effects of endosymbiont on the biology of the host, Ahmed et al. 2015 PLoS Pathogens explained the effects in detail.

      Thank you very much for your insightful comments and for highlighting the relevant literature in the field of horizontal transmission of endosymbionts, including Wolbachia and Rickettsia. After careful consideration of the studies you have mentioned, we believe that our work presents significant novel contributions to the field. 1) Regarding the parasitoid-mediated horizontal transmission of Wolbachia, most of the cited articles, such as Ahmed et al. 2013 in PLoS One and Ahmed et al. 2016 in BMC Evolutionary Biology, propose hypotheses but do not provide definitive evidence. The transmission of Wolbachia within the whitefly cryptic species complex (Ahmed et al. 2013) or between moths and butterflies (Ahmed et al. 2016) could be mediated by parasitoids, plants, or other unknown pathways. 2) Chiel et al. (2014 in Environmental Entomology reported “no evidence for horizontal transmission of Wolbachia between and within trophic levels” in their study system. 3) The literature you mentioned about Rickettsia, rather than Wolbachia, indirectly reflects the relative scarcity of evidence for Wolbachia horizontal transmission. For example, the evidence for plant-mediated transmission of Wolbachia remains isolated, with Li et al. 2017 in The ISME Journal being one of the few reports supporting this mode of transmission. 4) While the effects of endosymbionts on their hosts are not the central focus of our study, the effects of transgenerational Wolbachia on whiteflies are primarily demonstrated to confirm the infection of Wolbachia into whiteflies. Furthermore, the effects we report of Wolbachia on whiteflies are notably different from those reported by Ahmed et al. 2015 in PLoS Pathogens, likely due to different whitefly species and Wolbachia strains. 6) More importantly, our study reveals a mechanism of parasitoid-mediated horizontal transmission of Wolbachia that is distinct from the mechanical transmission suggested by Ahmed et al. 2015 in PLoS Pathogens. Their study implies transmission primarily through host-feeding contamination, without the need for Wolbachia to infect the parasitoid, suggesting host-to-host transmission at the same trophic level. In contrast, our findings demonstrate transmission from parasitoids to hosts through unsuccessful parasitism, which represents cross-trophic level transmission. To our knowledge, this is the first experimental evidence that Wolbachia can be transmitted from parasitoids to hosts. We believe these clarifications and the novel insights provided by our research contribute valuable knowledge to the field.

      References:

      Ahmed MZ, De Barro PJ, Ren SX, Greeff JM, Qiu BL. 2013. Evidence for horizontal transmission of secondary endosymbionts in the Bemisia tabaci cryptic species complex. PLoS One, 8: e53084.

      Ahmed MZ, Li SJ, Xue X, Yin XJ, Ren SX, Jiggins FM, Greeff JM, Qiu BL. 2015. The intracellular bacterium Wolbachia uses parasitoid wasps as phoretic vectors for efficient horizontal transmission. PLoS Pathog, 10: e1004672.

      Ahmed MZ, Breinholt JW, Kawahara AY. 2016. Evidence for common horizontal transmission of Wolbachia among butterflies and moths. BMC Evol Biol, 16: 118. doi.org/10.1186/s12862-016-0660-x.

      Caspi-Fluger A, Inbar M, Mozes-Daube N, Katzir N, Portnoy V, Belausov E, Hunter MS, Zchori-Fein E. 2012. Horizontal transmission of the insect symbiont Rickettsia is plant-mediated. Proc Biol Sci, 279(1734): 1791-6.

      Chiel E, Kelly SE, Harris AM, Gebiola M, Li X, Zchori-Fein E, Hunter MS. 2014. Characteristics, phenotype, and transmission of Wolbachia in the sweet potato whitefly, Bemisia tabaci (Hemiptera: Aleyrodidae), and its parasitoid Eretmocerus sp. nr. emiratus (Hymenoptera: Aphelinidae). Environ Entomol, 43(2): 353-62.

      Chrostek E, Pelz-Stelinski K, Hurst GDD, Hughes GL. 2017. Horizontal transmission of intracellular insect symbionts via plants. Front Microbiol, 8: 2237.

      Li SJ, Ahmed MZ, Lv N, Shi PQ, Wang XM, Huang JL, Qiu BL. 2017. Plantmediated horizontal transmission of Wolbachia between whiteflies. ISME J, 11: 1019-1028.

      Li YH, Ahmed MZ, Li SJ, Lv N, Shi PQ, Chen XS, Qiu BL. 2017. Plant-mediated horizontal transmission of Rickettsia endosymbiont between different whitefly species. FEMS Microbiol Ecol, 93(12). doi: 10.1093/femsec/fix138.

      Liu Y, He ZQ, Wen Q, Peng J, Zhou YT, Mandour N, McKenzie CL, Ahmed MZ, Qiu BL. 2023. Parasitoid-mediated horizontal transmission of Rickettsia between whiteflies. Front Cell Infect Microbiol, 12: 1077494. DOI: 10.3389/fcimb.2022.1077494

      Qi LD, Sun JT, Hong XY, Li YX. 2019. Diversity and phylogenetic analyses reveal horizontal transmission of endosymbionts between whiteflies and their parasitoids. J Econ Entomol, 112: 894-905.

      Shi PQ, Wang L, Chen XY, Wang K, Wu QJ, Turlings TCJ, Zhang PJ, Qiu BL. 2024. Rickettsia transmission from whitefly to plants benefits herbivore insects but is detrimental to fungal and viral pathogens. mBio, 15(3): e0244823.

      Weaknesses:

      In the current study, the authors downloaded the MLST or wsp genes from a public database and analyzed the data using other methods, and I think the authors may not be familiar with the research progress in the field of insect symbiont transmission, and the current stage of this manuscript lacking sufficient novelty.

      We appreciate your critical perspective on our study. However, we respectfully disagree with the viewpoint that our manuscript lacks sufficient novelty.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1

      i) "Enhancers dependent on TPR during senescence are enriched for binding sites of inflammatory transcription factors". *Proximity to genes does not confirm an enhancer role for that gene, although Tasdemir et al., 2016 suggested this. At that time, HI-C and Hi-CHiP techniques were not well-established. Nowadays, without combining HI-C and H3K27ac ChIP, Hi-ChIP alone cannot definitively identify actual enhancer regions. If we repeatedly use the Tasdemir et al., 2016 map, we risk incorrect mapping of enhancers of SASP. The authors should either use other public Hi-C databases to map the enhancer of SASP or temper their conclusions about enhancers. Otherwise, this could set a precedent for the SASP enhancer region that might not be entirely accurate. *

      The enhancer mapping for SASP is outdated, as advancements in Hi-C have significantly developed this area. Therefore, the claimed enhancers of SASP may not be accurate.

      __Response: __We agree with the reviewer that enhancers are not easy to define, or to pair with their target gene(s). Indeed, we would argue that even combined HI-C and H3K27ac does not define enhancers or enhancer-gene pairs and that the gold-standard evidence for an enhancer is genetics – does its deletion/mutation abrogate gene activation. We would also point out that we did not actually use the Tasdemir data to call enhancers. In response to the reviewer’s comment, we will temper our terminology and now refer to our inter-and intra-genic ATAC-seq peaks only as “putative enhancers”.

      ii) “Many of these include putative enhancers located close to key SASP genes, such as IL1B and IL8 (Figure 1D).” I have the same concern as mentioned above (i). However, I am interested in knowing the other key SASP genes where DNA is accessible near the genes. A supplementary table listing key SASP genes along with their distances to the TSS and affected by TPR knock-down would be helpful.

      __Response: __We thank the reviewer for this suggestion. We will provide tables listing the TPR dependent, senescent specific ATAC-seq peaks that are close to genes associated with the ‘positive regulation of inflammatory response’, ‘cytokine activity’ and ‘cytokine receptor binding’ gene ontology terms which were significant in the GREAT analysis, and which includes many SASP genes. We will also provide distances of these regions from the associated genes.

      iii) "As we previously reported, knockdown of TPR (siTPR) in RAS cells blocks SAHF formation, but it also results in reduced nuclear localisation (decreased nucleocytoplasmic ratio) of NF-κB, consistent with decreased NF-κB activation (Figure 2A and B, Figure S2A)." TPR is required for CCF, SASP, and SAHF. The relationship between CCF and SASP is well established, but the relationship between SAHF and CCF/SASP remains elusive. Both SAHF and CCF are enriched with heterochromatin markers, suggesting that CCF might originate from SAHF. However, this has not been confirmed. Do the authors think that SAHF is a prerequisite for CCF in the OIS model, or is it an independent event?

      Response: __We agree with the reviewer that CCFs likely originate from SAHF. Whilst we cannot definitively prove thisin our ER-Ras OIS model, in the revised manuscript we intend to further investigate the relationship between SAHF and CCF by knocking down HMGA1 during RAS-induced senescence. Like TPR, HMGA1 depletion is known to lead to loss of SAHF (Narita et al., Cell, 2006) but, unlike TPR, HMGA1 is a chromatin protein enriched on heterochromatin itself. We will assess whether loss of HMGA1 also abrogates CCF formation.__

      iv) The authors suggested that "it is plausible that the decrease in CCFs produced during the early phases of OIS upon TPR knockdown may be caused by an increase in the stability of the nuclear periphery due to the heterochromatin that remains there when SAHF are not formed." I do not completely agree with this explanation because CCF starts forming at day 3-4 but culminates at later time points. According to Figure 5A, only 5-6% of cells are positive for CCFs on day 5. What happens on day 8? By day 8, the percentage of CCF-positive cells could be 20-25%, or the number of CCFs per cell might be 0.2-0.3. If TPR is not required for CCF formation at this stage, then linking CCF to SASP at day 8 becomes critical. This suggests that another mechanism might be driving SASP expression and that TPR could be regulating downstream signaling of CCF. It is possible that changes in nuclear pore density affect the localization of cGAS from the nucleus to the cytoplasm.

      Response: __In our hands and using this IMR90 ER-RAS system, CCF formation decreases later in senescence (d8 - only 2% of cells) hence our focus on early timepoints after oncogenic RAS activation. At later timepoints, cGAS activation is also mediated by retrotransposons (de Cecco et al., Nature, 2019; Liu et al., Cell, 2023), as well as leakage of mitochondrial DNA (Victorelli et al., Nature, 2023; Chen et al., Nat. Comms, 2024), and so it is difficult to disentangle the net contribution of these three inputs.__

      v) Additionally, the authors did not address what happens in the later stages of CCF formation in the absence of TPR. If TPR is not required for CCF formation at later stages, it fails to explain the downstream processes at these time points adequately. This suggests that TPR may also have another mechanism of SASP regulation independent of CCF formation.

      __Response: __In our cellular system CCFs precede the SASP - CCFs are already present at day 3 but SASP factors are not secreted until day 5. However, CCFs are not necessarily required for maintenance of the SASP. Once initiated the SASP is maintained by cytokine feedback loops.

      …………

      Reviewer #2:

      1. The claim that TPR knockdown does not affect NFkappaB nuclear translocation indeed stands, but it would be nice if the authors also compared data across conditions in Fig. 2F, i.e. siCTRL+Ras CM versus siTPR+Ras CM in RAS cells and provided a p-value as it seems to me that there is some dampening of translocation intensity, which is clearly not the case for STOP cells. The authors focus on this for d3 and d5, but it seems to be also the case for later time points.

      __Response: __As basal NF-κB translocation is lower in RAS cells on TPR knockdown, we would expect a dampening in NF-κB translocation between siCTRL+RAS CM and siTPR+Ras CM regardless of whether there is a transportation defect. Consistent with this, the p-value for this comparison is significant, but we did not show it because it is not important in considering whether NF-κB nuclear translocation is impeded by TPR knockdown, which is the focus here. We will add a table with median nuclear:cytoplasmic NF-κB ratios and 95% confidence intervals to make the changes in basal level (treatment with STOP CM) clearer.

      Also, a comment based on literature or from the authors previous work on TPR, on the extent to which the structural integrity of the nuclear basket is at all affected upon TPR depletion would be helpful for data interpretation.

      __Response: __In the revised manuscript we will refer to the literature showing that TPR is the final component added to the nuclear pore and that its absence does not affect localisation of NUP153 to the nuclear basket (Hase and Cordes., Mol. Biol. Cell 2003; Aksenova et al., Nat Comms, 2020).

      Magnification of representative cells per each condition in Fig. 2E would be welcome.

      __Response: __We will provide a revised figure 2E with the magnifications as requested.

      Regarding the data in Figs 3 and S3: I am a bit confused about how the obviously decreased NFkappaB nuclear signal (e.g., in Fig. 3D) does not translate into a skewed N/C ratio (e.g., in Fig. 3C)? The western blots indicate that overall NFkappaB levels remain essentially unchanged? Am I missing something?

      Response: __As stated in the Methods section, we used a 50-pixel expansion of the detected nuclear area as our cytoplasmic area in the analysis (see image below). This was because we found detecting and segmenting the whole cytoplasmic area in the NF-κB channel to be unreliable. At day 3 and 5, the decrease in NF-κB nuclear signal in RAS cells on TPR knockdown was accompanied by a decrease in signal in the portion of the cytoplasm closest to the nucleus. This led to no change in the nuclear:cytoplasmic ratio. We believe the redistribution of NF-κB closer to the nucleus in the RAS siCTRL sample indicates early activation and will make this clearer in the revised text. We will also quantify the NF-κB western blots (see point 5), to help clarification of this issue.____ __

      Also, along these lines, d8 western blots seem to portray an overall drop in NFkappaB levels. Is this indeed so? Can the authors maybe quantify their blots' replicates and provide a box plot and statistical testing?

      Response: __We will provide quantification for the NF-κB western blots, though box plots would not be appropriate as we only have two replicates.__

      Regarding the ATAC-seq data from d3, I think it could be mined a bit more. For example, compare to d8 (which the authors have apparently done, but don't present in detail) and discuss which are these early regions that also become accessible by d3 and what kind of genes and motifs are associated with them. Moreover, the focus in Fig. S3E is on ATAC sites shared with d8; how about d3-specific ones? How many of these are there (if any) and how might they be affected?

      __Response: __As shown in Table S2, TPR knockdown did not cause any changes in chromatin accessibility at day 3, so there are no day 3 specific TPR dependent peaks. We will edit the text to make this clearer. We will carry out motif analysis and GREAT analysis on the day 3 peaks that become accessible in RAS cells but are not accessible in STOP (RAS-specific peaks).

      I trust that the authors quantified their STING blots for the conclusions they present, but since it is difficult to assess these confidently by eye, again, some quantification plots would be welcome in Figs 4C,D and S4D,E.

      __Response: __We will provide quantification for the STING western blots.

      As controls for Fig. 5, it would be interesting to see if active histone readouts also mark CCFs in this system.

      __Response: __Ivanov et al., J. Cell Biol., 2013 showed the absence of H3K9 acetylation from chromatin in CCFs. Further exploration of the types of chromatin/sequences in CCFs is outside the scope of our current manuscript.

      *The POM121 channel in Fig. 5C appears to have some small signal foci in the cytoplasm; could these be small CCFs? More generally, the authors focus on these large blobs that only appear in

      __Response: __The small signal foci the reviewer is highlighting are background from the POM121 antibody staining rather than CCFs – they do not show DAPI staining, and similar foci are evident in non-senescent cells where CCFs are generally not present. Our unpublished data (see response to Reviewer 1, point iv) from day 8 cells shows that only ~2% of senescent cells are positive for CCF regardless of TPR knockdown, which is a similar number to that observed in non-senescent cells at earlier timepoints. Thus, in our hands CCF formation occurs earlier, triggering the SASP, rather than at day 8 when the SASP is already established and reinforced through positive feedback cytokine signalling.

      I wonder if there is a simple experiment the authors could do to test if this mechanism is only linked to senescence, specifically oncogene-induced senescence? I don't think this is needed to support the conclusions drawn here, but it could significantly broaden the scope of their discovery of, for example, this was true in other senescence models or during proinflammatory activation in general?

      __Response: __These are interesting suggestions, but setting up, characterising and quantifying other senescence models will take a substantial amount of time that would be outside the scope of our current manuscript.

      ………….

      Reviewer #3

      1. The study uses a single cell strain IMR90 undergoing a single form of senescence, induced by activated Ras. To show the generalizability of the finding, the authors are advised to inhibit TPR in other forms of senescence in addition to IMR90. For example, IR or etoposide induces greater amount of CCF than in OIS of IMR90. BJ, MEFs, and ARPE-19 senescence also show prominent CCF.

      __Response: __These are interesting suggestions, but as we responded to reviewer 2, setting up, characterising and quantifying other senescence models will take a substantial amount of time that would be outside the scope of our current manuscript.

      To convincing show the CCF pathway is involved, the authors need to measure the activity of cGAS-STING pathway. Including cGAMP ELISA will be informative.

      __Response: __We thank the reviewer for this suggestion, and we will try to include this assay in our revised manuscript.

      The authors used conditioned media to show that TPR KD does not directly affect NFkB nuclear translocation. While this is helpful, conditions other than senescence will be more direct. For example, TNFa treatment or poly I:C transfection induces efficient NFkB nuclear translocation in IMR90 cells.

      __Response: __This experiment (Fig. 2EF) was designed to simply show that knocking down TPR does not impair the ability of activated NFkB to enter the nucleus, it is not about senescence per se. Indeed, this is why we included the addition of SASP (RAS) conditioned media to non-senescence STOP cells in Fig. 2. We do not think investigating other methods of activating NFkB would add more to the question of whether TPR loss abrogates NFkB nuclear import.

      Fig. 4C and Fig. S4D are identical.

      Response: Though these STING immunoblots look similar; in fact they are not identical. Below we attach the raw original image in which both biological replicates (Fig 4C and S4D) for Day 3 were run on the same gel as proof of this claim.

      Figure legend for Fig. S4F is mislabeled.

      __Response: __We will correct this.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      DNA damage triggers senescence, inducing chromatin reorganization and SASP activation. The authors previously demonstrated that the TPR nucleoprotein at nuclear pores is crucial for both SAHF formation and SASP activation during senescence. Here they also showed that TPR is required for the formation of cytoplasmic chromatin fragments (CCF), which activate cGAS-STING-TBK1-NF-kB signaling to express SASP. While the mechanistic regulation of CCF formation by TPR remains unclear, their study provides compelling evidence of downstream processes involving CCF. This study offers new insights into CCF formation, suggesting a promising direction for further research. I endorse the manuscript; however, there are several concerns that need addressing before acceptance.

      i) "Enhancers dependent on TPR during senescence are enriched for binding sites of inflammatory transcription factors".

      Proximity to genes does not confirm an enhancer role for that gene, although Tasdemir et al., 2016 suggested this. At that time, HI-C and Hi-CHiP techniques were not well-established. Nowadays, without combining HI-C and H3K27ac ChIP, Hi-ChIP alone cannot definitively identify actual enhancer regions. If we repeatedly use the Tasdemir et al., 2016 map, we risk incorrect mapping of enhancers of SASP. The authors should either use other public Hi-C databases to map the enhancer of SASP or temper their conclusions about enhancers. Otherwise, this could set a precedent for the SASP enhancer region that might not be entirely accurate.

      ii) Many of these include putative enhancers located close to key SASP genes, such as IL1B and IL8 (Figure 1D).

      I have the same concern as mentioned earlier about enhancers. However, I am interested in knowing the other key SASP genes where DNA is accessible near the genes. A supplementary table listing key SASP genes along with their distances to the TSS and affected by TPR knock-down would be helpful.

      iii) "As we previously reported, knockdown of TPR (siTPR) in RAS cells blocks SAHF formation, but it also results in reduced nuclear localisation (decreased nucleocytoplasmic ratio) of NF-κB, consistent with decreased NF-κB activation (Figure 2A and B, Figure S2A)." TPR is required for CCF, SASP, and SAHF. The relationship between CCF and SASP is well established, but the relationship between SAHF and CCF/SASP remains elusive. Both SAHF and CCF are enriched with heterochromatin markers, suggesting that CCF might originate from SAHF. However, this has not been confirmed. Do the authors think that SAHF is a prerequisite for CCF in the OIS model, or is it an independent event?

      iv) The authors suggested that "it is plausible that the decrease in CCFs produced during the early phases of OIS upon TPR knockdown may be caused by an increase in the stability of the nuclear periphery due to the heterochromatin that remains there when SAHF are not formed." I do not completely agree with this explanation because CCF starts forming at day 3-4 but culminates at later time points. According to Figure 5A, only 5-6% of cells are positive for CCFs on day 5. What happens on day 8? By day 8, the percentage of CCF-positive cells could be 20-25%, or the number of CCFs per cell might be 0.2-0.3. If TPR is not required for CCF formation at this stage, then linking CCF to SASP at day 8 becomes critical. This suggests that another mechanism might be driving SASP expression and that TPR could be regulating downstream signaling of CCF. It is possible that changes in nuclear pore density affect the localization of cGAS from the nucleus to the cytoplasm.

      Significance

      The authors previously demonstrated that the TPR nucleoprotein at nuclear pores is crucial for both SAHF formation and SASP activation during senescence. Here they also showed that TPR is required for the formation of cytoplasmic chromatin fragments (CCF), which activate cGAS-STING-TBK1-NF-kB signaling to express SASP. While the mechanistic regulation of CCF formation by TPR remains unclear, their study provides compelling evidence of downstream processes involving CCF. This study offers new insights into CCF formation, suggesting a promising direction for further research.

      However, there are some limitations to this study. The enhancer mapping for SASP is outdated, as advancements in Hi-C have significantly developed this area. Therefore, the claimed enhancers of SASP may not be accurate. Additionally, the authors did not address what happens in the later stages of CCF formation in the absence of TPR. If TPR is not required for CCF formation at later stages, it fails to explain the downstream processes at these time points adequately. This suggests that TPR may also have another mechanism of SASP regulation independent of CCF formation.

    1. we used our words we used what words we had to weld, what words we had we wielded, kneeled, we knelt.

      I think the opening lines of the poem are the first of many examples of Choi employing Parallelism in this poem. I think the repetitive nature of this parallelism may be a commentary on how society pushes us to to fit in to a mold both in our daily routine, as well as in our identities.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      (1) The manuscript by Lu et al aims to study the effects of tubulin post-translational modification in C. elegans touch receptor neurons. Authors use gene editing to engineer various predicted PTM mutations in a-tubulin MEC-12 and b-tubulin MEC-7. Authors generate and analyze an impressive battery of mutants in predicted phosphorylation site and acetylation site of b-tubulin MEC-7, K40 acetylation site in a-tubulin MEC-12, enzymatic site of the a-tubulin acetyltransferase MEC-17, and PTM sites in the MEC-12 and MEC-7 C-tails (glutamylation, detyrosination, delta-tubulin). This represents a lot of work, and will appeal to a readership interested in C. elegans touch receptor neurons. The major concern/criticism of this manuscript is whether the introduced mutation(s) directly affects a specific PTM or whether the mutation affects gene expression, protein expression/stability/localization, etc. As such, this work does convincingly demonstrate, as stated in the title, that "Editing of endogenous tubulins reveals varying effects of tubulin posttranslational modifications on axonal growth and regeneration." 

      We thank the reviewer for the constructive comments. With regards to the major concern or criticism, we like to point out that we have previously characterized ~100 missense mutations in mec-7 and mec-12 (Zheng et al., 2017, PMID: 28835377; Lee et al., 2021, PMID: 33378215). So, we are familiar with the phenotypes associated with mutations that affect gene expression or protein stability, which mostly result in a null phenotype. When analyzing the PTM site mutants, we compared their phenotypes with the previously categorized phenotypes of null alleles, neomorphic mutations that increase microtubule stability, and antimorphic mutations that prevent polymerization or disrupt microtubule stability. For example, in the case of mec-7 S172 mutations, we found that S172P mutants had the same phenotype as the mec-7 knockout (mild neurite growth defects), suggesting that S172P likely affects protein folding or stability, resulting in the loss of MEC-7. In contrast, S172A and S172E mutations showed phenotypes similar to neomorphic alleles (the emergence of ectopic ALM posterior neurite) and antimorphic alleles (the severe shortening of all neurites in the TRNs), respectively. These phenotypic differences suggested to us that the effects of S172A and S172E mutations cannot be simply attributed to the loss of protein expression and stability. Similar logic was applied to the studies of other PTM-inactivating or -mimicking mutations.

      (2) For example, the authors manipulate the C-terminal tail of MEC-12 and MEC-7, to test the idea that polyglutamylation may be an important PTM. These mutants displayed subtle phenotypes. The authors show that branch point GT335 and polyglutamyation polyE recognizing antibodies stain cultured embryonic touch receptor neurons (TRNs), but did not examine staining in C. elegans TRNs in situ. To my knowledge, these antibodies have not been shown to stain the TRNs in any published papers, raising the question of how these "glutamylation" mutations are affecting mec-12 and -7. The rationale for using cultured embryonic TRNs and the relevance of the data and its interpretation are not clear. 

      The GT335 and polyE antibodies were used by previous studies (O’Hagan et al., 2011, PMID: 21982591; and O’Hagan et al., 2017, PMID: 29129530) to detect the polyglutamylation signals in the sensory cilia of C. elegans. We initially tried to stain the whole animals using these antibodies but could not get clear and distinct signals in the TRNs. We reason that the tubulin polyglutamylation signals in the TRNs may be weak, and the in situ staining method which requires the antibodies to penetrate multiple layers of tissues (e.g., cuticles and epidermis) to reach the TRN axons may be not sensitive enough to detect the signal. In fact, the TRN axons are located deeper in the worm body compared to the sensory cilia that are mostly exposed to the environment. Another reason could be that the tissues (mostly epidermis) surrounding the TRN axons also have polyglutamylation staining, which makes it difficult to recognize TRN axons. This is a situation different from the anti-K40 acetylation staining, which only occurs in the TRNs because MEC-12 is the only a-tubulin isotype that carries K40. Due to these technical difficulties, we decided to use the in vitro cultured TRNs for the staining experiment, which allows both easy access of the antibodies (thus higher sensitivity) and the dissociation of the TRNs from other tissues. The fact that we were able to observe reduced staining in the ttll mutants and the tubulin mutants that lost the glutamate residues suggest that these antibodies indeed detected glutamylation signals in the cells.

      (3) The final paragraph of the discussion is factually incorrect. The C. elegans homologs of the CCP carboxypeptidases are called CCPP-1 and CCPP-6. There are several publications on their functions in C. elegans.

      We thank the reviewer for pointing out the mistake in the text. We intended to say that “there is no C. elegans homolog of the known tubulin carboxypeptidases that catalyze detyrosination”, which is true given that the detyrosinase vasohibins (VASH1/VASH2) homologs cannot be found in C. elegans. We are aware of the publications on CCPP-1 and CCPP-6; CCPP-1 is known to regulate tubulin deglutamylation in the cilia of C. elegans (O’Hagan et al., 2011 and 2017), while CCPP-6 may function in the PLM to regulate axonal regeneration (Ghosh-Roy et al., 2012). In the revised manuscript, we have corrected the error.

      Reviewer #2 (Public Review):

      Summary:

      The tubulin subunits that make up microtubules can be posttranslationally modified and these PTMs are proposed to regulate microtubule dynamics and the proteins that can interact with microtubules in many contexts. However, most studies investigating the roles of tubulin PTMs have been conducted in vitro either with purified components or in cultured cells. Lu et al. use CRISPR/Cas9 genome editing to mutate tubulin genes in C. elegans, testing the role of specific tubulin residues on neuronal development. This study is a real tour de force, tackling multiple proposed tubulin modifications and following the resulting phenotypes with respect to neurite outgrowth in vivo. There is a ton of data that experts in the field will likely reference for years to come as this is one of the most comprehensive in vivo analyses of tubulin PTMs in vivo.

      This paper will be very important to the field, however would be strengthened if: 1) the authors demonstrated that the mutations they introduced had the intended consequences on microtubule PTMs, 2) the authors explored how the various tubulin mutations directly affect microtubules, and 3) the findings are made generally more accessible to non C. elegans neurobiologists.

      (1) The authors introduce several mutations to perturb tubulin PTMs, However, it is unclear to what extent the engineered mutations affect tubulin in the intended way i.e. are the authors sure that the PTMs they want to perturb are actually present in C. elegans. Many of the antibodies used did not appear to be specific and antibody staining was not always impacted in the mutant cases as expected. For example, is there any evidence that S172 is phosphorylated in C. elegans, e.g. from available phosphor-proteomic data? Given the significant amount of staining left in the S172A mutant, the antibody seems non-specific in this context and therefore not a reliable readout of whether MTs are actually phosphorylated at this residue. As another example, there is no evidence presented that K252 is acetylated in C. elegans. At the very least, the authors should consider demonstrating the conservation of these residues and the surrounding residues with other organisms where studies have demonstrated PTMs exist. 

      We thank the reviewer for the comments. To our knowledge, there are very few phosphor-proteome data available for C. elegans. We searched a previously published dataset (Zielinska et al., 2009; PMID: 19530675) and did not find the S172 phosphorylation signal in MEC-7. This is not surprising, given that only six touch receptor neurons expressed MEC-7 and the abundance of MEC-7 in the whole animal lysate may be below the detection limit. However, this phosphorylation site S172 is highly conserved across species and tubulin isotypes (Figure 1-figure supplement 1 in the revised manuscript), suggesting that this site is likely phosphorylated in MEC-7.

      In the case of K252, the potential acetylation site and the flanking sequences are extremely conserved across species and isotypes. In fact, the 20 amino acids from 241-260 a.a. are identical among the tubulin genes of C. elegans, fruit flies, Xenopus, and humans (Figure 4-figure supplement 1B). Thus, although K252 acetylation was found in the HeLa cells, this site can possibly be acetylated. 

      In the case of K40, we observed sequence divergence at the PTM site and adjacent sequences among the tubulin isotypes in C. elegans. MEC-12 is the only C. elegans a-tubulin isotype that has the K40 residue, and the 40-50 a.a. region of MEC-12 appears to be more conserved than other isotypes when compared to Drosophila, frog, and human a-tubulins (Figure 4-figure supplement 1A).

      (2) Given that the authors have the mutants in hand, it would be incredibly valuable to assess the impact of these mutations on microtubules directly in all cases. MT phenotypes are inferred from neurite outgrowth phenotypes in several cases, the authors should look directly at microtubules and/or microtubule dynamics via EBP-2 when possible OR show evidence that the only way to derive the neurite phenotypes shown is through the inferred microtubule phenotypes. For example, the effect of the acetylation or detyrosination mutants on MTs was not assessed. 

      We thank the reviewer for the suggestions. In this study, we created >20 tubulin mutants. Due to limited time and resources, we were not able to examine microtubule dynamics in every mutant strain using EBP-2 kymographs. We assessed the effects of the tubulin mutations mostly based on the changes on neurite growth pattern. From our previous experience of analyzing ~100 mec-7 and mec-12 missense mutations (Zheng et al., 2017, MBoC; Lee et al., 2021, MBoC), we found that the changes in microtubule dynamics are correlated with the changes in neuronal morphologies. For example, the growth of ectopic ALM-PN is correlated with fewer EBP-2 comets and potentially reduced microtubule dynamics; this correlation holds true for several mec-7 neomorphic missense alleles we examined before (Lee et al., 2021, MBoC) and the PTM site mutants [e.g., mec-7(S172A) and mec-12(4Es-A)] analyzed in this study. Similarly, the shortening of TRN neurites is correlated with more EBP-2 comets and increased microtubule dynamics. For the mutants that don’t show neurite growth defects, our previous experience is that they are not likely to show altered microtubule dynamics in EBP-2 tracking experiments. So, we did not analyze the acetylation mutants (which had no defects in neurite growth) and the detyrosination mutants (which had weak ALM-PN phenotype). Nevertheless, we agree with the reviewer that we could not rule out the possibility that there may be some slight changes to microtubule dynamics in these mutants.

      Using tannic acid staining and electron microscopy (EM), we previously examined the microtubule structure in several tubulin missense mutants (Zheng et al., 2017, MBoC) and found that the loss-of-function and antimorphic mutations significantly reduced the number of microtubules and altered microtubule organizations by reducing protofilament numbers. These structural changes are consistent with highly unstable microtubules and defects in neurite growth. On the other hand, neomorphic mutants had only slight decrease in microtubule abundance, maintained the 15-protofilament structure, and had a more tightly packed microtubule bundles that filled up most of the space in the TRN neurite (Zheng et al., 2017, MBoC). These structural features are consistent with increased microtubule stability and ectopic neurite growth. Although we did not directly examine the microtubule abundance and structure using EM in this study, we would expect similar changes that are correlated with the neurite growth phenotypes in the PTM mutants. We agree with the reviewer, it will be informative to conduct more comprehensive analysis on these mutants using EM and other structural biology methods.

      (3) There is a ton of data here that will be important for experts working in this field to dig into, however, for the more general cell biologist, some of the data are quite inaccessible. More cartoons and better labeling will be helpful as will consistent comparisons to control worms in each experiment.

      Response: We thank the reviewer for the comment. In the revised manuscript, we added some cartoons to Figure 2G to show the location of the synaptic vesicles. The neurite growth phenotype should be quite straightforward. Nevertheless, we added one more Figure (Figure 8) to summarize all the results in the study with cartoons that depicted the changes to neuronal morphologies.

      (4) In addition, I am left unconvinced of the negative data demonstrating that MBK does not phosphorylate tubulin. First, the data described in lines 207-211 does not appear to be presented anywhere. Second, RNAi is notoriously finicky in neurons, thus necessitating tissue-specific degradation using either the ZF/ZIF-1 or AID/TIR1 systems which both work extremely well in C. elegans. Third, there appears to be increasing S172 phosphorylation in Figure 3 Supplement 2 with added MBK-2, but there is no anti-tubulin blot to show equal loading, so this experiment is hard to interpret.

      We added the results of mbk-1, mbk-2, and hpk-1 mutants and cell-specific knockdown of MBK-2 into Figure 3-figure supplement 1D. Considering the reviewer’s suggestion, we attempted to use a ZIF-1 system to remove the MBK-2 proteins specifically in the TRNs using a previously published method (PMID: 28619826). We fused endogenous MBK-2 with GFP by gene editing and then expressed an anti-GFP nanobodies fused with ZIF-1 in the TRNs to induce the degradation of MBK-2::GFP. To our surprise, unlike the mbk-2p::GFP transcriptional reporter, the MBK-2::GFP did not show detectable expression in the TRNs, although expression can be seen in early embryos, which is consistent with the “embryonic lethal” phenotype of the mbk-2(-) mutants (Figure 3-figure supplement 2A-B in the revised manuscript). We reason that either endogenous MBK-2 is not expressed in the TRNs or is expressed at a very low level. We then crossed mbk-2::GFP with ItSi953 [mec-18p::vhhGFP4::Zif-1] to trigger the degradation of any potential MBK-2 proteins and did not observe the ectopic growth of ALM-PN (Figure 3- figure supplement 2C). These results suggest that MBK-2 is not likely to regulate tubulin phosphorylation in the TRNs, which is consistent with the results of other genetic mutants and the RNAi experiments.

      For Figure 3 Supplement 2 (Figure 3-figuer supplement 3 in revised manuscript), because we added the same amount of purified MEC-12/MEC-7 to all reactions and had established equal loading in Figure 3E, we did not do the anti-tubulin staining in this experiment. Since higher concentration (1742 nM) of MBK-2 did not produce stronger signal than the condition with 1268 nM, we don’t think the 1268 nM band represents true phosphorylation. Moreover, the signal is not significantly stronger than the control without MBK-2 and is much lower than the signal generated by CDK1 in Figure 3E. Based on these results, we concluded that MBK-2 is not likely to phosphorylate MEC-7.  

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      General:

      A summary table would help the reader digest the vast amount of phenotypic data.

      Cartoons to help a non-C. elegans reader understand the figures. 

      We added Figure 8 to summarize and illustrate the effects of the various mutants analyzed in this study.

      Specific:

      The authors engineered mutations into the predicted phosphorylation site of b-tubulin mec-7. These CRISPR-alleles mutations phenocopied previously identified loss-of-function, gain-of-function, and neomorphic mec-7 alleles identified in genetic screens by the Chalfie lab. Next, the authors sought to identify the responsible kinase, taking a candidate gene approach. The most likely family - minibrain - had no effect when knocked down/out. The authors showed that cdk-1 mutants displayed ectopic ALM-PN outgrowth. Whether cdk-1 specifically acts in the TRNs was not demonstrated, calling into question whether CDK-1 phosphorylates S172 in vivo. In their introduction (lines 45-59), the authors built a case for engineering PTM mutations directly into tubulins, because the PTM enzymes may have multiple substrates. This logic applies to the cdk-1 experiment and its interpretation. 

      The reviewer is right. Since CDK1 and minibrain kinase are the only known kinases that catalyze S172 phosphorylation, our results suggest that CDK-1 is more likely to catalyze S172 phosphorylation in the TRNs compared to MBK-1/2. Genetic studies found that cdk-1(-); mec-7(S172A) double mutants did not show stronger phenotype than the two single mutants, suggesting that they function in the same pathway. Nevertheless, we could not rule out the possibility that other kinases may also control S172 phosphorylation, and the effect of CDK-1 is indirect. We mentioned this possibility in the revised manuscript.

      For a-tubulin MEC-12, acetyl-mimicking K40Q and unmodifiable K40R mutants failed to stain with the anti-acetyl-a-tubulin (K40) antibody and displayed subtle TRN phenotypes. The enzymatically dead MEC-17 had phenotypes similar to those described by Topalidou (2012), confirming the Chalfie lab finding that MEC-17 has functions in addition and independent of its acetyltransferase activity. The authors moved onto a predicted acetylation site in MEC-7 and observed TRN developmental defects, and acknowledged that this may be due to tubulin instability and not a PTM. This is a concern for all mutants, as there is no way to measure whether the protein is expressed, stable, or localized properly. 

      We acknowledge that this is a caveat of mutational studies. An amino acid substitution at the PTM site may have multiple effects, including the change of the PTM state and potential alteration of protein conformation. Without direct evidence for enzymatic modification of the PTM site in the neurons, we could not rule out the possibility the phenotype we observed is not related to PTM and instead is the result of abnormal protein conformation and function caused by the mutation.

      Nevertheless, as stated in our above response to the first point in the public review, we can phenotypically differentiate loss-of-function and gain-of-function mutants. If the mutation reduces expression or general protein stability, it is more likely to cause a loss-of-function phenotype. For most PTM site mutants, this is not the case. We observed mostly gain-of-function phenotype, suggesting that the missense mutations did not simply inactivate the tubulin protein and instead affected the functional properties of the protein.

      From here, the authors manipulate the C-terminal tail of MEC-12 and MEC-7, testing the idea that polyglutamylation may be an important PTM. These mutants displayed subtle phenotypes. The authors show that branch point GT335 and polyglutamyation polyE recognizing antibodies stain cultured embryonic TRNs, but did not examine staining in TRNs. To my knowledge, these antibodies have not been shown to stain the TRNs in any published papers (see next point). The rationale for using cultured embryonic TRNs is not clear. 

      See our response to the second point in the public review.

      Lines 548-553 There are several publications on CCPP-1 and CCPP-6 functions in TRNs and ciliated sensory neurons. See

      PMID: 20519502

      PMID: 21982591

      PMID: 21943602

      PMID: 23000142

      PMID: 29129530

      PMID: 33064774

      PMID: 36285326

      PMID: 37287505 

      We thank the reviewer for pointing out these references, some of which were cited in the revised manuscript. We made a mistake in the Discussion by saying that there are no C. elegans homologs of tubulin carboxypeptidases while we intended to state that there is no homolog of tubulin detyrosinase in C. elegans. We are aware of the studies of CCPP-1 and CCPP-6 and have corrected the mistake in revised manuscript (also see our response to the third point in the public review).

      Reviewer #2 (Recommendations For The Authors):

      Figures: 

      As stated in the public review, more cartoons and better labeling will be helpful as will consistent comparisons to control worms in each experiment. A good example of this issue is demonstrated in Figure 2 and Figure 4: 

      (1) Figure 2: Please label images with what is being probed in each panel. 

      We added labels to the panels.

      (2) Figure 2G is very hard to interpret - cartoon diagramming what is being observed would be helpful. 

      We added cartoons to help illustrate the images.

      (3) Line 182-185: is this referring to your data or to Wu et al? It is not clear in this paragraph when the authors are describing published work versus their own data presented here. 

      It is from our data. We have made it clear in the revised manuscript.

      (4) Figure 2 - 2K is not well described. What experiment is being done here? What is dlk-1 and why did you look at this mutant? 

      Figure 2K showed that both wild-type animals and S172A mutants could reconnect the severed axons after laser axotomy. Previous studies have found that dlk-1(-) mutants were not able to regenerate axons due to altered microtubule dynamics (PMID: 19737525; PMID: 23000142). We used dlk-1(-) mutants as a negative control, because DLK-1 promotes microtubule growth following axotomy, and the DLK-1 pathway is essential for regeneration (PMID: 23000142). We want to highlight the phenotypic difference between dlk-1(-) mutants and the S172E mutants. Although both mutants showed similar regrowth length, dlk-1(-) mutants showed unbranched regrowth probably due to the lack of microtubule polymerization, whereas the S172E mutants showed a mesh-like regrowth pattern likely due to highly dynamic and unstable microtubules. We explained the different phenotypes in the revised manuscript.

      (5) Figure 4C: this phenotype is hard to interpret. Where is the wt control? Where is the quantification? 

      In the Figure legend, we have referred the readers to Figure 1G for the wild-type image. Quantification is provided in the text (~20% of the animals showed the branching defects).

      (6) There are no WT comparison images in Figure 4I, making the quantification difficult to interpret 

      In the Figure legend, we have referred the readers to Figure 1A for the wild-type control. Moreover, we included a new Figure 8 to summarize the phenotypes of all mutants.

      Experimental:

      (1) Is it clear that only MEC-7/MEC-12 are the only a- and b-tubulin present in the TRNs? The presence of other tubulins not mutated would complicate the interpretation of the results. 

      According to the mRNA levels, the expression of MEC-7 and MEC-12 are >100 fold higher than other tubulin isotypes. For example, single-cell transcriptomic data (Taylor et al., 2021) showed that mec-7 mRNA is at 135,940 TPM in ALM neurons, whereas two other tubulin isotypes, tbb-1 and tbb-2, have expression value of 54 and 554 TPM, respectively in the ALM. So, even if there are some other tubulin isotypes, their abundance is much lower than mec-7 and mec-12 and are not likely to interfere with the effects of the mec-7 and mec-12 mutants.

      (2) The in vitro kinase assays should be quantified. 

      We have added the quantification.

      (3) The idea that Cdk1 phosphorylates tubulin in interphase is surprising and I am left wondering how the authors propose that Cdk1 is activated in interphase. Is cyclin B (or another cyclin) present in interphase in this cell type? Expression but not activation of Cdk1 is not discussed. 

      CDK1 can work with cyclin A and cyclin B. C. elegans has one cyclin A gene (cya-1) and four cyclin B genes (cyb-1, cyb-2.1, cyb-2.2, and cyb-3). According to single-cell transcriptomic data of L4 animals, cya-1 and cyb-1 showed weak expression in many postmitotic neurons (including the ALM neurons), while cyb-2.1, cyb-2.2, and cyb-3 had no expression in neurons. So, it is possible that cya-1/cyclin A and cyb-1/cyclin B has low level of expression in the TRNs. A previous study also found the expression of cell cycle regulators (including cyclins) in postmitotic neurons in mouse brain (Akagawa et al., 2021; PMID: 34746147).

      (4) What is the significance of neurite swelling and looping in Figure 4H? The underlying cause of this phenotype is not described. 

      The neurite swelling and looping phenotype of mec-17(-) mutants were described by Topalidou et al., (2012; PMID: 22658602) and were caused by the bending of the microtubules. It appears that the loss of the a-tubulin acetyltransferase altered the organization of microtubules in the TRNs. These defects were partially rescued by the enzymatically dead MEC-17, suggesting that MEC-17 may play a non-enzymatic (and likely structural) role in regulating microtubule organization. We added more explanation in the revised manuscript.

      (5) It is quite surprising that polyglutamylation is not affected in the quintuple ttll mutant. Since the authors made the sextuple ttll mutant, could they demonstrate whether polyglutamylation is further reduced in this mutant via GT335 staining? 

      We did not make the comparison of the quintuple and sextuple ttll mutants because they were crossed with TRN markers with different colors for technical reasons. The quintuple mutants CGZ1475 carried uIs115 [mec-17p::TagRFP] IV, whereas the sextuple mutants CGZ1474 carried zdIs5 [mec-4p::GFP] I. As a result, we need to use different secondary antibodies for the antibody staining, which makes the results not compatible.

      Polyglutmaylation signal in the cell body was strongly affected by the ttll mutations. In fact, in the ttll-4(-); ttl-5(-); ttll-12(-) triple mutants, the signal is significantly reduced in the cell body of the TRNs, as well as the cell body of other cells. What’s surprising is that the signal in the axons persisted in the ttll triple and quintuple mutants. As the reviewers suggested, we also stained the sextuple mutants and found similar pattern as the triple and quintuple mutants (new Figure 6-figure supplement 1C in the revised manuscript), although the results are not quantitatively comparable due to the use of secondary antibodies with different fluorophores.

      Writing:

      (1) The beginning of the results section is quite jarring. The information in lines 96-104 should be in the Introduction. 

      Due to the nature of this paper, each section deals with a particular PTM. We think it is helpful to discuss some background information before describing our results on each PTM rather than giving all in the introduction. Nevertheless, we modified the beginning of the results to make it more coherent and more connected with the preceding paragraphs.

      (2) Line 122-126: conclusions are not supported by the data: it is suggested from previous experiments, but authors do not look at MTs directly. 

      We have rephrased the statement to acknowledge that we made such conclusion based on phenotypic similarity with mutants we previously examined.

      (3) I am confused by the usage of both mec-12(4EtoA) and mec-12(4Es-A). Are these the same mutations? If so, there needs to be consistency. If not, each case needs to be defined. 

      They are the same. We have corrected the mistake and are now using mec-12(4Es-A) to refer to the mutants.

      Line 105: phosphor --> phospho 

      Line 187: were --> was 

      Line 298: is --> are

      The above typos are corrected.

    1. Author response:

      The following is the authors’ response to the current reviews.

      Reviewer #1 (Recommendations For The Authors):

      I still find it really impressive that the Purkinje cell stimulation so closely mimics the pathogenic phenotypes - in my opinion, the strongest part of the paper. I would like just a little clarification on some of my previous questions.

      Major points:

      (1) Can the authors clarify where the new units came from? Are these units that were recorded before the initial submission and excluded, but are now included? If so, why were they excluded before? Or are these units that were recorded since the original submission?

      The number of units increased in Figure 1 for three reasons: 1) We have now plotted the classifier results in Figure 1 instead of the validation results, which have been moved to Figure 1 Supplement 3. 2) In response to reviewer comments, we no longer include units that had >60 s of recording in both our model creation and validation. We had previously used 30 s for creating the model and a different 30 s for validating the model, if an additional 30 s were available. 3) We changed our model creation and validation strategy based on previous reviewer comments. The new units in Figures 2-4 were taken from our pool of previously collected but unanalyzed data (we collect neural data on a rolling basis and thus these data were not initially available). We were fortunate to have these data to analyze in order to address the concerns about the number of cells included in the manuscript. The number of units increased in Figure 5 because new units were recorded in response to reviewer comments.

      (2) Why did some of the neuron counts go down? For example, in Pdx1Cre;Vglut2fl/fl mice, the fraction of units with the control signature went from 11/21 to 7/23. Is this because the classifier changed between the original submission and the revision?

      Yes, the proportion of cells matching each classification changed due to the different parameters and thresholds used in the updated classifier model.

      Minor points:

      In the Discussion: "We find some overlap and shared spike features between the different disease phenotypes and show that healthy cerebellar neurons can adapt multiple disease-associated spike train signatures." I think "adapt" should be "adopt"

      In the Discussion: "compare" is misspelled as "compared"

      Thank you for bringing these typos to our attention. We will upload a new version of the text with the typos corrected.


      The following is the authors’ response to the original reviews.

      We would like to thank the Reviewers for providing excellent and constructive suggestions that have enabled us to strengthen our overall presentation of our data. We have addressed each of the comments by altering the text, providing additional data, and revising the figures, as requested.

      Below are our explanations for how we have altered the manuscript in this revised version.

      Recommendations for the authors:

      I think you will have seen from the comments that there was great enthusiasm for the importance of this study. There were also shared concerns about how the classifier may be inadequate in its current format, as well as specific suggestions to consider to improve. I hope that you will consider a revision to really amplify the impact of the importance of this study.

      Reviewer #1 (Recommendations For The Authors):

      Distinct motor phenotypes are reflected in different neuronal firing patterns at different loci in motor circuits. However, it is difficult to determine if these altered firing patterns: 1) reflect the underlying neuropathology or phenotype, 2) whether these changes are intrinsic to the local cell population or caused by larger network changes, and 3) whether abnormal firing patterns cause or reflect abnormal movement patterns. This manuscript attempts to address these questions by recording neural firing patterns in deep cerebellar nucleus neurons in several models of cerebellar dysfunction with distinct phenotypes. They develop a classifier based on parameters of single unit spike trains that seems to do an inconsistent job of predicting phenotype (though it does fairly well for tremor). The major limitation of the recording/classifier experiments is the low number of single units recorded in each model, greatly limiting statistical power. However, the authors go on to show that specific patterns of Purkinje cell stimulation cause consistent changes in interposed nucleus activity that map remarkably well onto behavioral phenotypes. Overall, I did not find the recording/classifier results to be very convincing, while the stimulation results strongly indicate that interposed nucleus firing patterns are sufficient to drive distinct behavioral phenotypes.

      We thank the reviewer for their comments. We describe below how we have addressed the major concerns.

      Major concerns:

      (1) I don't think it's legitimate to use two 30-second samples from the same recording to train and validate the classifier. I would expect recordings from the same mouse, let alone the same unit, to be highly correlated with each other and therefore overestimate the accuracy of the classifier. How many of the recordings in the training and validation sets were the same unit recorded at two different times?

      We previously published a paper wherein we measured the correlation (or variability) between units recorded from the same mouse versus units recorded from different mice (see: Van der Heijden et al., 2022 – iScience, PMID: 36388953). In this paper we did not find that nuclei neuron recordings from the same mouse were more correlated or similar to each other than recordings from different mice. 

      Upon this reviewer comment, however, we did observe strong correlations between the two 30-second samples from the same recording units. We therefore decided to no longer validate our classifier based on a training and validation sets that had overlapping units. Instead, we generated 12 training sets and 12 non-overlapping validation sets based on our entire database. We then trained 12 classifier models and ranked these based on their classification ability on the validation sets (Figure 1 – supplemental Figure 3). We found that the top two performing classifier models were the same, and used this model for the remainder of the paper. 

      (2) The n's are not convincing for the spike signature analyses in different phenotypic models. For example, the claim is that Pdx1Cre;Vglut2fl/fl mice have more "control" neurons than ouabain infusion mice (more severe phenotype). However, the numbers are 11/21 and 7/20, respectively. The next claim is that 9/21 dystonic neurons are less than 11/20 dystonic neurons. A z-test for proportions gives a p-value of 0.26 for the first comparison and a pvalue of 0.44 for the second. I do not think any conclusions can be drawn based on these data.

      We included more cells in our analyses and found that the z-test for n the proportion of cells with the “control” and “dystonia” signature is indeed statistically significant. 

      (3) Since the spiking pattern does not appear to predict an ataxic phenotype and the n's are too small to draw a conclusion for the dystonic mice, I think the title is very misleading - it does not appear to be true that "Neural spiking patterns predict behavioral phenotypes...", at least in these models.

      We have changed the title to: “Cerebellar nuclei cells produce distinct pathogenic spike signatures in mouse models of ataxia, dystonia, and tremor.” We feel that this new title captures the idea that we find differences between spike signatures associated with ataxia, dystonia, and tremor and that these signatures induce pathological movements.

      (4) I don't think it can be concluded from the optogenetic experiments that the spike train signatures do not depend on "developmental changes, ...the effect of transgene expression, ... or drug effects outside the cerebellum." The optogenetic experiments demonstrate that modulating Purkinje cell activity is sufficient to cause changes in DCN firing patterns and phenotypes (i.e., proof-of-principle). However, they do not prove that this is why DCN firing is abnormal in each model individually.

      Thank you for highlighting this section of the text. We agree that the optogenetic experiments cannot explain why the DCN is firing abnormally in each model. We have edited this section of the text to prevent this conclusion from being drawn by the readers.

      Minor points:

      (1) It would be nice to see neural recordings in the interposed nucleus during Purkinje terminal stimulation to verify that the firing patterns observed during direct Purkinje neuron illumination are reproduced with terminal activation. This should be the case, but I'm not 100% certain it is.

      We have edited the text to clarify that representative traces and analysis of interposed nucleus neurons in response to Purkinje terminal stimulation are the data in Figure 5.

      (2) How does the classifier validation (Fig. 1E) compare to chance? If I understand correctly, 24/30 neurons recorded in control mice are predicted to have come from control mice (for example). This seems fairly high, but it is hard to know how impressive this is. One approach would be to repeat the analysis many (1000s) of times with each recording randomly assigned to one of the four groups and see what the distribution of "correct" predictions is for each category, which can be compared against the actual outcome.

      We have now also included the proportion of spike signatures in the entire population of neurons and show that the spike signatures are enriched in each of the four groups (control, ataxia, dystonia, tremor) relative to the presence of these signatures in the population (Figure 1E). 

      (3) I don't think this is absolutely necessary, but do the authors have ideas about how their identified firing patterns might lead to each of these phenotypes? Are there testable hypotheses for how different phenotypes caused by their stimulation paradigms arise at a network level?

      We have added some ideas about how these spike signatures might lead to their associated phenotypes to the discussion.

      Reviewer #2 (Recommendations For The Authors):

      (1) As mentioned earlier, my main concern pertains to the overall architecture and training of the classifier. Based on my reading of the methods and the documentation for the classifier model, I believe that the classifier boundaries may be biased by the unequal distribution of neurons across cerebellar disease groups (e.g., n=29 neurons in control versus n=19 in ataxics). As the classifier is trained to minimize the classification error across the entire sample, the actual thresholds on the parameters of interest may be influenced by the overrepresentation of neurons from control mice. To address this issue, one possible solution would be to reweight each group so that the overall weight across classes is equal. However, I suggest a better strategy might be to revise the classifier architecture altogether (as detailed below).

      We have retrained the classifier model based on equal numbers of ataxic, dystonic, and tremor cells (n=20) but we intentionally included more control cells (n=25). We included more control cells because we assume this is the baseline status for all cerebellar neurons and wanted to avoid assigning disease signatures to healthy neurons too easily. 

      (2) As the authors make abundantly clear, one mouse model of disease could potentially exhibit multiple phenotypes (e.g., a mouse with both ataxia and tremor). To address this complexity, it might be more valuable to predict the probability of a certain CN recording producing specific behavioral phenotypes. In this revised approach, the output of the classifier wouldn't be a single classification (e.g., "this is an ataxic mouse") but rather the probability of a certain neural recording corresponding to ataxia-like symptoms (e.g., "the classifier suggests that this mouse has a 76% likelihood of exhibiting ataxic symptoms given this CN recording"). This modification wouldn't require additional data collection, and the exemplar disease models could still be used to train such a revised network/classifier, with each mouse model corresponding to 0% probability of observing all other behavioral phenotypes except for the specific output corresponding to the disease state (e.g., L7CreVgat-fl/fl would be 0% for all categories except ataxia, which would be trained to produce a score of 100%). This approach could enhance the validation results across other mouse models by allowing flexibility in a particular spike train parameter to produce a diverse set of phenotypes.

      This is a great comment. Unfortunately, our current dataset is constrained to fully address this comment for the following reasons:

      - We have a limited number of neurons on which we can train our classifier neurons. Further dividing up the groups of neurons or complicating the model limited the power of our analyses and resulted in overfitting of the model on too few neurons.

      - The recording durations (30 seconds) used to train our model are likely too short to find multiple disease signatures within a single recording. We feel that the complex phenotypes are likely resulting from cells within one mouse exhibiting a mix of disease signatures (as in the Car8wdl/wdl mice).

      We think this question would be great for a follow-up study that uses a large number of recordings from single mice to fully predict the mouse phenotype based on the population spike signatures. 

      To limit confusion about our classifier model, we have also altered the language of our manuscript and refer to the cells exhibiting a spike signature instead of predicting a phenotype. 

      However, the paper falls short in terms of the classifier model itself. The current implementation of this classifier appears to be rather weak. For instance, the crossvalidated performance on the same disease line mouse model for tremor is only 56%. While I understand that the classifier aims to simplify a high-dimensional dataset into a more manageable decision tree, its rather poor performance undermines the authors' main objectives. In a similar vein, although focusing on three primary features of spiking statistics identified by the decision tree model (CV, CV2, and median ISI) is useful for understanding the primary differences between the firing statistics of different mouse models, it results in an overly simplistic view of this complex data. The classifier and its reliance on the reduced feature set are the weakest points of the paper and could benefit from further analysis and a different classification architecture. Nevertheless, it is commendable that the authors have collected high-quality data to validate their classifier. Particularly impressive is their inclusion of data from multiple mouse models of ataxia, dystonia, and tremor, enabling a true test of the classifier's generalizability.

      We intentionally simplified our parameter space from a high-dimensional dataset into a more manageable decision tree. We did this for the following reasons:

      - The parameters, even though all measuring different features, are highly correlated (see Figure 1 – supplemental Figure 2). Further, we were training our dataset on a limited number of recordings. We found that including all parameters (for example using a linear model) caused overfitting of the data and poor model performance.

      - Describing the spike signatures using a lower number of parameters allowed us to design optogenetic parameters that would mimic this parameter space. This would be infinitely more complex with a bigger parameter space. 

      We agree with the reviewer that inclusion of multiple mouse models in addition to the optogenetics experiments provide the classifier’s generalizability. 

      Minor Comments:

      (1) The blown-up CN voltage traces in Figures 5C and Supplementary Figure 2B appear more like bar plots than voltage traces on my machine.

      Thank you for bringing this to our attention. We have improved the rendering of the traces.

      (2) The logic in lines 224-228 is somewhat confusing. The spike train signatures are undoubtedly affected by all the factors mentioned by the authors. What, I believe, the authors intend to convey is that because changes in CN firing rates can be driven by multiple factors, it is the CN firing properties themselves that likely drive disease-specific phenotypes.

      We agree that our discussion of the CN firing needs clarification. We have made the appropriate edits in the text.

      Reviewer #3 (Recommendations For The Authors):

      It's quite astounding that this can be done from single spike trains from what are almost certainly mixed populations of neurons. Could you add something to the discussion about this? Some questions that could be addressed would be would multiple simultaneous recordings additionally help classify these diseases, or would non-simultaneous recordings from the same animal be useful? Also more discussion about which cells you are likely recording from would be useful.

      Thank you for this suggestion. We have added discussion about multiple recordings, simultaneous vs non-simultaneous recordings, and our thoughts on the cell population recorded in this work.

      Data in figure 2 is difficult to understand - it appears that the majority of dysregulated cells in 2 ataxic models are classified as dystonia cells, not ataxic cells. This appears surprising as it seems to be at odds with earlier data from Fig 1. In my opinion, it is not discussed adequately in the Results or Discussion section.

      We have added further discussion of the ataxia models represented in Figures 1 and 2.

      Minor comment:

      The colours of the subdivisions of the bars in 2C and 3C, and the rest of the paper appear to be related to the groups in the middle (under "predicted"), but the colours are much paler in the figure than in the legend, although the colours in the bars and the legends match in the first figure (1E). Does this signify something?

      These figures were remade with the same colors across the board.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      The study by Prieto et al. faces the increasingly serious problem of bacterial resistance to antimicrobial agents. This work has an important element of novelty proposing a new approach to control antibiotic resistance spread by plasmids. Instead of targeting the resistance determinant, plasmid-borne proteins are used as antigens to be bound by specific nanobodies (Nbs). Once bound plasmid transfer was inhibited and Salmonella infection blocked. This in-depth study is quite detailed and complex, with many experiments (9 figures with multiple panels), rigorously carried out. Results fully support the authors' conclusions. Specifically, the authors investigated the role of two large molecular weight proteins (RSP and RSP2) encoded by the IncHI1 derivative-plasmid R27 of Salmonella. These proteins have bacterial Ig-like (Big) domains and are expressed on the cell surface, creating the opportunity for them to serve as immunostimulatory antigens. Using a mouse infection model, the authors showed that RSP proteins can properly function as antigens, in Salmonella strains harboring the IncHI1 plasmid. The authors clearly showed increased levels of specific IgG and IgA antibodies against these RSP proteins proteins in different tissues of immunized animals. In addition, non-immunized mice exhibited Salmonella colonization in the spleen and much more severe disease than immunized ones. 

      However, the strength of this work is the selection and production of nanobodies (Nbs) that specifically interact with the extracellular domain of RSP proteins. The procedure to obtain Nbs is lengthy and complicated and includes the immunization of dromedaries with purified RPS and the construction of a VHH (H-chain antibody variable region) library in E. coli. As RSP is expressed on the surface of E. coli, specific Nbs were able to agglutinate Salmonella strains harboring the p27 plasmid encoding the RSP proteins. 

      The authors demonstrated that Nbs-RSP reduced the conjugation frequency of p27 thus limiting the diffusion of the amp resistance harbored by the plasmid. This represents an innovative and promising strategy to fight antibiotic resistance, as it is not blocked by the mechanism that determines, in the specific case, the amp resistance of p27 but it targets an antigen associated with HincHI- derivative plasmids. Thus, RPS vaccination could be effective not only against Salmonella but also against other enteric bacteria. A possible criticism could be that Nbs against RSP proteins reduce the severity of the disease but do not completely prevent the infection by Salmonella.

      It is true that vaccina2on of mice with purified RSP protein did not provide complete protec2on against infec2on with a Salmonella strain harboring an IncHI plasmid. As this finding is based on an animal model, further inves2ga2on is required to evaluate its clinical efficacy. In any case, even par2al protec2on provided by nanobodies or by a vaccine could poten2ally improve survival rates among cri2cally ill pa2ents infected with a pathogenic bacterium harboring an IncHI plasmid. An addi2onal beneficial aspect of our approach is that it will reduce dissemina2on of IncHI plasmids among pathogenic bacteria, which would reduce the presence of an2bio2c resistance plasmids in the environment and in the bacteria infec2ng pa2ents. 

      Reviewer #2 (Public Review):

      Summary:

      This manuscript aims to tackle the antimicrobial resistance through the development of vaccines. Specifically, the authors test the potential of the RSP protein as a vaccine candidate. The RSP protein contains bacterial Ig-like domains that are typically carried in IncHl1 plasmids like R27. The extracellular location of the RSP protein and its role in the conjugation process makes it a good candidate for a vaccine. The authors then use Salmonella carrying an IncHl plasmid to test the efficacy of the RSP protein as a vaccine antigen in providing protection against infection of antibioticresistant bacteria carrying the IncHl plasmid. The authors found no differences in total IgG or IgA levels, nor in pro-inflammatory cytokines between immunized and non-immunized mice. They however found differences in specific IgG and IgA, attenuated disease symptoms, and restricted systemic infection.

      The manuscript also evaluates the potential use of nanobodies specifically targeting the RSP protein by expressing it in E. coli and evaluating their interference in the conjugation of IncHl plasmids. The authors found that E. coli strains expressing RSPspecific nanobodies bind to Salmonella cells carrying the R27 plasmid thereby reducing the conjugation efficacy of Salmonella. 

      Strengths:

      The main strength of this manuscript is that it targets the mechanism of transmission of resistance genes carried by any bacterial species, thus making it broad.

      The experimental setup is sound and with proper replication.

      Weaknesses:

      The two main experiments, evaluating the potential of the RSP protein and the effects of nanobodies on conjugation, seem as parts of two different and unrelated strategies.

      In preparing our manuscript, we were aware that we included two different strategies to combat an2microbial resistance. However, we deemed it valuable to include both in the paper. The development of new vaccines and the inhibi2on of the transfer of an2bio2c resistance determinants are currently considered relevant approaches to combat an2microbial resistance. Our inten2on in the ar2cle is to integrate these two strategies. 

      The survival rates shown in Figure 1A and Figure 3A for Salmonella pHCM1 and non-immunized mice challenged with Salmonella, respectively, are substantially different. In the same figures, the challenge of immunized mice and Salmonella pHCM1 and mice challenged with Salmonella pHCM1 with and without ampicillin are virtually the same. While this is not the only measure of the effect of immunization, the inconsistencies in the resulting survival curves should be addressed by the authors more thoroughly as they can confound the effects found in other parameters, including total and specific IgG and IgA, and pro-inflammatory cytokines.

      Overall the results are inconsistent and provide only partial evidence of the effectiveness of the RSP protein as a vaccine target.

      To address the concerns regarding the disparities in survival rates depicted in Figures 1A and 3A, it is important to refer to several factors that contribute to these variations. Firstly, it should be noted that the data depicted in these figures stem from distinct experimental sets conducted at different times employing different batches of mice. Despite the use of the same strain and supplier, individual animals and their batches can exhibit variability in susceptibility to infection due to inherent biological differences.

      Unlike in vitro cell culture experiments, which can achieve high replicability due to the homogeneity of cell lines, in vivo animal studies often exhibit greater variability. This variability is influenced not only by genetic variations within animal populations, even if originating from the same supplier, but also by environmental factors within the animal facility. These factors include temperature variations, the concentration y of non-pathogenic microorganisms in the facility, which can modify the immune responses, or the density of animals in the environment, consequently affecting human traffic and generating potential disturbances. 

      When designing experiments with animals, it is desirable for the results to be consistent across different animal batches. If one bacterial strain exhibits higher mortality rates than another across multiple experimental series, this pattern should be reproducible despite the inherent variability in in vivo studies. It is more important to demonstrate consistency in trends than to focus on absolute figures when validating experimental results. 

      It is also important to clarify that when we refer to survival rates, it doesn’ t necessarily mean that the animals were found deceased. The animal procedures were approved by the Ethics Committee of Animal Experimentation of the Universitat de Barcelona, which include an animal monitoring protocol. Our protocol requires close daily monitoring of several health and behavioral parameters, each evaluated according to specific criteria. When an animal reaches a predetermined score threshold indicating severe distress or suffering, euthanasia is administered to alleviate further suffering. At this point, biological samples are collected for subsequent analysis.

      The conjugative experiments use very long conjugation times, making it harder to assess if the resulting transconjugants are the direct result of conjugation or just the growth of transconjugants obtained at earlier points in time. While this could be assessed from the obtained results, it is not a direct or precise measure.

      In the conjuga2on experiments we u2lized a reduced number of donor cells expressing the RSP protein and of recipient cells, as well as long conjuga2on 2mes, to reflect more accurately a situa2on that may occur naturally in the environment. Short conjuga2on 2mes are efficient in controlled laboratory condi2ons using high densi2es of donor and recipient cells, but these condi2ons are not commonly found in the environment. For the interference of the conjuga2ve transfer of the IncHI plasmid we used an E. coli strain displaying the nanobody binding RSP to simulate a process that could be also scaled-up in a natural environment (i.e., a probio2c strain in a livestock farm) and that could be cost effec2ve. See discussion sec2on, lanes 326-328.   

      While the potential outcomes of these experiments could be applied to any bacterial species carrying this type of plasmids, it is unclear why the authors use Salmonella strains to evaluate it. The introduction does a great job of explaining the importance of these plasmids but falls short in introducing their relevance in Salmonella.

      The prevalence of IncHI plasmids in Salmonella was indicated in the introduc2on sec2on, lanes 65-67. Nevertheless, we understand the reviewer’s cri2cisms and have modified both these sentences in the introduc2on sec2on and also added comments in the results sec2on (lanes 118-128).

      Recommendations for the authors:

      Reviewer #2 (Recommendations For The Authors):

      I understand working with mice can be challenging in terms of repeating experiments to further support the study's claims. For this reason, I think the authors need to discuss more thoroughly the following things:

      Can the authors comment on why the presence of Ampicillin leads to a lower upregulation of proinflammatory cytokines in the spleen despite harboring resistance against ampicillin?

      At the intestinal level, physiological inflammatory responses play a crucial role in enabling the host to identify foreign and commensal bacterial antigens and initiate a highly regulated and "controlled" immune response (Fiocchi, 2008. Inflamm Bowel Dis. 2008, 14 Suppl 2:S77-8). The administration of antibiotics such as ampicillin, reduces the load of intestinal resident microbiota, thereby lowering the extent of intestinal immune activation. This decline in immune activation extends to systemic levels, potentially accounting for the reduced expression of proinflammatory cytokines observed in the spleen.

      There are inconsistent results in the survival rates in Figures 1A and 3A, please discuss how this could alter the observed differences in total and specific IgG and IgA, and pro-inflammatory cytokines.

      To address the reviewer concerns regarding the discrepancies in survival rates shown in Figures 1A and 3A, and how these differences might influence the observed variations in total and specific IgG and IgA, as well as pro-inflammatory cytokines, it is important to clarify the terminology used in our study. In our context, "survival" does not solely refer to mortality per se, but encompasses the endpoints defined by our animal welfare protocols, which are rigorously supervised by the Animal Experimentation Ethics Committee of the University of Barcelona. Our protocol mandates close daily monitoring of several health and behavioral parameters, each scored according to specific criteria. When an animal reaches a predefined score threshold indicating severe distress or suffering, euthanasia is conducted to prevent further distress, at which point we collect biological samples for analysis.

      In contrast to in vitro cell culture experiments, which often achieve high replicability thanks to the homogeneity of cell lines, in vivo animal studies frequently display greater variability. This variability stems not only from genetic differences within animal populations, even if originating from the same supplier, but also from environmental factors within the animal facility. These factors encompass variations in temperature, the presence of non-pathogenic microorganisms in the facility (capable of altering immune responses) and the density of animals, which can impact human traffic and potentially lead to disturbances. 

      The experiments depicted in Figs. 1A and 3A were separated in time, and hence may be influenced by environmental factors within the animal facility. Nevertheless, in the comparative analysis performed between immunized and non-immunized animals, experiments were performed simultaneously and hence under similar environmental conditions in the animal facility. For several parameters (i.e., immunoglobulins and proinflammatory cytokines) statistically significant differences were observed. 

      Regarding the conjugation assays, it is not entirely clear to me why the conjugation times are so long. It would be beneficial to have more data about the conjugation efficacy between the donor and recipient without any E. coli expressing the nanobodies at different time intervals. This would help to differentiate between transconjugants and transconjugants obtained from early conjugation events.

      This comment is par2ally answered in a previous response, regarding the numbers of donor and recipient cells and dura2on of conjuga2on.  We note here that in fig. 9, the requested experiment with donor and recipient cells without E. coli interferent cells is already present, corresponding to the label “none”. To avoid confusion, we have modified the legend in fig. 9.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      *Reviewer #1 *

      1. The authors conclude that RFP-Ac expression is restricted to emerging SOPs and surroundings cells at 18h APF, indicating that Ac is activated later than Sc. Can the authors provide images for RFP-Ac expression at 10h and 16h APF similar to GFP-Sc as shown in their figures. Do the SOPs that contain high levels of both Ac and Sc (as some SOPs have Sc expression but not Ac) undergo fate divergence and SB faster than the SOPs containing higher levels of only Sc?

      We are now showing the expression pattern of GFP-SC and RFP-Ac/GFP-Ac in fixed samples stained also for E-cad at 13h, 16h and 18h APF (Fig 1I-K' and Fig S1E-G'). Ac and Sc were found to be activated around the same time. However, Ac appeared to accumulate at lower levels than Sc prior to SOP selection in the central domain of the ADHN (Fig 1J-K'). We also confirmed that Ac was more strongly expressed in SOPs. Additionally, SOPs appeared to accumulate both Ac and Sc, i.e. SOPs with high levels of GFP-Sc also showed a strong RFP-Ac signal (Fig S1H-H'). Finally, since RFP-Ac was not detectable in living pupae, possibly due to the rapid turn-over of Ac and the slow maturation of RFP, we could not study more precisely the relative dynamics of Ac and Sc. For the same reason, we could not address whether the rate of fate divergence (measured using GFP-Sc) varied with the level of Ac.

      2. It would be interesting to see the spatial and temporal dynamics of Ac and Sc in Notch mutants or even Notch dynamics in Sc and Ac mutants to better understand the progression fate divergence and its effect on lateral inhibition in real time.

      Following the reviewer's suggestion, we examined the expression pattern of NRE-deGFP, a Notch activity reporter, in ac sc double mutant pupae at 16h and 24h APF (Fig S3A-D). This showed that the initial pattern of NRE-deGFP at 16h APF (signal detected in posterior ADHN cells as well as in the ADHN) did not depend on Ac and Sc. By contrast, the second phase of NRE-deGFP expression (in cells of the proneural ADHN domain, around emerging SOPs) was found to depend on the activity of Ac and Sc. Thus, strong Notch activation observed in cells surrounding emerging SOPs was found to depend on the activity of Ac-Sc, presumably because Ac and Sc are required for SOP specification and SOPs produce Delta, serving as the local source to activate Notch (see also our response to reviewer 3, point #6). Thus, since NRE-deGFP was not up-regulated in the proneural ADHN domain of sc10-1 ac3 mutant pupae, a quantitative analysis of the dynamics of NRE-deGFP may not be informative.

      The reviewer also suggested us to study the dynamics of GFP-Sc in Notch mutants. One can easily predict that most Notch mutant cells would accumulate GFP-Sc, as observed in the notum (PMID: 28386027). Therefore, analysis of fate symmetry breaking is unlikely to be useful in that context. Likewise, a FDI analysis would not be relevant. From a technical point of view, live imaging of GFP-Sc would have to be performed in Notch mutant clones. This is because RNAi against Notch (strong 10xUAS-Notch hp2 construct, PMID: 19487563) driven by escargot-Gal4 to knock down Notch in larval histoblasts only led to a partial loss of Notch function (our unpublished data). Generation of Notch mutant clones in the abdomen would require constructing appropriate GFP-Sc Notch FRT recombinant chromosome as well as generating a new FRT GFP-Sc chromosome with an infrared marker (not currently available) to compare the relative dynamics of GFP-Sc in wild-type and mutant cells. In sum, this proposed experiment would take a significant amount of time and is unlikely to shed new light. Given that this experiment is not essential to support the claims of the paper and that it is not clear to us what would be learnt from this experiment, we opted for not performing this experiment.

      Minor comments * 1. In figure 1F and F', the authors mention GFP-Sc is not expressed prior to 14h, however, there is still GFP signal detected in their imaging. Can the authors comment what would be the cause of this GFP signal or was it due to non-specific background signal during their imaging analysis?*

      We thank the referee for raising this issue. Yes, a strong autofluorescence signal was detected prior to the onset of GFP-Sc expression. We provide below the results of our analysis of the autofluorescence signal (Fig R1) relative to the nuclear signal (Fig R2), and how normalization of the signal was used to measure the specific GFP-Sc signal.

      Analysis of the autofluorescence signal over time

      To estimate the autofluorescence signal, we measured the average intensity of the signal acquired in the GFP channel for each frame and plotted these values over time. The results are shown in Fig R1 below:

      *Fig R1: temporal profile of the autofluorescence signal *

      Each measurement corresponds to the average intensity measured in the GFP channel over the entire field at each z-section and for each time point. Mean and SD values of measured are shown over time in black and grey, respectively. Time is in frame number (dt is 2.5 min). The data shown above corresponds to movie 1 (see also Fig 2).

      This plot indicates that the autofluorescence signal was progressively bleached. We therefore excluded from our analysis the first 50 time points when the autofluorescence signal was initially strong. No nuclear GFP-Sc signal was detectable in these first 50 frames in the cells of the central area of the ADHN which are studied here (see Fig 2A', t=1:12, time frame #29).

      While revising the manuscript, we realized that t=0 corresponded to two distinct time points in the first version of our manuscript: it corresponded to the onset of imaging in Fig 2A-D', and to t=2:08 (time frame #51) in all other figures showing data following removal of the first 50 time points. We have now fixed this issue and are presenting all data with t=0 corresponding to the onset of imaging.

      Analysis of the nuclear fluorescence signal over time

      To detect the nuclear GFP-Sc signal, we measured the average intensity of the signal acquired in the GFP channel (raw intensity values corresponding to the sum of the GFP-Sc and autofluorescence signals) in segmented nuclei (in 3D, within the entire z-stack). These values were plotted over time (pink curve in Fig R2 below; the autofluorescence is plotted in black, as in Fig R1, for the sake of comparison). This showed that the intensity of the signal measured in nuclei was initially identical to the mean intensity measured across the entire field of view, indicative of autofluorescence only. A specific increase in signal intensity in nuclei (relative to the entire field of view) was detectable after 2h of imaging (time frame 48 in Fig R1; dt is 2.5 min). Importantly, mean intensity values of the autofluorescence signal appeared to be approximately 10-fold stronger than the mean intensity associated with the nuclear GFP-Sc signal.

      Fig R2: temporal profile of the GFP-Sc signal

      *The plot in pink corresponds to the average intensity in the GFP channel (raw intensity values corresponding to the GFP-Sc and/or autofluorescence signals) per nucleus (within the entire z-stack) for each time point. Mean and SD values measured in each nucleus are shown over time (in pink; these data correspond to movie 1; shown also in Fig 3). This plot (pink) should be compared with the plot shown in Fig R1 (also in black in Fig R2). The intensity difference between the pink and black curves was attributed to the specific GFP-Sc signal. *

      Signal normalization and analysis of the GFP-Sc signal

      In our study, we normalized the GFP-Sc signal by dividing the averaged value measured in each single nucleus (data corresponding to the pink curve in Fig R2) by the mean value of the signal measured at the same time point in the same channel in the entire image stack (data corresponding to the black curve in Fig R1/R2). Given the low intensity of the GFP-Sc signal, and the small number of pixels corresponding to Scute-expressing nuclei over the entire field of view, this value should closely reflect the autofluorescence noise. Thus, the background autofluorescence signal should be close to 1. This was experimentally verified by measuring the normalized intensity values of the PDHN nuclei that did not express Scute: a mean intensity value of 0.96 +/- 0.10 was measured (at time frame #51; see Fig R1 below). In contrast, the normalized GFP-Sc values measured several hours before SB were found to be close to 1.1 (see Fig 3D). Whether these values reflect very low levels of nuclear GFP-Sc that cannot be detected visually or result from imperfect normalization of the signal remain unclear. Given the intensity and non-uniformity of the autofluorescence signal, we cannot exclude the latter. For this reason, we chose to not over-interpret the initial low intensity values of GFP-Sc.

      In the materials and methods, the authors mention that prior to imaging the larvae and pupae are grown at 18, 21 or 25{degree sign}C. Is there a reason why the larvae and pupae are grown at different temperatures for different experiments? Can the authors specify (i.e. in the figure legends) in which experiments different temperatures were used?

      Larvae and pupae were grown at different temperatures for convenience, i.e. to adapt the time interval between staging at 0h APF and mounting for live imaging. Indeed, it is much easier to obtain 10-14h APF pupae by collecting staged pupae at 0h APF the day before and incubating them overnight at lower temperature to slow-down development. However, all live imaging experiments were performed at 23-25{degree sign}C, and we have no reason to think that this prior incubation would affect the process studied here.

      The citations need to have a better format as they show up as each citation within a single bracket which makes it a little hard to read when multiple references are cited in a single sentence. fixed

      In the abstract, the sentence 'Unexpectedly, we observed at low frequency (10%) pairs of cells that are in direct contact at the time of SB'. SB should be replaced with "Symmetry breaking", as it appeared for the first time in the manuscript and should be written out in full. fixed

      Throughout the manuscript there are instances where the abbreviations are written in full with the abbreviation in brackets after they have already been introduced in the introduction which can be changed to just the abbreviation itself. fixed

      In the discussion on page 11, 'our observation...', our needs to be changed to Our. fixed

      7. It would be nice to have arrow heads or dotted lines around the cells or areas on interest in both, all the figures and movies, so that it will be easier to follow the results. The videos have a lot of background due to fragmented apoptotic nuclei, etc. as mentioned by the authors, hence arrow heads or dotted lines would bring viewers focus on the areas of interest.

      fixed (see for instance Fig 1D, Fig 2A, Fig 5B, Fig 7A, Fig S3D, etc...)

      8. It would be helpful to have anterior - posterior axis (i.e. with an arrow) shown on top of all the figures.

      In our earlier version, we indicated that 'In this and all other figures, dorsal is up and anterior is left' in the legend of Fig 1B. We have now moved this sentence at the end of Fig 1 to have it more apparent. Additionally, the AP axis is now clearly indicated in Fig 1C. We believe that it is not necessary to repeat this orientation in all figures.

      Scale bars are missing in all figures, videos, and figure legends. Added

      Only movies 1 and 3 are referenced in the text. All movies are now referenced in the text

      Keeping the colors in the movies and figures consistent and same would be helpful. For example, Movie 2 Histone3.3-mIFP marker is in blue but in figure 3 it is in magenta. fixed (H3.3-mIFP in magenta in this movie, now numbered 3)

      As mentioned above, it would be helpful if the authors have arrow heads or dotted lines around the cells or areas of interest in both the figures and movies for better representation of their data. For example, movie 1 shows a larger area of imaging than shown in figure 2A, which makes it hard to follow the cells of interest in the movie.

      An additional movie corresponding to the SOP shown in Fig 2A is now provided (new movie 2).

      --

      Reviewer #2

      1. Despite "symmetry breaking" being the main focus of the paper, in the Introduction, the authors do not explain what this term means and do not provide any description of this process. This is a critical point that makes understanding of the goals of the paper difficult. Therefore, the authors are encouraged to provide more information and a clear description of this term/phenomenon. We thank the reviewer for this suggestion, we are now stating in the introduction what symmetry breaking means in the context of lateral inhibition: 'To describe and study the process of SOP selection, we studied fate SB. The latter refers to the transition point when one cell, the future SOP, starts to stably accumulate a higher level of GFP-Sc relative to its immediate neighbors.'

      The role of Achaete in the story is not clear. Even though both factors are required for SOP determination, the authors mainly focus on Scute, so it is not very clear what the role of Achaete in this process is, if there is any. As shown in the paper, Achaete is expressed later when heterogeneity is promoting cell fate divergence. Is Achaete maybe contributing to cell heterogeneity/ cell fate divergence?

      We thank the reviewer for raising this point. We now show in Fig S1A-D that abdominal bristles develop in a protein null allele of sc (scM6 ) as well as in an ac mutant corresponding to a 45 kb deletion that removes ac but not sc (PMID: 16216235)). Together with our analysis of sc10-1 ac3 __mutant flies, we can now conclude that __Sc and Ac act redundantly for SOP specification in the pupal abdomen. We have also further studied the expression of Ac relative to Sc and E(spl)HLH-m3 (see our response above to point #1 of reviewer 1). We fully agree with the reviewer that cell-to-cell variations in Ac expression might contribute to proneural heterogeneity and SB. This is now briefly discussed.

      Minor points: * * 1. Symmetry Breaking (SB) should be abbreviated in the Abstract. The authors initially use the full term without abbreviation, and only on page 5, the abbreviation is finally defined; however, it should be introduced much earlier.

      fixed

      The second-to-last sentence in the abstract, "These lateral inhibition defects were correlated via cellular rearrangements," is unclear regarding what defects the authors are referring to.

      This sentence was rewritten: 'Live imaging showed that these patterning defects were corrected via cellular rearrangements associated with global tissue fluidity, not via cell fate change.'

      For clarity, being more specific in the text in regards to description of the figure panels would be beneficial (e.g. page 3 Fig 1C-E); referring to C-E together makes it hard to understand what does each panel shows.

      fixed

      In many instances, the movies are not properly referenced (e.g. on page 5, third row simply states "movies"), making it difficult to discern which movie should be checked. On page 8, when authors refer to movie 3, they likely meant movie 5.

      fixed

      Figure S1 requires some corrections.

      We thank the reviewer for helping us improve the presentation of our results.

      The authors use the short name "scute" initially and then switch to the shortened version "sc'.

      fixed

      Additionally, the nlsRFP (blue) is difficult to see; adjusting the levels or changing colors/showing separate channels may improve visibility.

      The authors mention clone borders, but none are shown. It would greatly help to outline the borders in all figures.

      The ubiquitous nlsRFP marker is now shown in magenta in Fig S1I that now shows only 2 channels to outline the ADHN (white dotted line) and the clones (yellow dotted lines).

      We also outlined the clone borders in Fig 4C,C'.

      Genotypes of the samples should be indicated, and clarification is needed regarding what "n" represents (number of cells, clones, or flies).

      The genotype studied in Fig S1 and Fig 4 (which is the only complex genotype studied here) is now indicated in the Methods section. We have clarified what the different 'n' meant, in Fig 4 (see text) and elsewhere (see legend of Fig S2 for instance).

      What do the arrows in the panel B show?

      Thanks for pointing this out. The arrows in Fig S1I' indicate Cut/Hnt-positive cells (SOPs) within the clones (as now explained in the legend).

      It is also recommended to display important channels as separate black and white images.

      Separate channels are now shown in Fig S1 and S3.

      Additionally, the use of RNAi against GFP instead of RNAi against scute should be justified; using RNAi GFP as the genotype on the graph could be interpreted as a control genotype rather than downregulation of scute.

      A RNAi construct against GFP was used because this construct was known to very efficient and specific. Indeed, a strong knock-down of GFP-Sc was obtained by this approach (see Fig 4C'). We did not test sc RNAi constructs in the context of GFP-Sc. To avoid confusion, we are now indicating Sc downregulation (gfp RNAi) in Fig 4C'.

      In the Figure 2 Legend, the authors use "std" as an abbreviation to define standard deviation. Typically, this is abbreviated as SD.

      fixed

      In Figure 4E, the authors do not explain on why there are points on the x-axis that correspond to a decimal number of cells.

      Since heterogeneity was calculated over a 20 min interval, we likewise calculated the number of neighbors over the same time interval. Thus, the number of neighbors for each SOP corresponds to an averaged value calculated over this time interval. This is now explained in the legend.

      --

      Reviewer #3

      1. First and foremost, the authors should state in the first paragraph of the Results that scGFP is a CRISPR knockin and thus it's the only source of Sc protein in the animals imaged (this is stated only in the Methods section). Thanks for this comment, we agree that this is one of the strengths of our work that we should emphasize. We now state in the results section: 'GFP-Sc is produced from the endogenous locus such that all Sc molecules produced in these pupae are GFP-tagged'

      The magnitude of the Sc increase should be commented on. Based on the intensity and FDI plots in Fig. 3B-E, an increase of 15-17% in the amount of Sc is suggested (the FDI plateaus at 0.08, which gives 1.08/0.92 = 1.17x the level of Sc in the SOP vs the surrounding cells). However, in the stills shown in Fig. 2BCD and in Fig. 3A, the intensity differential between SOPs and neighbors seems at least 100% (ie at least double the intensity, which would yield an FDI of >1/3 =0.33). Why is this high contrast never seen in the quantitative measurements?

      Thanks for asking about the fold change of GFP-Sc levels in SOPs, from SB to its plateau. This fold change can be seen in Fig 3D: the normalized value of GFP-Sc is 1.12 at SB, and 1.26 three hours after SB (when the FDI plateaus), indicative of a 2.2x fold increase of GFP-Sc in SOPs (0.26/0.12= 2.2, following background subtraction; see our detailed response to reviewer #1, minor point 1, about background signal analysis and normalization of the signal). This fold-change value is now indicated in the legend of Fig 3D. Obviously, this fold-change value is highly sensitive to signal normalization. Since the autofluorescence signal was stronger than the GFP-Sc signal (see Fig R2 above) and varied over time (due to bleaching; see Figs R1 and R2 above), we feel that this fold-change value should be taken with a grain of salt.

      From Fig. 2A-D it appears that the ScGFP fluorescence intensity is at the same level or weaker than nearby autofluorescence. Please state (1) how you confirmed that the histoblast nest has lower autofluorescence than the larval epidermis and (2) how you corrected for histoblast nest autofluorescence in your quantifications.

      As detailed above (our response to reviewer #1, minor point 1), the specific GFP-Sc signal is ten-times lower than the autofluorescence signal. We did not compare the autofluorescence signal produce by larval and imaginal cells (but note that larval epidermal cells had a stronger autofluorescence signal; see the yellow dots in Fig 2A). Normalization of the signal to correct for autofluorescence was explained in the Methods (and is also detailed above in our response to reviewer #1, minor point 1).

      The paradoxical result of Fig. S1B should be discussed. On the one hand it is stated that "Ac and Sc specify the fate of the Sensory Organ Precursor cells (SOPs)" (p.2) and on the other S1B shows SOP specification in the absence of Sc. Are the SOPs shown in Fig S1B rare exceptions? Do the authors believe that these rare exceptions are there because of inefficient RNAi (since in comparison with S1A, in the null condition almost no SOPs should be formed)? Or they are the SOPs in RNAi clones as rare as the occasional bristles in S1A?

      We do not see the result of Fig S1B as paradoxical but interpreted this result assuming that Ac and Sc were redundant for SOP determination. We now provide clear genetic evidence in support of this view (see our response above to reviewer #2, point 2). Otherwise, we found that RNAi is efficient (see loss of the GFP signal in clone in Fig. 4C'). In adult males, the density of bristles appeared to be quite normal over clonal patches of gfp RNAi cells (not shown), consistent with Ac being redundant with Sc

      One figure that is not straightforward to interpret is Fig. 4E. It plots ScGFP heterogeneity vs. number of RNAi neighbors. Each point in the plot must be an individual SOP (165 total). Therefore, its neighbors (the x-axis) should take integral (not decimal) values. How can a single SOP have a decimal number of RNAi neighbors, especially since heterogeneity was sampled over a 10min time-window, when not much cell rearrangement can take place? Please explain.

      Since heterogeneity was calculated over a 20 min interval, we likewise calculated the number of neighbors over the same time interval. Thus, the number of neighbors for each SOP corresponds to an averaged value calculated over this time interval. This is now explained in the legend: 'Note that the number of neighbors was likewise calculated over this time interval, and the resulting number of neighbors may not take an integral value.'

      I found the discussion of the Notch reporter dynamics (Fig. 7) confusing in several places. * * (6a) Whereas it's clear that there is plenty of Notch signaling going on before SBN, the authors repeatedly imply that Notch signaling starts after SBN. For example, in the Results (p.9) they state "Thus, this quantitative approach failed to detect a phase of reciprocal Notch signaling during which proneural cluster cells would both send and receive a Delta-Notch signal prior to SOP emergence." The fact that the NRE-deGFP gave a robust signal before the start of the movies clearly means that mutual inhibition was going on for quite some time before SB. In fact, an FDI of 0 for >4h prior to SBN (Fig. 7G) means exactly this: that the level of Notch response among the cluster cells is equivalent ("mutual inhibition" lasts for at least 4h before SBN). (6b) In the first paragraph of this section (p.8) they comment that the pre-existence of Notch signaling is unexpected - why? I interpret it to simply be mutual inhibition (see above). Then they go on to quantitate the average Notch response intensity over the entire posterior ADHN (please define the borders the "posterior" ADHN). I question the informational value of this analysis (averaging over a large region), when Notch signaling is known to have intense local cell-to-cell variability (also evident in the stills shown in Fig. 7A,B,C).

      We apologize for not describing well enough the data shown in Fig 7E, and for not explaining clearly our interpretation of the NRE-deGFP signal.

      While the observation of a strong NRE-deGFP signal indeed indicates that Notch signaling had been active prior to the time of observation (in this sense, Notch is indeed active long before SBN), this does not necessarily imply that Notch is still active at that time. This is because the deGFP protein produced by the NRE-deGFP reporter is stable relative to the time scale of the studied process. Its measured half-life in S2 cells cultured at 25{degree sign}C is 2h (PMID: 31140975). Based on this data, the NRE-deGFP signal is likely to remain detectable several hours after Notch signaling has been switched off. If the rate of production of deGFP is lower than its rate of degradation, then the NRE-deGFP signal is expected to progressively decay over time. We believe that this is what we observed in our movies: while a strong signal was detected over the posterior half of the ADHN at 14-15h APF, this signal decreased over time (Fig 7D). To interpret the temporal dynamics of NRE-deGFP signal in terms of instantaneous Notch activity, we examined the Rate of Change (ROC): an increase of the NRE-deGFP signal over time (positive ROC) would indicate that Notch activity is increasing (more precisely, the production rate of deGFP is higher than its rate of degradation), whereas a decrease (negative ROC) indicates that Notch becomes less active (or inactive if the rate of decrease approximates the decay rate of the deGFP protein). Our data shown in Fig 7D showed that the NRE-deGFP signal (measured in the area indicated with a dotted line in Fig 7A,B; this area was defined by the initial pattern of NRE-deGFP) decreased over time (negative ROC) between t=1 and t=6.5h. We therefore conclude that Notch signaling is decreasing to reach a minimum at t=~3.5h, indicating that the level of Notch activity is at its lowest around the time of SB. At this minimum, the decay rate corresponds to a protein half-life of 4.4h, which is not so different from the measured half-time of deGFP in S2 cells (particularly if one assumes a 1.4x difference between the decay rates measured at 22 and 25{degree sign}C, based on the known temperature-dependent speed of development). This is why we conclude that Notch signaling is very low at this stage. Additionally, no NRE-deGFP signal was detected before t=4:30h (movie 7) in the initially NRE-deGFP negative cells (located anterior to the area indicated with a dotted line in Fig 7A). This indicated that Notch was activated late in this area. Together, our observations are not consistent with the view that Notch mediates a strong mutual inhibition signal over a prolonged time interval prior to SB.

      To further study the pattern of Notch activity, we have monitored over time the accumulation pattern of GFP-tagged E(spl)m3-HLH (GFP-m3) (PMID: 31375669) in fixed sample (Fig S3F-G'). This confirmed that Notch was active in posterior ADHN cells and in the PDHN prior to 14h APF, i.e. prior to the onset of Ac and Sc, and that Notch activation extended to the central ADHN domain at 17-18h APF (Fig S3E-E' and G-G', and Fig 7I-I''), coinciding with SOP emergence.

      Otherwise, the reviewer is correct when stating that a FDI value close to 0 indicates that the level of measured fluorescence among the different cells of the considered cluster is similar. Such a FDI value would be measured if cells did not express NRE-deGFP or had decreasing but similar levels of NRE-deGFP. This FDI value does not, per se, imply that Notch is active.

      And then they move on to a (much more informative) cell-by-cell analysis, without even changing paragraphs, making it hard for the reader to follow. (6c) The conclusion at the end of the second paragraph (p. 9) "It also showed that SB was detected soon after the onset of Notch-mediated inhibitory signaling." is nowhere supported by data. If I understand well, SB refers to Sc and "the onset of Notch-mediated inhibitory signaling" refers to SBN (which is the onset of ASYMMETRY in Notch signaling, not the onset of Notch signaling, which has been going on for hours earlier). I don't see any data comparing SB with SBN. In fact, this is an important question to address (see below - comment 10).

      We apologize for the lack of clarity in our writing, we meant: "It also showed that SBN was detected soon after the onset of Notch-mediated inhibitory signaling."

      Yes, SBN refers to the onset of asymmetry in Notch signaling, as measured using NRE-deGFP. As explained above (but see also our response to point #7 below), our data do not provide evidence for a detectable Notch signal prior to SBN.

      We agree that comparing SB and SBN would be nice. Unfortunately, our current tools do not permit a detailed comparison (see our detailed response below, point #10).

      Mutual inhibition amongst neighboring cells has been proposed to involve (besides mutual Notch signaling) an increase in Sc levels in 2-3 cells in a cluster before the singularization of a single SOP. The authors seem rather biased against such a transient Sc hike based on their results in Fig. 2D, where the neighboring cells stay at rather constant basal Sc levels for several hours, while the Sc SB event happens. However, looking at an individual SOP in Fig. 2B, I do detect a mild hike in the pink curve right around SB in the blue curve. Could the average result from 160 SOPs (in Fig 2D) simply blur such transient Sc hikes, if they happen with different kinetics for different SOPs? Couldn't the 10% of SOP twins (shown in Fig. 6) represent a special case of this transient "subcluster" Sc hike? I would appreciate some discussion on this point. [Whether Sc is transiently upregulated or not, however, does not change my firm conclusion - from the data presented - that Notch-mediated mutual inhibition has been going on long before SBN.]

      First, our data are consistent with the notion that a few proneural cells progressively accumulate higher level of Scute prior to SB (as proposed above). Indeed, the moderate increase in both GFP-Sc levels and coefficient of variation values (GFP-Sc heterogeneity) seen prior to SB correspond to what the reviewer has in mind (higher levels of GFP-Sc in a few proneural cluster cells). We also appreciate the reviewer's comment about the plot shown in Fig 2D. However, we strongly feel that our quantitative analysis of a large dataset is a strength. Thus, we do not find useful to discretize a continuous process by introducing the notion of 'subclusters' of 2-3 cells. Likewise, we believe that it is more informative to focus our analysis on the entire dataset using average and SD values and do not wish to base our interpretation of the process based on selected tracks (the one shown in Fig 2B only served as an illustration of how we performed our analysis and has no interpretation value).

      The reviewer also states that "mutual inhibition amongst neighboring cells has been proposed to involve an increase in Sc levels in 2-3 cells in a cluster before the singularization of a single SOP". Since there is no published description of the pattern of accumulation of Scute in abdominal histoblats (to our best knowledge), we hypothesize that this statement applies to the proneural clusters in the developing wing disc. This is because the accumulation pattern of Sc has been studied in detail in that context by the Modollel and Carroll labs (PMID: 2044965, PMID: 2044964). However, their description of the accumulation pattern of Scute (in fixed samples, using anti-Sc antibodies) did not refer to sub-clusters of 2-3 cells. We would appreciate if the reviewer could direct us to the relevant published observation.

      Finally, we are not sure to follow the reviewer when she/he firmly concluded from our data that Notch-mediated mutual inhibition has been going on long before SBN. Instead, our data clearly showed that the ADHN region that produced SOPs exhibited two distinct NRE-deGFP patterns, with Notch signaling being active prior to imaging (i.e. prior to 14h APF) and decreasing to reach a minimum of Notch activation around 17h APF (i.e. around the time of SB, as determined by GFP-Sc imaging) in the posterior area of the ADHN.

      Thus, our data do not show that mutual inhibition does not take place in this tissue but rather imply that the phase of mutual inhibition (or competition) must be relatively short, or transient, and that competition amongst proneural cluster cells operate at low Notch and Sc levels (probably contrary to what many people have in mind).

      Some minor points: * * 8. Please change Cad-GFP to Ecad-GFP or shg-GFP, as Cad misdirects to caudal.

      Thanks, changed into Ecad-GFP and Ecad-mKate

      What is c in "(x,y,z,c,t) movies"? (a fifth independent variable?)

      c stands for channel. This is relatively standard nomenclature.

      The authors show that Sc displays a SB event leading to FDI of 0.08 and the Notch reporter displays another SB (SBN) leading to a much more pronounced FDI of -0.2. Are these two events (the hike of Sc levels and the plummeting of Notch signal) contemporaneous or does one precede the other? Having both tagged with GFP makes it impossible to image simultaneously, but the authors could register each reporter's dynamics relative to the time of SOP division (as done in Fig. 5C) to get a sense of their relative order.

      We do agree with the reviewer that it would be nice to be able to align in time these two data sets. Unfortunately, the temporal correlation between SB and the SOP division is too variable (4.7 +/- 1.1) to confidently align these two datasets using this event as a time reference. New tools are needed (see our response to point #11 below).

      Where in the above timeline is the SOP fate definitively adopted? neur-nlsGFP, Ac-RFP, m3Cherry and Sens detection in Figs. 1 and 7 give us a rough idea that these other markers appear around the time of Sc FDI peaking, around 3h after the initial SB. But this is not presented in an organized fashion - the reader collects this information sporadically. A reanalysis of the already existing data attempting to place these various markers in an integrated timeline would be of great importance in understanding the details of this cell fate specification process. Which is the earliest SB event? sc, neur or Notch? How long does it take from that early SB until definitive SOP markers (Sens) first appear?

      We agree with the reviewer, it would be interesting to extend the approach reported here for Scute to characterize SB and rate of FDI for other key factors governing the selection of SOPs. As pointed out by the reviewer (point #10 above), it would also be important to register in time these various events. Unfortunately, the maturation time of RFP, mCherry, FP670, etc... appeared to be too slow relative to the rapid turnover of the Ac, Sc and E(spl)-HLH factors prevented us from performing two-color imaging. Hence, current tools do not permit to determine which is the earliest SB event.

      More genetic perturbations could be performed to solidify the model of cell-cell communication during lateral inhibition. Two obvious ones come to mind: (a) How would the Sc-GFP dynamics change in a Notch-RNAi background? (b) How would the NRE-deGFP dynamics change in a sc-RNAi background?

      See our detailed response to reviewer #1, major point #2.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Your editorial guidance, reviews, and suggestions have led us to make substantial changes to our manuscript. While we detail point-by-point responses in typical fashion below, I wanted to outline, at a high level, what we’ve done.

      (1) Methods. Your suggestions led us to rethink our presentation of our methods, which are now described more cohesively in a new methods section in the main text.

      (2) Model Validation & Robustness. Reviewers suggested various validations and checks to ensure that our findings were not, for instance, the consequence of a particular choice of parameter. These can be found in the supplementary materials.

      (3) Data Cleaning & Inclusion/Exclusion. Finally, based on feedback, our new methods section fully describes the process by which we cleaned our original data, and on what grounds we included/excluded individual faculty records from analysis.

      eLife assessment

      Efforts to increase the representation of women in academia have focussed on efforts to recruit more women and to reduce the attrition of women. This study - which is based on analyses of data on more than 250,000 tenured and tenure-track faculty from the period 2011-2020, and the predictions of counterfactual models - shows that hiring more women has a bigger impact than reducing attrition. The study is an important contribution to work on gender representation in academia, and while the evidence in support of the findings is solid, the description of the methods used is in need of improvement.

      Reviewer #1 (Public Review):

      Summary and strengths

      This is an interesting paper that concludes that hiring more women will do more to improve the gender balance of (US) academia than improving the attrition rates of women (which are usually higher than men's). Other groups have reported similar findings but this study uses a larger than usual dataset that spans many fields and institutions, so it is a good contribution to the field.

      We thank the reviewer for their positive assessment of the contributions of our work.

      Weaknesses

      The paper uses a mixture of mathematical models (basically Leslie matrices, though that term isn't mentioned here) parameterised using statistical models fitted to data. However, the description of the methods needs to be improved significantly. The author should consider citing Matrix Population Models by Caswell (Second Edition; 2006; OUP) as a general introduction to these methods, and consider citing some or all of the following as examples of similar studies performed with these models:

      Shaw and Stanton. 2012. Proc Roy Soc B 279:3736-3741

      Brower and James. 2020. PLOS One 15:e0226392

      James and Brower. 2022. Royal Society Open Science 9:220785 Lawrence and Chen. 2015.

      [http://128.97.186.17/index.php/pwp/article/view/PWP-CCPR-2015-008]

      Danell and Hjerm. 2013. Scientometrics 94:999-1006

      We have expanded the description of methods in a new methods section of the paper which we hope will address the reviewer’s concerns.

      We agree that our model of faculty hiring and attrition resembles Leslie matrices. In results section B, we now mention Leslie matrices and cite Matrix Population Models by Caswell, noting a few key differences between Leslie matrices and the model of hiring and attrition presented in this work. Most notably, in the hiring and attrition model presented, the number of new hires is not based on per-capita fertility constants. Instead, population sizes are predetermined fixed values for each year, precluding exponential population growth or decay towards 0 that is commonly observed in the asymptotic behavior of linear Leslie Matrix models.

      We have additionally revised the main text to cite the listed examples of similar studies (we had already cited James and Brower, 2022). We thank the reviewer for bringing these relevant works to our attention.

      The analysis also runs the risk of conflating the fraction of women in a field with gender diversity! In female-dominated fields (e.g. Nursing, Education) increasing the proportion of women in the field will lead to reduced gender diversity. This does not seem to be accounted for in the analysis. It would also be helpful to state the number of men and women in each of the 111 fields in the study.

      We have carefully examined the manuscript and revised the text to correctly differentiate between gender diversity and women’s representation.

      We have additionally added a table to the supplemental materials (Tab. S3) that reports the estimated number of men and women in each of the 111 fields.

      Reviewer #2 (Public Review):

      Summary:

      This important study by LaBerge and co-authors seeks to understand the causal drivers of faculty gender demographics by quantifying the relative importance of faculty hiring and attrition across fields. They leverage historical data to describe past trends and develop models that project future scenarios that test the efficacy of targeted interventions. Overall, I found this study to be a compelling and important analysis of gendered hiring and attrition in US institutions, and one that has wide-reaching policy implications for the academy. The authors have also suggested a number of fruitful future avenues for research that will allow for additional clarity in understanding the gendered, racial, and socioeconomic disparities present in US hiring and attrition, and potential strategies for mitigating or eliminating these disparities.

      We thank the reviewer for their positive assessment of the contributions of our work.

      Strengths:

      In this study, LaBerge et al use data from over 268,000 tenured and tenure-track faculty from over 100 fields at more than 12,000 PhD-granting institutions in the US. The period they examine covers 2011-2020. Their analysis provides a large-scale overview of demographics across fields, a unique strength that allows the authors to find statistically significant effects for gendered attrition and hiring across broad areas (STEM, non-STEM, and topical domains).

      LaBerge et al. find gendered disparities in attrition-using both empirical data and their counterfactual model-that account for the loss of 1378 women faculty across all fields between 2011 and 2020. It is true that "this number is both a small portion of academia... and a staggering number of individual careers," as ." - as this loss of women faculty is comparable to losing more than 70 entire departments. I appreciate the authors' discussion about these losses-they note that each of these is likely unnecessary, as women often report feeling that they were pushed out of academic jobs.

      LaBerge et al. also find-by developing a number of model scenarios testing the impacts of hiring, attrition, or both-that hiring has a greater impact on women's representation in the majority of academic fields in spite of higher attrition rates for women faculty relative to men at every career stage. Unlike many other studies of historical trends in gender diversity, which have often been limited to institution-specific analyses, they provide an analysis that spans over 100 fields and includes nearly all US PhD-granting institutions. They are able to project the impacts of strategies focusing on hiring or retention using models that project the impact of altering attrition risk or hiring success for women. With this approach, they show that even relatively modest annual changes in hiring accumulate over time to help improve the diversity of a given field. They also demonstrate that, across the model scenarios they employ, changes to hiring drive the largest improvement in the long-term gender diversity of a field.

      Future work will hopefully - as the authors point out - include intersectional analyses to determine whether a disproportionate share of lost gender diversity is due to the loss of women of color from the professoriate. I appreciate the author's discussion of the racial demographics of women in the professoriate, and their note that "the majority of women faculty in the US are white" and thus that the patterns observed in this study are predominately driven by this demographic. I also highly appreciate their final note that "equal representation is not equivalent to equal or fair treatment," and that diversifying hiring without mitigating the underlying cause of inequity will continue to contribute to higher losses of women faculty.

      Weaknesses

      First, and perhaps most importantly, it would be beneficial to include a distinct methods section. While the authors have woven the methods into the results section, I found that I needed to dig to find the answers to my questions about methods. I would also have appreciated additional information within the main text on the source of the data, specifics about its collection, inclusion and exclusion criteria for the present study, and other information on how the final dataset was produced. This - and additional information as the authors and editor see fit - would be helpful to readers hoping to understand some of the nuance behind the collection, curation, and analysis of this important dataset.

      We have expanded upon the description of methods in a new methods section of the paper.

      We have also added a detailed description of the data cleaning steps taken to produce the dataset used in these analyses, including the inclusion/exclusion criteria applied. This detailed description is at the beginning of the methods section. This addition has substantially enhanced the transparency of our data cleaning methods, so we thank the reviewer for this suggestion.

      I would also encourage the authors to include a note about binary gender classifications in the discussion section. In particular, I encourage them to include an explicit acknowledgement that the trends assessed in the present study are focused solely on two binary genders - and do not include an analysis of nonbinary, genderqueer, or other "third gender" individuals. While this is likely because of the limitations of the dataset utilized, the focus of this study on binary genders means that it does not reflect the true diversity of gender identities represented within the professoriate.

      In a similar vein, additional context on how gender was assigned on the basis of names should be added to the methods section.

      We use a free, open-source, and open-data python package called nomquamgender (Van Buskirk et al, 2023) to estimate the strengths of (culturally constructed) name-gender associations. For sufficiently strong associations with a binary gender, we apply those labels to the names in our data. We have updated the main text to make this approach more apparent.

      We have also added language to the main text which explicitly acknowledges that our approach only assigns binary (woman/man) labels to faculty. We point out that this is a compromise due to the technical limitations of name-based gender methodologies and is not intended to reinforce a gender binary.

      I do think that some care might be warranted regarding the statement that "eliminating gendered attrition leads to only modest changes in field-level diversity" (Page 6). while I do not think that this is untrue, I do think that the model scenarios where hiring is "radical" and attrition is unchanged from present (equal representation of women and men among hires (ER) + observed attrition (OA)) shows that a sole focus on hiring dampens the gains that can otherwise be addressed via even modest interventions (see, e.g., gender-neutral attrition (GNA) + increasing representation of women among hires (IR)). I am curious as to why the authors did not include an additional scenario where hiring rates are equal and attrition is equalized (i.e., GNA + ER). The importance of including this additional model is highlighted in the discussion, where, on Page 7, the authors write: "In our forecasting analysis, we find that eliminating the gendered attrition gap, in isolation, would not substantially increase representation of women faculty in academia. Rather, progress towards gender parity depends far more heavily on increasing women's representation among new faculty hires, with the greatest change occurring if hiring is close to gender parity." I believe that this statement would be greatly strengthened if the authors can also include a comparison to a scenario where both hiring and attrition are addressed with "radical" interventions.

      Our rationale for omitting the GNA + ER scenario in the presented analysis is that we can reason about the outcomes of this scenario without the need for computation; if a field has equal inputs of women and men faculty (on average) and equal retention rates between women and men (on average), then, no matter the field’s initial age and gender distribution of faculty, the expected value for the percentage of women faculty after all of the prior faculty have retired (which may take 40+ years) is exactly 50%. We have updated the main text to discuss this point.

      Reviewer #3 (Public Review):

      This manuscript investigates the roles of faculty hiring and attrition in influencing gender representation in US academia. It uses a comprehensive dataset covering tenured and tenure-track faculty across various fields from 2011 to 2020. The study employs a counterfactual model to assess the impact of hypothetical gender-neutral attrition and projects future gender representation under different policy scenarios. The analysis reveals that hiring has a more significant impact on women's representation than attrition in most fields and highlights the need for sustained changes in hiring practices to achieve gender parity.

      Strengths:

      Overall, the manuscript offers significant contributions to understanding gender diversity in academia through its rigorous data analysis and innovative methodology.

      The methodology is robust, employing extensive data covering a wide range of academic fields and institutions.

      Weaknesses:

      The primary weakness of the study lies in its focus on US academia, which may limit the generalizability of its findings to other cultural and academic contexts.

      We agree that the U.S. focus of this study limits the generalizability of our findings. The findings that we present in this work will only generalize to other populations–whether it be to an alternate industry, e.g., tech workers, or to faculty in different countries–to the extent that these other populations share similar hiring patterns, retention patterns, and current demographic representation. We have added a discussion of this limitation to the manuscript.

      Additionally, the counterfactual model's reliance on specific assumptions about gender-neutral attrition could affect the accuracy of its projections.

      Our projection analysis is intended to illustrate the potential gender representation outcomes of several possible counterfactual scenarios, with each projection being conditioned on transparent and simple assumptions. In this way, the projection analysis is not intended to predict or forecast the future.

      To resolve this point for our readers, we now introduce our projections in the context of the related terms of prediction and forecast, noting that they have distinct meanings as terms of art: On one hand, prediction and forecasting involve anticipating a specific outcome based on available information and analysis, and typically rely on patterns, trends, or historical data to make educated guesses about what will happen. Projections are based on assumptions and are often presented in a panel of possible future scenarios. While predictions and forecasts aim for precision, projections (which we make in our analysis) are more generalized and may involve a range of potential outcomes.

      Additionally, the study assumes that whoever disappeared from the dataset is attrition in academia. While in reality, those attritions could be researchers who moved to another country or another institution that is not included in the AARC (Academic Analytics Research Centre) dataset.

      In our revision, we have elevated this important point, and clarified it in the context of the various ways in which we count hires and attritions. We now explicitly state that “We define faculty hiring and faculty attrition to include all cases in which faculty join or leave a field or domain within our dataset.” Then, we enumerate the number of situations that could be counted as hires and attritions, including the reviewer’s example of faculty who move to another country.

      Reviewer #1 (Recommendations For The Authors):

      Section B: The authors use an age structured Leslie matrix model (see Caswell for a good reference to these) to test the effect of making the attrition rates or hiring rates equal for men and women. My main concern here is the fitting techniques for the parameters. These are described (a little too!) briefly in section S1B. Some specific questions that are left hanging include:

      A 5th order polynomial is an interesting choice. Some statistical evidence as to why it was the best fit would be useful. What other candidate models were compared? What was the "best fit" judgement made with: AIC, r^2? What are the estimates for how good this fit is? How many data points were fitted to? Was it the best fit choice for all of the 111 fields for men and women?

      We use a logistic regression model for each field to infer faculty attrition probabilities across career ages and time, and we include the career age predictor up to its fifth power to capture the career-age correlations observed in Spoon et. al., Science Advances, 2023. For ease of reference, we reproduce the attrition risk curves in Fig S4.

      We note that faculty attrition rates start low and then reach a peak around 5-7 years after earning PhD, and then decline until around 15-20 years post-PhD, after which, attrition rates increase as faculty approach retirement.

      This function shape starts low and ends high, and includes at least one local minimum, which indicates that career age should be odd-ordered in the model and at least order-3, but only including career age up to its 3rd order term tended to miss some of the overserved career-age/attrition correlations. We evaluated the fit using 5-fold cross validation with a Brier score loss metric, and among options of polynomials of degree 1, 3, 5, or 7, we found that 5th order performed well overall on average over all fields (even if it was not the best for every field), without overfitting in fields with fewer data. Example fits, reminiscent of the figure from Spoon et al, are now provided in Figs S4 and S5.

      While the model fit with fifth order terms may not be the best fit for all 111 fields (e.g., 7th order fits better in some cases), we wanted to avoid field-specific curves that might be overfitted to the field-specific data, especially due to low sample size (and thus larger fluctuations) on the high career age side of the function. Our main text and supplement now includes justifications for our choice to include career age up to its fifth order terms.

      You used the 5th order logistic regression (bottom of page 11) to model attrition at different ages. The data in [24] shows that attrition increases sharply, then drops then increases again with career age. A fifth order polynomial on its own could plausibly do this but I associate logistic regression models like this as being monotonically increasing (or decreasing!), again more details as to how this worked would be useful.

      Our first submission did not explain this point well, but we hope that Supplementary Figures S4 and S5 provide clarity. In short, we agree of course that typical logistic regression assumes a linear relationship between the predictor variables and the log odds of the outcome variable. This means that the relationship between the predictor variables and the probability of the outcome variable follows a sigmoidal (S-shaped) curve. However, the relationship between the predictor variables and the outcome variable may not be linear.

      To capture more complex relationships, like the increasing, decreasing and then increasing attrition rates as a function of career age, higher-order terms can be added to the logistic regression model. These higher-order terms allow the model to capture nonlinear relationships between the predictor variables and the outcome variable — namely the non-monotonic relationship between rates of attrition and career age — while staying within a logistic regression framework.

      "The career age of new hires follows the average career age distribution of hires" did you use the empirical distribution here or did you fit a standard statistical distribution e.g. Gamma?

      We used the empirical distribution. This information has been added to the updated methods section in the main text.

      How did you account for institution (presumably available)? Your own work has shown that institution types plays a role which could be contributing to these results.

      See below.

      What other confounding variables could be at play here, what is available as part of the data and what happens if you do/don't account for them?

      A number of variables included in our data have been shown to correlate with faculty attrition, including PhD prestige, current institution prestige, PhD country, and whether or not an individual is a “self-hire,” i.e., trained and hired at the same institution (Wapman et. al., Nature, 2022). Additional factors that faculty self-report as reasons for leaving academia include issues of work-life balance, workplace climate, and professional reasons, and in some cases to varying degrees between men and women faculty (Spoon et. al., Sci. Adv., 2023).

      Our counterfactual analysis aims to address a specific question: how would women’s representation among faculty be different today if men and women were subjected to the same attrition patterns over the past decade? To answer this question, it is important to account for faculty career age, which we accept as a variable that will always correlate strongly with faculty attrition rates, as long as the tenure filter remains in place and faculty continue to naturally progress towards retirement age. On the other hand, it is less clear why PhD country, self-hire status, or any of the other mentioned variables should necessarily correlate with attrition rates and with gendered differences in attrition rates more specifically. While some or all of these variables may underlie the causal roots of gendered attrition rates, our analysis does not seek to answer causal questions about why faculty leave their jobs (e.g., by testing the impact of accounting for these variables in simulations per the reviewers suggestion). This is because we do not believe the data used in this analysis is sufficient to answer such questions, lacking comprehensive data on faculty stress (Spoon et. al., Sci. Adv., 2023), parenthood status, etc.

      What career age range did the model use?

      The career age range observed in model outcomes are a function of the empirically derived attrition rates for faculty across academic fields. The highest career age observed in the AARC data was 80, and the faculty career ages that result from our model simulations and projections do not exceed 80.

      We have also added the distribution of faculty across career ages for the projection scenario model outputs in the supplemental materials Fig. S3 (see response to your later comment regarding career age for further details). Looking at these distributions, it is observed that very few faculty have career age > 60, both in observation and in our simulations.

      What was the initial condition for the model?

      Empirical 2011 Faculty rosters are used as the initial conditions for the counterfactual analysis, and 2020 faculty rosters are these as the initial conditions for the projections analysis. This information has been added to the descriptions of methods in the main text.

      Starting the model in 2011 how well does it fit the available data up to 2020?

      Thank you for this suggestion. We ran this analysis for each field starting in 2011, and found that model outcomes were statistically indistinguishable from the observed 2020 faculty gender compositions for all 111 academic fields. This finding is not surprising, because the model is fit to the observed data, but it serves to validate the methods that we used to extract the model's parameters. We have added these results to the supplement (Fig. S2).

      What are the sensitivity analysis results for the model? If you have made different fitting decisions how much would the results change? All this applied to both the hiring and attrition parameters estimates.

      We model attrition and hiring using logistic regression, with career age included as an exogenous variable up to its fifth power. A natural question follows: what if we used a model with career age only to its first or third power? Or to higher powers? We performed this sensitivity analysis, and added three new figures to the supplement to present these findings:

      First, we show the observed attrition probabilities at each career age, and four model fits to attrition data (Supplementary Figs S4 and S5). The first model includes career age only to its first power, and this model clearly does not capture the full career age / attrition correlation structure. The second model includes career age to its third power, which does a better job of fitting to the observed patterns. The third model includes career age up to its fifth power, which appears to very modestly improve upon the former model. The fourth model includes career age up to its seventh power, and the patterns captured by this model are largely the same as the 5th-power model up to career age 50, beyond which there are some notable differences in the inferred attrition probabilities. These differences would have relatively little impact on model outcomes because the vast majority of faculty have a career age below 50.

      Second, we show the observed probability that hires are women, conditional on the career age of the hire. Once again, we fit four models to the data, and find that career age should be included at least up to its fifth order in order to capture the correlation structures between career age and the gender of new hires. However, limited differences result from including career age up to the 7th degree in the model (relative to the 5th degree).

      As a final sensitivity analysis, we reproduce Fig. 2, but rather than including career age as an exogenous variable up to its fifth power in our models for hiring and attrition, we include career age up to its third power. Findings under this parameterization are qualitatively very similar to those presented in Fig. 2, indicating that the results are robust to modest changes to model parameterization (shown in supplement Fig. S6).

      Far more detail in this and some interim results from each stage of the analysis would make the paper far more convincing. It currently has an air of "black box" too much of the analysis which would easily allow an unconvinced reader to discard the results.

      We have added more detailed descriptions of the methods to the main text. We hope that the changes made will address these concerns.

      Section C: You use the Leslie model to predict the future population. As the model is linear the population will either grow exponentially (most likely) or dwindle to zero. You mention you dealt with this by scaling the average value of H to keep the population at 2020 levels? This would change the ratio of hiring to attrition. How did this affect the timescale of the results. If a field had very minimal attrition (and hence grew massively over the time period of the dataset) the hiring rate would have to be very small too so there would be very little change in the gender balance. Did you consider running the model to steady state instead?

      We chose the 40 year window (2020-2060) for this projection analysis because 40 years is roughly the timespan of a full-length faculty career. In other words, it will take around 40 years for most of the pre-existing faculty from 2020 to retire, such that the new, simulated faculty will have almost entirely replaced all former faculty by 2060.

      For three out of five of our projection scenarios (OA, GNA, OA+ER), the point at which observed faculty are replaced by simulated faculty represents steady state. One way to check this intuition is to observe the asymptotic behavior of the trajectories in Fig. 3B; the slopes for these 3 scenarios nearly level out within 40 years.

      The other two scenarios (OA + IR, GNA+IR) represent situations where women’s representation among new hires is increasing each year. These scenarios will not reach steady state until women represent 100% of faculty. Accordingly, the steady state outcomes for these scenarios would yield uninteresting results; instead, we argue that it is the relative timescales that are interesting.

      What did you do to check that your predictions at least felt realistic under the fitted parameters? (see above for presenting the goodness of fit over the 10 years of the data).

      We ran the analysis suggested in a prior comment (Starting the model in 2011 how well does it fit the available data up to 2020?) and found that model outcomes were statistically indistinguishable from the observed 2020 faculty gender compositions for all 111 academic fields, plus the “All STEM” and “All non-STEM” aggregations.

      You only present the final proportion of women for each scenario. As mentioned earlier, models of this type have a tendency to lead to strange population distributions with wild age predictions and huge (or zero populations). Presenting more results here would assuage any worries the reader had about these problems. What is the predicted age distribution of men and women in the long term scenarios? Would a different method of keeping the total population in check have yielded different results? Interim results, especially from a model as complex as this one, rather than just presenting a final single number answer are a convincing validation that your model is a good one! Again, presenting this result will go a long way to convincing readers that your results are sound and rigorous.

      Thank you for this suggestion. We now include a figure that presents faculty age distributions for each projection scenario at 2060 against the observed faculty age distribution in 2020 (pictured below, and as Fig. S3 in the supplementary materials). We find that the projected age distributions are very similar to the observed distributions for natural sciences (shown) and for the additional academic domains. We hope this additional validation will inspire confidence in our model of faculty hiring and attrition for the reviewer, and for future readers.

      In Fig S3, line widths for the simulated scenarios span the central 95% of simulations.

      Other people have reached almost identical conclusions (albeit it with smaller data sets) that hiring is more important than attrition. It would be good to compare your conclusions with their work in the Discussion.

      We have revised the main text to cite the listed examples of similar studies. We thank the reviewer for bringing these relevant works to our attention.

      General comments:

      What thoughts have you given to non-binary individuals?

      Be careful how you use the term "gender diversity"! In many countries "Gender diverse" is a term used in data collection for non-binary individuals, i.e. Male, female, gender diverse. The phrase "hiring more gender diverse faculty" can be read in different ways! If you are only considering men and women then gender balance may be a better framework to use.

      We have added language to the main text which explicitly acknowledges that our analysis focuses on men and women due to limitations in our name-based gender tool, which only assigns binary (woman/man) labels to faculty. We point out that this is a compromise due to the technical limitations of name-based gender methodologies and is not intended to reinforce a gender binary.

      We have also taken additional care with referring to “gender diversity,” per reviewer 1’s point in their public review.

      Reviewer #2 (Recommendations For The Authors):

      Data availability: I did not see an indication that the dataset used here is publicly available, either in its raw format or as a summary dataset. Perhaps this is due to the sensitive nature of the data, but regardless of the underlying reason, the authors should include a note on data availability in the paper.

      The dataset used for these analyses were obtained under a data use agreement with the Academic Analytics Research Center (AARC). While these data are not publicly available, researchers may apply for data access here: https://aarcresearch.com/access-our-data.

      We also added a table to the supplemental materials (Tab. S3) that reports the estimated number of men and women in each of the 111 fields.

      Additionally, a variety of summary statistics based on this dataset are available online, here: https://github.com/LarremoreLab/us-faculty-hiring-networks/tree/main

      Gender classification: Was an existing package used to classify gender from names in the dataset, or did the authors develop custom code to do so? Either way, this code should be cited. I would also be curious to know what the error rate of these classifications are, and suggest that additional information on potential biases that might result from automated classifications be included in the discussion, under the section describing data limitations. The reliability of name-based gender classification is particularly of interest, as external gender classifications such as those applied on the basis of an individual's name - may not reflect the gender with which an individual self-identifies. In other words, while for many people their names may reflect their true genders, for others those names may only reflect their gender assigned at birth and not their self-perceived or lived gender identity. Nonbinary faculty are in particular invisibilized here (and through any analysis that assigns binary gender on the basis of name). While these considerations do not detract from the main focus of the study - which was to utilize an existing dataset classified only on the basis of binary gender to assess trends for women faculty-these limitations should be addressed as they provide additional context for the interpretation of the results and suggest avenues for future research.

      We use a free, open-source, and open-data python package called nomquamgender (Van Buskirk et al, 2023) to estimate the strengths of (culturally constructed) name-gender associations. For sufficiently strong associations with a binary gender, we apply those labels to the names in our data. We have updated the main text to make this approach more apparent.

      We have also added language to the main text which explicitly acknowledges that our approach only assigns binary (woman/man) labels to faculty. We point out that this is a compromise due to the technical limitations of name-based gender methodologies and is not intended to reinforce a gender binary.

      As we mentioned in response to the public review, we use a free and open source python package called nomquamgender to estimate the strengths of name-gender associations, and we apply gender labels to the names with sufficiently strong associations with a binary gender. This package is based on a paper by Van Buskirk et. al. 2023, “An open-source cultural consensus approach to name-based gender classification,” which documents error rates and potential biases.

      We have also added language to the main text which explicitly acknowledges that our approach only assigns binary (woman/man) labels to faculty. We point out that this is a compromise due to the technical limitations of name-based gender methodologies and is not intended to reinforce a gender binary.

      Page 1: The sentence beginning "A trend towards greater women's representation could be caused..." is missing a conjunction. It should likely read: "A trend towards greater women's representation could be caused entirely by attrition, e.g., if relatively more men than women leave a field, OR entirely by hiring..."

      We have edited the paragraph to remove the sentence in question.

      Pages 1-2: The sentence beginning "Although both types of strategy..." and ending with "may ultimately achieve gender parity" is a bit of a run-on; perhaps it would be best to split this into multiple sentences for ease of reading.

      We have revised this run-on sentence.

      Page 2: See comments in the public review about a methods section, the addition of which may help to improve clarity for the readers. Within the existing descriptions of what I consider to be methods (i.e., the first three paragraphs currently under "results"), some minor corrections could be added here. First, consider citing the source of the dataset in the line where it is first described (in the sentence "For these analyses, we exploit a census-level dataset of employment and education records for tenured and tenure-track faculty in 12,112 PhD-granting departments in the United States from 2011-2020.") It also may be helpful to include context here (or above, in the discussion about institutional analyses) about how "departments" can be interpreted. For example, how many institutions are represented across these departments? More information on how the authors eliminated the gendered aspect of patterns in their counterfactual model would be helpful as well; this is currently hinted at on page 4, but could instead be included in the methods section with a call-out to the relevant supplemental information section (S2B).

      We have added a citation to Academic Analytics Research Center’s (AARC) list of available data elements to the data’s introduction sentence. We hope this will allow readers to familiarize themselves with the data used in our analysis.

      Faculty department membership was determined by AARC based on online faculty rosters. 392 institutions are represented across the 12,112 departments present in our dataset. We have updated the main text to include this information.

      Finally, we have added a methods section to the main text, which includes information on how the gendered aspect of attrition patterns were eliminated in the counterfactual model.

      Page 2: Perhaps some indication of how many transitions from an out-of-sample institution might be helpful to readers hoping to understand "edge cases."

      In our analysis, we consider all transitions from out-of-sample institutions to in-sample institutions as hires, and all transitions away from in-sample institutions–whether it be to an out of sample institution, or out of academia entirely–as attritions. We choose to restrict our analysis of hiring and attrition to PhD granting institutions in the U.S. in this way because our data do not support an analysis of other, out-of-sample institutions.

      I also would have liked additional information on how many faculty switched institutions but remained "in-sample and in the same field" - and the gender breakdowns of these institutional changes, as this might be an interesting future direction for studies of gender parity. (For example, readers may be spurred to ask: if the majority of those who move institutions are women, what are the implications for tenure and promotion for these individuals?)

      While these mid-career moves are not counted as attritions in the present analysis, a study of faculty who switch institutions but remain (in-sample) as faculty could shed light on issues of gendered faculty retention at the level of institutions. We share the reviewer’s interest in a more in depth study of mid-career moves and how these moves impact faculty careers, and we now discuss the potential value of such a study towards the end of the paper. In fact, this subject is the topic of a current investigation by the authors!

      Page 3: I was confused by the statement that "of the three types of stable points, only the first point represents an equitable steady-state, in which men and women faculty have equal average career lengths and are hired in unchanging proportions." Here, for example, computer science appears to be close to the origin on Figure 1, suggesting that hiring has occurred in "unchanging proportions" over the study interval. However, upon analysis of Table S2, it appears that changes in hiring in Computer Science (+2.26 pp) are relatively large over the study interval compared to other fields. Perhaps I am reading too literally into the phrase that "men and women faculty are hired in unchanging proportions" - but I (and likely others) would benefit from additional clarity here.

      We had created an arrow along with the computer science label in Fig. 1, but it was difficult to see, which is likely the source of this confusion. This was our fault, and we have moved the “Comp. Sci.” label and its corresponding arrow to be more visible in Figure 1.

      Changes in women’s representation in Computer Science due to hiring over 2011 - 2020 was +2.26 pp as the reviewer points out, but, consulting Fig. 1 and the corresponding table in the supplement, we observe that this is a relatively small amount of change compared to most fields.

      Page 3: If possible it may be helpful to cite a study (or multiple) that shows that "changes in women's representation across academic fields have been mostly positive." What does "positive" mean here, particularly when the changes the authors observe are modest? Perhaps by "positive" you mean "perceived as positive"?

      We used the term positive in the mathematical sense, to mean greater than zero. We have reworded the sentence to read “women's representation across academic fields has been mostly increasing…” We hope this change clarifies our meaning to future readers.

      Page 3: The sentence that ends with "even though men are more likely to be at or near retirement age than women faculty due to historical demographic trends" may benefit from a citation (of either Figure S3 or another source).

      We now cite the corresponding figure in this sentence.

      Page 4: The two sentences that begin with "The empirical probability that a person leaves their academic career" would benefit from an added citation.

      We have added a citation to the sentences.

      Figure 3: Which 10 academic domains are represented in Panel 3B? The colors in appear to correspond to the legend in Panel 3A, but no indication of which fields are represented is provided. If possible, please do so - it would be interesting and informative to be able to make these comparisons.

      This was not clear in the initial version of Fig. 3B, so we now label each domain. For reference, the domains represented in 3B are (from top to bottom):

      ● Health

      ● Education

      ● Journalism, Media, Communication

      ● Humanities

      ● Social Sciences

      ● Public Administration and Policy

      ● Medicine

      ● Business

      ● Natural Sciences

      ● Mathematics and Computing

      ● Engineering

      Page 6: Consider citing relevant figure(s) earlier up in paragraph 2 of the discussion. For example, the first sentence could refer to Figure 1 (rather than waiting until the bottom of the paragraph to cite it).

      Thank you for this suggestion, we now cite Fig. 1 earlier in this discussion paragraph.

      Page 10: A minor comment on the fraction of women faculty in any given year-the authors assume that the proportion of women in a field can be calculated from knowing the number of women in a field and the number of men. This is, again, true if assuming binary genders but not true if additional gender diversity is included. It is likely that the number of nonbinary faculty is quite low, and as such would not cause a large change in the overall proportions calculated here, but additional context within the first paragraph of S1 might be helpful for readers.

      We have added additional context in the first paragraph of S1, explaining that an additional term could be added to the equation to account for nonbinary faculty representation if our data included nonbinary gender annotations. Thank you for making this point.

      Page 10: Please include a range of values for the residual terms of the decomposition of hiring and attrition in the sentence that reads "In Figure S1 we show that the residual terms are small, and thus the decomposition is a good approximation of the total change in women's representation."

      These residual terms range from -0.51pp to 1.14pp (median = 0.2pp). We have added this information to the sentence in question.

      Page 12: It may be helpful to readers to include a description of the information contained in Table S2 in the supplemental text under section S3.

      We refer to table S2 twice in the main text (once in the observational findings, and once for the counterfactual analysis), and the contents of table S2 are described thoroughly in the table caption.

      Reviewer #3 (Recommendations For The Authors):

      (1) There is a potential limitation in the generalizability of the findings, as the study focuses exclusively on US academia. Including international perspectives could have provided a more global understanding of the issues at hand.

      The U.S. focus of this study limits the generalizability of our findings, as non-U.S. other faculty may exhibit differences in hiring patterns, retention patterns, and current demographic representations. We have added a discussion of this limitation to the manuscript. Unfortunately, our data do not support international analyses of hiring and attrition.

      (2) I am not sure that everyone who disappeared from the AARC dataset could be count as "attrition" from academia. Indeed, some who disappeared might have completely left academia once they disappeared from the AARC dataset. Yet, there's also the possibility that some professors left for academic positions in countries outside of the US, or US institutions that are not included in the AARC dataset. These individuals didn't leave academia. Furthermore, it is also possible that these scholars who moved to an institution outside of US or not indexed by AARC are gender specific. Therefore, analyses that this study conducts should find a way to test whether the assumption that anyone who disappeared from AARC is indeed valid. If not, how will this potentially challenge the current conclusions?

      The reviewer makes an important point: faculty who move to faculty positions in other countries and faculty who move to non-PhD granting institutions, or to institutions that are otherwise not included in the AARC data are all counted as attritions in our analysis. We intentionally define hiring and attrition broadly to include all cases in which faculty join or leave a field or domain within our dataset.

      The types of transitions that faculty make out of the tenure track system at PhD granting institutions in the U.S. may correlate with faculty attributes, like gender. For example, women or men may be more likely to transition to tenure track positions at non-U.S. institutions. Nevertheless, these types of career transition represent an attrition for the system of study, and a hire for another system. Following this same logic, faculty who transition from one field to another field in our analysis are treated as an attrition from the first field and a hire into the new field.

      By focusing on “all-cause” attrition in this way, we are able to make robust insights for the specific systems we consider (e.g.,, STEM and non-STEM faculty at U.S. PhD granting institutions), without being roadblocked by the task of annotating faculty departures and arbitrating which should constitute “valid” attritions.

      (3) It would be very interesting to know how much of the attribution was due to tenure failure. Previous studies have suggested that women are less likely to be granted tenure, which makes me wonder about the role that tenure plays in the gendered patterns of attrition in academia.

      We note that faculty attrition rates start low and then reach a peak around 5-7 years after earning PhD, and then decline until around 15-20 years post-PhD, after which, attrition rates increase as faculty approach retirement. The first local maximum appears to coincide roughly with the tenure clock timing, but we can only speculate that these attritions are tenure related. Our dataset is unfortunately not equipped to determine the causal mechanisms driving attrition.

      We reproduce the attrition risk curve in the supplementary materials, Fig. S4:

      (4) The dataset used doesn't fully capture the complexities of academic environments, particularly smaller or less research-intensive institutions (regional universities, historically black colleges and universities, and minority-serving institutions). This could be potentially added to the manuscript for discussions.

      We have added this point to the description of this study’s limitations in the discussion.

    1. Reviewer #1 (Public Review):

      Summary:

      In "Changes in wing morphology..." Roy et al investigate the potential allometric scaling in wing morphology and wing kinematics in 8 different hoverfly species. Their study nicely combines different new and classic techniques, investigating flight in an important, yet understudied alternative pollinator. I want to emphasize that I have been asked to review this from a hoverfly biology perspective, as I do not work on flight kinematics. I will thus not review that part of the work.

      Strengths:

      The paper is well-written and the figures are well laid out. The methods are easy to follow, and the rationale and logic for each experiment are easy to follow. The introduction sets the scene well, and the discussion is appropriate. The summary sentences throughout the text help the reader.

      Weaknesses:

      The ability to hover is described as useful for either feeding or mating. However, several of the North European species studied here would not use hovering for feeding, as they tend to land on the flowers that they feed from. I would therefore argue that the main selection pressure for hovering ability could be courtship and mating. If the authors disagree with this, they could back up their claims with the literature. On that note, a weakness of this paper is that the data for both sexes are merged. If we agree that hovering may be a sexually dimorphic behaviour, then merging flight dynamics from males and females could be an issue in the interpretation. I understand that separating males from females in the movies is difficult, but this could be addressed in the Discussion, to explain why you do not (or do) think that this could cause an issue in the interpretation.

      The flight arena is not very big. In my experience, it is very difficult to get hoverflies to fly properly in smaller spaces, and definitely almost impossible to get proper hovering. Do you have evidence that they were flying "normally" and not just bouncing between the walls? How long was each 'flight sequence'? You selected the parts with the slowest flight speed, presumably to get as close to hovering as possible, but how sure are you that this represented proper hovering and not a brief slowdown of thrust?

      Your 8 species are evolutionarily well-spaced, but as they were all selected from a similar habitat (your campus), their ecology is presumably very similar. Can this affect your interpretation of your data? I don't think all 6000 species of hoverflies could be said to have similar ecology - they live across too many different habitats. For example, on line 541 you say that wingbeat kinematics were stable across hoverfly species. Could this be caused by their similar habitat?

    1. Author response:

      Reviewer 1:

      Summary:

      In this manuscript by Bimbard et al., a new method to perform stable recordings over long periods of time with neuropixels, as well as the technical details on how the electrodes can be explanted for follow-up reuse, is provided. I think the description of all parts of the method is very clear, and the validation analyses (n of units per day over time, RMS over recording days...) are very convincing. I however missed a stronger emphasis on why this could provide a big impact on the ephys community, by enabling new analyses, new behavior correlation studies, or neurophysiological mechanisms across temporal scales

      Strengths:

      Open source method. Validation across laboratories. Across species (mice and rats) demonstration of its use and in different behavioral conditions (head-fixed and freely moving).

      Weaknesses:

      Weak emphasis on what can be enabled with this new method that didn't exist before.

      We thank the reviewer for highlighting the limited discussion around scientific impact. Our implant has several advantages which combine to make it much more accessible than previous solutions. This enables a variety of recording configurations that would not have been possible with previous designs, facilitating recordings from a wider range of brain regions, animals, and experimental setups. In short, there are three key advances:

      (1) Adaptability: The CAD files can be readily adapted to a wide range of configurations (implantation depth, angle, position of headstage, etc.). Labs have already, modified the design to optimise for their needs, and re-shared with the community.

      (2) Weight:  Because of the lightweight design, experimenters can i) perform complex and demanding freely moving tasks as we exemplify in the manuscript, and ii) implant female and water restricted mice while respecting animal welfare weight limitations.

      (3) Cost: At ~$10, our implant is significantly cheaper than published alternatives, which makes it affordable to more labs and means that testing modifications is cost-effective.

      We will make these features clearer in the manuscript.

      Reviewer 2:

      Summary:

      This work by Bimbard et al., introduces a new implant for Neuropixels probes. While Neuropixels probes have critically improved and extended our ability to record the activity of a large number of neurons with high temporal resolution, the use of these expensive devices in chronic experiments has so far been hampered by the difficulty of safely implanting them and, importantly, to explant and reuse them after conclusion of the experiment. The authors present a newly designed two-part implant, consisting of a docking and a payload module, that allows for secure implantation and straightforward recovery of the probes. The implant is lightweight, making it amenable for use in mice and rats, and customizable. The authors provide schematics and files for printing of the implants, which can be easily modified and adapted to custom experiments by researchers with little to no design experience. Importantly, the authors demonstrate the successful use of this implant across multiple use cases, in head-fixed and freely moving experiments, in mice and rats, with different versions of Neuropixels probes, and across 8 different labs. Taken together, the presented implants promise to make chronic Neuropixel recordings and long-term studies of neuronal activity significantly easier and attainable for both current and future Neuropixels users.

      Strengths:

      - The implants have been successfully tested across 8 different laboratories, in mice and rats, in head-fixed and freely moving conditions, and have been adapted in multiple ways for a number of distinct experiments.

      - Implants are easily customizable and the authors provide a straightforward approach for customization across multiple design dimensions even for researchers not experienced in design.

      - The authors provide clear and straightforward descriptions of the construction, implantation, and explant of the described implants.

      - The split of the implant into a docking and payload module makes reuse even in different experiments (using different docking modules) easy.

      - The authors demonstrate that implants can be re-used multiple times and still allow for high-quality recordings.

      - The authors show that the chronic implantations allow for the tracking of individual neurons across days and weeks (using additional software tracking solutions), which is critical for a large number of experiments requiring the description of neuronal activity, e.g. throughout learning processes.

      - The authors show that implanted animals can even perform complex behavioral tasks, with no apparent reduction in their performance.

      Weaknesses:

      - While implanted animals can still perform complex behavioral tasks, the authors describe that the implants may reduce the animals' mobility, as measured by prolonged reaction times. However, the presented data does not allow us to judge whether this effect is specifically due to the presented implant or whether any implant or just tethering of the animals per se would have the same effects.

      The reviewer is correct: some of the differences in mouse reaction time could be due to the tether rather than the implant. As these experiments were also performed in water-restricted female mice with the heavier Neuropixels 1.0 implant, our data represent the maximal impact of the implant, and we will highlight this in the revision.

      - While the authors make certain comparisons to other, previously published approaches for chronic implantation and re-use of Neuropixels probes, it is hard to make conclusive comparisons and judge the advantages of the current implant. For example, while the authors emphasize that the lower weight of their implant allows them to perform recordings in mice (and is surely advantageous), the previously described, heavier implants they mention (Steinmetz et al., 2021; van Daal et al., 2021), have also been used in mice. Whether the weight difference makes a difference in practice therefore remains somewhat unclear.

      The reviewer is correct: without a direct comparison, we cannot be certain that our smaller, lighter implant improves behavioural results (although this is supported by the literature, e.g. Newman et al, 2023). However, the reduced weight of our implant is critical for several laboratories represented in this manuscript due to animal welfare requirements. Indeed, in Daal et al the authors “recommend a [mouse] weight of >25 g for implanting Neuropixels 1.0 probes.” This limit precludes using (the vast majority of) female mice, or water-restricted animals. Conversely, our implant can be routinely used with lighter, water-restricted male and female mice. We will emphasise this point in the revision.

      - The non-permanent integration of the headstages into the implant, while allowing for the use of the same headstage for multiple animals in parallel, requires repeated connections and does not provide strong protection for the implant. This may especially be an issue for the use in rats, requiring additional protective components as in the presented rat experiments.

      We apologise for not clarifying the various headstage options in the manuscript and we will address this in the revision. Our repository has headplate holder designs (in the XtraModifications/Mouse_FreelyMoving folder). This allows leaving the headstage on the implant, and thus minimize the number of connections (albeit increasing the weight for the mouse). Indeed, mice recorded while performing the task described in our manuscript had the head-stage semi-permanently integrated to the implant, and we will highlight this in the revision.

      Reviewer 3:

      Summary:

      In this manuscript, Bimbard and colleagues describe a new implant apparatus called "Apollo Implant", which should facilitate recording in freely moving rodents (mice and rats) using Neuropixels probes. The authors collected data from both mice and rats, they used 3 different versions of Neuropixels, multiple labs have already adopted this method, which is impressive. They openly share their CAD designs and surgery protocol to further facilitate the adaptation of their method.

      Strengths:

      Overall, the "Apollo Implant" is easy to use and adapt, as it has been used in other laboratories successfully and custom modifications are already available. The device is reproducible using common 3D printing services and can be easily modified thanks to its CAD design (the video explaining this is extremely helpful). The weight and price are amazing compared to other systems for rigid silicon probes allowing a wide range of use of the "Apollo Implant".

      Weaknesses:

      The "Apollo Implant" can only handle Neuropixels probes. It cannot hold other widely used and commercially available silicon probes. Certain angles and distances are not possible in their current form (distance between probes 1.8 to 4mm, implantation depth 2-6.5 mm, or angle of insertion up to 20 degrees).

      We appreciate the reviewer’s points, but as we will discuss in the revised manuscript, one implant accommodating the diversity of the existing probes is beyond the scope of this project. However, because the design is adaptable, groups should be able to modify the current version of the implant to adapt to their electrodes’ size and format (and can highlight any issues in the Github “Discussions” area).

      With Neuropixels, the current range of depths covers practically all trajectories in the mouse brain. In rats, where deeper penetrations may be useful, the experimenter can attach the probe at a lower point in the payload module to increase the length of exposed shank. We now specify this in the Github repository.

      We have now extended the range of inter-probe distances from a maximum of 4 mm to 6.5 mm, and this will be reflected in the revised manuscript. Distances beyond this may be better served by 2 implants, and smaller distances could be achieved by attaching two probes on the same side of the docking module. In the next revision, we will add these points to the discussion.

    1. Reviewer #1 (Public Review):

      Summary:

      This paper presents a mechanistic study of rDNA origin regulation in yeast by SIR2. Each of the ~180 tandemly repeated rDNA gene copies contains a potential replication origin. Early-efficient initiation of these origins is suppressed by Sir2, reducing competition with origins distributed throughout the genome for rate-limiting initiation factors. Previous studies by these authors showed that SIR2 deletion advances replication timing of rDNA origins by a complex mechanism of transcriptional de-repression of a local PolII promoter causing licensed origin proteins (MCMcomplexes) to re-localize (slide along the DNA) to a different (and altered) chromatin environment. In this study, they identify a chromatin remodeler, FUN30, that suppresses the sir2∆ effect, and remarkably, results in a contraction of the rDNA to about one-quarter it's normal length/number of repeats, implicating replication defects of the rDNA. Through examination of replication timing, MCM occupancy and nucleosome occupancy on the chromatin in sir2, fun30, and double mutants, they propose a model where nucleosome position relative to the licensed origin (MCM complexes) intrinsically determines origin timing/efficiency. While their interpretations of the data are largely reasonable and can be interpreted to support their model, a key weakness is the connection between Mcm ChEC signal disappearance and origin firing. While the cyclical chromatin association-dissociation of MCM proteins with potential origin sequences may be generally interpreted as licensing followed by firing, dissociation may also result from passive replication and as shown here, displacement by transcription and/or chromatin remodeling. Moreover, linking its disappearance from chromatin in the ChEC method with such precise resolution needs to be validated against an independent method to determine the initiation site(s). Differences in rDNA copy number and relative transcription levels also are not directly accounted for, obscuring a clearer interpretation of the results. Nevertheless, this paper makes a valuable advance with the finding of Fun30 involvement, which substantially reduces rDNA repeat number in sir2∆ background. The model they develop is compelling and I am inclined to agree, but I think the evidence on this specific point is purely correlative and a better method is needed to address the initiation site question. The authors deserve credit for their efforts to elucidate our obscure understanding of the intricacies of chromatin regulation. At a minimum, I suggest their conclusions on these points of concern should be softened and caveats discussed. Statistical analysis is lacking for some claims.

      Strengths are the identification of FUN30 as suppressor, examination of specific mutants of FUN30 to distinguish likely functional involvement. Use of multiple methods to analyze replication and protein occupancies on chromatin. Development of a coherent model.

      Weaknesses are failure to address copy number as a variable; insufficient validation of ChEC method relationship to exact initiation locus; lack of statistical analysis in some cases.

      Additional background and discussion for public review:

      This paper broadly addresses the mechanism(s) that regulate replication origin firing in different chromatin contexts. The rDNA origin is present in each of ~180 tandem repeats of the rDNA sequence, representing a high potential origin density per length of DNA (9.1kb repeat unit). However, the average origin efficiency of rDNA origins is relatively low (~20% in wild-type cells), which reduces the replication load on the overall genome by reducing competition with origins throughout the genome for limiting replication initiation factors. Deletion of histone deacetylase SIR2, which silences PolII transcription within the rDNA, results in increased early activation or the rDNA origins (and reduced rate of overall genome replication). Previous work by the authors showed that MCM complexes loaded onto the rDNA origins (origin licensing) were laterally displaced (sliding) along the rDNA, away from a well-positioned nucleosome on one side. The authors' major hypothesis throughout this work is that the new MCM location(s) are intrinsically more efficient configurations for origin firing. The authors identify a chromatin remodeling enzyme, FUN30, whose deletion appears to suppress the earlier activation of rDNA origins in sir2∆ cells. Indeed, it appears that the reduction of rDNA origin activity in sir2∆ fun30∆ cells is severe enough to results in a substantial reduction in the rDNA array repeat length (number of repeats); the reduced rDNA length presumably facilitates it's more stable replication and maintenance.

      Analysis of replication by 2D gels is marginally convincing, using 2D gels for this purpose is very challenging and tricky to quantify. The more quantitative analysis by EdU incorporation is more convincing of the suppression of the earlier replication caused by SIR2 deletion.

      To address the mechanism of suppression, they analyze MCM positioning using ChEC, which in G1 cells shows partial displacement of MCM from normal position A to positions B and C in sir2∆ cells and similar but more complete displacement away from A to positions B and C in sir2fun30 cells. During S-phase in the presence of hydroxyurea, which slows replication progression considerably (and blocks later origin firing) MCM signals redistribute, which is interpreted to represent origin firing and bidirectional movement of MCMs (only one direction is shown), some of which accumulate near the replication fork barrier, consistent with their interpretation. They observe that MCMs displaced (in G1) to sites B or C in sir2∆ cells, disappear more rapidly during S-phase, whereas the similar dynamic is not observed in sir2∆fun30∆. This is the main basis for their conclusion that the B and C sites are more permissive than A. While this may be the simplest interpretation, there are limitations with this assay that undermine a rigorous conclusion (additional points below). The main problem is that we know the MCM complexes are mobile so disappearance may reflect displacement by other means including transcription which is high is the sir2∆ background. Indeed, the double mutant has greater level of transcription per repeat unit which might explain more displaced from A in G1. Thus, displacement might not always represent origin firing. Because the sir2 background profoundly changes transcription, and the double mutant has a much smaller array length associated with higher transcription, how can we rule out greater accessibility at site A, for example in sir2∆, leading to more firing, which is suppressed in sir2 fun30 due to greater MCM displacement away from A?

      I think the critical missing data to solidly support their conclusions is a definitive determination of the site(s) of initiation using a more direct method, such as strand specific sequencing of EdU or nascent strand analysis. More direct comparisons of the strains with lower copy number to rule out this facet. As discussed in detail below, copy number reduction is known to suppress at least part of the sir2∆ effect so this looms over the interpretations. I think they are probably correct in their overall model based on the simplest interpretation of the data but I think it remains to be rigorously established. I think they should soften their conclusions in this respect.

    1. On BBT, all traditional and metacognitive accounts of the human are the product of extreme informatic poverty. Ironically enough, many have sought intentional asylum within that poverty in the form of apriori or pragmatic formalisms, confusing the lack of information for the lack of substantial commitment, and thus for immunity against whatever the sciences of the brain may have to say. But this just amounts to a different way of taking refuge in obscurity. What are ‘rules’? What are ‘inferences’? Unable to imagine how science could answer these questions, they presume either that science will never be able to answer them, or that it will answer them in a manner friendly to their metacognitive intuitions. Taking the history of science as its cue, BBT entertains no such hopes. It sees these arguments for what they happen to be: attempts to secure the sufficiency of low-dimensional, metacognitive information, to find gospel in a peephole glimpse.

      This describes the approach of Sellars, Brandom, and Brassier, all of which Bakker has criticized in the blog.

      They admit that science has priority in the scientific realm, but what we think we are is not something that can be true or false, but are games, rules, things we play, a game of "pretend as if we are persons".

      This is a much better position. It does not attempt to tell science that science is a building founded upon the ground of philosophy (unlike Kant, or Heidegger), and does not try to make scientifically testable predictions and get embarrassed in the process (unlike those who sought to study the "quantum of consciousness" because they thought free will is real, thus something quantum-mechanical must be true of the brain, or that philosopher who argued that Anton syndrome is impossible because it is philosophically impossible, or those psychoanalysts that try to interpret Cotard's syndrome as some manifestation of childhood trauma).

      The problem with this position is as follows:

      1. Is science really based on a game of giving and taking reasons? If not, then there's no guarantee that science would protect the game of "let's pretend we are persons who make decisions, has plans, hopes for love, etc". The juggernaut of science may eventually crush the "manifest image of man" under its wheels, migrate to a society of unconscious biorobots, and run even faster as a result!

      2. Philosophers are unable to figure out what rules, games, normativity, etc, are! They can't agree, after centuries of disputation. Any working consensus will have to come from science, and what if science finally shows that rules and games are nothing like what Sellars, Brandom, etc, thought they are? If not, then not only is the manifest image not the scientific image, not only is it unnecessary for working scientists, it is even not what the philosophers say it is. It is as if the philosophers have been stuck in Plato's Cave, mistaking the shadow-play for optical-science.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      …I find the concept and execution of the study very interesting and elegant. The paper is also commendably clear and readable. The differences between primary and higher cortex are compelling and I am largely convinced by the authors' claim that they have found evidence that broadly supports a mixed selectivity model of neural disentanglement along the lines of Rigotti et al (2013). I think that the increasing body of evidence for these kinds of representations is a significant development in our understanding of higher sensory representations. I also think that the dDR method is likely to be useful to researchers in a variety of fields who are looking to perform similar types of neural decoding analysis.

      Thanks! We agree that questions around population coding and high-level representations are critical in the field of sensory systems.

      Reviewer #2 (Public Review):

      ... This is a well-carried out study with thoughtful analyses which in large part achieves its aims to evaluate how task-engagement changes neural activity across multiple auditory regions. As with all work, there are several caveats or areas for future study/analysis. First, the sounds used here (tones, and narrow-band noise) are relatively simple sounds; previous work suggests that exactly what activity is observed within each region (e.g., sensory only, decision-related, etc) may depend in part upon what stimuli are used. Therefore, while the current study adds importantly to the literature, future work may consider the use of more varied stimuli. Second, the animals here were engaged in a behavioral task; but apart from an initial calculation of behavioral d', the task performance (and its effect on neural activity) is largely unaddressed.

      The reviewer makes several important points that we hope we addressed in the specific changes detailed below. Indeed, it is important to recognize the possibility that the specific stimuli involved in a task may interact with the effects of behavioral state and that variability in task performance should be considered as an important aspect of behavioral state.

      Reviewer #1 (Recommendations For The Authors):

      I have a few minor comments and criticisms:

      (1) Figure 1c. The choice of low-contrast grey text (e.g. "Target vs. target" is unfortunate, especially when printed, and should be replaced (e.g. with dark grey).

      We have edited the figure to use a higher contrast (dark grey). Thanks for catching this.

      (2) Figure 2 and Supplementary Figure 3. I think some indication of error or significance is required in all panels. Without this, it's hard to interpret any of these panels.

      Thank you for this feedback. Including significance here was clarifying and helps to strengthen our claim that state-dependent changes in neural activity were smaller and more diverse for single neurons than at the population level. We modified Figure 2b-c to indicate whether each neuron’s response to the target stimulus was significantly different than its response to the catch stimulus. The same test was performed in Supplementary Figure 3. Additionally, we added a statistical test in Figure 2d-e to indicate, for each pair of target/catch stimuli, whether discrimination (d-prime) changed significantly between active and passive conditions. Furthermore, we modified the text of the second paragraph under the results heading: “Diverse effects of task engagement on single neurons in primary and non-primary auditory cortex” to reference and interpret the results of these significance tests. The new text reads as follows (L. 121):

      “Sound-evoked spiking activity was compared between active and passive states to study the impact of task engagement on sound representation. In both A1 and dPEG, responses to target and catch stimuli were significantly discriminable for a subset of single neurons (about 25% in both areas, Figure 2A-C, Supplemental Figures 3-5, bootstrap test). This supports the idea that stimulus identity can be decoded in both brain regions, regardless of task performance. However, the fact that the responses of most neurons in both brain areas could not significantly discriminate target vs. catch stimuli also highlights the diversity of sound encoding observed at the level of single neurons. The accuracy of catch vs. target discrimination for each neuron was quantified using neural d-prime, the z-scored difference in target minus catch spiking response for each neuron (Methods: Single neuron PSTHs and d-prime (Niwa et al., 2012a)). Task engagement was associated with significant changes in catch vs. target d-prime for roughly 10% of neurons in both A1 (40 / 481 neurons, bootstrap test) and dPEG (33 / 377 neurons, bootstrap test). This included neurons that both increased their discriminability and decreased their discriminability (Figure 2D-E). Thus, the effects of task engagement at the level of single neurons were relatively mild and inconsistent across the population; many neurons showed no significant change and of those that did, effects were bidirectional (Figure 2D-E).”

      We also included an additional methods paragraph in the “Statistical tests” section to describe the bootstrapping procedure used for these significance tests (L. 644):

      “The one exception to this general approach is in Figure 2, where we analyzed the sound discrimination abilities of single neurons. In this case, we computed p-values for each neuron and stimulus independently. First, for each neuron and catch vs. target stimulus pair, we measured d-prime (see Methods: Single neuron evoked activity and d-prime). We generated a null distribution of d-prime values for each neuron-stimulus pair, under each experimental condition by shuffling stimulus identity across trials before computing d-prime (100 resamples). A neuron was determined to have a significant d-prime for a given target vs. catch pair if its actual measured d-prime was greater than the 95th percentile of the null d-prime distribution. Second, for each neuron and catch vs. target stimulus pair, we tested if d-prime was significantly different between active and passive conditions. To test this, we followed a similar procedure as above, however, rather than shuffle stimulus identity, we shuffled active vs. passive trial labels. This allowed us to generate a null distribution of active vs. passive d-prime difference for each neuron and stimulus pair. A neuron was determined to have a significant change in d-prime between conditions if the actual Δ d-prime lay outside the 95% confidence interval of the null Δ d-prime distribution.”

      For Figure 2a, we chose not to indicate significance on the figure to avoid clutter, since the significance for all neurons in the population are shown in panels b-c anyway. Additionally, the difference plot shown in panel a is in units of z-scores, which we believe already gives a raw sense of the significance of the target vs. catch response change per neuron in this example dataset.

      (3) Figure 2 and Supplementary Figure 3. I would consider including some more examples as a Supplementary Figure (and perhaps combining Supp Fig 3 with Fig 2 as a main figure).

      We found no significant or apparent difference in single-neuron properties between A1 and dPEG. Therefore, we decided it is not helpful to plot both A1 and PEG examples in the main text. However, we agree that the ability to see more examples of the raw data could be useful. Therefore, we compiled two supplementary figures (Supplementary Figures 4 and 5) that replicate Figure 2a for all datasets, encompassing A1 and PEG.

      (4) Figure 2a and Supp Fig 3a. I was initially confused that the "delta-spk/sec (z-score)" values had themselves been z-scored, but now I think that they are simply the differences of the two left hand sub-panels. This could be made clear in the figure legend.

      The figure legends have been modified to state the procedure for computing “delta-spk/sec” more clearly. Specifically, we added the following information to the legend (L. 141):

      “Difference is computed as the z-scored response to the target minus the z-scored catch response (resulting in a difference shown in units of z-score).”

      (5) Figure 2b-e and Supp Fig 3b-e. Indicate the time window over which the responses were measured, and the number of neurons.

      Figure legends have been modified to include a sentence clearly stating the time window over which responses were measured. The number of neurons is also now included in the legend and on the figure itself. Furthermore, a brief description of the new statistical testing procedure has been added here (L. 144).

      “Responses were defined as the total number of spikes recorded during the 300 ms of sound presentation (area between dashed lines in panel A). Neurons with a significantly different response to the catch vs. target stimulus are indicated in black and quantified on the respective figure panel.”

      (6) Figure 2. "singe" should read "single"

      Typo in figure label has been fixed.

      (7) Line 144. Figure number is missing (Figure 3B-C).

      The missing figure number has been added to the text.

      (8) Figure 3. Again, the low-contrast grey should be replaced.

      The low-contrast grey has been replaced with dark grey.

      Reviewer #2 (Recommendations For The Authors):

      This study really nicely compares the activity and effects on activity in two areas of the auditory cortex in respect to task-engagement; I think it is, for the most part, very well done.

      A couple of specific recommendations:

      (1) Although I understand 'inf dB' as the SNR, including the actual dB level used in the experiments, would be useful, especially in the case of the inf dB.

      Thank you for this feedback. We agree that clarification about the overall sound level used here would be helpful. We have modified the methods section “Behavioral paradigm” to include the following sentence (L. 450):

      “That is, the masking noise (and distractor stimuli) were always presented with an overall sound level of 60 dB SPL. Infinite (inf) dB trials corresponded to trials where the target tone was presented at 60 dB SPL without any masking noise present, 0 dB to trials where the target was 60 dB SPL, -5 dB to trials where the target was presented at 55 dB SPL etc.”

      In addition, we have modified the main text (L. 82):

      “Animals reported the occurrence of a target tone in a sequence of narrowband noise distractors by licking a piezo spout (Figure 1A, Methods: Behavioral paradigm, distractor stimulus sound level: 60 dB SPL). … We describe SNR as the overall SPL of the target relative to distractor noise level. Thus, an SNR of –5 dB corresponds to a target level of 55 dB SPL while an Inf dB SNR corresponds to a target tone presented without any masking noise.”

      And Figure legend 1 now explicitly states the sound level used in the experiments (L. 104):

      “Variable SNR was achieved by varying overall SPL of the target relative to the fixed (60 dB SPL) distractor noise, e.g., -5 dB SNR corresponds to a 55 dB SPL target with 60 dB SPL masking noise. Infinite (inf) dB SNR corresponds to a target tone presented in isolation (60 dB SPL).”

      (2) I very much appreciate the attempt to disentangle task engagement from generalized arousal state, and specifically, addressing this through the use of pupillometry. However, by focusing the discussion of pupil dynamics solely on the arousal-state aspects of pupil size, the paper doesn't address the increasing evidence suggests that pupil size may fluctuate based upon a lot of other things, including perceptual events (see Kronemer et al, 2022 for a recent human paper; for auditory: Zekveld et al 2018 (review) and Montes-Lourido et al, 2021; but many many others, too). It would be nice to see either a bit more nuanced discussion of what pupil size may be indicating (easier), or analyzing the behavior in the context of pupil dynamics (a heavier lift).

      This is a good point. We agree that it is worth mentioning these more nuanced aspects of cognition that may be reflected by pupil size. Therefore, we also analyzed pupil size in the context of behavioral performance (see Supplemental Figure 6) and added the following text to the results (L. 193).

      “In addition to reflecting overall arousal level, pupil size has also been reported to reflect more nuanced cognitive variables such as, for example, listening effort (Zekveld et al., 2014). Furthermore, rodent data suggests that optimal sensory detection is associated with intermediate pupil size (McGinley et al., 2015), consistent with the hypothesis of an inverted-U relationship between arousal and behavioral performance (Zekveld et al., 2014). To determine if this pattern was true for the animals in our task, we measured the dynamics of pupil size in the context of behavioral performance. Across animals, task stimuli evoked robust pupil dilation that varied with trial outcome (Supplemental Figure 6b-c). Notably, pre-trial pupil size was significantly different between correct (hit and correct reject), hit, and miss trials (Supplemental Figure 6b-c), recapitulating the finding of an inverted-U relationship to performance in rodents (McGinley et al., 2015).  Since we focused only on correct trials in our decoding analysis, these outcome-dependent differences in pupil size are unlikely to contribute to the emergent decoding selectivity in dPEG.”

      (3) I think it would make this paper shine that much more if behavioral performance were not subsumed into the overall label of task engagement. You've already established you have performance that varies as a function of SNR; I would love to see the neural d' and covariability related to the behavioral d' (in the comparisons where this is possible). I would also love to see a more direct measure of choice for those stimuli that show variable behavior (e.g., a choice probability analysis or something of the like would seem to be easily applied to the target SNRs of -5 and 0 dB); and compare task engaged activity of hits vs misses vs passive listening to those same stimuli. You discuss previous studies looking at choice-related/decision-related activity and draw parallels to this work-given that there is the opportunity with this data set to *directly* assess choice-related activity, the absence of such an analysis seems like a missed opportunity.

      Thank you for this feedback. We agree that “task engagement” is not a unimodal state and that a more fine-grained analysis of task-engaged neural activity, according to behavioral choice, could be informative.

      First, we would like to point out that in Figure 4 we did already compare behavioral d’ to delta neural d’. We found that the two were significantly correlated in dPEG, but not in A1. This suggests that task-dependent changes in stimulus decoding in dPEG, but not A1, are predictive of behavioral performance. This is consistent with the finding that task-relevant stimulus representations were selectively enhanced in dPEG, but not in A1.

      Second, we added a choice decoding analysis to address whether auditory cortex represents the animal’s choice in our task. The results of this analysis are summarized in Supplemental Figure 8 and are discussed under the results section: “Behavioral performance is correlated with neural coding changes in non-primary auditory cortex only.” (L. 226):

      “The previous analysis suggests that the task-dependent increase in stimulus information present in dPEG population activity is predictive of overall task performance. Next, we asked whether the population activity in either brain region was directly predictive of behavioral choice on single hit vs. miss trials. To do this, we conducted a choice probability analysis (Methods). We found that in both brain regions choice could be decoded well above chance level (Supplemental Figure 8). Choice information was present throughout the entire trial and did not increase during the target stimulus presentation. This suggests that the difference in population activity primarily reflects a cognitive state associated with the probability of licking on a given trial, or “impulsivity” rather than “choice.” This interpretation is consistent with our finding that baseline pupil size on each trial is predictive of trial outcome (Supplemental Figure 6b).”

      To keep our decoding approach consistent throughout the manuscript, we followed the same approach for choice decoding as we did for stimulus decoding (perform dDR then calculate neural d-prime in the dimensionality reduced space). To make the results more interpretable, we converted choice d-prime to a choice probability (percent correctly decoded choices) using leave-one-out cross validation. (We note that d-prime and percent correct are very highly correlated statistics.) This is described in the methods as follows (L. 550):

      “We performed a choice decoding analysis on hit vs. miss trials. We followed the same procedure as described above for stimulus decoding, where instead of a pair of stimuli our two classes to be decoded were “hit trial” vs. “miss trial”. That is, for each target stimulus we computed the optimal linear discrimination axis separating hit vs. miss trials (Abbott and Dayan, 1999) in the reduced dimensionality space identified with dDR (Heller and David, 2022). For the sake of interpretability with respect to previous work we reported choice probability as the percentage of correctly decoded trial outcomes rather than d-prime. Percent correct was calculated by projecting the population activity onto the optimal discrimination axis and using leave-one-out cross validation to measure the number of correct classifications.”

      (4) It would also be interesting to look at population coding across sessions (although the point is taken that within a session allows the opportunity to assess covariability). Minorly self-servingly but very much related to the above point, Christison-Lagay et al, 2017 employed a similar detect-in-noise task, analyzed single neurons and population level activity, and looked at putative choice-related activity. The current study has the opportunity to expand on that kind of analysis that much more by looking across multiple sites vs within a given recording site; and compare across regions.

      Thank you for highlighting this point, we agree that it is important. When studying population coding it is critical to consider the impact of covariability between neurons. Therefore, it is worthwhile to revisit our interpretations of prior results, e.g., Christison-Lagay et al, 2017, which studied population coding by combining neurons across different sessions, given that we now have access to simultaneously recorded population data.

      First, we would like to point out that this was the primary motivation for our simulation analyses presented in Figure 5. Using simulations, we found that task-dependent gain modulation (which can be observed across sessions) was sufficient to explain our primary finding – selective enhancement in decoding of behaviorally relevant sound stimuli in dPEG.

      Second, to address the question about how covariability affects choice-related information in auditory cortex and compare our findings with prior studies, we performed the same set of simulations for choice probability analysis. We found that, again, choice-dependent gain modulation was sufficient to explain our findings. That is, simulations with hit- vs. miss-dependent gain changes, but fixed covariability, closely mirrored the choice probability we observed in the raw data. An additional simulation where covariability between all neurons was set to zero also recapitulated our findings in the raw data. Collectively, this suggests that covariability does not play a significant role in shaping the choice information present in A1 and dPEG during this task. We have added the following text to the manuscript to summarize this finding (L. 293):

      “Finally, we used the same simulation approach to determine what aspects of population activity carry the “choice” related information we observed in A1 and dPEG (Figure 4 – figure supplement 1). Similar to our findings for stimulus decoding, we found that gain modulation alone was sufficient to recapitulate the choice information present in the raw data for this task. This helps frame prior work that pooled neurons across sessions to study population coding of choice in similar auditory discrimination tasks (Christison-Lagay et al, 2017).”

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This manuscript presents a solid and generally convincing set of experiments to address the question of whether the lateral parafacial area (pFL) is active in controlling active expiration, which is particularly important in patient populations that rely on active exhalation to maintain breathing (eg, COPD, ALS, muscular dystrophy). This study presents a valuable finding by pharmacologically mapping the core medullary region that contributes to active expiration and addresses the question of where these regions lie anatomically. Results from these experiments will be of value to those interested in the neural control of breathing and other neuroscientists as a framework for how to perform pharmacological mapping experiments in the future.

      Thanks for the positive feedback on our study, as well as the assessment of the novelty of our investigation and the advancements to the field that these results will bring in the future.

      We have addressed the specific comments and made changes to the manuscript as indicated below.

      Public Reviews:

      Reviewer #1 (Public Review):

      The main focus of the current study is to identify the anatomical core of an expiratory oscillator in the medulla using pharmacological disinhibition. Although expiration is passive in normal eupneic conditions, activation of the parafacial (pFL) region is believed to evoke active expiration in conditions of elevated ventilatory demands. The authors and others in the field have previously attempted to map this region using pharmacological, optogenetic, and chemogenetic approaches, which present their own challenges.

      In the present study, the authors take a systematic approach to determine the precise anatomical location within the ventral medulla's rostrocaudal axis where the expiratory oscillator is located. The authors used a bicuculline (a GABA-A receptor antagonist) and fluorobeads solution at 5 distinct anatomical locations to study the effects on neuronal excitability and functional circuitry in the pFL. The effects of bicuculline on different phases of the respiratory cycle were characterized using a multidimensional cycle-by-cycle analysis. This analysis involved measuring the differences in airflow, diaphragm electromyography (EMG), and abdominal EMG signals, as well as using a phase-plane analysis to analyze the combined differences of these respiratory signals. Anatomical immunostaining techniques were also used to complement the functional mapping of the pFL.

      Major strengths of this work include a robust study design, complementary neurophysiological and immunohistochemical methods, and the use of a novel phase-plane analysis. The authors construct a comprehensive functional map revealing functional nuances in respiratory responses to bicuculline along the rostrocaudal axis of the parafacial region. They convincingly show that although bicuculline injections at all coordinates of the pFL generated an expiratory response, the most rostral locations in the lateral parafacial region play the strongest role in generating active expiration. These were characterized by a strong impact on the duration and strength of ABD activation and a robust change in tidal volume and minute ventilation. The authors also confirmed histologically that none of the injection sites overlapped grossly with PHOX2B+ neurons, thus confirming the specificity of the injections in the pFL and not the neighboring RTN.

      Collectively, these findings advance our understanding of the presumed expiratory oscillator, the pFL, and highlight the functional heterogeneity in the functional response of this anatomical structure.

      Thanks for the positive feedback on the results presented in the current manuscript.

      Reviewer #2 (Public Review):

      Summary:

      Pisanski and colleagues map regions of the brainstem that produce the rhythm for active expiratory breathing movements and influence their motor patterns. While the neural origins of inspiration are very well understood, the neural bases for expiration lag considerably. The problem is important and new knowledge pertaining to the neural origins of expiration is welcome.

      The authors perturb the parafacial lateral (pFL) respiratory group of the brainstem with microinjection of bicuculline, to elucidate how disinhibition in specific locations of the pFL influences active expiration (and breathing in general) in anesthetized rats. They provide valuable, if not definitive, evidence that the borders of the pFL appear to extend more rostrally than previously appreciated. Prior research suggests that the expiratory pFL exists at the caudal pole of the facial cranial nucleus (VIIc). Here, the authors show that its borders probably extend as much as 1 mm rostral to VIIc. The evidence is convincing albeit with caveats.

      Strengths:

      The authors achieve their aim in terms of showing that the borders of the expiratory pFL are not well understood at present and that it (the pFL) extends more rostrally. The results support that point. The data are strong enough to cause many respiratory neurobiologists to look at the sites rostral to the VIIc for expiratory rhythmogenic neurons and characterize their properties and mechanisms. At present my view is that most respiratory neurobiologists overlook the regions rostral to VIIc in their studies of expiratory rhythm and pattern.

      Weaknesses:

      The injection of bicuculline has indiscriminate effects on excitatory and inhibitory neurons, and the parafacial region is populated by excitatory neurons that are expiratory rhythmogenic and GABA and glycinergic neurons whose roles in producing active expiration are contradictory (Flor et al. J Physiol, 2020, DOI: 10.1113/JP280243). It remains unclear how the microinjections of bicuculline differentially affect all three populations. A more selective approach would be able to disinhibit the populations separately. Nevertheless, for the main point at hand, the data do suggest that we should reconsider the borders of the expiratory pFL nucleus and begin to examine its physiology up to 1 mm rostral to VIIc.

      The control experiment showed that bicuculline microinjections induced cFos expression in the pFL, which is good, but again we don't know which neurons were disinhibited: glutamatergic, GABAergic, or glycinergic.

      Thanks for sharing your excitement on the results of our study, and appreciating the thorough investigation performed with the use of bicuculline, an approach that was originally used in Pagliardini et al, 2011, PMID: 21414911) and then used by many other groups to generate and study active expiration in vivo.

      In the current study we used the well known effect of Bicuculline to systematically test the area that is more sensitive to such a pharmacological effect, and hence may be the core for generating active expiration. While the use of GABA receptor antagonists may have an indiscriminate effect on GABA receptor expressing neurons with various phenotypes, anatomical assessment of inhibitory cells has shown very little distribution of GABAergic and glycinergic cells in the parafacial area (Tanaka et.al, 2003; PMID: 14512139) and it has been inferred in multiple publications (Huckstepp et al., 2015, PMID: 25609622; Huckstepp et al. 2016 PMID: 27300271; Huckstepp et al., 2018, PMID: 30096151; Flor et al., 2020, PMID: 32621515; Britto & Moraes, 2017; PMID: 28004411; Silva et al. 2016; PMID: 26900003) and demonstrated recently (Magalhaes et al.,  2021; PMID: 34510468) that late-E neurons in the parafacial region are excitatory and have a glutamatergic phenotype. We can’t exclude that a small fraction of neurons in the pFL area are inhibitory, and that they could influence recruitment of adjacent late-E expiratory neurons. A more selective activation of neuronal populations with different phenotype would be indeed interesting, nonetheless, if local inhibitory neurons have a role in the generation of active expiration, then their disinhibition could have either an inhibitory effect on late-E activity or stimulate expiration in a more indirect fashion.

      While the effect of bicuculline on active expiration has been reported and replicated in multiple manuscripts, the source of inhibition across different phases of the respiratory cycle is still under investigation. Some studies suggest that GABAergic and glycinergic inhibition is not originated in pFL but rather in the BötC and preBötC areas (Flor et al., 2020, PMID: 32621515; Magalhaes et al., 2021; PMID: 34510468) and the effects of this inhibition across the respiratory cycle is debated. Future studies will be key to identify the source of pFL inhibition.

      The manuscript characterizes how bicuculline microinjections affect breathing parameters such as tidal volume, frequency, ventilation, inspiratory and expiratory time, as well as oxygen consumption. Those aspects of the manuscript are a bit tedious and sometimes overanalyzed. Plus, there was no predictive framework established at the outset for how one should expect disinhibition to affect breathing parameters. In other words, if the authors are seeking to map the pFL borders, then why analyze the breathing patterns so much? Does doing so provide more insight into the borders of pFL? I did not think it was compellingly argued.

      We have edited the introduction to address this comment and emphasize the rationale for the study. We also edited the results section to summarize our findings.

      We continue to report our in-depth analysis of the perturbations induced by bicuculline injection over the various respiratory characteristics as this will be fundamental to determine the effects of our experiment not only on the activation of pFL and active expiration, but also on the respiratory network in general. In order to be fair and open about our findings we have reported the results of our analysis in detail. Of note, all sites generated active expiration, but since the objective of the study was to determine the sites with the most significant changes, a finer and multilevel analysis has been used.

      Further, lines 382-386 make a point about decreasing inspiratory time even though the data do not meet the statistical threshold. In lines 386-395, the reporting appears to reach significance (line 388) but not reach significance (line 389). I had trouble making sense of that disparity.

      The statistics were confirmed, and the lines edited as follows: “Interestingly, the duration of inspiration during the response was found to decrease in all groups relative to baseline respiration (Ti response = 0.279 ± 0.034s, Ti baseline = 0.318 ± 0.043s, Wilcoxon rank sum: Z = 3.24, p = 0.001). Contrary to this decrease in inspiratory duration, the total expiratory time was observed to increase in all groups and remained elevated compared to baseline (TE response = 1.313 ± 0.188s, TE baseline = 1.029 ± 0.161s, Wilcoxon rank sum: Z = 4.49, p = 0.001).”

      The other statistical hiccups include "tended towards significance" (line 454), "were found to only reach significance for a short portion of the response" (line 486-7), "did not reach the level of significance" (line 506), which gives one the sense of cherry picking or over-analysis. Frankly, this reviewer finds the paper much more compelling when just asking whether the microinjections evoke active expiration. If yes, then the site is probably part of the pFL.

      Statistical “tendencies” have been eliminated throughout the manuscript.

      We have analyzed in details our results in order to determine changes and differential effects on respiration when comparing the 5 sites of injections. Although the presentation of the results may seem tedious, it has allowed us to highlight some interesting effects: first, the effects on respiratory frequency. It has been shown in the past that optogenetic stimulation of this area causes an increase in respiratory frequency (Pagliardini et al., 2011, PMID: 21414911), whereas a dishinibition with this same approach or stimulation of AMPAreceptor in pFL have shown a reduction in frequency or not a significant change in the response (Pagliardini et al., 2011, PMID: 21414911; Huckstepp et al., 2015, PMID: 25609622; Huckstepp et al. 2016 PMID: 27300271; Huckstepp et al., 2018, PMID: 30096151). Here, we suggest that the reduction in respiratory frequency is observed only in the caudal sites and could be attributed to BötC effects rather than the stimulation of the core of the pFL since no respiratory change was observe where the effect was more potent (rostral side). Another interesting point was the effects on O2 consumption, although difficult to interpret at this point, we found very interesting that hyperventilation occurred only at the most rostral injection sites.

      I encourage the authors to consider the fickleness of p-values in general and urge them to consider not just p but also effect size.

      Thank you for the feedback on our description of the statistical results and the suggestion of incorporating effect size. We have now included measurements of effect size in the results section.  Specifically, we calculated the effect size within each ANOVA using the value of eta squared for all data shown in Figures 3 and 4. Please note that in our phase-plane analysis (Fig. 5-6) the Mahalanobis distance is itself an effect size measure for multidimensional data. We also note that statistical evaluation using non-parametric analyses do not involve effect sizes.

      Reviewer #3 (Public Review):

      Summary:

      The study conducted by Pisanski et al investigates the role of the lateral parafacial area (pFL) in controlling active expiration. Stereotactic injections of bicuculline were utilized to map various pFL sites and their impact on respiration. The results indicate that injections at more rostral pFL locations induce the most robust changes in tidal volume, minute ventilation, and combined respiratory responses. The study indicates that the rostrocaudal organization of the pFL and its influence on breathing is not simple and uniform.

      Strengths:

      The data provide novel insights into the importance of rostral locations in controlling active expiration. The authors use innovative analytic methods to characterize the respiratory effects of bicuculline injections into various areas of the pFL.

      Weaknesses:

      Bicuculline injections increase the excitability of neurons. Aside from blocking GABA receptors, bicuculline also inhibits calcium-activated potassium currents and potentiates NMDA current, thus insights into the role of GABAergic inhibition are limited.

      Increasing the excitability of neurons provides little insights into the activity pattern and function of the activated neurons. Without recording from the activated neurons, it is impossible to know whether an effect on active expiration or any other respiratory phase is caused by bicuculline acting on rhythmogenic neurons or tonic neurons that modulate respiration. While this approach is inappropriate to study the functional extent of the conditional "oscillator" for active expiration, it provides valuable insights into this region's complex role in controlling breathing.

      We have included a reflection of the weaknesses of our studies in the technical consideration section to address the possibility that bicuculline may induce active expiration through other mechanisms. Please note that the use of bicuculline was not to gain further insight on GABAergic inhibition of pFL but to adopt a tool to generate active expiration that has been extensively validated by our group and others.

      Multiple studies have shown recruitment of excitatory late expiratory neurons with bicuculline injections. Although we did not record from late-E neurons in this study, we infer from the body of literature that disinhibition of neurons in this area will activate late-E neurons (as previously demonstrated) and generate active expiration. Although we see value in recording activity of single neurons (especially to study mechanisms of rhythmogenesis), we opted to measure the physiological response from respiratory muscles as an indication of active expiration recruitment in vivo. Recording from single neurons after bicuculline injections in each site would confirm the presence of expiratory neurons along the parafacial area, which is probably not surprising, since every site tested promoted active expiration. The focus of the study though was to determine the site with the strongest physiological response to disinhibition. Future studies will be key to determine whether all neurons along this column have similar electrophysiological rhythmic properties to the ones recently reported (Magalhaes et al., 2021; PMID: 34510468), or some of them simply provide tonic drive to late-E neurons located elsewhere.

      We have discussed the issue as follows:

      “Our experiments focused on determining the area in the pFL that is most effective in generating active expiration as measured by ABD EMG activity and expiratory flow. We did not attempt to record single cell neuronal activity at various locations as previously shown in other studies (Pagliardini et al 2011; Magalhaes et al., 2021), as this approach would most likely find some late-E neurons across the pFL and thus not effectively discriminate between areas of the pFL. Future studies involving multi-unit recordings or imaging of cell population activities will help to determine the firing pattern and population density of bicuculline-activated cells and further determine differences in distribution and function of late-E neurons across the region of the pFL.”

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Overall, the manuscript addresses an important question in the field, the anatomical location of the expiratory oscillator. I commend the authors for a well-thought-out and clearly presented study. However, a few small concerns deserve attention to improve the clarity of the report.

      (1) The figures would benefit from a rostral-to-caudal representation of results instead of a caudal-to-rostral orientation. Example, Figure 2.

      We opted for a caudal to rostral representation to progressively move away from the inspiratory oscillator (preBötC) and the anatomical reference point (the caudal tip of the facial nucleus) with our series of injections. 

      (2) A discussion about how expiratory responses generated by these pharmacological approaches would compare to endogenous baseline conditions. The authors mention that bicuculline injections elicited a late-E downward inflection that was absent in baseline conditions. Thus, this raises the point of how these findings compare to awake freely moving animals or during different conditions of increased ventilatory demand.

      This is an interesting question that has not yet been address in the field. As far as we know, there are no recordings of pFL neurons in freely behaving animals although recordings of pFL late-E neurons under elevated PaCO2 have shown a late-E activity in in situ preparations (Britto & Moraes, 2017; PMID: 28004411; Magalhaes et al., 2021; PMID: 34510468).

      We have clarified this in the discussion as follows:

      “At rest, respiratory activity does not present with active expiration (i.e, expiratory flow below its functional residual capacity in conjunction with expiratory-related ABD muscle recruitment) and expiratory flow occurs due to passive recoil of chest wall with no contribution of abdominal activity. Active expiration and abdominal recruitment can be spontaneously observed during sleep (in particular REM sleep, Andrews and Pagliardini, 2015; Pisanski et al., 2019) and can be triggered during increased respiratory drive (e.g. Hypercapnia, RTN stimulation, Abbott et al., 2011). Although never assessed in freely moving, unanesthetized rodents, bicuculline has been extensively used to generate active expiration and late-E neuron activity in both juvenile and adult anesthetized rats (Pagliardini et al., 2011; Huckstepp et al., 2015 Huckstepp et al., 2016; Huckstepp et al., 2018; De Britto and Moraes, 2017; Magalhaes et al., 2021). “

      (3) In Figure 2A, there appears to be an injection site in the top right quadrant of the image, very distant from the intended site. Could the authors confirm if this is an artifact?

      Yes, it is an artifact of image acquisition, we should have marked that in the figure. To avoid confusion and follow other reviewers’ suggestions we have edited he figure.

      (4) A stylistic suggestion would be to include the subpanel of Figure 2C saline control injection as a graph of its own and also include the control anatomical location in 2B.

      Thanks for the suggestion. Because of the complex organization of the figure we opted to leave it as a subpanel in order to not distract the reader from the 5 injection sites, but still provide information about vehicle injection and their lack of changes in respiratory response.

      (5) The authors note that DIAm Area (norm.) during the inspiratory phase is increased in the +6 and +8mm groups. However, Figure 5E shows that the +8mm group is significantly reduced as compared to the +6mm group. Please clarify.

      During the inspiratory phase we did not observe any significant change in the DIA Area (norm.). We realize that the description of this part of the results was confusing and therefore we have eliminated that section.

      Reviewer #2 (Recommendations For The Authors):

      I encourage the authors to consider the fickleness of p-values in general and urge them to consider not just p but also effect size. There is a valuable editorial in this week's J Physiology (https://doi.org/10.1113/JP285575) that may provide helpful guidance.

      Thanks for this comments and the general assessment. We realized that the results section was dense and with a lot of information. We significantly slimmed the description of the results in order to facilitate the appreciation of the results and avoid confounding statement about significant vs non- significant results.

      We have now included measurements of effect size in the results section.  Specifically, we calculated the effect size within each ANOVA using the value of eta squared for all data shown in Figures 3 and 4. Please note that in our phase-plane analysis (Fig. 5-6) the Mahalanobis distance is itself an effect size measure for multidimensional data. We also note that statistical evaluation using non-parametric analyses do not involve effect sizes.

      The equipment and resources should be clearly identified and use RRIDs whenever possible. Resources like antibodies and other reagents (e.g., cryoprotectants) should be identified, not just by manufacturer, but also by specific part or product numbers or identifiers.

      Manuscript has been edited to add these details.

      The manuscript makes reference to ImageJ and Matlab routines, which must be public through GitHub or another stable repository.

      Thanks for pointing this out. Image J analysis has been performed following scripts already available to users (no custom scripts). The Matlab scripts used for the multivariate analysis is now available at: https://github.com/mprosteb/Pisanski2024

      The way that ABD-DIA coupling was assessed was unclear from the Methods.

      The following text has been added to the methods: “The coupling between ABD and DIA signals was measured as a ratio and analyzed by quantifying the number of bursts of activity observed for the ABD and DIA EMG signals during the first 10 minutes of the response, excluding time bins at end of the response (due to fading and waning of the ABD response in those instances).”

      Fig. 1A was never cited in the text.

      It has been cited now.

      Fig. 1A-C appears to be exactly the same as Fig. 5A-C.

      The reviewer is correct. We have used figure 1 to describe and explain our analytical methods with sample data and Figure 5 describes our results. We have clarified that in: “Figure 5: Rostral injections elicit more prominent changes to respiration in each signal and sub-period. A-C: Is the same as Method Figure 1, has been included here for further clarity when analyzing the results.”

      Late Expiratory airflow is given in units of volts (V) in lines 358-363 (Fig. 4C) but then in units of volts-seconds (V•s) in lines 363-367. Both units are problematic because the voltage is neither an air volume nor an air volume per unit time. Is there some conversion factor left out?

      In this section of the results we describe the changes in expiratory peak amplitude (V) and expiratory peak flow (V•s). Since calibration of airflow was performed on the positive flow and for larger volumes, we prefer to use the original units to guarantee precise assessment of the change and avoid introducing potential errors. Since the analysis considers changes from baseline readings, converting to ml or ml*s would not affect our analysis.

      Reviewer #3 (Recommendations For The Authors):

      The study conducted by Pisanski et al investigates the role of the lateral parafacial area (pFL) in respiratory control, specifically in modulating active expiration. The precise location of this expiratory oscillator within the ventral medulla remains uncertain, with some studies indicating that the caudal tip of the facial nucleus (VIIc) forms the core while others propose more rostral areas. Bicuculline injections were utilized at various pFL sites to explore the impact of these injections on respiration. The authors use innovative and impressive analytic methods to characterize the effect on respiratory activity. The results indicate that injections at more rostral pFL locations induce the most robust changes in tidal volume, minute ventilation, and combined respiratory responses. The study will contribute to an enhanced understanding of the neural mechanisms controlling active expiration. The main message of the study is that the rostro-caudal organization of the pFL is not simple and uniform. The data provides novel insights into the importance of rostral locations in controlling active expiration (see e.g. lines 738-740).

      The data and results of the paper are intriguing, and it appears that the experiments are well-managed and executed. However, there are several major and minor comments and suggestions that should be addressed by the authors:

      (1) The study relies heavily on local injections into specific areas that are confirmed histologically. One potential concern is the injection volume of 200 nL in such a tiny area. The authors suggest that the drug did not spread to rostral/caudal areas outside the specified coordinate partly based on their cFOS staining. For example, the lack of cFOS activation in TH+ cells and Phox2B cells is interpreted as proof that bicuculline did not spread to these somas (Figure 2). The authors seem to use a similar argument as evidence that the pFL does not include Phox2B neurons in the RTN as discussed in the Discussion section (lines 830-847). However, it is very surprising that bicuculline injections into an area that is known to contain Phox2B and Th+ neurons do not activate these neurons as assessed by the cFOS staining. It seems puzzling to me that none of their injections shown in Figure 2 activated Phox2B or Th neurons. I assume that in targeting the pFL the authors must have sometimes hit areas that included neurons that define the RTN, which would have activated Phox2B or Th+ neurons. Did the authors find that these activations did not activate active expiration? Such negative "controls" would strengthen their argument that pFL is a separate and distinct region that selectively controls active expiration.

      Thanks for the positive feedback on the manuscript. As it has been demonstrated and discussed in several previous publications, PHOX2B expressing neurons in this area of the brain are part of the RTN Neuromedin B positive neurons (more densely located in the ventral paraFacial rather than the lateral parafacial, our site of injection), the TH+ C1 neurons (located in a somewhat more caudal and medial position compared to our sites of injection, around the BötC/ preBötC area) and the large Facial MN (easily identifiable by their large size and compact location). Given this differential spatial distribution, and the controls described below, we believe we have reduced the possibility of the direct activation of these neurons, although we can’t exclude it in full.

      There is now strong evidence about lack of PHOX2B expression in late E neuron in juvenile and adult rats (Magalhaes et al., 2021; PMID: 34510468). We realize that the microinjected solution could potentially diffuse in the brain and hit other areas, but we combined two strategies to verify our intention for a focal injection activating only a restricted area of the brain (i.e., the pFL): i) localization of fluorobeads that were diluted in the Bicuculline solution; ii) expression of cFos combined with anatomical markers, to identify activated cells. Fluorobeads have a very limited spread in the brain and therefore informed us of the site of the injection to differentiate between the five injections locations. Although we can’t assume that Bicuculline will have a similar spread (and it will also be quickly degraded in the tissue), the combination of this analysis with the localized expression of cFos cells has helped us to differentiate between injections site. Because of the proximity of PHOX2B cells in RTN and C1 neurons, we also combined cFos expression with immunohistochemistry to determine whether bicuculline activation was also visible in these two neuronal populations. Our results indicate that there is baseline cfos activity in RTN neurons (see vehicle injection) but the fraction of PHOX2B activated cells did not increase with bicuculline injections suggesting that these neurons were not the target of our injections. Please note that cfos expression has been extensively used to determine RTN neuron activation, especially following chemoreflex responses. 

      (2) The authors refer to "the expiratory oscillator" throughout the manuscript (e.g. lines 58, 62, 65) as if there is only one expiratory oscillator i.e. "the expiratory oscillator". For some reason, the authors avoided citing and mentioning PiCo (Anderson et al. 2016), which is considered the oscillator for postinspiration. Since the present study focuses on the role of expiration, and since the authors describe convincing effects on postinspiration, considering this oscillator which is located dorsomedial to the VRC seems relevant for the present study.

      Due to the limited and controversial literature that is currently present describing Pico as a third oscillator and the fact that our studies do not directly assess the post-inspiratory activity (as measure by the V nerve or laryngeal muscles) or Pico activity and location (which would be even more distant than the RTN, for example), we prefer to avoid commenting on the effects of this injection on Pico or the connectivity between Pico and pFL.

      We have added this to the discussion:

      “Therefore, although it has previously been described, it is currently unknown the exact mechanism by which this post-I activity in the ABD muscles is generated. For example the interplay between the rostral pFL and brainstem structures generating post-inspiratory activity, such as the proposed post-inspiratory oscillator (PiCo; Anderson et al., 2016) or pontine respiratory networks, could be reasonably involved in this process.”

      (3) The authors do not specify what type of bicuculline they injected. Bicuculline is known to have significant effects on potassium channels. Thus, the effects reported here could be due to a non-specific change in excitability, rather than caused by a specific GABAergic blockade.

      The authors also do not know what effects these injections cause in the neurons in vivo, since the injections are not accompanied by recordings from the respiratory neurons that they activate. This together with the non-specific bicuculline effects will affect the interpretation of the results. Thus, the authors need to be more careful when interpreting their effects as "GABAergic". The use of more specific blockers like gabazine could partly address this concern. The authors have to discuss this in a "limitation section".

      Thanks for pointing that out, we have now clarified in the methods section that we used bicuculline methochloride. We can’t exclude that some side- effects could be present due to the use of this drug. For the purpose of this study though, we focused on using bicuculline as a tool to consistently generate active expiration since it has been extensively used by multiple laboratories to induce abdominal muscle recruitment and active expiration, as well as to directly record late-E neurons in this same area.

      We have included in the discussion the following statement:

      “Technical considerations

      Bicuculline methiodide has previously been observed to exhibit inhibitory effects on Ca2+ activated K+ currents inducing non-specific potentiation of NMDA currents (Johnson and Seutin, 1997). Consequently, caution is warranted in attributing our findings solely to the GABAa antagonist properties of bicuculline. Previous work has demonstrated a temporal correlation between the onset of late-E neuron activity in the caudal parafacial region and ABD activity in response to bicuculline (Pagliardini et al., 2011; de Britto and Moraes, 2017; Magalhaes et al., 2021) as well as GABAergic sIPSCs in late-E neurons (Magalhaes et al., 2012). However, it is essential to note that the current study lacks single unit recording, preventing us from definitively confirming whether the observed activity stems from late-E neuronal GABAergic dishinibition or excitation through non GABAergic mechanisms.”

      (4) I also caution the authors when stating that the bicuculline injections will reveal the precise location and functional boundaries of "the" expiratory oscillation within the pFL. Increasing the excitability with bicuculline is inappropriate to study the functional boundaries of an oscillator. It is particularly inappropriate to identify the boundaries of the pFL, a network that is normally inactive and activated only under certain behavioral and metabolic conditions. Because the injections are increasing the neuronal excitability unspecifically, and because the authors are not recording the activity of the neurons in the pFL region it is unclear what kind of neurons are activated. The cFOS staining may help to define whether these neurons are Phox2B or Th positive or negative, but they will not provide insights into the activity patterns of the activated neurons. Thus, it is fair to assume that these injections will likely include also tonic neurons that might indirectly control the activity of pFL neurons under certain metabolic or behavioral conditions without actually being involved in the rhythmogenesis of active expiration. Many of the effects peak after several minutes, and different regions cause differential effects with different time courses, which is difficult to interpret functionally. Thus, the "core" identified in the present study could consist of tonic neurons as opposed to rhythmic neurons generating active expiration.

      We agree with the reviewer that our local injections may have activated an heterogeneous population of neurons. We do not claim that we only activated late-E rhythmogenic neurons but that our multiple sites of injections revealed the area that is generating the strongest excitation of ABD muscles and active expiration.

      While the use of GABA receptor antagonists may have an indiscriminate effect on GABA receptor expressing neurons with various phenotypes, anatomical assessment of inhibitory cells has shown very little distribution of GABAergic and glycinergic cells in the parafacial area (Tanaka et.al, 2003; PMID: 14512139) and it has been inferred in multiple publications (Huckstepp et al., 2015, PMID: 25609622; Huckstepp et al. 2016 PMID: 27300271; Huckstepp et al., 2018, PMID: 30096151; Flor et al., 2020, PMID: 32621515; Britto & Moraes, 2017; PMID: 28004411; Silva et al. 2016; PMID: 26900003) and demonstrated recently (Magalhaes et al.,  2021; PMID: 34510468) that late-E neurons in the parafacial region are excitatory and have a glutamatergic phenotype

      As suggested by the reviewer, it is possible that the bicuculline injection may have activated some tonic non rhythmogenic neurons which could activate the expiratory oscillator located elsewhere.

      We have edited the introduction as follows:

      “By strategically administering localized volumes of bicuculline at multiple rostrocaudal levels of the ventral brainstem, we aimed to selectively enhance the excitability of neurons driving active expiration, thereby revealing the extension of the pharmacological response and the most efficient site in generating active expiration.”

      We have edited the results as follows:

      “Importantly, the group with injection sites at +0.6 mm from VIIc exhibited the swiftest response onset, suggesting that this area is the most critical for the generation of active expiration, either through direct activation of the expiratory oscillator or, alternatively, for providing a strong tonic drive to late-E neurons located elsewhere.”

      In the introduction, it should also be emphasized that the pharmacological approach used in the present study complements the existing elegant chemogenetic studies, rather than emphasizing primarily the limitations of the chemogenetic inhibitions. The conclusion should be that these studies together provide different, yet complementary insights: The chemogenetic approach by inhibiting neurons, the present study by exciting neurons, and all studies come with their own limitations.

      Thanks for the suggestion, we have updated the manuscript as follows:

      “Although both of these elegant chemogenetic studies have contributed extensively to our understanding of the pFL, the existing evidence suggests that the expiratory oscillator may expand beyond the limits of the viral expression achieved in said studies, as proposed by Huckstepp et al., (2015).”

      Throughout the manuscript, the authors have to be cautious when implying that an excitatory effect relates to the activity of rhythmogenic pFL neurons. For example, on line 710 the authors state that "it is conceivable to infer that the rostral pFL is in the closest proximity to the cells responsible for the generation of active expiration". While it may indeed be "conceivable", the bicuculline injections themselves provide no insights into the location of neurons responsible for rhythmogenesis. It is equally "conceivable" that the excited neurons provide a tonic drive to the neurons without being involved in the generation of active expiration. These tonic neurons could be located at a distance from the presumed rhythmogenic core.

      We have included the possibility of tonic excitation in the technical considerations section:

      “However, our study did not include recording from late-E neurons following bicuculline injections, preventing us from definitively confirming whether the observed activity stems from late-E neuronal excitation or the potentiation of a tonic drive, particularly in the rostral areas.”

      (5) It is intriguing that some of their injections (Fig.2D) evoked postinspiratory activity. This interesting finding should be discussed as it could provide important insights into the coordination of the different phases of expiration.

      Thanks for the suggestion. We have included the following to the discussion:

      “Therefore, although it has previously been described, the exact mechanism by which this post-I ABD activity is generated is unclear. This late-E/post-I pattern of activity is similar to what has been observed in in vitro preparations and in vivo recordings in juvenile rats (Janczewski et al., 2002; Janczewski et al., 2006).

      “Therefore, although it has previously been described, it is currently unknown the exact mechanism by which this post-I activity in the ABD muscles is generated. For example the interplay between the rostral pFL and brainstem structures generating post-inspiratory activity, such as the proposed post-inspiratory oscillator (PiCo; Anderson et al., 2016) or pontine respiratory networks, could be reasonably involved in this process.”

      (6) The authors conducted bilateral disinhibition of the pFL, but only a unilateral photomicrograph was shown. Figure 2 should include a representative bilateral photomicrograph along with a scatter plot for clarity and completeness.

      We have edited figure 2 to include representative images of bilateral injections.

      (7) Regarding the Bicuculline injections in the Methods section: Aside from specifying exactly what type of bicuculline was used, the authors should provide more information about the pFL location and landmarks used, including the missing medial-lateral coordinate. The fluorobead spread of approximately ~300 µm, as observed in Figure 2C, is crucial for the interpretation of the results and should be detailed. An alternative approach could involve e.g. calculating the area covered by fluorobeads in each group.

      We have included the following in the text:

      “Each rat was injected at 2.8 mm lateral from the midline and at a specific RC coordinate based on the following groups: -0.2 mm from the caudal tip of the facial nucleus (VIIc) (n=5), +0.1 mm from VIIc (n=7), +0.4 mm from VIIc (n=5), +0.6 mm from VIIc (n=6), +0.8 mm from VIIc (n=5)”

      “These findings strongly suggest that bicuculline specifically activated cells within the vicinity of the injection sites which spread ~300 ìm (Figure 2C, horizontal lines) and did not activate PHOX2B+ cells in the RTN area, beyond their baseline level of activity.”

      (8) In the Experimental Protocol, the authors should provide more details on how the parameters were determined. For example, specify the number of cycles included for Dia frequency/amplitude, Abd frequency/amplitude, and with regards to the averaging process, the authors should specify over how many cycles they obtained an average for Dia/Abd activity time and AUC. The authors should also provide information on the number of bicuculline injections that they repeated to average these values and they should report the coefficient of variation for repeated injections. Please clarify the method used to calculate AUC, considering the non-linear nature of the activity.

      Only one bicuculline injection per rat was performed and the number of rats used for each injection site is indicated in the methods as follows:

      “Each rat was injected at 2.8 mm lateral from the midline and at a specific RC coordinate based on the following groups: -0.2 mm from the caudal tip of the facial nucleus (VIIc) (n=5), +0.1 mm from VIIc (n=7), +0.4 mm from VIIc (n=5), +0.6 mm from VIIc (n=6), +0.8 mm from VIIc (n=5), and CTRL (n=7). We recorded the physiological responses to the injection for 20-25 min.”

      We have clarified in the methods section the following:

      “Respiratory data was tracked in time bins of 2-minute duration from the baseline period prior to injections and spanned 20 min of recording post-injection. Mean-cycle measurements for each signal were computed by averaging values across all cycles within a given time bin.”

      Additional clarifications have been added:

      “We then used the average calculations of respiratory rate (RR), tidal volume (VT), Minute Ventilation (Ve), expiratory ABD amplitude, expiratory ABD area, VO2, VE/VO2 to obtain values relative to the baseline period. Peak responses were identified as the time bin that produced the strongest changes relative to baseline.”

      “Mean-cycle measurements for each signal were computed by averaging across all cycles within a given time bin. (~300 cycles in baseline, ~100 cycles per response time bin). We then used the average calculations of respiratory rate (RR), tidal volume (VT), Minute Ventilation (Ve), expiratory ABD amplitude, expiratory ABD area, VO2, VE/VO2 to obtain values relative to the baseline period. Peak responses were identified as the time bin that produced the strongest changes relative to baseline.”

      “The Area under the curve (AUC) was measured during baseline and was subtracted from the corresponding AUC of the response for each time bin (Figure 1C). This AUC measure was computed as the sum of the signal in a given respiratory phase as all signals were sampled at the same rate. Note that areas calculated below the zero- (0) line, as would be expected from a negative airflow during expiration, yields negative AUC values.”

      (9) The authors should explain how oxygen consumption was calculated-did it involve the Depocas & Hart (1957) formula? Please provide information on expiratory CO2, whether ventilation was adjusted to achieve consistent CO2 levels across animals, and ideally specify the end-tidal CO2 range for the experiments. Discuss the rationale behind the chosen CO2 levels and whether CO2-dependent pFL activity could have influenced results.

      We have clarified in the measurement in the methods as follows:

      “The gas analyzer measured fractional concentration of O2. Based on this and the flow rate at the level of the trachea (minute ventilation), we calculated O2 consumption according to Depocas and Hart (1957).”

      We have also added to the methods section:

      “During the entire experimental procedure, rats breathed spontaneously and end tidal CO2 was not adjusted through the experimental protocol.”

      In terms of the CO2-dependent pFL activity possibly influencing the results: by inducing active expiration in conditions in which there is no physiological demand for it (i.e. no hypoxia or hypercapnia), it is likely that pCO2 is reduced, overall decreasing the drive for ABD activity which would suggest that our results are likely an underestimation of the response that would have been produced if we maintained the CO2 levels constant.

      (10) The authors should address the discrepancy in fos-activated neurons between the control (44 neurons) and experimental animals (90-120 neurons per hemisection). Please explain the activation in the control group. Please also provide insights into how the authors interpret this difference in cfos-activated neurons between control and experimental groups.

      The following paragraph has been added to the discussion:

      “The assessment of cellular activity, quantified through cFos staining, unveiled the existence of basal activity in control rats. This observed baseline activity is likely emanating from subthreshold physiological processes within the parafacial area which do not culminate in ABD activity. Analysis of the cFos staining confirmed focal activation of neurons in the pFL of rats injected with bicuculline and minimal cFos expression in the PHOX2B+ cells in all groups as compared to the control group. These results confirm the very limited mediolateral spread of the drug from the core site of injection and back previous findings supporting the hypothesis that the majority of PHOX2B+ cells are more ventrally located in the parafacial area (pFV, Huckstepp et al., 2015) and PHOX2B+ cell recruitment is not necessary for active expiration (de Britto & Moraes, 2017; Magalhães et al., 2021).”

      (11) In Figure 8, the authors plotted the relationship of each cycle correlated to the normalized area. Have you also calculated the same late-E, inspiratory, and post-I to fR or VT separately?

      No, we only did the separated breathing phase (late-E, I, Post-I) analysis in the calculations of the DIA, airflow and ABD area, as well as on the Euclidean and Mahalanobis distances.

      Minor comments:

      Is there any specific reason for conducting these experiments exclusively in males?

      No, we usually use male rats for this type of experiments. We use both male and female rats for other studies that concern the effects of sex hormones but in this case, we performed experiments only in male rats.

      Page 13, Line 320: What is the duration of the bicuculline-induced effects?

      This information is included in the results section as follows:

      “Similarly, the ABD response duration was longer at the two most rostral locations (+0.6 mm = 17.6 ± 2.7 min; +0.8 = 17.1 ± 3.3 min) compared to the most caudal group (-0.2 mm = 2.4 ± 1.1 min; One-Way ANOVA p = 0.043; Tukey -0.2 mm vs +0.6 mm: p = 0.048; -0.2 mm vs +0.8 mm: p = 0.041; Figure 3E).”

      Page 16, Line 400: Is there a rationale for the high tidal volume (VT) observed in these animals? A baseline VT of 7 ml/kg appears notably elevated.

      Please note that rats were vagotomised and spontaneously breathing, hence the tidal volume is increased compared to non-vagotomised rats as seen in previous studies (Ouahchi et al., 2011).

      Figure 2D: Could you provide longer recordings? Additionally, incorporating diaphragm (Dia) recordings would enhance the interpretation of abdominal (Abd) recordings.

      Figure 3 A has a representative example of the 20 minute recordings for each location.

      Page 18, Line 458: Please rectify "Dunn: p , 0.001" to the appropriate format, perhaps "Dunn: p < 0.001."

      Thank you, edited.

    1. Author response:

      eLife assessment

      “…The evidence however is incomplete, since the tai loss-of-clone phenotype is based on one allele and the mechanism involved in cell competition through Dlp and Wg lacks adequate supporting data.”

      We agree with the need for a second allele and are adding supporting data from a new tai lof allele we have generated by Crispr.

      We also agree that additional functional data would help demonstrate that differences in Dlp levels are required for the mechanism of Tai cell competition. Experiments are ongoing to test whether normalizing Dlp levels across clonal boundaries rescues elimination of Tai-low clones.

      Reviewer #1:

      Overall Statements:

      “There is some data in the supplementary materials suggesting that Tai promotes dlp mRNA expression, but this was not compelling.”

      We are currently testing effects on Tai on dlp and dally transcription using qPCR and reporter transgenes. As noted below, the effects of Tai on Dlp trafficking are ‘strong’, so resolving effects on Dlp transcription will complement this localization data.

      “The authors don't further examine Dlp protein in tai clones.”

      As noted by the Reviewer, we do examine Dlp levels and localization in tai-low clones (see Figure 9), but these experiments are challenging due to their very small size and the hypomorphic nature of the tai allele (tai[k15101]) that was used. Experiments are in progress to examine the effect of our Crispr null allele of tai on Dlp levels and localization in wing clones.

      “In sum, the authors have uncovered some interesting results, but the story has some unresolved issues that, if addressed, could boost its impact. Additionally, the preprint seems to have 2 stories, one about tai and cell competition and the other about tai and Wg distribution. It would be helpful to reorder the figures and improve the narrative so that these are better integrated with each other.”

      We agree. The results of our modifier screen required that we first understand how Tai regulates the Wg pathway before could apply this to understanding the competitive mechanism. Thus, the paper is composed of three sections: 1. the screen, 2. the Tai-Dlp-Wg connection in the absence of competition, and 3. the contribution of Dlp-Wg to the tai[low] ‘loser’ phenotype. These sections use different techniques (e.g., clonal mosaics with genomic alleles, Gal4/UAS and RNAi to define the effect of Tai loss on Wg and Dlp). Ongoing experiments return to clonal mosaics to test whether elevating Dlp can rescue tai lof clones in the same manner as Apc/Apc2 alleles (see Figs. 2-3), which elevate Wg pathway activity.

      Specifics:

      “It would be good to know whether the authors can rescue tai-low clones by over-expression UAS-Dlp.”

      As noted above, experiments are ongoing to test whether normalizing Dlp levels across clonal boundaries rescues elimination of Tai-low clones.

      “The data on Wg distribution seems disjointed from the data about cell competition. The authors could refocus the paper to emphasize the cell competition story. The role of Dlp in Wg distribution is well established, so the authors could remove or condense these results. The story really could be Figs 1, 2, 3 and 7 and keep the paper focused on cell competition. The authors could then discuss Dlp as needed for Wg signaling transduction, which is already established in the literature.”

      We appreciate the suggestion to reorganize the figures to focus the first part of the story on competition, and then follow with the role of Tai in controlling Dlp. We will consider this approach pending the results of ongoing experiments.  

      “The model of tai controlling dlp mRNA and Dlp protein distribution is confusing. In fact, the data for the former is weak, while the data for the latter is strong. I suggest that the authors focus on the altered Dlp protein distribution on tai-low clones. It would also be helpful to prove the Wg signaling is impeded in tai clones (see #5 below).”

      We agree but are currently testing how dlp reporters and mRNA respond to Tai in order to rigorously test a Dlp transcriptional mechanism. To complement the ‘strong’ evidence that Tai regulates Dlp distribution, we are testing Dlp in clones of our Tai Crispr null. Since submission, we have also assessed the effect of blocking the endocytic factor shibire/dynamin in Dlp distribution in Tai deficient cells to complement the data on Pentagone that is already in the paper (see Fig. S3).

      “I don't know if the Fz3-RFP reported for Wg signaling works in imaginal discs, but if it does then the authors could make clones in this background to prove that cell-autonomous Wg signaling is reduced in tai-low clones.”

      We thank the reviewer for this suggestion, which we are now testing.

      Reviewer #2

      Overall Comments:

      “While the authors present good evidence in support of most of their conclusions, there are alternative explanations in many cases that have not been excluded.”

      We appreciate this point and are conducting experiments for a revised submission that will help test alternative mechanisms and clarify our conclusions.

      Specifics:

      “However, the experiments have been done with a single allele, and these experiments do not exclude the possibility that there is another mutation on the same chromosome arm that is responsible for the observed phenotype. Since the authors have a UAS-tai stock, they could strengthen their results using a MARCM experiment where they could test whether the expression of UAS-tai rescues the elimination of tai mutant clones. Alternatively, they could use a second (independent) allele to demonstrate that the phenotype can be attributed to a reduction in tai activity.”

      As noted above, we agree with the need for a second allele and are adding supporting data from a new tai lof allele we have generated by Crispr.

      The tai[k15101] allele acts as a tai hypomorph and has been shown to produce weaker phenotypes than the 61G1 strong lof in a number of papers (Bai et al, 2000; König et al, 2011, Luo et al, 2019, and Zhang et al, 2015). We agree that rescue of tai[k1501] with a UAS-Tai transgene would help rule out effects of second site mutations. We are currently pursuing the reviewer’s second suggestion of phenocopy with a different allele, our new tai Crispr lof.   

      “The authors have screened a total of 21 chromosomes for modification and have not really explained which alleles are nulls and which are hypomorphs. The nature of each of the alleles screened needs to be explained better.”

      We will update the text to better reflect what type of alleles were chosen. In most cases we preferred amorphs or null alleles over hypomorphs, however when the amorph option was not available, we used hypomorphs.

      “Also, the absence of a dominant modification does not necessarily exclude a function of that gene or pathway in the process. This is especially relevant for the Spz/Toll pathway which the authors have previously implicated in the ability of tai-overexpressing cells to kill wild-type cells.”

      We thank the reviewer for this completely accurate point. The dominant screen does not rule out effects of other pathways such as Spz/Toll. Indeed, we were surprised by the lack of dominant effects by Spz/Toll alleles on tai[low] competition given our prior work. The reciprocally clear dominant effect of Apc/Apc2 led us to consider that Wg signaling plays a role in this phenomenon, which then became the starting point of this study.

      “The most important discovery from this screen is the modification by the Apc alleles. This part of the paper would be strengthened by testing for modification by other components of the Wingless pathway. The authors show modification by Apc[MI01007] and the double mutant Apc[Q8] Apc2[N175A]. Without showing the Apc[Q8] and Apc2[N175A] alleles separately, it is hard to know if the effect of the double mutant is due to Apc, Apc2,` or the combination.”

      We agree that testing for modification with other components of the Wg pathway would be helpful to strengthen the connection between Tai low clonal elimination and Wg pathway biology. We also agree that separating Apc [Q8] and Apc2 [N175A] would be a good idea to check if both Apc proteins are equally important for rescuing Tai low cell death, and future experiments for the lab could investigate this distinction.

      “RNAi of tai seems to block the formation of the Wg gradient. If so, one might expect a reduction in wing size. Indeed, this could explain why the wings of tai/Df flies are smaller. The authors mention briefly that the posterior compartment size is reduced when tai-RNAi is expressed in that compartment. However, this observation merits more emphasis since it could explain why tai/Df flies are smaller (Are their wings smaller?).”

      We agree that this is an exciting possibility. Growth effects of Tai linked to interactions with Yorkie and EcR could be due to a distinct role in promoting Wg activity. Alternatively, Tai may cooperate with Yorkie or EcR to control Wg pathway. These are exciting possibilities that we are pursuing in future work

      With regard to the “small size” effect of reducing Tai, we have previously shown that RNAi of Tai using engrailed-Gal4 causes the posterior compartment to shrink (Zhang et al. 2015, Figure 1C-F, H). In this paper, we also showed that tai[k15101]/Df animals are proportionally smaller than wildtype animals and quantified this by measuring 2D wing size (Zhang et al. 2015, Figure 1A and 1B)

      “In Figure 7, the authors show the effect of manipulating Tai levels alone or in combination with increasing Dlp levels. However, they do not include images of Wg protein distribution upon increasing Dlp levels alone.”

      We thank the reviewer for this reminder and have already generated these control images to include in a revised submission paper.

      “In Figure 8, there is more Wg protein both at the DV boundary and spreading when tai is overexpressed in the source cells using bbg-Gal4. However, in an earlier experiment (Figure 5C) they show that the wg-lacZ reporter is downregulated at the DV boundary when tai is overexpressed using en-Gal4. They therefore conclude that wg is not transcriptionally upregulated but is, instead secreted at higher levels when tai is expressed in the source cells. Wg protein is reduced in the DV stripe with tai is overexpressed using the en-Gal4 driver (Figure 6B') and is increased at the same location when tai is overexpressed with the bbg-Gal4 driver. (Figure 8) I don't know how to reconcile these observations.”

      We thank the reviewer for pressing us to develop an overall model explaining our results and how we envision Tai regulating Dlp and Wg. We are preparing a graphic abstract that illustrates this model and will be included in our revision.

      Briefly, we favor a model in which Tai controls the rate of Wg spread via Dlp, without a significant effect on wg transcription. For example, the induction of Dlp across the ‘engrailed’ domain of en>Tai discs (Fig 7B-B”) allows Wg to spread rapidly across the flanks and moderately depletes it from the DV margin (Fig 6B-B”) as noted by the reviewer. Adding a UAS-Dlp transgene in the en>Tai background dramatically accelerates Wg spread and causes it to be depleted from the DV margin and build up at the far end of the gradient adjacent to the dorsal and ventral hinge. Significantly blocking endocytosis of Wg in en>Tai discs with a dominant negative shibire transgene also causes Wg to build up in the same location (new data to be added in a revision) consistent with enhanced spreading. The difference in the bbg-Gal4 experiment is that Tai is only overexpressed in DV margin cells, which constrains and concentrates Wg within this restricted domain; we are in the process of testing whether this effect on Wg is blocked by RNAi of Dlp in bbg>Tai discs.

      “In Figure 9, the tai-low clones have elevated levels of Dlp. How can this be reconciled with the tai-RNAi knockdown shown in Figure 7C' where reducing tai levels causes a strong reduction in Dlp levels?”

      We apologize for not explaining this data well enough. First, the tai[k15101] allele is a weak, viable hypomorph (as shown in our Zhang et al, 2015 paper) whereas the Tai RNAi line is lethal with most drivers (including en-Gal4) and thus a stronger lof. Second, Tai RNAi lower Dlp levels (Fig 7C) while tai[k15101] causes Dlp to accumulate intracellularly (see Fig. 9A-C). These data indicate that reduced Tai leads to a defect in Dlp intracellular trafficking while its loss reduces Dlp overall levels; these data can be explained by a single role for Tai in Dlp traffic to or from the cell membrane, or two roles, one in trafficking and one Dlp expression. As noted, we are investigating both possibilities using dlp reporter lines and our new tai null Crispr allele.

      Reviewer #3:

      Overall Weaknesses:

      “The study has relatively weak evidence for the mechanism of cell competition mediated by Dlp and Wg.”

      The screen and middle section of the paper provide genetic evidence that elevating Wg pathway activity rescues Tai[low} loser cells and that Tai controls levels/localization of Dlp and distribution of Wg in the developing wing disc. Our current work is focused on linking these two finding together in Tai “loser” clones.

      “More evidence is required to support the claim that dlp transcription or endocytosis is affected in tai clones.”

      As noted above, we are testing whether normalizing Dlp levels across clonal boundaries rescues tai[low] loser clones and assessing effects of Tai on dlp transcription and Dlp trafficking.

      Specifics:

      “Most of the rest of the study is not in the clonal context, and mainly relies on RNAi KD of tai in the posterior compartment, which is a relatively large group of cells. I understand why the authors chose a different approach to investigate the role of tai in cell competition. However because ubiquitous loss of tai results in smaller organs, it is important to determine to what extent reducing levels of tai in the entire posterior compartment compares with clonal elimination i.e. cell competition. This is important in order to determine to what extent the paradigm of Tai-mediated regulation of Dlp levels and by extension, Wg availability, can be extended as a general mechanism underlying competitive elimination of tai-low clones. If the authors want to make a case for mechanisms involved in the competitive elimination of tai clones, then they need to show that the KD of tai in the posterior compartment shows hallmarks of cell competition. Is there cell death along the A/P boundary? Or is the compartment smaller because those cells are growing slower?”

      Based on data that cell competition does not occur over compartment boundaries (e.g., see review by L.A. Johnston, Science, 2009), we chose not to use UAS-Gal4 to assess competition, but rather to investigate underlying biology occurring between Tai, Wg, and Dlp.

      “Are the levels of Myc/DIAP1, proteins required for fitness, affected in en>tai RNAi cells?”

      This is, of course, an interesting question given that Myc is a well-studied competition factor and is proposed to be downstream of the Tai-interacting protein Yki. We are not currently focused on Myc, but plan to test its role in the Tai-Dlp-Wg pathway in future work.

      “The authors do not have direct/strong evidence of changes in dlp mRNA levels or intracellular trafficking. To back these claims, the authors should look for dlp mRNA levels and provide more evidence for Dlp endocytosis like an antibody uptake assay or at the very least, a higher resolution image analysis showing a change in the number of intracellular Dlp positive punctae. Also, do the authors think that loss of tai increases Dlp endocytosis, making it less available on the cell surface for maintaining adequate extracellular Wg levels?”

      As noted above, have added experiments using a dominant-negative shibire/dynamin allele to test whether Tai controls Dlp endocytosis. These data will be added to a revised manuscript. We have also gathered reagents to test effects of Tai gain/loss on Dlp secretion.

      “The data shown in the last figure is at odds with the model (I think) the authors are trying to establish: When cells have lower Tai levels, this reduces Dlp levels (S2) presumably either by reducing dlp transcription and/or increasing (?) Dlp endocytosis. This in turn reduces Wg (availability) in cells away from source cells (Figure 6). The reduced Wg availability makes them less fit, targeting them for competitive elimination. But in tai clones, I do not see any change in cell-surface Dlp (9B) (I would have expected them to be down based on the proposed model). The authors also see more total Dlp (9A) (which is at odds with S2 assuming data in S2 were done under permeabilizing conditions.).”

      As noted above (under Rev #2 comments), we apologize for not explaining this data well enough. First, the tai[k15101] allele is a weak, viable hypomorph (as shown in our Zhang et al, 2015 paper) whereas the Tai RNAi line is lethal with most drivers (including en-Gal4) and thus a stronger lof. Second, Tai RNAi lower Dlp levels (Fig 7C) while tai[k15101] causes Dlp to accumulate intracellularly (see Fig. 9A-C). These data indicate that reduced Tai leads to a defect in Dlp intracellular trafficking while its loss reduces Dlp overall levels; these data can be explained by a single role for Tai in Dlp traffic to or from the cell membrane, or two roles, one in trafficking and one Dlp expression. We are investigating both possibilities using dlp reporter lines and our new tai null Crispr allele.

      “As a side note, because Dlp is GPI-anchored, the authors should consider the possibility that the 'total' Dlp staining observed in 9A may not be actually total Dlp (and possibly mostly intracellular Dlp, since the permeabilizing membranes with detergent will cause some (most?) Dlp molecules to be lost, and how this might be affecting the interpretation of the data. I think one way to address this would be to process the permeabilized and non-permeabilized samples simultaneously and then image them at the same settings and compare what membrane staining in these two conditions looks like. If membrane staining in the permeabilized condition is decreased compared to non-permeabilized conditions, and the signal intensity of Dlp in permeabilized conditions remains high, then the authors will have evidence to support increased endocytosis in tai clones. Of course, these data will still need to be reconciled with what is shown in S2.

      We thank the reviewer for this excellent suggestion and are generating mosaic discs to test the proposed approach of synchronous analysis of total vs. intracellular Dlp.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary: Authors performed a metatranscriptomic analysis from publicly-available datasets of whole blood from 3 places in Indonesia. Their goal was to explore which pathogens were present on the blood of those 117 healthy individuals. It was interesting that reads from Flaviviridae and Plasmodium were detected in asymptomatic subjects.

      Major comments: 1) How did the authors assess and correct batch-effects between different datasets?

      Our response: We have sequencing batch information for the Indonesian dataset and saw no clear clustering based on batches in the first 8 PCs. We recognize that sampling variations may exist between islands, though the taxa matrix we acquired from the unmapped reads are very scarce that such variations did not have a strong enough effect to introduce batch effects in our microbiome analyses, and that the signals were driven by pathogenic reads. For our comparative analyses between datasets, we made sure that all three datasets shared similar processing (collected using Tempus Blood RNA Tubes and went through globin depletion method) and have trimmed both Indonesian and Malian reads to match the length of the UK reads (75BP).

      2) Did the RNA-seq capture poly-A mRNAs? If so... these reads that did not map the human genome were captured because of internal priming. Can they find internal poly A sequences in the genome of Flaviviridae and Plasmodium pathogens? I would like to know that to understand the source of the reads and which other pathogens may be missing (due to the lack of internal priming).

      __Our response: __No, our dataset did not capture poly-A mRNAs. We performed ribosomal RNA (rRNA) and globin mRNA depletion.

      3) Principal coordinates analysis (PCoA) is often utilized in metagenomics analysis. Although they are equivalent, is there a reason for using PCA?

      Our response: Since we used CLR transformation, the resulting matrix lies in Euclidian space. PCA is just a form of PCoA in Euclidian space.

      Minor comments: 1) "Indonesia is a country with large numbers of endemic and emerging infectious diseases [16], making it a crucially important location to monitor and understand the effects of pathogens on human hosts." Is there any epidemiological data that shows differences in infectious diseases across these 3 places? Can the authors provide a map and better explanation about the importance in comparing these 3 areas?

      __Our response: __We have added references to malaria infection being more prevalent in the eastern side of Indonesia in the discussion section.

      2) Why is it so hard to try to identify (only for Flaviviridae reads) reads that map to very relevant viruses, such as Zika, Dengue, and Yellow Fever? Why did the authors state that they "were unable to refine this assignment further" if this is one of the most interesting finding?

      __Our response: __Our reanalysis showed a small percentage of the Flaviviridae reads to be assigned to the Pegivirus genus. As more diverse microbial genomes are added to reference databases and identical regions become more common between them, it becomes harder for the classifer to further define reads to species level (https://link.springer.com/article/10.1186/s13059-018-1554-6). Flaviviridae has distinct species spread across six different genera (https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=11050). In comparison, despite Plasmodiidae having more species recorded compared to Flaviviridae, an overwhelming majority of the species is part of the Plasmodium genus, hence we were able to refine them down to species-level (https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=1639119).

      3) Is the script available at https://gitlab.unimelb.edu.au/igr-lab/Epi_Study ? This reviewer could not access it. __Our response: __We thank Reviewer 1 for pointing this out and have amended the link, now accessible here: https://gitlab.svi.edu.au/muhamad.fachrul/indo_blood_microbiome

      Reviewer #1 (Significance (Required)):

      Interesting paper that enable to extract additional knowledge from whole blood RNA-seq data. There are already several papers that do this and I think authors could go one step forward (for instance, PCR validation of additional individuals). I don't think this can be used for surveillance if it cannot identify species, it is more expensive than running targeted assays, and that may be many false negative pathogens in the samples.

      __Our response: __We thank Reviewer 1 for their comments. We have updated our manuscript to reflect our updated analyses which minimizes false positive taxa and the project’s significance not as a mainline surveillance tool, but a retrospective one.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary:

      Bobowik and colleagues perform a computational analysis of whole blood RNA-seq datasets from healthy individuals of three different regions of Indonesia. Their goal is to identify infecting pathogens and other microbes and correlate their abundances to host gene expression patterns or health characteristics in these populations. They find a broad range of bacterial, viral and microeukaryote taxa. When comparing the three Indonesian populations, they find that the Korowai population is the most diverse and different from the other two, possibly driven by the higher prevalence and abundance of Plasmodium (Apicomplexa) in this population.

      Then, the authors conduct a statistical decomposition of human gene expression in these samples in independent factors using ICA, and correlate each of these factors to the abundances of the microbial taxa detected. This analysis allows researchers to associate specific patterns of gene expression, such as immune-related pathways, to the presence of members of the Apicomplexa and Kitrinoviricota phyla.

      Lastly, the authors use previously published data from other two cohorts (from Mali and the UK) to contextualize their blood microbiome findings. They find microbial reads in all datasets. The Mali cohort is characterized by a large abundance of archaea, not found in the other two populations, while the UK cohort has the lower diversity. Altogether, the authors propose the use of RNA-seq data from human whole blood as a way to study the blood microbiome and establish potential associations between blood resident microbes and host gene expression

      Major comments:

      1) The methodology to filter and remove reads from potential contaminants needs to be more stringent to ensure the results do not contain spurious contaminants and that the conclusions are correct. It has been described that genomic databases are heavily contaminated with human sequences (Steinegger and Salzberg, 2020), and in this manuscript, even after a two-pass alignment with STAR, reads mapping to helminths also corresponded to the human genome. Additionally, ad-hoc removal of specific taxa (Metazoa and Viridiplantae) was only performed after suspicion of contamination. However, this ad-hoc removal cannot be performed with microbial (bacterial, viral, etc.) contaminants as there is a risk of removing actual bacteria from the samples. But it has been confirmed that many microbial assemblies also suffer from human contamination. Possible actions to take are the following: a.Perform the human mapping with more lenient parameters to avoid human reads to map to other (likely contaminated) genomes in genome databases. b.Remove common contaminants that have been documented, for instance in blood (Chrisman et al., 2022). c.Run a tool to detect contaminated contigs in the database used to map reads to microbes and remove these problematic contigs from further analysis.

      Our response: We thank Reviewer 2 for the suggestions, especially to address contaminants. We have reanalyzed our data which resulted in much fewer taxa yet still retained the main pathogenic findings.

      2) In line with the above, removing singletons (as I have understood these are taxa that are represented by a single read), is a way to minimize the risk of contamination. To take advantage of the functional profiling of RNA-seq, a measure to ensure that microbes found in blood are active would be to include in the analysis only taxa for which expression of more than a few genes is detected. This type of filtering has been previously applied in studies where very low microbial loads are expected (Lloréns-Rico et al., 2021). In this study, it has only been applied to the specific case of the archaeal taxon Methanocaldococcaceae. However, I would expect cleaner results if applied consistently to all taxa detected.

      __Our response: __We have reanalyzed the data and applied this to all taxa detected.

      3) The specificity of Methanocaldococcaceae in the samples from Mali is very striking. I am highly suspicious that this only occurs due to a batch effect, even though the authors were highly selective in their cohorts to avoid these. In fact, I extracted the genes spanning the regions highlighted in Supplementary Figure 9 of the Methanocaldococcus jannaschii genome. A BLAST search of these sequences returned, among Methanocaldococcus hits, hits from the ERCC synthetic spike-in sequences, used as internal controls in many RNA-seq experiments. ERCC synthetic spike-in hits appeared for all 4 regions in the genome of M. jannaschii highlighted in this figure. In the original publications of this dataset, there is no reference to the use of these ERCC controls, but given the observed matches, I suggest the authors to perform an extra step in their filtering pipeline to remove all reads mapping to these ERCC standards in all their three cohorts to prevent these sort of batch effects.

      __Our response: __We thank Reviewer 2 for pointing this out. Our reanalysis, which now used proper 2-pass mapping and further downstream classification with both pairs of the reads, no longer detected any archaea.

      4) I am puzzled by the inconsistencies shown between forward and reverse reads when mapping paired-end data. I expect these inconsistencies at lower taxonomic ranks (species or genus level) due to incomplete genomes, but not at higher taxonomic ranks. I wonder if, by performing more stringent filtering of contaminants as suggested above, the consistency between forward and reverse reads increases and both mates can be used, making the mapping more reliable.

      __Our response: __We have reanalyzed the data using both pairs of the reads for classification, resulting in less detected taxa. We believe the new results are more robust as it no longer includes taxa that are not typically found in humans (such as the archae Methanocaldococcus and other environmental bacteria).

      In summary, my main concerns regarding this manuscript involve the possibility that contaminants in the sequencing data may be the cause of some of the results presented, and I tried to propose ways of dealing with these contaminants. While some of the results may not be affected by detection of contaminants (i.e. the association between Apicomplexa and some ICs), others such as the diversity measures or the comparison across cohorts may be severely affected. I will consider these results highly preliminary until a more thorough and stringent approach for contaminant removal is applied.

      Our response: We thank Reviewer 2 for the suggestions and have updated our manuscript with results updated analyses that are more stringent towards contaminants, as can be seen from our updated findings.

      Minor comments:

      1) I would appreciate some of the analyses done at lower taxonomic levels if the sparsity of the data allows it, after removing contaminants. Given that the CLR transformation does not allow for zeros, other alternatives such as GMPR (Chen et al., 2018) or adding a pseudocount would allow these analyses?

      __Our response: __After our reanalysis, we ended up with even sparser data and therefore could not perform the analyses at lower taxonomic levels.

      2) In the PCA shown in figure 1, does the number of microbial reads detected correlate with any of the first two components?

      __Our response: __Yes Plamosdiidae correlates well with PCs 1 and 2 (0.66 & 0.73) and Flaviviridae correlates very strongly with PC1 (0.917). We have added this detail in the results section.

      3) In Figure 1C, the x axis is wrongly named PC2.

      __Our response: __We thank Reviewer 2 for pointing this out and have amended this detail.

      4) There is a typo in the legend of Figure 1A ("showeing")

      __Our response: __We thank Reviewer 2 for pointing this out and have amended this detail.

      5) In the alpha diversity estimates comparison across the three different cohorts, after subsampling each population to achieve similar sample size in each cohort, it is stated that "after subsampling, each population had similar diversity estimates". However, the numbers shown afterwards corresponding to the mean values of alpha diversity, without confidence intervals or a boxplot/violin plot together with an accompanying statistical test, are not enough to assess similarity. I would appreciate a figure (similar to Figure 3E and F) or a test accompanying these mean values.

      __Our response: __We thank Reviewer 2 for pointing this out and have amended this detail.

      6) In the volcano plots (Figure 3A, B and others throughout the manuscript) it would help the reader to add lines for the thresholds chosen for the effect size and -log10(p-value) to separate significant results.

      __Our response: __We thank Reviewer 2 for pointing this out and have amended this detail.

      7) In Figure 3E and F, I would appreciate having bars for the statistically significant comparisons.

      __Our response: __We thank Reviewer 2 for pointing this out and have amended this detail.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      The manuscript constitutes an important contribution to antimalarial drug discovery, employing diverse systems biology methodologies; with a focus on an improved M1 metalloprotease inhibitor, the study provides convincing evidence of the utility of chemoproteomics in elucidating the preferential targeting of PfA-M1. Additionally, metabolomic analysis effectively documents specific alterations in the final steps of hemoglobin breakdown. These findings underscore the potential of the developed methodology, not only in understanding PfA-M1 targeting but also in its broader applicability to diverse malarial proteins or pathways. Revisions are needed to further enhance overall clarity and detail the scope of these implications.

      We thank the editor and reviewers for recognising the contribution our work makes to understanding the selective targeting of aminopeptidase inhibitors in malaria parasites and the wider impact this multi-omic strategy can have for anti-parasitic drug discovery efforts. The reviewers have provided constructive feedback and raised important points that we have taken on-board to improve our manuscript. In particular, we have revised aspects of the text and figures to enhance clarity, performed additional analysis on the other possible MIPS2673 interacting proteins and more comprehensively analysed the effect of MIPS2673 on parasite morphology. NB: Specific responses to comments in the public reviews are provided within responses to the specific recommendations to authors.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The article "Chemoproteomics validates selective targeting of Plasmodium M1 alanyl aminopeptidase as a cross-species strategy to treat malaria" presents a series of biochemical methods based on proteomics and metabolomics, as a means to:

      (1) validate the specific targeting of biologically active molecules (MIPS2673) towards a defined (unique) protein target within a parasite and (2) to explore whether by quantifying the perturbations generated at the level of the parasite metabolome, it is possible to extrapolate which metabolic pathway has been disrupted by using this biologically active molecule and whether this may further confirm selective targeting in parasites of the expected (or in-vitro targeted) enzyme (here PfA-1).

      The inhibitor used in this work by the authors (MIPS2673) is to my knowledge a novel one, although belonging to a chemical series previously explored by the authors, which recently enabled them to discover a specific PfA-M17 inhibitor, MIPS2571 (Edgard et al., 2022, ref 11 of this current work). Indeed, inhibitors specifically targeting either PfA-M1 or PfA-M17 (and not both, as currently done in the past) are scarce today, and highly needed to functionally characterize these two zinc-aminopeptidases. MIPS2673, blocks the development of erythrocytic stages of Plasmodium falciparum with an EC50 of 324 nM, blocks the parasite development at the young trophozoite stage at 5x EC50 (but at ring stages at 10xEC50, figure 1E), and inhibits the enzymatic activity of PfA-M1 (and its ortholog Pv-M1) but not of the related malarial metallo-aminopeptidases (M17 and M18 families) nor the human metalloenzymes from closely related enzymatic families, supporting its selective targeting of PfA-M1 (and Pv-M1).

      All experiments are carried out in vitro (e.g. biochemical studies such as enzymology, proteomics, metabolomics) and on cultured parasites (erythrocyte stages of Plasmodium falciparum and several gametocytes stages obtained in vitro); there are no in vivo manipulations. The work related to Plasmodium vivax, which justifies the "cross-species" indication in the title of the article, is restricted to using a recombinant form of the M1-family aminopeptidase in enzymatic assays. The rest of the work concerns only Plasmodium falciparum. While I found globally that this work is original and brings new data and above all proposes chemical validation approaches that could be used for other target validations under similar limiting conditions (impossibility of KO of the gene), I have some specific questions to address to the authors.

      Strengths and weaknesses:

      - The chemoproteomic approach, that explores the ability of MIPS2673 to more significantly "protect" the putative target (PfA-M1) against thermal degradation or enzymatic attack (by proteinase K), to document its selective targeting towards PfA-M1 (the inhibitor, once associated with its target, is expected to stabilize its structure or prevent the action of end proteases), uses several concentrations of MIPS2673 and provides convincing results. My main criticism is that these tests are carried out with parasite extracts enriched in 30-38 hours old forms, and restricted to the fraction of soluble proteins isolated from these parasitic forms, which still limits the scope of the analysis. It is clear that this methodological approach is a choice that can be argued both biologically (PfA-M1 is well expressed in these stages of the parasite development) and biochemically (it is difficult to do proteomic analyses on insoluble proteins) but I regret that the authors do not discuss these limitations further, notably, I would have expected (from Figure 1E) some targets to be also present at ring stages.

      - The metabolomic approach, by documenting the ability of MIPS2673 to selectively increase the number of non-hydrolyzed dipeptides in treated versus untreated parasites is another argument in favor of the selective targeting of PfA-M1 by MIPS2673, in particular by its broad-spectrum aminopeptidase action preferentially targeting peptides resulting from the degradation of hemoglobin by the parasite. The relative contribution of peptides derived from host hemoglobin versus other parasite proteins is, however, little discussed.

      The work as a whole remains highly interesting, both for the specific topic of PfA-M1's role in parasite biology and for the method, applicable to other malarial drug contexts.

      Reviewer #2 (Public Review):

      In this manuscript, the authors first developed a new small molecular inhibitor that could target specifically the M1 metalloproteases of both important malaria parasite species Plasmodium falciparum and P. vivax. This was done by a chemical modification of a previously developed molecule that targets PfM1 as well as PfM17 and possibly other Plasmodial metalloproteases. After the successful chemical synthesis, the authors showed that the derived inhibitor, named MIPS2673, has a strong antiparasitic activity with IC50 342 nM and it is highly specific for M1. With this in mind, the authors first carried out two large-scale proteomics to confirm the MIPS2673 interaction with PfM1 in the context of the total P. falciparum protein lysate. This was done first by using thermal shift profiling and subsequently limited proteolysis. While the first demonstrated overall interaction, the latter (limited proteolysis) could map more specifically the site of MIPS2673-PfM1 interaction, presumably the active site. Subsequent metabolomics analysis showed that MIPS2673 cytotoxic inhibitory effect leads to the accumulation of short peptides many of which originate from hemoglobin. Based on that the authors argue that the MIPS2673 mode of action (MOA) involves inhibition of hemoglobin digestion that in turn inhibits the parasite growth and development.

      Reviewer #3 (Public Review):

      This is a manuscript that attempts to validate Plasmodium M1 alanyl aminopeptidase as a target for antimalarial drug development. The authors provide evidence that MIPS2673 inhibits recombinant enzymes from both Pf and Pv and is selective over other proteases. There is in vitro antimalarial activity. Chemoproteomic experiments demonstrate selective targeting of the PfA-M1 protease.

      This is a continuation of previous work focused on designing inhibitors for aminopeptidases by a subset of these authors. Medicinal chemistry explorations resulted in the synthesis of MIPS2673 which has improved properties including potent inhibition of PfA-M1 and PvA-M1 with selectivity over a closed related peptidase. The compound also demonstrated selectivity over several human aminopeptidases and was not toxic to HEK293 cells at 40 uM. The activity against P. falciparum blood-stage parasites was about 300 nM.

      Thermal stability studies confirmed that PfA-M1 was a binding target, however, there were other proteins consistently identified in the thermal stability studies. This raises the question as to their potential role as additional targets of this inhibitor. The authors dismiss these because they are not metalloproteases, but further analysis is warranted. This is particularly important as the authors were not able to generate mutants using in vitro evolution of resistance strategies. This often indicates that the inhibitor has more than one target.

      The next set of experiments focused on a limited proteolysis approach. Again several proteins were identified as interacting with MIPS2673 including metalloproteases. The authors go on to analyze the LiP-MS data to identify the peptide from PfA-M1 which putatively interacts with MIPS2673. The authors are clearly focused on PfA-M1 as the target, but a further analysis of the other proteins identified by this method would be warranted and would provide evidence to either support or refute the authors' conclusions.

      The final set of experiments was an untargeted metabolomics analysis. They identified 97 peptides as significantly dysregulated after MIPS2673 treatment of infected cells and most of these peptides were derived from one of the hemoglobin chains. The accumulation of peptides was consistent with a block in hemoglobin digestion. This experiment does reveal a potential functional confirmation, but questions remain as to specificity.

      Overall, this is an interesting series of experiments that have identified a putative inhibitor of PfA-M1 and PvA-M1. The work would be significantly strengthened by structure-aided analysis. It is unclear why putative binding sites cannot be analyzed via specific mutagenesis of the recombinant enzyme.

      In the thermal stability and LiP -MS analysis, other proteins were consistently identified in addition to PfA-M1 and yet no additional analysis was undertaken to explore these as potential targets.

      The metabolomics experiments were potentially interesting, but without significant additional work including different lengths of treatment and different stages of the parasite, the conclusions drawn are overstated. Many treatments disrupt hemoglobin digestion - either directly or indirectly and from the data presented here it is premature to conclude that treatment with MIPS2673 directly inhibits hemoglobin digestion.

      Finally, the potency of this compound on parasites grown in vitro is 300 nM - this would need improvements in potency and demonstration of in vivo efficacy in the SCID mouse model to consider this a candidate for a drug.

      Summary:

      Overall, this is an interesting series of experiments that have identified a putative inhibitor of the Plasmodium M1 alanyl aminopeptidases, PfA-M1 and PvA-M1.

      Strengths:

      The main strengths include the synthesis of MIPS2673 which is selectively active against the enzymes and in whole-cell assay.

      Weaknesses:

      The weaknesses include the lack of additional analysis of additional targets identified in the chemoproteomic approaches.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Question 1. Line 737 (and elsewhere). Why are Plasmodium vivax orthologs of PfA-M1 and PfA-M17 called Pv-M1 and Pv-M17 and not PvA-M1 and PvA-M17, where A stands for Aminopeptidase? I would recommend changing the names if possible, although the mention of Pv-M1 and Pv-M17 is now current in the literature (which is kind of regrettable). See also Supplemental Table S1 where PfA-M1 is named Pf-M1.

      Supplemental Table S1 was updated to PfA-M1. Nomenclature for the Plasmodium vivax aminopeptidase orthologs was amended to PvA-M1 and PvA-M17 as suggested by the reviewer.

      Question 2. Figure 1. Observation of parasite culture slide smears in Figure 1E strongly suggests that an important target of MIPS2673 appears to be expressed at the ring stage or very young trophozoites, whereas the authors, in their proteomic and metabolomic analyses, performed studies focused on late trophozoites stages (30-38h post-invasion). This difference in the targeting of Plasmodium stages puzzles me and deserves some explanations from the authors, and is related to my question 3.

      As the reviewer indicates, ring-stage parasite growth appears to be affected at high concentrations (5x and 10x EC50) of MIPS2673. Under these conditions, parasite growth appears to stall during late rings/early trophs at ~16-22 h post invasion when haemoglobin digestion is increasing and when one presumes PfA-M1 (the primary target of MIPS2673) is increasing in both expression and activity (see references 26 and 28 of this manuscript). Thus, whilst it is unsurprising that MIPS2673 has some activity against ring-stage parasites, we focused on the trophozoite stage for our proteomics studies as we showed this to be the stage most susceptible to MIPS2673 (Fig. 1D) and reasoned that we would most likely identify the primary MIPS2673 target, and other interacting proteins, from a complex biological mixture at this stage. The same reasoning underpinned our decision to perform metabolomics on drug-treated trophozoites, as we reasoned we would see a greater functional effect on this stage. Furthermore, performing these experiments on trophozoites rather than rings minimises the interference from the host red blood cell. While we cannot rule out additional targets in rings, repeating all experiments during this parasite stage is beyond the scope of this study.

      Question 3. Figure 2. Although Figure 2 is insightful and somehow self-explanatory, I think it misses two specific pieces of information. First, it is indicated in line 618 (M&M) that parasite material for thermal stability and limited proteolysis studies correspond to synchronized parasites (30-38h post-invasion) but this information is not given in Figure 2. In addition, if I fully understand the experimental protocol of obtaining parasite extracts, they strictly correspond to the soluble protein fraction of the erythrocytic stages of plasmodium at the late trophozoite stage, and not to all parasitic proteins as the scheme of Figure 2 might suggest. I would appreciate it very much if these two points (parasite stages and soluble proteins) were clearly indicated in the scheme as indeed, not the whole parasite blood stage proteome is investigated in the study but just a part of it (~47%, as the authors indeed indicate line 406). Please, edit also the legend of the figure accordingly.

      This is correct, the soluble protein fraction from synchronised trophozoites was used in our proteomics studies. These details have been included in an updated Figure 2 and in the corresponding figure legend.

      Question 4. Thermal stabilization. Figure 3B. Could the authors explain how they calculated or measured "absolute" protein abundances, and how this refers to a number of parasites in initial assays as this is not clear to me. Notably, abundance for PfA-M1 is much higher than for PF3D7_0604300, which are interesting "absolute" values.

      Protein abundance was calculated using the mean peptide quantity of the stripped peptide sequence, with only precursors passing the Q-value threshold (0.01) considered for relative quantification. Within independent experiments, normalisation was based on total protein amount (determined by the BCA assay) rather than the initial number of parasites.

      PfA-M1 is known to be a highly abundant protein and PF3D7_0604300 (as well as the other protein hits identified by thermal stability proteomics) are likely less abundant. It is noted that abundance is also dependent on ionisation efficiency and trypsin digestion efficiency. Therefore, we avoid comparing absolute abundances across proteins and use relative differences across conditions instead.

      NB: the word “absolute” in the text (“absolute fold-change”) refers to the absolute value of the fold-change (i.e. positive or negative), and not to absolute quantification of proteins. The preceding text in each case clarifies that these are based on “relative peptide abundance”.

      Question 5. Figure 5A. How do the authors explain peptides whose abundances are decreasing instead of increasing? Figure 5C. Could the authors provide digital cues (aa numbers or positions) on the ribbon representation of the PfA-M1 sequence? It is difficult to correlate the position of the 3D domains with respect to the primary structure of the protein. Also, the "yellow" supposed to show the "drug ligand" is really not very visible.

      LiP-MS is based on the principle that ligand binding alters the local proteolytic susceptibility of a protein to a non-specific protease (in this case proteinase K, PK). In this sense, in LiP-MS we are not looking at variations in the stability of whole proteins (as is the case with thermal stability proteomics, where proteins detected with significantly higher abundance in treated relative to control samples reflects thermal stabilisation of the target due to ligand binding), but differences in peptide patterns between treated and control samples that reflect a change in the ability of PK to cleave the target. Thus, in the bound state, the ligand prevents proteolysis with PK. This results in decreased abundance of peptides with non-tryptic ends (as PK cannot access the region around where the ligand is bound) and increased abundance of the corresponding fully tryptic peptide, when compared to the free target. This concept is demonstrated in Fig. 4A and is explained in the text (lines 279-282) and Fig. 4 figure legend.

      To aid visualisation, we have not added amino acid positions on the PfA-M1 sequence in Fig. 5, but have provided amino acid positions for all peptides in Supplementary File 3. We have also changed the colour of the ligand in Fig. 5C to blue and increased transparency of the binding and centre of mass neighbourhoods.

      Question 6. Gametocyte assays. Line 824 states that several compounds were used as positive controls for anti-gametocyte activity (chloroquine, artesunate, pyronaridine, pyrimethamine, dihydroartemisinin, and methylene blue) and line 821 states that the biological effects are measured against puromycin. This is not very clear to me, could the authors comment on this?

      This wording has been clarified in the methods to reflect that 5 µM puromycin was used as the positive control to calculate percent viability, whereas the other antimalarials were run in parallel as reference compounds with known anti-gametocyte activity (line 862).

      Question 7. Metabolomics. Metabolomic assays were done on parasites at 28h pi, incubated for 1h with 3x EC50 of MIPS2673. You mention applying the drug on 2x10E8 infected red blood cells (line 838) but you do not explain how you isolate these infected red blood cells from non-infected red blood cells. Could you please specify this?

      Metabolomics studies were performed such that cultures at 2% haematocrit and 6% trophozoite-stage parasitaemia (representing 2 x 108 cells in total, rather than 2 x 108 infected cells) were treated with compound or vehicle and after 1 h metabolites were extracted. This methodological detail has been clarified in the methods (line 875).

      Question 8. Figure 3B. Does this diagram come from the experimental 3D structure created by the authors (8SLO) or from molecular modeling? Please specify in the legend (line 1305).

      The diagram showing the binding mode of MIPS2673 bound to PfA-M1 comes from the experimentally determined 3D structure (PDB ID: 8SLO). This has now been stated in the figure legend. Note that the structural diagram refers to Fig. 1B (not Fig. 3B as indicated by the reviewer). The experimentally determined PfA-M1 structure with MIPS2673 bound (PDB ID: 8SLO) was also used to map LiP peptides and estimate the MIPS2673 binding site in Fig. 5, which is also now reflected in the appropriate section of the text (line 308) and Fig. 5 legend.

      Question 9. Line 745. Why not indicate µm concentration for this H-Leu-NHMec substrate while it is indicated for the other substrates mentioned in the rest of the paragraph (H-Ala-NHMec, 20 μM, etc..). Also in this section (Enzyme assays) the pH at which the various enzymatic assays were done is missing.

      All enzyme assays were performed at pH 8.0. The concentration of H-Leu-NHMec varied depending on the enzyme assayed, as follows: 20 µM for PfA-M1, 40 µM for PvA-M1 and 100 µM for ERAP1 and ERAP2. This information is now clearly stated in the methods section (lines 782 and 787) and as a footnote for Supplemental Table S1.

      Question 10. Line 830, please define FBS.

      Fetal bovine serum (FBS) has been added where appropriate (line 867).

      Question 11. The authors mention in the title the targeting of several plasmodium species, but the only experimental study on the Plasmodium vivax species concerns the use of the recombinant enzyme Pv-M1. Authors also mention "multi-stage targets", but ultimately only look at erythrocyte stages and three different gametocyte stages.

      We have now removed the words “cross-species” and “multi-stage” from the manuscript title and abstract so as not to overstate these findings. We have also added the word “potential” in the manuscript text to clarify that selective M1 inhibition could offer a potential multistage and cross species strategy for malaria.

      Question 12. Supplemental Table S1. I would suggest replacing "Percent inhibition by MIPS2673 of PfA-M1 and Pv-M1 aminopeptidases compared to selected human M1 homologues" with "Percent inhibition by MIPS2673 of PfA-M1 and Pv-M1 aminopeptidase activities compared to selected human M1 homologues".

      Done.

      Question 13. Supplemental Table S3. Here you indicate IC50 while in text and Figure 1 you quote EC50. Why this difference?

      This has now been changed to EC50 in Supplemental Table S3.

      Reviewer #2 (Recommendations For The Authors):

      Amendments that I would recommend in order to improve the presentation include all four parts of the study:

      (1) In vitro antiparasitic activity of MIPS2673.

      The authors showed that MIPS2673 inhibits parasite growth with IC50 of 324nM measured by a standard drug sensitivity assay, Fig 1C. This is all well and good, but it would be helpful to include at least one if not more other compounds such as antimalaria drugs and/or their earlier inhibitors (e.g. inhibitor 1) for comparisons. This is typically done to show that the assay in this manuscript is fully compatible with previous studies. It will also give a better view of how the selective inhibition of PfM1 kills the parasite, specifically.

      Alongside MIPS2673, we also analysed the potency of the known antimalarial artesunate, which was found to have an EC50 of 4 nM. This value agrees with the expected potency of artesunate and indicates our MIPS2673 value of 324 nM is indeed compatible with previous studies. We have now reported the artesunate EC50 value for reference (lines 197-198 and Fig. S1).

      Next, the authors proceeded to investigate the stage-specific effect of MIPS2673 but this time doing a survival assay instead of proper IC50 estimations (Figure 1. I wonder why? Drug survival assays have typically very limited information content and measuring proper IC50 in stage-specific wash-off assays would be much more informative.

      We performed single concentration stage specificity assays to determine the parasite asexual stage at which MIPS2673 is most active. This involved washing off the compound after a 24 h exposure in rings or trophozoites and determining parasite viability in the next asexual lifecycle. While a full dose response curve would allow generation of an EC50 value against the respective parasite stages, this information is unlikely to change the interpretation that MIPS2673 is more active against trophozoites stages than against rings.

      Finally, in Figure 1E, the authors present the fact that the MIPS2673 arrests the parasite development. This is done by presenting a single (presumably representative) cell per time point. This is in my view highly insufficient. I recommend this figure be supplemented by parasite stage counts or other more comprehensive data representation. Also, the authors mention that while there is a growth arrest, hemoglobin is still being made. From the cell images, I can not see anything that supports this statement.

      We thank the reviewer for this constructive comment and they are correct in their assessment that these are representative parasite images at the respective time points. To address the reviewers concerns we have now provided cell counts from each treatment condition (Fig. 1E) at selected time points, which shows parasite stalling at the ring to trophozoite transition under drug treatment. On reflection, we agree that it is difficult to determine the presence of haemozoin from our images and have removed this statement.

      (2) Protein thermal shift profiling. In the next step, the authors proceed to carry out cellular thermal shift profiling to show that PfM1 indeed interacts with MIPS2673, this time in the context of the total protein lysates from P. falciparum. This section of the study is in my view quite solid and indeed it is nice to see that the inhibitor causes a thermal shift of PfM1 which further supports what was already expected: interaction.

      I have no problem with this study in terms of the technical outcome but I would urge the authors to tone down the interpretation of these results in two ways.

      Four other proteins were found to be shifted by the inhibitor which also indicates interactions. Calling it simply "off-target" interactions might not represent the truth. The authors should explore and in some way comment that interactions with these proteins could contribute to the MIPS2673 MOA. I do not suggest conducting any more studies but simply acknowledge this situation. Identifying more than one target is indeed very common in CETSA studies and it would be helpful to acknowledge this here as well.

      We agree that identifying binding proteins in addition to the “expected” target is commonplace, and is indeed one of the benefits of this unbiased and proteome-wide approach. In the results and discussion, we have now amended our language to refer to these additional hits as MIPS2673-interacting proteins. In our original manuscript we dedicate a paragraph in the discussion to these additional interacting proteins and the likelihood of them being targets that contribute to antimalarial activity. Of these four additional interacting proteins, only the putative AP2 domain transcription factor (PF3D7_1239200) is predicted to be essential for blood stage growth and is therefore the only protein from this additional four that would likely contribute to antimalarial activity. These points are explicitly stated in the discussion (lines 530-550). Notably, all of the other interacting proteins identified in our thermal stability dataset were detected in our LiP-MS experiment but were not identified as interacting proteins by this method. The remaining three proteins were two non-essential P. falciparum proteins with unknown functions (PF3D7_1026000 and PF3D7_0604300) that are poorly described in the literature and a human protein (RAB39A). Further analysis of these other thermal stability proteomics hits in our LiP-MS dataset (see responses to Reviewer #3) identified none or only 1 significant LiP peptide from these proteins across our LiP-MS datasets, indicating they are likely to be false positive hits. Caveats around identifying protein targets by different deconvolution methods are also now addressed (lines 545-550).

      At some point, the author argues that causing shifts of only four/five proteins including PfM1 shows that MIPS2673 does not interact with other (off) targets. Here one must be careful to present the lack of shifts in the CETSA as proof of no interaction. There are many reasons why thermal shifts are not observed including the physical properties of the individual proteins, detection limit etc. Again I suggest adjusting these statements accordingly.

      We thank the reviewer for raising this important point and have now included additional discussion around this comment (lines 545-550).

      Finally, I am not convinced that Figure 2 presents nothing more than the overall experimental scheme with not much new information. Many of such schemes were published previously in the original publication of thermal profiling. I would suggest omitting it from the main text and shifting it into supplementary methods etc.

      We agree that similar schemes have been published previously, especially for thermal proteome profiling, and acknowledge the reviewer’s suggestion of moving this figure to the supplemental material. However, we have kept Fig. 2 in the main text as this scheme also incorporates a LiP-MS workflow for malaria drug target deconvolution (the first to do so) and also to satisfy the additional details requested for this figure by Reviewer #1 (question 3).

      (3) Identification of MIPS2673 target proteins using LiP-MS. In the next step, the authors carried out the limited proteolysis analysis with the rationale that protein peptides that are near the inhibitor binding site will exhibit higher resilience to proteolysis. The authors did a very good job of showing this for PfM1-MISP2673 interaction. This part is very impressive from a technological perspective, and I congratulate the authors on such achievement. I imagine these types of studies require very precise optimizations and performance.

      Here, however, I struggle with the meaning of this experiment for the overall flow of the manuscript. It seems that the binding pocket of MIPS2673 is less known since the inhibitor was designed for it. In fact, the authors mentioned that the crystal structure of PfM1 is available. From this perspective, the LiP-MS study represents more of a technical proof of concept for future drug target analysis but has limited contribution to the already quite well-established PfM1-MISP2673 interaction. Perhaps this could be presented in this way in the text.

      We thank the reviewer for this comment and they are correct that we solved the crystal structure of PfA-M1 bound to MIPS2673. We wish to highlight that the primary reason for performing the LiP-MS study was as an independent and complementary target deconvolution method to narrow down the shortlist of targets identified with thermal stability proteomics, and validate with high confidence that PfA-M1 is indeed the primary target of MIPS2673 in parasites. The use of a complementary approach based on a different biophysical principle (proteolytic susceptibility vs thermal stability) would also allow us to identify MIPS2673 interacting proteins that may not be detectable by thermal stability proteomics, for example targets that do not alter their thermal stability upon ligand binding. The text in the results and discussion has been amended to clarify these points (lines 266-268 and 545-550).

      Furthermore, we agree that correctly predicting the MIPS2673 binding site on PfA-M1 using our LiP-MS peptide data is a technical proof of concept. Indeed, we wished to highlight the potential utility of LiP-MS for identifying both the protein targets of drugs and predicting their binding site, which is not possible with many other target deconvolution approaches. This point has been updated in the text (lines 303-304, 459-461).

      (4) Metabolomic profiling of MIPS2673 inhibition showed a massive accumulation of short peptides which clearly indicates that this inhibitor blocks some proteolytic activity of short peptides, presumably products of upstream proteolytic activities. Here the authors argue, that because many of these detected short (di-/tri-) peptides could be mapped on the hemoglobin protein sequence, this must be their origin. Although this might be the case the author could not exclude the fact that at least some of these come from other sources (e.g. Plasmodium proteins). It would be quite helpful to comment on such a possibility as well. In particular, it was mentioned that the main subcellular localization of PfM1 is in the cytoplasm while most if not all hemoglobin digestion occurs in the digestive vacuole...?

      Indeed, we agree that Pf_A-M1 is likely processing both Hb and non-Hb peptides and do not definitively conclude that all dysregulated peptides must be derived from haemoglobin. A subset of dysregulated peptides cannot be mapped to haemoglobin and must have an alternative source such as other host proteins or turnover of parasite proteins. We have amended the discussion to better reflect these possible alternate peptide sources (480-482). Although the peptides detected in the metabolomics study (2-5 amino acids) are too short to be definitively assigned to any specific parasite or RBC protein, it is important to note that our analysis strongly indicates that the majority, but not all, of dysregulated peptides are more likely to originate from haemoglobin than other human or parasite proteins. This is based on sequence mapping, which was aided by acquiring MS/MS data for a subset of dysregulated peptides from which we derive accurate sequences (as opposed to residue composition inferred from total peptide mass) to more directly link dysregulated peptides to haemoglobin. We further quantified the sequence similarity of dysregulated peptides to all detectable proteins in the _P. falciparum infected erythrocyte proteome (~4700 proteins), showing that these peptides are statistically more similar to haemoglobin than other host or parasite proteins.

      The apparent disconnect between PfA-M1 localisation (cytosol) and the predominant site of haemoglobin digestion (digestive vacuole, DV) is explained by the fact that peptides originating from digestion of haemoglobin in the DV are required to be transported into the cytoplasm for further cleavage by peptidases, including PfA-M1. This point has now been clarified in the discussion (lines 473-474).

      Reviewer #3 (Recommendations For The Authors):

      (1) Thermal stability studies confirmed that PfA-M1 was a binding target, however, there were other proteins consistently identified in the thermal stability studies. This raises the question as to their potential role as additional targets of this inhibitor. The authors dismiss these because they are not metalloproteases, but further analysis is warranted. This is particularly important as the authors were not able to generate mutants using in vitro evolution of resistance strategies. This often indicates that the inhibitor has more than one target.

      We thank the reviewer for this comment. The possibility of other targets contributing to MIPS2673 activity was also raised by Reviewer #2 (question 2) and is addressed above. Further to our response to Reviewer #2, we agree that the inability to generate resistant parasites in vitro could indicate that inhibition of multiple essential parasite proteins (including PfA-M1) contribute to MIPS2673 activity and do not rule out this possibility. It may also indicate the target has a very high barrier for resistance and is unable to tolerate resistance causing mutations as they are deleterious to function. Indeed, previous attempts to mutate PfA-M1 (references 12 and 50), and our own attempts to generate MIPS2673 resistant parasites in vitro (unpublished), were unsuccessful. It is important to note that of the hits reproducibly identified using thermal stability proteomics, only PfA-M1 and a putative AP2 domain transcription factor (PF3D7_1239200) are predicted to be essential for blood stage growth. We have explicitly stated that PF3D7_1239200 could also contribute to activity (line 533 and 537).

      As we identified multiple hits with thermal stability proteomics we employed the complementary LiP-MS method to further investigate the target landscape of MIPS2673. PfA-M1 was the only protein reproducibly identified as the target through this approach. Importantly, the five proteins identified as hits by thermal stability proteomics were also detected in our LiP-MS datasets, but only PfA-M1 was identified as a target by both target deconvolution methods, strongly indicating it is the primary target of MIPS2673 in parasites. An important caveat is that we profiled the soluble proteome (we did not include detergents necessary for extracting membrane proteins as they may interfere with these stability assays) and other factors (e.g. the biophysical properties of the protein) will impact on whether ligand induced stabilisation events are detected. We have added additional text in the discussion around the above points (lines 545-550).

      While we do not definitively rule out other MIPS2673 interacting proteins existing in parasites (that possibly also contribute to activity), our metabolomics studies indicated no functional impact by MIPS2673 outside of elevated levels of short peptides. This is indicative of aminopeptidase inhibition and the profile of peptide accumulation was distinct from a known PfA-M17 inhibitor, and other antimalarials, further pointing to selective inhibition of the PfA-M1 enzyme by MIPS2673 being responsible for antimalarial activity.

      (2) The next set of experiments focused on a limited proteolysis approach. Again several proteins were identified as interacting with MIPS2673 including metalloproteases. The authors go on to analyze the LiP-MS data to identify the peptide from PfA-M1 which putatively interacts with MIPS2673. The authors are clearly focused on PfA-M1 as the target, but a further analysis of the other proteins identified by this method would be warranted and would provide evidence to either support or refute the authors' conclusions.

      As PfA-M1 was the only protein reproducibly identified as an interacting protein across both LiP-MS experiments (and by thermal stability proteomics) we focused our analysis on this protein. However, we agree that further analysis of the other putative interacting proteins would be valuable. Additional analysis was performed  (see new figure S4) on the other interacting proteins identified by thermal stability proteomics and the other interacting proteins identified in LiP-MS experiment one, as no other proteins (apart from PfA-M1) were identified as hits in the second LiP-MS experiment (lines 314-318, 495-505, 740-762 and Fig. S4). Using the common peptides detected across both LiP-MS experiments we mapped significant LiP peptides to the structures of the other putative MIPS2673-interacting proteins, where a structure was available and significant LiP-MS peptides were detected, and measured the minimum distance to expected binding sites. It is noted that when using the same criteria for a significant LiP peptide that we used for our PfA-M1 analysis, only one significant LiP peptide is identified from these other putative interacting proteins (YSPSFMSFK from PfADA). Therefore, we used a less stringent criteria for defining significant LiP peptides for these other proteins (see methods and Fig. S4 legend) in order to identify significant LiP peptides to map to structures. This analysis showed that, with the exception of PfA-M17, significant LiP-MS peptides for these other proteins are not significantly closer to binding sites than all other detected peptides, supporting our assertion that these other proteins are likely to be false positives or not functionally relevant MIPS2673 interacting proteins. Although significant peptides from PfA-M17 were closer to the binding site, our thermal stability and metabolomics data, combined with our previous work on the PfA-M17 enzyme, argue against this being a functionally relevant target (see lines 362-374 and 486-529 for a more detailed discussion). Another possible explanation for this result is that peptide substrates accumulating due to primary inhibition of PfA-M1 interact with PfA-M17, leading to structural changes around the enzyme active site that are detected by LiP-MS.

      (3) The final set of experiments was an untargeted metabolomics analysis. They identified 97 peptides as significantly dysregulated after MIPS2673 treatment of infected cells and most of these peptides were derived from one of the hemoglobin chains. The accumulation of peptides was consistent with a block in hemoglobin digestion. This experiment does reveal a potential functional confirmation, but questions remain as to specificity.

      As indicated, the accumulation of short peptides identified by metabolomics suggests MIPS2673 perturbs aminopeptidase function. Many of these peptides (but not all) likely map to haemoglobin and are more haemoglobin-like than other proteins in the infected red blood cell proteome. An effect on a subset of non-haemoglobin peptides is also apparent and we have added this to our discussion (also refer to our response to question 4 from Reviewer #2). A direct comparison to our previous metabolomics analysis of a specific PfA-M17 inhibitor (MIPS2571, reference 11) revealed MIPS2673 induces a unique metabolomic profile. The extent of peptide accumulation differed and a subset of short basic peptides (containing Lys or Arg) were elevated only by MIPS2673, consistent with the broad substrate preference of PfA-M1. Importantly, the metabolomics profile induced by MIPS2673 is the opposite of many other antimalarials, which cause depletion of haemoglobin peptides. Taken together, the profile of short peptide accumulation induced by MIPS2673 is consistent with specific inhibition of PfA-M1.

      (4) Overall, this is an interesting series of experiments that have identified a putative inhibitor of PfA-M1 and PvA-M1. The work would be significantly strengthened by structure-aided analysis. It is unclear why putative binding sites cannot be analyzed via specific mutagenesis of the recombinant enzyme.

      Contrary to this comment we solved the crystal structure of PfA-M1 bound to MIPS2673, determining its binding mechanism to the enzyme. This was further supported through proteomics-based structural analysis by LiP-MS. Undertaking site specific mutagenesis would be interesting to further probe the binding dynamics of MIPS2673 to the M1 protein. However, we believe it is beyond the scope of this study and would not change our conclusion that MIPS2673 binds to PfA-M1, which we have shown using multiple unbiased proteomics-based methods, enzyme assays and X-ray crystallography.

      (5) In the thermal stability and LiP -MS analysis, other proteins were consistently identified in addition to PfA-M1 and yet no additional analysis was undertaken to explore these as potential targets.

      As addressed in our previous responses, across independent thermal stability proteomics experiments we consistently identified 5 interacting proteins, including the expected target PfA-M1. In contrast, only PfA-M1 was reproducible across independent LiP-MS experiments. While several plausible putative targets (including aminopeptidases and metalloproteins) were identified in one of our LiP-MS experiment, they appear to be false discoveries and not responsible for the antiparasitic activity of MIPS2673, as peptide-level stabilisation was not consistent across independent LiP-MS experiments, and an interaction is refuted by our thermal stability, metabolomics and recombinant enzyme inhibition data. We have now performed further analysis of these other putative interacting proteins, which also argues against them being likely interacting proteins (see also response to question 2). We have also added to our existing discussion on possible MIPS2673 targets and the likelihood of these proteins contributing to antimalarial activity (lines 486-550).

      (6) The metabolomics experiments were potentially interesting, but without significant additional work including different lengths of treatment and different stages of the parasite, the conclusions drawn are overstated. Many treatments disrupt hemoglobin digestion - either directly or indirectly and from the data presented here it is premature to conclude that treatment with MIPS2673 directly inhibits hemoglobin digestion.

      Our metabolomics studies were performed using typical experimental conditions for investigating the antimalarial mechanisms of compounds by metabolomics (see references 11, 39, 40 and 55-57). We used a short 1 h incubation at 3x EC50 allowing us to profile the primary parasite pathways affected by MIPS2673 and avoid a nonspecific death phenotype associated with longer incubations. As addressed in our response to Reviewer #1 (question 2) we focused on trophozoite infected red blood cells as this is the stage most susceptible to MIPS2673 and when one presumes the greatest functional impact would be seen. It is possible that an expanded kinetic metabolomics analysis may reveal secondary mechanisms involved in MIPS2673 activity and we have now acknowledged this in the manuscript (lines 515-516). However, even though secondary mechanisms may become apparent at longer incubations it also becomes difficult to uncouple drug specific responses from nonspecific death effects. We believe any additional information provided by an expanded metabolomics analysis is unlikely to outweigh the significant extra financial cost associated with this type of experiment.

      It is correct that many antimalarial compounds appear to disrupt haemoglobin digestion when analysed by metabolomics. However, as indicated in our manuscript (lines 369-373) and previous responses, the profile of elevated haemoglobin peptides induced by MIPS2673 is substantially different to the profile caused by other antimalarials. For example, artemisinins and mefloquine cause haemoglobin peptide depletion (references 55-57) and chloroquine results in increased levels of a different subset of non-haemoglobin peptides (see Creek et al. 2016). While there is some overlap in profile with a selective M17 inhibitor (our previous work, reference 11), the level of enrichment of these peptides is different and MIPS2673 also induces accumulation of a distinct set of basic peptides consistent with the substrate preference of the PfA-M1 enzyme. As we show that MIPS2673 does not inhibit other parasite aminopeptidases, a likely explanation for the profile overlap is that the build-up of substrates that cannot be processed by PfA-M1 leads to secondary dysregulation of other aminopeptidases. Our analyses (sequence mapping, MS/MS analysis and sequence similarities to all infected red blood cell proteins) strongly indicate that the majority of elevated peptides (but not all) originate from haemoglobin. Combined with our proteomics and recombinant enzyme data indicating direct engagement of PfA-M1, and with previous literature indicating the enzyme functions to cleave amino acids from haemoglobin-derived peptides, our data indicates MIPS2673 likely directly perturbs the haemoglobin digestion pathway through PfA-M1 inhibition.

      (7) Finally, the potency of this compound on parasites grown in vitro is 300 nM - this would need improvements in potency and demonstration of in vivo efficacy in the SCID mouse model to consider this a candidate for a drug.

      We do not propose MIPS2673 as an antimalarial candidate. The experiments presented here were centred on target validation rather than identification of an antimalarial lead, which may be the focus of future studies. To avoid this confusion, we have amended the manuscript title and language throughout to clarify this point.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      eLife assessment

      This study advances our understanding of the allosteric regulation of anaerobic ribonucleotide reductases (RNRs) by nucleotides, providing valuable new structural insight into class III RNRs containing ATP cones. The cryo-EM structural characterization of the system is solid, but some open questions remain about the interpretation of activity/binding assays and the newly incorporated HDX-MS results. The work will be of interest to biochemists and structural biologists working on ribonucleotide reductases and other allosterically regulated enzymes.

      Public Reviews:

      Reviewer #1 (Public Review):

      The goal of this study is to understand the allosteric mechanism of overall activity regulation in an anaerobic ribonucleotide reductase (RNR) that contains an ATP-cone domain. Through cryo-EM structural analysis of various nucleotide-bound states of the RNR, the mechanism of dATP inhibition is found to involve order-disorder transitions in the active site. These effects appear to prevent binding of substrate and a radical transfer needed to initiate the reaction.

      Strengths of the manuscript include the comprehensive nature of the work - including both numerous structures of different forms of the RNR and detailed characterization of enzyme activity to establish the parameters of dATP inhibition. The manuscript has been improved in a revision by performing additional experiments to help corroborate certain aspects of the study. But these new experiments do not address all of the open questions about the structural basis for mechanism. Additionally, some questions about the strength of biochemical data and fit of binding or kinetic curves to data that were raised by other referees still remain. Some experimental observations are not consistent with the proposed model. For example, why does dATP enhance Gly radical formation when the proposed mechanism of dATP inhibition involves disorder in the Gly radical domain?

      The work is impactful because it reports initial observations about a potentially new mode of allosteric inhibition in this enzyme class. It also sets the stage for future work to understand the molecular basis for this phenomenon in more detail.

      We express our gratitude to the reviewer for dedicating time to review our work and for the overall favorable assessment. We agree that the question of exactly how much the glycyl radical domain becomes more mobile without losing the glycyl radical entirely is an unresolved one but we also think that our work sets a solid basis for future experiments by us and others.

      Reviewer #3 (Public Review):

      The manuscript by Bimai et al describes a structural and functional characterization of an anaerobic ribonucleotide reductase (RNR) enzyme from the human microbe, P. copri. More specifically, the authors aimed to characterize the mechanism by how (d)ATP modulates nucleotide reduction in this anaerobic RNR, using a combination of enzyme kinetics, binding thermodynamics, and cryo-EM structural determination, complemented by hydrogen-deuterium exchange (HDX). One of the principal findings of this paper is the ordering of a NxN 'flap' in the presence of ATP that promotes RNR catalysis and the disordering (or increased protein dynamics) of both this flap and the glycyl radical domain (GRD) when the inhibitory effector, dATP, binds. The latter is correlated with a loss of substrate binding, which is the likely mechanism for dATP inhibition. It is important to note that the GRD is remote (>30 Ang) from the binding site of the dATP molecule, suggesting long-range communication of the structural (dis)ordering. The authors also present evidence for a shift in oligomerization in the presence of dATP. The work does provide evidence for new insights/views into the subtle differences of nucleotide modulation (allostery) of RNR, in a class III system, through long-range interactions.

      The strengths of the work are the impressive, in-depth structural analysis of the various regulated forms of PcRNR by (d)ATP using cryo-EM. The authors present seven different models in total, with striking differences in oligomerization and (dis)ordering of select structural features, including the GRD that is integral to catalysis. The authors present several, complementary biochemical experiments (ITC, MST, EPR, kinetics) aimed at resolving the binding and regulatory mechanism of the enzyme by various nucleotides. The authors present a good breadth of the literature in which the focus of allosteric regulation of RNRs has been on the aerobic orthologues.

      The addition of hydrogen-deuterium exchange mass spectrometry (HDX-MS) complements the results originating from cryo-EM data. Most notably, is the observation of the enhanced exchange (albeit quite subtle) of the GRD domain in the presence of dATP that matches the loss of structural information in this region in the cryo-EM data. The most pronounced and compelling HDX results are seen in the form of dATP-induced protection of peptides immediately adjacent to the b-hairpin at the s-site, where dATP is expected to bind based on cryo-EM. It is clear that the presence of dATP increases the rigidity of this region.

      We are happy that both reviewers find the HDX-MS experiments to be a valuable addition to the existing data.

      Weaknesses:

      The discussion of the change in peptide mobility in the N-terminal region is complicated by the presence of bimodal mass spectral features and this may prevent detailed interpretation of the data, especially for select peptide region that shows opposite trends upon nucleotide association.

      Further, the HDX data in the NxN flap is unchanged upon nucleotide binding (ATP, dATP, or CTP), despite changes observed in the cryo-EM data.

      We are grateful to the reviewer for the comprehensive feedback on the HDX-MS part and for identifying areas for improvement. The HDX analysis was of course undertaken with the intention of identifying differences in disorder of the NxN flap and GRD region. From an HDX perspective both regions were found to be highly susceptible to HDX regardless of state/ligand, due to surface accessibility and/or very fast dynamics. However, this does not mean that there is no difference in the degree of order of these regions upon ligand addition, simply that we with HDX-MS, in the limited time span of 30-3000 seconds, could not conclusively support an increased disorder. We have rephrased the discussion text to reflect this fact

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      On page 5 (and throughout the manuscript) there are some inconsistencies in how dissociation constants for effectors and inhibitors are described - for example, D in KD is sometimes subscripted and sometimes not.

      Thank you for noticing these remaining errors. We hope that we have fixed all of them now.

      Reviewer #3 (Recommendations For The Authors):

      The authors addressed many of the initial concerns raised. The addition of the HDX-MS data in this revision is a welcomed contribution to the work and complements the cryo-EM data. In select cases, the data may be over-interpreted. This reviewer suggests that the authors revise the text in this section so that it is more consistent with the presented data.

      Specific points:

      (1) The bimodal mass spectral features in the N-terminal domain complicate the data interpretation. Specifically for peptides in 81-99 region, the fast exchanging feature shows protection in the presence of (d)ATP/CTP, but the opposite trend is observed for the slow exchanging species. It is therefore advisable to not make absolutes about the HDX results in this region, as the data are complicated.

      As stated by the reviewer, it is not possible from the presented HDX data to deduce if this is a result of 50% loaded dimer or the oligomerization state of the protein. We have remedied this by removing mentions of a difference between the dATP and ATP in bimodality. Also, we have addressed this in the text by stating that the main reason is most likely the different oligomerization states present in solution. Nevertheless, it is clear from the HDX data that the N-terminal region and 81-99 are very interesting, and it was somewhat disappointing that due to the dynamics of the oligomerization it was not possible to SEC-purify pure dimer or tetramer samples for HDX-MS, in order to deconvolute the cause.

      (2) Related to #1, the authors assign the bimodal HDX behavior to EX1 mechanism, but this is not necessarily (and unlikely) true based on the limited time points. The authors also state that it originates from the heterogeneity of the sample: "a mixture of states" which could reflect the mixture of oligomerization states. The authors should be careful assigning EX1 mechanism unless there are compelling results to support it.

      We apologize for the unfortunate phrasing. It was not our intention to imply that the bimodality is due to true EX1 kinetics. See the above answer. The mention of EX1 has been removed from the discussion text.

      (3) The deuterium uptake for peptide 118-126 is very small (~1Da) compared to the length of the peptide. The change in deuterium uptake (<0.25Da) from dATP is very small; the authors should proceed with caution when presenting interpretations of such small differences.

      We agree with the reviewer that extra caution should be taken when dealing with such a small difference. However, the 118-126 peptide has been significance tested in both HDExaminer and Deuteros 2.0, and we also observed this for more than one run. The difference in uptake is small but increases to significance at the longer labelling times. The proximity to the NxN flap makes it interesting in context of an allosteric conformational change. i.e the dynamics of the NxN might be too fast so we can only see some secondary effects. We would like to keep the data  in Figure 10 for reasons of transparency. In essence this is similar to the observed bimodality mentioned above: we cannot fully explain the observation but present the data as it was observed.

      (4) On p. 22, the authors should consider revising the following statement: "confirming dATP binding to the s-site." Even though the HDX data are most compelling for the protection of peptides 178-204 and 330-348 that are adjacent to the beta-hairpin at the s-site, these data cannot "confirm" a binding site for a small molecule, such as dATP.

      We appreciate that the reviewer has pointed out that the statement can be misleading, and we agree that the binding site of small molecules can’t be confirmed based solely on HDX data. The sentence reformulated to clarify that the binding site was confirmed based on the combined evidence of HDX data and the previously presented biochemical and structural data on the s-site.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      eLife assessment

      This manuscript reports important in vitro biochemical and in planta experiments to study the receptor activation mechanism of plant membrane receptor kinase complexes with non-catalytic intracellular kinase domains. Several lines of evidence convincingly show that one such putative pseudokinase, the immune receptor EFR achieves an active conformation following phosphorylation by a co-receptor kinase, and then in turn activates the co-receptor kinase allosterically to enable it to phosphorylate down-stream signaling components. This manuscript will be of interest to scientists focusing on cell signalling and allosteric regulation.

      We wish to clarify that EFR is itself, not a pseudokinase. We could show in previous work (Bender et al., 2021; https://doi.org/10.1073/pnas.2108242118 ) that EFR has catalytic activity in vitro. This catalytic activity is, however, not required for elf18-induced immune signaling in planta.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary

      The authors use an elegant but somewhat artificial heterodimerisation approach to activate the isolated cytoplasmic domains of different receptor kinases (RKs) including the receptor kinase BRI1 and EFR. The developmental RK BRI1 is known to be activated by the co-receptor BAK1. Active BRI1 is then able to phosphorylate downstream substrates. The immune receptor EFR is also an active protein kinase also activated by the co-receptor BAK1. EFR however appears to have little or no kinase activity but seems to use an allosteric mechanism to in turn enable BAK1 to phosphorylate the substrate kinase BIK1. EFR tyrosine phosphorylation by BAK1 appears to trigger a conformational change in EFR, activating the receptor. Likewise, kinase activating mutations can cause similar conformational transitions in EFR and also in BAK1 in vitro and in planta.

      We wish to clarify that we make no strong link between tyrosine phosphorylation and the conformational change leading to activation of the complex. Rather, the HDX-MS data demonstrate the structural importance of Tyr836 for the activation mechanism. At present, we do not know how phosphorylation of the residue would affect the activation process.

      Strengths:

      I particularly liked The HDX experiments coupled with mutational analysis (Fig. 2) and the design and testing of the kinase activating mutations (Fig. 3), as they provide novel mechanistic insights into the activation mechanisms of EFR and of BAK1. These findings are nicely extended by the large-scale identification of EFR-related RKs from different species with potentially similar activation mechanisms (Fig. 5).

      Weaknesses:

      In my opinion, there are currently two major issues with the present manuscript. (1) The authors have previously reported that the EFR kinase activity is dispensible for immune signaling (https://pubmed.ncbi.nlm.nih.gov/34531323/) but the wild-type EFR receptor still leads to a much better phosphorylation of the BIK1 substrate when compared to the kinase inactive D849N mutant protein (Fig. 1). (2) How the active-like conformation of EFR is in turn activating BAK1 is poorly characterized, but appears to be the main step in the activation of the receptor complex. Extending the HDX analyses to resting and Rap-activated receptor complexes could be a first step to address this question, but these HDX studies were not carried out due to technical limitations.

      Overall this is an interesting study that aims to advance our understanding of the activation mechanisms of different plant receptor kinases with important functions in plant immunity.

      Reviewer #2 (Public Review):

      Summary:

      Transmembrane signaling in plants is crucial for homeostasis. In this study, the authors set out to understand to what extent catalytic activity in the EFR tyrosine kinase is required in order to transmit a signal. This work was driven by mounting data that suggest many eukaryotic kinases do not rely on catalysis for signal transduction, relying instead on conformational switching to relay information. The crucial findings reported here involve the realisation that a kinase-inactive EFR can still activate (ie lead to downstream phosphorylation) of its partner protein BAK1. Using a convincing set of biochemical, mass spectrometric (HD-exchange) and in vivo assays, the team suggest a model in which EFR is likely phosphorylated in the canonical activation segment (where two Ser residues are present), which is sufficient to generate a conformation that can activate BAK1 through dimersation. A model is put forward involving C-helix positioning in BAK1, and the model extended to other 'non-RD' kinases in Arabidopsis kinases that likely do not require kinase activity for signaling.

      We prefer not to describe EFR as a tyrosine kinase. It may be the case that EFR can function under certain conditions as a dual-specificity protein kinase, but this has never been demonstrated experimentally. We therefore describe EFR as a Ser/Thr protein kinase, since it is known that the isolated cytoplasmic domain can phosphorylate on Ser and Thr residues (Wang et al., 2014; https://doi.org/10.1016/j.jprot.2014.06.009).

      Strengths:

      The work uses logical and well-controlled approaches throughout, and is clear and convincing in most areas, linking data from IPs, kinase assays (including clear 32P-based biochemistry), HD-MX data (from non-phosphorylated EFR) structural biology, oxidative burst data and infectivity assays. Repetitions and statistical analysis all appear appropriate.

      Overall, the work builds a convincing story and the discussion does a clear job of explaining the potential impact of these findings (and perhaps an explanation of why so many Arabidopsis kinases are 'pseudokinases', including XPS1 and XIIa6, where this is shown explicitly).

      Weaknesses:

      No major weaknesses are noted from reviewing the data and the paper follows a logical course built on solid foundations; the use of Tables to explain various experimental data pertinent to the reported studies is appreciated.

      (1) The use of a, b,c, d in Figures 2C and 3C etc is confusing to this referee, and is now addressed in the latest version

      (2) The debate about kinase v pseudokinases is well over a decade old. For non-experts, the kinase alignments/issues raised are in PMID: 23863165 and might prove useful if cited.

      We have cited the suggested reference in the second paragraph of the discussion.

      (3) Early on in the paper, the concept of kinases and pseudokinases related to R-spine (and extended R-spine) stability and regulation really needs to be more adequately introduced to explain what comes next; e.g. some of the key work in this area for RAF and Tyr kinases where mutual F-helix Phe amino acid changes are evaluated (conceptually similar to this study of the E-helix Tyr to Phe changes in EFR) should be cited (PMID: 17095602, 24567368 and 26925779).

      As an alternative, we have amended the text in several places to focus on conformational toggling between active/inactive states rather than R-spine stability. We think that this keeps the message of our manuscript focused. We hope that the reviewer finds this acceptable.

      (4) In my version, some of the experimental text is also currently in the wrong order (and no page numbers, so hard for me to state exactly where in the manuscript); However, I am certain that Figure 2C is mentioned in the text when the data are actually shown in Figure 3C for the EFR-SSAA protein.

      Indeed, some references to Figure 2 in the text were incorrect. We have corrected these. References in the text to Figure 3 and the data reported therein are correct.

      (5) Tyr 156 in PKA is not shown in Supplement 1, 2A as suggested in the text; for readers, it will be important to show the alignment of the Tyr residue in other kinases; this has been updated in the second version. Although it is clearly challenging to generate phosphorylated EFR (seemingly through Codon-expansion here?), it appears unlikely that a phosphorylated EFR protein, even semi-pure, couldn't have been assayed to test the idea that the phosphorylation drives/supports downstream signaling. What about a DD or EE mutation, as commonly used (perhaps over-used) in MEK-type studies?

      Our aim with codon expansion was to generate recombinant protein carrying high-stoichiometry phosphorylation at sites which we have previously documented to be required for downstream signaling (Macho et al., 2014; Bender et al., 2021). We additionally demonstrated previously that a DD mutant of the activation loop sites in EFR does not fully complement the efr-1 mutant (Bender et al., 2021), suggesting that the Asp mutations are not good phospho-mimics in this context. We therefore did not generate DD or EE mutations for in vitro studies.

      Impact:

      The work is an important new step in the huge amount of follow-up work needed to examine how kinases and pseudokinases 'talk' to each other in (especially) the plant kingdom, where significant genetic expansions have occurred. The broader impact is that we might understand better how to manipulate signaling for the benefit of plants and mankind; as the authors suggest, their study is a natural progression both of their own work, and the kingdom-wide study of the Kannan group.

      Reviewer #3 (Public Review):

      The study presents strong evidence for allosteric activation of plant receptor kinases, which enhances our understanding of the non-catalytic mechanisms employed by this large family of receptors.

      Plant receptor kinases (RKs) play a critical role in transducing extracellular signals. The activation of RKs involves homo- or heterodimerization of the RKs, and it is believed that mutual phosphorylation of their intracellular kinase domains initiates downstream signaling. However, this model faces a challenge in cases where the kinase domain exhibits pseudokinase characteristics. In their recent study, Mühlenbeck et al. reveal the non-catalytic activation mechanisms of the EFR-BAK1 complex in plant receptor kinase signaling. Specifically, they aimed to determine that the EFR kinase domain activates BAK1 not through its kinase activity, but rather by utilizing a "conformational toggle" mechanism to enter an active-like state, enabling allosteric trans-activation of BAK1. The study sought to elucidate the structural elements and mutations of EFR that affect this conformational switch, as well as explore the implications for immune signaling in plants. To investigate the activation mechanisms of the EFR-BAK1 complex, the research team employed a combination of mutational analysis, structural studies, and hydrogen-deuterium exchange mass spectrometry (HDX-MS) analysis. For instance, through HDX-MS analysis, Mühlenbeck et al. discovered that the EFR (Y836F) mutation impairs the accessibility of the active-like conformation. On the other hand, they identified the EFR (F761H) mutation as a potent intragenic suppressor capable of stabilizing the active-like conformation, highlighting the pivotal role of allosteric regulation in BAK1 kinase activation. The data obtained from this methodology strengthens their major conclusion. Moreover, the researchers propose that the allosteric activation mechanism may extend beyond the EFR-BAK1 complex, as it may also be partially conserved in the Arabidopsis LRR-RK XIIa kinases. This suggests a broader role for non-catalytic mechanisms in plant RK signaling.

      The allosteric activation mechanism was demonstrated for receptor tyrosine kinases (RTKs) many years ago. A similar mechanism has been suggested for the activation of plant RKs, but experimental evidence for this conclusion is lacking. Data in this study represent a significant advancement in our understanding of non-catalytic mechanisms in plant RK signaling. By shedding light on the allosteric regulation of BAK1, the study provides a new paradigm for future research in this area.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      The authors have considered points 1-5 raised in my initial review and the revised manuscript contains a more balanced discussion and limitation section. No additional experiments have been performed to substantiate the envisioned allosteric activation mechanism of the co-receptor kinase BAK1 by the receptor EFR. I rewrote the public statement accordingly.

      Reviewer #2 (Recommendations For The Authors):

      Thanks for responding to my comments.

      Reviewer #3 (Recommendations For The Authors):

      The revised manuscript has fully addressed my previous concerns and is now suitable for publication in eLife.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary: 

      Using concurrent in vivo whole-cell patch clamp and dendritic calcium imaging, the authors characterized how functional synaptic inputs across dendritic arborizations of mouse primary visual cortex layer 2/3 neurons emerge during the second postnatal week. They were able to identify spatially and functionally separated domains of clustered synapses in these neurons even before eye-opening and characterize how the clustering changes from P8 to P13. 

      Strengths: 

      The work is technically challenging and the findings are novel. The results support previous EM and immunostaining studies but provide in vivo evidence on the time course and the trajectory of how functional synaptic input develops. 

      Weaknesses: 

      There are some missing details about how the experiments were performed, and I also have some questions about the analyses. 

      We have now added a more detailed description of the methods and added new supplemental figures and descriptions to clarify our analyses. Please find our responses to the specific points of this reviewer in the section “Recommendations for the authors” below.

      Reviewer #2 (Public Review):

      In this study, Leighton et al performed remarkable experiments by combining in-vivo patch-clamp recording with two-photon dendritic Ca2+ imaging. The voltage-clamp mode is a major improvement over the pioneer versions of this combinatorial experiment that has led to major breakthroughs in the neuroscience field for visualizing and understanding synaptic input activities in single cells in-vivo (sharp electrodes: Svoboda et al, Nature 1997, Helmchen et al, Nature Neurosci 1999; whole-cell current-clamp: Jia et al, Nature 2010, Chen et al, Nature 2011. I suggest that these papers would be cited). This is because in voltage-clamp mode, despite the full control of membrane voltage in-vivo not being realistic, is nevertheless most effective in preventing back-propagation action potentials, which would severely confound the measurement of individual synaptically-induced Ca2+ influx events. Furthermore, clamping the cell body at a strongly depolarized potential (here the authors did -30mV) also facilitates the detection of synaptically-induced Ca2+ influx. As a result, the authors successfully recorded high-quality Ca2+ imaging data that can be used for precise analysis. To date, even in view of the rapid progress of voltage-sensitive indicators and relevant imaging technologies in recent years, this very old 'art' of combining single-cell electrophysiology and two-photon imaging (ordinary, raster-scanned, video-rate imaging) of Ca2+ signals still enables measurements of the best level precision. 

      We thank the reviewer for reminding us of these important previous studies that we cite now in the revised manuscript. 

      On the other hand, the interpretation of data in this study is a bit narrow-minded and lacks a comprehensive picture. Some suggestions to improve the manuscript are as follows: 

      (1) The authors made a segregation of 'spine synapse' and 'shaft synapse' based solely on the two photon images in-vivo. However, caution shall be taken here, because the optical resolution under in vivo imaging conditions like this cannot reliably tell apart whether a bright spot within or partially overlapping a segment of the dendrite is a spine on top of (or below) it. Therefore, what the authors consider as a 'shaft synapse' (by detecting Ca2+ hotspots) has an unknown probability of being just a spine on top or below the dendrite. If there is other imaging data of higher axial resolution to validate or calibrate, the authors shall take some further considerations or analysis to check the consistency of their data, as the authors do need such a segregation between spine and shaft synapses to show how they evolve over the brain development stages. 

      We agree with the reviewer that the differentiation between spine and sha synapses can be difficult for those spines that are located above or below the dendric sha in the z-dimension because of the lower resolution of 2-photon microscopy in the z-dimension compared to the image plane. We have now added a new paragraph to the Methods section to describe in more detail how we identify spine and sha synapses and provide more examples in a new supplementary figure (Fig S5). We believe that we can identify spine and sha synapses reliably in most cases, but added a cautionary note to make the reader aware of potential misidentifications.

      (2) The use of terminology 'bursts of spontaneous inputs' for describing voltage-clamp data seems improper. Conventionally, 'burst' refers to suprathreshold spike firing events, but here, the authors use 'burst' to refer to inward synaptic currents collected at the cell body. Not every excitatory synaptic input (or ensemble of inputs) activation will lead to spike firing under naturalistic conditions, therefore, these two concepts are not equivalent. It is recommended to use 'barrage of inputs' instead of 'burst of inputs'. Imagine a full picture of the entire dendritic tree, the fact that the authors could always capture spontaneous Ca2+ events here and there within a few pieces of dendrites within an arbitrary field-of-view suggests that, the whole dendritic tree must have many more such events going on as a barrage while the author's patch electrode picks up the summed current flow from the whole dendritic tree. 

      We agree with the reviewer that “barrage” is a clearer term for multiple synaptic inputs occurring simultaneously and therefore we changed the terminology throughout the manuscript.

      (3) Following the above issue, an analysis of the temporal correlation between synaptic (not segregating 'spine' or 'shaft') Ca2+ events and EPSCs is absent. Again, the authors drew arbitrary time windows to clump the events for statistical analysis. However, the demonstrated example data already shows that the onset times of individual synaptic Ca2+ events do not necessarily align with the beginning of a 'barrage' inward current event. 

      The reviewer writes that “an analysis of the temporal correlation between synaptic calcium events and EPSCs is absent”. We would like to point out that we did determine the percentage of calcium transients that occurred during barrages of synaptic inputs (~60%, page 7). This is important, since the barrages in our patch-clamp recordings most likely reflect spontaneous network events as described in the developing cortex previously by us and many other labs . The time window we chose was not “arbitrary” as the reviewer suggests, but based on the duration of the barrages of synaptic inputs as defined in the Methods section. 

      The reason, why we did not perform a more in-depth analysis of the temporal relationship between synaptic calcium transients and synaptic input currents is that it is essentially impossible to relate calcium transients at individual synapses to specific synaptic input events. First, during barrages of synaptic inputs many synapses are active simultaneously, both in the mapped dendrites as well as in the un-observed parts of the dendric arborization as the reviewer notes above. Thus, barrages cannot be broken down into individual synaptic transmission events. Second, since our acquisition frequency is ~10 Hz, we can identify the onset of individual synaptic calcium transients with 100-200 ms precision (1 or 2 frames). However, throughout any 100-200 ms period of recording, several synapses are active across the entire dendric arborization such that we cannot assign a given calcium transient to a specific EPSC within a 100-200 ms epoch. Third, due to the limited clamping capacity of in vivo patch recordings, we cannot be certain that individual transmission events in distal dendrites can be resolved in the patch recording.

      (4) The authors claim that "these observations indicate that the activity patterns investigated here are not or only slightly affected by low-level anesthesia". It would be nice to show some of the recordings in this work without any anesthesia to support this claim. 

      Indeed, the conclusion that the patterns of activity are only slightly affected by low levels of anesthesia is based on our previous recordings on the network level. Unfortunately, we are still not able to record calcium imaging with single synapse resolution in unanesthezed developing mice (and no one else is as far as we know), because the skull of these young animals is not firm, yet. As a consequence, movements cannot be reduced sufficiently for patching and imaging with single synapse resolution. Our previously published (Siegel et al., 2012) and unpublished work on the cellular level suggests that activity patterns during light anesthesia are very similar to those during sleep in mouse pups at this age.

      Reviewer #3 (Public Review):

      Summary: 

      There is a growing body of litterature on the clustering of co-active synapses in adult mice, which has important implications for understanding dendritic integration and sensory processing more broadly. However, it has been unclear when this spatial organization of co-active synapses arises during development. In this manuscript, Leighton et al. investigate the emergence of spatially organized, coactive synapses on pyramidal dendrites in the mouse visual cortex before eye-opening. They find that some dendrite segments contain highly active synapses that are co-active with their neighbors as early as postnatal day (P) 8-10, and that these domains of co-active synapses increase their coverage of the dendritic arbor by P12-13. Interestingly, Leighton et al. demonstrate that synapses co-active with their neighbors are more likely to increase their activity across a single recording session, compared to synapses that are not co-active with their neighbors, suggesting local plasticity driven by coincident activity before eye-opening. 

      The current manuscript includes some replication of earlier results from the same research group (Winnubst et al., 2015), including the presence of clustered, co-active synapses in the visual cortex of mouse pups, and the finding that synapses co-active with their neighbors show an increase in transmission frequency during a recording session. The main novelty in the current study compared to Winnubst et al. (2015) is the inclusion of younger animals (P8-13 in the current study compared to P10-15 in Winnubst et al., 2015). The current manuscript is the first demonstration that active synapses are clustered on specific dendrite segments as early as P8-10 in the mouse visual cortex, and the first to show the progression in active synapse distribution along the dendrite during the 2nd postnatal week. These results from the visual cortex may help inform our understanding of sensory development more broadly. 

      Strengths: 

      The authors ask a novel question about the emergence of synaptic spatial organization, and they use well-chosen techniques that directly address their questions despite the challenging nature of these techniques. To capture both structural and functional information from dendrites simultaneously, the authors performed a whole-cell voltage clamp to record synaptic currents arriving at the soma while imaging calcium influx at individual synaptic sites on dendrites. The simultaneous voltage clamp and calcium imaging allowed the authors to isolate individual synaptic inputs without their occlusion by widespread calcium influx from back-propagating action potentials. Achieving in vivo dendrite imaging in live mice that are as young as P8 is challenging, and the resulting data provides a unique view of synaptic activity along individual dendrites in the visual cortex at an early stage in development that is otherwise difficult to assess. 

      The authors provide convincing evidence that synapses are more likely to be co-active with their neighbors compared to synapses located farther away (Fig. 6F-H), and that synapses co-active with their neighbors increase their transmission frequency during a recording session (Figure 7C). These findings are particularly interesting given that the recordings occur before eye-opening, suggesting a relationship between co-activity and local synaptic plasticity even before the onset of detailed visual input. These results replicate previously published findings from P10-15 pups (Winnubst et al., 2015), increasing confidence in the reproducibility of the data. 

      The authors also provide novel data documenting for the first time spatially organized, co-active synapses in pups as young as P8. Comparing the younger (P8-10) and older (P12-13) pups, provides insight into how clusters of co-active synapses might emerge during development. 

      Weaknesses: 

      This manuscript provides insufficient detail for assessing the rigor and reproducibility of the methods, particularly for age comparisons. The P8-10 vs P12-13 age comparisons are the primary novel finding in this manuscript, and it is, therefore, critical to avoid systematic age differences in the methods and analysis whenever possible. Specific concerns related to the age comparisons are listed below: 

      (1) Given that the same research group previously published P12-13 data (Winnubst et al., 2015), it is unclear whether both age groups in the current study were imaged/analyzed in parallel by the same researcher(s), or whether previous data was used for the P12-13 group. 

      While indeed the approach in the present study is similar to that of our previous study (Winnubst et al. 2015), the data set presented here is entirely new. The current study was made possible by a new microscope that allows combining resonant scanning with piezo-focusing to image large fractions of the dendric arborization. In fact, we could now image almost 10 times larger dendric segments including branch points than in our previous study. One author contributed to the experiments in both studies. Image analysis of all experiments was performed by the first author of the present study who was not involved in the Winnubst et al. work.

      (2) The authors mention that they used 2 different microscopes, and used a fairly wide range of imaging frame rates (5-15 Hz). It is unclear from the current manuscript whether the same imaging parameters were used across the two age groups. If data for the two experimental groups was collected separately, perhaps at different times, by a different person, or on a different microscope, there is a concern that some differences between the groups may not necessarily be due to age. 

      The reviewer mentions that the experimental settings are not identical across the experiments of this study. In the original manuscript we erroneously reported in the Methods section that 2 different setups were used for this study; however, all experiments were performed on the same microscope. We have corrected this in the new manuscript. We took timelapse recordings of small stacks of varying depth to cover as many dendrites as possible in each recording, therefore, we needed to adjust the rate of acquired stacks within a certain range as the reviewer points out. The data were acquired by two scientists during an overlapping period. And while the different ages were not recorded in a strictly randomized fashion, they were not acquired in sequence according to ages, but rather involved many attempts on animals of different ages from many different litters. For each litter a small percentage of animals would generate successful recordings, and the ages of these successes were random. Therefore, we believe that neither the collection of data nor the analysis (see point above) affected the differences we describe here for the two age groups.

      (3) It is unclear whether the image analysis was performed blind to age. Blinding to age during analysis is particularly important for this study, in which it was not possible to blind to age during imaging due to visible differences in size and developmental stage between younger and older pups. 

      The analysis was not setup to be performed blind to age. Not only is the age of the animal apparent at the stage (as the reviewer points out), also the number of spines and the activity levels clearly show differences between neurons only a few days apart. However, all age-related findings reported in this study - except the increase in synapse density and activity - became apparent to us only after the full set of synaptic transmission events was determined and the analysis was performed on the entire data set, making it very unlikely that event detection was biased.

      (4) The relatively low N (where N is the number of dendrites or the number of mice) in this study is acceptable due to the challenging nature of the techniques used, but unintentional sampling bias is a concern. For example, if higher-order dendrites from the apical tuft were imaged at P12-13, while more segments of the apical trunk were imaged at P8-10, this could inadvertently create apparent age differences that were in fact due to dendrite location on the arbor or dendrite depth. 

      The reviewer points out that sampling bias with respect to synapse location along dendrites in the dataset could lead to falsely apparent age differences. In all experiments we imaged dendrites of layer 2/3 neurons that were relatively close to the cortical surface to optimize image quality. In addition, we confirmed that the mean distance of the imaged dendric stretches from the cell body was similar between the dendrites of each age group (Young: 392 +/-  104 µm, Old: 323 +/- 118 µm; mean +/- STD). Therefore, we do not think that sampling bias affected these results.

      Additional general methodological concerns, which are not specifically related to the age comparisons, are listed below: 

      (5) The authors assert that clustered, co-active synapses emerge in the visual cortex before eye-opening, which is an important finding in that it suggests this phenomention is driven by spontaneous activity rather than visual input. However, this finding hinges on the imaged cells being reliably located in the visual cortex, which is difficult to identify with certainty in animals that have not yet opened their eyes and therefore cannot undergo intrinsic signal imaging to demarcate the boundaries of the visual cortex. If the imaged cells were in, for example, nearby somatosensory cortex, then the observed spatial organization could be due to sensory input rather than spontaneous activity. 

      The reviewer argues that if the neurons included in our analysis were located in non-visual sensory cortex, e.g. the somatosensory cortex, sensory experience might have shaped clustered inputs instead of spontaneous activity. We are, however, certain that the neurons were located inside the primary visual cortex. In previous experiments where we performed the same craniotomies, we mapped spontaneous activity across the sensory areas in the occipital neocortex and we know the exact location of V1 which is already very consistent during the second postnatal week. (See for example Supplemental Figure 4 in Leighton et al., 2021).  

      (6) It is unclear how the authors defined a synaptic transmission event in the GCaMP signal (e.g. whether there was a quantitative deltaF/F threshold). 

      In the revised manuscript, we describe the procedure of identifying synaptic calcium transients in more detail and added a new supplemental figure to clarify this aspect of the analysis. In short, we use an automated detection with a 2x standard deviation threshold and a subsequent manual control and selection step. Please, find all details in the Methods section and Figure S4 of the revised manuscript.

      (7) The authors' division of synapses into spine vs shaft is unconvincing due to the difficulty of identifying Z-projecting spines in images from 2-photon microscopy, where the Z resolution is insufficient to definitively identify Z-projecting spines, and the fact that spines in young animals may be thin and dim. The authors' examples of spine synapses (e.g. in Fig. 2A) are convincing, but some of the putative shaft synapses may in fact be on spines. 

      We agree with the reviewer that the differentiation between spine and sha synapses can be difficult for those spines that are located above or below the dendric sha in the z-dimension because of the lower resolution of 2-photon microscopy in the z-dimension compared to the image plane (see also response to Reviewer 2, point 1). We have now added a new paragraph to the Methods section to describe in more detail how we identify spine and sha synapses and provide more examples in a new supplementary figure (Fig S5). We believe that we can identify spine and sha synapses reliably in most cases, but added a cautionary note to make the reader aware of potential misidentifications.

      Reviewer #1 (Recommendations For The Authors):

      I think the experiments performed were very technically challenging (probably one of the few labs that can do this in the field), and the findings provide in vivo evidence on how structured synaptic inputs are assembled during development that has never been reported. 

      I suggest improving the writing and presentation and really explaining how they conducted the experiments and how they defined shaft synapses. 

      Line 96: 12 dendritic areas from 11 mice at ages between postnatal day 8 to 13. 

      - Do the authors know how many neurons were imaged? It is unclear if the authors patch on all the imaged neurons and only imaged (or analyzed) the dendrites of those patched neurons. If yes, how sparse are the neurons labelled from IUE? From 1B, it looks like there are two cells adjacent to each other. Can the authors really distinguish whether the imaged dendrites are from the patched neuron? 

      The reviewer wonders whether we can tell apart dendrites of patched cells from those of neighboring neurons that were not patched. This is actually very straight forward: the experiment included a depolarization step (see Methods section) which leads to an immediate, but temporary, increase in fluorescence in all of the patched neurons’ dendrites, but none of the neighboring dendrites. We have added this information to the Methods section of the new manuscript and provide now an example (Fig S3). Furthermore, as these cells normally fire frequently, it would immediately become clear that an unpatched cell is being imaged if backpropagating action potentials are predominantly observed rather than synaptic signals. The visualization of these synaptic signals is only possible due to the blockade of Na+ channels with QX314 in the intracellular solution (see Methods). 

      - In the methods section, it says 'dendrites were imaged in single plane or small stacks with plane...'. How do the authors do calcium imaging with small stacks of plane using Nikon MP scope? 

      Small stacks were acquired by using the piezo focusing device of our Nikon A1 microscope. Since we combined this fast focusing approach with resonant scanning, we were able to acquire z-stacks of 3-5 frames at a rate of up to 15 Hz (per stack).

      - I also assume this is not chronic imaging, and there are different mice for each postnatal day. If it's true, this is somewhat important for all the correlation analysis as there are only 2 mice for each postnatal day (other than day 12) and day 13 only has 1 animal. 

      Yes, indeed these are not chronic experiments and dendrites imaged on different days are from different neurons and different mice. We agree with the reviewer that if it had been possible to image the same neurons across these developmental stages, we would have detected even clearer correlations. Therefore, we see our results as conservative estimates of the developmental trajectory of the analyzed parameters.

      Line 104 - 109: I don't understand why the authors need to hold at -30mV to facilitate calcium influx through NMDA receptors? I assume this helps them to visualize as many synapses as possible? but wouldn't that also make the 'event frequency' not reflect the true value? 

      Indeed depolarizing the imaged neurons to -30 mV was necessary to get sufficient calcium influx to map synaptic inputs. We don’t think that this affects the frequency of inputs, because the frequency of synaptic inputs is determined by the presynaptic firing rate and the release probability of the presynaptic terminal, which are not affected by the depolarization of the dendrite.

      Figure 2A - It says in the method section that ROIs are manually selected. However, it's not explained what the criteria are. For spine synapses, it's easy to define but for shaft synapses like in Fig 2B, why are there 2 synapses on the shaft? And in Fig 4a, 5a, Fig S1 P13, some of the dendrites are packed with ROIs. What's the distance between those shaft synapses? Can the imaging resolution really separate them? 

      The reviewer asks for a better description of how we identified individual ROIs and thus synapse locations and whether this is actually feasible. We have now added a more detailed description of how we select synaptic sites based on the occurrence of synaptic calcium transients. In addition, we have added a new supplemental Figure (S4) to give the reader an impression of the image quality and the ability to locate individual synapses reliably. We find that separating sha synapses was possible for inter-synapse distances of ~4 µm or more. The mean sha synapse distance in our data set is 21 µm.

      - Similar issue applies to Figure 4A that I'm not sure what's the resolution of each 'hot spot'. They all seem very close together. Maybe additional raw dendrite images with fluorescence changes like 1C or 2A could be helpful (or movies in the supplementary?) 

      As the reviewer suggests, we have added now additional supplemental figures to illustrate better how we identify synaptic transmission events as well as spine and sha synapses.

      - Also for line 164, it says that 76% of high-activity synapses were located on spines. This could also maybe support that only the spine synapses are real synapses and many shaft synapses are actually not synapses and they were just categorized as shaft synapses from manual ROI? 

      We are actually quite sure that sha synapses are real synapses based on our analysis, since they show repeated synaptic calcium transients that co-occur with barrages of synaptic inputs as measured by patch-clamp recordings. Indeed one would expect to see a number of excitatory synapses on dendric shas of pyramidal neurons at these ages based on previous EM studies (Miller and Peters, 1981; Wildenberg et al., 2023).

      - While this might not impact the overall novelty of the paper, I would be curious to know if the authors can still observe the same findings if they only analyze spine synapses. 

      We repeated several analyses with a dataset that contained only spine synapses. For most analyses we observed the expected result: the effect sizes were similar compared to the entire data set, but the power was reduced. For example the effect of distance to closest high-activity neighbor and own activity (Fig 5E, F) was similar, but p-values were around 0.1 (Similar results for Figure 7B). In contrast, the co-activity with synapses within a domain was significantly higher than the co-activity with synapses in other domains also for the spine-synapse only data set. 

      Fig 6 - Does the domain co-activity also contribute to the synaptic current recorded (related to Fig 4). 

      Yes, the synaptic activity measured by calcium imaging contributes to the recorded EPSCs. However, the exact relationship between synaptic inputs measured by calcium imaging and those measured by patch-clamping is complicated by 3 factors: first, during barrages of synaptic inputs many synapses are active simultaneously, both in the mapped dendrites as well as in the un-observed parts of the dendric arborization. Thus, barrages cannot be broken down into individual events. Second, since our acquisition frequency is ~10 Hz, we can identify the onset of individual synaptic calcium transients with 100-200 ms precision (1 or 2 frames). However, throughout any 100-200 ms period of recording several synapses are active across the entire dendric arborization such that we cannot assign a given calcium transient to a specific EPSC within a 100-200 ms epoch. Third, due to the limited clamping capacity of in vivo patch recordings, we cannot be certain that individual transmission events in distal dendrites can be resolved in the patch recording as EPSCs.

      Reviewer #2 (Recommendations For The Authors):

      (1) I suggest the authors should provide the number of cells and mice recorded in the figure legends. 

      The number of dendrites and mice is the same across all analyses: 12 dendrites from 11 mice for all experiments, 6/6 for P8-10 and 6/5 for P12-13. All dendrites and synapses (and their ages) are shown in the supplemental figures S1 and S2. We mention the number of imaged dendrites now at the beginning of the Results section and when we split ages for the first me.

      (2) Instead of showing only cartoon illustrations of dendrites in Figures 3-6, I suggest showing the two photon images as well together with the cartoon. 

      The 2-photon images of all dendrites of the dataset are available in Figure S1. To allow the reader to compare the cartoon representations in the main figures and the 2-photon images of each neuron, we have now labeled each dendrite in the dataset (D1-D12, see figures S1 and S2). For every figure, where we show example neurons (cartoons or zoom ins) we now provide this identifier.

      Reviewer #3 (Recommendations For The Authors):

      To address the weaknesses outlined above, we recommend that the authors do the following: 

      • To address concerns about the rigor and reproducibility of the methods specifically related to age comparisons, please confirm the following: 

      - Both age groups were run in parallel by the same researcher(s). 

      Experiments were run partly overlapping and experiments from different age groups were performed in parallel by both researchers.

      - Both age groups were imaged on the same microscope, or animals from each age group were imaged on both microscopes. If it was necessary to use different microscopes for the different age groups for biological or practical reasons, please explain. 

      All experiments were run on the same microscope, a Nikon A1 2-photon microscope. In the original methods description we erroneously mentioned two microscopes (copy and paste error from a previous publication). We corrected that in the revised manuscript.

      - There was no difference in imaging frame rates or other imaging parameters between age groups. If it was necessary to use different parameters for different age groups for biological reasons, please explain. 

      We varied the frame rates somewhat to allow larger z-stacks for some experiments where dendrites traversed different depths; however the mean frame rates were similar between the experiments in P8-10 vs P12-13 dendrites, 8.5 vs 10 Hz, respectively.

      - Images were analyzed blind to age. 

      The analysis was not setup to be performed blind to age. The number of spines and the activity levels clearly show obvious differences between neurons only a few days apart. However, all findings reported in this study related to age - except the increase in synapse density and activity - became apparent to us only after the full set of synaptic transmission events was determined and the analysis was performed on the entire data set, making it unlikely that event detection was biased.

      - There was no difference in the location of analyzed dendrites (e.g. depth from the pia, branch order) between age groups. 

      In all experiments we imaged dendrites of layer 2/3 neurons that were relatively close to the cortical surface to optimize image quality. In addition, we determined the mean distance of the imaged dendric stretches from the cell body and found that this distance was similar between the dendrites of each age group (Young: 392 +/-  104 µm, Old: 323 +/- 118 µm; mean +/- STD). Therefore, we do not think that sampling bias affected these results.

      • To address general methodological concerns, please provide additional description of the following points: 

      - Please clarify how the visual cortex was identified in P8-13 pups. If there was ambiguity about identifying the visual cortex in these pups, please discuss the implications of this ambiguity. 

      The reviewer asks how we identified V1 in these experiments. We are indeed certain that the neurons were located inside the primary visual cortex. We have ample experience with mapping V1 in these animals based on patterns of spontaneous activity as well as post-hoc stainings. V1 is quite large already at these ages (> 2 mm long and > 1 mm wide) and its extent very consistent across animals. Thus, we would argue it is actually hard to miss.

      - Please clarify how synaptic transmission events were identified in the GCaMP signal. 

      We have now added a more detailed description of how we identify synaptic calcium transients. In addition, we have added a new supplemental Figure (S3) to give the reader an impression of the image quality and the ability to locate individual synapses reliably. 

      - It is acceptable to use the spine vs shaft analysis despite the inevitable difficulty resolving Z-projecting spines, but this caveat should be mentioned in the discussion of the spine vs shaft results. 

      We added a more detailed description of spine and sha synapse identification, a new supplemental figure (S5) and we now mention the caveat related to the limited z-resolution of 2-photon microscopy in the revised manuscript.

      • Two additional minor details should be clarified in the text of the manuscript: 

      - Please specify the volume of DNA solution injected into each embryo. 

      The injected volume was 1 µl. We added this information in the Methods section of the revised manuscript.

      - In Fig S1, please specify whether the scale bar applies to all images. 

      The scale bar applies to all images. This information was added to the figure legend.

      References

      Leighton AH, Cheyne JE, Houwen GJ, Maldonado PP, De Winter F, Levelt CN, Lohmann C. 2021. Somatostatin interneurons restrict cell recruitment to renally driven spontaneous activity in the developing cortex. Cell Rep 36:109316. doi:10.1016/j.celrep.2021.109316

      Miller M, Peters A. 1981. Maturation of rat visual cortex. II. A combined Golgi-electron microscope study of pyramidal neurons. JComp Neurol 203:555–573.

      Siegel F, Heimel JA, Peters J, Lohmann C. 2012. Peripheral and central inputs shape network dynamics in the developing visual cortex in vivo. Current Biology 22:253–258.

      Wildenberg G, Li H, Sampathkumar V, Sorokina A, Kasthuri N. 2023. Isochronic development of cortical synapses in primates and mice. Nat Commun 14:8018. doi:10.1038/s41467-02343088-3

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      This is an interesting and well-written paper reporting on a novel approach to studying cerebellar function based on the idea of selective recruitment using fMRI. The study is well-designed and executed. Analyses are sound and results are properly discussed. The paper makes a significant contribution to broadening our understanding of the role of the cerebellum in human behavior.

      We thank the reviewer for the positive assessment of our paper.

      (1) While the authors provide a compelling case for the link between BOLD and the cerebellar cortical input layer, there remains considerable unexplained variance. Perhaps the authors could elaborate a bit more on the assumption that BOLD signals mainly reflect the input side of the cerebellum (see for example King et al., elife. 2023 Apr 21;12:e81511).

      Our paper is based on the assumption that the cerebellar BOLD signal reflects solely the input to the cerebellum and does not reflect the changes in firing rates of Purkinje cells. This assumption relies on two lines of arguments: Studies that have directly looked at the mechanism of vasodilation in the cerebellum, and studies that try to infer the contributions of different neurophysiological mechanisms to overall cerebellar metabolism (Attwell and Iadecola, 2002).

      Vasodilatory considerations: The mechanisms that causes vasodilation in the cerebellum, and hence BOLD signal increases, has been extensively studied: Electrical stimulation of mossy fibers (Gagliano et al., 2022; Mapelli et al., 2017), as well as parallel fibers (Akgören et al., 1994; Iadecola et al., 1996; Mathiesen et al., 1998; Yang and Iadecola, 1997) lead to robust increases in cerebellar blood flow. In contrast to the neocortex, the regulation of blood flow in the cerebellum depends nearly purely on the vasodilator Nitric Oxide (NO) (Akgören et al., 1994; Yang and Iadecola, 1997) with stellate cells playing a key role in the signaling cascade (Yang et al., 2000).

      Electrical (Mathiesen et al., 2000) and pharmacological (Yang and Iadecola, 1998) stimulation of climbing fibers also leads to robust increases in blood flow. Simultaneous parallel and climbing fiber stimulation seems to combine sub-additively to determine the blood flow changes (K. Caesar et al., 2003).

      Importantly, even dramatic changes in spiking rate of Purkinje cells do not lead to changes in vasodilation. For starters, parallel fiber stimulation leads to blood flow increases, even though the net effect on Purkinje cell firing is inhibitory (Mathiesen et al., 1998). More importantly, complete inhibition of the Purkinje cell using a GABA agonist does not change baseline cerebellar blood flow (Kirsten Caesar et al., 2003). Conversely, even a 200-300% increase in simple (and complex) spike firing rate through application of a GABA antagonist does not show any measurable consequences for blood flow, even though it clearly increases the metabolic rate of oxygen consumption in the tissue (Thomsen et al., 2009, 2004).

      In sum, this extensive set of studies clearly argues that the cerebellar blood flow response is mostly dictated by synaptic input, and that the firing rate of Purkinje cells does not influence vasodilation. Because the BOLD signal is caused by an supply of oxygen over and above the level of oxygen consumption, this would argue that increases in Purkinje cell firing would not lead to BOLD increases. What is less clear is the degree to which changes in BOLD signal during normal activity are determined by changes in mossy fiber or climbing fiber input. Disruption of either pathway leads to 60-70% reductions in the evoked blood flow response during whisker stimulation (Yang et al., 2000; Zhang et al., 2003) – but it remains unclear to what degree this reflects the distribution of contributions in the healthy animal, as these powerful disruptions may have a number of side-effects.

      Metabolic considerations: To estimate the relative contributions climbing fiber / mossy fiber input to the variations in BOLD signal under natural conditions, it is useful to consider the contributions of different cerebellar processes to the overall metabolism of the cerebellum. Assuming an average firing rate of 40Hz for mossy fibers, ~3Hz for Granule cells, and 1Hz for climbing fibers, Howarth et al. (Howarth et al., 2012, 2010) estimated that the transmission from mossy fibers to granular cells, dominates the energy budget with 53%. The subsequent stage, encompassing the transfer of information from Granular cells to Purkinje cells, accounts for 32% of energy expenditure. In contrast, integration within Purkinje cells and the spiking (simple and complex) of these cells represents only 15% of the total energy consumption.

      More important for the BOLD signal, however, are the activity-induced variations in metabolic consumption: Purkinje cells fire relatively constantly at a very high frequency (~50Hz) both during awake periods and during sleep (Shin et al., 2007). When providing a signal to the neocortex, firing rate decreases, actually lowering the metabolic demand. Climbing fibers normally fire at ~0.5 Hz and even during activity rarely fire much above 2Hz (Streng et al., 2017). In contrast, granule cells show a low firing rates during rest (typically <1hz) and can spike during activity well above 100Hz. Combined with the sheer number of granule cells, these considerations would suggest that the vast majority of the variation in metabolic demand are due to mossy fiber input and granule cell activity.

      Overall, we therefore think it is likely that the main determinant of the cerebellar cortical BOLD signal is mossy fiber input and the transmission of information from mossy fibers to granule cells to Purkinje cells. We admit that the degree to which climbing fiber input contribute to BOLD signal changes is much less clear. We can be quite certain, however, that the firing rate of Purkinje cells does not contribute to the cerebellar BOLD signal, as even dramatic changes in the firing rate do not cause any changes in vasodilation.  We have clarified our line of reasoning in the paper, and hope this more extensive response here will give the reader a better overview over the pertaining literature.

      (2) The current approach does not appear to take the non-linear relationships between BOLD and neural activity into account.

      Thank you for raising this concern. We did not stress this point in the paper, but one big advantage of our selective recruitment approach is that it is – to some degree- robust against non-linearities in the relationship between neural activity and BOLD signal. This is the case, as long as the shape of the non-linearity is similar in the cerebellum and the neocortex. The results of our motor task (Figure 3) provide a clear example of this: The BOLD signal both in the neocortex and cerebellum incases non-linearly as a function of force – the increase from 2.5N to 6N (a 3.5N increase) is larger than the increase from 6N to 10N (a 4N increase). A similar non-linearity can be observed for tapping speed (6, 10 to 18 taps / s). However, within each condition, the relationship between cortical and cerebellar activity is nearly perfectly linear, reflecting the fact that the shape of the non-linearity for the cerebellum and cortex is very similar.

      Most importantly, even if the non-linearity across the two structures is different, any non-linear relationship between neural activity and BOLD signal (of vasodilatory nature) should apply to different conditions (here force and speed increases) similarly. Therefore, if two conditions show overlapping activity levels (as observed for force and speed across medium and high levels, Figure 3), a offset between conditions cannot be caused by a non-linearity in the relationship of cortical and cerebellar activity. Because all conditions are subject to the same non-linearity, all points should lie on a single (likely monotonically increasing) non-linear function. Both for the motor and working memory task, the pattern of results clearly violates this assumption.

      (3) The authors may want to address a bit more the issue of closed loops as well as the underlying neuroanatomy including the deep cerebellar nuclei and pontine nuclei in the context of their current cerebello-cortical correlational approach. But also the contribution of other brain areas such as the basal ganglia and hippocampus. 

      Cortical-cerebellar communication is of course bi-directional. As discussed in King at al., (2023), however, we are restricting our model to the connections from the neocortex to the cerebellum for the following reasons: First, cerebellar BOLD activity likely reflects mostly neocortical input (see our answer to pt. 1), whereas neocortical activity is determined by a much wider array of projections, including striato-thalamo-cortical and cortico-cortical connections. Secondly, the output of the cerebellum cannot be predicted from the BOLD signal of the cerebellar cortex, as it is unlikely that the firing rate of Purkinje cells contribute to cerebellar BOLD signal (see pt. 1). For these reasons we believe that the relationship between neocortical and cerebellar activity patterns is mostly dictated by the connectivity from cortex to cerebellum, and is therefore best modelled as thus. This is now more clearly discussed in a new paragraph (line 318-323) of the revised manuscript.

      We are also ignoring other inputs to the cerebellum, including the spinal chord, the basal ganglia (Bhuvanasundaram et al., 2022; Bostan and Strick, 2018) hippocampus (Froula et al., 2023; Watson et al., 2019), and amygdala (Farley et al., 2016; Jung et al., 2022; Terburg et al., 2024). In humans, however, the neocortex remains the primary source of input to pontine nuclei. Consequently, it stands as the main structure shaping activity within the cerebellar cortex. While it is an interesting question to what degree the consideration of subcortical structures can improve the prediction of cerebellar activity patterns, we believe that considering the neocortex provides a good first approximation.

      Reviewer #1 (Recommendations):

      (4)  A few sentences to clarify the used models as was done in the King et al. (2024) paper may improve readability.

      We have now added the sentences in the introduction (line 25ff):

      To approach this problem, we have recently developed and tested a range of cortical-cerebellar connectivity models (King et al., 2023), designed to capture fixed, or task-invariant, transmission between neocortex and cerebellum. For each cerebellar voxel, we estimated a regularized multiple regression model to predict its activity level across a range of task conditions (King et al., 2019) from the activity pattern observed in the neocortex for the same conditions. The models were then evaluated in their ability to predict cerebellar activity in novel tasks, again based only on the corresponding neocortical activity pattern. Two key results emerged from this work. First, while rs-FC studies (Buckner et al., 2011; Ji et al., 2019; Marek et al., 2018) have assumed a 1:1 mapping between neocortical and cerebellar networks, models which allowed for convergent input from multiple neocortical regions to a single cerebellar region performed better in predicting cerebellar activity patterns for novel tasks. Second, when given a cortical activation pattern, the best performing model could predict about 50% of the reliable variance in the cerebellar cortex across tasks (King et al., 2023).

      (5) To what extent does this paper demonstrate the limitations of BOLD in neuroscientific research? 

      The primary objective of this study was to shed light on the problems of interpreting BOLD activation within the cerebellum. The problem that the BOLD signal mostly reflect input to a region is not unique to the cerebellum, but also applies (albeit likely to a lesser degree) to other brain structures. However, the solution we propose here critically hinges on three features of the cerebellar circuitry: a) the mossy fiber input for the cerebellar hemispheres mostly arise from the neocortex, b) the BOLD signal is likely dominated by this mossy fiber input (see pt. 1), and c) there is very little excitatory recurrent activity in the cerebellum, so output activity in the cerebellum does not cause direct activity in other parts of the cerebellum.

      These features motivate us to use a directed cortex->cerebellum connectivity model, which does not allow for any direct connectivity within the cerebellum. While the same approach can also be applied to other brain structures, it is less clear that the approach would yield valid results here. For example, due the local excitatory recurrent connectivity within neocortical columns, the activity here will also relate to local processing.

      (6) What if the authors reversed their line of reasoning as in that cerebellum activity is matched to map changes in cerebral cortical activity? Perhaps this could provide further evidence for the assumed directional specificity of the task-dependent gating of neocortical inputs. 

      Given (a) that the cerebellar BOLD signal tells us very little about cerebellar output signals (b) that there are many other input signals to the neocortex that are more powerful than cerebellar inputs, and c) that there strong cortical-cortical connections, we believe that this model would be hard to interpret (see also our answer to pt. 3).

      Therefore, while the inversion of the linear task-invariant mapping between cortical and cerebellar activity is a potentially interesting exercise, it is unclear to us at this point what strong predictions we would be able to test with this approach.

      (7) The statement that cerebellar fMRI activity may simply reflect the transmission of neocortical activity through fixed connections can be better explained. Also in the context of using the epiphenomenon (on page 11) in the paper. To what extent is the issue of epiphenomenon not a general problem of fMRI research?

      We have rephrased the introduction of this idea (line 17):

      This means that increases in the cerebellar BOLD signal could simply reflect the automatic transmission of neocortical activity through fixed anatomical connections. As such, whenever a task activates a neocortical region, the corresponding cerebellar region would also be activated, regardless of whether the cerebellum is directly involved in the task or not.

      Epiphemonal activity: This is indeed a general problem in fMRI research (and indeed research that uses neurophysiological recordings, rather than manipulations of activity). Indeed, we have discussed similar issues in the context of motor activity in ipsilateral motor cortex (Diedrichsen et al., 2009). However, given that we only offer a possible approach to address this issue for the cerebellum (see pt. 5), we thought it best to keep the scope of the discussion focused on this structure.

      Reviewer #2 (Public Review):

      Summary:

      Shahshahani and colleagues used a combination of statistical modelling and whole-brain fMRI data in an attempt to separate the contributions of cortical and cerebellar regions in different cognitive contexts.

      Strengths:

      The manuscript uses a sophisticated integration of statistical methods, cognitive neuroscience, and systems neurobiology.

      The authors use multiple statistical approaches to ensure robustness in their conclusions.

      The consideration of the cerebellum as not a purely 'motor' structure is excellent and important. <br />

      We thank the reviewer for their positive evaluation.

      Weaknesses:

      (1) Two of the foundation assumptions of the model - that cerebellar BOLD signals reflect granule cells > purkinje neurons and that corticocerebellar connections are relatively invariant - are still open topics of investigation. It might be helpful for the reader if these ideas could be presented in a more nuanced light.

      Please see response to the comment 1 of Reviewer 1 for a more extensive and detailed justification of this assumption. We have now also clarified our rationale for this assumption better in the paper on line 10-14. Finally, we now also raise explicitly the possibility that some of the violations of the task-invariant model could be caused by selectively increase of climbing fiber activity in some tasks (line 340).

      (2) The assumption that cortical BOLD responses in cognitive tasks should be matched irrespective of cerebellar involvement does not cohere with the idea of 'forcing functions' introduced by Houk and Wise. 

      We are assuming that you refer to the idea that cerebellar output is an important determinant of the dynamics (and likely also of the magnitude) of neocortical activity. We agree most certainly here. However, we also believe that in the context of our paper, it is justified to restrict the model to the connectivity between the neocortex and the cerebellum only (see reviewer 1, comment 3).

      Furthermore, if increased cerebellar output indeed occurs during the conditions for which we identified unusually high cerebellar activity, it should increase neocortical activity, and bring the relationship of the cerebellar and cortical activity again closer to the predictions of the linear model. Therefore, the identification of functions for which cerebellar regions show selective recruitment is rather conservative.

      Reviewer #2 (Recommendations):

      (3) One of the assumptions stated in the abstract -- that the inputs to the cerebellum may simply be a somewhat passive relay of the outputs of the cerebral cortex -- has been challenged recently by work from Litwin-Kumar (Muscinelli et al., 2023 Nature Neuroscience), which argues for complex computational relationships between cortical pyramidal neurons, pontine nuclei and granule cells, which in turn would have a non-linear impact on the relationship between cortical and cerebellar BOLD. The modelling is based on empirical recordings from Wagner (2019, Cell) which show that the synaptic connections between the cortex and granule cells change as a function of learning, further raising concerns about the assumption that the signals inherent within these two systems should be identical. Whether these micro-scale features are indicative of the macroscopic patterns observed in BOLD is an interesting question for future research, but I worry that the assumption of direct similarity is perhaps not reflective of the current literature. The authors do speak to these cells in their discussion, but I believe that they could also help to refine the authors' hypotheses in the manuscript writ large.

      We absolutely agree with your point. However, we want to make extremely clear here that our hypothesis (that the inputs to the cerebellum are a linear task-invariant function of the outputs of the cerebral cortex) is the Null-hypothesis that we are testing in our paper. In fact, our results show the first empirical evidence that task-dependent gating may indeed occur. In this sense, our paper is consistent with the theoretical suggestion of (Muscinelli et al., 2023).

      You may ask whether a linear task-invariant model of cortical-cerebellar connectivity is not a strawman, given that is most likely incorrect. However, as we stress in the discussion (line 298-), a good Null-model is a useful model, even if it is (as all models) ultimately incorrect. Without it, we would not be able to determine which cerebellar activity outstrips the linear prediction. The fact that this Null-model itself can predict nearly 50% of the variance in cerebellar activity patterns across tasks at a group level, means that it is actually a very powerful model, and hence is a much more stringent criterion for evidence for functional involvement than just the presence of activity.

      (4) Further to this point, I didn't follow the authors' logic that the majority of the BOLD response in the cerebellum is reflective of granule cells rather than Purkinje cells. I read through each of the papers that were cited in defense of the comment: "The cerebellar BOLD signal is dominated by mossy fiber input with very little contribution from the output of the cerebellar cortex, the activity of Purkinje cells" and found that none of these studies made this same direct conclusion. As such, I suggest that the authors soften this statement, or provide a different set of references that directly confirm this hypothesis. 

      Please see response to the comment 1, Reviewer 1. We hope the answer provides a more comprehensive overview over the literature, which DOES show that spiking behavior of Purkinje cells does not influence vasodilation (as opposed to mossy fiber input). We have now clarified our rationale and the exact cited literature on line 9-14 of the paper.

      (5) Regarding the statement: "As such, whenever a task activates a neocortical region, we might observe activity in the corresponding cerebellar regions regardless of whether the cerebellum is directly involved in the task or not." -- what if this is a feature, rather than a bug? That is, the organisation of the nervous system has been shaped over phylogeny such that every action, via efference copies of motor outputs, is filtered through the complex architecture of the cerebellum in order to provide a feed-forward signal to the thalamus/cortex (and other connected structures). Houk and Wise made compelling arguments in their 1995 Cerebral Cortex paper arguing that these outputs (among other systems) could act as 'forcing functions' on the kinds of dynamics that arise in the cerebral cortex. I am inclined to agree with their hypothesis, where the implication is that there are no tasks that don't (in some way) depend on cerebellar activity, albeit to a lesser or greater extent, depending on the contexts/requirements of the task. I realise that this is a somewhat philosophical point, but I do think it is important to be clear about the assumptions that form the basis of the reasoning in the paper. 

      This is an interesting point. Our way of thinking about cerebellar function does indeed correspond quite well to the idea of forcing functions- the idea that cerebellar output can “steer” cortical dynamics in a particular way. However, based on patient and lesion data, it is also clear that some cortical functions rely much more critically on cerebellar input than others. We hypothesize here that cerebellar activity is higher (as compared to the neocortical activity) when the functions require cerebellar computation.

      We also agree with the notion that cerebellar contribution is likely not an all-or-none issue, but rather a matter of gradation (line 324ff).

      (6) Regarding the logic of expecting the cortical patterns for speed vs. force to be matched -- surely if the cerebellum was involved more in speed than force production, the feedback from the cerebellum to the cortex (via thalamus) could also contribute to the observed differences? How could the authors control for this possibility? 

      Our model currently indeed does not attempt to quantify the contributions of cerebellar output to cortical activity. However, given that cerebellar output is not visible in the BOLD signal of the cerebellum (see reviewer 1, comment 1), we believe that this is a rational approach. As argued in our response to your comment 2, increased cerebellar output in the speed compared to the force condition should bring the activity relationship closer to the linear model prediction. The fact that we find increased cerebellar (as compared to neocortical) activity in the speed conditions, suggests that there is indeed task-dependent gating of cortical projections to the cerebellum.

      Akgören N, Fabricius M, Lauritzen M. 1994. Importance of nitric oxide for local increases of blood flow in rat cerebellar cortex during electrical stimulation. Proc Natl Acad Sci U S A 91:5903–5907.

      Attwell D, Iadecola C. 2002. The neural basis of functional brain imaging signals. Trends Neurosci 25:621–625.

      Bhuvanasundaram R, Krzyspiak J, Khodakhah K. 2022. Subthalamic Nucleus Modulation of the Pontine Nuclei and Its Targeting of the Cerebellar Cortex. J Neurosci 42:5538–5551.

      Bostan AC, Strick PL. 2018. The basal ganglia and the cerebellum: nodes in an integrated network. Nat Rev Neurosci 19:338–350.

      Buckner RL, Krienen FM, Castellanos A, Diaz JC, Yeo BTT. 2011. The organization of the human cerebellum estimated by intrinsic functional connectivity. J Neurophysiol 106:2322–2345.

      Caesar K., Gold L, Lauritzen M. 2003. Context sensitivity of activity-dependent increases in cerebral blood flow. Proc Natl Acad Sci U S A 100:4239–4244.

      Caesar K., Thomsen K, Lauritzen M. 2003. Dissociation of spikes, synaptic activity, and activity-dependent increments in rat cerebellar blood flow by tonic synaptic inhibition. Proc Natl Acad Sci U S A 100:16000–16005.

      Farley SJ, Radley JJ, Freeman JH. 2016. Amygdala Modulation of Cerebellar Learning. J Neurosci 36:2190–2201.

      Froula JM, Hastings SD, Krook-Magnuson E. 2023. The little brain and the seahorse: Cerebellar-hippocampal interactions. Front Syst Neurosci 17:1158492.

      Gagliano G, Monteverdi A, Casali S, Laforenza U, Gandini Wheeler-Kingshott CAM, D’Angelo E, Mapelli L. 2022. Non-linear frequency dependence of neurovascular coupling in the cerebellar cortex implies vasodilation-vasoconstriction competition. Cells 11:1047.

      Howarth C, Gleeson P, Attwell D. 2012. Updated energy budgets for neural computation in the neocortex and cerebellum. J Cereb Blood Flow Metab 32:1222–1232.

      Howarth C, Peppiatt-Wildman CM, Attwell D. 2010. The energy use associated with neural computation in the cerebellum. J Cereb Blood Flow Metab 30:403–414.

      Iadecola C, Li J, Xu S, Yang G. 1996. Neural mechanisms of blood flow regulation during synaptic activity in cerebellar cortex. J Neurophysiol 75:940–950.

      Ji JL, Spronk M, Kulkarni K, Repovš G, Anticevic A, Cole MW. 2019. Mapping the human brain’s cortical-subcortical functional network organization. Neuroimage 185:35–57.

      Jung SJ, Vlasov K, D’Ambra AF, Parigi A, Baya M, Frez EP, Villalobos J, Fernandez-Frentzel M, Anguiano M, Ideguchi Y, Antzoulatos EG, Fioravante D. 2022. Novel Cerebello-Amygdala Connections Provide Missing Link Between Cerebellum and Limbic System. Front Syst Neurosci 16:879634.

      King M, Hernandez-Castillo CR, Poldrack RA, Ivry RB, Diedrichsen J. 2019. Functional boundaries in the human cerebellum revealed by a multi-domain task battery. Nat Neurosci 22:1371–1378.

      King M, Shahshahani L, Ivry RB, Diedrichsen J. 2023. A task-general connectivity model reveals variation in convergence of cortical inputs to functional regions of the cerebellum. Elife 12:e81511.

      Mapelli L, Gagliano G, Soda T, Laforenza U, Moccia F, D’Angelo EU. 2017. Granular layer neurons control cerebellar neurovascular coupling through an NMDA receptor/NO-dependent system. J Neurosci 37:1340–1351.

      Marek S, Siegel JS, Gordon EM, Raut RV, Gratton C, Newbold DJ, Ortega M, Laumann TO, Adeyemo B, Miller DB, Zheng A, Lopez KC, Berg JJ, Coalson RS, Nguyen AL, Dierker D, Van AN, Hoyt CR, McDermott KB, Norris SA, Shimony JS, Snyder AZ, Nelson SM, Barch DM, Schlaggar BL, Raichle ME, Petersen SE, Greene DJ, Dosenbach NUF. 2018. Spatial and Temporal Organization of the Individual Human Cerebellum. Neuron 100:977-993.e7.

      Mathiesen C, Caesar K, Akgören N, Lauritzen M. 1998. Modification of activity-dependent increases of cerebral blood flow by excitatory synaptic activity and spikes in rat cerebellar cortex. J Physiol 512 ( Pt 2):555–566.

      Mathiesen C, Caesar K, Lauritzen M. 2000. Temporal coupling between neuronal activity and blood flow in rat cerebellar cortex as indicated by field potential analysis. J Physiol 523:235–246.

      Muscinelli SP, Wagner MJ, Litwin-Kumar A. 2023. Optimal routing to cerebellum-like structures. Nat Neurosci 26:1630–1641.

      Shin S-L, Hoebeek FE, Schonewille M, De Zeeuw CI, Aertsen A, De Schutter E. 2007. Regular patterns in cerebellar Purkinje cell simple spike trains. PLoS One 2:e485.

      Streng ML, Popa LS, Ebner TJ. 2017. Climbing Fibers Control Purkinje Cell Representations of Behavior. J Neurosci 37:1997.

      Terburg D, van Honk J, Schutter DJLG. 2024. Doubling down on dual systems: A cerebellum–amygdala route towards action- and outcome-based social and affective behavior. Cortex 173:175–186.

      Thomsen K, Offenhauser N, Lauritzen M. 2004. Principal neuron spiking: neither necessary nor sufficient for cerebral blood flow in rat cerebellum. J Physiol 560:181–189.

      Thomsen K, Piilgaard H, Gjedde A, Bonvento G, Lauritzen M. 2009. Principal cell spiking, postsynaptic excitation, and oxygen consumption in the rat cerebellar cortex. J Neurophysiol 102:1503–1512.

      Watson TC, Obiang P, Torres-Herraez A, Watilliaux A, Coulon P, Rochefort C, Rondi-Reig L. 2019. Anatomical and physiological foundations of cerebello-hippocampal interaction. Elife 8:e41896.

      Yang G, Huard JM, Beitz AJ, Ross ME, Iadecola C. 2000. Stellate neurons mediate functional hyperemia in the cerebellar molecular layer. J Neurosci 20:6968–6973.

      Yang G, Iadecola C. 1998. Activation of cerebellar climbing fibers increases cerebellar blood flow: role of glutamate receptors, nitric oxide, and cGMP. Stroke 29:499–507; discussion 507-8.

      Yang G, Iadecola C. 1997. Obligatory role of NO in glutamate-dependent hyperemia evoked from cerebellar parallel fibers. Am J Physiol 272:R1155-61.

      Zhang Y, Forster C, Milner TA, Iadecola C. 2003. Attenuation of activity-induced increases in cerebellar blood flow by lesion of the inferior olive. Am J Physiol Heart Circ Physiol 285:H1177-82.

    1. Author response:

      eLife assessment

      This valuable study reveals how a rhizobial effector protein cleaves and inhibits a key plant receptor for symbiosis signaling, while the host plant counters by phosphorylating the effector. The molecular evidence for the protein-protein interaction and modification is solid, though biological evidence directly linking effector cleavage to rhizobial infection is incomplete. With additional functional data, this work could have implications for understanding intricate plant-microbe dynamics during mutualistic interactions.

      Thank you for this helpful comment. In the revised manuscript version, we will be more prudent with directly linking cleavage of Nod factor receptors by NopT and rhizobial infection.

      We plan to modify the Title, the One-Sentence Summary, Abstract, and Discussion regarding this point.

      Public Reviews:

      Reviewer #1 (Public Review):

      Bacterial effectors that interfere with the inner molecular workings of eukaryotic host cells are of great biological significance across disciplines. On the one hand they help us to understand the molecular strategies that bacteria use to manipulate host cells. On the other hand they can be used as research tools to reveal molecular details of the intricate workings of the host machinery that is relevant for the interaction/defence/symbiosis with bacteria. The authors investigate the function and biological impact of a rhizobial effector that interacts with and modifies, and curiously is modified by, legume receptors essential for symbiosis. The molecular analysis revealed a bacterial effector that cleaves a plant symbiosis signaling receptor to inhibit signaling and the host counterplay by phosphorylation via a receptor kinase. These findings have potential implications beyond bacterial interactions with plants.

      Thank you for highlighting the broad significance of rhizobial effectors in understanding legume-rhizobium interactions. We fully agree with your assessment and will emphasize these points in the revised Introduction and Discussion sections of our manuscript. Specifically, we will expand our Discussion regarding the potential impact of the NopT interaction with symbiotic receptor kinases on plant immune signaling and regarding the general significance of our work.

      Bao and colleagues investigated how rhizobial effector proteins can regulate the legume root nodule symbiosis. A rhizobial effector is described to directly modify symbiosis-related signaling proteins, altering the outcome of the symbiosis. Overall, the paper presents findings that will have a wide appeal beyond its primary field.

      Out of 15 identified effectors from Sinorhizobium fredii, they focus on the effector NopT, which exhibits proteolytic activity and may therefore cleave specific target proteins of the host plant. They focus on two Nod factor receptors of the legume Lotus japonicus, NFR1 and NFR5, both of which were previously found to be essential for the perception of rhizobial nod factor, and the induction of symbiotic responses such as bacterial infection thread formation in root hairs and root nodule development (Madsen et al., 2003, Nature; Tirichine et al., 2003; Nature). The authors present evidence for an interaction of NopT with NFR1 and NFR5. The paper aims to characterize the biochemical and functional consequences of these interactions and the phenotype that arises when the effector is mutated.

      Thank you for your positive feedback on our manuscript. In the revised Introduction and Discussion sections, we plan to better emphasize the interdisciplinary significance of our work. We will show how the knowledge gained from our study can contribute to a better understanding of microbial interactions with eukaryotic hosts in general, which may have a stimulating effect on future research in various research areas such as pathogenesis and immunity.

      To ensure that the readers can easily follow the rationale behind our experiments, we will improve the Results section and provide more detailed explanations of how NopT among 15 examined effectors was selected. Additionally, we will provide more background information on NopT and the roles of NFR1 and NFR5 in symbiotic signaling in the Introduction section. As suggested, we will include the references Madsen et al. (2003) and Tirichine et al. (2003) as well as additional references on rhizobial NopT proteins into our revised manuscript version.

      Evidence is presented that in vitro NopT can cleave NFR5 at its juxtamembrane region. NFR5 appears also to be cleaved in vivo. and NFR1 appears to inhibit the proteolytic activity of NopT by phosphorylating NopT. When NFR5 and NFR1 are ectopically over-expressed in leaves of the non-legume Nicotiana benthamiana, they induce cell death (Madsen et al., 2011, Plant Journal). Bao et al., found that this cell death response is inhibited by the coexpression of nopT. Mutation of nopT alters the outcome of rhizobial infection in L. japonicus. These conclusions are well supported by the data.

      We appreciate that you recognize the value of our data.

      The authors present evidence supporting the interaction of NopT with NFR1 and NFR5. In particular, there is solid support for cleavage of NFR5 by NopT (Figure 3) and the identification of NopT phosphorylation sites that inhibit its proteolytic activity (Figure 4C). Cleavage of NFR5 upon expression in N. benthamiana (Figure 3A) requires appropriate controls (inactive mutant versions) that have been provided, since Agrobacterium as a closely rhizobia-related bacterium, might increase defense related proteolytic activity in the plant host cells.

      Thank you for recognizing the use of an inactive NopT variant in Figure 3A. In fact, increased activity of plant proteases induced by Agrobacterium is an important point that should not be neglected. We plan to mention this aspect in our revised Discussion.

      In the context of your comments, we are planning to make the following improvements to the manuscript:

      (1) We will add a more detailed description of the experimental conditions under which the cleavage of NFR5 by NopT was observed in vitro and in vivo.

      (2) We plan to provide more comprehensive data on the phosphorylation of NopT by NFR1, including phosphorylation assays and mass spectrometry results. These additional data support the proposed mechanism by which NFR1 inhibits the proteolytic activity of NopT.

      (3) We will expand the Discussion on the cell death response induced by ectopic expression of NFR1 and NFR5 in Nicotiana benthamiana. We will include more details from Madsen et al. (2011) to contextualize our findings with published literature.

      We believe these additions and clarifications will enhance the clarity and impact of our findings.

      Key results from N. benthamiana appear consistent with data from recombinant protein expression in bacteria. For the analysis in the host legume L. japonicus transgenic hairy roots were included. To demonstrate that the cleavage of NFR5 occurs during the interaction in plant cells the authors build largely on western blots. Regardless of whether Nicotiana leaf cells or Lotus root cells are used as the test platform, the Western blots indicate that only a small proportion of NFR5 is cleaved when co-expressed with nopT, and most of the NFR5 persists in its full-length form (Figures 3A-D). It is not quite clear how the authors explain the loss of NFR5 function (loss of cell death, impact on symbiosis), as a vast excess of the tested target remains intact. It is also not clear why a large proportion of NFR5 is unaffected by the proteolytic activity of NopT. This is particularly interesting in Nicotiana in the absence of Nod factor that could trigger NFR1 kinase activity.

      Thank you for your comments regarding the cleavage of NFR5 and its functional implications. In the revised version, we will change our manuscript taking into account the following considerations:

      (1) We acknowledge that the Western blots indicate only a small proportion of NFR5 is cleaved when co-expressed with NopT. It is worth noting in this context that the proteins were expressed at high levels which likely do not reflect the natural situation in L. japonicus. Low production of cleaved NFR5 in our Western blots with transformed N. benthamiana or L. japonicus cells thus may simply reflect an experimental effect due to high NFR5 protein synthesis. We suggest that the presence of high amounts of intact NFR5 does not have a significant functional impact on plant responses (cell death in N. benthamiana, rhizobial infection of L. japonicus) whereas NFR5 cleavage (or formation of NFR5 cleavage products) may be crucial for the observation of the observed phenotypic changes. The fraction of cleaved NFR5, although small, may be sufficient to disrupt crucial signaling pathways, leading to observable phenotypic changes. We will address possible differences between experimental and natural protein levels in our revised Discussion.

      (2) We studied in our work three biochemical aspects of NopT: (i) physical binding of NopT to NFR1 and NFR5 (ii) proteolytical cleavage of NFR5 by NopT and (iii) phosphorylation of NopT by NFR1. These three biochemical properties appear to influence each other. Phosphorylation of NopT by NFR1 appears to reduce its proteolytic activity, thereby counteracting NFR5 degradation by NopT (NFR5 homeostasis). Moreover, as NopT is a phosphorylation substrate for NFR1, NopT probably interferes with kinase mediated downstream responses of NFR1. Thus, NFR5 cleavage activity of NopT appears to be only one feature of NopT. We plan to mention these considerations in our revised Discussion.

      It is also difficult to evaluate how the ratios of cleaved and full-length protein change when different versions of NopT are present without a quantification of band strengths normalized to loading controls (Figure 3C, 3D, 3F). The same is true for the blots supporting NFR1 phosphorylation of NopT (Figure 4A).

      Thank you for pointing out this aspect. Following your recommendation, we will quantify the band intensities for cleaved and full-length NFR5 in the experiments with different versions of NopT. These values will be normalized to loading controls. Similarly, the Western blots supporting NFR1 phosphorylation of NopT will be quantified. The data for normalized band intensities will be included into the revised figures. The quantifications will provide a clearer understanding of how the ratios of cleaved to full-length proteins change with different NopT variants and also will provide information to which extent NopT is phosphorylated by NFR1.

      It is clear that mutation of nopT results in a quantitative infection phenotype. Nodule primordia and infection threads are still formed when L. japonicus plants are inoculated with ∆nopT mutant bacteria, but it is not clear if these primordia are infected or develop into fully functional nodules (Figure 5). A quantification of the ratio of infected and non-infected nodules and primordia would reveal whether NopT is only active at the transition from infection focus to thread or perhaps also later in the bacterial infection process of the developing root nodule.

      Thank you for pointing this out. In the revised version of our manuscript, we will provide data showing that there are no obvious differences in nodule formation in plants inoculated with ∆nopT and wild-type NGR234, respectively. However, quantification of infection threads containing our GFP-labeled rhizobia in primordia and nodules would be difficult to perform due to strong autofluorescence signals in these tissues. The main goal of our study was to identify and characterize the interaction between NopT and Nod factor receptors. We therefore believe that an in-depth analysis of the bacterial infection process at later symbiotic stages is out of the scope of the present work.

      Reviewer #2 (Public Review):

      Summary:

      This manuscript presents data demonstrating NopT's interaction with Nod Factor Receptors NFR1 and NFR5 and its impact on cell death inhibition and rhizobial infection. The identification of a truncated NopT variant in certain Sinorhizobium species adds an interesting dimension to the study. These data try to bridge the gaps between classical Nod-factor-dependent nodulation and T3SS NopT effector-dependent nodulation in legume-rhizobium symbiosis. Overall, the research provides interesting insights into the molecular mechanisms underlying symbiotic interactions between rhizobia and legumes.

      Strengths:

      The manuscript nicely demonstrates NopT's proteolytic cleavage of NFR5, regulated by NFR1 phosphorylation, promoting rhizobial infection in L. japonicus. Intriguingly, authors also identify a truncated NopT variant in certain Sinorhizobium species, maintaining NFR5 cleavage but lacking NFR1 interaction. These findings bridge the T3SS effector with the classical Nod-factor-dependent nodulation pathway, offering novel insights into symbiotic interactions.

      We appreciate that you recognize the value of our manuscript.

      Weaknesses:

      (1) In the previous study, when transiently expressed NopT alone in Nicotiana tobacco plants, proteolytically active NopT elicited a rapid hypersensitive reaction. However, this phenotype was not observed when expressing the same NopT in Nicotiana benthamiana (Figure 1A). Conversely, cell death and a hypersensitive reaction were observed in Figure S8. This raises questions about the suitability of the exogenous expression system for studying NopT proteolysis specificity.

      We appreciate your attention to these plant-specific differences. In view of your comments, we plan to revise the Discussion and explain the different expression systems used for studying NopT effects in planta. Previous studies showed that NopT expressed in tobacco (N. tabacum) or in specific Arabidopsis thaliana ecotypes (with PBS1/RPS5 genes) causes rapid cell death (Dai et al. 2008; Khan et al. 2022). Our data shown in Fig. S8 confirm these findings. As cell death (effector triggered immunity) is usually associated with induction of protease activities, we considered N. tabacum and A. thaliana plants as not suitable for testing NFR5 cleavage by NopT. In fact, no NopT/NFR5 experiments were performed with these plants in our study. In contrast, the expression of NopT in Nicotiana benthamiana did not lead to cell death in our experiments. Khan et al. 2022 also reported that cell death does not occur in N. benthamiana unless the cells were transformed with PBS1/RPS5 constructs. Thus, N. benthamiana is a suitable expression system to analyze NopT protease activity on co-expressed substrates. Our revision aims to better understand the advantages of the N. benthamiana expression system for studying NopT mediated proteolysis of NFR5.

      (2) NFR5 Loss-of-function mutants do not produce nodules in the presence of rhizobia in lotus roots, and overexpression of NFR1 and NFR5 produces spontaneous nodules. In this regard, if the direct proteolysis target of NopT is NFR5, one could expect the NGR234's infection will not be very successful because of the Native NopT's specific proteolysis function of NFR5 and NFR1. Conversely, in Figure 5, authors observed the different results.

      Our inoculation experiments clearly show that NopT of NGR234 has a negative effect on formation of infection foci (Fig. 5A) and nodule primordia (Fig. 5E). Our biochemical analysis indicates that NopT targets the NFR1/NFR5 complex, which most likely impairs activation of downstream responses such as NIN gene expression. Accordingly, NIN promoter activity was found to be higher in roots inoculated with the Δ_nopT_ mutant as compared to the NGR234 wild-type (Fig. 5B and 5D). It is therefore plausible that NopT impairs rhizobial infection of L. japonicus due to inhibition of NFR1/NFR5 functions. We agree with this Reviewer that it can be expected that “NGR234's infection will not be very successful”. Fig. 5 confirms that Δ_nopT_ mutant is indeed a better symbiont and we do not think that we obtained “unexpectedly different results”. In the revised version, we will try to formulate our discussion text better in order to avoid any misunderstandings. Furthermore, will write as figure title “NopT dampens rhizobial infection…” instead of “NopT regulates rhizobial infection…”. We are also considering changing the title of our manuscript.  

      (3) In Figure 6E, the model illustrates how NopT digests NFR5 to regulate rhizobia infection. However, it raises the question of whether it is reasonable for NGR234 to produce an effector that restricts its own colonization in host plants.

      We acknowledge the potential paradox of NGR234 producing an effector that appears to restrict its own colonization in host plants. In fact, depending on the host plant, most rhizobial effectors are “double-edged swords” that play either a positive or negative role in the symbiosis. In response to your comment, we will discuss the possibility that NopT may confer selective advantages in interactions between NGR234 and host plants where NopT plays a positive symbiotic role (Dai et al. 2008; Kambara et al. 2009). Inhibition of NFR1/NFR5 functions by NopT in these host plants could be a feedback response in cells in which symbiotic signaling has already started. It is tempting speculate that the interaction between NopT and Nod factor receptors reduces Nod factor perception and downstream signaling to avoid a possible overreaction of symbiotic signaling, which may result in hypernodulation or formation of empty nodules without bacteria. Furthermore, it is tempting to speculate that NopT targets not only Nod factor receptors but also other host proteins to promote symbiosis, e.g. by suppressing excessive immune responses triggered by hyperinfection of rhizobia. In our revised manuscript, we will highlight the need for further investigations to elucidate the precise mechanisms underlying the observed infection phenotype and the role of NopT in modulating symbiotic signaling pathways.  

      (4) The failure to generate stable transgenic plants expressing NopT in Lotus japonicus is surprising, considering the manuscript's claim that NopT specifically proteolyzes NFR5, a major player in the response to nodule symbiosis, without being essential for plant development.

      Thank you for your comments. The failure to obtain L. japonicus plants constitutively expressing NopT was indeed surprising and suggests that NopT targets not only NFR5 but also other proteins in L. japonicus. The number of NopT substrates in plants could be greater than assumed. For example, we show in our work that NopT can cleave AtLYK5 and LjLYS11. In our manuscript, we don’t provide protocols and data on our efforts to construct L. japonicus plants stably expressing NopT. Indeed, it cannot be completely ruled out that the observed failure is not due to NopT expression, but rather to other factors that influence the transformation and regeneration of explants into whole plants. Our results should therefore not be over-interpreted. We consider a discussion of our failed transformation experiments to be somewhat preliminary and not central to this manuscript. herefore, we plan to modify our Discussion and delete the sentence reporting that stable transgenic plants expressing NopT have not been successfully generated.

    1. Reviewer #2 (Public Review):

      Summary:

      The goal of the paper was to trace the transitions hippocampal microglia undergo along aging. ScRNA-seq analysis allowed the authors to predict a trajectory and hypothesize about possible molecular checkpoints, which keep the pace of microglial aging. E.g. TGF1b was predicted as a molecule slowing down the microglial aging path and indeed, loss of TGF1 in microglia led to premature microglia aging, which was associated with premature loss of cognitive ability. The authors also used the parabiosis model to show how peripheral, blood-derived signals from the old organism can "push" microglia forward on the aging path.

      Strengths:

      A major strength and uniqueness of this work is the in-depth single-cell dataset, which may be a useful resource for the community, as well as the data showing what happens to young microglia in heterochronic parabiosis setting and upon loss of TGFb in their environment.

      Weaknesses:

      That said, given what we recently learned about microglia isolation for RNA-seq analysis, there is a danger that some of the observations are a result of not age, but cell stress from sample preparation (enzymatic digestion 10min at 37C; e.g. PMID: 35260865). Changes in cell state distribution along aging were made based on scRNA-seq and were not corroborated by any other method, such as imaging of cluster-specific marker expression in microglia at different ages. This analysis would allow confirming the scRNA-seq data and would also give us an idea of where the subsets are present within the hippocampus, and whether there is any interesting distribution of cell states (e.g. some are present closer to stem cells?). Since TGFb is thought to be crucial to microglia biology, it would be valuable to include more analysis of the mice with microglia-specific Tgfb deletion e.g. what was the efficiency of recombination in microglia? Did their numbers change after induction of Tgfb deletion in Cx3cr1-creERT2::Tgfb-flox mice.

      Overall:

      In general, I think the authors did a good job following the initial observations and devised clever ways to test the emerging hypotheses. The resulting data are an important addition to what we know about microglial aging and can be fruitfully used by other researchers, e.g. those working on microglia in a disease context.

    1. There's a lot of really great content here. But, for readers like me (the technical/design/engineering/research side of the visualization community), I think the writing isn't landing with quite the impact that it could for a few reasons:

      (1) In my interdisciplinary collaborations, I've noticed a difference in writing styles/norms between the humanities and the design/engineering disciplines. The latter tend to favor a top-down argument structure (e.g., a crisply articulated thesis that is then unpacked via clearly signposted topic sentences). I think that's because readers like me are trying to figure out how to operationalize the things we're reading/learning. So, right from the get go, we need a clearly articulate conceptual model so that, over the course of the rest of the writing, we can figure out how to integrate it with our existing mental models of practice/research.

      In contrast, this piece takes a very bottom-up approach to the argument. For me, the experience of reading bottom-up writing is of assembling a mental model that feels more like a wobbly house of cards: ad hoc, duct taped together, and needing to constantly swap/rearrange it as more pieces of the conceptual contribution reveal themselves to me.

      As a concrete example, for the first third of this chapter, I wasn't actually sure what I was supposed to be taking away. I almost wondered whether I should suggest titling the chapter "preface" instead of "introduction" because it opens by being focused inwardly (i.e., on the presentation of the homepage) in a way I'm more accustomed to with prefaces than introductions. Although that whole chunk of writing was very pleasant to read (which may also be a function of the fact that I had the pleasure of meeting y'all and learning about how the project came together!), I wasn't entirely sure what this chunk was hoping to do/communicate—or how it was hoping to influence my thinking.

      (2) Related to the first point, while I personally find the exploration of a visualization counterhistory exciting and thought-provoking, I wonder if the writing could better motivate the goals of the counterhistory a bit more explicitly and clearly? That is, if someone isn't already bought into valuing the history of the field (or doesn't know how a counterhistory may/should affect their current practice today), how might the writing persuade them to care? Or, put another way, how can the writing speak and evangelize to an audience who is open-minded, but not yet "on side." To me, this feels like a particularly important thing for an introductory chapter to do, that seems missing in the current iteration.

      (3) I'm on the fence about how central a role Tufte is given here. I think this depends on the audience you are trying to reach—I'm not sure that many (most?) visualization researchers/designers/practitioners (i.e., visualization "thought leaders") consider Tufte to play as influential a role as this chapter purports him to do. If this was the core audience, then I think the focus on Tufte could be watered down without losing much of the overall framing of "counterhistory"—because I think what this chapter describes is very much the history the field tells itself (relatively independently of Tufte, I think?).

      On the other hand, if the intended audience are the folks one hop removed (i.e., people who produce/consume visualizations in their daily lives/jobs, but aren't necessarily plugged into conversations on the bleeding edge), I think Tufte serves as a useful foil. But, something about his treatment in this chapter feels a little caricatured to me (and I say this as someone relatively ambivalent about his role). I'm not quite able to put my finger on what specifically about the writing left me with that feeling, though.

      (4) Starting with the "Two Stories of Data Visualization" (and particularly the subsequent chapter on "Every Datapoint is a Person"), I wondered whether the target of the book's critique is indeed visualization (i.e., the graphic representation of data) or whether it's more fundamental and broader practices of data (i.e., definition, collection, etc. more similar to the set of issues y'all discussed in Data Feminism). I really enjoyed all of the detail and discussion here—and I was convinced about the role that data played. But, I was perhaps less convinced about visualization's central/facilitating/empowering role in it. It's likely impossible to fully disentangle data from its representation (as the data table examples do a great job) but, if the book wants to maintain visualization as its target, I wonder if the writing could be refined a little to make its focus clearer/crisper?

      (5) I wonder if the writing can be more explicit about its positionality? I think some of the early sections (and occasional passages throughout) set up an incorrect expectation for me of a much broader (i.e., more global) counterhistory. So, I was then surprised that this chapter maintains a relatively fixed focus on Western history. In fact, I might go further to say that the writing seems to be particularly fixated on an American point of view (e.g., I raised an eyebrow at the description of the United States as "the exemplary" colonial state; as a non-American and citizen of a former colonized nation, I would consider the British Empire to be the ultimate colonial power...). I think this focus is fine if the writing is explicit that it is primarily concerned with developing a counterhistory rooted in the West (and, at that, the United States).

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This important study identifies differential Orsay virus infection of C. elegans when animals are fed on different bacteria. The evidence for this is however, incomplete, as experiments to control for feeding rate and bacterial pathogenicity are needed as well as direct quantification of viral load. 

      We appreciate that the editors and reviewers felt that our manuscript addressed an important problem. We appreciate the constructive critiques provided by the reviewers and have worked to address all of the concerns, including a number of additional experiments as indicated below.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary: 

      This manuscript explores the importance of food type on virus infection dynamics using a nematode virus as a model system. The authors demonstrate that susceptibility to viral infection can change by several orders of magnitude based on the type of bacterial food that potential hosts consume. They go on to show that, for the bacterial food source that reduces susceptibility, the effect is modulated by quorum sensing molecules that the bacteria produce. 

      Strengths: 

      This manuscript shows convincingly that nematode susceptibility to viral infection changes by several orders of magnitude (i.e. doses must be increased by several orders of magnitude to infect the same fraction of the population) depending on the bacterial food source on which hosts are reared. The authors then focus on the bacteria that reduce host susceptibility to viral infection and demonstrate that certain bacterial quorum-sensing compounds are required to see this effect of reduced susceptibility. Overall, sample sizes are large, methods are generally rigorous, experiments are repeated, and patterns are clear. 

      Weaknesses: 

      Although the molecular correlate of reduced susceptibility is identified (i.e. quorum sensing compounds) the mechanisms underlying this effect are missing. For example, there are changes in susceptibility due to altered nutrition, host condition, the microbiome, feeding rate, mortality of infected hosts, etc. In addition, the authors focus almost entirely on the reduction in susceptibility even though I personally find the increased susceptibility generated when reared on Ochrobactrum to be much more exciting. 

      I was a bit surprised that there was no data on basic factors that could have led to reductions in susceptibility. In particular, data on feeding rates and mortality rates seem really important. I would expect that feeding rates are reduced in the presence of Pseudomonas. Reduced feeding rates would translate to lower consumed doses, and so even though the same concentration of virus is on a plate, it doesn't mean that the same quantity of virus is consumed. Likewise, if Pseudomonas is causing mortality of virus-infected hosts, it could give the impression of lower infection rates. Perhaps mortality rates are too small in the experimental setup to explain this pattern, but that isn't clear in the current version of the manuscript. Is mortality greatly impacted by knocking out quorum-sensing genes? Also, the authors explored susceptibility to infection, but completely ignored variation in virus shedding. 

      We have added data on feeding rates (Line numbers 141-148 and 176-182, Supplementary Figure 4). After six hours of exposure no differences in feeding rate were observed. After 24 hours minor differences emerged between O. vermis MYb71 and each Pseudomonas species, however feeding rate inversely correlated with susceptibility to Orsay virus in that O. vermis MYb71 displayed the lowest feeding rate while P. aeruginosa PA14 displayed the highest feeding rate.

      We have also added data on mortality rates (Line numbers 183-200, Supplementary Figure 6). No significant mortality was observed within the 24-hour exposure period used for our Orsay infection and transmission assays. P. aeruginosa virulence is dependent upon temperature and as our assays are done at 20°C rather than 25°C this may account for reduced mortality compared to other published results. Regardless, we noted that O. vermis MYb71 killed C. elegans as quickly as P. aeruginosa PA14 under these conditions and these two bacteria led to the shortest lifespan compared to the other tested bacteria. Interestingly, P. lurida MYb11 was observed to be more virulent than P. aeruginosa PA01 under these conditions. These results suggest that there is no direct correlation between mortality and susceptibility to Orsay virus, although it does not rule out that virulence effects unique to each bacterium could contribute to alterations in host susceptibility.  

      The reviewer is correct to assert that differences in viral shedding could exist. However, our susceptibility assays using exogenous Orsay virus remove this source of variation and yet we still observe the same trends such that O. vermis MYb71 promotes infection while P. lurida MYb11, P. aeruginosa PA01, and P. aeruginosa PA14 attenuate infection. Further we measured the amount of virus shed into the lawns in the presence of different bacteria and did not observe differences in shed virus that could account for the differences we observe in incidence proportion (Line numbers 241-254, Fig. 3 F). Viral stability could be an issue in both the transmission and susceptibility assays. We therefore tested viral stability in the presence of E. coli, P. lurida MYb11, P. aeruginosa PA01, and P. aeruginosa PA14 and successfully recovered virus from all lawns, suggesting virus is not rapidly degraded in the presence of any bacterium (Fig. 3D and 3E). However, we noted that the recovery of Orsay virus from lawns of E. coli OP50 and P. lurida MYb11 within 30 minutes was decreased compared to a spike-in control suggesting recovery from each lawn is not equivalent. This complicates a comparison of viral stability and shedding rates between different bacteria, but our ability to recover substantial amounts of virus in the shedding assay from the three Pseudomonas strains we examined precludes a substantial decrease in shedding rates as an explanation for the robust attenuation of Orsay virus observed in transmission assays.  

      I was also curious why the authors did not further explore the mechanism behind the quorumsensing effect. Not sure whether this is possible, but would it be possible to add spent media to the infection plates where the spent media was from Pseudomonas that produce the quorum sensing compound but the plates contain OP50, Pseudomonas, or the quorum sensing knockout of Pseudomonas? That would reveal whether it is the compound itself vs. something that the compound does. 

      We observed that quorum sensing mutants suppressed the attenuation of Orsay virus infection and we agree that this could be a consequence of the compounds themselves, or more likely an effect of the downstream consequences of quorum signaling. We added culture supernatant from each bacterium to lawns of E. coli OP50 to assess the effect on host susceptibility and did not observe any potent effect (Line numbers 311-318, Supplementary Figure 9). This supports an interpretation that it is not the compound itself that is responsible, however we cannot rule out that the compounds themselves may be responsible if provided at a higher concentration.

      In addition, I was surprised by how much focus there was on the attenuation of infection and how little there was on the enhancement of infection. To me, enhancement seems like the more obvious thing to find a mechanism for -- is the bacteria suppressing immunity, preventing entry to gut cells, etc? 

      We are also intrigued by the enhancement of infection by Ochrobactrum spp, however we chose to focus on attenuation given the availability of Pseudomonas aeruginosa genetic mutants for study. We have added data (Line numbers 371-402, Figure 7, and Supplemental Figure 12) that inform our current hypothesis regarding Ochrobactrum mediated enhancement of Orsay virus infection.

      I was a bit concerned about the "arbitrary units", which were used without any effort to normalize them. David Wang and Hongbing Jiang have developed a method based on tissue culture infectious dose 50 (TCID50) that can be used to measure infectious doses in a somewhat repeatable way. Without some type of normalization, it is hard to imagine how this study could be repeated. The 24-hour time period between exposure and glowing suggests very high doses, but it is still unclear precisely how high. Also, it is clear that multiple batches of virus were used in this study, but it is entirely unclear how variable these batches were. 

      We have clarified that we also measured the (TC)ID50 for every batch of virus used similar to the methods suggested by the Wang laboratory (Line numbers 107-119 and 499-506). We have added a figure showing the virus batch variability for all batches used in this study (Supp. Fig. 2). We have further clarified that the arbitrary units correspond to the actual microliters of viral filtrate used during infection and provided clear methods to replicate our viral batch production to assist with issues of reproducibility (Line numbers 107-119 and 499-506).

      The authors in several places discuss high variability or low variability in incidence as though it is a feature of the virus or a feature of the host. It isn't. For infection data (or any type of binomial data) results are highly variable in the middle (close to 50% infection) and lowly variable at the ends (close to 0% or 100% infection). This is a result that is derived from a binomial distribution and it should not be taken as evidence that the bacteria or the host affect randomness. If you were to conduct dose-response experiments, on any of your bacterial food source treatments, you would find that variability is lowest at the extremely high and extremely low doses and it is most variable in the middle when you are at doses where about 50% of hosts are infected. 

      Thank you for pointing this out, we have removed all reference to this throughout the manuscript.

      Reviewer #2 (Public Review):

      Summary and Major Findings/Strengths:

      Across diverse hosts, microbiota can influence viral infection and transmission. C. elegans is naturally infected by the Orsay virus, which infects intestinal cells and is transmitted via the fecal-oral route. Previous work has demonstrated that host immune defense pathways, such as antiviral RNAi and the intracellular pathogen response (IPR), can influence host susceptibility to virus infection. However, little is known about how bacteria modulate viral transmission and host susceptibility. 

      In this study, the authors investigate how diverse bacterial species influence Orsay virus transmission and host susceptibility in C. elegans. When C. elegans is grown in the presence of two Ochrobactrum species, the authors find that animals exhibit increased viral transmission, as measured by the increased proportion of newly infected worms (relative to growth on E. coli OP50). The presence of the two Ochrobactrum species also resulted in increased host susceptibility to the virus, which is reflected by the increased fraction of infected animals following exposure to the exogenous Orsay virus. In contrast, the presence of Pseudomonas lurida MYb11, as well as Pseudomonas PA01 or PA14, attenuates viral transmission and host susceptibility relative to E. coli OP50. For growth in the presence of P. aeruginosa PA01 and PA14, the attenuated transmission and susceptibility are suppressed by mutations in regulators of quorum sensing and the gacA two-component system. The authors also identify six virulence genes in P. aeruginosa PA14 that modulate host susceptibility to virus and viral transmission, albeit to a lesser extent. Based on the findings in P. aeruginosa, the authors further demonstrate that deletion of the gacA ortholog in P. lurida results in loss of the attenuation of viral transmission and host susceptibility. 

      Taken together, these findings provide important insights into the species-specific effects that bacteria can have on viral infection in C. elegans. The authors also describe a role for Pseudomonas quorum sensing and virulence genes in influencing viral transmission and host susceptibility. 

      Major weaknesses: 

      The manuscript has several issues that need to be addressed, such as insufficient rigor of the experiments performed and questions about the reproducibility of the data presented in some places. In addition, confounding variables complicate the interpretations that can be made from the authors' findings and weaken some of the conclusions that are stated in the manuscript. 

      (1) The authors sometimes use pals-5p::GFP expression to indicate infection, however, this is not necessarily an accurate measure of the infection rate. Specifically, in Figures 4-6, the authors should include measurements of viral RNA, either by FISH staining or qRT-PCR, to support the claims related to differences in infection rate. 

      Following the reviewers comment we have corroborated our pals-5::GFP data using FISH staining (Line numbers 291-292 and 357-359, Figure 4D & 4E, and Figure 6C).  

      (2) In several instances, the experimental setup and presentation of data lack sufficient rigor. For example, Fig 1D and Fig 2B only display data from one experimental replicate. The authors should include information from all 3 experimental replicates for more transparency. In Fig 3B, the authors should include a control that demonstrates how RNA1 levels change in the presence of E. coli OP50 for comparison with the results showing replication in the presence of PA14. In order to support the claim that "P. aeruginosa and P. lurida MYb11 do not eliminate Orsay virus infection", the authors should also measure RNA1 fold change in the presence of PA01 and P. lurida in the context of exogenous Orsay virus. Additionally, the authors should standardize the amount of bacteria added to the plate and specify how this was done in the Methods, as differing concentrations of bacteria could be the reason for species-specific effects on infection. 

      All experimental replicates are now included within the supplementary information. 

      We have also measured RNA1 fold change following infection in the presence of P. aeruginosa PA01 and P. lurida MYb11 (Line numbers Fig 3B and 3C) and found that these bacteria also do not eliminate Orsay virus replication. 

      We thank the reviewer for their comment on controlling the amount of bacteria and have clarified our methods section to more clearly explain that we seed our plates with equivalent amounts (based on volume) of overnight bacterial culture before allowing the bacteria to grow on the plates for 48 hours.  

      (3) The authors should be more careful about conclusions that are made from experiments involving PA14, which is a P. aeruginosa strain (isolated from humans), that can rapidly kill C. elegans. To eliminate confounding factors that are introduced by the pathogenicity of PA14, the authors should address how PA14 affects the health of the worms in their assays. For example, the authors should perform bead-feeding assays to demonstrate that feeding rates are unaffected when worms are grown in the presence of PA14. Because Orsay virus infection occurs through feeding, a decrease in C. elegans feeding rates can influence the outcome of viral infection. The authors should also address whether or not the presence of PA14 affects the stability of viral particles because that could be another trivial reason for the attenuation of viral infection that occurs in the presence of PA14. 

      We have added data on feeding rates (Line numbers 141-148 and 176-182, Supplementary Figure 4). After six hours of exposure no differences in feeding rate were observed. After 24 hours minor differences emerged between O. vermis MYb71 and each Pseudomonas species, however feeding rate inversely correlated with susceptibility to Orsay virus in that O. vermis MYb71 displayed the lowest feeding rate while P. aeruginosa PA14 displayed the highest feeding rate.

      We have also added data on mortality rates (Line numbers 183-200, Supplementary Figure 6). No significant mortality was observed within the 24-hour exposure period used for our Orsay infection and transmission assays. P. aeruginosa virulence is dependent upon temperature and as our assays are done at 20°C rather than 25°C this may account for reduced mortality compared to other published results. Regardless, we noted that O. vermis MYb71 killed C. elegans as quickly as P. aeruginosa PA14 under these conditions and these two bacteria led to the shortest lifespan compared to the other tested bacteria. Interestingly, P. lurida MYb11 was observed to be more virulent than P. aeruginosa PA01 under these conditions. These results suggest that there is no direct correlation between mortality and susceptibility to Orsay virus, although it does not rule out that virulence effects unique to each bacterium could contribute to alterations in host susceptibility.  

      We tested viral stability in the presence of E. coli OP50 and Pseudomonas spp. and successfully recovered virus from all lawns, suggesting virus is not rapidly degraded in the presence of P. lurida MYb11, P. aeruginosa PA01, and P. aeruginosa PA14 (Line numbers 241-249, Fig 3D and Fig 3E). However, we noted that the recovery of Orsay virus from lawns of E. coli OP50 and P. lurida MYb11 within 30 minutes was decreased compared to a spike-in control suggesting recovery from each lawn is not equivalent. This complicates a comparison of viral stability and shedding rates between different bacteria, but our ability to recover substantial amounts of virus in the shedding assay from each Pseudomonas species precludes a substantial decrease in shedding rates as an explanation for the robust attenuation of Orsay virus observed in transmission assays.  

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      Overall, I really liked this manuscript, I do think there are areas for improvement though. 

      Some smaller things: 

      Line 84: "can be observed spreading from a single animal" -- this isn't really great wording because the virus itself can't be observed (at least not very easily) -- even infection is hard to see. 

      The wording in line 84-85 has now been adjusted to read “can spread from a single animal”.

      Fig 1C: which groups are statistically significantly different from each other? 

      Statistics have now been added to Figure 1C. 

      Line 154: not necessary to do for this paper, but this sentence made me curious whether the effect would have been seen with mixtures of bacteria (i.e. what if 50% were OP50 and 50% were Pseudomonas?) 

      This data has now been added in Line numbers 372-378, Figure 7A, and Supp. Fig. 12A and 12B.

      Line 262-264: I don't find this interesting at all for the reasons mentioned earlier about binomial data being the most variable in the middle. 

      These lines have been removed.

      Figure 4 B: The labels for the first two tick marks on the x-axis are switched I suspect. Otherwise, the controls did not behave as expected. 

      Figure 4B has been corrected.

      Line 288, 297 and several other places: "Orsay Virus" should be "Orsay virus". 

      We have corrected these instances.

      Supplemental Figure 2: Labels in the figure legend are B and C instead of A and B. 

      These labels have been adjusted for their placement within Figure 6.

      Line 411: I suspect this was supposed to be 13,200 xg rather than 13.2 xg. 

      This error has been corrected.

      Line 416-417: This sentence is very hard to interpret. More details are needed. This is the ID50 in which host strain? Is this averaged over all batches of virus? How variable are the batches? 

      This sentence (line number 114) has been amended to clarify that all ID50 values referred to here were calculated for ZD2611 populations in the presence of E. coli OP50. Further, Supplementary Figure 2 now shows all the ID50 values measured for each batch of virus used in this manuscript resulting in an average ID50 of 3.6.

      Lines 467-469: Why exclude these instead of counting them as zeros in the analysis? How many plates fit this description -- were there lots or only a few over the course of all experiments? 

      We have chosen to exclude these plates as these samples lost spreaders at some point during the course of the assay potentially skewing the eventual number of new infections counted depending on when the infected spreader animal crawled off the plate.  We have detailed the number of plates that fit this description in lines 559-562. 

      Line 476: A critical detail that is missing here is what number of worms were counted to score infection. Please say here or in the figure legends. 

      We have added the total number of worms counted and the minimum number counted per plate for each assay in the figure legends.

      Line 546: Why was only a single representative experiment shown? I'm asking for a justification, not necessarily for you to show all the data. 

      We chose to show a single representative experiment for two reasons:  We noted variability between susceptibility assays even when using the same batch of virus such that we could not combine experiments into a single plot as we did for transmission assays. Second, while we could normalize to a control within each experiment and expect to see similar relative differences across experiments, we believe this makes it more difficult to interpret the underlying data. For example, an increase in the infection rate of 80% compared to 10% within a population has only a single interpretation while a relative increase in the infection rate by 8x within a population could have several underlying meanings (e.g. 80% vs 10%, 64%vs 8%, 24% vs 3%). We have now included all experimental replicates in the supplementary material. 

      Reviewer #2 (Recommendations For The Authors):

      Minor concerns: 

      (1) Lines 86-87: "utilized a collection of bacteria isolated from the environment with wild C. elegans". The authors should provide more context on the source of these bacterial strains. 

      More references for the sources of these bacteria have been added to Supplementary Table 2.

      (2) The presentation of data in Fig 1 could be improved. The authors should include the text "pals-5p::GFP" on the images shown in Fig 1B. The red dashed line in Fig. 1D should intersect the dose-response curve at y = 0.5. The column heading for Fig 1E states "ID50 +/- SD (a.u.)", but should read "ID50 ratio" and should not have units. It also might be more intuitive to normalize the ID50 value for O. vermis to E. coli OP50. This way, having an ID50 ratio >1 indicates decreased transmission relative to E. coli, and ID50 ratio <1 indicates increased transmission relative to E. coli. To increase the transparency and rigor of 1E, the authors should plot the ratios from all 3 experimental replicates. The authors should also briefly explain why different viral doses were used in Fig 1D and 1F. 

      The text “pals-5p::GFP” has now been added to Figure 1B and throughout the text. The red dashed line in figure 1D has been corrected. Figure 1E has been adjusted to an actual figure as suggested and the y-axis label is “ID50 Ratio Compared to E. coli OP50”. The ID50 replicates have been plotted in Supplementary Figure 2. We have clarified that the doses used are the same. Briefly, the technical replicates of individual doses from Figure 1D and Supplementary Figure 3A and 3B were pooled and processed for FISH staining to provide each experimental replicate of Figure 1F. 

      (3) Line 110: The claim is that Ochrobactrum and P. lurida MYb11 reduce the variability of infection levels. However, another possibility is that there's simply less dynamic range in the assay because the infection levels have been compressed to 100% and 0% under these conditions. 

      This line has been removed.

      (4) There are discrepancies between what is shown in Fig 2C and what is described in the text. Lines 163-164: "P. aeruginosa PA01 and P. lurida MYb11 attenuated average infection to 33% and 62% of the population respectively". In Fig 2C, the mean for PA01 is ~25% whereas the mean for P. lurida appears to be less than 62%. 

      These values have been corrected.

      (5) Line 196: Provide more context for why rde-1 mutants were tested. This is the first time rde-1 is mentioned in the text (i.e. why show results in rde-1 mutants when the results are in Fig 2). 

      More context has been provided for why rde-1 mutants were tested (Line numbers 228-232). Briefly, using the rde-1 mutant, which has defective antiviral immunity and therefore supports higher viral replication levels than the wild-type (Félix et al. 2011), allows us to potentiate our infection assay in Figure 3B and 3C such that we maximize our chances of detecting viral replication in the presence of the Pseudomonas species, and especially P. aeruginiosa PA14, where fewer animals might be expected to get infected based upon Figure 2B and Supplementary Figure 5. 

      (6) Lines 228-229: "Mutations of any the regulators of the las, rhl, or pqs quorum sensing systems suppressed the attenuation of Orsay virus infection caused by the presence of wild-type P. aeruginosa PA01". Based on this description, PA01 should have a lower fraction of GFP positive relative to the quorum sensing mutants in Fig 4B. It seems that the x-axis labels OP50 and PA01 are swapped. 

      The x-axis labels of Figure 4B have been corrected. 

      (7) To improve clarity, for any figures that have data showing the "fraction of individuals GFP positive", the authors should include "pals-5p::GFP" in the y-axis title and legend. 

      The y-axis labels, legends, and text have been corrected throughout.  

      (8) To improve overall clarity and flow, the order in which the data is presented could be reordered. In particular, Fig. 6 could be better positioned instead of being the last figure, as no further characterization is performed on the mutants, and the findings are not conserved in strains that are more relevant to the C. elegans microbiota, such as P. lurida. The overall story could be strengthened if the authors ended the manuscript with more details related to the mechanism by which regulators of quorum sensing modulate the outcome of viral infection. 

      Figure 5 and Figure 6 have now been swapped.

      (9) Fig 5A: Make arrow sizes consistent across diagrams (i.e. the diagram for gacA deletion). 

      This figure (now Figure 6A) has been adjusted to make arrow sizes consistent across diagrams.  

      (10) Lines 280-282: "These data suggest that gacA has a conserved role across distant Pseudomonas species..." Here, the authors can provide more context on how well-conserved gacA is across Pseudomonas species (i.e. phylogenetic analysis of gacA sequences across different Pseudomonas species/strains). Furthermore, the data in Fig 5 does not provide strong enough support for the conclusion that gacA has a conserved role broadly across Pseudomonas species, as the authors only assess the effects of a gacA deletion in two species, P. aeruginosa and P. lurida. 

      We have adjusted lines 361-362 to “These data suggest that gacA has a conserved role between P. aeruginosa and P. lurida Myb11 in the attenuation of Orsay virus transmission and infection of C. elegans.” to reflect that we only assessed the effects of the gacA deletion in P. aeruginosa and P. lurida MYb11.

      (11) The manuscript can be strengthened by performing additional experiments to elucidate the mechanism by which Pseudomonas modulates viral infection. Does the attenuation of viral transmission and host susceptibility by P. lurida and P. aeruginosa require C. elegans to be in the presence of live bacteria? For example, the authors could measure viral transmission and susceptibility of C. elegans grown on heat-killed Pseudomonas. Additionally, it would be interesting to determine if modulation of viral infection is dependent on a secreted molecule. To assess this, the authors could perform viral infections in the context of Pseudomonas culture supernatant. 

      We added bacterial culture supernatant from each bacterium to lawns of E. coli OP50 to assess the effect on host susceptibility and did not observe any potent effect (Line numbers 311-318, Supplementary Figure 9). This supports an interpretation that attenuation is not mediated by a secreted molecule, however we cannot rule out that attenuation activity would become apparent if supernatant were provided at a higher concentration.

      We have found substantial challenges appropriately controlling live vs. heat-killed experiments particularly with the specifics of our susceptibility experiments. With regards to the underlying question of mechanism we believe that the genetic mutants (e.g. rhlR/gacA) are equally informative and that further comparison of these mutants’ interaction with the C. elegans host as compared to wild-type may be informative. 

      (12) The authors should include a discussion on the relative virulence potential of PA01, PA14, and P. lurida and the relationship between bacterial virulence potential and the outcome of viral infection. 

      We have also added data on mortality rates (Line numbers 183-200, Supplementary Figure 6). No significant mortality was observed within the 24-hour exposure period used for our Orsay infection and transmission assays. P. aeruginosa virulence is dependent upon temperature and as our assays are done at 20°C rather than 25°C this may account for reduced mortality compared to other published results. Regardless, we noted that O. vermis MYb71 killed C. elegans as quickly as P. aeruginosa PA14 under these conditions and these two bacteria led to the shortest lifespan compared to the other tested bacteria. Interestingly, P. lurida MYb11 was observed to be more virulent than P. aeruginosa PA01 under these conditions. These results suggest that there is no direct correlation between mortality and susceptibility to Orsay virus, although it does not rule out that virulence effects unique to each bacterium could contribute to alterations in host susceptibility.  

      (13) More information is needed on strains listed in Supplementary Table 2, particularly when there is no reference listed and the strain is "Gift of XXX lab". For example, the Troemel lab previously published about an Ochrobactrum strain in Troemel et al PLOS Biology 2008 PMID: 19071962 - is this the same strain? Please ensure that there is adequate information about each strain with as many published references as possible so that the work can be more easily reproduced. 

      We have added additional information and references to the strain table in Supplementary Table 2. The strain listed as Ochrobactrum sp. has been amended to Ochrobactrum BH3 as it is the strain described in Troemel et al. 2008.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      This manuscript uses C. elegans as a model to interrogate the effects of autism-associated variants of previously unknown function in the RNA-binding protein RBM-26/RBM27.

      Despite its potential impact, there are several concerns related to the technical rigor and specificity of the observed effects.

      Major concerns: 1. The effects on PLM are interesting, but why was this neuron selected for study? Was this a lucky guess or are other axons also affected? It is important to clarify whether the effects of RBM-26 are specific to this neuron or act pleiotropically across many or all neurons. According to CeNGEN, rbm-26 is strongly expressed in the well-characterized neurons ASE, PVD, and HSN. Are there morphological defects in these neurons, or others? As a note, there are also functional assays for these neurons (salt sensing, touch response, and egg laying, respectively).

      We have added new data to the supplemental materials showing that loss of rbm-26 function also causes the beading phenotype in the axons and dendrites of the PVD neuron (Figure S4 and lines 196-199). We have focused on the PLM neuron because our preliminary studies indicated that it had a higher penetrance of axon defects relative to the PVD neuron. Moreover, we observed expression of endogenously tagged RBM-26 in the PLM neuron (Figure 3A-C and lines 210-215).

      Similarly, the choice of the MALSU homolog seemed like a shot in the dark. It is ranked 46th (out of 63 genes) for fold-enrichment following RBM-26 pull-down, and 9th for p-value. Were any of the mRNAs with greater fold-enrichment or smaller p-values examined further? It is important to determine whether many or all of these interacting genes are overexpressed in the absence of RBM-26 and whether they are also required for the phenotypic effects of RBM-26 mutants, or if the MALSU homolog is special.

      We have clarified our reasoning for selecting the MALS-1 ortholog of MALSU1 for further study (see lines 283-284 and Table S2). Amongst binding partners with human orthologs, MALS-1 was by far the top ranked candidate. The adjusted p-value for MALS-1 was 0.0008. The next smallest adjusted p-value was two orders of magnitude larger (0.028 for dpy-4). Moreover, the log2fold fold enrichment for MALS-1 was 1.98, about the same as the largest (ACADS with 2.13). Nonetheless, we agree that some of the other interactors may also be of interest and have thus included them in the supplemental table S2. Although these other potential binding partners are outside the scope of this study, we expect that future studies by ourselves or others may focus on the roles of these other binding partners.

      In addition to the specificity controls mentioned above, positive and negative controls are needed throughout the results. While each of these may be relatively minor by itself, as a group they raise questions about the technical rigor of the study. Briefly these include: Fig 1C. Missing loading controls and negative control (rbm-26 null allele). Additional exposures should be included to show whether RBM-26(P80L) protein or the lower band for RBM-26(L13V) are present at all, relative to the null allele.

      We have added no-stain loading controls to figure 1C. We have also switched to using ECL detection, which is much more sensitive and reveals faint bands for RBM-26(P80L) and additional faint bands for RBM-26(L13V). In addition, we have included a longer exposure for the blot (Figure S1). We are unable to test the null, as we can only produce a limited number of small maternally rescued progeny, thereby precluding western blot analysis.

      Fig 2. Controls to distinguish overextension of PLM axon from posterior mispositioning of ALM cell body are needed. Quantification of PLM axon lengths in microns (or normalized to body size) with standard deviation, not error of proportion, should be shown. Measurement of "beading phenotype" should be more rigorous, see for example the approach in Rawson et al. Curr. Biol. 2017 https://doi.org/10.1016/j.cub.2014.02.025 . The developmental stage examined, and the reason for choosing that stage, should be described for this and all figures.

      We have added new data that shows PLM axon length relative to body length for each of the RBM-26 mutants (Figure S2 and lines 183-185). These results indicate that the PLM axon has a larger axon length to body length ration, suggesting that the PLM/ALM overlap phenotype is a result of PLM axon overextension. For most experiments, we retain penetrance, as this has been standard practice in the field and allows for a much larger sample size (see examples listed below). We have also added examples of how the beading phenotype was measured (Figure S3). Moreover, we have now analyzed this phenotype and others at multiple developmental stages (Figures 2D-H and Table S1). In general, we have conducted experiments at the L3 stage because the rbm-26(null) mutants don't survive past this stage. However, for many of our experiments we have also included additional stages as well. We have added this explanation to the methods section of phenotype analysis and also at various locations throughout the text. We have also labeled all graphs to clearly indicate the developmental stages and included.

      10.1038/s41467-019-12804-3 Article by laboratory of Brock Grill

      10.1371/journal.pgen.1002513 Article by laboratory of Ian Chin-Sang

      doi.org/10.1073/pnas.1410263111 Article by laboratory of Chun-Liang Pan

      10.1016/j.neuron.2007.07.009 Article by laboratory of Yishi Jin

      doi.org/10.1523/JNEUROSCI.5536-07.2008 Article by laboratory of William Wadsworth

      Fig 3. Controls without auxin and with neuronal TIR1 expression alone should be included. Controls demonstrating successful RBM-26 depletion, in larvae as well as in embryos at the time of PLM extension, should be included (weak embryonic depletion might explain why the overextension phenotype is only 14% instead of 40% as in the null). According to CeNGEN, rbm-26 expression in PLM is barely detected, thus depletion with a PLM-specific TIR1 should also be tested. To confirm the authors' identification of the cell marked "N" as the PLM cell body, co-expression of rbm-26 and a PLM-specific marker should be added. Rescue of the rbm-26 mutants with neuronal (and PLM-only) expression should be included to test sufficiency in PLM, and as a further control for potential artifacts of the AID system.

      We have added new data showing that an endogenously tagged RBM-26::Scarlet protein is expressed in the PLM neuron (Figure 3A-C). Moreover, we have added rescue experiments, showing that a Pmec-7::rbm-26::scarlet transgene can rescue the beading phenotype and the PLM/ALM overlap phenotype (Figure 3 F-G). We have also added controls without auxin (Figure S7) __and without the rbm-26::scarlet::aid gene (Figure S8). We have added a new figure showing auxin-mediated depletion of RBM-26::Scarlet::AID in the PLM neuron (Figure S10)__. We examined auxin-mediated depletion at the L3 stage for consistency with our auxin-mediated phenotypic experiments. Moreover, these were done at the L3 stage for consistency with other experiments that included the rbm-26(null) mutants, which don't survive past this stage.

      In general, auxin-mediated knockdown tends to be hypomorphic in neurons. This is likely due to the fact that the neuronal TIR1 driver is expressed at much lower levels relative to the other drivers. In addition, the lower penetrance observed in auxin-mediated PLM/ALM overlap phenotype could reflect the fact that this phenotype resolves by the L4 stage in the hypomorphic mutants. For example, in P80L mutants at the L3 stage we see only about a 20% penetrance of the PLM/ALM overlap phenotype (relative to about 15% in auxin-mediated knockdown).

      Fig 4. More rigorous quantification of the distribution of mitochondria along the axon should be included, not only total number, and it should be clarified what region of the axon the images are taken from. Including the AID-depletion strain with and without auxin would further add to the sense of rigor. For the mitoTimer experiments, why is RBM-26(L13V) not included and why do wild-type values differ ~5-fold between experiments (despite error bars being almost non-existent)? A more rigorous approach to standardizing imaging conditions may be needed. Positive controls using compounds that affect oxidation should be included. Measurements of individual mitochondria with standard deviations should be shown, rather than aggregate averages with error of proportion.

      We have changed our methodology for measuring mitochondria, so that we now report the density of mitochondria in the axon (number per 100µm), (Figure 4E-F). We agree that this method is much better than counting the total number of mitochondria per axon, as it corrects for differences in body length and axon length). We also now include data for the whole axon (Figure 4E), proximal axon (Figure 4G), and distal axon (Figure 4H). These data suggest that the mitochondrial density defects occur in the proximal axon but not in the distal axon. Using the null allele, we have also examined the timing of mitochondria defects in the axon and report that the defects begin in the L1 stage and continue throughout larval development (Figure 4F). Individual datapoints have been added for all graphs in Figure 4.

      For the mitoTimer experiments (Figure 5), we have added data for L13V and have added the individual datapoints to the graph. In the prior version, the values did not differ 5-fold between experiments with the same stage, rather the different graphs were from different stages (as noted in the figure legends/main text) and the L4 stage has much more oxidation than the L2 stage. To clear this up, we have added labels to the graphs to indicate the stages for each experiment. We have also added new data, so that we now show results for the L2, L3, and L4 stages for all three rbm-26 mutants (see Figure 5C-E). We didn't test the L1 stage because the signal was not sufficient for accurate quantitation.

      Fig 5. Additional positive and negative controls should be added, including additional rbm-26 alleles, the AID-tagged strain with and without auxin, and a rescued mutant.

      The old Figure 5 has become Figure 6 in the new version. We have added the rbm-26(L13V) allele to each experiment, (Figure 6B-D). We have also added the loading controls for the western blot along with quantification for 3 biological replicates of the western blot analysis (Figure 6D). We agree that these additions significantly strengthen the data because they show that two independent alleles of rbm-26 cause very substantial increase in the expression of mals-1 at both the mRNA and protein levels. We did not do these experiments with the rescuing transgene or with the AID-tagged strain because these experiments are done on whole worm lysates, whereas the AID-tagged and rescuing transgene are neuron-specific.

      Fig 6. Controls showing whether the Scarlet-tagged protein is functional are needed, to rule out dominant negative or toxicity-related effects.

      This is Figure 7 in the new version. For this experiment, we are showing that overexpression of MALS-1 does cause defects. The idea is that excessive amounts of MALS-1 causes deleterious effects to the mitochondria. In fact, these defects could be considered as dominant negative or toxic. We considered the possibility of crossing the Pmec-7::mals-1::scarlet transgene with rbm-26; mals-1 double mutants. However, this does not seem workable, because the single copy Pmec-7::mals-1::scarlet transgene produces the phenotypes at penetrances that are similar to what we observe in rbm-26; mals-1 double mutants. We concede that the results of the overexpression experiments in Figure 7 are limited when considered in isolation. However, we think that they are meaningful when considered in combination with the results on the mals-1;rbm-26 double mutants in Figure 8.

      Fig 8. Controls for other mitochondrial components need to be included. It is important to determine if the decrease in ribosomes is specific or reflects a general decrease in mitochondria. If there are fewer mitochondria as suggested in Fig. 4, then of course mitochondrial ribosomal protein levels are also reduced. Additional rbm-26 alleles should be included here as well. Is this effect dependent on the MALSU homolog?

      This is Figure 8D-E in the new version. We have added new data showing that the decrease in MRPL-58 expression that is caused by the rbm-26(P80L) mutation is dependent on MALS-1. We concede that these experiments cannot be used to determine anything about the mitoribosomes per se, but rather serve as an alternative way of testing the effect of rbm-26 on mitochondria. We have revised the text accordingly (lines 355-357). Given these limitations we have elected not to try additional mitochondrial markers and have also not included additional rbm-26 alleles for this experiment.

      Finally the authors should address concerns about image manipulation, which amplify the concerns about technical rigor outlined above. The image in Fig. 2A appears to have a black box placed over the lower-right portion of the field to hide some features. Black boxes also appear to have been placed over the tops of images in Fig. 4B and 4D and at the left of Fig. 6A, 6B, and 6C. While these manipulations probably do not affect the conclusions, they further undermine confidence in data integrity and experimental rigor.

      We have corrected all of these image processing errors. The box in 2A was for the purpose of squaring off a corner that was clipped during image rotation. The boxes in Figures 4 and 6 (of the prior version) were added to give space for labels (without obscuring image features). We have now used alternative methods to accomplish the same goals. For example, in Figures 4-D we have placed the labels outside of the images.

      Minor points. 1. C. elegans nomenclature conventions should be followed: - C. elegans gene names have three or four letters, thus the MALSU homolog cannot be named "malsu-1". Please have new gene names approved by WormBase BEFORE submitting for publication http://tazendra.caltech.edu/~azurebrd/cgi-bin/forms/gene_name.cgi

      We have changed malsu-1 to mals-1. In addition, both mals-1 and mrpl-58 have now been approved by wormbase and will be listed on the website upon its next update.

      • If two sequential CRISPR edits are made on the same gene then they should be listed as a compound allele, such as rbm-26(cue22cue25)

      We have updated our gene names to reflect this convention.

      • Genes on the same chromosome should not be separated with a semicolon, for example rbm-26(cue40) K12H4.2(syb6330)

      We have updated our gene names to reflect this convention.

      Describing the defects as "neurodevelopmental" is misleading in the case of axon beading or degeneration. Similarly, there is no evidence for an "axon targeting" defect as stated in the abstract.

      We have revised such that instead of referring to degeneration phenotypes as neurodevelopmental, we now refer to axon degeneration phenotypes that occur during development. For example, in the abstract we now say, "These observations reveal a mechanism that regulates expression of a mitoribosomal assembly factor to protect against axon degeneration during neurodevelopment.

      Regarding targeting defects, this was meant to refer to the misplacement of the PLM axon tip (which contains electrical synapses). However, our subsequent analysis has revealed that these defects are transient in P80L and L13V mutants, as they resolve by the L4 stage. The rbm-26 null axon development defects do not resolve, though these mutant die prior to the L4 stage. Given these findings, we have decided not to use the term of targeting defects. Instead, we now refer to this as an axon tiling defect or PLM/ALM overlap phenotype.

      In Fig. 5A, the symbol that appears to correspond to F59C6.15 (lowest p-value) is a different size than the others and is colored as ncRNA, whereas WormBase annotates this gene as snoRNA.

      This error has been corrected.

      In the Introduction, the last sentences of the first two paragraphs should be varied ("However, little is known about the [...] mechanisms that protect [...] during neurodevelopment.")

      This has been done.

      Why is RBM-26 protein running as a doublet at both sizes?

      We have improved our western blotting methodology by using 12% gel, allowing for better resolution. We have also switched from colorimetric detection to ECL detection, allowing for greater sensitivity. In our new blots, we identify 6 different RBM-26 protein bands. We don't know the reason for these bands, but speculate that they are the result of post-translational processing (148-150).

      When showing the RBM-26 expression pattern (Fig. 3) please include a lower-magnification image of the entire animal.

      This has been done (Figure S6)

      It is confusing to refer to the RNA IP experiments as an "unbiased screen", which in C. elegans typically refers to a genetic screen.

      We now refer to this as a "biochemical screen".

      The relationship between axon overextension, beading, and mitochondrial localization is not clear. What causal connection between these is being proposed? The causal connections between these phenotypes, if any, should be clarified experimentally. For example, if the axon extension defects develop before mitochondrial localization defects, then it is unlikely that mitochondrial defects cause axon overextension.

      We have added new data showing that the reduction in mitochondrial density within the axon begins during the L1 stage and increases throughout larval development (Figure 4F). We have also added additional data showing that the increase in mitochondrial oxidation is weak in the L2 stage and surges in the L3 stage (Figure 5C-E), coincident with the beginning of the axon degeneration phenotypes. We propose (lines 383-391) that a low level of mitochondrial defects is present in L1 larvae, giving rise to the axon tiling defects. In the L3 stage there is a surge in excessive mitochondrial oxidation, giving rise to the axon degeneration phenotypes. We have added a new section to the discussion that addresses the relationship between defects in axon development and axon degeneration (lines 375-405).

      Please explain how to interpret the difference in axon beading in the two deletion alleles of the MALSU homolog (axon beading defects in tm12122 but not in syb6330). Is syb6330 not a null allele? Or are the defects in tm12122 due to other mutations in this strain background?

      One likely reason for this difference is that tm12122 is predicted to cause a partial deletion of the mals-1 coding sequence, whereas the syb6330 is a full deletion. Thus, the tm12122 could be acting as a dominant negative. In fact, prior work on the MALSU1 ortholog has indicated that this protein is subject to interference by a dominant negative construct (see Rorbach et al, Nucleic Acids Res 2012). Nonetheless, we cannot rule out the possibility of a linked second mutation in tm12122. However, since we have found similar phenotypes and genetic interactions with both alleles, we can conclude that these phenotypes and interactions are due to loss of MALS-1, rather than a second mutation.

      Are mitochondria reduced in number or mislocalized? If they are reduced in number, is this due to altered balance of fission/fusion?

      We have adjusted our methods for quantifying mitochondria and have also analyzed the proximal vs distal axon (Figure 4). We find that the density of mitochondria is decreased in the proximal axon, but not in the distal axon. We speculate that this might reflect a higher demand on mitochondria in the proximal axon, due to a higher amount of trafficking activity in the proximal axon (lines 255-257). We propose that the loss of RBM-26 causes dysfunction in mitochondria. Since fission and fusion are mechanisms that can help to repair damaged mitochondria, it is likely that they would be involved in the phenotypes that we observe.

      In Fig. 3A-D, please keep the labels in the same position in all panels and do not alter brightness settings between single-color and merged panels.

      These images have been moved to the supplemental data section (Figure S5). We have adjusted the labels as suggested. We have not changed the brightness settings, as they were already the same in all panels. However, the blue signal in the merged panel does obscure some of the red signal, giving an appearance of an alteration in color balance.

      The claim that rbm-26 acts cell-autonomously requires PLM-specific depletion and rescue experiments.

      We have added new data indicating that a Pmec-7::rbm-26::scarlet transgene can rescue the beading phenotype (Figure 3F-G).

      **Referees cross-commenting** I appreciate the use of the consultation session to resolve differences between reviewers, but in this case I fully agree with the content and tone of all the comments from the other reviewer -- I think our remarks are very well aligned!

      Reviewer #1 (Significance (Required)):

      The study engineers autism-associated variants in conserved residues of RBM27 into the C. elegans homolog RBM-26 and identifies neuronal phenotypes potentially relevant to autism and a potential molecular mechanism involving regulation of mitochondrial ribosome assembly.

      The key claims of the study are 1} that autism-associated variants in RBM-26 decrease its protein expression; 2} that impaired RBM-26 function leads to a variety of defects in development and maintenance of a single neuron called PLM, including altered axonal localization of mitochondria; 3} that RBM-26 normally binds the mRNA for the C. elegans homolog of MALSU, a mitochondrial ribosomal assembly factor; 4} that loss of RBM-26 leads to overexpression of the MALSU homolog; and 5} that MALSU is required for some of the deleterious effects on the PLM neuron seen in RBM-26 mutants.

      This study will be of interest to the autism research community because it bolsters the idea that variants in RBM27 are likely to disrupt gene function and to affect neuronal health. It will also be of interest to the broader cell biology community because it suggests an interesting potential nucleus-to-mitochondria signaling mechanism, in which a nuclear RNA-binding protein might regulate assembly of mitochondrial ribosomes.

      My field of expertise is developmental biology in C. elegans.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary In this manuscript, the authors studied an ASD-associated gene, rbm-26 in neuronal morphology using the touch receptor neuron PLM in C. elegans, and found that loss-of-function rbp-27 causes overextension and the formation of bulb-like structures in the axon. Using UV-crosslinking RNA immunoprecipitation and RNA-Seq, they identify malsu-1 as a target of rbm-26. Genetic analyses suggest malsu-1 likely functions downstream of rbm-26 in controlling the PLM morphology. Major comments:

      • The authors describe RBM27 is associated with ASD and ID while they only cite SFARI paper that describes a weak association of RBM27 to ASD. The appropriate referenced that show link between RBM27 and ID should be provided. The link with ID was an error. We had meant to say "ASD or other neurodevelopmental disorders." This has been corrected.

      • SFARI database only has three (P79L, R190Q, G348D) mutations listed as ASD-associated. Where are other mutations L13V and R455H, particularly L13V that the authors used to generate the C. elegans mutant come from? Are they associated with intellectual disabilities? The others came from the devovo-DB. We have added a reference for this database and have also added the primary source references for each of the five de novo variants (see line 121).

      • The authors should be very careful when describing 'gene X causes Y diseases'. Many (if not all) of the examples described in this manuscript are disease-associated genes without validation to be causal genes. We have revised accordingly. For example on lines 433-435, we now say," For example, mutations in the EXOSC3, EXOSC8 and EXOSC9 are thought to cause syndromes that include defects in brain development such as hypoplasia of the cerebellum and the corpus callosum". We have decided to use the phrase "thought to cause" because three of the five referenced articles on these genes use titles that indicate causation.

      • The authors refer PLM axon beading and overextension phenotypes to 'axon degeneration and targeting defects'. The authors must provide additional evidence of axon degeneration (see below). Also the term 'targeting defects' is misleading as the authors did not examine if overextension of the PLM axon causes targeting defects. At least they should examine some synaptic markers. To provide more evidence of degeneration we have analyzed several additional phenotypes at multiple developmental stages (Figure 2 and Table S1). Regarding targeting defects, this was meant to refer to the misplacement of the PLM axon tip (which contains electrical synapses). However, our subsequent analysis has revealed that these defects are transient in P80L and L13V mutants, as they resolve by the L4 stage. The rbm-26 null axon development defects do not resolve, though these mutant die prior to the L4 stage. Given these findings, we have decided not to use the term of targeting defects. Instead, we now refer to this as an axon tiling defect or PLM/ALM overlap phenotype.

      • Neuronal phenotypes (axon overextension and beading) should be examined at different developmental timepoints (larval, young adult, and aged animals) to test if these phenotypes are indeed degenerative instead of developmental defects. We have included new data to observe all of these phenotypes at multiple developmental time points (Figure 2 and Table S1).

      • The authors use the blebbing (beading) phenotype in the axon as the sole evidence of neurodegenerative properties of the PLM neuron. A more thorough analysis of this phenotype as done by others (Pan PNAS 2006) must be provided to support the authors' claim that this phenotype represents neurodegeneration. We have included new data on multiple degenerative phenotypes in axons including: blebbing, beading, waviness and breaks (Table S1).

      • The number of beads per axon should be quantified to better represent the severity of rbm-26 mutant. Individual samples should be plotted in the quantification instead of showing the percentage of animals. We have added data on the density of beads in rbm-26(null), rbm-26(P80L), and rbm-26(L13V) mutants (Figure S3). For most experiments we have decided to use penetrance to measure axon degeneration because this is a standard in the field and allows for a larger sample size. For examples please see:

      10.1523/JNEUROSCI.1494-11.2012 (Toth et al, 2012)

      https://doi.org/10.1016/j.cub.2014.02.025 (Rawson et al, 2014)

      10.1073/pnas.1011711108 (Pan et al, 2012)

      https://doi.org/10.7554/eLife.80856 (Czech et al, 2023)

      https://doi.org/10.1016/j.celrep.2016.01.050 (Nichols et al, 2016)

      • Based on the single gel image in Fig. 1C with no loading control, the P80L mutant appears to have no protein expression. How is the P80L viable while the null mutant is lethal? The authors should quantify the protein expression levels from multiple blots with proper loading controls. If P80L mutation is introduced into RBM-26::mScarlet strain can it cause depletion of the signal in vivo? We have added new data showing that the RBM-26::Scarlet signal is diminished by the P80L mutation in vivo (Figure 1E-F). We have also added quantification from 3 biological replicate blots (Figure 1D). Finally, we have improved the sensitivity of our blots by using ECL detection and also show various exposures to highlight the fainter bands (Figures 1C and S1). Therefore, we are now able to detect low level expression of RBM-26(P80L) mutant protein. It is likely that the low level of RBM-26(P80L) and RBM-26(L13V) seen on western blots is sufficient to prevent the lethal phenotype.

      • 'Moreover, loss of either the SPTBN1 or ADD1 genes causes a neurodevelopmental syndrome that includes autism and ADHD' References are missing, and as described above, be extra careful when indicating causality. Very few genes are known to cause ASD and ADHD. We have added the citations for this work (line 81). We also note that the titles for both of the cited articles indicate causation. To be on the safe side we have revised this line to say, "Moreover, loss of either the SPTBN1 or ADD1 genes are thought to cause a neurodevelopmental syndrome that includes autism and ADHD"

      • Fig. 3E F, the authors should use the strains that express TIR1 specifically in the touch receptor neurons to argue cell autonomous function of RBM-26. Alternatively, the authors may conduct PLM neuron-specific rescue experiments to test the sufficiency. We have added new data indicating that a Pmec-7::rbm-26::scarlet transgene can rescue the beading phenotype and the PLM/ALM overlap phenotype (see Figure 3F-G).

      • 'Loss of RBM-26 causes mitochondria dysfunction in axons.' The authors did not examine mitochondria function in axons. They only examined the number of mitochondria, and ROS production in the soma. The authors should provide additional evidence to support the idea that elevated ROS production in the soma is due to mitochondrial dysfunction in axons. Also, the authors should use both P80L and L13V for this experiment, and indicate individual datapoint as dots. Here, they quantified at the L4 stage, which the authors should justify. We have added the L13V data to this experiment and now show the individual data points. In addition, we have now conducted this analysis at the L2, L3 and L4 stages (Figure 5C-E). We have also revised the text to indicate that loss of rbm-26 function causes mitochondrial dysfunction in the cell body which could potentially cause a reduction of mitochondria in the axon (see lines 100-101 and 268-270). We speculate that mitochondria in the axon are also dysfunctional. However, the mitoTimer signal is not bright enough in axons to allow for quantification.

      • Figure 5B and C: the authors should also use L13V to quantify malsu-1 mRNA and protein level, and include quantifications in panel C (from multiple blots). This is Figure 6 in the new version. We have added new data for expression of mals-1 mRNA and protein in rbm-26(L13V) mutants (Figure 6B-D). We have also included quantifications from 3 biological replicates (Figure 6D).

      • In the rbm-26 mutant, the number of mitochondria is reduced, while the amount of MALSU-1 protein is increased. If MALSU-1 is specifically localized at mitochondria in wild type, where does the excessive MALSU-1 go in the rbm-26 mutants? Quantification of MALSU-1 signal intensity should be provided. Our Pmec-7::mals-1::scarlet transgene uses the tbb-2 3'UTR and causes an overexpression phenotype. To address the question posed by the reviewer, we would need to express MALS-1 at endogenous levels. Given that endogenous levels of MALS-1 are very low, it is unlikely that we would be able to visualize its expression. Nonetheless, as a way to address this question we have attempted to create a single copy Pmec-7::mals-1::scarlet transgene that utilizes the mals-1 endogenous 3'UTR. We have tried multiple approaches for generating this construct, but all have failed, likely due to sequence complexities within the mals-1 3'UTR. While we cannot say where the extra MALS-1 protein goes, we think that it is likely overloaded into the remaining mitochondria and could also be in the cytosol as well.

      • Figure 7C: malsu-1 knockout mutants exhibit PLM overextension phenotype, which is not consistent with their model. The authors should discuss this in detail. We have added a paragraph to the discussion explaining that mitochondria function could be disrupted by either MALS-1 overexpression or by MALS-1 loss of function (lines 471-480).

      • 'To validate these findings, we also repeated these experiments with an independent allele of malsu-1, malsu-1(tm12122) and found similar results (Fig. 7A-C).' The malsu-1(tm12122) exhibits beading phenotype and more severe overextension phenotype which the authors must describe and discuss more carefully. One likely reason for this difference is that tm12122 is predicted to cause a partial deletion of the mals-1 coding sequence, whereas the syb6330 is a full deletion. Thus, the tm12122 could be acting as a dominant negative. In fact, prior work on the MALSU1 ortholog has indicated that this protein is subject to interference by a dominant negative construct (see Rorbach et al, Nucleic Acids Res 2012). Nonetheless, we cannot rule out the possibility of a linked second mutation in tm12122. However, since we have found similar phenotypes and genetic interactions with both alleles, we can conclude that these phenotypes and interactions are due to loss of MALS-1, rather than a second mutation (albeit at a slightly different penetrance). We have added these considerations to the results section (lines 342-244).

      • Figure 8: The authors should include data from L13V, malsu-1 and rbm-26; malsu-1 mutants. Quantification from multiple blots should be provided. This is Figure 8D in the new version. We have added the malsu-1 and rbm-26;malsu-1 double mutants to this experiment. We have also added quantification from multiple biological replicate blots. As pointed out by the other reviewer, we think that this experiment does not give specific information about mitoribosomes, but is an alternative approach to looking at the reduction in mitochondria. Given this limitation and considering that we have added L13V data to the mitochondria experiment in Figure 8B, we have elected not to add additional data on L13V to the western blot experiment in Figure 8D

      Minor comments: • 'Consistent with a role for mitochondria in neurodevelopmental disorders, some of these disorders include a neurodegenerative phenotype.' Why is it consistent to have neurodegenerative phenotypes if mitochondria is associated with neurodevelopmental disorders? A better explanation would help.

      We have changed this sentence to, "Some neurodevelopmental syndromes feature neurodegenerative phenotypes that occur during neuronal development."

      • L13V is generally more severe in axon overextension phenotype than P80L while protein level is more abundant. The authors should discuss about this. We have also added a time course for the PLM/ALM overlap phenotype mutants (Figure 2D). This new data shows that the PLM/ALM overlap is quite similar overall between the P80L and L13V mutants. Both of these mutations cause an increase in PLM/ALM overlap in early larval development that is resolved by the L4 stage. The P80L phenotype resolves slightly sooner for reasons that are unknown. This could reflect differences in expression within the PLM that are not reflected in the whole worm lysate. This could also be due to a slight difference in the genetic background or other stochastic factors. The key point is that these two independent alleles cause similar phenotype overall, indicating that this phenotype is the result of loss in RBM-26 function.

      • Fig. 2E, F: 'Beading refers to focal enlargement or bubble-like lesions which were at least twice the diameter of the axon in size.' How are the diameters of axons measured? A more detailed quantification method, and examples of measurement should be provided. We have added example measurements to the supplemental section (Figure S3). Additional detail on the measurements are in the Methods section (lines 517-518).

      • Figure 3: The authors should also include low-magnification images to show where RBM-26 is expressed. The current image does now allow identifying cells. The transgene that labels the nuclei of hypodermis should be indicated in the manuscript. Specifically, the expression of the RBM-26 in the PLM should be shown. We have added a low magnification image (Figure S6) and have also added images of endogenously tagged RBM-26:Scarlet in the PLM (Figure 3A-C). The transgenic label for the hypodermis has been added to the legend of Figure S5.

      • Figure 3: 'Tissue specific degradation of RBM-26::SCARLET::AID was achieved due to cell-type specific TIR-1 driver lines (see methods for details).' This information is not provided in the method section. This information has been added to methods section, "Auxin proteindegredation"

      • Fig. 4 E. Values from individual samples should be indicated as dots. Representative images of P80L and L13V should be included. Conduct quantifications at adult stage as the authors use in other quantifications, or justify use of specific developmental stage (L3) they used. Figure 4 has become Figures 4 and 5 in the revised version. We have updated the graphs to include dots for individual data points. We have added quantifications of the mitoTImer experiments for the L2, L3 and L4 stages (Figure 5C-E). We note that our other experiments were done at the L1, L2, L3 and L4 and adult stages. The mitoTimer signal is not sufficient at the L1 stage for quantification. At the adult stage, the red signal becomes saturated. We have added representative images for mitoTimer in P80L and L13V mutants (Figure S9).

      • The genes malsu-1 and mrpl-58 are not listed on wormbase. If the authors would like to designate names to these gene, they should clearly indicate that along with the sequence name. We have changed malsu-1 to mals-1. In addition, both mals-1 and mrpl-58 have now been approved by wormbase and will be listed on the website upon its next update.

      • The authors found that MRPL-58 amount is reduced in rbm-26 mutants (which require additional verifications). This can be explained by the fact that axonal mitochondria number is reduced in the rbm-26 mutants. How did the authors confirm that the reduction in MRPL-58 level is due to the disruption of mitoribosome assembly? This is Figure 8D-E in the new version. We have added new data showing that the decrease in MRPL-58 expression that is caused by the rbm-26(P80L) mutation is dependent on MALS-1. We concede that these experiments cannot be used to determine anything about the mitoribosomes per se, but rather serve as an alternative way of testing the effect of rbm-26 on mitochondria. We have revised the text accordingly (lines 355-357).

      • 'MALSU-1 is a mitoribosomal assembly factor that functions as part of the MALSU1:LOR8F8:mtACP anti-association module [37-39].' I don't think these are known for C. elegans MALSU-1. We have revised to, "MALS-1 is an ortholog of the MALSU1 mitoribosomal assembly factor that functions as part of the MALSU1:LOR8F8:mtACP anti-association module"

      • 'Moreover, our results also suggest that disruption of this process can give rise to neurodevelopmental disorders.' I feel this is a quite a bit of stretch.

      This has been replaced with, "Therefore, we speculate that human RBM26/27 could function with the RNA exosome complex to protect against neurodevelopmental defects and axon degeneration in infants." (lines 371-373)

      **Referees cross-commenting** Yes, many of our comments overlap, and I fully agree with all comments from the other reviewer too. Reviewer #2 (Significance (Required)):

      I found the manuscript interesting particularly the use of innovative techniques in identifying the target of RBM-26, The genetic analyses of rbm-26 and malsu-1 generally support the authors main conclusions that rbm-26 inhibits malsu-1 and be of potential interest to basic neuroscientists and cell biologists. However, the current manuscript looked premature which made my reading experience less pleasant. The phenotypic analyses is superficial compared to works similar to this work, which are insufficient to support the authors' claim of 'axon degeneration and targeting defects'. A number of issues listed above should be addressed before this manuscript is published. The reviewer's expertise: neurodevelopment in model organisms.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary

      In this manuscript, the authors studied an ASD-associated gene, rbm-26 in neuronal morphology using the touch receptor neuron PLM in C. elegans, and found that loss-of-function rbp-27 causes overextension and the formation of bulb-like structures in the axon. Using UV-crosslinking RNA immunoprecipitation and RNA-Seq, they identify malsu-1 as a target of rbm-26. Genetic analyses suggest malsu-1 likely functions downstream of rbm-26 in controlling the PLM morphology.

      Major comments:

      • The authors describe RBM27 is associated with ASD and ID while they only cite SFARI paper that describes a weak association of RBM27 to ASD. The appropriate referenced that show link between RBM27 and ID should be provided.
      • SFARI database only has three (P79L, R190Q, G348D) mutations listed as ASD-associated. Where are other mutations L13V and R455H, particularly L13V that the authors used to generate the C. elegans mutant come from? Are they associated with intellectual disabilities?
      • The authors should be very careful when describing 'gene X causes Y diseases'. Many (if not all) of the examples described in this manuscript are disease-associated genes without validation to be causal genes.
      • The authors refer PLM axon beading and overextension phenotypes to 'axon degeneration and targeting defects'. The authors must provide additional evidence of axon degeneration (see below). Also the term 'targeting defects' is misleading as the authors did not examine if overextension of the PLM axon causes targeting defects. At least they should examine some synaptic markers.
      • Neuronal phenotypes (axon overextension and beading) should be examined at different developmental timepoints (larval, young adult, and aged animals) to test if these phenotypes are indeed degenerative instead of developmental defects.
      • The authors use the blebbing (beading) phenotype in the axon as the sole evidence of neurodegenerative properties of the PLM neuron. A more thorough analysis of this phenotype as done by others (Pan PNAS 2006) must be provided to support the authors' claim that this phenotype represents neurodegeneration.
      • The number of beads per axon should be quantified to better represent the severity of rbm-26 mutant. Individual samples should be plotted in the quantification instead of showing the percentage of animals.
      • Based on the single gel image in Fig. 1C with no loading control, the P80L mutant appears to have no protein expression. How is the P80L viable while the null mutant is lethal? The authors should quantify the protein expression levels from multiple blots with proper loading controls. If P80L mutation is introduced into RBM-26::mScarlet strain can it cause depletion of the signal in vivo?
      • 'Moreover, loss of either the SPTBN1 or ADD1 genes causes a neurodevelopmental syndrome that includes autism and ADHD' References are missing, and as described above, be extra careful when indicating causality. Very few genes are known to cause ASD and ADHD.
      • Fig. 3E F, the authors should use the strains that express TIR1 specifically in the touch receptor neurons to argue cell autonomous function of RBM-26. Alternatively, the authors may conduct PLM neuron-specific rescue experiments to test the sufficiency.
      • 'Loss of RBM-26 causes mitochondria dysfunction in axons.' The authors did not examine mitochondria function in axons. They only examined the number of mitochondria, and ROS production in the soma. The authors should provide additional evidence to support the idea that elevated ROS production in the soma is due to mitochondrial dysfunction in axons. Also, the authors should use both P80L and L13V for this experiment, and indicate individual datapoint as dots. Here, they quantified at the L4 stage, which the authors should justify.
      • Figure 5B and C: the authors should also use L13V to quantify malsu-1 mRNA and protein level, and include quantifications in panel C (from multiple blots).
      • In the rbm-26 mutant, the number of mitochondria is reduced, while the amount of MALSU-1 protein is increased. If MALSU-1 is specifically localized at mitochondria in wild type, where does the excessive MALSU-1 go in the rbm-26 mutants? Quantification of MALSU-1 signal intensity should be provided.
      • Figure 7C: malsu-1 knockout mutants exhibit PLM overextension phenotype, which is not consistent with their model. The authors should discuss this in detail.
      • 'To validate these findings, we also repeated these experiments with an independent allele of malsu-1, malsu-1(tm12122) and found similar results (Fig. 7A-C).' The malsu-1(tm12122) exhibits beading phenotype and more severe overextension phenotype which the authors must describe and discuss more carefully.
      • Figure 8: The authors should include data from L13V, malsu-1 and rbm-26; malsu-1 mutants. Quantification from multiple blots should be provided.

      Minor comments:

      • 'Consistent with a role for mitochondria in neurodevelopmental disorders, some of these disorders include a neurodegenerative phenotype.' Why is it consistent to have neurodegenerative phenotypes if mitochondria is associated with neurodevelopmental disorders? A better explanation would help.
      • L13V is generally more severe in axon overextension phenotype than P80L while protein level is more abundant. The authors should discuss about this.
      • Fig. 2E, F: 'Beading refers to focal enlargement or bubble-like lesions which were at least twice the diameter of the axon in size.' How are the diameters of axons measured? A more detailed quantification method, and examples of measurement should be provided.
      • Figure 3: The authors should also include low-magnification images to show where RBM-26 is expressed. The current image does now allow identifying cells. The transgene that labels the nuclei of hypodermis should be indicated in the manuscript. Specifically, the expression of the RBM-26 in the PLM should be shown.
      • Figure 3: 'Tissue specific degradation of RBM-26::SCARLET::AID was achieved due to cell-type specific TIR-1 driver lines (see methods for details).' This information is not provided in the method section.
      • Fig. 4 E. Values from individual samples should be indicated as dots. Representative images of P80L and L13V should be included. Conduct quantifications at adult stage as the authors use in other quantifications, or justify use of specific developmental stage (L3) they used.
      • The genes malsu-1 and mrpl-58 are not listed on wormbase. If the authors would like to designate names to these gene, they should clearly indicate that along with the sequence name.
      • The authors found that MRPL-58 amount is reduced in rbm-26 mutants (which require additional verifications). This can be explained by the fact that axonal mitochondria number is reduced in the rbm-26 mutants. How did the authors confirm that the reduction in MRPL-58 level is due to the disruption of mitoribosome assembly?
      • 'MALSU-1 is a mitoribosomal assembly factor that functions as part of the MALSU1:LOR8F8:mtACP anti-association module [37-39].' I don't think these are known for C. elegans MALSU-1.
      • 'Moreover, our results also suggest that disruption of this process can give rise to neurodevelopmental disorders.' I feel this is a quite a bit of stretch.

      Referees cross-commenting Yes, many of our comments overlap, and I fully agree with all comments from the other reviewer too.

      Significance

      I found the manuscript interesting particularly the use of innovative techniques in identifying the target of RBM-26, The genetic analyses of rbm-26 and malsu-1 generally support the authors main conclusions that rbm-26 inhibits malsu-1 and be of potential interest to basic neuroscientists and cell biologists. However, the current manuscript looked premature which made my reading experience less pleasant. The phenotypic analyses is superficial compared to works similar to this work, which are insufficient to support the authors' claim of 'axon degeneration and targeting defects'. A number of issues listed above should be addressed before this manuscript is published.

      The reviewer's expertise: neurodevelopment in model organisms.

    1. Author response

      The following is the authors’ response to the previous reviews

      eLife assessment 

      This work is an attempt to establish conditions that accurately and efficiently mimic a drought response in Arabidopsis grown on defined agar-solidified media - an admirable goal as a reliable experimental system is key to conducting successful low water potential experiments and would enable high-throughput genetic screening (and GWAS) to assess the impacts of environmental perturbations on various genetic backgrounds. The authors compare transcriptome patterns of plant subjected to water limitation imposed with different experimental systems. The work is valuable in that it lays out the challenges of such an endeavor and points out shortcomings of previous attempts. There was concern, however, that a purely gene expression-based approach may not provide sufficient physiologically relevant information about plant responses to drought, and therefore, despite improvements from a previous version, the new methodology championed by this work remains inadequate.   

      Molecular biologists who study drought stress must make choices about which assays to use in their investigation. Serious resources and effort are put into their endeavor, and choice of assay matters. Our manuscript’s goal was largely practical: to guide molecular biologists employing transcriptomics in their choice of drought stress assay, and thus help ensure their work will discover transcriptional signatures of importance, and not those that may be an artifact from lowering water potential using chemical agents on agar plates.  

      We examine how different approaches of reducing water potential impact the Arabidopsis root and shoot transcriptome. Our manuscript shows that each method of reducing water potential has a different effect on Arabidopsis root transcriptome responses. We acknowledge that drought stress induces a complex physiological response, and can vary depending on the method used. However, by comparing across assays, we find instances where a gene is downregulated by low water potential in one assay, and upregulated by low water potential in another assay. We feel it is only natural to question why this could be, and to hypothesize that it may be caused by secondary effects caused by the way low water potential is imposed.  We note that comparative transcriptomics has been a standard approach for decades. We take it as the reviewer’s opinion that it may not be insightful, but it does not factually impact our findings. 

      Reviewer #2 (Public Review): 

      This manuscript purports to develop a new system to study low water potential (drought) stress responses in agar plates. They make numerous problematic comparisons among transcriptome datasets, particularly to transcriptome data from a vermiculite drying experiment which they inappropriately present as representing an authentic "drought response" to the exclusion of all other data. For some reason, which the reviewer cannot fully understand, the authors seem intent on asserting the superiority of their experimental system to all others. They do not succeed in this and such an effort is ultimately a disservice to the field of drought research as a whole. 

      While they devote considerable effort in comparing transcriptome data among various experimental systems, the potentially more informative experiment at the end of the manuscript of testing growth responses of a number of Arabidopsis accessions is only done for their "LW" system. The focus of this manuscript on transcriptome data to the almost complete exclusion of other types of data which is a symptom of a broader over-emphasis on transcriptome that unfortunately is quite prevalent in plant science now. It is worth reminding that for protein coding genes, which constitute the vast majority of genes, transcriptome data is a proxy measurement. The really important thing is protein amount, and even more so protein activity/function, which we know has an imperfect, at best, correlation with transcript level. We measure transcriptomes because we can, not because it is inherently the most informative thing to do. The author's quixotic quest to see if the transcriptomes of different stress treatments match is of limited value and further diminished by their misleading presentation of one particular transcriptome data set (from their vermiculite drying experiments) as somehow a special data set that everything else must be evaluated against. This study sheds no new light on how to do relevant drought (low water potential) experiments in the lab. 

      Although the reviewer acknowledges that the authors have made some effort to respond to previous comments, the fundamental flaws remain and the present version of this study is little improved from the first submission. 

      One challenge faced by the drought community is establishing consensus regarding the definition of drought itself. According to the criteria followed by the reviewer, any method leading to a reduction in water potential qualifies as drought stress. However, the findings presented in this manuscript demonstrate that transcriptional responses in roots vary considerably across five different methods of reducing water potential. This indicates that beyond responding to a change in water potential itself, root transcriptomes will also respond to the specific way low water potential is introduced. We believe this variability is of interest to the drought research community. 

      Of the five methods we explore, we hold the view that the gene expression changes induced by vermiculite drying as the most analogous to the expression signatures Arabidopsis would exhibit in response to low water potential in the natural environment. In contrast, we posit that Arabidopsis grown on agar plates - where the root system is exposed to air and light, and where water potential is lowered using chemical agents - may contain gene expression signatures plant molecular biologists may not find particularly relevant. However, we acknowledge that this is our opinion, and will make this more explicit on our revised text. 

      More broadly, we believe that the reviewer’s observation regarding the ‘over-emphasis’ on transcriptomics that is prevalent within the plant science community justifies, rather than diminishes, the work presented here. If transcriptomics is a commonly employed method, then we anticipate that the outcomes of this study will hold value for a broad audience. Such researchers are likely not only using transcriptomics as a proxy measure for protein abundance, as the reviewer suggests, but also because it is one of the more straightforward genomic techniques biologists can use to identify candidate genes that may be chosen for further scrutiny. 

      Reviewer #3 (Public Review): 

      Comments on revised version: 

      Specific previous criticisms that were addressed are: 

      (1) that gene expression changes were only compared between the highest dose of each stress assay. In the revised version, the authors changed their framework and are now using linear modelling to detect genes that display a dose response to each specific treatment. I agree that this might be a more robust approach to selecting genes that are specific to a certain treatment. 

      (2) that concentrations of PEG, mannitol, NaCl, and the "low water" agar which were chosen are not comparable in regards to their specific osmotic component. I appreciate that the authors measured the osmotic potential of each treatment. It revealed that both PEG and NaCl at their highest concentration had a much more negative osmotic potential compared to the other treatment. The authors claim that using ANCOVA they did not detect any significant differences between the treatments (lines 113, 114). I do believe that ANCOVA is not the appropriate test in this case. ANCOVA has an assumption of linearity, while the dose response between concentration and osmotic potential is non-linear. This is particularly evident for PEG (Steuter AA. Water potential of aqueous polyethylene glycol. Plant Physiol. 1981 Jan;67(1):64-7. doi: 10.1104/pp.67.1.64.). Since the treatments are not the same at the highest level, I think this could have effects on the validity of comparisons by linear model. One approach could be to remove the treatment level with the highest concentration and compare the results or adjust the treatments to the same osmolarity. 

      (3) that only two biological replicates were collected for RNA sequencing which makes it impossible to know how much variance exists between samples. The authors added a third replicate in the revised version for most treatments. However, some treatments still have only two replicates, which cannot be easily seen from the text or the figure. I would prefer that those differences are pointed out. 

      (4) that the original manuscript did not explore what effect the increase of agar and nutrient concentration in the "low water" agar had on water potentials. The authors conducted additional experiments showing that changes in water potential were exclusively caused by changes in the nutrient concentration (Figure 2-figure supplement 5; lines 222-224). However, the increase in agar strength had also some effect on gene expression. While this is not further discussed in the text, I believe this effect of agar on gene expression could be similar to root responses to soil compaction. 

      (5) That the lower volume of media in the "low water" agar could have an effect on plants. The authors compared these effects in Figure 2-figure supplement 7. They claim that "different volumes of LW agar media do not play a significant part in modulating gene expression". While I can see that they detected 313 overlapping DEGs, there were still 146 and 412 non-overlapping DEGs. The heatmap in subpanel E also shows that there were differences in particular in the up-regulated genes. My conclusion would be that the change in volume does play a role and this should be a consideration in the manuscript. 

      We thank the reviewer for their suggestions. We plan to resubmit the manuscript reflecting the requested changes. Specifically, we will: 

      -       We will detail more thoroughly the effects of agar volume on gene expression changes elicited by LW agar treatment. 

      -       We will investigate whether the tensile stress introduced by hard agar is similar to soil compaction by an analysis with existing literature. 

      -       Assess more rigorously the suitability of the ANCOVA model for assessing water potential changes of different media types.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer 1

      The paper is overall convincing. However, a little more attention to data presentation and possibly the addition of at least another technique (see below) would greatly strengthen the findings.

      As we hope to demonstrate below, we have taken steps to improve our manuscript on both fronts (data presentation and experimental evidence).

      The absence of statistics catches immediately the eye. I am sure that the shown differences are statistically significant (thanks to the number of analyzed cells), but reporting the result of some statistical test would help the reader in identify the relevant data in a plot. This is somehow necessary considering that sometimes in the text something is deemed to be "significant" or "not significant", and I felt that I really needed that when looking at the plot in Fig. 3D.

      To facilitate the interpretation of figures that contain data from multiple strains (such as the one mentioned by the reviewer), we have carried out a nonparametric single-step multiple comparison test (Games-Howell) to identify mutants whose means differ significantly from each other. To avoid overcrowding the figures, we have graphically summarized the p-values of all pairwise comparisons in a small matrix within the corresponding panel, and provided 99% confidence intervals and p-values of all differences in the Supplement.

      Related to the previous point: for every N/C distribution analysis, a number of analyzed cells is reported. By the way it is written, it seems that the replication relies solely by the cells in that specific population, i.e.: each cell is treated as a replicate. At least I could not find if that is not the case in the legends or in the methods. I wonder what the results would be (and their significance) if each replicate would be a new assay on another population.

      Cell populations exhibit significant variability in their phenotypic characteristics. Consequently, the quantification of a specific feature (e.g., the Sfp1 nuclear/cytoplasmic ratio) across a sample of cells from a given population results in a distribution rather than a single fixed value. For each quantification, we report the number of cells that were used to construct the corresponding distribution, i.e. the sample size. To compare samples from different populations (e.g., different Sfp1 mutant strains), we run them in parallel during microscopy experiments and compare their means, as described above. Throughout our study, we have tried to ensure that we quantify a sufficiently large number of cells to overcome cell-to-cell variability and enhance the reliability of our results.

      In this context, the question of the reviewer is not entirely clear to us, as individual measurements of a sample are not replicates. However, one can replicate the entire experiment on a different day by re-growing the different strains, running microscopy, quantifying the new movies etc. In this sense, the experiments shown in the manuscript consist of single replicates, i.e. experiments that were carried out on the same day, with all the relevant mutants and controls quantified together. However, we have monitored many of our mutants multiple times over the course of our work. For example, Fig. 1 below shows replicates of the Sfp1 N/C ratio distributions at steady-state in the analog-sensitive (A) and wild-type (B) background, which were quantified several times across various experiments. While day-to-day variability in the empirical distributions of the same mutant exists to a small extent, it is quite small.

      The scale of x axes in N/C ratio plots. Besides not being consistent throughout the figures, it originates from 1, visually enhancing the differences.

      We believe the reviewer was referring to the y-axes, as the x-axes represent time. Summarizing the N/C ratio dynamics of different Sfp1 mutants has been challenging. First, the average N/C ratios at steady-state vary considerably across different mutants, as shown in the panels that summarize steady-state N/C ratios. To compare the magnitude and features of their responses, normalization is necessary. We chose to normalize the time series of each mutant to have a mean of 1 prior to the onset of a perturbation. This allows the normalized time series to represent the percentage-wise changes in the Sfp1 N/C ratio upon perturbation.

      Using a common y-axis scale for all plots of N/C ratio dynamics not ideal, as some responses are subtler than others. Additionally, we do not believe that N/C dynamics across different figures need to (or should) be compared to each other. However, within a figure, panels that require comparison are placed in the same row and share the same y-axis scale. We believe that this approach optimizes data visualization and facilitates important visual comparisons.

      Related to the previous point: it is evident from the plots that the N/C ratio is always positive, even in the most deficient of the analyzed mutants. This implies that a relevant fraction of Sfp1 is still nuclear. I thus wonder what the impact of these mutations would be on the actual function of Sfp1. For this reason, I feel that qPCR evaluation of transcripts of Sfp1 target genes is particularly needed. Since lack of Sfp1 is known to yield some of the smallest cells possible, it would also be cool to have an estimate of the size of mutants where Sfp1 is less nuclear. These analyses could confer phenotypical relevance to the data, but would also help in assessing a currently unexplored possibility, that phosphorylation events by PKA influence Sfp1 function besides its localization, i.e.: the still somehow nuclear fraction is not as functional as wt Sfp1 in promoting transcription.

      It is indeed the case that the recorded N/C ratios are larger than 1 in all strains that we have monitored. We have never observed an N/C ratio smaller than 1 using widefield microscopy for two main reasons: first, out-of-focus light from the cytosol above and below the nucleus is added to the nuclear signal, causing the nuclear signal to always be non-zero, even for predominantly cytosolic proteins. Second, both in- and out of focus vacuoles are devoid of the fluorescent protein fusions that we quantify, which reduces the average brightness of the cytosol. For these reasons, even when a protein is largely cytosolic, the average N/C ratio over a cell population is no lower than around 1.5. Keeping these points in mind, one can observe that our most delocalized Sfp1 mutants have an N/C ratio that is around 1.6-1.7, which is very close to the lower limit. This means that these Sfp1 mutants are largely cytosolic, and the nuclear fraction (if non-zero) is quite small.

      We agree that assessing the phenotypic relevance of Sfp1 mutations is of interest. However, this was impossible with our original strains, as we introduced each Sfp1 mutant as an extra copy in the HO locus while leaving the endogenous Sfp1 locus intact. This was done in order to avoid any phenotypic changes that might result from changes in Sfp1 activity.

      To address the suggestion of the reviewer, we therefore deleted the endogenous Sfp1 copy in strains carrying sfp1PKA2A, sfp1PKA2D and sfp113A, leaving only the mutated Sfp1 copy at the HO locus. Surprisingly, the growth rate and drug sensitivity (determined by halo assays) of these single-copy mutants did not differ much in comparison to the mutants carrying the functional Sfp1 copy and from the wild-type (Supp. Figs. 4J and 7). This observation aligns with findings for the single-copy sfp1-1 mutant in [Lempiäinen et al. 2009], which corresponds to sfp1TOR7A in our work. [Lempiäinen et al. 2009] had suggested that Sch9 compensates for the loss of Sfp1 activity via a feedback mechanism, which could explain our results as well. If this is the case, acute depletion of wild-type Sfp1 could unveil transient changes in cell growth, before the compensatory effect of Sch9 was established. Unfortunately, we were unable to efficiently degrade wild-type Sfp1 carrying a C-terminal auxin-inducible degron. Instead, we followed the same approach with [Lempiäinen et al. 2009] and deleted SCH9.

      As we describe in the last section of Results, the difference was dramatic for sfp113A __mutants, which were extremely slow-growing in the absence of Sch9 (doubling time was around 4 hours, but it was hard to estimate because we could not grow the cells consistently). Interestingly, SCH9 deletion had a negative impact on sfp1__PKA2D __but not sfp1__PKA2A __cells (__Supp. Fig. 7). Overall, these results demonstrate that Sch9 can compensate for loss of Sfp1 activity, which makes it challenging to study the impact of Sfp1 mutations on cellular phenotypes.

      To further understand to what extent Sch9 compensates for loss of Sfp1 phosphorylation, we carried out RNA-seq on WT and cells carrying a single copy of sfp113A (with the endogenous SFP1 copy removed). Despite the fact that sfp113A __grow as well as WT, RNA-seq picked up several differentially expressed genes related to amino acid biosynthesis. This surprising finding is presented in the last section of Results, and in __Supplementary Figures 8, 9 and 10. We explore the relevance of these results and their connection with past literature on Sfp1 and Sch9 in the Discussion section.

      I found some typos here and there, and it would greatly help to report them if in the manuscript line numbers were included.

      We apologize for the typos. We have tried to eliminate them, and we have also added line numbers to the manuscript.

      Reviewer 2

      There is no biochemical evidence presented that the putative PKA sites (S105 and S136) are genuinely phosphorylated by PKA. The fact that they match the PKA consensus motif, alone, does not guarantee this. In order to claim that they are looking at the effect of PKA by mutagenizing these residues, the authors have to demonstrate the PKA-dependency of S105 and S136 phosphorylation by, for example, mass spec experiments or western blotting with phospho-specific antibodies (Cell Signaling Technology #9624 for example). Also, does the band-shift caused by PKA inhibition (Fig 3C) is canceled by the S105A/S136A mutation?

      We took several actions to demonstrate that the putative PKA sites are indeed phosphorylated by PKA. We first tried to detect Sfp1 phosphorylation using the antibody mentioned by the reviewer, but failed as the sensitivity of this antibody appears to be quite low. On the other hand, mass spectrometry did not produce the right fragments to detect the sites of interest. We therefore resorted to an in vitro kinase assay using [γ-32P]ATP together with purified PKA and Sfp1. Unfortunately, bacterial overexpression of MBP-tagged Tpk1, Tpk2 and Tpk3 (the catalytic subunits of PKA) was quite challenging and we were unable to produce soluble protein. We therefore resorted to commercially available bovine PKA (bPKA, PKA catalytic subunit, Sigma-Aldrich 539576), which shows high homology to the yeast Tpk kinases [Toda et al. 1987]. Moreover 87% of bPKA substrates have been shown to also be Tpk1 substrates [Ptacek et al. 2005], and bPKA has been used to identify new Tpk substrates in budding yeast [Budovskaya et al. 2005__]. As we show in the revised manuscript, bovine PKA does phosphorylate Sfp1. Moreover, phosphorylation is reduced by 50% in the double S105A, S136A mutant (Fig.1F), and becomes undetectable in the 13A mutant__ (Supp Fig. 6). Together with the rapid response of Sfp1 localization to acute PKA inhibition which we had already reported, we believe that these results provide strong evidence that Sfp1 is a direct PKA substrate, and that the two phosphosites that we identified are functional.

      As the above in vivo experiments do not exclude S105/S136 phosphorylation by other kinases downstream of PKA, in order to claim the direct phosphorylation, the authors need in vitro PKA kinase assay. These biochemical experiments are not trivial, but I think absolutely necessary for this story.

      One cannot exclude that S105/S136 are also phosphorylated by other kinases of the AGC family (note that [Lempiäinen et al. 2009] has already excluded Sch9). However, as we hope to have shown, PKA indeed phosphorylates Sfp1. Examining if other kinases besides PKA and TORC1 target Sfp1 is a very interesting question that should be addressed in future work.

      The authors only look at the localization of Sfp1. To assess its functionality and so physiological impact, it would be informative to measure the mRNA level of target ribosomal genes in various Sfp1 mutants they created.

      As we described in our response to Reviewer 1 above, we did perform RNA-seq on WT and cells carrying a single copy of sfp113A. We observed a notable absence of differentially expressed ribosomal genes and ribosome-related categories in the GO analysis (Supp. Figs. 8, 9 and 10). Together with our observations on SCH9 deletion (Supp. Fig. 7), these results suggest that Sch9 can largely compensate for the loss of Sfp1 activity. On the other hand, the emergence of differentially expressed amino acid biosynthesis genes is a finding that merits further investigation, as it connects with previous observations made with Sch9 deletion mutants and the [ISP+] prion form of Sfp1 (cf. Discussion).

      In the experiments using analog-sensitive PKA (Fig 1D and E for example), they directly compare wildtype-PKA versus analog sensitive-PKA, or with 1-NM-PP1 versus without 1-NM-PP1. This makes interpretation difficult, particularly because 1-NM-PP1 itself has a significant impact even in the wild PKA strain. The real question is the difference between wild-type Sfp1 versus mutant Sfp1. In the current form, they compare Fig 1D versus 1E, these two do not look like a single, side-by-side experiment. They should compare wild-type Sfp1 versus mutant Sfp1 side-by-side.

      Figure 1D shows that 1-NM-PP1 has a transient off-target effect on Sfp1 localization in WT cells, which could also affect Sfp1 mutants. This observation prompted us to use wild-type PKA as a control when testing the effect of 1-NM-PP1 on sfp1PKA2D in cells carrying PKAas (Figure 1E). As Fig. 1E shows, the effect of 1-NM-PP1 on sfp1PKA2D localization in PKAas cells is quite similar to the off-target effect in cells carrying sfp1__PKA2D __and wild-type PKA. This behavior of sfp1__PKA2D __is clearly different from the response of wild-type Sfp1 to PKAas inhibition, which results in sustained delocalization. We have made the latter observation repeatedly, both in this study and our previously published work [Guerra et al. 2021].

      In Figure 3, the argument around the additive effects of PKA and TORC1 is confusing. The authors say they are additive referring Figure 3E, but say they are not additive referring Figure 3B. Which is true? In fact, Figure 3B appears to show an additive effect as well.

      We did not use the word "additive" in the text, because we find it difficult to interpret. Instead, we state that PKA and TORC1 appear to control Sfp1 phosphorylation independently of each other. PKA and TORC1 phosphorylation converges to the same response, affecting Sfp1 localization. It appears that loss of either kinase delocalizes Sfp1, while loss of both kinases may only have a small additional effect.

    1. Josh is, by the way, a philosopher and a neuroscientist, so this gives him special powers. He doesn't sort of sit back in a chair, smoke a pipe and think, "Now why do you have these differences?" He says, "No, I would like to look inside people's heads, because in our heads we may find clues as to where these feelings of revulsion or acceptance come from." In our brains.

      AUTHORITY ??

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      *Reviewer #1 (Evidence, reproducibility and clarity (Required)): ** Summary

      The nucleus is recognised as a core component of mechanotransduction with many mechano-sensitive proteins shuttling between the nucleus and cytoplasm in response to mechanical stimuli. In this work, Granero-Moya et al characterise a live florescent marker of nucleocytoplasmic transport (NCT) and how it responds to a variety of cues. This work follows on from the authors previous study (Andreu 2022) where they examined the response of passive and active NCT to mechanical signalling using a series of artificial constructs. One of these constructs (here named Sencyt) showed a differential localisation depending on substrate stiffness, accumulating in the nucleus on stiffer substrates (which the authors previously showed was due to differences in mechano-sensitivity of passive versus facilitated NCT). Here the authors use Sencyt as a tool to probe how different cues affect NCT and thus nuclear force-sensing in two different cell lines (one epithelial, one mesenchymal). *

      They have established a 3D image segmentation pipeline to measure both the nuclear/cytoplasmic ratio of Sencyt and 3D nuclear shape parameters. As a proof-of -principle, they show that hypoosmotic shock (which inflates the nucleus and would be expected to increase nuclear tension) and hyper-osmotic shock (which shrinks and deforms the nucleus) alter Sencyt nuclear-cytoplasmic ration as expected. They then show that inhibiting acto-myosin, which would be expected to block force transduction to the nucleus, reduces NCT, although interestingly this is without any changes to nuclear morphology. They then examine how cell density affects NCT and show that Sencyt localisation correlates only weakly with density but much more strongly with nuclear deformation (especially as measured by solidity). This is surprising considering that mechano-sensitive transcription factors such as YAP have been shown to exit the nucleus at high cell densities. Therefore, the authors directly compare Sencyt and Yap nucleo/cytoplasmic localisation and show that Sencyt behaves differently to YAP with YAP localisation correlating strongly with cell density. This reveals an added layer of complexity in YAP regulation beyond pure changes to NCT.* Major points *

      The data presented throughout this work are high quality and rigorous. The controls used are appropriate (including the use of a freely diffusing mCherry to illustrate the specificity of the Sencyt probe in osmotic shock experiments - figure S2). Experiments are properly replicated and the statistical analysis is appropriate. The data are beautifully presented in figures and the manuscript is well written and very clear. Overall this is a high quality work.

      We thank the reviewer for the positive assessment of the manuscript.

      * The discussion is careful and the conclusions are supported by the data. My only small concern is that the authors place too much emphasis on how this work is in 'multicellular systems' as opposed to their previous work in single cells (for example "Here, we demonstrate that mechanics also plays a role in multicellular systems, in response to both hypo and hyper-osmotic shocks, and to cell contractility. L212). Cell density is only controlled in figures 3 and 4 and in some of the earlier experiments, cells look quite sparse (eg Figure 2). It's also debatable how far a monolayer of cancer cells, which lack contact inhibition of growth, is a multicellular system. Furthermore, the authors don't specifically look at cell/cell adhesion or observe major differences between the epithelial or mesenchymal lines. For this reason, the authors should tone down this discussion before publication. *

      • *

      We agree with the reviewer that properly assessing cell-cell adhesion is important in the context of the work. To this end, we have stained for E-cadherin in both cell lines. As expected and as described previously, the results confirm that MCF7 cells do have clear cadherin-mediated cell-cell adhesions, with a cadherin staining localized specifically in cell-cell junctions. Also as expected, C26 cells show much lower cadherin expression, without a clear pattern. Further confirming this difference, MCF7 cells show clearly distinct actin organizations in their apical and basal sides, whereas C26 cells do not. Thus, we believe that the two cell models do represent a reasonable assessment of epithelial versus mesenchymal phenotypes, in a multicellular context. The data are presented in new supplementary fig. 1, and discussed in page 3 of the manuscript (first paragraph). We have also included a paragraph in the discussion to comment on the differences between cell types (page 7, 2nd paragraph).

      * Optional experimental suggestions: For me, the most compelling finding is that nuclear deformation has a greater correlation with NCT than cell density and that this is different from the behaviour of YAP. To cement the importance of nuclear deformation, the authors could induce deformation in single cells, for example by culture on very thin micropatterned lines and assess the localisation of Sencyt and YAP. It would also be interesting to assess the role of force transduction in this context or in different densities by removing actin, which affects NCT without inducing nuclear shape changes. These functional experiments would allow the authors to draw stronger conclusions about the role of nuclear shape and deformation but they aren't necessary for publication. *

      • *

      This is a very interesting suggestion. Following the reviewer's advice, we have now carried out experiments in which we have seeded cells on micropatterns of different sizes, and measured both sencyt and YAP ratios. In C26 cells, we have found as expected that increasing spreading leads to progressive nuclear deformation (as measured through nuclear solidity) and progressive increase in both sencyt and YAP ratios. Interestingly, cell spreading in MCF7 did not affect nuclear solidity, sencyt ratios, or YAP ratios. This further confirms the relationship between nuclear deformation and nucleocytoplasmic transport, and shows as well that different cell lines have different sensitivities. The lack of response of MCF7 cells is consistent with the lower sencyt response, and lower sencyt/nuclear shape correlation measured in fig. 4. It suggests that MCF7 cells may have mechanisms to shield the nucleus from deformation, something which we have reported in a different context (Kechagia et al., Nat. Mater. 2023). The new results are reported in new fig. 3, and supplementary fig. 8, and discussed in pages 5 (1st paragraph) and 6 (1st paragraph) of the manuscript results.

      • *

      Minor points

      * - I'd like to see better examples of 3D reconstructions of nuclei (ie fig 1C but bigger) in different conditions. This is especially important in figure 3 where it would be helpful to see examples of nuclei with high or low solidity. The differences in oblateness are clear to see from the images in 3a and 3f but solidity could be better illustrated. *

      • *

      We have now added 3D reconstructions as requested, which illustrate the nuclear shape changes that take place. This is shown in figs. 1, 4 (which corresponds to figure 3 in the previous version of the manuscript), s3, and s7.

      *

      • Where Sencyt index is plotted, it would be clearer to add labels to at least figure 1 which indicate whether it is more cytoplasmic or nuclear. *
      • *

      We have done this as requested in figure 1.

      * Reviewer #1 (Significance (Required)): *

      * In this work, Granero-Moya et al characterise a new tool for measuring NCT and show that it is mechanically regulated. Given the importance of NCT in mechano-transduction, this tool will be a great asset to the mechano-biology community and will likely be adopted by multiple groups in the future. The findings about the effects of cell density on NCT and differences from YAP are interesting but could be further fleshed out. This work is likely to be of greatest interest to a specialised audience working in the fields of mechano-biology and nuclear transport. *

      • *

      We thank the reviewer for the positive assessment.

      * *

      • *

      *Reviewer #2 (Evidence, reproducibility and clarity (Required)): *

      * The study conducted by Granero Moya and colleagues describes the application of a synthetic protein which is observed to enter the nucleus in response to mechanical strains, rather than being influenced by cell density. However, the novelty of this work is minimal since the conceptual framework and the utilization of this identical or similar tool have been previously reported by the same team in earlier publications. *

      • *

      We respectfully disagree with the assessment of the reviewer. Please see below for a detailed response regarding novelty.

      • *

      *In their experiments, they employ this GFP-based sensor, referred to as Sencyt, in cells subjected to osmotic shocks. These shocks are highly stressful and impact a range of cellular processes, including stress response pathways MAPK and others; Osmoregulatory pathways; cell cycle regulations, autophagy and death pathway; ion channel regulations and others. The second findings are on cells treated with a combo of drugs affecting the actin cytoskeleton. The justification for using a combination of two specific drugs remains unclear, as the study does not adequately explain the rationale behind this choice. Additionally, there is a lack of information regarding the full range of targets these drugs affect. This raises questions about the comprehensiveness and applicability of the findings, as understanding the complete scope of the drugs' targets is crucial for interpreting the results within a minimal frame of physiological context. *

      • *

      The two drugs used are paranitroblebbistatin (a photostable version of blebbistatin) and Ck666. We apologize for not explaining in more detail the action of these drugs, both of which have been characterized and used extensively in the literature. Paranitroblebbistatin binds to myosin, preventing its ATPase activity and therefore impairing actomyosin contractility (https://doi.org/10.1002/anie.201403540). It acts on different myosin isoforms, including non-muscle myosin II, the main type of myosin responsible for actomyosin contractility in non-muscle cells. CK666 binds to and inhibits arp2/3, a protein responsible for nucleating branched actin (https://doi.org/10.1016/j.chembiol.2013.03.019). This impairs lamellipodial formation and therefore cell spreading (see for instance https://doi.org/10.1371/journal.pone.0100943).

      The rationale for using both drugs in combination was explained in page 4 of the manuscript. In our previous work, we determined that myosin inhibition with blebbistatin is not sufficient to inhibit nuclear mechanotransduction. Indeed, in an epithelial context, we observed that due to reduced contractility, blebbistatin-treated epithelial cells in fact spread more on their substrate. This leads to more deformed (flattened) nuclei, leading to the counterintuitive result that YAP nuclear localization increases rather than decreases. If cell spreading is impaired by interfering with branched actin nucleation, then this spreading is prevented, and the combination of drugs leads to reduced nuclear deformation, and reduced YAP nuclear localization (see supplementary fig. 7 in Kechagia et al, Nat. Mater. 2023, https://doi.org/10.1038/s41563-023-01657-3). Similar results had been published previously by the group of Clare Waterman (https://doi.org/10.1074/jbc.M115.708313).

      Thus, the combination of drugs was designed to ensure that we were impairing nuclear mechanotransduction. Of course, we agree with the reviewer that all perturbations have potential side effects. Osmotic shocks will affect a range of cellular processes (as mentioned in the discussion of the manuscript), and any drug treatment can potentially have off-target effects. However, the fact that two orthogonal perturbations with different potential side effects (osmotic shocks versus actomyosin-targeting drugs) lead to the same effects in sencyt strongly suggests that the effect is mediated by mechanics, and not other factors. To reinforce this, we have now added an additional mechanical manipulation: seeding cells on micropatterned islands of different sizes. As spreading increases, cells are known to increase actomyosin contractility, and nuclear deformation (https://doi.org/10.1529/biophysj.107.116863, https://doi.org/10.1073/pnas.0235407100, https://www.nature.com/articles/ncomms1668, https://doi.org/10.1073/pnas.1902035116). As expected, nuclear solidity, sencyt ratios, and Yap ratios all increased with cell spreading. Interestingly, this occurred only for C26 and not MCF7 cells, where no changes were measured in solidity, sencyt, or YAP. The lack of response of MCF7 cells is consistent with the lower sencyt response, and lower sencyt/nuclear shape correlation measured in fig. 4. It suggests that MCF7 cells may have mechanisms to shield the nucleus from deformation, something which we have reported in a different context (Kechagia et al., Nat. Mater. 2023).

      The new results are shown in figs. 3 and s8. We have also expanded the explanation of drug treatments in page 4 (3rd paragraph).

      * The novelty is on the specificity of this synthetic fusion protein for these manipulations and not on cell density. Yet, the reasons behind this selective response remain unexplained, potentially attributable to the unique characteristics or sensitivity thresholds of their synthetic probe. As comparison, YAP localization and this is sensitive to both inputs, but this is also already published (fig4). The focus is anyway on Sencyt for which they offer simple observations and quantifications. *

      • *

      The main novelty of the work lies in the characterization of the role of nucleocytoplasmic transport in mechanotransduction, in the context of multicellular systems. We and others had shown that nucleocytoplasmic transport responds to mechanical force in the context of single cells (see for instance Andreu et al. 2022 from our group, but also https://doi.org/10.1126/science.abd9776 from the Martin Beck group). However, to what extent this applies to multicellular systems was unknown. It is true that in multicellular systems, the response of YAP and other mechanosensitive transcription factors has been characterized (such as in our Elosegui-Artola 2017 paper, mostly done at the single cell level but including one figure panel on epithelial cell monolayers). The reviewer argues here and in the consultation comments with other reviewers (see below) that this demonstrated the role of nucleocytoplasmic transport in multicellular systems. However, we respectfully disagree. As also noted by reviewer 3 in the consultation, the response of YAP, and of any transcription factor, may include effects on nucleocytoplasmic transport, but will also likely include effects caused by the complex biochemical signalling pathways that regulate them. Disentangling such effects requires a sensor that only responds to nucleocytoplasmic transport, and this is precisely what Sencyt provides.

      The reviewer also states that our manuscript does not explain why sencyt responds to mechanics and not cell density. We disagree: sencyt responds to mechanics for the reasons explained in our previous work (Andreu et al., Nat. Cell Biol. 2022), and there is no reason to expect a specific response to cell density. In this regard, we don't think there are any sensitivity thresholds to detect cell density, as the probe is not designed to sense this parameter in the first place. The fact that YAP responds to both mechanics and cell density shows that the response to density cannot be merely explained by mechanics, and is rather due to signalling through other means. Of course, we agree that we do not explain the mechanism by which YAP senses cell density, but we think this lies clearly out of the scope of our manuscript.

      In terms of novelty, our work also characterizes a tool to assess nucleocytoplasmic transport live in cells. We agree with the reviewer that the specific construct had been reported in our previous paper, but it had not been characterized in detail. This is done here, enabling its use by the community as a tool to measure nucleocytoplasmic transport in any context, be it related to mechanics or not.

      • *

      When reviewing the figures presented, I find it challenging to detected marked differences, despite their quantitative data suggesting otherwise.

      • *

      We assume here that the reviewer refers to differences in sencyt nuclear localization, that is, the sencyt index. We have now checked the example images showing changes in sencyt index, in figures 1 and 2. In figure 1, the example cells under hypo-osmotic shocks increase their sencyt index from 1.2 to 1.45 (C26). In figure 1, the example cells under hyper-osmotic shocks decrease their sencyt index from 0.9 to 0.3 (MCF7) and from 1.4 to 0.5 (C26). In figure 2, the example cells increase their sencyt index upon drug washout from 0.2 to 1.4 (MCF7) and from 0 to 0.9 (C26). Of course, these individual values don't reflect exactly average values, but they do reflect the reported average trends and their magnitudes faithfully. Here we note that even though sencyt changes with the different treatments, it is always more nuclear than cytosolic (sencyt index >0, as it has an NLS). Thus, to the naked eye, sencyt always seems to show a "bright" nucleus, and it is hard to intuitively see changes in its localization. Further, we also note that osmotic shocks lead to overall changes in fluorescence levels due to volume changes (as GFP molecules get diluted or concentrated in hypo or hyper osmotic shocks, respectively). This does not affect ratiometric quantifications as assessed with our mcherry control, but means that changes in ratios are hard to see by eye. To help in this visualization, we have now changed the images from green to grayscale, which is better perceived by the human eye. We have also specified the issue of fluorescence intensity changes in the legend of the figure.

      In addition to this, we have seen that there is indeed a case in which examples were not following average trends. In the case of hypo-osmotic shocks in figure 1, example MCF7 cells were barely changing their sencyt index with treatment. We apologize for choosing this non-representative image for the figure, we have now changed the figure to show more representative cells.

      • Furthermore, the study attempts to correlate the behavior of Sencyt with the nuclear geometric parameter of solidity, a connection that seems to lack a clear basis in cell biology and could potentially lead to misconceptions. *
      • *

      Mechanical effects on nucleocytoplasmic transport are mediated by mechanical tension application to nuclear pores, which are embedded in the nuclear membrane (nuclear envelope). Whereas nuclear envelope tension is very challenging to measure directly, it can be indirectly related to nuclear shape. Indeed, a tense membrane will tend to even out membrane irregularities and appear rounded, whereas a membrane under low tension will tend to show wrinkles. Nuclear solidity is a geometric parameter that compares actual nuclear volume to the volume of the convex hull (intuitively, the volume of the smallest wrinkle-free object containing all of the nucleus). Thus, it is the geometric parameter that best reflects the presence of wrinkles, folds or irregularities, and as such the one that should best correlate to membrane tension. Of course, this correlation is not perfect, and there could be many situations in which changes in membrane tension may not directly affect nuclear solidity. But we do believe that solidity is the geometrical parameter that should best reflect membrane tension, and this is why we focus on it. Consistent with our hypothesis, solidity is the geometrical parameter that best correlates with sencyt. To further clarify this, we now explain this rationale in detail in page 4 of the manuscript (1st paragraph).

      * Reviewer #2 (Significance (Required)): *

      * In sum, I think the MS is of interest for a very specialistic audience. There are no clear interpretations. The work is done in one or two cellular model systems in vitro; and the general significance of these observations is of very limited impact and no novelty. *

      We strongly disagree. The study is done on two cellular models, one with epithelial and the other with mesenchymal phenotype, and thus highly relevant for multicellular systems. Following suggestions by reviewers 1 and 2, we have now characterized the epithelial/mesenchymal behaviour of the cell types in detail (see supp. fig. 1). The results are novel in that they demonstrate the role of nucleocytoplasmic transport in multicellular systems, something which as argued above had not been done before. The difference with YAP, and the disentanglement between transport and signalling, is also novel. Finally, we believe the manuscript will be impactful because of this novelty, but also because of the availability of sencyt as a tool for the community. In fact, since placing this manuscript in biorxiv, we have received many requests (directly and through addgene) to share sencyt, which is currently being used in several labs across the world.

      • *

      *Reviewer #3 (Evidence, reproducibility and clarity (Required)): *

      • *

      In this very well-written manuscript, Pere Roca-Cusachs and colleagues investigated the response of nucleocytoplasmic transport (NCT) to mechanical stress and tested whether this response is similar in epithelial and mesenchymal cells using a combination of quantitative approaches. This study builds upon their earlier findings, which elegantly demonstrated that NCT is sensitive to mechanical forces transmitted to the nuclear membrane. Using a similar approach to their recent work, they quantitatively analyzed NCT and compared the two cell types using various treatments that impact nuclear membrane tension. The study is straightforward and experimentally sound, with an adequate number of replicates and independent experiments. While one might consider the limitations given their previous work, none have demonstrated that NCT is mechanosensitive in epithelial cells. Additionally, they provide a simple approach to measure NCT, which should be of interest in the field. However, it is unclear how the authors defined the epithelial phenotype in this work and whether they solely based this characterization on the tissue/cell's origin. Epithelia can be defined ultrastructurally with reference to their apico-basal polarity and specific cell-cell junctions (Alberts et al., 1994; Davies and Garrods, 1997). Changing cell density should affect cell/cell adhesion, but the authors provide no evidence that the cells tested in the study are attached to their neighbors on all sides and form an epithelium. While I recognize that the objective of this study is not to mimic the in vivo behavior of epithelial tissue, the authors should at least ensure that cells form a monolayer by quantitatively assessing cell-cell junctions (or they should adjust their conclusions adequately). This control is specifically important for Figure 3 and 4, whose objective is to test the impact of cell/cell contacts. But it would also be important to provide this essential control for Figure 1 and 2, as it is unclear from the images provided if MCF7 cells are forming an epithelium (and form cell/cell junctions).

      • *

      We thank the reviewer for the positive assessment of our work. We fully agree with the reviewer that properly assessing cell-cell adhesion is important in the context of the work. To this end, we have stained for E-cadherin in both cell lines. As expected and as described previously, the results confirm that MCF7 cells do have clear cadherin-mediated cell-cell adhesions, with a cadherin staining localized specifically in cell-cell junctions. Also as expected, C26 cells show much lower cadherin expression, without a clear pattern. Further confirming this difference, MCF7 cells (but not C26 cells) show a clear apico-basal polarization, with distinct actin organizations in their apical and basal sides. Thus, we believe that the two cell models do represent a reasonable assessment of epithelial versus mesenchymal phenotypes, in a multicellular context. The data are presented in new supplementary fig. 1. We have also included a paragraph in the discussion to comment on the differences between cell types (page 7, 2nd paragraph).

      • Reviewer #3 (Significance (Required)): *

      • *

      The mechanosensitivity of NCT is an important question central to many aspects of cell biology. One might consider the impact of the proposed work limited, given their previous research. However, none have demonstrated that NCT is mechanosensitive in epithelial cells, making it a crucial question that needs to be addressed. Additionally, they provide a simple approach to measure NCT, which should be of interest to a broad audience.

      We thank again the reviewer for this positive assessment.

      • *

      *Referees cross-commenting *

      * Here comments from all 3 reviewers are reported *

      * Reviewer 1: *

      * I disagree with R2's comment that there is 'no novelty' here. Although this work is going to be of greater interest to a specialised rather than general audience, it characterises in depth a simple tool to measure NCT which will be useful for mechanobiology field. Also, using 'two cellular model systems in vitro' is very standard in the field when assessing subcellular processes like NCT. Using this approach in vivo would be very interesting but challenging and would be an entirely different study . *

      • *

      *I agree with R2's comments that the authors should better justify their combination of two actin inhibitors and R3s point on better assessing cell/cell junctions. *

      • *

      We thank the reviewer for these comments. Both issues have been addressed, as described in the response to reviewers above.

      * Reviewer 2 *

      * About Reviewer 3's comments, I believe it's a stretch to highlight the strength and novelty based on "NCT's mechanosensitivity in epithelial cells has not been demonstrated,". There are thousands of papers on the Hippo pathway, that is known to be mechanosensitive, on the regulation of YAP, that enters in the nucleus in Hippo inhibited conditions and exits to the cytoplasm in Hippo induced cells, including downstream of mechanical signals. The phenomenon of nuclear-cytoplasmic shuttling being a common event from neurons to endothelial and multiple types of epithelial, immune, and fibroblast cells is already established through NCT of this and other endogenous proteins. This is simply an accepted fact. Then, The Nature cell Biology 2022 was offering a very general claim. No warning that conclusions could have been cell type specific. In the Artola 2017 Cell paper they also showed NCT in mammary epithelial cells. We should definitively conclude that NCT's mechanosensitivity in epithelial cells has been well demonstrated. *

      • *

      We disagree with this assessment, for the same reasons also exposed by reviewer 3 below. Previous work on YAP and other transcription factors cannot be seen as a demonstration of the role of nucleocytoplasmic transport per se. The localization of any transcription factor is highly regulated by complex signalling pathways, and can be affected by many factors. One of them is nucleocytoplasmic transport, but signalling events (for instance through phosphorylation) could change localization by promoting binding to cytosolic or nuclear binding partners, by promoting protein degradation, by masking nuclear localization signals, and others. To isolate the role of nucleocytoplasmic transport, a probe sensitive only to this factor should be designed. This is exactly what sencyt provides. In fact, this has allowed us to answer an important open question: is the sensitivity of YAP to cell density mediated by mechanics and nucleocytoplasmic transport, or is it mediated by some other factor? Our results suggest that some other factor, likely mediated by the Hippo pathway and not necessarily mechanotransduction, explains this sensing of cell density. This is a novel finding, which was not provided in either our Elosegui-Artola 2017 paper or our Andreu 2022 paper.

      * About Reviewer 1: I find it challenging to grasp the point made in the comment. On novelty, in their previous study in NBC 2022 Syncet was already shown to undergo NCT. The reviewer states that the study presents "a simple tool to measure nuclear-cytoplasmic transport (NCT) beneficial for the mechanobiology field, and evidence that this demonstrates a novel layer of regulation in hippo signaling (also because this is observational and not a mechanistic study). The tool in question is far from simple. Its application requires transfection into cell cultures, conducting live imaging, etc. If one aims to measure NCT of endogenous proteins, straightforward immunofluorescence or live imaging of endogenous proteins (like GFP-tagged YAP, Twist, Smads, etc.) using the same experimental setup should suffice to demonstrate relevance, without necessitating any additional experiments. What then, is the unique benefit of this proposed tool? Given it's an artificial construct combining NLS-GFP with a bacterial protein, questions arise about the effects of the forced nuclear localization signal (NLS) or the bacterial component. It is an empirical artificial construct and there is no mechanism to explain its behavior.The comparison of Syncet with YAP seems to me questionable and of limited utility. *

      As also noted by reviewer 3 below, the use of genetically encoded fluorescent sensors that require transfection is by now absolutely standard in biology, and cannot be considered to be "far from simple". And as stated above, imaging of endogenous transcription factors (which also requires transfection if it is done live) does not isolate the role of nucleocytoplasmic transport. We also disagree that "there is no mechanism to explain its behaviour". Sencyt was developed in our previous andreu et al 2022 paper, where the mechanism is explained in detail.

      • *

      *It's unsurprising that an artificial construct only mirrors some aspects of what is considered a genuine mechanosensitive protein. The utility of a synthetic tool lies in its ability to replicate actual phenomena, not in what it fails to do. In comparison to their NBC 2022 study, this manuscript focuses on what their reporter fails to detect. *

      We disagree that a synthetic tool is only useful if it replicates the behaviour of endogenous proteins. A synthetic tool, precisely due to its engineered, artificial nature, can be made to respond only to specific factors (in this case, nucleocytoplasmic transport). This can then be used to disentangle the role of such specific factors, as done here.

      The osmotic shock was the assay in their 2017 Cell paper. Here they demonstrate that a combination of Blebbistatin+CK (an unclear choice of drugs) is ineffective, as is cell density. Are there other specific peculiarities associated with this construct?

      Here, we note that our osmotic shock experiments in our 2017 paper were done for YAP (not nucleocytoplasmic transport in general). Regarding the choice of drugs, please refer to our answer to the reviewer comments above for a full explanation. Also, we want to clarify that this combination is not ineffective, as it leads to clear changes in sencyt. * *

      * My other concern is on the minor quantitative changes reported, which seem inconsistent with the provided representative images, where significant differences are difficult to appreciate. For instance, the claim that the transfected sensor differs from an endogenous NCT protein, YAP, after cell density treatment, is hard to detect in their images. In Figure 4, comparing YAP and Syncet in C26 cells, YAP appears uniformly nuclear at high cell density, potentially more nuclear than the synthetic sensor, which is not coherent with their claim.*

      • *

      Regarding the concern of the minor changes seen in images, please refer to our full response to the reviewer comments above. Regarding the comparison between sencyt and YAP, we want to clarify that in our manuscript we do not compare the absolute values of nuclear localization between YAP and sencyt. As the reviewer notes, these are two different proteins, so which one is more nuclear does not really provide useful information. So whether YAP is more or less nuclear than sencyt is unrelated to (not incoherent with) our claim. What we state in figure 4 is that YAP responds to cell density, whereas sencyt does not. This is clear from the quantifications and also from the images.

      • *
      • From the Hippo perspective, there is really an unusual amount of nuclear YAP left in their cells. This should be almost completely cytoplasmic from prior contact inhibition studies in the Hippo field. Syncet could be simply less sensitive than YAP in these borderline conditions. Although there's a more noticeable cytoplasmic noise in dense cells with YAP compared to Syncet, this could be attributed to several factors, including differences in protein degradation rates, which I suspect to be quicker for a synthetic protein. From a technical perspective it is complex to get strong conclusions after comparing something so unrelated with each other. One is a live GFP detection and the other is a staining by immunofluorescence. the nature of the background is also different and so conclusions from comparisons between unrelated systems is not justified. *
      • *

      In conditions of high density, average YAP ratios are close to one (zero in logarithmic scale, as reported in the figures) for MCF10A cells, so there is no nuclear localization. This is similar to what we and others have previously reported in similar conditions (Elosegui Artola et al 2017, Kechagia et al. 2023, for example). In C26 cells, YAP levels at high density are a bit higher. This is likely due to their mesenchymal nature, and therefore diminished cell-cell contact inhibition (as assessed in detail in this revision). This in fact further suggests that the response of YAP to cell-cell contacts is different from a mere mechanical factor, supporting our hypothesis. Regarding the issue of noise, background noise is removed from quantifications, and potential noise coming from non-specificities or autofluorescence is also cancelled by the fact that we compute fluorescence ratios between nucleus and cytoplasm (and not absolute values). Thus, we don't think noise is an issue. Further, we note again that we do not directly compare values between sencyt and yap.

      * This suggests caution on what is heralded as the main claim here put forward. *

      * Reviewer 1: *

      *I do have some sympathy with R2s comments in the consultation. I agree that showing that NCT is mechanosensitive in an epithelium is not new. I also agree that sometimes it is difficult to see the quantitative differences by eye. This second point could be addressed by including more details of the segmentation and analysis in the supplemental material (along with some example images). *

      • *

      We thank the reviewer for the suggestions. Regarding the novelty, please see above for a detailed discussion, and also the comments of reviewer 3 below (previous work studied not NCT but transcription factors, affected by many parameters). Regarding quantitative differences, we have now addressed this issue by showing images in grayscale rather than green, and also by replacing one example cell in figure 1 which indeed did not reflect the average measured trends. We now also show examples of 3D rendered images of the nuclei in different conditions. We have also gone through the methods and clarified in detail how ratios are calculated, the segmentation procedure is also explained in detail.

      * Regarding novelty, I would be interested to know if R2 thinks that there are experiments that the authors could do to improve the work. Or do they need to simply tone down their claims? It's perfectly acceptable to publish a well characterised tool with a series of observations and it's beneficial to the community to do so.*

      • Reviewer 3 *

      * Thanks to Reviewers #1 and #2 for using this consultation option; I truly appreciate their feedback on my comments and find it extremely valuable. I agree with Reviewer #1 that the method proposed here is relatively simple. Transfecting cells and conducting live fluorescent imaging can hardly be considered difficult. I believe the construct used/designed by the authors is the main advantage as it provides a specific way to quantitatively assess NCT and not limit the analysis to a single nuclear protein (such as YAP). Reviewer #2 suggests using immunofluorescence staining of YAP or live imaging of fusion fluorescent protein (following transfection) to analyze NCT, but this approach would yield a readout not only based on NCT but also on the many other interacting partners/mechanisms that regulate the candidate localization, resulting in an unspecific readout (and similar transfection/live imaging set-up). *

      • *

      We thank the reviewer for this comment, we fully agree and have elaborated on this in our responses above.

      * Regarding the impact of the study, I agree that it is certainly not as impactful as previous publications on this topic. Although I find reviewer#2 argument on Yap irrelevant, as YAP is not the main focus of this paper. Some experiments have been done with cells of epithelial origin, but NCT mechanosensitivity has not been clearly tested in epithelial monolayer, which is the main claim of the proposed study here. The 2017 Cell paper focused on YAP transport into the nucleus (and not NCT in general) and they showed a correlation between YAP nuclear localization and traction force in MCF10A. I am not sure if one would say that "NCT mechanosensitivity has been well demonstrated in epithelial cells" based on this single panel. The impact of the proposed study is certainly not outstanding but offering a thorough analysis in epithelial cells (as monolayers and not as individual cells) and presenting a well-defined experimental approach should be of interest in the field. I agree with comments from reviewer#2 that some reported effects in graph are unclear on main images. More experimental details should hopefully clarify this aspect.*

      • *

      We fully agree with the reviewer. Regarding quantitative differences, we have now addressed this issue by showing images in grayscale rather than green, and also by replacing one example cell in figure 1 which indeed did not reflect the average measured trends.

    1. Author response:

      The following is the authors’ response to the original reviews.

      (1) The authors should show i) whether the variants exhibit the same surface expression as wildtype and ii) whether changes of surface expression (e.g. wt transporter expressed low and high) alters growth rates under conditions where growth depends on amino acid uptake. The authors say that the uptake of radioactive substrate and the overall fitness coincide (Figures 5 and 6), but it would be good to quantify the correlation, perhaps by using a scatterplot and linear regression.

      We thank the reviewer for the questions and proposals. The comparison of the surface expression between the transporter-expressing variants was added to the manuscript (Figure 3- Figure supplement 1 and 2). In the case of the AGP1 variants it was calculated that surface expression between the evolved mutants and the wild-type is similar, indicating that the transporter overexpression has no impact on the growth rate per se. The same analysis for the PUT4 variants showed significant difference, with the PUT4-S variant seemingly expressed more than the wild-type. However, that does not seem to affect the uptake effect of the mutation in the cases of the original substrates of Ala, Gly and GABA, since in those cases the transporter activity for the evolved variant is substantially decreased (Figure 5). Thus, the variation on the surface expression between the mutant and the wild-type, which could be attributed to the small sample size and the inherent limitations of the analysis (imaging of a culture with cells in different planes), is not expected to interfere with the reported results.

      Additionally, a scatterplot accompanied with a linear regression curve describing the connection between the overall fitness and uptake of 2 mM radioactive substrates was added to the manuscript, as advised (Figure 5- Figure supplement 2). In both cases of 2 mM Phe or Glu, the regression model explains 60-70% of the variation observed in the uptake rate of the amino acids by the different variants if changes in the uptake rate are dependent on changes in the fitness.

      (2) The authors should further investigate to what extent the (over)expression of wildtype versus variant transporters impacts growth rates. I would recommend such experiments being done under conditions where nitrogen uptake does not depend on amino acid uptake. I could imagine that some of the fitness data are confounded by the general effects of mutations on growth rates. More concretely, I could imagine that overexpression of e.g. the AGP1-G variant is less of a burden for the yeast cells and would allow to grow them better in general. This could explain why its overall fitness is close to wt, whereas other variants exhibit diminished fitness (Fig. 4A).

      The growth curves of all transporter variant cultures in the absence of selection for amino acid uptake have been presented in Figure 4 - Supplement figure 1. As proposed, the growth rates of the variants in medium with ammonium as nitrogen source were calculated and presented in Figure 3- Supplement figure 1 and 2. For both cases of AGP1 and PUT4 expressing variants, statistical analysis showed no significant difference between the mutants and the wild-type.

      (3) It is quite remarkable that the PUT4-S variant has such a dramatically enlarged substrate spectrum. In addition, the fitness losses for Alanine and GABA are rather small. This striking finding asks the question of why yeast has not evolved this much better/more efficient variant in the first place?

      We thank the reviewer for this very good question. We now included an explanation in the Discussion, but to give a short answer here: One should keep in mind that we used a 10-gene deletion strain to select for given mutants. Wild-type cells have a wide spectrum of substrates through the use of many amino acid transporters, and their regulation is intricately tuned to achieve optimum transport under any environmental circumstance. Broadening the spectrum of a single transporter thus would not lead to increased fitness. On the contrary, it would probably throw off this fine balance.

      (4) It would be generally interesting which types of selections (transporter/amino acid combinations) were tried (maybe as part of the methods section). I could imagine that the examples that are shown in the paper are the "tip of the iceberg", and that many other trials may have failed either because the cultures died, or the identified clones would grow faster due to mutations outside of the plasmid. It would be helpful for researchers planning such experiments in the future to be made aware of potential stepping stones.

      The issues raised here are spot-on, as we actually did test the evolution of PUT4 towards transport of other amino acids than the two mentioned in the report. Aside from the successful Asp and Glu, we ran parallel cultures selecting for transport of Gln, Thr, Trp, Tyr, and Cit. Neither of these evolution regimes led to increased growth phenotypes that were linked to the evolved gene, and we did not investigate these cultures further. At this point, we cannot fully explain this result, which is why we decided to omit it from the report. The L207S variant of PUT4 was later shown to indeed support growth on Gln, Thr, and Cit. Therefore, we speculate that the reason for not evolving this mutant in the respective evolution cultures was that the fitness gain in these amino acids was not large enough to be sufficiently enriched in the course of the evolution trial. Given that the Δ10AA strain still harbors nine amino acid transporter genes in its genome, it is conceivable that upregulation of some of these genes causes growth in some amino acids, prohibiting the selection of mutations in PUT4 (e.g., by mutations outside the plasmid, as the reviewer aptly suggested). We deemed these (negative) results not appropriate for the manuscript, as our main focus was characterizing the fitness effects of single mutations, not the laboratory evolution process of obtaining the mutants.

      (5) The authors took a genetic gain-of-function approach based on random mutagenesis of the transporter. In such approaches, it is difficult to know which mutation space is finally covered/tested, and information that can be gained from loss-of-function analyses is missed. Accordingly, the outcome is somewhat anecdotal. To provide an idea of the mutational landscape accessible, the authors could perform NGS of cultures without any selective pressure, and report the distribution of missense variants in the population.

      We very much appreciate the interest in the details of the mutagenesis. Based on the information given in the original OrthoRep publications (e.g., Ravikumar et al., DOI: 10.1016/j.cell.2018.10.021; mutation rate approx. 10-5 per generation and nucleotide), we calculated the expected number of mutations per passage in our experiments. For AGP1, it is about 5000 mutational events per passage (10 mL culture volume and 1:200 dilution), and for PUT4, it is about 1000 mutational events per passage (2 mL culture volume and 1:100 dilution). At a gene length of about 2000 bp, we expect to cover most single mutations already in the first or second passage (in the absence of selection). This is reflected in the result that the strongly beneficial mutation L207S in PUT4 was recovered in every selection on Asp or Glu we tested. We included this information in the Methods section.

      That said, the present study was consciously designed to research gain-of-function mutations, as we wanted to know if and how membrane transporters can evolve new substrate specificities without losing the original functions. Our approach was chosen to reflect as close as possible a natural scenario where a microorganism encounters a new ecological niche (a new nutrient to be transported). At the same time, we included selective pressure to keep the capacity to thrive in the original niche (to assimilate an ancestral nutrient). This approach is designed to specifically select against any loss-of-function mutations, which is in line with most modern theories about evolution of protein function (excellently reviewed in Soskine and Tawfik, DOI: 10.1038/nrg2808). We find that this approach gives a good idea how transporters could evolve new functions in a natural setting. By engineering single mutations in the wild-type background of the transporters, we show the fitness effects of different single mutations - this finding thus does not depend on the mutational landscape that is covered in the experiment.

      (6) The authors do not discuss the impact of these mutations on transport rates/kinetics, which are known to play a role in substrate selection in solute carriers (https://www.nature.com/articles/s41467-023-39711-y). Do the authors think ligand binding/recognition is more important than kinetic selection in the evolution of function?

      Indeed, the observed phenotypes can stem from both changes in transport rate and changes in substrate binding. In our opinion, both are perfectly possible explanations for the behavior of evolved transporter variants. We are not discussing this in the manuscript as the weak transport of the novel substrates in the wild-type transporters did not allow us to unambiguously assign one or the other. Yet, we can lend minor circumstantial evidence pointing towards substrate affinity being the more important factor in evolving a new activity in transporters: Overall transport rate (for original substrates) declined in most evolved transporters. Therefore, it is a bit less likely that improved transport rate allowed novel substrates to be used as a nutrient. However, this is not to say that both processes can occur (even side by side).

      (7) Ultimately, what are the selective pressures that drive transporter function? The authors pose this question but don't fully develop the idea. Would promiscuous variants still be selected for if the limiting nitrogen source was taken up by the cell via a different pathway (i.e. ammonium or perhaps arginine)?

      Evolution and regulation of transporters is a very complex system, and we simplify this system in our single-transporter/single-amino acid approach. In nature, the selective forces are assumed to be much smaller than in our system, and multiple selective pressures might occur at the same time (maybe even in opposite directions). Therefore, such predictions are beyond the scope of the present study. To put it shortly, yeasts (and other organisms) have evolved the capacity to transport all natural amino acids. Yet, to actually allow fine-tuned regulation of transport of each individual amino acid, narrow- and broad-range transporters have evolved, including a lot of redundancy. This means that the question posed cannot be answered by yes or no, but by “it depends”.

      (8) Amino acids are a special class of metabolites, in that they all have the same basic structure. Thus, transport systems really only need to recognize the amino and carboxyl groups with high fidelity, and can modulate the side chain binding site to increase specificity. This was demonstrated in a bacterial APC transporter (https://www.nature.com/articles/s41467-018-03066-6#Sec2). Is this why the APC fold is largely responsible for AA uptake in biology?

      Indeed, typically, APC-type amino acid transporters bind the amino and carboxyl groups in the same position by backbone interactions. Therefore, this might be an ancestral feature of the APC superfamily and explain why this group represents the main group of amino acid transporters.

      (9) There isn't much discussion on the location of the mutations with respect to binding site vs. gating helices. Are there hotspots of mutations within the APC, and areas where variation is poorly tolerated? It would be helpful to briefly review what is known about mutations that change amino acid specificity in the APC family. My impression is that other studies applying rational mutagenesis have also shown that single-site mutations in the binding pocket alter substrate specificity - are these analogous to the L207 in PUT4? PUT4: I64T comes up in 3 of 5 selections. Did the authors consider a closer analysis of this mutation, and if not, why?

      We agree that it would be helpful to determine hotspots of mutations in APC transporters that lead to changes in selectivity. However, we feel that the current literature does not lend enough data to support an extended analysis of such hotspots. Conversely, the natural sequences of APC transporters are not similar enough to determine which residues are responsible for a certain selectivity profile. There are however some studies on site-directed mutagenesis, as mentioned by the reviewer. A short summary of those is discussed in the revised paper. Interpretation of the previous studies under the light of our results suggests that the evolutionary evolved sites derived in our work play a significant role in substrate selectivity and transporter function within the superfamily of the APC transporters.

      As to the question why we did not include the I64T mutation in our experiments: this mutation lies within the poorly defined N-terminus of the protein, which is not part of the transmembrane core. We therefore deemed this residue as probably not connected to the specificity of the protein; it might be related to the protein’s stability in the cell, as the termini of transporters are known to be important for post-translational regulation, especially vacuolar degradation.

      (10) What do we learn about the APC fold that informs our understanding of where substrate specificity arises in this fold? Do the authors think all SLC folds are equally capable of adaption, or are some more evolutionary-ready than others? An evolutionary analysis of these transporters to gain insights into whether the identified substitutions also occurred during natural evolution under real-life conditions would further strengthen the manuscript. Could the authors provide a sense of how similar the 18 yeast amino acid transporters are, such as sequence alignments or a matrix of pairwise sequence identity/similarity? Are they very diverged, or is the complement of amino acid substrates covered by a rather conserved suite of transporters?

      We do not want to make bold statements about adaptive evolution in other SLC folds, but we consider it not unlikely that a similar approach will lead to similar conclusions in other transporters.<br /> As advised, a pairwise identity matrix was added to the manuscript (Figure 1–figure supplement 2).

      As to the proposed analysis focusing on natural occurrence of the mutations we found: we have indeed looked into this, but have not found evidence of such mutations. This is actually expected, as our selection regime puts “unnatural” selective pressures on a single transporter in isolation, which in reality co-evolved with a whole suite of other transporters that already have the capacity to transport all amino acids. Therefore, it is unlikely that the same mutations would happen in a natural setting. Our study is designed to capture evolution where a completely novel substrate is encountered, for which no transport mechanism has evolved yet.

      (11) Throughout: some of the bar graphs show individual data points, but others do not (Figure 3, Figure 5). These should be shown for all experiments.

      We thank the reviewer for the comment. In the revised version of the manuscript, we included individual data points in all bar graphs.

      (12) For bar graphs in which no indication of significance is shown, does this mean that p>0.05? Comparisons that are not significant (p>0.05) should be indicated as such.

      We thank the reviewer for the comment. In the revised version of the manuscript, we indicated in the legends that in cases of no significant difference (p > 0.05) between the wild-type and the evolved variants, no asterisks are shown.

      (13) Figure 5, Figure 6: Are the three confocal images just three different fields of view? It might be useful to include a zoom-in on a single representative cell, as it is hard for the reader to see to evaluate the membrane localization.

      In the revised version of the manuscript, we clarified that the three confocal images represent three different cultures, as each variant was tested in triplicates. We also included a zoom-in of a representative cell, as suggested.

      (14) In the main text, page 9, the conditions used for each experimental evolution are not clear ("nitrogen limiting mixture of amino acids (1 mM final concentration)". I think this is an important detail, since the mixtures are quite different for the more promiscuous vs. the more selective transporter, and it would be helpful if this was described more clearly in the main text.

      We thank the reviewer for the comment. We have included further clarification in the revised manuscript.

      (15) Figure 1-Supplement 1 and Figure 4 Supplement 4 - can't read the figure labels. Try labeling columns and rows rather than individual plots.

      We have taken the proposal into account and revised the proposed Figures accordingly.

      (16) Page 9: "The transporter gene was sequenced and re-introduced into Delta-10AA cells." Was the plasmid isolated, sequenced, and re-introduced, or was the gene cut-and-pasted into a new vector backbone?

      In the revised manuscript we have clarified that the gene was sequenced and then cloned into the expression vector and re-introduced into naïve Δ10AA cells.

    1. Author response:

      Reviewer #1 (Public Review):

      The authors report a high-quality genome assembly for a member of Xenacoelomorpha, a taxon that is at the center of the last remaining great controversies in animal evolution. The taxon and the species in question have "jumped around" the animal tree of life over the past 25 years, and seemed to have found their place as a sister-group to all remaining bilaterians. This hypothesis posits that the earliest split within Bilateria includes Xenacoelomorpha on the one hand and a clade known as Nephrozoa (Protostomia + Deuterostomia) on the other, and is thus referred to as the Nephrozoa hypothesis. Nephrozoa is supported by phylogenomic evidence, by a number of synapomorphic morphological characters in the Nephrozoa (namely, the presence of nephridia) and lack of some key bilaterian characters in Xenacoelomorpha, and by the presence of unique miRNAs in Nephrozoa.

      The Nephrozoa hypothesis has been challenged several times by the authors' groups who alternatively suggest placing Xenacoelomorpha within Deuterostomia as a sister group to a clade known as Ambulacraria. This hypothesis (the Xenambulacraria hypothesis) is supported by alternative phylogenomic datasets and by the shared presence of a number of unique molecular signatures. In this contribution, the authors aim to strengthen their case by providing full genome data for Xenoturbella bocki.

      The actual sequencing and analysis are technically and methodologically excellent. Some of the analyses were done several years ago using approaches that may now seem obsolete, but there is no reason not to include them. As a detailed report of a newly sequenced genome, the manuscript meets the highest standards.

      The authors emphasize a number of key findings. One is the fact that the genome is not as simple as one might expect from a "basal" taxon, and is on par with other bilaterian genomes and even more complex than the genome of secondarily simplified bilaterians. There is an implicit expectation here that the sister group to all Bilateria would represent the primitive state. This is of course not true, and the authors are aware of this, but it sometimes feels as though they are using this implicit assumption as a straw dog argument to say that since the genome is not as simple as expected, X. bocki must be nested within Bilateria. The authors get around this by acknowledging that their finding is consistent with a "weak version of the Nephrozoa hypothesis", which is essentially the Nephrozoa phylogenetic hypothesis without implicit assumptions of simplicity.

      We were NOT suggesting that Xenacoels are ‘basal’ though others have certainly done so. We were testing, instead, whether their supposed simplicity is reflected in the compostion of the genome.

      Another finding is a refutation of the miRNA data supporting Nephrozoa. This is an important finding although it is somewhat flogging a dead horse, since there is already a fair amount of skepticism about the validity of the miRNA data (now over 20 years old) for higher-level phylogenetics.

      The missing bilaterian microRNAs was one of the early pieces of evidence excluding the Xenacoelomorpha from Nephrozoa. Our new data are an important refutation of this source of evidence and add to the picture that this phylum is not lacking characters of Bilateria as had been suggested (missing micro RNAs Hox genes explicitly interpreted in this way).

      The finding that the authors feel is most important is gene presence-absence data that recovers a topology in which X. bocki is sister to Abulacraria. The problem is that the same tree does not support the monophyly of Xenacoelomorpha. This may be an artifact of fast evolving acoel genomes, as the authors suggest, but it still raises questions about the robustness of the data.

      In sum, the authors' results and analyses leave an open window for the Xenambulacraria hypothesis, but do not refute the Nephrozoa hypothesis. The manuscript is a valuable contribution to the debate but does not go a significant way towards its resolution.

      The manuscript has gone through several rounds of review and revision on a preprint server and is thus fairly clear of typos, inconsistencies and lack of clarity. The authors are honest and open in their interpretation of the results and their strengths.

      We thank the reviewer for their assessment of our manuscript. We have responded to some of the points they make above. As there were no specific points to edit or change raised by reviewer 1, we are replying in detail only to reviewer 2. We like to note that we have modified the text and thus focus of our manuscript in accordance to with what we think reviewer 1 is suggesting in the last two paragraphs of their review.

      Reviewer #2 (Public Review):

      The manuscript describes the genome assembly and analysis of Xenoturbella bocki, a worm that bears many morphological features ascribed to basal bilateria. The authors aim to analyse this genome in an attempt to determine the phylogenetic position of X. bocki as a representative of Xenacoelomorpha and its associated acoelomorphs. In doing so, they want to inform the debate as to whether xenacoelomorph belong among, or is in fact paraphyletic to all bilaterians.

      This paper presents a high-quality assembly of the X. bocki genome. By virtue of the phylogenetic position of this species, this genome has considerable scientific interest. This assembly appears to be highly complete and is a strength of the paper. The further characterisation of the genome is well executed and presented. Solid results from this paper include a comprehensive description of the Hox genes, miRNA and neruopeptide repertoire, as well as a description of the linkage group and how they relate to the ancestral linkage groups.

      Where this paper is weaker is that for the central claims and questions of this paper, i.e,. the question of the phylogenetic position of xenacoelomorph and whether X. bocki is a slowly evolving, but otherwise representative member of this clade, remains insufficiently resolved.

      The authors have achieved the goal of describing the X. bocki genome very well. By contrast, it is unclear, based on the presented evidence, whether xenacoelomorph is truly a monophyletic group. The balance of the evidence seems to suggest that the X. bocki genome belongs within the bilateria group. However, it is unclear as to what is driving the position of the other acoels. Assuming that X. bocki and the other two species in that group are monophyletic, then the evidence will favour the authors' conclusion (but without clearly rejecting the alternatives).

      This paper will likely further animate the debate regarding this basal species, and also questions related to the ancestral characters of bilateria as a whole. In particular the results from the HOX and paraHOX clusters, may provide an interesting counterpoint to the previous results based on the acoels.

      We thank the Reviewer for their extended comments on our manuscript. We would firstly like to point out that our work was not aiming to resolve the phylogenetic position of X. bocki. We discussed this question at length, as it was and is a major and important question in evolutionary biology, however we think that we had phrased any conclusions in this regard very cautiously as we are well aware of limitations in our data to resolve the conundrum.

      In this revision we have further modified our text, specifically in the Introduction and Abstract, to make it clear that we are contributing to the understanding of the evolution and biology of a fascinating organism that cannot easily be cultured in the laboratory.

      In addition, we have supplied more explanation on why Xenacoelomorpha are generally seen as a monophyletic group and which lines of evidence point to this. Again, it should be noted here that colleagues who regard the Nephrozoa hypothesis as true, do not doubt the monophyly of Xenacoelomorpha.

    1. Scene One: A Typical Day in English Class, Tuesday, 12:20 p.m.When I walk into English class, there are only two students in the classroom; the tables are set up in a U-shape. The room is not organized, your desk is messy, and the room has trash ever ywhere. There is one TV in the back of the room. The room smells like scented board markers. I walk to my seat and wait for you to get ever yone settled in the classroom. After more students arrive, you ask us to read our independent reading book for about 25 minutes. Some of us do what you ask while you work on your computer. Then three students get kicked out because they didn’t do what you wanted them to do, they were talking back, or maybe you were just having a bad day. We don’t have a jour-nal to write about our books and you do not ask us what we are reading dur-ing this time. When independent reading time is over, you tell us to take out our Hamlet books. We read Hamlet as a class for the rest of the period. While we are reading, we have to take notes about what is happening or write sum-maries in our Hamlet notebook. You tell us what you think about the text and what is happening in the play. Most often, we simply write what you tell us to write. This happens ever y single day. Class is over and you didn’t assign any homework — you rarely do

      The disorderly environment suggests a lack of organization and may impact the learning atmosphere negatively. While reading "Hamlet" provides valuable literary exposure, the lack of student input or discussion beyond teacher-directed notes may limit critical thinking and analysis.

    2. The Letter-Writing Process with StudentsI wanted to do this project not only for the experience of improving my writ-ing but also I think that the students’ voice is not always heard entirely, even through dialogue. I feel that by doing this journal we can make a difference with our personal experience and touch the heart of someone who is willing to stand by us. I also wanted to get the attention of other students who may be feel-ing the same frustration I have felt

      In the letter-writing process with students, Rashida Registe expresses her motivation for the project. She sees it as an opportunity not only to enhance her writing skills but also to amplify the voices of students, which she feels may not always be fully heard even through dialogue. Rashida believes that by sharing their personal experiences through the journal, they can make a difference and touch the hearts of those willing to support them.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This study assesses homeostatic plasticity mechanisms driven by inhibitory GABAergic synapses in cultured cortical neurons. The authors report that up- or down-regulation of GABAergic synaptic strength, rather than excitatory glutamatergic synaptic strength, is critical for homeostatic regulation of neuronal firing rates. The reviewers noted that the findings are potentially important, but they also raised questions. In particular, the evidence supporting the findings is currently incomplete and demonstration of independent regulation of mEPSCs and mIPSCs is a necessary experiment to support the major claims of the study. 

      We appreciate the detailed, thoughtful assessment of our paper by the reviewers and editors and now submit a revised version that addresses the reviewers’ comments as detailed below in response to each concern. We include a more open discussion of alternative possibilities and have added experiments demonstrating that AMPAergic scaling in our mouse cortical cultures is triggered differently than GABAergic scaling. We treated the cultured neurons exactly as described for triggering GABAergic scaling (20µM CNQX for 24 hours), however this did not trigger AMPAergic upscaling (new Figure 7), even though it did reduce spiking/bursting activity. Below we explain the result further, but ultimately this does demonstrate independent regulation of mEPSCs and mIPSCs as requested by the editor/reviewer (spike reductions induced by CNQX reduced mIPSC amplitude, but had no effect on mEPSC amplitude).

      Reviewer #1 (Public Review):

      While the paper is ambitious in its rhetorical scope and certainly presents intriguing findings, there are several serious concerns that need to be addressed to substantiate the interpretations of the data. For example, the CTZ data do not support the interpretations and conclusions drawn by the authors. Summarily, the authors argue that GABAergic scaling is measuring spiking (at the time scale of the homeostatic response, which they suggest is a key feature of a homeostat) yet their data in figure 5B show more convincingly that CTZ does not influence spiking levels - only one out of four time points is marginally significant (also, I suspect that the bootstrapping method mentioned in line 454-459 was conducted as a pairwise comparison of distributions. There is no mention of multiple comparisons corrections, and I have to assume that the significance at 3h would disappear with correction).

      We certainly understand the criticism here (similar to reviewer 2’s third point). We now discuss these complications in a more detailed description in the manuscript (CTZ section of results and at end of the discussion). First, we are presenting our entire dataset to be as transparent as possible. Unlike most synaptic scaling studies (including our own) that apply drugs to alter activity and assess mPSC amplitude at the final time point, here we are actually showing CTZ’s effect on spiking activity within the culture over time. This is critical because it has informed us of the drug’s true effect on spiking, the variability that is associated with these perturbations, and the ability and timing of the cultured network to homeostatically recover initial levels. This was important because it revealed that the drugs do not always influence activity in the way we assume, and this provides greater context to our results. Second, we are showing all of our data, and presenting it using estimation statistics which go beyond the dichotomy of a simple p value yes or no (Ho J, Tumkaya T, Aryal S, Choi H, Claridge-Chang A. 2019. Moving beyond P values: data analysis with estimation graphics. Nat Methods 16: 565-66). Estimation statistics have become a more standard statistical approach in the last 15 years and is the preferred method for the Society for Neuroscience’s eNeuro Journal. This method shows the effect size and the confidence interval of the distribution. For the 3 hr time point in Fig. 5B the CTZ/ethanol vs. ethanol data points exhibit very little overlap and the effect size demonstrates a near doubling of spike frequency, and the confidence interval shows a clear separation from 0. This was a pairwise comparison as we compared values at each time point after the addition of ethanol or ethanol/CTZ. Third, the plots illustrate an upward trend in spike frequency at 1 and 6 hrs, but that there is also clear variability. It is important to note that these are multiunit recordings and not purely excitatory principal neurons that we target for mPSC recordings. This complication along with the variability inherent in these cultures could make simple comparisons difficult to interpret and we now discuss this (end of discussion). Regardless, we do see some increase in spiking with CTZ and we clearly see increases in mIPSC amplitude, thus providing some support for the idea that spiking could be a critical player in terms of GABAergic scaling, particularly when put in the context of all of our findings. Future work will be necessary to determine how alterations in spiking lead to changes in mIPSC amplitude and we now discuss this (2nd to last paragraph in discussion).

      Then, the fact that TTX applied on top of CTZ drives an increase in mIPSC amplitude is interpreted as a conclusive demonstration that GABAergic scaling is sensing spiking. It is inevitable, however, that TTX will also severely reduce AMAP-R activation - a very plausible alternative explanation is that the augmentation of AMPAR activation caused by CTZ is not sufficient to overcome the dramatic impact of TTX. All together, these data do not provide substantial evidence for the conclusion drawn by the authors. 

      We believe that the most parsimonious explanation for our results is that spiking activity, not AMPAR activation, triggers GABAergic downscaling. GABAergic scaling is no different when comparing 24hr TTX treatment vs TTX+CTZ, and optogenetic restoration of spiking activity while continuing to block AMPAR activation was able to restore GABAergic mPSC amplitudes to control levels. It is important to emphasize that our results with TTX vs. TTX+CTZ are different for GABAergic scaling (no difference in this study) and AMPAergic scaling (CTZ diminished upward scaling in previous study – Fong et al., 2015 - PMID: 25751516) suggesting different triggers for the two forms of scaling. While we strongly believe we have demonstrated that GABAergic downscaling is dependent on spiking (not AMPAergic transmission), we now acknowledge that we cannot rule out the possibility that upward GABAergic scaling may be influenced by AMPAR activation (2nd paragraph discussion), although we have no evidence in support of this.

      Specific points:

      - The logic of the basis for the argument is somewhat flawed: A homeostat does not require a multiplicative mechanism, nor does it even need to be synaptic. Membrane excitability is a locus of homeostatic regulation of firing, for example. In addition, synapse-specific modulation can also be homeostatic. The only requirement of the homeostat is that its deployment subserves the stabilization of a biological parameter (e.g., firing rate). 

      We largely agree with the reviewer and should not have implied that this was a necessary requirement for a spike rate homeostat. What we should have said was that historically this definition has been applied to AMPAergic scaling, which is thought to be a spike rate homeostat. We have now corrected this (introduction and discussion).

      - Line 63 parenthetically references an important, but contradictory study as a brief "however". Given the tone of the writing, it would be more balanced to give this study at least a full sentence of exposition. 

      Agreed, and we have now done this.

      - The authors state (line 11) that expression of a hyperpolarizing conductance did not trigger scaling. More recent work ('Homeostatic synaptic scaling establishes the specificity of an associative memory') does this via expression of DREADDs and finds robust scaling.

      The purpose of citing this study was to argue that the spike rate homeostat hypothesis doesn’t make sense for AMPAergic scaling based on a study that hyperpolarized an individual cell while leaving the rest of the network unaltered and therefore leaving network activity and neurotransmission largely normal. In this previous study scaling was not triggered, suggesting reduced spike rate within an individual cell was insufficient to trigger scaling in that cell. The more recent study mentioned by the reviewer achieved scaling by hyperpolarizing a majority of cells in the network. Importantly, this approach alters neurotransmission throughout the network, making it challenging to isolate the specific contributions of spiking vs. receptor activation. Unlike the previous study, which focused on the impact within individual cells, this newer study involves global alterations in network activity, complicating the interpretation of the role of spiking versus receptor activation in triggering scaling.

      - Supplemental figure 1 looks largely linear to me? Out of curiosity, wouldn't you expect the left end to be aberrant because scaling up should theoretically increase the strength of some synapses that would have been previously below threshold for detection?

      We agree that the scaling ratio plot is largely linear. To be clear, the linearity of the ratio plot was not our point here, rather that there was a positive slope meaning ratios (CNQX mEPSC amplitudes/control mEPSC amplitudes) got bigger for the larger CNQX-treated mEPSCs. Alternatively, a multiplicative relationship where mEPSCs are all increased by a single factor (e.g. 2X) would be a flat line with 0 slope at the multiplicative value (e.g. 2). In terms of the left side of the plot, we do see values that rise abruptly from 1 - this was partially obstructed by the Y axis in this figure and we have adjusted this. This left part of the plot is likely due the CNQX-induced increases in mEPSC amplitudes of mini’s that where below our detection threshold of 5pA, as suggested by the reviewer. Therefore, mini’s that were 4pAs could now be 5pAs after CNQX treatment and these are then divided by the smallest control mEPSCs which are 5 pAs (ratio of 1). We tried to do a better job describing this in the resubmission (1st paragraph of results).

      - Given that figure 2B also shows warping at the tail ends of similar distributions, how is this to be interpreted? 

      The left side of the ratio plot shows evidence consistent with the idea that mIPSCs are dropping into the noise after CNQX treatment (smallest GABA mIPSCs that don’t fall into noise are 5pA and this is divided by the smallest control GABA mPSCs of 5pPA and therefore the ratio is 1). The rest of the distribution will then approach the scaling factor (50% in this case). On the right side of the ratio plot the values appear to slightly increase. We are not sure why this is happening, but it maybe that a small percentage of mIPSCs are not purely multiplicative at 0.5, however the biggest mPSCs can vary to a great degree from one cell to the next and in other cases we do not see this (Figure 4B, Figure 5E). We tried to do a better job describing this in the resubmission (results describing Figure 2).

      - The readability of the figures is poor. Some of them have inconsistent boundary boxes, bizarre axes, text that appears skewed as if the figures were quickly thrown together and stretched to fit. 

      We have adjusted the figures to be more consistent throughout the manuscript.

      - I'm concerned about the optogenetic restoration of activity experiment. Cortical pyramidal neuron mean firing rates are log normally distributed and span multiple orders of magnitude. The stimulation experiments can only address the total firing at a network-level - given than a network level "mean" is meaningless in a lognormal distribution, how are we to think about the effect of this manipulation when it comes to individual neurons homeostatically stabilizing their own activities? In essence, the argument is made at the single-neuron level, but the experiment is conducted with a network-level resolution. 

      As described above, we do not have the capacity to know what the actual firing rate of a particular neuron was before and after perturbing the system, and certainly not for the specific cells we recorded from to obtain mPSC amplitudes, and so we cannot say that we have perfectly restored the original firing rates of neurons. However, there is reason to believe that this is achieved to some extent. Our optogenetic stimulation is only 50-100 ms long activating a subset of neurons. This is sufficient to provide a synaptic barrage that then triggers a full blown network burst where the majority of spikes occur, but this is after the light is off. In other words, the optogenetic light pulse only initiates what becomes a relatively normal network burst that fortunately allows the individual cells to express their relatively normal (pre-drug) activity pattern. In our previous study using optogenetic activity restoration (Fong et al., 2015) we were able to show that this was the case for individual units - the spiking of an individual unit during a burst is similar before and after CNQX/optogenetic stimulation (see Figure 4b and Suppl. Fig 4 in Fong et al. 2015). We are not claiming that we have restored spiking to exactly the pre-drug state, but bring it back toward those levels and we see this is associated with a return of the mIPSC amplitude to near control levels. We now include a brief description of this in the manuscript (results describing Figure 3).

      - Line 198-99: multiplicativity is not a requirement of a homeostatic mechanism.

      - Line 264-265 - again, neither multiplicativity and synaptic mechanisms are fundamentally any more necessary for a homeostatic locus than anything else that can modulate firing rate in via negative feedback. 

      As mentioned above, the multiplicative nature of scaling has been a historical proposal for AMPAergic scaling and we have now found such a relationship for GABAergic scaling. This is important for understanding how this plasticity works, but we agree that it is not necessary for a homeostat and we have adjusted the manuscript accordingly.

      - 277: do you mean AMPAR? 

      We were not clear enough here. We actually do mean GABAR. The idea was that CTZ increases network activity and thus increases both AMPAergic and GABAergic transmission. We have rewritten this part of the discussion to avoid any confusion (2nd paragraph discussion).

      - Example: Figure 1A is frustratingly unreadable. The axes on the raster insets are microscopic, the arrows are strangely large, and it seems unnecessary to fill so much realestate with 4 rasters. Only one is necessary to show the concept of a network burst. The effect of time+CNQX on the frequency of burst is shown in B and C.

      - Example: Figure 2 appears warped and hastily assembled. Statistical indications are shown within and outside of bounding boxes. Axes are not aligned. Labels are not aligned. Font sizes are not equal on equivalent axes. 

      These figures were generated by the estimation statistics website and text may have been resized inappropriately. We have tried to adjust this and now have attempted to standardize the axes text to the best of our ability.

      - The discussion should include mention of the limitations and/or constraints of drawing general conclusions from cell culture. 

      We have added this consideration at the end of the discussion. Further, this is why we cited studies that argue GABAergic neurons have a particularly important role in homeostatic regulation of firing following sensory deprivations in vivo.

      - The discussion should include mention of the role of developmental age in the expression of specific mechanisms. It is highly likely that what is studied at ~P14 is specific to early postnatal development. 

      We now discuss caveats of cortical cultures at the end of the discussion.

      It is essential to ensure that the data presented in the paper adequately supports the conclusions drawn. A more cautious approach in interpreting the results may lead to a stronger argument and a more robust understanding of the underlying mechanisms at play. 

      We have broadened our discussion of alternative interpretations throughout the manuscript.

      Reviewer #1 (Recommendations For The Authors):

      While I am hesitant to judge a paper based on its tone, I would personally recommend revision of some of the subjective words and statements, as the manuscript undermines its own effectiveness by making unnecessarily strong statements. The text repeatedly paints an "either A or B" picture, and if there's any general lesson in biology, it's that it's always A and B. Global, multiplicative glutamatergic scaling could quite conceivably occur alongside GABAergic scaling, as well as synapse-specific homeostatic modifications. It seems that it would be wise to acknowledge that, while the data presented here point in one direction, in vivo results in an adult brain (for example) might present an entirely different set of patterns. This will not only enhance the readability of the paper but also ensure that the scientific community can engage with the work in a constructive and collaborative manner. Again, I present this as only a constructive and supportive suggestion. I am a big fan of work from this laboratory, and I would love to see this paper in an improved form - it's an important set of ideas and I do believe that these data are rigorously collected. 

      We have attempted to provide a more comprehensive interpretation of our results. We agree that a homeostat can come in many flavors, but do believe that GABAergic scaling is strong candidate, whereas AMPAergic scaling does not currently fit such a role. We do now discuss caveats with our work and are open to other interpretations that need to be flushed out in future work.

      Reviewer #2 (Public Review):

      Major points:

      (1) The reason why CNQX does not completely eliminate spiking is unclear (Fig. 1). What is the circuit mechanism by which spiking continues, although at lower frequency, in the absence of AMPA-mediated transmission and what the mechanism by which spiking frequency grows back after 24h (still in the absence of AMPA transmission)?

      Is it possible that NMDA-mediated transmission takes over and triggers a different type of network plasticity?

      The bursting in AMPAR blockade is due to the remaining NMDA receptor-mediated transmission. We showed this in our previous study in Suppl. Figure 2 and 6 of Fong et al., 2015 (PMID: 25751516). Our ability to optically induce normal looking bursts of spikes was also dependent NMDAR activation (Fong et al 2015 and Figure 6 Newman et al., 2015 - PMID: 26140329). Further, in Dr Fong’s PhD dissertation it was shown that the bursting activity was abolished when AMPA and NMDA receptors were both blocked. There are likely many factors that contribute to the recovery of activity, and certainly one of them is likely to be the weakening of inhibitory GABAergic currents as we had mentioned. We have now added the point about NMDARs mediating the remaining bursts in the manuscript (results associated with Figure 1). We are not clear on what the reviewer has in mind in terms of “NMDA-mediated transmission takes over and triggers a different kind of network plasticity”, but we do discuss the possibility that spiking triggers GABAergic scaling through its effect on NMDAergic transmission, which we cannot rule out, but also have no evidence in support of this idea (3rd and 5th paragraph of discussion). We do plan on addressing this in a future work.

      (2) A possible activation of NMDARs should be considered. One would think that experiments involving chronic glutamatergic blockade could have been conducted in the presence of NMDAR blockers. Why this was not the case?

      Unfortunately, it was not possible to optogenetically restore normal bursting in the presence of NMDAR blockade (even when AMPAergic transmission was intact), as NMDARs appeared to be critical for the optical restoration of the normal duration and form of the burst in rat cortical cultures (see Suppl. Figure 6 Fong et al., 2015 Nat Comm and Figure 6 Newman et al., 2015). Even high concentrations of CNQX (40µM) prevented us from restoring spiking in mouse cultures in the current study, which is why we moved to 20µM CNQX for this study. The reviewer raises an excellent point about a possible NMDAR contribution to altered synaptic strength, however. It is likely that NMDAR signaling is reduced in the presence of CNQX since burst frequency was dramatically reduced along with AMPAR-mediated depolarizations. We cannot rule out the possibility that NMDAR signaling could contribute to the alterations in GABAergic mIPSCs and discuss this in the resubmission (3rd and 5th paragraph of the discussion). We had not considered this previously because prior work suggested that 24/48 hour block NMDARs (APV) did not trigger AMPAergic scaling in cortical or hippocampal cultures (see Figure 1 Turrigiano et al., 1998 Nature and Suppl. Figure 4 Sutton et al., 2006 Cell), moreover, our previous study showed that restoring NMDAergic transmission ontogenetically, at least to some extent, had no influence on AMPAergic scaling (Fong et al., 2015).

      Also, experiments with global ChR2 stimulation with coincident pre and postsynaptic firing might also activate NMDARs and result in additional effects that should be taken into consideration for the global scaling mechanism.

      To be clear, our optical stimulation was of short duration (duration 50-100 ms) and was turned off before the vast majority of spiking that occurred in the bursts. So the light flash was a trigger that allowed a relatively normal looking burst to occur after the light was off (see lower panel of Figure 3B optogenetic stimulation – short duration only at onset of burst – we now make this clearer in resubmission). Therefore, we were unlikely to trigger significant synchronous activation that does not normally occur in network bursts.

      (3) Cultures exposed to CTZ to enhance AMPA receptors generated variable results (Fig. 5), somewhat increasing spiking activity in a non-significant manner but, at the same time, strengthening mIPSC amplitude. This result seems to suggest that spiking might be involved in GABAergic scaling, but it does not seem to prove it. Then, addition of TTX that blocked spiking reduced mIPSC amplitude. It was concluded here that the ability of CTZ to enhance GABAergic currents was primarily due to spiking, rather than the increase in AMPA-mediated currents. However, in addition to blocking action potentials, TTX would also prevent activation of AMPARs in the presence of CTZ due to the lack of glutamatergic release. Therefore, under these conditions, an effect of glutamatergic activation on GABAergic scaling cannot be ruled out.

      These concerns were very similar to reviewer 1’s first comments (see above). To be clear we are going a step beyond most scaling studies by assessing MEA-wide firing rate, but this still provides an incomplete picture of the particular cells that we target for patch recordings in terms of their firing before and after a drug. Further, we see considerable variability in effect on firing rate from culture to culture, which we now discuss in the resubmission (final paragraph discussion). The fact that mIPSCs are no different after TTX treatment vs CTZ+TTX treatment suggests that AMPAergic transmission is not so influential on GABAergic downscaling. While the CTZ results are not conclusive by themselves, taken together with the optogenetic results, where restoration of spiking in AMPAR blockade reverses scaling, is most consistent with idea that GABAergic scaling is triggered by spiking rather than AMPAR activation and places GABAergic scaling as a strong candidate as spike rate homeostat. Although we do feel that we have demonstrated that downward GABAergic scaling is dependent on spiking, we cannot rule out the possibility that upward GABAergic scaling could be influenced by AMPAR activation to some extent. We now acknowledge this possibility (2nd paragraph discussion).

      (4) The sample size is not mentioned in any figure. How many cells/culture dishes were used in each condition?

      The individual dots represent either individual cells for mIPSC amplitude or individual cultures in MEA experiments. Number of cultures and cells are now stated in the figure legends.

      (5) Cortical cultures may typically contain about 5-10% GABAergic interneurons and 90-95 % pyramidal cells. One would think that scaling mechanisms occurring in pyramidal cells and interneurons could be distinct, with different impact on the network. Although for whole-cell recordings the authors selected pyramidal looking cells, which might bias recordings towards excitatory neurons, naked eye selection of recording cells is quite difficult in primary cultures. Some of the variability in mIPSC amplitude values (Fig. 2A for example) might be attributed to the cell type? One could use cultures where interneurons are fluorescently labeled to obtain an accurate representation. The issue of the possible differential effects of scaling in pyramidal cells vs. interneurons and the consequences in the network should be discussed.

      We now include this discussion in the resubmission (final paragraph discussion). Briefly, we chose large cells, which will be predominantly glutamatergic neurons as suggested by the reviewer. Ultimately, even among glutamatergic principal cells there may be variability in the response to drug application. All of these issues could contribute to variability and we have expanded our description of the variability in our results, including that based on cellular heterogeneity. 

      Reviewer #2 (Recommendations For The Authors):

      Minor comments –

      Fig S3: Please quantify changes in frequency

      We have done this (Supplemental Figure 5).

      Fig 2: please choose colors with higher contrast for CNQX/TTX

      We have done this.

      Fig. 3C: Why doesn't CNQX+PhotoStim reach control levels of bursting at 2h?

      The program was designed to follow and maintain total spike frequency and so it does a better job at this than maintaining burst frequency.

      Fig. 5A: please include a comparison between control and Ethanol

      We now do this in Figure 5C. Both around 26pAs.

      Fig. 5C: where is the Etoh condition?

      We have made this figure more clear in terms of controls (Figure 5C & D).

      Reviewer #3 (Public Review):

      This paper concerns whether scaling (or homeostatic synaptic plasticity; HSP) occurs similarly at GABA and Glu synapses and comes to the surprising conclusion that these are regulated separately. This is surprising because these were thought to be co-regulated during HSP and in fact, the major mechanisms thought to underlie downscaling (TTX or CNQX driven), retinoic acid and TNF, have been shown to regulate both GABARs and AMPARs directly. (As a side note, it is unclear that the manipulations used in Josesph and Turrigiano represent HSP, and so might not be relevant). Thus the main result, that GABA HSP is dissociable from Glu HSP, is novel and exciting. This suggests either different mechanisms underlie the two processes, or that under certain conditions, another mechanism is engaged that scales one type of synapse and not the other.

      However, strong claims require strong evidence, and the results presented here only address GABA HSP, relying on previous work from this lab on Glu HSP (Fong, et al., 2015). But the previous experiments were done in rat cultures, while these experiments are done in mice and at somewhat different ages (DIV). Even identical culture systems can drift over time (possibly due to changes in the components of B27 or other media and supplements). Therefore it is necessary to demonstrate in the same system the dissociation. To be convincing, they need to show the mEPSCs for Fig 4, clearly showing the dissociation. Doing the same for Fig 5 would be great, but I think Fig 4 is the key.

      We understand the concern of the reviewer as we do see significant variability within our cultures and they were plated in different places, by different people, in different species (rat vs mouse). Therefore, we have attempted to redo the study on AMPAergic scaling on these mouse cortical neurons. Surprisingly, we found that 20µM CNQX did not trigger AMPAergic upscaling (new Figure 7), even though it did reduce spiking activity and was able to produce GABAergic downscaling. We did not carry out the optogenetic restoration of activity, because we did not trigger upscaling. The result does however, show that the reductions in spiking/bursting that trigger GABAergic downscaling, did not trigger AMPAergic upscaling and therefore dissociate the 2 forms of scaling in these mouse cultures. We do not know why 20 µM CNQX did not trigger scaling in these cultures since it does reduce spiking and AMPAR activation. In the Fong study we used 40µM CNQX because intracellular recordings from rat cortical neurons suggested this was required to completely block AMPAergic currents. Our initial studies in the current manuscript examining GABAergic scaling in mouse cortical cultures used 40µM CNQX, however, this concentration of CNQX prevented us from restoring spiking through optogenetic activation, so we reduced our concentration to 20µM CNQX, which did trigger GABAergic downscaling and allowed the restoration of spiking. We now show and discuss this result (Figure 7 and 3rd paragraph discussion).

      The paper also suggests that only receptor function or spiking could control HSP, and therefore if it is not receptor function then it must be spiking. This seems like a false dichotomy; there are of course other options. Details in the data may suggest that spiking is not the (or the only) homeostat, as TTX and CNQX causes identical changes in mIPSC amplitude but have different effects on spiking. Further, in Fig 5, CTZ had a minimal effect on spiking but a large effect on mIPSCs. Similar issues appear in Fig 6, where the induction of increased spiking is highly variable, with many cells showing control levels or lower spiking rates. Yet the synaptic changes are robust, across all cells. Overall, this is not persuasive that spiking is necessarily the homeostat for GABA synapses.

      Together our results argue against AMPAR or GABAR activation as a trigger for GABAergic scaling and that this is different than our results for AMPAergic scaling. These points alone are important to recognize. While changes in spiking do not perfectly follow the changes in GABAergic scaling they do always trend in the right direction. As mentioned above, total spiking activity is only one measure of spiking. It is possible that these drugs alter the pattern of spiking that translates into an altered calcium transients which may be important for triggering the plasticity. Further, we acknowledge that we cannot rule out a role for NMDARs contributing to GABAergic scaling (3rd and 5th paragraph of discussion). Based on the variability that we observe and the nature of our MEA recordings we cannot precisely determine how the total activity or pattern of activity changes with drug application in the specific cells that we target for whole cell recordings, and this is now discussed (final paragraph of discussion). Again, it is important to note that we are going a step beyond most homeostatic plasticity studies that add a drug and simply assume it is having an effect on spiking (e.g. CNQX was initially thought to completely abolish spiking, but clearly does not). However, we believe that the most parsimonious explanation of our results supports our proposal that GABAergic scaling is a strong candidate as a spike rate homeostat. Regardless, in the resubmission we have included a broader discussion about these possibilities, and recognize that we cannot rule out the possibility that AMPAergic transmission could contribute to upward GABAergic scaling (2nd paragraph discussion).

      The paper also suggests that the timing of the GABA changes coincides with the spiking changes, but while they have the time course of the spiking changes and recovery, they only have the 24h time point for synaptic changes. It is impossible to conclude how the time courses align without more data.

      We can only say that by the 24 hour CNQX time point, when overall spiking is recovered in some but not all cultures and bursts have not recovered, that GABAergic scaling has already occurred. We now state this more clearly in the resubmission (near the end of the 2nd paragraph of the discussion).

      Reviewer #3 (Recommendations For The Authors):

      The statistics are inadequately described. The full information including actual p values should be given, particularly for the non-significant trends reported.

      We have done this in Figure legends.

      The abstract and introduction give the impression that GABA and Glu HSP are independent, though most work links them as occurring simultaneously and in a coordinated fashion to achieve homeostasis.

      While it is true that many studies have triggered both forms of scaling with activity or transmission blockade, these studies have not addressed whether these forms of scaling are actually triggered in the same way mechanistically, except potentially for the one study that we mentioned (Joseph et al.,). Our results suggest they are independent. We now do mention the idea that these two forms of scaling have been assumed to be commonly triggered (3rd paragraph introduction).

      The data in Fig 6 is presented as if BIC treatment is a novel result, although BIC/Gabazine/PTX have been used to induce down-scaling in many previous papers. While it's good to have the results, they should be put in proper context. As suggested in the paper, testing if decreased GABAR function would lead to upscaling does not make sense given all the previous data. 

      Figure 6 shows GABAergic upscaling in response to GABAR block (bicuculline), but we are aware of only two other studies that looked at GABAergic scaling after treating with a GABAR blocker and they found upscaling but this was in hippocampal cultures, not cortical cultures (Peng et al., 2010 - PMID: 21123568, Pribiag et al., 2014 - PMID: 24753587). We now mention this in the results section describing Figure 6. While many studies have blocked GABARs and find AMPAergic downscaling, we are addressing the triggers for GABAergic scaling in Figure 6.

      Is Fig S4B mislabeled? The title says spike rate but the graph axis says burst frequency.

      The reviewer is correct and we have now adjusted this.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Protein conformational changes are often critical to protein function, but obtaining structural information about conformational ensembles is a challenge. Over a number of years, the authors of the current manuscript have developed and improved an algorithm, qFit protein, that models multiple conformations into high resolution electron density maps in an automated way. The current manuscript describes the latest improvements to the program, and analyzes the performance of qFit protein in a number of test cases, including classical statistical metrics of data fit like Rfree and the gap between Rwork and Rfree, model geometry, and global and case-by-case assessment of qFit performance at different data resolution cutoffs. The authors have also updated qFit to handle cryo-EM datasets, although the analysis of its performance is more limited due to a limited number of high-resolution test cases and less standardization of deposited/processed data.

      Strengths:

      The strengths of the manuscript are the careful and extensive analysis of qFit's performance over a variety of metrics and a diversity of test cases, as well as the careful discussion of the limitations of qFit. This manuscript also serves as a very useful guide for users in evaluating if and when qFit should be applied during structural refinement.

      Reviewer #2 (Public Review):

      Summary

      The manuscript by Wankowicz et al. describes updates to qFit, an algorithm for the characterization of conformational heterogeneity of protein molecules based on X-ray diffraction of Cryo-EM data. The work provides a clear description of the algorithm used by qFit. The authors then proceed to validate the performance of qFit by comparing it to deposited X-ray entries in the PDB in the 1.2-1.5 Å resolution range as quantified by Rfree, Rwork-Rfree, detailed examination of the conformations introduced by qFit, and performance on stereochemical measures (MolProbity scores). To examine the effect of experimental resolution of X-ray diffraction data, they start from an ultra high-resolution structure (SARS-CoV2 Nsp3 macrodomain) to determine how the loss of resolution (introduced artificially) degrades the ability of qFit to correctly infer the nature and presence of alternate conformations. The authors observe a gradual loss of ability to correctly infer alternate conformations as resolution degrades past 2 Å. The authors repeat this analysis for a larger set of entries in a more automated fashion and again observe that qFit works well for structures with resolutions better than 2 Å, with a rapid loss of accuracy at lower resolution. Finally, the authors examine the performance of qFit on cryo-EM data. Despite a few prominent examples, the authors find only a handful (8) of datasets for which they can confirm a resolution better than 2.0 Å. The performance of qFit on these maps is encouraging and will be of much interest because cryo-EM maps will, presumably, continue to improve and because of the rapid increase in the availability of such data for many supramolecular biological assemblies. As the authors note, practices in cryo-EM analysis are far from uniform, hampering the development and assessment of tools like qFit.

      Strengths

      qFit improves the quality of refined structures at resolutions better than 2.0 A, in terms of reflecting true conformational heterogeneity and geometry. The algorithm is well designed and does not introduce spurious or unnecessary conformational heterogeneity. I was able to install and run the program without a problem within a computing cluster environment. The paper is well written and the validation thorough.

      I found the section on cryo-EM particularly enlightening, both because it demonstrates the potential for discovery of conformational heterogeneity from such data by qFit, and because it clearly explains the hurdles towards this becoming common practice, including lack of uniformity in reporting resolution, and differences in map and solvent treatment.

      Weaknesses

      The authors begin the results section by claiming that they made "substantial improvement" relative to the previous iteration of qFit, "both algorithmically (e.g., scoring is improved by BIC, sampling of B factors is now included) and computationally (improving the efficiency and reliability of the code)" (bottom of page 3). However, the paper does not provide a comparison to previous iterations of the software or quantitation of the effects of these specific improvements, such as whether scoring is improved by the BIC, how the application of BIC has changed since the previous paper, whether sampling of B factors helps, and whether the code faster. It would help the reader to understand what, if any, the significance of each of these improvements was.

      Indeed, it is difficult (embarrassingly) to benchmark against our past work due to the dependencies on different python packages and the lack of software engineering. With the infrastructure we’ve laid down with this paper, made possible by an EOSS grant from CZI, that will not be a problem going forward. Not only is the code more reliable and standardized, but we have developed several scientific test sets that can be used as a basis for broad comparisons to judge whether improvements are substantial. We’ve also changed with “substantial improvement” to “several modifications”  to indicate the lack of comparison to past versions.

      The exclusion of structures containing ligands and multichain protein models in the validation of qFit was puzzling since both are very common in the PDB. This may convey the impression that qFit cannot handle such use cases. (Although it seems that qFit has an algorithm dedicated to modeling ligand heterogeneity and seems to be able to handle multiple chains). The paper would be more effective if it explained how a user of the software would handle scenarios with ligands and multiple chains, and why these would be excluded from analysis here.

      qFit can indeed handle both. We left out multiple chains for simplicity in constructing a dataset enriched for small proteins while still covering diversity to speed the ability to rapidly iterate and test our approaches. Improvements to qFit ligand handling will be discussed in a forthcoming work as we face similar technical debt to what we saw in proteins and are undergoing a process of introducing “several modifications” that we hope will lead to “substantial improvement” - but at the very least will accelerate further development.

      It would be helpful to add some guidance on how/whether qFit models can be further refined afterwards in Coot, Phenix, ..., or whether these models are strictly intended as the terminal step in refinement.

      We added to the abstract:

      “Importantly, unlike ensemble models, the multiconformer models produced by qFit can be manually modified in most major model building software (e.g. Coot)  and fit can be further improved by refinement using standard pipelines (e.g. Phenix, Refmac, Buster).”

      and introduction:

      “Multiconformer models are notably easier to modify and more interpretable in software like Coot12 unlike ensemble methods that generate multiple complete protein copies(Burnley et al. 2012; Ploscariu et al. 2021; Temple Burling and Brünger 1994).”

      and results:

      “This model can then be examined and edited in Coot12 or other visualization software, and further refined using software such as phenix.refine, refmac, or buster as the modeler sees fit.”

      and discussion

      “qFit is compatible with manual modification and further refinement as long as the subsequent software uses the PDB standard altloc column, as is common in most popular modeling and refinement programs. The models can therefore generally also be deposited in the PDB using the standard deposition and validation process.”

      Appraisal & Discussion

      Overall, the authors convincingly demonstrate that qFit provides a reliable means to detect and model conformational heterogeneity within high-resolution X-ray diffraction datasets and (based on a smaller sample) in cryo-EM density maps. This represents the state of the art in the field and will be of interest to any structural biologist or biochemist seeking to attain an understanding of the structural basis of the function of their system of interest, including potential allosteric mechanisms-an area where there are still few good solutions. That is, I expect qFit to find widespread use.

      Reviewer #3 (Public Review):

      Summary:

      The authors address a very important issue of going beyond a single-copy model obtained by the two principal experimental methods of structural biology, macromolecular crystallography and cryo electron microscopy (cryo-EM). Such multiconformer model is based on the fact that experimental data from both these methods represent a space- and time-average of a huge number of the molecules in a sample, or even in several samples, and that the respective distributions can be multimodal. Different from structure prediction methods, this approach is strongly based on high-resolution experimental information and requires validated single-copy high-quality models as input. Overall, the results support the authors' conclusions.

      In fact, the method addresses two problems which could be considered separately:

      - An automation of construction of multiple conformations when they can be identified visually;

      - A determination of multiple conformations when their visual identification is difficult or impossible.

      We often think about this problem similarly to the reviewer. However, in building qFit, we do not want to separate these problems - but rather use the first category (obvious visual identification) to build an approach that can accomplish part of the second category (difficult to visualize) without building “impossible”/nonexistent conformations - with a consistent approach/bias.

      The first one is a known problem, when missing alternative conformations may cost a few percent in R-factors. While these conformations are relatively easy to detect and build manually, the current procedure may save significant time being quite efficient, as the test results show.

      We agree with the reviewers' assessment here. The “floor” in terms of impact is automating a tedious part of high resolution model building and improving model quality.

      The second problem is important from the physical point of view and has been addressed first by Burling & Brunger (1994; https://doi.org/10.1002/ijch.199400022). The new procedure deals with a second-order variation in the R-factors, of about 1% or less, like placing riding hydrogen atoms, modeling density deformation or variation of the bulk solvent. In such situations, it is hard to justify model improvement. Keeping Rfree values or their marginal decreasing can be considered as a sign that the model is not overfitted data but hardly as a strong argument in favor of the model.

      We agree with the overall sentiment of this comment. What is a significant variation in R-free is an important question that we have looked at previously (http://dx.doi.org/10.1101/448795) and others have suggested an R-sleep for further cross validation (https://pubmed.ncbi.nlm.nih.gov/17704561/). For these reasons it is important to get at the significance of the changes to model types from large and diverse test sets, as we have here and in other works, and from careful examination of the biological significance of alternative conformations with experiments designed to test their importance in mechanism.

      In general, overall targets are less appropriate for this kind of problem and local characteristics may be better indicators. Improvement of the model geometry is a good choice. Indeed, yet Cruickshank (1956; https://doi.org/10.1107/S0365110X56002059) showed that averaged density images may lead to a shortening of covalent bonds when interpreting such maps by a single model. However, a total absence of geometric outliers is not necessarily required for the structures solved at a high resolution where diffraction data should have more freedom to place the atoms where the experiments "see" them.

      Again, we agree—geometric outliers should not be completely absent, but it is comforting when they and model/experiment agreement both improve.

      The key local characteristic for multi conformer models is a closeness of the model map to the experimental one. Actually, the procedure uses a kind of such measure, the Bayesian information criteria (BIC). Unfortunately, there is no information about how sharply it identifies the best model, how much it changes between the initial and final models; in overall there is not any feeling about its values. The Q-score (page 17) can be a tool for the first problem where the multiple conformations are clearly separated and not for the second problem where the contributions from neighboring conformations are merged. In addition to BIC or to even more conventional target functions such as LS or local map correlation, the extreme and mean values of the local difference maps may help to validate the models.

      We agree with the reviewer that the problem of “best” model determination is poorly posed here. We have been thinking a lot about htis in the context of Bayesian methods (see: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9278553/); however, a major stumbling block is in how variable representations of alternative conformations (and compositions) are handled. The answers are more (but by no means simply) straightforward for ensemble representations where the entire system is constantly represented but with multiple copies.

      This method with its results is a strong argument for a need in experimental data and information they contain, differently from a pure structure prediction. At the same time, absence of strong density-based proofs may limit its impact.

      We agree - indeed we think it will be difficult to further improve structure prediction methods without much more interaction with the experimental data.

      Strengths:

      Addressing an important problem and automatization of model construction for alternative conformations using high-resolution experimental data.

      Weaknesses:

      An insufficient validation of the models when no discrete alternative conformations are visible and essentially missing local real-space validation indicators.

      While not perfect real space indicators, local real-space validation is implicit in the MIQP selection step and explicit when we do employ Q-score metrics.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      A point of clarification: I don't understand why waters seem to be handled differently in for cryo-EM and crystallography datasets. I am interested about the statement on page 19 that the Molprobity Clashscore gets worse for cryo-EM datasets, primarily due to clashes with waters. But the qFit algorithm includes a round of refinement to optimize placement of ordered waters, and the clashscore improves for the qFit refinement in crystallography test cases. Why/how is this different for cryo-EM?

      We agree that this was not an appropriate point. We believe that the high clash score is coming from side chains being incorrectly modeled. We have updated this in the manuscript and it will be a focus of future improvements.

      Reviewer #2 (Recommendations For The Authors):

      - It would be instructive to the reader to explain how qFit handles the chromophore in the PYP (1OTA) example. To this end, it would be helpful to include deposition of the multiconformer model of PYP. This might also be a suitable occasion for discussion of potential hurdles in the deposition of multiconformer models in the PDB (if any!). Such concerns may be real concerns causing hesitation among potential users.

      Thank you for this comment. qFit does not alter the position or connectivity of any HETATM records (like the chromophore in this structure). Handling covalent modifications like this is an area of future development.

      Regarding deposition, we have noted above that the discussion now includes:

      “qFit is compatible with manual modification and further refinement as long as the subsequent software uses the PDB standard altloc column, as is common in most popular modeling and refinement programs. The models can therefore, generally also be deposited in the PDB using the standard deposition and validation process.”

      Finally, we have placed all PDBs in a Zenodo deposition (XXX) and have included that language in the manuscript. It is currently under a separate data availability section (page XXX). We will defer to the editor as to the best header that should go under.

      - It may be advisable to take the description of true/false pos/negatives out of the caption of Figure 4, and include it in a box or so, since these terms are important in the main text too, and the caption becomes very cluttered.

      We think adding the description of true/false pos/negatives to the Figure panel would make it very cluttered and wordy. We would like to retain this description within the caption. We have also briefly described each in the main text.

      - page 21, line 4: some issue with citation formatting.

      We have updated these citations.

      - page 25, second paragraph: cardinality is the number of members of a set. Perhaps "minimal occupancy" is more appropriate.

      Thank you for pointing this out. This was a mistake and should have been called the occupancy threshold.

      - page 26: it's - its

      Thank you, we have made this change. 

      - Font sizes in Supplementary Figures 5-7 are too small to be readable.

      We agree and will make this change. 

      Reviewer #3 (Recommendations For The Authors):

      General remarks

      (1) As I understand, the procedure starts from shifting residues one by one (page 4; A.1). Then, geometry reconstruction (e.g., B1) may be difficult in some cases joining back the shifted residues. It seems that such backbone perturbation can be done more efficiently by shifting groups of residues ("potential coupled motions") as mentioned at the bottom of page 9. Did I miss its description?

      We would describe the algorithm as sampling (which includes minimal shifts) in the backbone residues to ensure we can link neighboring residues. We agree that future iterations of qFit should include more effective backbone sampling by exploring motion along the Cβ-Cα, C-N, and (Cβ-Cα × C-N) bonds and exploring correlated backbone movements.

      (2) While the paper is well split in clear parts, some of them seem to be not at their right/optimal place and better can be moved to "Methods" (detailed "Overview of the qFit protein algorithm" as a whole) or to "Data" missed now (Two first paragraphs of "qFit improves overall fit...", page 8, and "Generating the qFit test set", page 22, and "Generating synthetic data ..." at page 26; description of the test data set), At my personal taste, description of tests with simulated data (page 15) would be better before that of tests with real data.

      Thank you for this comment, but we stand by our original decision to keep the general flow of the paper as it was submitted.

      (3) I wonder if the term "quadratic programming" (e.g., A3, page 5) is appropriate. It supposes optimization of a quadratic function of the independent parameters and not of "some" parameters. This is like the crystallographic LS which is not a quadratic function of atomic coordinates, and I think this is a similar case here. Whatever the answer on this remark is, an example of the function and its parameters is certainly missed.

      We think that the term quadratic programming is appropriate. We fit a function with a loss function (observed density - calculated density), while satisfying the independent parameters. We fit the coefficients minimizing a quadratic loss. We agree that the quadratic function is missing from the paper, and we have now included it in the Methods section.

      Technical remarks to be answered by the authors :

      (1) Page 1, Abstract, line 3. The ensemble modeling is not the only existing frontier, and saying "one of the frontiers" may be better. Also, this phrase gives a confusing impression that the authors aim to predict the ensemble models while they do it with experimental data.

      We agree with this statement and have re-worded the abstract to reflect this.

      (2) Page 2. Burling & Brunger (1994) should be cited as predecessors. On the contrary, an excellent paper by Pearce & Gros (2021) is not relevant here.

      While we agree that we should mention the Burling & Brunger paper and the Pearce & Gros (2021) should not be removed as it is not discussing the method of ensemble refinement.

      (3) Page 2, bottom. "Further, when compared to ..." The preference to such approach sounds too much affirmative.

      We have amended this sentence to state:

      “Multiconformer models are notably easier to modify and more interpretable in software like Coot(Emsley et al. 2010) unlike ensemble methods that generate multiple complete protein copies(Burnley et al. 2012; Ploscariu et al. 2021; Temple Burling and Brünger 1994).”

      “The point we were trying to make in this sentence was that ensemble-based models are much harder to manually manipulate in Coot or other similar software compared to multiconformer models. We think that the new version of this sentence states this point more clearly.”

      (4) Page 2, last paragraph. I do not see an obvious relation of references 15-17 to the phrase they are associated with.

      We disagree with this statement, and think that these references are appropriate.

      “Multiconformer models are notably easier to modify and more interpretable in software like Coot12 unlike ensemble methods that generate multiple complete protein copies(Burnley et al. 2012; Ploscariu et al. 2021; Temple Burling and Brünger 1994).”

      (5) Page 3, paragraph 2. Cryo-EM maps should be also "high-resolution"; it does not read like this from the phrase.

      We agree that high-resolution should be added, and the sentence now states:

      “However, many factors make manually creating multiconformer models difficult and time-consuming. Interpreting weak density is complicated by noise arising from many sources, including crystal imperfections, radiation damage, and poor modeling in X-ray crystallography, and errors in particle alignment and classification, poor modeling of beam induced motion, and imperfect detector Detector Quantum Efficiency (DQE) in high-resolution cryo-EM.”

      (6) Page 3, last paragraph before "results". The words "... in both individual cases and large structural bioinformatic projects" do not have much meaning, except introducing a self-reference. Also, repeating "better than 2 A" looks not necessary.

      We agree that this was unnecessary and have simplified the last sentence to state:

      “With the improvements in model quality outlined here, qFit can now be increasingly used for finalizing high-resolution models to derive ensemble-function insights.”

      (7) Page 3. "Results". Could "experimental" be replaced by a synonym, like "trial", to avoid confusing with the meaning "using experimental data"?

      We have replaced experimental with exploratory to describe the use of qFit on CryoEM data. The statement now reads:

      “For cryo-EM modeling applications, equivalent metrics of map and model quality are still developing, rendering the use of qFit for cryo-EM more exploratory.”

      (8) Page 4, A.1. Should it be "steps +/- 0.1" and "coordinate" be "coordinate axis"? One can modify coordinates and not shift them. I do not understand how, with the given steps, the authors calculated the number of combinations ("from 9 to 81"). Could a long "Alternatively, ...absent" be reduced simply to "Otherwise"?

      We have simplified and clarified the sentence on the sampling of backbone coordinates to state:

      “If anisotropic B-factors are absent, the translation of coordinates occurs in the X, Y, and Z directions. Each translation takes place in steps of 0.1 along each coordinate axis, extending to 0.3 Å, resulting in 9 (if isotropic) or to 81 (if anisotropic) distinct backbone conformations for further analysis.”

      (9) Page 6, B.1, line 2. Word "linearly" is meaningless here.

      We have modified this to read:

      “Moving from N- to C- terminus along the protein,”

      (10) Page 9, line 2. It should be explained which data set is considered as the test set to calculate Rfree.

      We think this is clear and would be repetitive if we duplicated it.

      (11) Page 9, line 7. It should be "a valuable metric" and not "an"

      We agree and have updated the sentence to read:

      “Rfree is a valuable metric for monitoring overfitting, which is an important concern when increasing model parameters as is done in multiconformer modeling.”

      (12) Page 10, paragraph 3. "... as a string (Methods)". I did not find any other mention of this term "string", including in "Methods" where it supposed to be explained. Either this should be explained (and an example is given?), or be avoided.

      We agree that string is not necessary (discussing the programmatic datatype). We have removed this from the sentence. It now reads:

      “To quantify how often qFit models new rotameric states, we analyzed the qFit models with phenix.rotalyze, which outputs the rotamer state for each conformer (Methods).”

      (13) Page10, lines 3-4 from bottom. Are these two alternative conformations justified?

      We are unsure what this is referring to.

      (14) Page 12, Fig. 2A. In comparison with Supplement Fig 2C, the direction of axes is changed. Could they be similar in both Figures?

      We have updated Supplementary Figure 2C to have the same direction of axes as Figure 2A.

      (15) Page 15, section's title. Choose a single verb in "demonstrate indicate".

      We have amended the title of this section to be:

      “Simulated data demonstrate qFit is appropriate for high-resolution data.”

      (16) Page 15, paragraph 2. "Structure factors from 0.8 to 3.0 A resolution" does not mean what the author wanted apparently to tell: "(complete?) data sets with the high-resolution limit which varied from 0.8 to 3.0 A ...". Also, a phrase of "random noise increasing" is not illustrated by Figs.5 as it is referred to.

      We have edited this sentence to now read:

      “To create the dataset for resolution dependence, we used the ground truth 7KR0 model, including all alternative conformations, and generated artificial structure factors with a high resolution limit ranging from  0.8 to 3.0 Å resolution (in increments of 0.1 Å).”

      (17) Page 15, last paragraph is written in a rather formal and confusing way while a clearer description is given in the figure legend and repeated once more in Methods. I would suggest to remove this paragraph.

      We agree that this is confusing. Instead of create a true positive/false positive/true negative/false negative matrix, we have just called things as they are, multiconformer or single conformer and match or no match. We have edited the language the in the manuscript and figure legends to reflect these changes.

      (18) Page 16. Last two paragraphs start talking about a new story and it would help to separate them somehow from the previous ones (sub-title?).

      We agree that this could use a subtitle. We have included the following subtitle above this section:

      “Simulated multiconformer data illustrate the convergence of qFit.”

      (19) Page 20. "or static" and "we determined that" seem to be not necessary.

      We have removed static and only used single conformer models. However, as one of the main conclusions of this paper is determining that qFit can pick up on alternative conformers that were modeled manually, we have decided to the keep the “we determined that”.

      (20) Page 21, first paragraph. "Data" are plural; it should be "show" and "require"

      We have made these edits. The sentence now reads:

      “However, our data here shows that not only does qFit need a high-resolution map to be able to detect signal from noise, it also requires a very well-modeled structure as input.”

      (21) Page 21, References should be indicated as [41-45], [35,46-48], [55-57]. A similar remark to [58-63] at page 22.

      We have fixed the reference layout to reflect this change.

      (22) Page 21, last paragraph. "Further reduce R-factors" (moreover repeated twice) is not correct neither by "further", since here it is rather marginal, nor as a goal; the variations of R-factors are not much significant. A more general statement like "improving fit to experimental data" (keeping in mind density maps) may be safer.

      We agree with the duplicative nature of these statements. We have amended the sentence to now read:

      “Automated detection and refinement of partial-occupancy waters should help improve fit to experimental data further reduce Rfree15 and provide additional insights into hydrogen-bond patterns and the influence of solvent on alternative conformations.”

      (23) Page 22. Sub-sections of "Methods" are given in a little bit random order; "Parallelization of large maps" in the middle of the text is an example. Put them in a better order may help.

      We have moved some section of the Methods around and made better headings by using an underscore to highlight the subsections (Generating and running the qFit test set, qFit improved features, Analysis metrics, Generating synthetic data for resolution dependence).

      (24) Page 24. Non-convex solution is a strange term. There exist non-convex problems and functions and not solutions.

      We agree and we have changed the language to reflect that we present the algorithm with non-convex problems which it cannot solve.

      (25) Page 26, "Metrics". It is worthy to describe explicitly the metrics and not (only) the references to the scripts.

      For all metrics, we describe a sentence or two on what each metric describes. As these metrics are well known in the structural biology field, we do not feel that we need to elaborate on them more.

      (26) Page 26. Multiplying B by occupancy does not have much sense. A better option would be to refer to the density value in the atomic center as occ*(4*pi/B)^1.5 which gives a relation between these two entities.

      We agree and have update the B-factor figures and metrics to reflect this.

      (27) Page 40, suppl. Fig. 5. Due to the color choice, it is difficult to distinguish the green and blue curves in the diagram.

      We have amended this with the colors of the curves have been switched.

      (28) Page 42, Suppl. Fig. 7. (A) How the width of shaded regions is defined? (B) What the blue regions stand for? Input Rfree range goes up to 0.26 and not to 0.25; there is a point at the right bound. (C) Bounds for the "orange" occupancy are inversed in the legend.

      (A) The width of the shaded region denotes the standard deviations among the values at every resolution. We have made this clearer in the caption

      (B) The blue region denotes the confidence interval for the regression estimate. Size of the confidence interval was set to 95%. We have made this clearer in the caption

      (C) This has been fixed now

      The maximum R-free value is 0.2543, which we rounded down to 0.25.

      (29) Page 43. Letters E-H in the legend are erroneously substituted by B-E.

      We apologize for this mistake. It is now corrected.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      The study makes a valuable empirical contribution to our understanding of visual processing in primates and deep neural networks, with a specific focus on the concept of factorization. The analyses provide solid evidence that high factorization scores are correlated with neural predictivity, yet more evidence would be needed to show that neural responses show factorization. Consequently, while several aspects require further clarification, in its current form this work is interesting to systems neuroscientists studying vision and could inspire further research that ultimately may lead to better models of or a better understanding of the brain.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The paper investigates visual processing in primates and deep neural networks (DNNs), focusing on factorization in the encoding of scene parameters. It challenges the conventional view that object classification is the primary function of the ventral visual stream, suggesting instead that the visual system employs a nuanced strategy involving both factorization and invariance. The study also presents empirical findings suggesting a correlation between high factorization scores and good neural predictivity.

      Strengths:

      (1) Novel Perspective: The paper introduces a fresh viewpoint on visual processing by emphasizing the factorization of non-class information.

      (2) Methodology: The use of diverse datasets from primates and humans, alongside various computational models, strengthens the validity of the findings.

      (3) Detailed Analysis: The paper suggests metrics for factorization and invariance, contributing to a future understanding & measurements of these concepts.

      Weaknesses:

      (1) Vagueness (Perceptual or Neural Invariance?): The paper uses the term 'invariance', typically referring to perceptual stability despite stimulus variability [1], as the complete discarding of nuisance information in neural activity. This oversimplification overlooks the nuanced distinction between perceptual invariance (e.g., invariant object recognition) and neural invariance (e.g., no change in neural activity). It seems that by 'invariance' the authors mean 'neural' invariance (rather than 'perceptual' invariance) in this paper, which is vague. The paper could benefit from changing what is called 'invariance' in the paper to 'neural invariance' and distinguish it from 'perceptual invariance,' to avoid potential confusion for future readers. The assignment of 'compact' representation to 'invariance' in Figure 1A is misleading (although it can be addressed by the clarification on the term invariance). [1] DiCarlo JJ, Cox DD. Untangling invariant object recognition. Trends in cognitive sciences. 2007 Aug 1;11(8):333-41.

      Thanks for pointing out this ambiguity. In our Introduction we now explicitly clarify that we use “invariance” to refer to neural, rather than perceptual invariance, and we point out that both factorization and (neural) invariance may be useful for obtaining behavioral/perceptual invariance.

      (2) Details on Metrics: The paper's explanation of factorization as encoding variance independently or uncorrelatedly needs more justification and elaboration. The definition of 'factorization' in Figure 1B seems to be potentially misleading, as the metric for factorization in the paper seems to be defined regardless of class information (can be defined within a single class). Does the factorization metric as defined in the paper (orthogonality of different sources of variation) warrant that responses for different object classes are aligned/parallel like in 1B (middle)? More clarification around this point could make the paper much richer and more interesting.

      Our factorization metric measures the degree to which two sets of scene variables are factorized from one another. In the example of Fig. 1B, we apply this definition to the case of factorization of class vs. non-class information. Elsewhere in the paper we measure factorization of several other quantities unrelated to class, specifically camera viewpoint, lighting conditions, background content, and object pose. In our revised manuscript we have clarified the exposition surrounding Fig. 1B to make it clear that factorization, as we define it, can be applied to other quantities as well and that responses do not need to be aligned/parallel but simply live in a different set of dimensions whether linearly or nonlinearly arranged. Thanks for raising the need to clarify this point.

      (3) Factorization vs. Invariance: Is it fair to present invariance vs. factorization as mutually exclusive options in representational hypothesis space? Perhaps a more fair comparison would be factorization vs. object recognition, as it is possible to have different levels of neural variability (or neural invariance) underlying both factorization and object recognition tasks.

      We do not mean to imply that factorization and invariance are mutually exclusive, or that they fully characterize the space of possible representations. However, they are qualitatively distinct strategies for achieving behavioral capabilities like object recognition. In the revised manuscript we also include a comparison to object classification performance (Figures 5C & S4, black x’s) as a predictor of brain-like representations, alongside the results for factorization and invariance.

      In our revised Introduction and beginning of the Results section, we make it more clear that factorization and invariance are not mutually exclusive – indeed, our results show that both factorization and invariance for some scene variables like lighting and background identity are signatures of brain-like representations. Our study focuses on factorization because we believe its importance has not been studied or highlighted to the degree that invariance to “nuisance” parameters has in concert with selectivity to object identity in individual neuron tuning functions. Moreover, the loss functions used for supervised training functions of neural networks for image classification would seem to encourage invariance as a representational strategy. Thus, the finding that factorization of scene parameters is an equally good if not better predictor of brain-like representations may motivate new objective functions for neural network training.

      (4) Potential Confounding Factors in Empirical Findings: The correlation observed in Figure 3 between factorization and neural predictivity might be influenced by data dimensionality, rather than factorization per se [2]. Incorporating discussions around this recent finding could strengthen the paper.

      [2] Elmoznino E, Bonner MF. High-performing neural network models of the visual cortex benefit from high latent dimensionality. bioRxiv. 2022 Jul 13:2022-07.

      We thank the Reviewer for pointing out this important, potential confound and the need for a direct quantification. We have now included an analysis computing how well dimensionality (measured using the participation ratio metric for natural images, as was done in [2] Elmoznino& Bonner bioRxiv. 2022) can account for model goodness-of-fit (additional pink bars in Figure 6). Factorization of scene parameters appears to add more predictive power than dimensionality on average (Figure 6, light shaded bars), and critically, factorization+classification jointly predict goodness-of-fit significantly better than dimensionality+classification for V4 and IT/HVC brain areas (Figure 6, dark shaded bars). Indeed, dimensionality+classification is only slightly more predictive than classification alone for V4 and IT/HVC indicating some redundancy in those measures with respect to neural predictivity of models (Figure 6, compare dark shaded pink bar to dashed line).

      That said, high-dimensional representations can, in principle, better support factorization, and thus we do not regard these two representational strategies necessarily in competition. Rather, our results suggest (consistent with [2]) that dimensionality is predictive of brain-like representation to some degree, such that some (but not all) of factorization’s predictive power may indeed owe to a partial correlation with dimensionality. We elaborate in the Discussion where this point comes up and now refer to the updated Figure 6 that shows the control for dimensionality.

      Conclusion:

      The paper offers insightful empirical research with useful implications for understanding visual processing in primates and DNNs. The paper would benefit from a more nuanced discussion of perceptual and neural invariance, as well as a deeper discussion of the coexistence of factorization, recognition, and invariance in neural representation geometry. Additionally, addressing the potential confounding factors in the empirical findings on the correlation between factorization and neural predictivity would strengthen the paper's conclusions.

      Taken together, we hope that the changes described above address the distinction between neural and perceptual invariance, provide a more balanced understanding of the contributions of factorization, invariance, and local representational geometry, and rule against dimensionality for natural images as contributing to the main finding of the benefits from factorization of scene parameters.

      Reviewer #2 (Public Review):

      Summary:

      The dominant paradigm in the past decade for modeling the ventral visual stream's response to images has been to train deep neural networks on object classification tasks and regress neural responses from units of these networks. While object classification performance is correlated to the variance explained in the neural data, this approach has recently hit a plateau of variance explained, beyond which increases in classification performance do not yield improvements in neural predictivity. This suggests that classification performance may not be a sufficient objective for building better models of the ventral stream. Lindsey & Issa study the role of factorization in predicting neural responses to images, where factorization is the degree to which variables such as object pose and lighting are represented independently in orthogonal subspaces. They propose factorization as a candidate objective for breaking through the plateau suffered by models trained only on object classification.

      They claim that (i) maintaining these non-class variables in a factorized manner yields better neural predictivity than ignoring non-class information entirely, and (ii) factorization may be a representational strategy used by the brain.

      The first of these claims is supported by their data. The second claim does not seem well-supported, and the usefulness of their observations is not entirely clear.

      Strengths:

      This paper challenges the dominant approach to modeling neural responses in the ventral stream, which itself is valuable for diversifying the space of ideas.

      This paper uses a wide variety of datasets, spanning multiple brain areas and species. The results are consistent across the datasets, which is a great sign of robustness.

      The paper uses a large set of models from many prior works. This is impressively thorough and rigorous.

      The authors are very transparent, particularly in the supplementary material, showing results on all datasets. This is excellent practice.

      Weaknesses:

      (1) The primary weakness of this paper is a lack of clarity about what exactly is the contribution. I see two main interpretations: (1-A) As introducing a heuristic for predicting neural responses that improve over-classification accuracy, and (1-B) as a model of the brain's representational strategy. These two interpretations are distinct goals, each of which is valuable. However, I don't think the paper in its current form supports either of them very well:

      (1-A) Heuristic for neural predictivity. The claim here is that by optimizing for factorization, we could improve models' neural predictivity to break through the current predictivity plateau. To frame the paper in this way, the key contribution should be a new heuristic that correlates with neural predictivity better than classification accuracy. The paper currently does not do this. The main piece of evidence that factorization may yield a more useful heuristic than classification accuracy alone comes from Figure 5. However, in Figure 5 it seems that factorization along some factors is more useful than others, and different linear combinations of factorization and classification may be best for different data. There is no single heuristic presented and defended. If the authors want to frame this paper as a new heuristic for neural predictivity, I recommend the authors present and defend a specific heuristic that others can use, e.g. [K * factorization_of_pose + classification] for some constant K, and show that (i) this correlates with neural predictivity better than classification alone, and (ii) this can be used to build models with higher neural predictivity. For (ii), they could fine-tune a state-of-the-art model to improve this heuristic and show that doing so achieves a new state-of-the-art neural predictivity. That would be convincing evidence that their contribution is useful.

      Our paper does not make any strong claim regarding the Reviewer’s point 1-A (on heuristics for neural predictivity). In the Discussion, last paragraph, we better specify that our work is merely suggestive of claim 1-A about heuristics for more neurally predictive, more brainlike models. We believe that our paper supports the Reviewer’s point 1-B (on brain representation) as we discuss below.

      We leave it to future work to determine if factorization could help optimize models to be more brainlike. This treatment may require exploration of novel model architectures and loss functions, and potentially also more thorough neural datasets that systematically vary many different forms of visual information for validating any new models.

      (1-B) Model of representation in the brain. The claim here is that factorization is a general principle of representation in the brain. However, neural predictivity is not a suitable metric for this, because (i) neural predictivity allows arbitrary linear decoders, hence is invariant to the orthogonality requirement of factorization, and (ii) neural predictivity does not match the network representation to the brain representation. A better metric is representational dissimilarity matrices. However, the RDM results in Figure S4 actually seem to show that factorization does not do a very good job of predicting neural similarity (though the comparison to classification accuracy is not shown), which suggests that factorization may not be a general principle of the brain. If the authors want to frame the paper in terms of discovering a general principle of the brain, I suggest they use a metric (or suite of metrics) of brain similarity that is sensitive to the desiderata of factorization, e.g. doesn't apply arbitrary linear transformations, and compare to classification accuracy in addition to invariance.

      We agree with the Reviewer about the shortcomings of neural predictivity for comparing representational geometries, and in our revised manuscript we have provided a more comprehensive set of results that includes RDM predictivity in new Figures 6 & 7, alongside the results for neural fit predictivity. In addition, as suggested we added classification accuracy predictivity in Figures 5C & S4 (black x’s) for visual comparison to factorization/invariance. In Figure S4 on RDMs, it is apparent how factorization is at least as good a predictor as classification on all V4 & IT datasets from both monkeys and humans (compared x’s to filled circles in Figure S4; note that some of the points from the original Figure S4 changed as we discovered a bug in the code that specifically affected the RDM analysis for a few of the datasets).

      We find that the newly included RDM analyses in Figures 6 & 7 are consistent with the conclusions of the neural fit regression analyses: that the correlation of factorization metrics with RDM matches are strong, comparable in magnitude to that of classification accuracy (Figure 6, 3rd & 4th columns, compare black dashed line to faded colored bars) and are not fully accounted for by the model’s classification accuracy alone (Figure 6, 3rd & 4th columns, higher unfaded bars for classification combined with factorization, and see corresponding example scatters in Figure 7 middle/bottom rows).

      It is encouraging that the added benefit of factorization for RDM predictivity accounting for classification performance is at least as good as the improvement seen for neural fit predictivity (Figure 6, 1st & 2nd columns for encoding fits versus 3rd & 4th columns for RDM correlations).

      (2) I think the comparison to invariance, which is pervasive throughout the paper, is not very informative. First, it is not surprising that invariance is more weakly correlated with neural predictivity than factorization, because invariant representations lose information compared to factorized representations. Second, there has long been extensive evidence that responses throughout the ventral stream are not invariant to the factors the authors consider, so we already knew that invariance is not a good characterization of ventral stream data.

      While we appreciate the Reviewer’s intuition that highly invariant representations are not strongly supported in the high-level visual cortex, we nevertheless thought it was valuable to put this intuition to a quantitative, detailed test. As a result, we uncovered effects that were not obvious a priori, at least to us – for example, that invariance for some scene parameters (camera view, object pose) is negatively correlated with neural predictions while invariance to others (background, lighting) is positively correlated. Thus, our work exercises the details of invariance for different types of information.

      (3) The formalization of the factorization metric is not particularly elegant, because it relies on computing top K principal components for the other-parameter space, where K is arbitrarily chosen as 10. While the authors do show that in their datasets the results are not very sensitive to K (Figure S5), that is not guaranteed to be the case in general. I suggest the authors try to come up with a formalization that doesn't have arbitrary constants. For example, one possibility that comes to mind is E[delta_a x delta_b], where 'x' is the normalized cross product, delta_a, and delta_b are deltas in representation space induced by perturbations of factors a and b, and the expectation is taken over all base points and deltas. This is just the first thing that comes to mind, and I'm sure the authors can come up with something better. The literature on disentangling metrics in machine learning may be useful for ideas on measuring factorization.

      Thanks to the Reviewer for raising this point. First, we wish to clarify a potential misunderstanding of the factorization metric: the number K of principal components we choose is not an arbitrary constant, but rather calibrated to capture a certain fraction of variance, set to 90% by default in our analyses. While this variance threshold is indeed an arbitrary hyperparameter, it has a more intuitive interpretation than the number of principal components.

      Nonetheless, the Reviewer’s comment did inspire us to consider another metric for factorization that does not depend on any arbitrary parameters. In the revised version, we now include a covariance matrix based metric which simply measures the elementwise correlation of the covariance matrices induced by varying the scene parameter of interest and the covariance matrix induced by varying the other parameters (and then subtracts this quantity from 1).

      Correspondingly, we now present results for both the new covariance based measure and the original PCA based one in Figures 5C, 6, and 7. The main findings remain largely the same when using the covariance based metric, and the covariance based metric (Figure 5C, compare light shaded to dark shaded filled circles; Figure 6, compare top row to bottom row; Figure 7, compare middle rows to bottom rows).

      Ultimately, we believe these two metrics are complementary and somewhat analogous to two metrics commonly used for measuring dimensionality (the number of components needed to explain a certain fraction of the variance, analogous to our original PCA based definition; the participation ratio, analogous to our covariance based definition). We have added the formula for the covariance based factorization metric along with a brief description to the Methods.

      (4) The authors defined the term "factorization" according to their metric. I think introducing this new term is not necessary and can be confusing because the term "factorization" is vague and used by different researchers in different ways. Perhaps a better term is "orthogonality", because that is clear and seems to be what the authors' metric is measuring.

      We agree with the Reviewer that factorization has become an overloaded term. At the same time, we think that in this context, the connotation of the term factorization effectively conveys the notion of separating out different latent sources of variance (factors) such that they can be encoded in orthogonal subspaces.

      To aid clarity, we now mention in the Introduction that factorization defined here is meant to measure orthogonalization of scene factors. Additionally, in the Discussion section, we now go into more detail comparing our metric to others previously used in the literature, including orthogonality, to help put it in context.

      (5) One general weakness of the factorization paradigm is the reliance on a choice of factors. This is a subjective choice and becomes an issue as you scale to more complex images where the choice of factors is not obvious. While this choice of factors cannot be avoided, I suggest the authors add two things: First, an analysis of how sensitive the results are to the choice of factors (e.g. transform the basis set of factors and re-run the metric); second, include some discussion about how factors may be chosen in general (e.g. based on temporal statistics of the world, independent components analysis, or something else).

      The Reviewer raises a very reasonable point about the limitation of this work. While we limited our analysis to generative scene factors that we know about and that could be manipulated, there are many potential factors to consider. It is not clear to us exactly how to implement the Reviewer’s suggestion of transforming the basis set of factors, as the factors we consider are highly nonlinear in the input space. Ultimately, we believe that finding unsupervised methods to characterize the “true” set of factors that is most useful for understanding visual representations is an important subject for future work, but outside the scope of this particular study. We have added a comment to this effect in the Discussion.

      Reviewer #3 (Public Review):

      Summary:

      Object classification serves as a vital normative principle in both the study of the primate ventral visual stream and deep learning. Different models exhibit varying classification performances and organize information differently. Consequently, a thriving research area in computational neuroscience involves identifying meaningful properties of neural representations that act as bridges connecting performance and neural implementation. In the work of Lindsey and Issa, the concept of factorization is explored, which has strong connections with emerging concepts like disentanglement [1,2,3] and abstraction [4,5]. Their primary contributions encompass two facets: (1) The proposition of a straightforward method for quantifying the degree of factorization in visual representations. (2) A comprehensive examination of this quantification through correlation analysis across deep learning models.

      To elaborate, their methodology, inspired by prior studies [6], employs visual inputs featuring a foreground object superimposed onto natural backgrounds. Four types of scene variables, such as object pose, are manipulated to induce variations. To assess the level of factorization within a model, they systematically alter one of the scene variables of interest and estimate the proportion of encoding variances attributable to the parameter under consideration.

      The central assertion of this research is that factorization represents a normative principle governing biological visual representation. The authors substantiate this claim by demonstrating an increase in factorization from macaque V4 to IT, supported by evidence from correlated analyses revealing a positive correlation between factorization and decoding performance. Furthermore, they advocate for the inclusion of factorization as part of the objective function for training artificial neural networks. To validate this proposal, the authors systematically conduct correlation analyses across a wide spectrum of deep neural networks and datasets sourced from human and monkey subjects. Specifically, their findings indicate that the degree of factorization in a deep model positively correlates with its predictability concerning neural data (i.e., goodness of fit).

      Strengths:

      The primary strength of this paper is the authors' efforts in systematically conducting analysis across different organisms and recording methods. Also, the definition of factorization is simple and intuitive to understand.

      Weaknesses:

      This work exhibits two primary weaknesses that warrant attention: (i) the definition of factorization and its comparison to previous, relevant definitions, and (ii) the chosen analysis method.

      Firstly, the definition of factorization presented in this paper is founded upon the variances of representations under different stimuli variations. However, this definition can be seen as a structural assumption rather than capturing the effective geometric properties pertinent to computation. More precisely, the definition here is primarily statistical in nature, whereas previous methodologies incorporate computational aspects such as deviation from ideal regressors [1], symmetry transformations [3], generalization [5], among others. It would greatly enhance the paper's depth and clarity if the authors devoted a section to comparing their approach with previous methodologies [1,2,3,4,5], elucidating any novel insights and advantages stemming from this new definition.

      [1] Eastwood, Cian, and Christopher KI Williams. "A framework for the quantitative evaluation of disentangled representations." International conference on learning representations. 2018.

      [2] Kim, Hyunjik, and Andriy Mnih. "Disentangling by factorising." International Conference on Machine Learning. PMLR, 2018.

      [3] Higgins, Irina, et al. "Towards a definition of disentangled representations." arXiv preprint arXiv:1812.02230 (2018).

      [4] Bernardi, Silvia, et al. "The geometry of abstraction in the hippocampus and prefrontal cortex." Cell 183.4 (2020): 954-967.

      [5] Johnston, W. Jeffrey, and Stefano Fusi. "Abstract representations emerge naturally in neural networks trained to perform multiple tasks." Nature Communications 14.1 (2023): 1040.

      Thanks to the Reviewer for this suggestion. We agree that our initial submission did not sufficiently contextualize our definition of factorization with respect to other related notions in the literature. We have added additional discussion of these points to the Discussion section in the revised manuscript and have included therein the citations provided by the Reviewer (please see the third paragraph of Discussion).

      Secondly, in order to establish a meaningful connection between factorization and computation, the authors rely on a straightforward synthetic model (Figure 1c) and employ multiple correlation analyses to investigate relationships between the degree of factorization, decoding performance, and goodness of fit. Nevertheless, the results derived from the synthetic model are limited to the low training-sample regime. It remains unclear whether the biological datasets under consideration fall within this low training-sample regime or not.

      We agree that our model in Figure 1C is very simple and does not fully capture the complex interactions between task performance and features of representational geometry, like factorization. We intend it only as a proof of concept to illustrate how factorized representations can be beneficial for some downstream task use cases. While the benefits of factorized representations disappear for large numbers of samples in this simulation, we believe this is primarily a consequence of the simplicity and low dimensionality of the simulation. Real-world visual information is complex and high-dimensional, and as such the relevant sample size regime in which factorization offers tasks benefits may be much greater. As a first step toward this real-world setting, Figure 2 shows how decreasing the amount of factorization in neural population data in macaque V4/IT can have an effect on object identity decoding.

      Recommendations for the authors

      Reviewer #1 (Recommendations For The Authors):

      Missing citations: The paper could benefit from discussions & references to related papers, such as:

      Higgins I, Chang L, Langston V, Hassabis D, Summerfield C, Tsao D, Botvinick M. Unsupervised deep learning identifies semantic disentanglement in single inferotemporal face patch neurons. Nature communications. 2021 Nov 9;12(1):6456.

      We have added additional discussion of related work, including the suggested reference and others on disentanglement, to the Discussion section in the revised manuscript.

      Reviewer #2 (Recommendations For The Authors):

      Here are several small recommendations for the authors, all much more minor than those in the public review:

      I suggest more use of equations in methods sections about Figure 1C and macaque neural data analysis.

      Thanks for this suggestion. We have added new Equation 1 for the method transforming neural data to reduce factorization of a variable while preserving other firing rate statistics.

      In Figure 1-C, the methods indicate that Gaussian noise was added. This is a very important detail, and complexifies the interpretation of the figure because it adds an assumption about the structure of noise. In other words, if I understand correctly, the correct interpretation of Figure 1C is "assuming i.i.d. noise, decoding accuracy improves with factorization." The i.i.d. noise is a big assumption, and it is debated how well the brain satisfies this assumption. I suggest you either omit noise for this figure or clearly state in the main text (e.g. caption) that the figure must be interpreted under an i.i.d. noise assumption.

      We have added an explicit statement of the i.i.d. noise assumption to the Figure 1C legend.

      For Figure 2B, I suggest labeling the x-axis clearly below the axis on both panels. Currently, it is difficult to read, particularly in print.

      We have made the x-axis labels more clear and included on both panels.

      Figure 3A is difficult to read because of the very small task. I suggest avoiding such small fonts.

      We agree that Figure 3A is difficult to read. We have broken out Figure 3 into two new Figures 3 & 4 to increase clarity and sizing of text in Figure 3A.

      Reviewer #3 (Recommendations For The Authors):

      To strengthen this work, it is advisable to incorporate more comprehensive comparisons with previous research, particularly within the machine learning (ML) community. For instance, it would be beneficial to explore and reference works focusing on disentanglement [1,2,3]. This would provide valuable context and facilitate a more robust understanding of the contributions and novel insights presented in the current study.

      We have added additional discussion of related work and other notions similar to factorization to the Discussion section in the revised manuscript.

      Additionally, improving the quality of the figures is crucial to enhance the clarity of the findings:

      • Figure 2: The caption of subfigure B could be revised for greater clarity.

      Thank you, we have substantially clarified this figure caption.

      • Figure 3: Consider a more equitable approach for computing the correlation coefficient, such as calculating it separately for different types of models. In the case of supervised models, it appears that the correlation between invariance and goodness of fit may not be negligible across various scene parameters.

      We appreciate the suggestion, but we are not confident in our ability to conclude much from analyses restricted to particular model classes, given the relatively small N and the fact that the different model classes themselves are an important source of variance in our data.

      • Figure 4: To enhance the interpretability of subfigures A and B, it may be beneficial to include p-values (indicating confidence levels).

      As we supply bootstrapped confidence intervals for our results, which provide at least as much information as p-values, and most of the effects of interest are fairly stark when comparing invariance to factorization, p-values were not needed to support our points. We added a sentence to the legend of new Figure 5 (previously Figure 4) indicating that error bars reflect standard deviations over bootstrap resampling of the models.

      • Figure 5: For subfigure B, it could be advantageous to plot the results solely for factorization, allowing for a clear assessment of whether the high correlation observed in Classification+Factorization arises from the combined effects of both factors or predominantly from factorization alone.

      First, we clarify/note that the scatters solely for factorization that the Reviewer seeks are already presented earlier in the manuscript across all conditions in Figures 4A,B and Figure S2.

      While we could also include these in new Figure 7 (previously Figure 5B) as the Reviewer suggests, we believe it would distract from the message of that figure at the end of the manuscript – which is that factorization is useful as a supplement to classification in predictive matches to neural data. Nonetheless, new Figure 6 (old Figure 5A) provides a summary quantification of the information that the reviewer requests (Fig. 6, faded colored bars reflect the contribution of factorization alone).

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      The study elucidates a detailed molecular mechanism of the initial stages of transport in a medically relevant GABA neurotransmitter transporter GAT1 and thus generates useful new insights for this protein family. In particular, it presents convincing evidence for the presence of a "staging binding site" that locally concentrates Na+ ions to increase transport activity, whilst solid evidence for how Na+ binding affects the larger scale dynamics.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The manuscript authored by Stockner and colleagues delves into the molecular simulations of Na+ binding pathway and the ionic interactions at the two known sodium binding sites site 1 and site 2. They further identify a patch of two acidic residues in TM6 that seemingly populate the Na+ ions prior to entry into the vestibule. These results highlight the importance of studying the ion-entry pathways through computational approaches and the authors also validate some of their findings through experimental work. They observe that sodium site 1 binding is stabilized by the presence of the substrate in the S1 site and this is particularly vital as the GABA carboxylate is involved in coordinating the Na+ ion unlike other monoamine transporters and binding of sodium to the Na2 site stabilizes the conformation of the GAT1 by reducing flexibility among the helical bundles involved in alternating access.

      Strengths:

      The study displays results that are generally consistent with available information from experiments on SLC6 transporters particularly GAT1 and puts forth the importance of this added patch of residues in the extracellular vestibule that could be of importance to the ion permeation in SLC6 transporters. This is a nicely performed study and could be improved if the authors could comment on and fix the following queries.

      We thank our reviewer for the overall positive evaluation.

      Weaknesses:

      (1) How conserved are the residue pair of D281-E283 in other SLC6 transporters. The authors commented on the presence of these residues in SERT but it would be nice to know how widespread these residues are in other SLC6 transporters like NET, GlyT, and DAT.

      We have created a sequence alignment of the entire human SLC6 family (Supplementary Figure 1) and found that E283 is polar or charged in all SLC6 transporters. D281 shows a higher level of conservation across the family compared to E283. D281 is negatively charged in approximately 50% of the SLC6 family members, an aspartate in all GABA transporters and a glutamate in all monoamine transporters.

      (2) Further, one would like to see the effect of individual mutations D281A and E283A on transport, surface expression, and EC50 of Na+ to gauge the effect on transport.

      We have carried out experiments to investigate the effects of the individual mutations. The results revealed intermediate effects between WT and the double mutant (D281A-E283A) and showed that the effects mostly align with the degree of conservation, as a neutralisation of D281 by alanine has a stronger effect than the E283A mutant. Both single mutants had minimal effects on the sodium dependence of uptake, D281A had a stronger effect on expression, Km and Vmax as compared to E283. Only D281A reduced surface expression, while E283A expresses to a similar level as wild type GAT1.

      (3) A clear figure of the S1 site where Na+ tends to stay prior to Na1 site interactions needs to be provided with a clear figure. Further, it is not entirely clear how access to S1 is altered if the transporter is in an outwardoccluded conformation if F294 is blocking solvent access. Please comment.

      We have modified the structural images in Figure 1, 5, 6 and 7 to improve their comprehensibility. We have also added a comment on the role of F294 as part of the outer hydrophobic gate to the discussion. In short, F294 does not occlude the passage to the S1 as long as GAT1 is outward open, and we find that GAT1 is outward open in all sodium binding simulations.

      (4) The p-value of the EC50 differences between GAT1WT and GAT1double mutant need to be mentioned. The difference in sodium dependence EC50 seems less than twofold, and it would be useful to mention how critical the role of the recruitment site is. Since the transport is not affected the site could play a transient role in attracting ions.

      We have added p-values or standard deviation to our data.

      (5) It would be very nice to know how K+ ions are attracted by this recruitment site. This could further act as a control simulation to test the preference for Na+ ions among SLC6 members.

      We think that attraction of potassium to the recruitment site is not of relevance, as the residues are at the extracellular side and exposed to bulk, where the concentration of sodium is high (typically 130-150 mM), while the concentration of potassium is very small (3-5 mM). Exploring sodium binding by simulations for all SLC6 members could be interesting, but clearly outside the scope of this manuscript.

      (6) Some of the important figures are not very clear. For instance, there should be a zoomed-in view of the recruitment site. The current one in Fig. 1b and 1c could be made clearer. Similarly as mentioned earlier the Na residence at the S1 site away from the Na1 and Na2 sites needs to be shown with greater clarity by putting side chain information in Fig. 6d.

      We have modified the structural images in Figure 1, 5, 6 and 7 to improve their comprehensibility.

      (7) The structural features that comprise the two principal components PC1 and PC2 should be described in greater detail.

      We have modified Figure 6 and added images that show the motions along PC1 and PC2. In addition, these are now better explained in the text.

      Reviewer #2 (Public Review):

      Summary:

      Starting from an AlphaFold2 model of the outward-facing conformation of the GAT1 transporter, the authors primarily use state-of-the-art MD simulations to dissect the role of the two Na+ ions that are known to be cotransported with the substrate, GABA (and a co-transported Cl- ion). The simulations indicated that Na+ binding to OF GAT depends on the electrostatic environment. The authors identify an extracellular recruiting site including residues D281 and E283 which they hypothesized to increase transport by locally increasing the available Na+ concentration and thus increasing binding of Na+ to the canonical binding sites NA1 and NA2. The charge-neutralizing double mutant D281A-E283A showed decreased binding in simulations. The authors performed GABA uptake experiments and whole-cell patch clamp experiments that taken together validated the hypothesis that the Na+ staging site is important for transport due to its role in pulling in Na+.

      Detailed analysis of the MD simulations indicated that Na+ binding to NA2 has multiple structural effects: The binding site becomes more compact (reminiscent of induced fit binding) and there is some evidence that it stabilizes the outward-facing conformation.

      Binding to NA1 appears to require the presence of the substrate, GABA, whose carboxylate moiety participates in Na+ binding; thus the simulations predict cooperativity between binding of GABA and Na+ binding to NA1.

      Strengths:

      -  MD simulations were used to propose a hypothesis (the existence of the staging Na+ site) and then tested with a mutant in simulations AND in experiments. This is an excellent use of simulations in combination with experiments.

      -  A large number of repeat MD simulations are generally able to provide a consistent picture of Na+ binding. Simulations are performed according to current best practices and different analyses illuminate the details of the molecular process from different angles.

      -  The role of GABA in cooperatively stabilizing Na+ binding to the NA1 site looks convincing and intriguing.

      We thank the review for the very supportive assessment.

      Weaknesses:

      -  Assessing the effects of Na+ binding on the large-scale motions of the transporter is more speculative because the PCA does not clearly cover all of the conformational space and the use of an AlphaFold2 model may have introduced structural inconsistencies. For example, it is not clear if movements of the inner gate are due to an AF2 model that's not well packed or really a feature of the open outward conformation.

      The long range effect of sodium binding to GAT1 and destabilisation of the inner gate has, based on our data, a causal effect. PCA separates conformational motions into degrees of freedom and sorts them according to the largest motions. Motions of TM5a were among the 2 largest motions, which suggests that these are relevant motions. To directly quantify their behaviour, we measured informative distances at the inner gate of GAT1, as shown in Figure 6i,j,k and separated data according to the presence of sodium in NA2.

      For the following reasons we exclude that the results are a consequence of structural inconsistencies introduced by AlphaFold2 and therefore not reflecting functionally relevant effects:

      (1) If depending on the model instead of sodium binding, the effects should not be correlated with the presence of sodium in the NA2 binding site.

      (2)  We carried out new simulations starting from the occluded GAT1 structure (Figure 6j,k). The data shows that in the occluded state the distance across the inner vestibule and the length of TM5a differ, consistent with our interpretation of the data. As sodium binding fixes GAT1 outwardfacing, as it also occurs in other SLC6 family members (Szöllősi and Stockner, 2022), the distances of the outward-open GAT1 are at the short extreme of the scale, distances of the inward-open state of the cryo-EM structure(s) are at the other extreme, while the occluded conformation of GAT1 shows intermediate values.

      (3)  We have observed the same property in SERT, for which we used experimental structures as starting structure (Gradisch et al., 2024), suggesting that this could be a generally mechanism.

      (4)  All available structures from the entire SLC6 family are consistent with structural effects of TM5a in response to bundle domain motions and therefore to binding of sodium to NA2 as it stabilized the outward-open state as well as transition to the inward facing conformation.

      - Quantitative analyses are difficult with the existing data; for example, the tICA "free energy" landscape is probably not converged because unbinding events haven't been observed.

      Simulations can always be too short and therefore not fully describe the complete underlying conformational ensemble. We added a statement in the discussion indicating this shortcoming. With respect to the tICA analysis in our manuscript, the tICA approach does, by design, not need long simulations that capture the full binding and unbinding in multiple instances to construct a correct free energy landscape. Instead, the tICA method builds on Markov chain dependencies and relies only on the convergence of transitions between hundreds of conformational microstates and the fluxes between them. The free energy profile derived for the S1, including NA1, TMP and NA2 and up to the salt bridge of the outer gate is well converged and we observed many transitions. In contrast, the entry from the recruitment side to the S1 has most likely a too low density of microstate and a too small number of transition to be considered converged with respect to quantifying the free energy of binding from bulk. We now explain this shortcoming.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for The Authors):

      Authors should furnish p-values in the figure legends for experimental results.

      We have added the p-values to text and figure legends.

      Reviewer #2 (Recommendations For The Authors):

      -  Deposit simulation data in a public repository (input files, trajectories (possibly subsampled)).

      We deposited the data to Zenodo and provided the DOI: 10.5281/zenodo.10686813 to the data. As we were unable to upload the trajectories to zenodo, we deposited the starting and the end structures of the simulations.

      -  Please include a short discussion of the reliability of using an AF2 model instead of experimental structures. What is expected to be correct/which parts of the structure are potentially incorrect? What makes you think that the AF2 model is a good model of the OF conformation of GAT1?

      Unfortunately, an outward-facing structure of GAT1 is not available. We have initially worked with an outward-open homology model of GAT1 based on SERT (build with MODELLER), but the structural differences between SERT and GAT1 are sufficiently large that these models did not behave well in simulations and too frequently could not maintain a sealed inner gate, also forming a channel. In contrast to the SERT-based GAT1 model, the AlphaFold2 model of GAT1 behaved as expected and consistent with the behaviour of SERT in simulations and with general knowledge of protein dynamics from literature. Based on structural analysis of our simulations and on the comparison to SERT we could not identify a region of GAT1 which would be potentially behave incorrect or unexpectedly. We added a statement to the discussion on this potential limitation of the use of homology models.

      -  Fig 1a: Na+ densities are not very clear (both due to small size and the transparency). I have a hard time seeing where bulk, 2*bulk regions are --- are you showing "onion shells" of density? Perhaps investigate presenting as cuts through the full density?

      I like the labelling in terms of absolute density and multiples of bulk.

      We have created new images to improve the visualisation of data. The data are shown as onion shells (isosurface), with the shells at the indicated densities. This is now clearly stated. Transparency is needed, otherwise e.g. the inner onion shells would not be visible. The cut-through is intuitive, but we could not find a useful plain, as the densities are too extensively distributed in 3D and not on a single plain.

      -  Fig 1h-k: would be clearer if "recruitment site" (TMP?) was indicated in the figure.

      We have created a new image for the recruiting site (Figure 1b,c) and temporary site (Figure 1g) and indicated these two sites as appropriate.

      -  Show time series of Na+ binding with a suitable order parameter (z or distances to NA1 and NA2?) to show how ions bind spontaneously. Mark the different sites. Mark pre- and post-binding parts of trajectories.

      We have added time series for every simulation that shows sodium binding to the NA1 or NA2 to the supplementary information Figure 2a,b,c. These quantify the distances to the recruiting site, the temporary site and the respective sodium binding site.

      -  PCA - how much of the total variance was captured by PC1 and PC2?

      The variance captured by the PCs are shown as eigenvalues in supplementary information Figure 4. PC1 captures about 19% of the variance, PC2 8%.

      -  "We found that the inner hydrophobic gate is dynamic in the absence of Na2" -- is this instability due to the AF2 model or likely realistic? E.g. was similar behaviour ever observed in simulations of the occluded state?

      In simulations of the occluded state we do not see such instabilities as observed in the outward-open state in the absence of sodium (Figure 6). As these larger scale fluctuations are not randomly distributed across all simulations starting from the AlphaFold2 models, but confined to the systems without sodium, it is unlikely an effect of the AlphaFold2 model.

      Please note, we have seen comparable behaviour in simulations of SERT starting from experimental structures (Gradisch et al., 2024), therefore suggesting a more general mechanism.

      -  Cooperativity between GABA-binding and Na+ binding to NA1: How would this lead to an experimentally measurable signature, i.e., which experiments could validate this interesting prediction?

      Direct detection of cooperativity is difficult to separate from other effects in experiments, as sodium binding and transport involves NA1 and NA2, NA2 has a higher affinity according to our data, while mutations will not only affect cooperativity, but will also have other effects.

      Conformational changes can also complicate experimental detection, as NA2 stabilises the outward-open conformation, while NA1+GABA binding triggers the transition to the inward-open state. To quantify cooperativity, it would be important to isolate the cooperative from all other effects, which is a challenge. Support for cooperativity has been found by (Zhou, Zomot and Kanner, 2006; Meinild and Forster, 2012) using this route. In the first paper the authors make use of lithium that only binds to the NA2, even though lithium is not only a mere NA2 selective ligand and otherwise identical to sodium. By comparing two GABA concentrates the authors showed that the sodium dependence of GABA transport is left shifted at higher GABA concentrations, which is not the case in the absence of lithium. This data is indirect, but consistent with cooperativity between GABA and NA1-bound sodium, as GABA transport mainly reflects binding of sodium to NA1. Similar approaches could be further explored, for example by varying the GABA concentration instead of sodium. Other options could be to create an outward-facing and conformationally locked GAT1 and to measure the cooperativity of sodium and GABA binding using for example the scintillation proximity assay. Most likely the assay would also need a way to be NA2 binding independent. We are not aware of such a GABA transporter system.

      -  There are some instances of [SI Figure] or [citation needed] that should be cleaned up.

      We have corrected these instances.

      References

      Gradisch, R. et al. (2024) ‘Ligand coupling mechanism of the human serotonin transporter differentiates substrates from inhibitors’, Nature Communications, 15(1), p. 417. Available at: https://doi.org/10.1038/s41467-023-44637-6.

      Meinild, A.-K. and Forster, I.C. (2012) ‘Using lithium to probe sequential cation interactions with GAT1’, American Journal of Physiology. Cell Physiology, 302(11), pp. C1661-1675. Available at: https://doi.org/10.1152/ajpcell.00446.2011.

      Szöllősi, D. and Stockner, T. (2022) ‘Sodium Binding Stabilizes the Outward-Open State of SERT by Limiting Bundle Domain Motions’, Cells, 11(2), p. 255. Available at: https://doi.org/10.3390/cells11020255.

      Zhou, Y., Zomot, E. and Kanner, B.I. (2006) ‘Identification of a lithium interaction site in the gamma-aminobutyric acid (GABA) transporter GAT-1’, The Journal of Biological Chemistry, 281(31), pp. 22092–22099. Available at: https://doi.org/10.1074/jbc.M602319200.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Response to reviewer #1:

      We thank the reviewer for the further recommendations for improving our presentation. We would like to carefully address the remaining concerns of the reviewer.

      (1) I realize now that I didn't make my point clear enough, which was that as far as I know there is no reason to believe that an oscillatory state cannot be induced with synaptic depression as with spike frequency adaptation when used in the context of the author's model. I'm fine with how the authors have distinguished their model from R&T 2015, but I think the more interesting question is whether there is any reason to believe that STD is not equally capable of doing all the things mentioned in this paper as SFA, and if not why not. I would like the authors to go out on a limb and address this, if only with a few sentences in the discussion. 

      Thank you for pointing this out again. In response to your query regarding the comparison between STD and SFA in generating bump sweeps, we have done simulations based on STD. The results showed that both STD and SFA are capable of inducing bi-directional sweeps. However, (based on our simulations) only SFA can produce uni-directional sweeps. The absence of uni-directional sweeps based on STD may be due to the subtle yet important differences between the two mechanisms. Specifically, STD modulates the neural activity by weakening the recurrent connections, which theoretically can only inhibit recurrent inputs, while SFA can attenuate all forms of excitatory inputs, including external inputs. However, since we did not exhaustively explore the entire parameter space, we cannot conclude that STD is incapable of producing uni-directional sweeps. Future simulations are required.

      According to the Reviewer’s suggestion, we added few sentences to discuss the distinctions between STD and SFA in generating theta sweeps in the CANN in line 432 to 440 in the Discussion session:

      “Based on our simulation, both STD and SFA show the ability to produce bi-directional sweeps within a CANN model, with the SFA uniquely enabling uni-directional sweeps in the absence of external theta inputs. This difference might be due to the lack of exhaustively exploration of the entire parameter space. However, it might also attribute to the subtle yet important theoretical distinctions between STD and SFA. Specifically, STD attenuates the neural activity through a reduction in recurrent connection strength, whereas SFA provides inhibitory input directly to the neurons, potentially impacting all excitatory inputs. These differences might explain the diverse dynamical behaviors observed in our simulations. Future experiments could clarify these distinctions by monitoring changes in synaptic strength and inhibitory channel activation during theta sweeps.”

      (2) I appreciate the inclusion of the experimental data in Fig 6a (though I don't find the left-most panel very useful). I also understand what the authors are trying to convey with plots in 6c and 6c. However, I don't find the text that was added above very helpful at all. I was hoping for a simpler demonstration of the effect, by plotting a series of sequential sweeps (cell index vs time, with color indicating firing rate, as in Fig 2d) in the case of both the slow speed and fast speed regimes. Here, vertical lines could mark the individual theta cycles and the firing of individual cells, showing the constancy of the former but change of the latter. 

      Thank you for your constructive feedback. It seems there might be a misunderstanding in our previous explanation, for which we apologize. The phenomenon we want to elucidate is not an increase in the theta frequency as detected in LFPs, but rather the slope of phase precession with respect to the animal's movement speed. Due to phase precession, the oscillations of place cells as the animal traverses the field is higher than the theta frequency. A plot as Fig 2.d will not make this point clearer, since it shows the baseline theta frequency (i.e., theta sweeps as we claimed previously). A straightforward way of thinking this point is as we added previously: “…The faster the animal runs, the faster the extra half cycle can be accomplished. Consequently, the firing frequency will increase more (a steeper slope in Fig. 6c red dots) than the baseline frequency”. We hope this clarification addresses the concerns raised.

      (3) This is still confusing to me. I just don't understand how the *phase* of the oscillating activity bump has anything to do with the movement of the animal. I would like to see a plot of the sweeps (again, cell index vs time, with color indicating the firing rate) before and after inactivation for short and long duration inactivation. Perhaps I am not understanding or appreciating how the bump recovers after inactivation and how this is related to the motion of the animal. 

      Thank you for pointing this out again. The activity bump will naturally pop out at the input location (which moves forward than before) after we remove the inactivation and then starts to sweep again as before the inactivation. Single cell phase precession and populational theta sweeps are actually the two sides of the same coin (if all cells start at roughly the same phase in theta cycles). If the reviewer accept this, then at the new location, the activity bump sweeps again (around the new location), and therefore phase precession starts again at a further phase, since phase codes the position as the animal traverses the place field.

      (4) I am glad the authors are spending more time discussing this phenomenon, but I am unsure of their explanation: for a sweep moving at constant speed, neurons all along the path will be equally affected (inhibited), so where does the bias for suppressing the "end" neurons come from? 

      While it may appear that neurons along the path are equally inhibited as the bump sweeps over them, our model incorporates external inputs with Gaussian profiles. These inputs bias neurons closer to the input location, resulting in fewer activations in neurons further away from the input position.

      (5) Here I was hoping that the authors might comment on what they suspect happens when the animal starts (or stops) moving, and how the network shifts from tracking regime to oscillatory regime (or vice versa), as is typically seen in experimental data (see for example, Kay et al., 2020, fig 4b,c). My apologies for not making this point clearer. 

      Thank you for pointing this out. In our model, we observed that when the animal stops, the network continues to generate theta oscillations near the input location, albeit with reduced amplitude (so the network dynamics looks like in the tracking regime). However, we hypothesize that when the animal pauses its movement for enough time (immobile but awake states), sensory input into the hippocampus also decreases, which is similar to removing external inputs in our model. In this case, the activity bump spontaneously moves away, resembling the phenomenon of replay (see also Romani & Tsodyks 2015).

      Regarding the experimental data (Kay et al.), it indeed appears that theta sweeps decoded from neural activity become less pronounced when the mouse moves at slower speeds. This observation could potentially correspond to a decrease in the amplitude of bump oscillations when external inputs associated with movement are halted but not entirely removed in our model. However, in experiments, when the mouse's movement slows down, hippocampal activity no longer oscillates at theta frequency, making it challenging to decode theta sweeps.

      We appreciate your clarification on this point and recognize the importance of further investigating how our model can accurately replicate the transition between tracking and oscillatory regimes observed in experimental data.

    1. Reviewer #3 (Public Review):

      This manuscript examines the impact of congenital visual deprivation on the excitatory/inhibitory (E/I) ratio in the visual cortex using Magnetic Resonance Spectroscopy (MRS) and electroencephalography (EEG) in individuals whose sight was restored. Ten individuals with reversed congenital cataracts were compared to age-matched, normally sighted controls, assessing the cortical E/I balance and its interrelationship to visual acuity. The study reveals that the Glx/GABA ratio in the visual cortex and the intercept and aperiodic signal are significantly altered in those with a history of early visual deprivation, suggesting persistent neurophysiological changes despite visual restoration.

      My expertise is in EEG (particularly in the decomposition of periodic and aperiodic activity) and statistical methods. I have several major concerns in terms of methodological and statistical approaches along with the (over)interpretation of the results. These major concerns are detailed below.

      (1) Variability in visual deprivation:

      - The document states a large variability in the duration of visual deprivation (probably also the age at restoration), with significant implications for the sensitivity period's impact on visual circuit development. The variability and its potential effects on the outcomes need thorough exploration and discussion.

      (2) Sample size:

      - The small sample size is a major concern as it may not provide sufficient power to detect subtle effects and/or overestimate significant effects, which then tend not to generalize to new data. One of the biggest drivers of the replication crisis in neuroscience.

      - The main problem with the correlation analyses between MRS and EEG measures is that the sample size is simply too small to conduct such an analysis. Moreover, it is unclear from the methods section that this analysis was only conducted in the patient group (which the reviewer assumed from the plots), and not explained why this was done only in the patient group. I would highly recommend removing these correlation analyses.

      (3) Statistical concerns:

      - The statistical analyses, particularly the correlations drawn from a small sample, may not provide reliable estimates (see https://www.sciencedirect.com/science/article/pii/S0092656613000858, which clearly describes this problem).

      - Statistical analyses for the MRS: The authors should consider some additional permutation statistics, which are more suitable for small sample sizes. The current statistical model (2x2) design ANOVA is not ideal for such small sample sizes. Moreover, it is unclear why the condition (EO & EC) was chosen as a predictor and not the brain region (visual & frontal) or neurochemicals. Finally, the authors did not provide any information on the alpha level nor any information on correction for multiple comparisons (in the methods section). Finally, even if the groups are matched w.r.t. age, the time between surgery and measurement, the duration of visual deprivation, (and sex?), these should be included as covariates as it has been shown that these are highly related to the measurements of interest (especially for the EEG measurements) and the age range of the current study is large.

      - EEG statistical analyses: The same critique as for the MRS statistical analyses applies to the EEG analysis. In addition: was the 2x3 ANOVA conducted for EO and EC independently? This seems to be inconsistent with the approach in the MRS analyses, in which the authors chose EO & EC as predictors in their 2x2 ANOVA.

      - Figure 4: The authors report a p-value of >0.999 with a correlation coefficient of -0.42 with a sample size of 10 subjects. This can't be correct (it should be around: p = 0.22). All statistical analyses should be checked.

      - Figure 2c. Eyes closed condition: The highest score of the *Glx/GABA ratio seems to be ~3.6. In subplot 2a, there seem to be 3 subjects that show a Glx/GABA ratio score > 3.6. How can this be explained? There is also a discrepancy for the eyes-closed condition.

      (4) Interpretation of aperiodic signal:

      - Several recent papers demonstrated that the aperiodic signal measured in EEG or ECoG is related to various important aspects such as age, skull thickness, electrode impedance, as well as cognition. Thus, currently, very little is known about the underlying effects which influence the aperiodic intercept and slope. The entire interpretation of the aperiodic slope as a proxy for E/I is based on a computational model and simulation (as described in the Gao et al. paper).

      - Especially the aperiodic intercept is a very sensitive measure to many influences (e.g. skull thickness, electrode impedance...). As crucial results (correlation aperiodic intercept and MRS measures) are facing this problem, this needs to be reevaluated. It is safer to make statements on the aperiodic slope than intercept. In theory, some of the potentially confounding measures are available to the authors (e.g. skull thickness can be computed from T1w images; electrode impedances are usually acquired alongside the EEG data) and could be therefore controlled.

      - The authors wrote: "Higher frequencies (such as 20-40 Hz) have been predominantly associated with local circuit activity and feedforward signaling (Bastos et al., 2018; Van Kerkoerle et al., 2014); the increased 20-40 Hz slope may therefore signal increased spontaneous spiking activity in local networks. We speculate that the steeper slope of the aperiodic activity for the lower frequency range (1-20 Hz) in CC individuals reflects the concomitant increase in inhibition." The authors confuse the interpretation of periodic and aperiodic signals. This section refers to the interpretation of the periodic signal (higher frequencies). This interpretation can not simply be translated to the aperiodic signal (slope).

      - The authors further wrote: We used the slope of the aperiodic (1/f) component of the EEG spectrum as an estimate of E/I ratio (Gao et al., 2017; Medel et al., 2020; Muthukumaraswamy & Liley, 2018). This is a highly speculative interpretation with very little empirical evidence. These papers were conducted with ECoG data (mostly in animals) and mostly under anesthesia. Thus, these studies only allow an indirect interpretation by what the 1/f slope in EEG measurements is actually influenced.

      (5) Problems with EEG preprocessing and analysis:

      - It seems that the authors did not identify bad channels nor address the line noise issue (even a problem if a low pass filter of below-the-line noise was applied).

      - What was the percentage of segments that needed to be rejected due to the 120μV criteria? This should be reported specifically for EO & EC and controls and patients.

      - The authors downsampled the data to 60Hz to "to match the stimulation rate". What is the intention of this? Because the subsequent spectral analyses are conflated by this choice (see Nyquist theorem).

      - "Subsequently, baseline removal was conducted by subtracting the mean activity across the length of an epoch from every data point." The actual baseline time segment should be specified.

      - "We excluded the alpha range (8-14 Hz) for this fit to avoid biasing the results due to documented differences in alpha activity between CC and SC individuals (Bottari et al., 2016; Ossandón et al., 2023; Pant et al., 2023)." This does not really make sense, as the FOOOF algorithm first fits the 1/f slope, for which the alpha activity is not relevant.

      - The model fits of the 1/f fitting for EO, EC, and both participant groups should be reported.

      (6) Validity of GABA measurements and results:

      - According the a newer study by the authors of the Gannet toolbox (https://analyticalsciencejournals.onlinelibrary.wiley.com/doi/abs/10.1002/nbm.5076), the reliability and reproducibility of the gamma-aminobutyric acid (GABA) measurement can vary significantly depending on acquisition and modeling parameter. Thus, did the author address these challenges? Furthermore, the authors wrote: "We confirmed the within-subject stability of metabolite quantification by testing a subset of the sighted controls (n=6) 2-4 weeks apart. Looking at the supplementary Figure 5 (which would be rather plotted as ICC or Blant-Altman plots), the within-subject stability compared to between-subject variability seems not to be great. Furthermore, I don't think such a small sample size qualifies for a rigorous assessment of stability.

      - "Why might an enhanced inhibitory drive, as indicated by the lower Glx/GABA ratio" Is this interpretation really warranted, as the results of the group differences in the Glx/GABA ratio seem to be rather driven by a decreased Glx concentration in CC rather than an increased GABA (see Figure 2).

      - Glx concentration predicted the aperiodic intercept in CC individuals' visual cortices during ambient and flickering visual stimulation. Why specifically investigate the Glx concentration, when the paper is about E/I ratio?

      (7) Interpretation of the correlation between MRS measurements and EEG aperiodic signal:

      - The authors wrote: "The intercept of the aperiodic activity was highly correlated with the Glx concentration during rest with eyes open and during flickering stimulation (also see Supplementary Material S11). Based on the assumption that the aperiodic intercept reflects broadband firing (Manning et al., 2009; Winawer et al., 2013), this suggests that the Glx concentration might be related to broadband firing in CC individuals during active and passive visual stimulation." These results should not be interpreted (or with very caution) for several reasons (see also problem with influences on aperiodic intercept and small sample size). This is a result of the exploratory analyses of correlating every EEG parameter with every MRS parameter. This requires well-powered replication before any interpretation can be provided. Furthermore and importantly: why should this be specifically only in CC patients, but not in the SC control group?

      (8) Language and presentation:

      - The manuscript requires language improvements and correction of numerous typos. Over-simplifications and unclear statements are present, which could mislead or confuse readers (see also interpretation of aperiodic signal).

      - The authors state that "Together, the present results provide strong evidence for experience-dependent development of the E/I ratio in the human visual cortex, with consequences for behavior." The results of the study do not provide any strong evidence, because of the small sample size and exploratory analyses approach and not accounting for possible confounding factors.

      - "Our results imply a change in neurotransmitter concentrations as a consequence of *restoring* vision following congenital blindness." This is a speculative statement to infer a causal relationship on cross-sectional data.

      - In the limitation section, the authors wrote: "The sample size of the present study is relatively high for the rare population , but undoubtedly, overall, rather small." This sentence should be rewritten, as the study is plein underpowered. The further justification "We nevertheless think that our results are valid. Our findings neurochemically (Glx andGABA+ concentration), and anatomically (visual cortex) specific. The MRS parameters varied with parameters of the aperiodic EEG activity and visual acuity. The group differences for the EEG assessments corresponded to those of a larger sample of CC individuals (n=38) (Ossandón et al., 2023), and effects of chronological age were as expected from the literature." These statements do not provide any validation or justification of small samples. Furthermore, the current data set is a subset of an earlier published paper by the same authors "The EEG data sets reported here were part of data published earlier (Ossandón et al., 2023; Pant et al., 2023)." Thus, the statement "The group differences for the EEG assessments corresponded to those of a larger sample of CC individuals (n=38) " is a circular argument and should be avoided.

    1. Author response:

      The following is the authors’ response to the original reviews.

      We appreciate the reviewers for their insightful comments, which have helped to improve the manuscript. We provide specific examples and a point-by-point response to all comments, below. Based on the Reviewers’ comments, we revised our manuscript, adding considerable amount of new data (found in Fig. 1A,B, 4E-G, 7C,D, 8C,E, S1B,C, S2C-G, S4C, and Video 1). In the main manuscript text, blue fonts indicate added or revised texts. An additional author (Lauren N. Juga) is added for the newly generated data in the revised manuscript.

      Reviewer #1: 

      Sekulovski et al present an interesting and timely manuscript describing the temporal transition from epiblast to amnion. The manuscript builds on their previous work describing this process using stem cell models. 

      They suggest a multi-step process initiated by BMP induction of GATA3, followed by expression of TFAP2A, followed by ISL1/HAND1 in parallel with loss of pluripotency markers. This transition was reproduced through IF analysis of CS6/7 NHP embryo. 

      There are significant similarities in the expression of trophectoderm and the amnion. There are also ample manuscripts showing trophoblast induction following BMP stimulation of primed pluripotent stem cells. The authors should ensure that the amnion indeed is only amnion and not trophectoderm (or the amount of contribution to trophectoderm). As an extension, does the amnion character remain after the 48h BMP4 treatment, and is a trophectoderm-like state adopted as suggested by Ohgushi et al 2022?  

      Thank you for this insightful comment. As pointed out, Ohgushi et al. showed that, in their culture method, amnion is first induced, and extended culturing leads to the formation of trophectoderm-like cells (Ohgushi et al., 2022).

      Importantly, we would like to note that our culture system differs substantially from that of Ohgushi et al. in several respects. First our system uses a 3D culture method while Ohgushi et al. employ 2D hPSC monolayers. Second, the two systems are chemically quite distinct. In our Glass-3D+BMP protocol, cells are cultured in mTeSR media (which contains FGF2 and TGFb1) for two days, by which time they generate 3D pluripotent cysts. BMP is then added to the culture medium for 24 hours, followed by another 24 hours without BMP4. In stark contrast, Ohgushi et al. employ A83-01, an Activin/Nodal signaling inhibitor, and PD173074, an FGF signaling inhibitor (a protocol which they call AP). This treatment leads to spontaneous activation of BMP signaling, but it also clearly inhibits Activin/Nodal and FGF signaling pathways, which remain active in our system. As a result of these distinct chemical as well as geometrical culturing protocols, their system produces amnion and trophectoderm, while our system produces exclusively amnion.

      Further analysis of gene expression data provides additional data supporting our contention that our system produces amnion. Though the gene expression profiles of amnion and trophectoderm are quite similar, specific markers of trophectoderm have been identified including GCM1, PSG1, PSG4 and CGB (Blakeley et al., 2015; Meistermann et al., 2021; Ohgushi et al., 2022; Okae et al., 2018; Petropoulos et al., 2016; Yabe et al., 2016). Importantly, while all of these markers are abundantly expressed in the Ohgushi et al. system, bulk RNA sequencing analysis of our Glass-3D+BMP hPSC-amnion cells reveals that none of these markers are detectable. Indeed, SDC1, a marker that Ohgushi et al. claim distinguishes trophoblast from amnion actually decreases (more than 8-fold) as pluripotent cysts transition to amnion in Glass3D+BMP. Finally, Ohgushi et al. report that ISL1, a key marker of specified amnion population, is initially increased in their system, but is reduced to a basal level overtime. In contrast, in Glass3D+BMP hPSC-amnion, ISL1 expression continuously increases with time, and ISL1 protein expression is seen uniformly throughout the amnion cysts. This uniform expression is also seen in CS6/7 cynomolgus macaque amnion. Together, these results support out conclusion that the Glass-3D+BMP system leads to the formation of amniotic cells, and not trophectoderm cells.

      The functional data does not support a direct function of GATA3 prior to TFAP2A and the authors suggest compensatory mechanisms from other GATAs. If so, which GATAs are expressed in this system, with and without GATA3 targeting? Would it not be equally likely that the other early genes could be the key drivers of amnion initiation, such as ID2? 

      We appreciate this helpful comment. We agree that our data do not provide sufficient evidence for the role of GATA3 in early amniogenesis. We also agree that other early genes could be key drivers, and apologize for including our speculation that focuses only on GATA2. GATA2 was selected because, among the other GATAs, GATA2 and GATA3 are the only abundantly expressed GATA factors. This point suggesting a potentially redundant role of GATA2 is now removed from the manuscript (Line#355 of the original manuscript).

      The targeting of TFAP2A displays a very interesting phenotype which suggests that amnion and streak share an initial trajectory but where TFAP2A is necessary to adopt amnion fate. It would again be important to ensure that this alternative fate is indeed in streak and not misannotated alternative lineages, including trophoblast. 

      Is TBXT induced in this setting as well as in the wt situation during amnion induction? This should be displayed as in Figure 3D and would be nice to be complimented by NHP IF analysis.

      We will address these two closely related comments together.

      TFAP2A-KO cysts contain ISL1+ squamous cells as well as SOX2+ pluripotent cells, suggesting that, while the initial focal amniogenesis is seen, subsequent spreading event is not seen. Interestingly, our new data show that TFAP2A-KO cysts display cells with high TBXT expression (Fig. 8E, Line#373-374). This result suggests that, in the absence of TFAP2A, once amnion lineage progression is halted, more primitive streak-like (TBXThigh) lineage emerges. It is important to note that TBXT expression is not seen in the trophectoderm population of cynomolgus macaque peri-gastrula (Sasaki et al., 2016; Yang et al., 2021).

      As suggested, we now include a TBXT expression time course during hPSC-amnion formation in Fig. S2D of the revised manuscript. These data show weak TBXT expression (transcripts) starting at the 24-hr timepoint. However, a clear TBXT protein signal could not be detected using IF (Fig. S2C), likely because TBXT expression is very low (Line#264-265). While statistically significant compared to the 12-hr timepoint, TBXT expression is 31 FPKM +/- 0.8 (standard deviation) at 24-hr and 48 FPKM +/- 6 at 48-hr. These are low expression values compared to, for example, TFAP2A, which displays 572 FPKM +/- 23 at 12-hr and 1169 FPKM +/- 27 at 24-hr, at which TFAP2A is readily detected using IF. While weak nuclear TFAP2A is seen using IF at 6hr (187 FPKM +/- 7), no clear TFAP2A is detected at 3-hr (74 FPKM +/- 7). Another example is ISL1, which displays 758 FPKM +/- 55 at 24-hr and 1505 FPKM +/- 26 at 48-hr, when ISL can be detected using IF. Importantly, we were not able to detect ISL1 protein expression using IF at

      12-hr, at which its expression level is 12 FPKM +/-18. Lastly, we now show that, in the cynomolgus macaque peri-gastrula, while pSMAD1/5+ primitive streak-derived disseminating cells show abundant TBXT expression, no clear TBXT expression is seen in the amnion territory (Fig. S2G, Line#291-293). 

      Together, these results show that while a TBXTlow state clearly emerges during hPSC-amnion development, in wild-type hPSC cultured in Glass-3D+BMP, TBXT levels remain low throughout amnion differentiation. However, in the absence of TFAP2A, a TBXThigh state is seen, suggesting that TFAP2A is critical for suppressing this TBXThigh state in fate spreading cells, perhaps by preventing BMP responding cells from acquiring embryonic lineages (e.g., mesodermal and/or primordial germ cells).

      The authors should address why they get different results from Castillo-Venzor et al 2023 DOI: 10.26508/lsa.202201706  

      Thank you very much for this helpful suggestion, and we now include a section detailing this in the Discussion (Line#410-432). In short, we propose several possibilities. First, culturing conditions are highly distinct. Castillo-Venzor et al. (Castillo-Venzor et al., 2023) utilize initial “pre-mesoderm” conditioning by Activin and CHIR, followed by treating floating embryoid bodies with a growth factor cocktail (BMP, SCF, EGF and LIF). In contrast, our system (Glass-3D+BMP) employs BMP stimulation of pluripotent cysts. Thus, we suspect that, in the PGCLC differentiation condition, cells are conditioned to the pre-mesodermal lineage. Moreover, we propose that amnion fate spreading may not be present in the PGCLC system, perhaps due to differences in geometry (aggregates versus cysts), or due to differing lineage commitment programs. That is, while initial amniogenesis is seen in the PGCLC system, most cells may already be committed to the PGC-like or mesodermal lineages by the time amnion fate spreading can occur. Alternatively, because several cell types (PGC-like, mesodermal and amniotic) co-exist in the culture by Castillo-Venzor et al., PGC-like and/or mesodermal cells may compensate for the loss of TFAP2A.

      Reviewer #2: 

      In this study, Sekulovski and colleagues report refinements to an in vitro model of human amnion formation. Working with 3D cultures and BMP4 to induce differentiation, the authors chart the time course of amnion induction in human pluripotent stem cells in their system using immunofluorescence and RNA-seq. They carry out validation through comparison of their data to existing embryo datasets, and through immunostaining of post-implantation marmoset embryos. Functional experiments show that the transcription factor TFAP2C drives the amnion differentiation program once it has been initiated. 

      There is currently great interest in the development of in vitro models of human embryonic development. While it is known that the amnion plays an important structural supporting role for the embryo, its other functions, such as morphogen production and differentiation potential, are not fully understood. Since a number of aspects of amnion development are specific to primates, models of amniogenesis will be valuable for the study of human development. Advantages of this model include its efficiency and the purity of the cell populations produced, a significant degree of synchrony in the differentiation process, benchmarking with single-cell data and immunocytochemistry from primate embryos, and identification of key markers of specific phases of differentiation. Weaknesses are the absence of other embryonic tissues in the model, and overinterpretation of certain findings, in particular relating bulk RNA-seq results to scRNA-seq data from published analyses of primate embryos and results from limited (though high quality) embryo immunostainings.  

      We are happy that Reviewer #2 agrees that our Glass-3D+BMP model is important for investigating additional roles of amniogenesis, as well as roles of amnion as a signaling hub, due to the purity of the amniotic cell population, and a high degree of synchrony of differentiation.

      We respectfully disagree that the absence of other embryonic tissues in the model is a weakness: rather, we believe it is a strength because this single lineage amnion model allows us to directly (and independently) investigate mechanisms underlying amnion lineage progression. For example, as noted above in our response to Reviewer #1, use of our hPSCamnion model allowed us to see a very specific and interesting phenotype in the absence of TFAP2A (reduced amnion formation and emergence of an alternative lineage), though previous findings by Castilllo-Venzor et al. concluded that amniogenesis is not affected by loss of TFAP2A. We noted that the culture method used by Castillo-Venzor et al. contains several cell types (amniotic, mesodermal and PGC-like), and that amniogenesis may be intact in that model due to compensation by the presence of these other cell types. That is, while cell-cell interactions can indeed be gleaned in culture systems with several cell types, the presence of multiple cell types and their additional signaling inputs can also confound some aspects of mechanistic investigations. We now include a paragraph in the Discussion of the revised manuscript (Line#410-432), in which we detail these ideas, and suggest that, because of the cell purity, our Glass-3D+BMP model enables robust mechanistic examinations, specifically during amnion formation.

      We address Reviewer #2’s point about bulk vs. single cell transcriptomic similarity analysis in Reviewer’s specific point #4 below. We do, however, want to note here that we have performed the same analysis using a 14-day old cynomolgus macaque peri-gastrula single cell RNA sequencing dataset generated by Yang et al. (Yang et al., 2021), and obtained a lineage trajectory (Fig. 4F, Line#265-268) similar to that seen when the Tyser et al. dataset (Tyser et al., 2021) was used (Fig. 4C).

      Importantly, while cynomolgus macaque early embryo samples are limited, we now include additional staining (Fig. S2G). 

      Reviewer #2 (Recommendations For The Authors): 

      Provide more confirmation of key findings in more than one stem cell line. 

      We now confirm key findings in the H7 human embryonic stem cell line (Fig. S1C).

      Provide stronger evidence e.g. scRNA-seq to support the existence of intermediate cells or tone down the conclusions.  

      We agree that this is a very important point. In our recent study (Sekulovski et al., 2023), we performed single cell RNA sequencing of Gel-3D, another hPSC-amnion model. In this study, we comprehensively described the transcriptome associated with the “intermediate” cell types, as well as CLDN10 as a marker of these cell types. Moreover, we now include additional data showing the molecular characteristics of the TBXTlow intermediate cells during amniogenesis in hPSC-amnion (Fig. S2C, S2D) and d14 cynomolgus macaque peri-gastrula (Fig 4G, replot of single cell RNAseq by (Yang et al., 2021), Line#264-268).

      Provide more data on the expression of DLX5 in the model. 

      We now provide a DLX5 staining time course in Fig. 7C. We find that, similar to ISL1, prominent DLX5 staining is seen in the focal cells at 24-hr post-BMP. Interestingly, at 48-hr, while some cells show high levels of DLX5, some cells show low DLX5 levels; this is of an interest for future investigations.

      (1) L159 - the authors should repeat more of the key results in at least one other hPSC line, to ensure reproducibility of the method. Figure S1 contains minimal information (one timepoint, three genes, one biological replicate) on a single different hPSC line. 

      We now include additional validation analysis using the H7 human ESC line (Fig. S1).

      (2) Figure 1- it is a little difficult to appreciate cyst formation from images taken at one level in the stack, can the authors perhaps show a 3D rendering or video to display morphogenesis better? 

      We now provide all optical sections of cysts shown in Movie 1.

      (3) Figure 1-did the authors carry out podocalyxin staining? This is a standard marker for lumenogenesis.  

      We now provide PODXL staining (Fig. 1A,1B).

      (4) L248 onwards and Figure 4-I am a little skeptical concerning conclusions drawn from an overlay of bulk RNA-seq onto scRNA-seq UMAP plots. I think the authors need to provide some strong justification for this approach. I would be particularly careful about concluding that cells depicted in Fig 4D represent an intermediate close to primitive streak and even more careful about claiming any lineage relationship between T-positive "primitive streak like intermediates" and the trajectory of cells in the model. UMAP is a dimension-reduction technique for the visualization of clusters in high-dimensional data. It is not a lineage-tracing methodology. It would have been preferable for the authors to present their own scRNA-seq data from the model.  

      We are sorry that it was not clear that our approach to find similarity between bulk and single cell RNA-seq data is largely based on a published work (Granja et al., Nature Biotechnology 2019, (Granja et al., 2019)) named projectLSI. Please refer to our Methods section for details of the implementation and how we modified it for better visualization (addressed in Line#667-676 of the original manuscript, now in Line#718-730). The performance of projectLSI was extensively evaluated in the original article. Furthermore, as pointed out, UMAP is indeed a dimension reduction method that has been widely used in single cell RNA-seq research. In addition to visualizing clusters, trajectory analysis, such as RNA-velocity (which is used in this study), is another successful and widely adapted application of UMAP to gauge fate progression. Therefore, we believe that UMAP can be effectively used as a lineage prediction methodology, and that our use of bulk to single cell transcriptomic similarity analysis leveraging projectLSI is well justified at conceptual and technical levels.

      As illustrated in Fig. 5A, we performed RNA-velocity analysis of the Tyser et al. dataset, and our result clearly predicts a differentiation trajectory from Epiblast, a part of the TBXTlow population shown in Fig. 4D, and, then, to Ectoderm/Amnion cells. Consistent with this bioinformatic result, we now show that some cells show some but weak TBXT expression (at the transcript level) at the 24-hr post-BMP timepoint in control hPSC-amnion (Fig. S2D, Line#264-265). Importantly, our conclusion is drawn from a trajectory based on our time course (0, 0.5, 1, 3, 6, 12, 24, and 48 hours post-BMP treatment) which shows a clear transition from epiblast cells to TBXTlow and then finally to the ectoderm/amnion population. Moreover, using the transcriptomic similarity analysis, we found that the loss of TFAP2A leads to emergence of more primitive streak-like transcriptional characteristics (Fig. 8D). Indeed, using IF, we now show that several fate spreading cells in the TFAP2A-KO cysts are TBXThigh (Fig. 8E, Line#373-374). Thus, the new data provide additional evidence for the successful implementation of this bulk/single cell transcriptomic similarity analysis.

      Together, our bioinformatic and localization analyses show that the Glass-3D+BMP system recapitulates the trajectory found in our Tyser et al. RNA-velocity analysis, further supporting the validity of this differentiation trajectory. To avoid confusion, however, we now omit the “primitive streak-like” phrase when describing the TBXTlow cells because, while they may show some TBXT expression, they are likely intermediate fate transitioning cells. Indeed, a recent study by Ton et al. (Ton et al., 2023) showed that the Tyser et al. Primitive Streak cells consist of a mix of several lineage progressing cells (e.g., Epiblast, Non-neural ectoderm, Anterior or caudal primitive streak, PGC). Therefore, these cells are now specifically described as “TBXTlow” state; TBXThigh cells are described as primitive streak-like state.

      (5) L276 Tyser data do come from a primate model; the authors mean NHP.  

      We now specifically state that the validation is performed in a non-human primate model (Line#280).

      (6) Figure 5-though the immunostaining of the CS6/7 monkey embryos is excellent, the authors should not overinterpret these images. What is shown is not a time course, and one can only infer that a particular pattern of gene expression exists in a spatial sense from these images. In the model (Figure 2), the epiblast markers gradually fade and overlap for a time with emergent amnion markers, but in Figure 5 the transition between epiblast and amnion in the embryo seems pretty sharp, at least in terms of gene expression. There may be a few cells in D that show overlap of SOX2 and TFAP2A, but if the authors want to claim that a transition zone exists, they need to produce stronger evidence. Figure 7 is more convincing but see the next point. 

      Thank you for this insightful comment. We now address the nature of the transitioning boundary cell population extensively in our other recent study (Sekulovski et al., 2023).

      (7) Figure 7 further confuses the issue. A zone at either end of the epiblast is clearly positive for Sox2 and the two amnion markers, clearer than in Figure 5, but why does the marker DLX5 overlap with SOX2 in the embryo (7d) but not the model (7C)? Arguments regarding intermediate cell populations would be greatly strengthened by scRNA-seq data on the model system. 

      In our original manuscript, our DLX5 staining was performed at 48-hr post-BMP, at which SOX2 expression is absent in all cells. Our new analysis at the 24-hr timepoint now shows that DLX5 is expressed in SOX2+ cells (this is now presented in Fig. 7C).

      As stated in the point #6, our recent study comprehensively describes the transcriptomic and spatial characteristics of the transitioning boundary cell population (Sekulovski et al., 2023).

      (8) L357 TFAP2C KO does not resemble intermediate cysts in Figure 2. In Figure 2, both SOX2 and amnion markers are co-expressed in the same cells. In 8C, SOX2 and ISL1 are mutually exclusive.  

      We agree with this comment, and now removed this statement pointing out the resemblance (Line#359 of the original manuscript).

      (9) Figure 8d-the same caveats noted above regarding the interpretation of superposition of bulk RNA-seq data with scRNA-seq UMAP analysis apply here.  

      Please refer to our explanation in point#4.

      Reviewer #3: 

      In this work, the authors tried to profile time-dependent changes in gene and protein expression during BMP-induced amnion differentiation from hPSCs. The authors depicted a GATA3 - TFAP2A - ISL1/HAND1 order of amniotic gene activation, which provides a more detailed temporary trajectory of amnion differentiation compared to previous works. As a primary goal of this study, the above temporal gene/protein activation order is amply supported by experimental data. However, the mechanistic insights on amniotic fate decision, as well as the transcriptomic analysis comparing amnion-like cells from this work and other works remain limited. While this work allows us to see more details of amnion differentiation and understand how different transcription factors were turned on in a sequence and might be useful for benchmarking the identity of amnion in ex utero cultured human embryos/embryoids, it provides limited insights on how amnion cells might diverge from primitive streak / mesoderm-like cells, despite some transcriptional similarity they shared, during early development.  

      We are happy that Reviewer #3 appreciates that our model can be used effectively to identify previously unrecognized amniotic gene activation cascade, providing a comprehensive timecourse transcriptomic resource.

      As detailed below, we address specific concerns raised by Reviewer #3. We now provide additional mechanistic insights into amnion fate progression, and include additional transcriptomic comparisons with a cynomolgus macaque single cell RNA sequencing dataset.

      Reviewer #3 (Recommendations For The Authors): 

      (1) The authors generated KO cell lines lacking GATA3 and TFAP2A, respectively. Their results showed some disrupted amnion differentiation only in TFAP2A-KO. Therefore, these data do not provide sufficient evidence to support whether these transcription factors are crucial for amnion fate specification. Perhaps an experiment could be done with overexpression of these markers and testing if they could force hPSC to adopt amnion-like fate.  

      Thank you for this insightful comment. We generated cell lines that enable us to inducibly express GATA3 or TFAP2A, and the transgene expression was induced at d2 (when BMP treatment is normally initiated) until d4. However, this inducible expression did not lead to amniogenesis, and cysts maintained pluripotency. Due to the uninterpretable nature, these results are not included in the revised manuscript.

      As detailed extensively in the manuscript, within each cyst, amniogenesis is initially seen focally, then spreads laterally resulting in fully squamous amnion cysts. This is also seen in our previously published Gel-3D amnion model (extensively described in (Shao et al., 2017)). In the absence of TFAP2A, we showed that the focal amniogenesis is observed, but spreading is not seen, suggesting that TFAP2A controls amnion fate progression. Therefore, while TFAP2A is not critical for the amnion fate specification in the focal cells, our results show that TFAP2A indeed helps to promote amniotic specification of cells neighboring the focal amniotic cells. Moreover, in the revised manuscript, we now show that TFAP2A transgene expression in the TFAP2A-KO background restores formation of fully squamous hPSC-amnion, further establishing the role of TFAP2A in amnion fate progression (Fig. 8C of the revised manuscript, Line#362-364).

      (2) The transcriptomic analysis made by the authors provides some comparison between BMPinduced amnion-like cells in vitro and the amnion-like cells from CS7 human embryo in vivo. However, the data set from the human embryo contains only a limited number of cells, and might not provide a sufficient base for decisive assessment of the true identity of amnion-like cells obtained in vitro. It might help if the authors could integrate their bulk sequencing data with other primate embryo data sets.  

      Thank you for this helpful comment. We have now performed our transcriptional similarity analysis using early (day 14) cynomolgus macaque embryo datasets generated in a study by (Yang et al., 2021), and found that the bulk time-course transcriptome of our hPSC-amnion model overlaps with the cynomolgus macaque amniotic lineage progression (Fig. 4F, Line#265268). We also now provide the expression of key markers within the Yang et al. dataset (GATA3, TFAP2A, ISL1, TBXT, DLX5, Fig. 4G, S2F).

      (3) Following the point above, the authors used transcriptomic analysis to identify several intermediate states of cells during amnion differentiation and claimed that there is a primitivestreak-like intermediate. However, this might be an overstatement. During stem cell culture and differentiation, intermediate states showing a mixture of biomarkers are very common and do not imply that such intermediates have any biological meaning. However, stating that amnion differentiation passes through primitive streak-like intermediates, might imply a certain connection between these two lineages, for which there is a lack of solid support. Instead, a more interesting question might be how amnion and primitive streak differentiation, despite some transcriptomic similarity, diverge from each other during early development. What factors make this difference? The authors might further analyze RNA-seq data to provide some insights.  

      Thank you very much for the insightful comments. 

      We understand Reviewer #3’s concern that the intermediate state that we see may not recapitulate a primitive streak-like state. However, in our original manuscript, we described these cells as “Primitive Streak-like” because those cells were annotated as Primitive Streak in the dataset by Tyser et al. Interestingly, a recent study by Ton et al. showed that the Tyser et al. Primitive Streak cells actually consist of a mixture of different cell lineages (e.g., Epiblast, Nonneural ectoderm, Anterior or caudal primitive streak, PGC (Ton et al., 2023)). Therefore, we agree that it was an overstatement to call them “Primitive Streak-like”, and, to avoid confusions, we now label the TBXTlow sub-population found in the Tyser et al. Primitive Streak population as “TBXTlow state” throughout the manuscript.

      Our data indicate that TFAP2A may play a role in controlling the lineage decision between amnion and primitive streak cells that abundantly express TBXT (TBXThigh). In the original manuscript, we included data showing that 48-hr TFAP2A-KO cysts show transcriptomic characteristics similar to some Primitive Streak cells (Fig. 8D). Intriguingly, our new data show that, in the absence of TFAP2A, some TBXThigh cells are indeed seen (Fig. 8E, Line#373-374). These results provide a body of evidence for the role of TFAP2A in promoting the amniotic lineage, perhaps by suppressing the TBXThigh state. This point is now addressed in the Discussion (Line#401-409).

      Additional new data:

      Using Western blot, we now show that GATA3 is absent in the GATA3-KO lines (Fig. S4C). We noticed that this was lacking in the original manuscript.

      We now show that an inducible expression of TFAP2A in the TFAP2A-KO cysts leads to controllike cysts (Fig. 8C, Line#362-364).

      Additional changes:

      Typos were fixed in Fig. 5I – “boundary” and “disseminating” were not spelled correctly.

      Line#350 – we originally noted “GATA3 expression precedes TFAP2A expression by approximately 12 hours”. This was incorrect, and is changed to 9 hours in the revised manuscript. We apologize for this mistake.

      REFERENCES

      Blakeley, P., Fogarty, N.M., del Valle, I., Wamaitha, S.E., Hu, T.X., Elder, K., Snell, P., Christie, L., Robson, P., and Niakan, K.K. (2015). Defining the three cell lineages of the human blastocyst by single-cell RNA-seq. Development 142, 3151-3165.

      Castillo-Venzor, A., Penfold, C.A., Morgan, M.D., Tang, W.W., Kobayashi, T., Wong, F.C., Bergmann, S., Slatery, E., Boroviak, T.E., Marioni, J.C., et al. (2023). Origin and segregation of the human germline. Life Sci Alliance 6.

      Granja, J.M., Klemm, S., McGinnis, L.M., Kathiria, A.S., Mezger, A., Corces, M.R., Parks, B., Gars, E., Liedtke, M., Zheng, G.X.Y., et al. (2019). Single-cell multiomic analysis identifies regulatory programs in mixed-phenotype acute leukemia. Nature biotechnology 37, 1458-1465. Meistermann, D., Bruneau, A., Loubersac, S., Reignier, A., Firmin, J., Francois-Campion, V., Kilens, S., Lelievre, Y., Lammers, J., Feyeux, M., et al. (2021). Integrated pseudotime analysis of human pre-implantation embryo single-cell transcriptomes reveals the dynamics of lineage specification. Cell stem cell 28, 1625-1640 e1626.

      Ohgushi, M., Taniyama, N., Vandenbon, A., and Eiraku, M. (2022). Delamination of trophoblastlike syncytia from the amniotic ectodermal analogue in human primed embryonic stem cellbased differentiation model. Cell reports 39, 110973.

      Okae, H., Toh, H., Sato, T., Hiura, H., Takahashi, S., Shirane, K., Kabayama, Y., Suyama, M., Sasaki, H., and Arima, T. (2018). Derivation of Human Trophoblast Stem Cells. Cell stem cell 22, 50-63 e56.

      Petropoulos, S., Edsgard, D., Reinius, B., Deng, Q., Panula, S.P., Codeluppi, S., Plaza Reyes, A., Linnarsson, S., Sandberg, R., and Lanner, F. (2016). Single-Cell RNA-Seq Reveals Lineage and X Chromosome Dynamics in Human Preimplantation Embryos. Cell 165, 1012-1026.

      Sasaki, K., Nakamura, T., Okamoto, I., Yabuta, Y., Iwatani, C., Tsuchiya, H., Seita, Y., Nakamura, S., Shiraki, N., Takakuwa, T., et al. (2016). The Germ Cell Fate of Cynomolgus Monkeys Is Specified in the Nascent Amnion. Developmental cell 39, 169-185.

      Sekulovski, N., Juga, L.L., Cortez, C.L., Czerwinski, M., Whorton, A.E., Spence, J.R., Schmidt, J.K., Golos, T.G., Gumucio, D.L., Lin, C.-W., et al. (2023). Identification of amnion progenitor-like cells at the amnion-epiblast bounday in the primate peri-gastrula. bioRxiv doi:

      10.1101/2023.09.07.556553.

      Shao, Y., Taniguchi, K., Townshend, R.F., Miki, T., Gumucio, D.L., and Fu, J. (2017). A pluripotent stem cell-based model for post-implantation human amniotic sac development. Nature communications 8, 208.

      Ton, M.N., Keitley, D., Theeuwes, B., Guibentif, C., Ahnfelt-Ronne, J., Andreassen, T.K., Calero-Nieto, F.J., Imaz-Rosshandler, I., Pijuan-Sala, B., Nichols, J., et al. (2023). An atlas of rabbit development as a model for single-cell comparative genomics. Nature cell biology 25, 10611072.

      Tyser, R.C.V., Mahammadov, E., Nakanoh, S., Vallier, L., Scialdone, A., and Srinivas, S. (2021). Single-cell transcriptomic characterization of a gastrulating human embryo. Nature 600, 285289.

      Yabe, S., Alexenko, A.P., Amita, M., Yang, Y., Schust, D.J., Sadovsky, Y., Ezashi, T., and Roberts, R.M. (2016). Comparison of syncytiotrophoblast generated from human embryonic stem cells and from term placentas. Proceedings of the National Academy of Sciences of the United States of America 113, E2598-2607.

      Yang, R., Goedel, A., Kang, Y., Si, C., Chu, C., Zheng, Y., Chen, Z., Gruber, P.J., Xiao, Y., Zhou, C., et al. (2021). Amnion signals are essential for mesoderm formation in primates. Nature communications 12, 5126.

    1. Contents move to sidebar hide (Top) 1Text 2Printing history 3The production process: Das Werk der Bücher Toggle The production process: Das Werk der Bücher subsection 3.1Pages 3.2Ink 3.3Type 3.4Type style 3.5Rubrication, illumination and binding 4Early owners 5Influence on later Bibles 6Forgeries 7Surviving copies Toggle Surviving copies subsection 7.1Substantially complete copies 8Recent history 9See also 10General bibliography 11References 12External links Toggle the table of contents Gutenberg Bible 48 languages العربية閩南語 / Bân-lâm-gúБеларускаяБеларуская (тарашкевіца)БългарскиCatalàČeštinaCymraegDanskDeutschEestiΕλληνικάEspañolEsperantoEstremeñuEuskaraفارسیFrançaisFrysk한국어Հայերենहिन्दीHrvatskiBahasa IndonesiaInterlinguaItalianoעבריתქართულიLatviešuМакедонскиമലയാളംमराठीNederlands日本語Norsk bokmålPolskiPortuguêsРусскийSimple EnglishSlovenčinaСрпски / srpskiSuomiSvenskaதமிழ்TürkçeУкраїнськаاردو中文 Edit links ArticleTalk English ReadEditView history Tools Tools move to sidebar hide Actions ReadEditView history General What links hereRelated changesUpload fileSpecial pagesPermanent linkPage informationCite this pageGet shortened URLDownload QR codeWikidata item Expand allEdit interlanguage links Print/export Download as PDFPrintable version In other projects Wikimedia Commons From Wikipedia, the free encyclopedia Earliest major book printed in Europe The copy of the Gutenberg Bible held at the Richelieu - Bibliothèques, musée, galeries. The Gutenberg Bible, also known as the 42-line Bible, the Mazarin Bible or the B42, was the earliest major book printed in Europe using mass-produced metal movable type. It marked the start of the "Gutenberg Revolution" and the age of printed books in the West. The book is valued and revered for its high aesthetic and artistic qualities[1] and its historical significance. The Gutenberg Bible is an edition of the Latin Vulgate printed in the 1450s by Johannes Gutenberg in Mainz, in present-day Germany. Forty-nine copies (or substantial portions of copies) have survived. They are thought to be among the world's most valuable books, although no complete copy has been sold since 1978.[2][3] In March 1455, the future Pope Pius II wrote that he had seen pages from the Gutenberg Bible displayed in Frankfurt to promote the edition, and that either 158 or 180 copies had been printed. The 36-line Bible, said to be the second printed Bible, is also sometimes referred to as a Gutenberg Bible, but may be the work of another printer.[4] Text[edit] Gutenberg Bible in the Beinecke Rare Book & Manuscript Library at Yale University in New Haven, Connecticut The Gutenberg Bible, an edition of the Vulgate, contains the Latin version of the Hebrew Old Testament and the Greek New Testament. It is mainly the work of St Jerome who began his work on the translation in AD 380, with emendations from the Parisian Bible tradition, and further divergences.[5] Printing history[edit] Gutenberg Bible of the New York Public Library; purchased by James Lenox in 1847, it was the first Gutenberg Bible to be acquired by a United States citizen. While it is unlikely that any of Gutenberg's early publications would bear his name, the initial expense of press equipment and materials and of the work to be done before the Bible was ready for sale suggests that he may have started with more lucrative texts, including several religious documents, a German poem, and some editions of Aelius Donatus's Ars Minor, a popular Latin grammar school book.[6][7][8] Preparation of the Bible probably began soon after 1450, and the first finished copies were available in 1454 or 1455.[9] It is not known exactly how long the Bible took to print. The first precisely datable printing is Gutenberg's 31-line Indulgence which certainly existed by 22 October 1454.[10] Gutenberg made three significant changes during the printing process.[11] Spine of the Lenox copy Some time later, after more sheets had been printed, the number of lines per page was increased from 40 to 42, presumably to save paper. Therefore, pages 1 to 9 and pages 256 to 265, presumably the first ones printed, have 40 lines each. Page 10 has 41, and from there on the 42 lines appear. The increase in line number was achieved by decreasing the interline spacing, rather than increasing the printed area of the page. Finally, the print run was increased, necessitating resetting those pages which had already been printed. The new sheets were all reset to 42 lines per page. Consequently, there are two distinct settings in folios 1–32 and 129–158 of volume I and folios 1–16 and 162 of volume II.[11][12] The most reliable information about the Bible's date comes from a letter. In March 1455, the future Pope Pius II wrote that he had seen pages from the Gutenberg Bible, being displayed to promote the edition, in Frankfurt.[13] It is not known how many copies were printed, with the 1455 letter citing sources for both 158 and 180 copies. Scholars today think that examination of surviving copies suggests that somewhere between 160 and 185 copies were printed, with about three-quarters on paper and the others on vellum.[14][15] The production process: Das Werk der Bücher[edit] A vellum copy of the Gutenberg Bible owned by the U.S. Library of Congress, on display at the Thomas Jefferson Building in Washington, D.C. In a legal paper, written after completion of the Bible, Johannes Gutenberg refers to the process as Das Werk der Bücher ("the work of the books"). He had introduced the printing press to Europe and created the technology to make printing with movable types finally efficient enough to facilitate the mass production of entire books.[16] Many book-lovers have commented on the high standards achieved in the production of the Gutenberg Bible, some describing it as one of the most beautiful books ever printed. The quality of both the ink and other materials and the printing itself have been noted.[1] Pages[edit] First page of the first volume: the epistle of St Jerome to Paulinus from the University of Texas copy. The page has 40 lines. The paper size is 'double folio', with two pages printed on each side (four pages per sheet). After printing the paper was folded once to the size of a single page. Typically, five of these folded sheets (ten leaves, or twenty printed pages) were combined to a single physical section, called a quinternion, that could then be bound into a book. Some sections, however, had as few as four leaves or as many as twelve leaves.[17] Gutenberg Bible on display at the U.S. Library of Congress The 42-line Bible was printed on the size of paper known as 'Royal'.[18] A full sheet of Royal paper measures 42 cm × 60 cm (17 in × 24 in) and a single untrimmed folio leaf measures 42 cm × 30 cm (17 in × 12 in).[19] There have been attempts to claim that the book was printed on larger paper measuring 44.5 cm × 30.7 cm (17.5 in × 12.1 in),[20] but this assertion is contradicted by the dimensions of existing copies. For example, the leaves of the copy in the Bodleian Library, Oxford, measure 40 cm × 28.6 cm (15.7 in × 11.3 in).[21] This is typical of other folio Bibles printed on Royal paper in the fifteenth century.[22] Most fifteenth-century printing papers have a width-to-height ratio of 1:1.4 (e.g. 30:42 cm) which, mathematically, is a ratio of 1 to the square root of 2 or, simply, 2 {\textstyle {\sqrt {2}}} . Many suggest that this ratio was chosen to match the so-called Golden Ratio, 1 + 5 2 {\textstyle {\tfrac {1+{\sqrt {5}}}{2}}} , of 1:1.6; in fact the ratios are, plainly, not at all similar (equating to a difference of about 12 per cent). The ratio of 1:1.4 was a long established one for medieval paper sizes.[23] A single complete copy of the Gutenberg Bible has 1,288 pages (4×322 = 1288) (usually bound in two volumes); with four pages per folio-sheet, 322 sheets of paper are required per copy.[24] The Bible's paper consists of linen fibers and is thought to have been imported from Caselle in Piedmont, Italy based on the watermarks present throughout the volume.[25] Ink

      we have

      FORK LYFT

      https://philosophybreak.com/articles/if-a-tree-falls-in-the-forest-and-theres-no-one-around-to-hear-it-does-it-make-a-sound/#:~:text=So%2C%20the%20answer%20to%20this%20age-old%20question%20seems,lonesome%20falling%20tree%20does%20not%20make%20a%20sound.

      [

      Philosophy BreakYour home for learning about philosophy

      ](https://philosophybreak.com/)

      CoursesReading ListsLatest BreaksAbout UsSign In

      Join 12,000+ Subscribers

      [

      Courses

      Introductory philosophy courses distilling the subject's greatest wisdom.

      ](https://philosophybreak.com/courses/)

      [

      Reading Lists

      Curated reading lists on philosophy's best and most important works.

      ](https://philosophybreak.com/reading-lists/)

      [

      Latest Breaks

      Bite-size philosophy articles designed to stimulate your brain.

      ](https://philosophybreak.com/articles/)

      [

      About Us

      ](https://philosophybreak.com/about/)

      [

      Sign In

      ](https://academy.philosophybreak.com/)

      [

      Instagram

      ](https://www.instagram.com/philosophybreak/)

      [

      Twitter

      ](https://twitter.com/philosophybreak)

      If a tree falls in the forest, and there's no one around to hear it, does it make a sound?

      If a Tree Falls in the Forest, and There's No One Around to Hear It, Does It Make a Sound?

      The age-old question of whether a falling tree makes a sound when there's no one around to hear it exploits the tension between perception and reality. This article explores possible answers and their consequences.

      Jack Maden

      By Jack Maden  |  September 2022

      3-MIN BREAK  

      If a tree falls in the forest, and there's no one around to hear it, does it make a sound? Well, if by 'sound' we mean vibrating air, then yes, when the tree falls, it vibrates the air around it.

      However, if by 'sound' we mean the conscious noise we hear when our sensory apparatus interacts with the vibrating air, then if no one is around to hear the tree when it falls, there'd be no sensory apparatus for the vibrating air to interact with, and thus no conscious noise would be heard.

      So, the answer to this age-old question seems to be simple: it depends on how we define 'sound'. If we define it as 'vibrating air', the falling tree makes a sound. If we define it as a conscious experience, the lonesome falling tree does not make a sound.

      There, problem solved.

      The point of asking this question, however, is not so that it can be answered quickly and put aside.

      Rather, its point is to draw out the rather strange tension between our two very different definitions of the word 'sound'.

      On the one hand, we classify sound as a mechanistic process that exists without us, 'out there' in the world. On the other, we regard it as a private conscious experience, its existence entirely dependent on us.

      And when you dwell on this latter definition, you realize it doesn't just extend to sounds. Everything we experience --- everything we see, hear, smell, touch, taste --- all of it depends on our sensory apparatus, on us. Without us, our experiences would not exist.

      As the great 16th-century astronomer Galileo Galilei put it:

      Tastes, odors, colors, and so on... reside only in consciousness. If the living creature were removed, all these qualities would be wiped away and annihilated.

      Take away our senses, and the world of our experience would be replaced by a colorless, soundless, odorless, tasteless nothingness. Without us, what remains?

      The reason our original question --- When a tree falls in the forest, and there's no one around to hear it, does it make a sound? --- is such a teaser, is because it hits on a deeper question. Namely:

      If there was no conscious life, would the physical universe exist?

      Our kneejerk reaction to this question might be, 'of course it would'. But let's think about it again: if there was nothing conscious, then nothing would be experienced. There would be nothing resembling anything we call 'existence'. No colors, no sounds, no smells, no tastes, no touch, no sense of time, no sense of space.

      In one concise email each Sunday, I break down a famous idea from philosophy. You get the distillation straight to your inbox:

      Join 12,000+ Subscribers

      💭 One short philosophical email each Sunday. Unsubscribe any time.

      Is consciousness more fundamental than matter?

      Reflecting on this strange state of affairs, numerous great thinkers have concluded that consciousness must be more fundamental than the 'stuff' that consciousness experiences.

      Southwest Airlines

      Wanna spring into summer?

      Sponsored By Southwest Airlines

      Earn 50,000 points.

      Learn More

      For instance, in his 1710 work, A Treatise Concerning the Principles of Human Knowledge, the philosopher George Berkeley discusses the absurdity of a world existing independently of our conscious minds:

      It is indeed an opinion strangely prevailing amongst people that houses, mountains, rivers, and in a word all sensible objects, have an existence natural or real, distinct from their being perceived by the understanding... for what are the forementioned objects but things we perceive by sense? And what do we perceive besides our own ideas or sensations? And is it not plainly repugnant that any one of these or any combination of them should exist unperceived?

      On this view, it is absurd to say a lonesome falling tree makes a sound. For Berkeley, it is absurd to say the tree, without a conscious mind there perceiving it, even exists. (You can learn more about his mind-bending arguments for this position in our short explainer piece on Berkeley's subjective idealism, his theory that the world is in our minds).

      But to conclude this brief reflection on the tension between perception and reality, consider a comment from the Nobel Prize-winning quantum physicist Max Planck in a 1931 interview (italics added):

      I regard consciousness as fundamental. I regard matter as derivative from consciousness. We cannot get behind consciousness. Everything that we talk about, everything that we regard as existing, postulates consciousness.

      What do you think? Can we get behind consciousness?

      This is a short exploration of themes covered in our celebrated 5-day introduction to philosophy course, Life's Big Questions, in which you can learn thousands of years of philosophy with just 30 minutes of thought-provoking reading per day. Learn more and see if it's for you now:

      life's big questions

      Life's Big Questions: Your Concise Guide to Philosophy's Most Important Wisdom

      From why anything exists to how we should live, unlock philosophy's best answers to life's big questions.

      Get Instant Access

      ★★★★★ (50+ reviews for our courses)

      Get one mind-opening philosophical idea distilled to your inbox every Sunday (free):

      Join 12,000+ Subscribers

      💭 One short philosophical email each Sunday. Unsubscribe any time.

      About the Author

      Jack Maden

      Jack MadenFounder\ Philosophy Break

      Having received great value from studying philosophy for 15+ years (picking up a master's degree along the way), I founded Philosophy Break in 2018 as an online social enterprise dedicated to making the subject's wisdom accessible to all. Learn more about me and the project here.

      If you enjoy learning about humanity's greatest thinkers, you might like my free Sunday email. I break down one mind-opening idea from philosophy, and invite you to share your view.

      Subscribe for free here, and join 12,000+ philosophers enjoying a nugget of profundity each week (free forever, no spam, unsubscribe any time).

      Philosophy Break

      WEEKLY EMAILS

      Get one mind-opening philosophical idea distilled to your inbox every Sunday (free)

      From the Buddha to Nietzsche: join 12,000+ subscribers enjoying a nugget of profundity from the great philosophers every Sunday:

      Join 12,000+ Subscribers

      ★★★★★ (50+ reviews for Philosophy Break). Unsubscribe any time.

      Philosophy Basics

      Take Another Break

      Each break takes only a few minutes to read, and is crafted to expand your mind and spark your philosophical curiosity.

      [

      The Buddha's Four Noble Truths

      The Buddha's Four Noble Truths: the Cure for Suffering

      7-MIN BREAK

      ](https://philosophybreak.com/articles/the-buddha-four-noble-truths-the-cure-for-suffering/)

      [

      Compatibilism: Philosophy's Favorite Answer to the Free Will Debate

      Compatibilism: Philosophy's Favorite Answer to the Free Will Debate

      10-MIN BREAK

      ](https://philosophybreak.com/articles/compatibilism-philosophys-favorite-answer-to-the-free-will-debate/)

      [

      Splendor, by Albert Bierstadt

      The Last Time Meditation: a Stoic Tool for Living in the Present

      5-MIN BREAK

      ](https://philosophybreak.com/articles/the-last-time-meditation-a-stoic-tool-for-living-in-the-present/)

      [

      Stormy Sea at Night, by Ivan Aivazovsky

      Nietzsche On Why Suffering is Necessary for Greatness

      3-MIN BREAK

      ](https://philosophybreak.com/articles/nietzsche-on-why-suffering-is-necessary-for-greatness/)

      View All Breaks

      PHILOSOPHY 101

      ABOUT US

      FOLLOW US

      Philosophy Break is an online social enterprise dedicated to making the wisdom of philosophy instantly accessible (and useful!) for people striving to live happy, meaningful, and fulfilling lives. Learn more about us here. To offset a fraction of what it costs to maintain Philosophy Break, we participate in the Amazon Associates Program. This means if you purchase something on Amazon from a link on here, we may earn a small percentage of the sale, at no extra cost to you. This helps support Philosophy Break, and is very much appreciated.

      Access our generic Amazon Affiliate link here

      Privacy Policy | Cookie Policy

      © Philosophy Break Ltd, 2024

      Social enterprise badge

      Close ad

      https://www.poetryfoundation.org/poems/44272/the-road-not-taken

      The Road Not Taken

      Launch Audio in a New Window

      BY ROBERT FROST

      Two roads diverged in a yellow wood,

      And sorry I could not travel both

      And be one traveler, long I stood

      And looked down one as far as I could

      To where it bent in the undergrowth;

      Then took the other, as just as fair,

      And having perhaps the better claim,

      Because it was grassy and wanted wear;

      Though as for that the passing there

      Had worn them really about the same,

      And both that morning equally lay

      In leaves no step had trodden black.

      Oh, I kept the first for another day!

      Yet knowing how way leads on to way,

      I doubted if I should ever come back.

      I shall be telling this with a sigh

      Somewhere ages and ages hence:

      Two roads diverged in a wood, and I---

      I took the one less traveled by,

      And that has made all the difference.

      n/a

      THIS POEM HAS A POEM GUIDE

      View Poem Guide

    1. Consequently, our freedom has become a psychological problem, it has isolated us from the connections necessary for our survival and development (Fromm, 1941). The danger with this situation, according to Fromm, is that when an entire society is suffering from feelings of isolation and disconnection with the natural order (from nature itself, in Fromm’s view), the members of that society may seek connection with a societal structure that destroys their freedom and, thus, integrates their self into the whole (albeit in a dysfunctional way).

      Fromm's theory stated that freedom was a problem for humans due to the separation from nature it caused. Being separated from our basic instincts humans struggle to essentially find their identity. In doing so they encounter crisis which leads to mental un-wellness. As a society we have proven time and again that this disconnection from the natural order of things is beyond our scope of coping. Fromm uses STalin/Hitler as examples but more recently I think we can see this happening with former president Trump and his followers. We see this in members of cults. (I am not equating Trump supporters to cultists) I can see Fromm's theory playing out in many situations where humans feel powerless and seek power through connections with a belief or a person supporting certain beliefs.

  2. May 2024
    1. We can achieve that goal by several practical devices. For example, under the American or Latin American presidential arrangements, we can allow both the president and the Congress to dissolve an impasse by calling early elections. The early elections would always have to be bilateral: the branch exercising the constitutional prerogative would share the electoral risk.

      I suspect such innovations would yield some benefit, but not nearly as much as we might imagine.

      I think is a classic case of the primacy of being. yes structure (i.e. institutions) can make a bad case worse, or an ok case better. But they are limited in how much they can move things. The US is mostly hamstrung by deep cultural differences at a moment of paradigmatic change. Yes the electoral college or the senata composition may make things worse but it is minor compared to what is going on at the cultural foundations.

      Structural innovations are therefore most interesting that would hasten cultural evolution.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this article, the authors investigate whether the connectivity of the hippocampus is altered in individuals with aphantasia ¬- people who have reduced mental imagery abilities and where some describe having no imagery, and others describe having vague and dim imagery. The study investigated this question using a fMRI paradigm, where 14 people with aphantasia and 14 controls were tested, and the researchers were particularly interested in the key regions of the hippocampus and the visual-perceptual cortices. Participants were interviewed using the Autobiographical Interview regarding their autobiographical memories (AMs), and internal and external details were scored. In addition, participants were queried on their perceived difficulty in recalling memories, imagining, and spatial navigation, and their confidence regarding autobiographical memories was also measured. Results showed that participants with aphantasia reported significantly fewer internal details (but not external details) compared to controls; that they had lower confidence in their AMs; and that they reported finding remembering and imagining in general more difficult than controls. Results from the fMRI section showed that people with aphantasia displayed decreased hippocampal and increased visual-perceptual cortex activation during AM retrieval compared to controls. In contrast, controls showed strong negative functional connectivity between the hippocampus and the visual cortex. Moreover, resting state connectivity between the hippocampus and visual cortex predicted better visualisation skills. The authors conclude that their study provides evidence for the important role of visual imagery in detail-rich vivid AM, and that this function is supported by the connectivity between the hippocampus and visual cortex. This study extends previous findings of reduced episodic memory details in people with aphantasia, and enables us to start theorising about the neural underpinnings of this finding.

      The data provided good support for the conclusion that the authors draw, namely that there is a 'tight link between visual imagery and our ability to retrieve vivid and detail-rich personal past events'. However, as the authors also point out, the exact nature of this relationship is difficult to infer from this study alone, as the slow temporal resolution of fMRI cannot establish the directionality between the hippocampus and the visual-perceptual cortex. This is an exciting future avenue to explore.

      We thank the reviewer for highlighting our contributions and suggesting that the relationship between visual imagery and autobiographical memory recall is an exciting future avenue.

      Weaknesses:

      A weakness of the study is that some of the questions used are a bit vague, and no objective measure is used, which could have been more informative. For example, the spatial navigation question (reported as 'How difficult is it typically for you to orient you spatially?' - a question which is ungrammatical, but potentially reflects a typo in the manuscript) could have been more nuanced to tap into whether participants relied mostly on cognitive maps (likely supported by the hippocampus) or landmarks. It would also have been interesting to conduct a spatial navigation task, as participants do not necessarily have insight into their spatial navigation abilities (they could have been overconfident or underconfident in their abilities).

      Secondly, the question 'how difficult is it typically for you to use your imagination?' could also be more nuanced, as imagination is used in a variety of ways, and we only have reason to hypothesise that people with aphantasia might have difficulties in some cases (i.e. sensory imagination involving perceptual details). It is unlikely that people with aphantasia would have more difficulty than controls in using their imagination to imagine counterfactual situations and engage in counterfactual thought (de Brigard et al., 2013, https://doi.org/10.1016%2Fj.neuropsychologia.2013.01.015) due to its non-sensory nature, but the question used does not distinguish between these types of imagination. Again, this is a ripe area for future research. The general phrasing of 'how difficult is [x]' could also potentially bias participants towards more negative answers, something which ought to be controlled for in future research.

      The main goal of our study was to examine autobiographical memory recall. Therefore, we used the gold standard Autobiographical Interview, or AI (Levine et al. 2002) and an fMRI paradigm to explore autobiographical memory recall as standardised, precisely, and objectively as possible.

      In addition to these experimentally rigorous tasks, we employed some loosely formulated questions with the intention for people to reflect on how they perceive their own abilities to recall autobiographical memories, navigate spatially, and use their imagination. We agree with the reviewer that these questions are vague and did not have the experimental standard for an investigation into spatial cognition or imagination associated with aphantasia. Nonetheless, we believe that these questions provide important additional insights into what participants think about their own cognitive abilities. In order to set these questions into perspective, we argue in the discussion that spatial cognition and other cognitive functions should be investigated in more depth in individuals with aphantasia in the future.

      As an additional note, all tasks were conducted in German. Thus, we were able to correct the wording of the debriefing question in our revision. We thank the reviewer for bringing this to our attention.

      Strengths:

      A great strength of this study is that it introduces a fMRI paradigm in addition to the autobiographical interview, paralleling work done on episodic memory in cognitive science (e.g. Addis and Schacter, 2007, https://doi.org/10.1016%2Fj.neuropsychologia.2006.10.016 ), which has examined episodic and semantic memory in relation to imagination (future simulation) in non-aphantasic participants as well as clinical populations. Future work could build on this study, and for example use the recombination paradigm (Addis et al. 2009, 10.1016/j.neuropsychologia.2008.10.026 ), which would shed further light on the ability of people with aphantasia to both remember and imagine events. Future work could also build on the interesting findings regarding spatial navigation, which together with previous findings in aphantasia (e.g. Bainbridge et al., 2021, https://doi.org/10.1016/j.cortex.2020.11.014 ) strongly suggests that spatial abilities in people with aphantasia are unaffected. This can shed further light on the different neural pathways of spatial and object memory in general. In general, this study opens up a multitude of new avenues to explore and is likely to have a great impact on the field of aphantasia research.

      We much appreciate the acknowledgment of our work into autobiographical memory employing both the autobiographical interview and fMRI. Furthermore, we hope that our work inspires future research in the way the reviewer outlines and in the way we describe in our manuscript.

      Reviewer #2 (Public Review):

      Summary:

      This study investigates to what extent neural processing of autobiographical memory retrieval is altered in people who are unable to generate mental images ('aphantasia'). Self-report as well as objective measures were used to establish that the aphantasia group indeed had lower imagery vividness than the control group. The aphantasia group also reported fewer sensory and emotional details of autobiographical memories. In terms of brain activity, compared to controls, aphantasics had a reduction in activity in the hippocampus and an increase in activity in the visual cortex during autobiographical memory retrieval. For controls, these two regions were also functionally connected during autobiographical memory retrieval, which did not seem to be the case for aphantasics. Finally, resting-state connectivity between the visual cortex and hippocampus was positively related to autobiographical vividness in the control group but negatively in the aphantasia group. The results are in line with the idea that aphantasia is caused by an increase in noise within the visual system combined with a decrease in top-down communication from the hippocampus.

      Recent years have seen a lot of interest in the influence of aphantasia on other cognitive functions and one of the most consistent findings is deficits in autobiographical memory. This is one of the first studies to investigate the neural correlates underlying this difference, thereby substantially increasing our understanding of aphantasia and the relationship between mental imagery and autobiographical memory.

      We thank the reviewer for highlighting the importance of our findings.

      Strengths:

      One of the major strengths of this study is the use of both self-report as well as objective measures to quantify imagery ability. Furthermore, the fMRI analyses are hypothesis-driven and reveal unambiguous results, with alterations in hippocampal and visual cortex processing seeming to underlie the deficits in autobiographical memory.

      Once again, we thank the reviewer for highlighting the quality of our methods and our results.

      Weaknesses:

      In terms of weaknesses, the control task, doing mathematical sums, also differs from the autobiographical memory task in aspects that are unrelated to imagery or memory, such as self-relevance and emotional salience, which makes it hard to conclude that the differences in activity are reflecting only the cognitive processes under investigation.

      We agree with the reviewer that our control task differs from autobiographical memory in many different ways. In fact, for this first investigation of the neural correlates of autobiographical memory in aphantasia, this is precisely the reason why we chose this mental arithmetic (MA) task. We know from previous studies, that MA is, as much as possible, not dependent on hippocampal memory processes (Addis, et al. 2007, McCormick et al. 2015, 2017, Leelaarporn et al., 2024). The main goal of the current study was to establish whether there are any differences between individuals with aphantasia and controls. In the next investigation, we can now build on these findings to disentangle in more detail what this difference reflects. 

      Overall, I believe that this is a timely and important contribution to the field and will inspire novel avenues for further investigation.

      This highly positive conclusion is much appreciated.

      References

      Addis, D. R., Wong, A. T., & Schacter, D. L. (2007). Remembering the past and imagining the future: Common and distinct neural substrates during event construction and elaboration. Neuropsychologia45(7), 1363-1377.

      Kriegeskorte, N., Simmons, W., Bellgowan, P. et al. Circular analysis in systems neuroscience: the dangers of double dipping. Nat Neurosci 12, 535–540 (2009). https://doi.org/10.1038/nn.2303

      Leelaarporn, P., Dalton, M. A., Stirnberg, R., Stöcker, T., Spottke, A., Schneider, A., & McCormick, C. (2024). Hippocampal subfields and their neocortical interactions during autobiographical memory. Imaging Neuroscience.

      Levine, B., Svoboda, E., Hay, J. F., Winocur, G., & Moscovitch, M. (2002). Aging and autobiographical memory: dissociating episodic from semantic retrieval. Psychology and aging17(4), 677.

      McCormick, C., St-Laurent, M., Ty, A., Valiante, T. A., & McAndrews, M. P. (2015). Functional and effective hippocampal–neocortical connectivity during construction and elaboration of autobiographical memory retrieval. Cerebral cortex25(5), 1297-1305.

      McCormick, C., Moscovitch, M., Valiante, T. A., Cohn, M., & McAndrews, M. P. (2018). Different neural routes to autobiographical memory recall in healthy people and individuals with left medial temporal lobe epilepsy. Neuropsychologia110, 26-36.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      This is a very interesting article that makes a substantial contribution to the field of the study of aphantasia as well as the neural mechanisms of autobiographical memory. I would strongly recommend this manuscript to be accepted (with these minor revisions), as it makes a substantial and well-evidenced contribution to the research, and it opens up many interesting avenues for researchers to explore. I was especially excited to see that the Autobiographical Interview had been paired with an fMRI paradigm, something which this field of research highly benefits from, as there are yet so few fMRI studies into aphantasia. I understand that it is the authors' decision whether to accept or reject any of the revisions I recommend here, but I would like to stress that I encourage accepting the recommended revisions, especially as there are some minor inaccuracies in the manuscript as it currently stands. Finally, I would like to stress that though I am based in the area of cognitive science, am not trained in fMRI imaging techniques, and therefore do not stand in a position where I can comment on the methodology pertaining to this part of the study - I encourage the Editors to seek a second reviewer's opinion on this.

      Thank you for the positive evaluation of our manuscript as well as your comments. We have revised our manuscript according to your important suggestions as further explained below.

      Line 33: "aphantasia prohibits people from experiencing visual imagery". This  characterisation of aphantasia is too strong, especially as the authors use 32 as a cut-off point on the VVIQ, which represents weak and dim imagery. I would recommend using language like 'people with aphantasia have reduced visual imagery abilities', as this more accurately captures the group of people studied. Please revise throughout the manuscript. Please consult Blomkvist and Marks (2023) on this point who have discussed this problem in the aphantasia literature.

      We agree that aphantasics may experience reduced visual imagery abilities. We have revised our wording throughout the manuscript.

      Line 49: The authors conclude that their results 'indicate that visual mental imagery is essential for detail-rich, vivid AM', but this seems to be a bit too strong, for example since AM can be detail-rich with external (rather than internal) detail, and a person could potentially use mnemonic tricks such as keeping a detail-rich diary in order to boost their memory. That visual imagery is 'essential' implies that it is the only way to achieve detail-rich vivid AM, and this does not seem to be supported by the findings. I would recommend rephrasing it as 'visual mental imagery plays an important role in detail-rich, vivid AM' or 'visual mental imagery mediated detail-rich vivid AM'.

      We altered the sentence in Line 49 using one of the recommended phrases:

      ‘Our results indicate that visual mental imagery plays an important role in detail-rich, vivid AM, and that this type of cognitive function is supported by the functional connection between the hippocampus and the visual-perceptual cortex.’

      Line 69: Blomkvist and Marks (2023) have warned against calling aphantasia a 'condition' and this moreover seems to fit with the authors' previous research (Monzel, 2022). Please consider instead calling aphantasia an 'individual difference' in mental imagery abilities.

      Thank you for the suggestion. We have revised our wording throughout the manuscript, avoiding the term ‘condition’.

      Line 72: Add reference for emotional strength which has also been researched (Wicken et al. 2021, https://doi.org/10.1016/j.cortex.2020.11.014).

      We have added the suggested reference in Line 75:

      ‘Indeed, a handful of previous studies report convergent evidence that aphantasics report less sensory AM details than controls (Bainbridge et al., 2021; Dawes et al., 2020, 2022; Milton et al., 2020; Zeman et al., 2020), which may also be less emotional (Monzel et al., 2023; Wicken et al., 2021).’

      72-73: 'absence of voluntary imagery' - too strong as many people with aphantasia report having weak/dim mental imagery on the VVIQ.

      We agree that aphantasics may experience reduced visual imagery. We have revised this notion throughout the manuscript.

      74: Add reference to Bainbridge study which found a difference between recall of object vs spatial memory. This would be relevant here.

      We have added the suggested reference in Line 76:

      ‘Spatial accuracy, on the other hand, was not found to be impaired (Bainbridge et al., 2021).’

      Lines 94-97: The authors mention 'a prominent theory' but it is unclear which theory is referred to here. The article cited by Pearson (2019) does not suggest the possibility that aphantasia is due to altered connectivity between the hippocampus and visual-perceptual cortices. It suggests that aphantasia is due to impairment in the ventral stream, and in fact says that the hippocampus is unlikely to be affected due to spared spatial abilities in people with aphantasia. Specifically, Pearson claims: "Accordingly, memory areas of the brain that process spatial properties, including the hippocampus, may not be the underlying cause of aphantasia." (page 631). The authors further come back to this point in the discussion section (see comment below), saying that the hypothesis attributed to Pearson is supported by their study. I do not disagree with the point that the hypothesis is supported by the data, but it is unclear to me why the hypothesis is attributed to Pearson.

      Thank you for pointing out this inaccuracy. We have edited the text to spell out our entire train of thought (see Lines 96-102):

      ‘A prominent theory posits that because of this hyperactivity, small signals elicited during the construction of mental imagery may not be detected (Pearson, 2019, Keogh et al., 2020). Pearson further speculates that since spatial abilities seem to be spared, the hippocampus may not be the underlying cause of aphantasia. In agreement, Bergmann and Ortiz-Tudela (2023) speculate that individuals with aphantasia might lack the ability to reinstate visually precise episodic elements from memory due to altered feedback from the visual cortex.’

      Line 97: Blomkvist reference should be 2022 (when first published online).

      The article ‘Aphantasia: In search of a theory’ by Blomkvist was first published on 1st July 2022. However, a correction was added on 13th March 2023. Therefore, we had cited the corrected version in this manuscript. However, we agree that the first publication date should be used and edited the reference accordingly.

      Line 116: 'one aphantasic' could be seen as offensive. I would suggest 'one aphantasic participant'.

      We have altered the paragraph according to your suggestion.

      Line 138: In line with the recommendations put forward by Blomkvist and Marks (2023), I would suggest removing the word 'diagnosed', as this medicalises aphantasia in a way that is not consistent with its not being a kind of mental disorder (Monzel et al., 2022). I would say that aphantasia is instead operationalised as a score between 16-32. However, note that Blomkvist (2022) and Blomkvist and Marks (2023, https://doi.org/10.1016/j.cortex.2023.09.004 ) point out that there is also a lot of inconsistency in this score and how it is used in different studies. In your manuscript, I would recommend removing all wording that indicates that people with aphantasia have no experience of mental imagery, as you have operationalised for a score up to 32 which indicates vague and dim imagery. Describing vague and dim imagery as no imagery/absence of imagery is inconsistent (but common practice in the literature).

      Thank you for your suggestion. We have revised the entire manuscript to eliminate any ambiguous meanings regarding the definition of aphantasia. Moreover, we replaced the word ‘diagnosed’ with ‘identified’ in Line 146.

      Line 153: maybe 'correlated with imagery strength' rather than 'measures imagery strength'?

      We have altered the sentence according to your suggestion in Line 160:

      ‘Previous studies have shown that the binocular rivalry task validly correlated with mental imagery strength.’

      Line 162: "For participants who were younger than 34 years, the middle-age memory was replaced by another early adulthood memory". Is there precedence for this? Please add one sentence to explain/justify for the reader why a memory from this time period was chosen.

      To maintain the homogeneous data set of acquiring five episodic autobiographical memories from five different periods of life per one individual, we asked the participants who were at the time of the interview, younger than 34 years old, to provide another early adulthood memory instead of middle age memory, as they had not reached the age range of middle age. According to Levine et al. (2002), younger adults (age < 34 years old) selected 2 events from the early adulthood period. Hence, all participants provided the last time period with memories from their previous year. We have added an additional explanation in this section in Line 170:

      ‘In order to acquire five AMs in every participant, the middle age memory was replaced by another early adulthood memory for participants who were younger than 34 years old (see Levine et al., 2002). Hence, all participants provided the last time period with memories from their previous year.’

      Line 169: "During the general probe, the interviewer asked the participant encouragingly to promote any additional details." Consider a different word choice, 'promote' sounds odd.

      We have altered the sentence according to your suggestion in Line 180:

      ‘During the general probe, the interviewer asked the participant encouragingly to provide any additional details.’

      Line 196-198: the phrasing of these questions could have biased participants toward reporting it being more difficult. Did the authors control for this possibility in any way? The phrasing ‘How easy is it for you to [x]?’ might also be considered in a future study.

      Thank you for pointing this out. These debriefing questions were thought of as open questions to get people to talk about their experiences. They were not meant as rigorous scientific experiments. Framing it in a positive way is a good idea for future research.

      We have edited the manuscript on Line 394-396:

      ‘The debriefing questions were employed as a way for participants to reflect on their own cognitive abilities. Of note, these were not meant to represent or replace necessary future experiments.’

      Line 197: This question is ungrammatical. Is this a typo, or was this how the question was actually posed? What language was the study conducted in?

      All interviews within this study were conducted in German. Hence, the questions listed in this current manuscript were all translated from German into English. We have added this information in the Materials and Methods section in Line 169 as well as restructured the referred questions from Line 208-210:

      ‘All interviews were conducted in German.’

      (1) Typically, how difficult is it for you to recall autobiographical memories?

      (2) Typically, how difficult is it for you to orient yourself spatially? 

      (3) Typically, how difficult is it for you to use your imagination?’

      Line 211: The authors write that participants were asked to "re-experience the chosen AM and elaborate as many details as possible in their mind's eye" was this the instruction used? I think stating the explicit instruction here would be relevant for the reader. If this is the word choice, it is also interesting as the autobiographical interview does not normally specify to re-experience details 'in one's mind's eye'.

      The instructions gi‘en to ’he par’Icipa’ts were to choose an AM and re-experience/elaborate it in their mind with as many details as possible without explaining them out loud. We have clarified this in Lines 221-223.

      ‘For the rest of the trial duration, participants were asked to re-experience the chosen AM and try to recall as many details as possible without speaking out loud.’

      Line 213: Were ‘vivid’ and ‘faint’ the only two options? Why was a 5-point scale (like the VVIQ scale) not used to better be able to compare?

      During the scanning session, the participants were given a button box which contained two buttons with 'vivid' by pressing the index finger and 'faint' by pressing the middle finger. The 5-point scale was not used to avoid confusion with the buttons during the scanning session. We have clarified this in Line 224:

      ‘We chose a simple two-button response in order to keep the task as easy as possible.’

      Line 347: Do the authors mean the same thing by 'imagery strength' and 'imagery vividness'? This would be good to clarify as it is not clear that these words mean the same thing.

      Imagery strength is often used to describe the results of the Binocular Rivalry Task, whereas vividness of mental imagery is often used to describe the results of the VVIQ. Although both tasks are correlated, the VVIQ measures vividness, whereas the dimension of the Binocular Rivalry Task is not clearly defined. We added this information in a footnote on page 10.

      Lines 353 - 356: When the authors first say that aphantasics described fewer memory details than controls, does this refer to external + internal details? Please clarify.

      Lines 353-360: The authors first say that aphantasics report "internal details (M = 43.59, SD = 17.91) were reported more often than external details (M = 20.64, SD = 8.94)" (line 355). But then they say: "a 2-way interaction was found between the type of memory details and group, F(1, 27)= 54.09, p < .001, ηp2 = .67, indicating that aphantasics reported significantly less internal memory details, t(27) = 5.07, p < .001, d = 1.83, but not significantly less external memory details, t(27) = 0.13, p = .898, compared to controls (see Figure 1b)" (line 358). This seems to first say that aphantasics didn't report fewer details than controls, but then that they did report fewer internal details than controls. Please clarify if this is correct.

      Line 383: Results from controls are not reported in this section.

      We have first reported the main effects of the different factors; thus, aphantasics reported less details than controls (no matter of group and type of memory details), the internal details were reported more often than external details (no matter of group and memory period), and more details were reported for recent than remote memories (no matter of group and type of memory details). Subsequently, we report the simple effects for aphantasics and controls separately. To further clarify, we added the following segment in line 360:

      ‘Regarding the AI, we found significant main effects of memory period, F(1, 27) = 11.88, p = .002, ηp2 = .31, type of memory details, F(1, 27) = 189.03, p < .001, ηp2 = .88, and group, F(1, 27) = 9.98, p = .004, ηp2 = .27. When the other conditions were collapsed, aphantasics (M = 26.29, SD = 9.58) described less memory details than controls (M = 38.36, SD = 10.99). For aphantasics and controls combined, more details were reported for recent (M = 35.17, SD = 14.19) than remote memories (M = 29.06, SD = 11.12), and internal details (M = 43.59, SD = 17.91) were reported more often than external details (M = 20.64, SD = 8.94). More importantly, a 2-way interaction was found between type of memory details and group, F(1, 27) = 54.09, p < .001, ηp2 = .67, indicating that aphantasics reported significantly less internal memory details, t(27) = 5.07, p < .001, d = 1.83, but not significantly less external memory details, t(27) = 0.13, p = .898, compared to controls (see Figure 1b).’

      Overall, the results were reported for aphantasics and controls separately in Lines 368-372.

      Line 386: The question does not specify that it's asking about using imagination in daily life, even though this is what results report. I'm not sure that the question implies the use of imagination in daily life, so I would recommend removing this reference here.

      We have removed the “in daily life” since this was not part of the original debriefing question.

      Line 394: Could this slowness in response reflect uncertainty about the vividness?

      Since the reason for this slowness is not known, we have refrained from adding this to the discussion. However, we added this as a short insertion in line 406:

      ‘Moreover, aphantasics responded slower (M = 1.34 s, SD = 0.38 s) than controls (M = 1.00 s, SD = 0.29 s) when they were asked whether their retrieved memories were vivid or faint, t(28) = 2.78, p = .009, possibly reflecting uncertainty in their response.’

      Line 443: Graph E, significance not indicated on the graph.

      After preprocessing, the fMRI data were statistically analyzed using the GLM contrast AM versus MA. The resulting images were then thresholded at p < 0.001, so that the illuminated voxels in Fig. 3 A, B, C, and D show only voxel in which we know already that there is a statistical difference between our conditions. Graph E illustrates only the descriptive means and variance of the significant differences in Fig. 3 C and D. This display is useful since the reader can more easily assess the difference between two conditions and two groups at a glance. For a general discussion on this topic, please also see circular analysis in fMRI (Kriegeskorte et al. 2009)

      Line 521-522: The authors claim that Pearson (2019) forwards the hypothesis that heightened activity of visual-perceptual cortices hinders aphantasics from detecting small imagery-related signals. However, I find no statement of this hypothesis in Pearson (2019). It is unclear to me why this hypothesis is attributed to Pearson (2019). Please remove this reference or provide a correct citation for where the hypothesis is stated. Further, it is not clear from what is written how the results support this hypothesis as this is rather brief - please elaborate on this.

      We attributed this hypothesis to Pearson (2019) according to his Fig. 4, which states: ‘A strong top-down signal and low noise (bottom left) gives the strongest mental image (square), whereas a high level of neural noise and a weak top-down imagery signal would produce the weakest imagery experience (top right).’

      We have edited our manuscript to reflect Pearson better in Lines 543-550:

      ‘In a prominent review, Pearson synthesizes evidence about the neural mechanism of imagery strength (Pearson, 2019). Indeed, activity metrics in the visual cortex predict imagery strength (Cui et al., 2007; Dijkstra et al., 2017). Interestingly, lower resting activity and excitability result in stronger imagery, and reducing cortical activity in the visual cortex via transcranial direct current stimulation (tDCS) increases visual imagery strength (Keogh et al., 2020). Thus, one potential mechanism of aphantasia-related AM deficits is that the heightened activity of the visual-perceptual cortices observed in our and previous work hinders aphantasics to detect weaker imagery-related signals.’

      Line 575: Consider citing Blomkvist (2022) who has argued that aphantasia is an episodic memory condition

      We added the suggested reference in Line 601.

      Line 585: Consider citing Bainbridge et al (2021) https://doi.org/10.1016/j.cortex.2020.11.014

      We have added the suggested reference in Line 612.

      Line 581: It might be relevant here to also discuss non-visual details, which have indeed been investigated in your present study. E.g. the lower emotional details, temporal details, place details, etc.

      We have edited our discussion to reflect the non-visual details better in Line 605:

      ‘In fact, previous and the current study show that aphantasics and individuals with hippocampal damage report less internal details across several memory detail subcategories, such as emotional details and temporal details (Rosenbaum et al., 2008; St-Laurent et al., 2009; Steinvorth et al., 2005), and these deficits can be observed regardless of the recency of the memory (Miller et al., 2020). These similarities suggest that aphantasics are not merely missing the visual-perceptual details to specific AM, but they have a profound deficit associated with the retrieval of AM.’

      Place details are discussed on page 37 onwards.

      Line 605: I agree with this interesting suggestion for future research. It would also be relevant to reference Bainbridge (2021) here who tested spatial cognition in a drawing task and found that aphantasic participants correctly recalled spatial layouts of rooms but reported fewer objects than controls. It might also be worth pointing out that the present study does not actually test for accuracy in spatial cognition, so it could be the case that people with aphantasia feel confident that they can navigate well, but they might in fact not. Future studies relying on objective measures should test this possibility.

      We have added the suggested reference in Line 625.

      Lines 609-614: Is there any evidence that complex decision-making and complex empathy tasks depend on constructed scenes with visual-perceptual details? This hypothesis seems a bit far-fetched without any supporting evidence. In fact, it seems unlikely to be supported as we also know that people with aphantasia generally live normal lives, and often have careers that we can assume involve complex decision-making (see Zeman 2020 who report aphantasics who work as computer scientists, managers, etc). I would recommend that the authors provide evidence of the role of mental imagery in complex decision-making and complex empathy tasks, mediated by scene construction, to support this hypothesis as viable to test for future research. It is also unclear how this point connects to the argument made by Bergmann and Ortiz-Tudela (2023). In fact, Bergmann and Ortiz-Tudela seem to make the same argument as Pearson (2019) does - that aphantasia results from impairments in the ventral stream, but that the dorsal stream is unaffected. However, Blomkvist (2022) argues that this view is too simplistic to be able to account for the variety of deficits that we see in aphantasia. I would recommend either engaging more fully with this debate or cutting it, as it currently is too vague for a reader to follow.

      We have decided to leave the discussion about scene construction and its connection to complex decision making and empathy out of the current manuscript. We have included the argument of Bergmann & Ortiz-Tudela (2023) in the Introduction (Line 101):

      ‘In agreement, Bergmann and Ortiz-Tudela (2023) speculate that individuals with aphantasia might lack the ability to reinstate visually precise episodic elements from memory due to altered feedback from the visual cortex.’

      Reviewer #2 (Recommendations For The Authors):

      In general, I really enjoyed reading this paper.

      Thank you very much for the positive evaluation of our manuscript as well as your comments.

      There were only a few things that I had some concerns about. For example, it was unclear to me whether the whole-brain analysis (Figures 3 and 4) was corrected for multiple comparisons or why only a small volume correction was applied for the functional connectivity analysis. If these results are borderline significant, this should be made more explicit in the manuscript. I don't think this is a major issue as the investigation of both the hippocampus and visual cortex was strongly hypothesis-driven, but it would still be good to be explicit about the strength of the findings.

      For the whole-brain analysis, we applied a threshold of p < .001, voxel cluster of 10, but no other multiple comparisons correction applied. The peak in the right hippocampus did survive the whole-brain threshold but we decided to lower this threshold just for display purposes in Figure 3, so that the readers can easily see the cluster.

      We have made the statistical thresholds more easily assessable for the reader on the following pages:

      Figure 3 (Page 27): ‘Images are thresholded at p < .001, cluster size 10, uncorrected, except (D) which is thresholded at p < .01, cluster size 10, for display purposes only (i.e., the peak voxel and adjacent 10 voxels also survived p < .001, uncorrected).’

      Figure 4 (Page 30): ‘Image is displayed at p < .05, small volume corrected, and a voxel cluster threshold of 10 adjacent voxels.’

      I was wondering whether it would be possible to use DCM to investigate the directionality of the connectivity. Given that there are only two ROIs and two alternative hypotheses (top-down versus bottom-up) this seems like an ideal DCM problem.

      We thank the reviewer for this suggestion and will consider testing the effective connectivity between both regions of interest in a future investigation. 

      Line 385: typo: 'great' should be 'greater'.

      We have altered the typo from ‘great’ to ‘greater’ in Line 397.

      Line 400: absence of evidence of an effect is not evidence of absence of an effect.

      We agree with the reviewer that this was unclear. We changed the wording in Line 412:

      ‘In addition, aphantasics and controls did not differ significantly in their time searching for a memory in AM trials, t(19) = 1.03, p = .315.’

      Typo line 623: 'overseas'.

      We have altered the mistyped word from ‘overseas’ to ‘oversees’ in Line 647.

    1. Author response:

      Reviewer #1 (Public review):

      (1) The link between the background in the introduction and the actual study and findings is often tenuous or not clearly explained. A re-working of the intro to better set up and link to the study questions would be beneficial.

      Response: upon revision, we plan to rewrite the introduction of the manuscript.

      (2) For the sequencing, which kit was used on the Novaseq6000?

      Response: for sequencing, we used the Chromium Controller and Chromium Single Cell 3’Reagent Kits (v3 chemistry CG000183) on the Novaseq6000. We feel sorry for lacking this quite important part and will add the information in Methods.

      (3) Additional details are needed for the analysis pipeline. How were batch effects identified/dealt with, what were the precise functions and settings for each step of the analysis, how was clustering performed and how were clusters validated etc. Currently, all that is given is software and sometimes function names which are entirely inadequate to be able to assess the validity of the analysis pipeline. This could alternatively be answered by providing annotated copies of the scripts used for analysis as a supplement.

      Response: we apologize for the inadequacy of descriptions of data analysis process due to word count limit. We plan to provide more information, and if possible we also would like to provide scripts as supplementary data in the revised manuscript.

      (4) For Cell type annotation, please provide the complete list of "selected gene markers" that were used for annotation.

      Response: we will add the list of marker genes for cell type annotation in the revised manuscript.

      (5) No statistics are given for the claims on cell proportion differences throughout the paper (for cell types early, epithelial sub-clusters later, and immune cell subsets further on). This should be a multivariate analysis to account for ADC/SCC, HPV+/- and Early/Late stage.

      Response: considering this inadequacy, we plan to use statistic approaches for further analyses to compare the differences between each set of groups up revision.

      (6) The Y-axis label is missing from the proportion histograms in Figure 2D. In these same panels, the bars change widths on the right side. If these are exclusively in ADC, show it with a 0 bar for SCC, not doubling the width which visually makes them appear more important by taking up more area on the plot.

      Response: we feel sorry for impreciseness when presenting histograms such as Fig 2D and we will add labels in Y-axis. As for the width of bars, we just used the histograms generated originally from the data package. However, we did not intend to double the width on purpose to strengthen the visual importance. We sincerely feel sorry for this and will correct the similar mistakes alongside the whole manuscript.

      (7) Throughout the manuscript, informatic predictions (differentiation potential, malignancy score, stemness, and trajectory) are presented as though they're concrete facts rather than the predictions they are. Strong conclusions are drawn on the basis of these predictions which do not have adequate data to support. These conclusions which touch on essentially all of the major claims made in the manuscript would need functional data to validate, or the claims need to be very substantially softened as they lack concrete support. Indeed, the fact that most of the genes examined that were characteristic of a given cluster did not show the expected expression patterns in IHC highlights the fact that such predictions require validation to be able to draw proper inferences.

      Response: we agree that many conclusions, which were based on bio-informatic predictions, are written in an over-affirmative way. Upon revision, we will rewrite these conclusions more precisely.

      (8) The cluster Epi_10_CYSTM1 which is the basis for much of the paper is present in a single individual (with a single cell coming from another person), and heavily unconnected from the rest of the epithelial populations. If so much emphasis is placed on it, the existence of this cluster as a true subset of cells requires validation.

      Response: we are thankful for this suggestion. We think that each cluster of epithelial cells is specified from other clusters and identified by DEGs, but they are not heavily unconnected from others. Upon revision, we plan to add further validation for the existence of Epi_10_CYSTM1.

      (9) Claims based on survival analysis of TCGA for Epi_10_CYSTM1 are based on a non-significant p-value, though there is a slight trend in that direction.

      Response: from the data of TCGA survival analysis for Epi_10, we found a not-so-slight trend of difference between groups (with a small P value). As a result, we presented this data and hoped to add more strength to the clinical significance of this cluster. However, this indeed caused controversy because the P value is non-significant. We plan to rewrite the conclusion more precisely or delete this data in the revised manuscript.

      (10) The claim "The identification of Epi_10_CYSTM1 as the only cell cluster found in patients with stage IIICp raises the possibility that this cluster may be a potential marker to diagnose patients with lymph node metastasis." This is incorrect according to the sample distributions which clearly show cells from the patient who has EPI_10_CYSTM1 in multiple other clusters. This is then used as justification for SLC26A3 which appears to be associated with associated with late stage, however, in the images SLC26A3 appears to be broadly expressed in later tumours rather than restricted to a minor subset as it should be if it were actually related to the EPI_10_CYSTM1 cluster.

      Response: we feel thankful for this question. The conclusion “The identification of Epi_10_CYSTM1 as the only cell cluster found in patients with stage IIICp raises the possibility that this cluster may be a potential marker to diagnose patients with lymph node metastasis” has indeed been written too concrete according to the sample distribution. We will correct the description in the up-coming revised manuscript. As for SLC26A3, we also do not think it is “broadly” expressed, but it is specified in later tumors. When we presented the data of IHC, we only showed the strongly-positive area of each slide in order to emphasize the differences, however, this has caused misunderstandings. Thus, upon revision, we would like to show the other areas of one case or even the scan of one whole slide as supplementary data.

      (11) The authors claim that cytotoxic T cells express KRT17, and KRT19. This likely represents a mis-clustering of epithelial cells.

      Response: we apologize for the ignorance of further validation of cytotoxic T cells. From fig. 4B and 4C, the four different clusters of T cells were basically identified based on canonical T cell markers. And then we focused mainly on the validation and further analysis of Tregs, neglecting the other clusters. In fig. 4D we intended to only show the top DEGs in each T cell cluster and hoped to find some potential marker genes for next-step analysis. However, we did not notice that there might be contamination of epithelial cells within cytotoxic T cells when clustering. We will optimize the analysis of this part in our revision.

      (12) Multiple claims are made for specific activities based on GO term biological process analysis which while not contradictory to the data, certainly are by no means the only explanation for it, nor directly supported.

      Response: our initial purpose was to use GO analysis as supports for our conclusions. However we know these are only claims but not evidence, which is also the problem of our writing techniques as in question (7). Therefore, in our revised manuscript, we plan to rewrite the conclusion from the GO analysis in a more scientific way or delete these data.

      Reviewer #2 (Public review):

      (1) I believe that many of the proposed conclusions are over-interpretations or unwarranted generalizations of the single-cell analysis. These conclusions are often based on populations in the scRNA-seq data that are described as enriched or specific to a given group of samples (eg. ADC). This conclusion is based on the percentage of cells in that population belonging to the given group; for example, a cluster of cells that dominantly come from ADC. The data includes multiple samples for each group, but statistical approaches are never used to demonstrate the reproducibility of these claims.

      Response: we understand that many of the conclusions are too sure but lack profound supporting evidence, thus we will optimize the writing in the revised manuscript. More importantly, to strengthen the validity of our data, we will try to use statistical approaches for further analysis.

      (2) This leads to problematic conclusions. For example, the "ADC-specific" Epi_10_CYSTM1 cluster, which is a central focus of the paper, only contains cells from one of the 11 ADC samples and represents only a small fraction of the malignant cells from that sample (Sample 7, Figure 2A). Yet, this population is used to derive SLC26A3 as a potential biomarker. SLC26A3 transcripts were only detected in this small population of cells (none of the other ADC samples), which makes me question the specificity of the IHC staining on the validation cohort.

      Response: we sincerely feel grateful for being questioned on the validity, appropriateness and the real potential of SLC26A3. We plan to add more explanation of the importance of SLC26A3 in the discussion part. We are also sorry for some over-sure conclusions about ADC-specific cell clusters, as well as the marker gene SLC26A3. However, we do not think these conclusions are problematic. In fact, due to the heterogeneity among different individuals, as well as even different sites within one individual when sampling, we think a “small faction” does not means it will not make sense. Also, these ADC-specific clusters (including Epi_10_CYSTM1) do have certain proportions when comparing with those “big fraction” groups (Fig. 2D). Furthermore, when considering the specificity of DEGs to ADC only, but not to SCC, we think it might be these ADC-specific cluster genes to have the central function to make a difference between ADC and SCC. And we further used validation experiment to support our hypothesis. Lastly and most importantly, SLC26A3 was coming from sample 7 whose clinical stage is FIGO IIIC (late stage) and pathological type is ADC. Among the 15 cases, there are only 4 cases whose clinical stages are late (within which 3 are ADC). At this point of view, we think 1 in 3 (33%) having expression of SLC26A3 (or existence of cluster Epi_10_CYSTM1) should be considered as a potential choice. Samples coming from early-staged and SCC patients do not have fractions of Epi_10_CYSTM1. This likewise indicates the specificity of this cell cluster to ADC. Therefore, in our revised manuscript, we plan to add more in-depth discussion about this question.

      (3) This is compounded by technical aspects of the analysis that hinder interpretation. For example, it is clear that the clustering does not perfectly segregate cell types. In Figures 2B and D, it is evident that C4 and C5 contain mixtures of cell type (eg. half of C4 is EPCAM+/CD3-, the other half EPCAM-/CD3+). These contaminations are carried forward into subclustering and are not addressed. Rather, it is claimed that there is a T cell population that is CD3- and EPCAM+, which does not seem likely.

      Response: do you mean Figure 1B and D? In the revised manuscript, we will list the canonical marker genes to cluster different types of cells to at least support that the clustering of cell types match most of the present published references. To further avoid the contamination of cells in each cluster, we will use quality controls and re-analyze these data upon revision.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This work presents an in-depth characterization of the factors that influence the structural dynamics of the Clostridium botulinum guanidine-IV riboswitch (riboG). Using a single-molecule FRET, the authors demonstrate that riboG undergoes ligand and Mg2+ dependent conformational changes consistent with the dynamic formation of a kissing loop (KL) in the aptamer domain. Formation of the KL is attenuated by Mg2+ and Gua+ ligand at physiological concentrations as well as the length of the RNA. Interestingly, the KL is most stable in the context of just the aptamer domain compared to longer RNAs capable of forming the terminator stem. To attenuate transcription, binding of Gua+ and formation of the KL must occur rapidly after transcription of the aptamer domain but before transcription of the rest of the terminator stem.

      Strengths:

      (1) Single-molecule FRET microscopy is well suited to unveil the conformational dynamics of KL formation and the authors provide a wealth of data to examine the effect of the ligand and ions on riboswitch dynamics. The addition of complementary transcriptional readthrough assays provides further support for the author's proposed model of how the riboswitch dynamics contribute to function.

      (2) The single-molecule data strongly support that the effect of Gua+ ligand and Mg2+ influence the RNA structure differently for varying lengths of the RNA. The authors also demonstrate that this is specific for Mg2+ as Na+ and K+ ions have little effect.

      (3) The PLOR method utilized is clever and well adapted for both dual labeling of RNAs and examining RNA at various lengths to mimic co-transcriptional folding. Using PLOR, they demonstrate that a change in the structural dynamics and ligand binding can occur after the extension of the RNA transcript by a single nucleotide. Such a tight window of regulation has intriguing implications for kinetically controlled riboswitches.

      Weaknesses:

      (1) The authors use only one mutant to confirm that their FRET signal indicates the formation of the KL. Importantly, this mutation does not involve the nucleotides that are part of the KL interaction. It would be more convincing if the authors used mutations in both strands of the KL and performed compensatory mutations that restore base pairing. Experiments like this would solidify the structural interpretation of the work, particularly in the context of the full-length riboG RNA or in the cotranscriptional mimic experiments, which appear to have more conformational heterogeneity.

      We thank the reviewer for describing our work “in-depth characterization” of riboG. We agree with the reviewer and we have added two more mutants, G71C and U72C with the mutations located at the KL (Figure 2– figure supplement 8A, 8B, 9A, 9B, Figure 3– figure supplement 6A, 6B, 7A, 7B, and Figure 4– figure supplement 6A, 6B, 7A, 7B). Furthermore, we have performed compensatory mutations, C30G-G71C and A29G-U72C that restore base pairing in the KL (Figure 2– figure supplement 8C, 8D, 9C, 9D, Figure 3– figure supplement 6C, 6D, 7C, 7D, and Figure 4– figure supplement 6C, 6D, 7C, 7D). We added the experimental results in the revised manuscript accordingly as “The highly conserved nucleotides surrounding the KL are crucial for its formation (Lenkeit et al., 2020). To test our hypothesis that the state with EFRET ~ 0.8 corresponds to the conformation with the KL, we preformed smFRET analysis on several mutations at these crucial nucleotides (Figure 2– figure supplement 8–10). Consistent with our expectations, the peaks with EFRET ~ 0.8 was significantly diminished in the riboG-G71C mutant, which features a single nucleotide mutation at site 71 (with 97% nucleotide conservation) in the KL (Figure 2– figure supplement 8A and 8B). It is worth noting that the C30G and G71C mutant, which were initially expected to restore a base pair in the KL, did not successfully bring about the anticipated peak of EFRET ~ 0.8 (Figure 2– figure supplement 8C and 8D). On the other hand, the riboG-U72C mutant exhibited a lower proportion at the state with EFRET ~ 0.8 than riboG-apt. However, the A29G and U72C mutations restored a base pair in the KL, as well as the formation of the KL (Figure 2– figure supplement 9). Furthermore, our investigation revealed that the G77C mutant, involving a single nucleotide mutation at a highly conversed site, 77 (with 97% nucleotide conservation), also hindered the formation of the KL (Figure 2– figure supplement 10). This finding aligns with previous research (Lenkeit et al., 2020) and the predicted second structure of G77C mutation by Mfold (Zuker, 2003)”  ( page 7), “In contrast to riboG-term, both its G71C and C30G-G71C mutants displayed a reduced proportion of the state with EFRET ~ 0.8. Remarkably, the fractions of EFRET ~ 0.8 remained unaffected by the addition of 1.0 mM Gua+ in these mutants. Distinct from riboG-term, no structural transitions between states were observed in the two mutants (Figure 3– figure supplement 6). Regarding the U72C mutant of riboG-term, the mutation at the site 72 had a reduced impact on the KL conformation in the presence of 1.0 mM Gua+ and 2.0 mM Mg2+. However, the increased proportion of EFRET ~ 0.8 in the A29G-U72C mutant of riboG-term suggests that these mutations can restore the base-pairing between sites 29 and 72, as well as facilitate the formation of the KL (Figure 3– figure supplement 7)” ( page 8), and “Upon comparing the G71C and C30G-G71C mutants of the full-length riboG with their wild-type counterpart, it was observed that the wild-type adopted higher proportions of the state with EFRET ~ 0.8 (Figure 4– figure supplement 6). Regarding the U72C and A29G-U72C mutants of the full-length riboG, their behaviors with regards to the peak with EFRET ~ 0.8 were similar to that of their counterparts in riboG-term (Figure 4– figure supplement 7)” ( page 9).

      (2) The existence of the pre-folded state (intermediate FRET ~0.5) is not well supported in their data and could be explained by an acquisition artifact. The dwell times are very short often only a single frame indicating that there could be a very fast transition (< 0.1s) from low to high FRET that averages to a FRET efficiency of 0.5. To firmly demonstrate that this intermediate FRET state is metastable and not an artifact, the authors need to perform measurements with a faster frame rate and demonstrate that the state is still present.

      We thank the reviewer for the great comment. We added smFRET experiments at higher time resolution, 20 ms, as well as lower time resolution (Figure 2– figure supplement 3).  Based on our experimental results, the intermediate state (EFRET ~0.5) exists at the smFRET collected at 20 ms, 100 ms and 200 ms. 

      (3) The PLOR method employs a non-biologically relevant polymerase (T7 RNAP) to mimic transcription elongation and folding near the elongation complex. T7 RNAP has a shorter exit channel than bacterial RNAPs and therefore, folding in the exit channel may be different between different RNAPs. Additionally, the nascent RNA may interact with bacterial RNAP differently. For these reasons, it is not clear how well the dynamics observed in the T7 ECs recapitulate riboswitch folding dynamics in bacterial ECs where they would occur in nature. 

      We thank the reviewer for the comment. We agree with the reviewer that the bacterial and T7 RNAPs may behave differently due to their differences in transcriptional speed, dynamics, interactions, and so on. And we added a statement in the Discussion as “It is worth noting that the RNAP utilized in our study is T7 RNAP, which exhibits distinct characteristics compared to bacterial RNAP in terms of transcriptional speed, dynamics, and interactions. However, Xue et al. have reported similarities between T7 and E. coli RNAP in the folding of nascent RNA. Additionally, Lou and Woodson have provided valuable insights into the co-transcriptional folding of the glmS ribozyme using T7 RNAP (Xue et al., 2023; Lou & Woodson, 2024)” ( page 13–14).

      Reviewer #2 (Public Review):

      Summary:

      Gao et al. used single-molecule FRET and step-wise transcription methods to study the conformations of the recently reported guanidine-IV class of bacterial riboswitches that upregulate transcription in the presence of elevated guanidine. Using three riboswitch lengths, the authors analyzed the distributions and transitions between different conformers in response to different Mg2+ and guanidine concentrations. These data led to a three-state kinetic model for the structural switching of this novel class of riboswitches whose structures remain unavailable. Using the PLOR method that the authors previously invented, they further examined the conformations, ligand responses, and gene-regulatory outcomes at discrete transcript lengths along the path of vectorial transcription. These analyses uncover that the riboswitch exhibits differential sensitivity to ligand-induced conformational switching at different steps of transcription, and identify a short window where the regulatory outcome is most sensitive to ligand binding.

      Strengths:

      Dual internal labeling of long RNA transcripts remains technically very challenging but essential for smFRET analyses of RNA conformations. The authors should be commended for achieving very high quality and purity in their labelled RNA samples. The data are extensive, robust, thorough, and meticulously controlled. The interpretations are logical and conservative. The writing is reasonably clear and the illustrations are of high quality. The findings are significant because the paradigm uncovered here for this relatively simple riboswitch class is likely also employed in numerous other kinetically regulated riboswitches. The ability to quantitatively assess RNA conformations and ligand responses at multiple discrete points along the path towards the full transcript provides a rare and powerful glimpse into cotranscriptional RNA folding, ligand-binding, and conformational switching.

      Weaknesses:

      The use of T7 RNA polymerase instead of a near-cognate bacterial RNA polymerase in the termination/antitermination assays is a significant caveat. It is understandable as T7 RNA polymerase is much more robust than its bacterial counterparts, which probably will not survive the extensive washes required by the PLOR method. The major conclusions should still hold, as the RNA conformations are probed by smFRET at static, halted complexes instead of on the fly. However, potential effects of the cognate RNA polymerase cannot be discerned here, including transcriptional rates, pausing, and interactions between the nascent transcript and the RNA exit channel, if any. The authors should refrain from discussing potential effects from the DNA template or the T7 RNA polymerase, as these elements are not cognate with the riboswitch under study.

      We thank the reviewer for describing our work “The data are extensive, robust, thorough, and meticulously controlled. The interpretations are logical and conservative. The writing is reasonably clear and the illustrations are of high quality”. We agree with the reviewer that the bacterial and T7 RNAPs may behave differently due to their differences in transcriptional speed, dynamics, interactions, and so on. And we added a statement in the Discussion as “It is worth noting that the RNAP utilized in our study is T7 RNAP, which exhibits distinct characteristics compared to bacterial RNAP in terms of transcriptional speed, dynamics, and interactions. However, Xue et al. have reported similarities between T7 and E. coli RNAP in the folding of nascent RNA. Additionally, Lou and Woodson have provided valuable insights into the co-transcriptional folding of the glmS ribozyme using T7 RNAP (Xue et al., 2023; Lou & Woodson, 2024)” ( page 14).

      Reviewer #3 (Public Review):

      Summary:

      In this article, Gao et. al. uses single-molecule FRET (smFRET) and position-specific labelling of RNA (PLOR) to dissect the folding and behavioral ligand sensing of the Guanidine-IV riboswitch in the presence and absence of the ligand guanidine and the cation Mg2+. The results provided valuable information on the mechanistic aspects of the riboswitch, including the confirmation of the kissing loop present in the structure as essential for folding and riboswitch activity. Co-transcriptional investigations of the system provided key information on the ligand-sensing behavior and ligandbinding window of the riboswitch. A plausible folding model of the Guanidine-IV riboswitch was proposed as a final result. The evidence presented here sheds additional light on the mode of action of transcriptional riboswitches.

      Strengths:

      The investigations were very thorough, providing data that supports the conclusions. The use of smFRET and PLOR to investigate RNA folding has been shown to be a valuable tool for the understanding of folding and behavior properties of these structured RNA molecules. The co-transcriptional analysis brought important information on how the riboswitch works, including the ligand-sensing and the binding window that promotes the structural switch. The fact that investigations were done with the aptamer domain, aptamer domain + terminator/anti-terminator region, and the full-length riboswitch were essential to inform how each domain contributes to the final structural state if in the presence of the ligand and Mg2+.

      Weaknesses:

      The system has its own flaws when compared to physiological conditions. The RNA polymerase used (the study uses T7 RNA polymerase) is different from the bacterial RNA polymerase, not only in complexity, but also in transcriptional speed, which can directly interfere with folding and ligand-sensing. Additionally, rNTPs concentrations were much lower than physiological concentrations during transcription, likely causing a change in the polymerase transcriptional speed. These important aspects and how they could interfere with results are important to be addressed to the broad audience. Another point of consideration to be aware of is that the bulky fluorophores attached to the nucleotides can interfere with folding to some extent.

      We thank the reviewer for describing our work as “The investigations were very thorough, providing data that supports the conclusions”. We agree with the reviewer that the bacterial and T7 RNAPs may behave differently due to their differences in transcriptional speed, dynamics, interactions, and so on. And we added a statement in the Discussion as “It is worth noting that the RNAP utilized in our study is T7 RNAP, which exhibits distinct characteristics compared to bacterial RNAP in terms of transcriptional speed, dynamics, and interactions. However, Xue et al. have reported similarities between T7 and E. coli RNAP in the folding of nascent RNA. Additionally, Lou and Woodson have provided valuable insights into the cotranscriptional folding of the glmS ribozyme using T7 RNAP (Xue et al., 2023; Lou & Woodson, 2024)” ( page 14). And we also agree with the reviewer that the lower NTP may affect the transcriptional speed. Regarding the fluorophores, we purposely placed them away from the KL to avoid their influence on the formation of the KL.

      Reviewer #1 (Recommendations For The Authors):

      Related to weakness 1

      - The authors cite a paper that investigated mutations in the KL duplex but do not include these mutations in their analysis. It is unclear why the authors chose the G77C mutation and not the other mutants previously tested. Can the authors explain their choice of mutation in detail in the text? I also did not see the proposed secondary structure for the G77C mutant shown in Figure 2 -supp 3A in the cited paper, is this a predicted structure? Please explain how this structure was determined. 

      We thank the reviewer for the comment. The reason we chosen the G77C mutation is based on previous report that G77C can disturb the formation of the KL, as we stated in the manuscript as “Furthermore, our investigation revealed that the G77C mutant, involving a single nucleotide mutation at a highly conversed site, 77 (with 97% nucleotide conservation), also hindered the formation of the KL (Figure 2– figure supplement 10). This finding aligns with previous research (Lenkeit et al., 2020) and the predicted second structure of G77C mutation by Mfold (Zuker, 2003)” ( page 7). And the secondary structure for the G77C mutant was predicted by Mfold, which as cited in the manuscript and added in the reference list as “Zuker, M. (2003). Mfold web server for nucleic acid folding and hybridization prediction. Nucleic Acids Research, 31(13), 3406-3415”. 

      - It is not clear to me that the structural interpretation of their FRET states is correct and that the FRET signal reports on the base pairing of the KL in only the high FRET state. The authors should perform experiments with additional mutations in the KL duplex to confirm that their construct reports on KL duplex formation alone and not other structural dynamics. 

      We thank the reviewer for the comment. We have included additional mutations to establish a connection between the high-FRET state to the formation of the KL. The results have been added to the manuscript as “The highly conserved nucleotides surrounding the KL are crucial for its formation (Lenkeit et al., 2020). To test our hypothesis that the state with EFRET ~ 0.8 corresponds to the conformation with the KL, we preformed smFRET analysis on several mutations at these crucial nucleotides (Figure 2– figure supplement 8–10). Consistent with our expectations, the peaks with EFRET ~ 0.8 was significantly diminished in the riboG-G71C mutant, which features a single nucleotide mutation at site 71 (with 97% nucleotide conservation) in the KL (Figure 2– figure supplement 8A and 8B). It is worth noting that the C30G and G71C mutant, which were initially expected to restore a base pair in the KL, did not successfully bring about the anticipated peak of EFRET ~ 0.8 (Figure 2– figure supplement 8C and 8D). On the other hand, the riboG-U72C mutant exhibited a lower proportion at the state with EFRET ~ 0.8 than riboG-apt. However, the A29G and U72C mutations restored a base pair in the KL, as well as the formation of the KL (Figure 2– figure supplement 9). Furthermore, our investigation revealed that the G77C mutant, involving a single nucleotide mutation at a highly conversed site, 77 (with 97% nucleotide conservation), also hindered the formation of the KL (Figure 2– figure supplement 10). This finding aligns with previous research (Lenkeit et al., 2020) and the predicted second structure of G77C mutation by Mfold (Zuker, 2003)”  ( page 7), “In contrast to riboG-term, both its G71C and C30G-G71C mutants displayed a reduced proportion of the state with EFRET ~ 0.8. Remarkably, the fractions of EFRET ~ 0.8 remained unaffected by the addition of 1.0 mM Gua+ in these mutants. Distinct from riboG-term, no structural transitions between states were observed in the two mutants (Figure 3– figure supplement 6). Regarding the U72C mutant of riboG-term, the mutation at the site 72 had a reduced impact on the KL conformation in the presence of 1.0 mM Gua+ and 2.0 mM Mg2+. However, the increased proportion of EFRET ~ 0.8 in the A29G-U72C mutant of riboG-term suggests that these mutations can restore the base-pairing between sites 29 and 72, as well as facilitate the formation of the KL (Figure 3– figure supplement 7)” ( page 8), and “Upon comparing the G71C and C30G-G71C mutants of the full-length riboG with their wild-type counterpart, it was observed that the wild-type adopted higher proportions of the state with EFRET ~ 0.8 (Figure 4– figure supplement 6). Regarding the U72C and A29G-U72C mutants of the full-length riboG, their behaviors with regards to the peak with EFRET ~ 0.8 were similar to that of their counterparts in riboG-term (Figure 4– figure supplement 7)” ( page 9).  

      - For the full-length riboG-136 (Cy3Cy5 riboG in Figure 4), the authors have clearly defined peaks at 0.6 and 0.4. However, the authors do not explain their structural interpretation of these states. Do the authors believe that the KL is forming in these states? It would be helpful to have data on mutations in the KL in the context of the full-length riboG to better understand the structural transitions of these intermediate states. 

      Based on our mutation studies, we proposed that the peak with EFRET ~0.8 corresponds to the conformation with the KL, while the states with EFRET ~0.4 and 0.6 are the states without a stable KL. 

      Related to weakness 2:

      - For the riboG-apt and riboG-term RNAs, the proposed intermediate FRET state (EFRET = 0.5) is poorly fit by a Gaussian and the dwell times in the state are almost entirely single-frame dwells. It is likely that this state is the result of a camera blurring artifact, in which RNAs undergo a FRET transition between two frames giving an apparent FRET efficiency which is between that of the two transitioning states. This artifact arises when the average dwell times of the true states (Elow and Ehigh) are comparable to the frame duration (within a factor of ~5-10; see https://doi.org/10.1021/acs.jpcb.1c01036). To confirm the presence of the intermediate state, the authors should perform at least a few experiments with higher time resolution to support the existence of the 0.5 state with a lifetime of 0.1 s. Alternatively, the data should be refit to a two-state HMM and the authors could explain in the text that the density in the FRET histogram between the two states is likely due to transitions that are faster than the time resolution of the experiment. 

      We thank the reviewer for the great comment. Taking the suggestion into consideration, we performed smFRET experiments with a higher time resolution of 20 ms. As a result, we still detected the intermediate state, supporting that it is not an artifact. The new data has been included in the revised manuscript (Figure 2-figure supplement 3).  

      Related to weakness 3:

      - The authors depict the polymerase footprint differently in some of the figures and it is unclear if this is part of their model. Is the cartoon RNAP supposed to indicate the RNA:DNA hybrid or the footprint of T7 RNAP on the RNA? For example, in Figure 8a there are 8 nts (left) and 9 nts (right) covered by RNAP, and only 6nts in Figure 6 - supp 2A. This is particularly misleading for the EC-87 and EC-88 in Figure 6 - supp 2, where it is likely that this stem is not formed at all and the KL strand is single-stranded. The authors should clarify and at least indicate in the figure legend if the RNAP cartoon is part of the model or only a representation. 

      We thank the reviewer for bringing the issues to our attention. Due to space limitations, we chose to represent the polymerase footprint differently in Figure 8. However, we have included the statement “DNA templates from EC-87 to EC-105 are not displayed in the model” in the legend of Figure 8 to avoid the confusion.

      Moreover, we have corrected the error of 6 nts Figure 6-supplement figure 2.  

      - With a correct 9 bp RNA:DNA hybrid, the EC-88 construct would not be able to form the top part of the P2 stem and the second half of the KL RNA would be single-stranded. In this case, an interaction between the KL nucleotides would resemble a pseudoknot and not a kissing loop interaction. Can the authors explain if this could explain the heterogeneity they observe in the EC-88 construct compared to the riboGapt  RNA?

      Thank the reviewer for the comment. We have added the statement in the revised manuscript as “The T7 RNA polymerase (RNAP) sequestered about 8 nt of the nascent RNA, preventing the EC-88 construct from forming the P2 stem (Durniak et al., 2008; Huang & Sousa, 2000; Lubkowska et al., 2011; Tahirov et al., 2002; Wang et al., 2022; Yin & Steitz, 2002). Consequently, a pseudoknot structure potentially formed instead of the expected KL. This distinction may account for the observed heterogeneity between EC-88 and riboG-apt” ( page 11).

      Other comments:

      (1) It appears that the FRET histograms in the PLOR experiments (Figure 6 and related figures) only show the fits presumably to highlight the overlays. However, this makes it impossible to determine the goodness of the fit. The authors should instead show the outline of the raw histogram with the fit, or at least show the raw histograms with fits in the supplement. 

      We have replaced Figure 6- figure supplements 2-4 to enhance the clarity of the raw and fitted smFRET histograms.  

      (2) The authors should consider including a concluding paragraph to put the results into a larger context. How does the kinetic window compare to other transcriptional riboswitches? Would the authors comment on how the transcription speed compares to the kinetics for the formation of the KL? 

      We thank the reviewer for the comment. We have added the comparison of riboG to other transcription riboswitches to the manuscript as “Nevertheless, the ligand-sensitive windows of riboswitches during transcription vary. In a study conducted by Helmling et al. using NMR spectroscopy, they proposed a broad transcriptional window for deoxyguanosine-sensing riboswitches, whereby the ligand binding capability gradually diminishes over several nucleotide lengths (Helmling et al., 2017). However, more recent research by Binas et al. and Landgraf et al. on riboswitches sensing ZMP, c-di-GMP, and c-GAMP revealed a narrow window with a sharp transition in binding capability, even with transcript lengths differing by only one or three nucleotides (Binas et al., 2020; Landgraf et al., 2022). In line with the findings for the c-GAMP-sensing riboswitch, our study on the guanidine-IV riboswitch also demonstrated a sharp transition in binding capability with just a single nucleotide extension” ( page 14). 

      We appreciate the reviewer’s comment in comparing the transcription speed to the kinetics of the KL formation. However, we must acknowledge that we have limited kinetic data in this study to confidently make such a comparison.

      (3) Cy3Cy5 RiboG is a confusing name because it implies that the others are not also Cy3Cy5 labeled. The authors should consider changing the names and being consistent throughout. I suggest full-length riboG or riboG-136. 

      We have changed “Cy3Cy5 riboG” to “Cy3Cy5-full-length riboG” (pages 15 and 16).

      (4) The transcriptional readthrough experiment should be explained when first mentioned in line 109. 

      We have added the citation (Chien et al., 2023) of the transcriptional readthrough experiment to the manuscript as “we noted that the transcriptional read-through of the guanidine-IV riboswitch during the single-round PLOR reaction was sensitive to Gua+, exhibiting an apparent EC50 value of 68.7  7.3 μM (Figure 1D) (Chien et al., 2023)” (page 5). 

      (5) Kd values in text should have uncertainties, and the way these uncertainties are obtained should be explained.

      We have added the uncertainties of Kd values in the revised manuscript ( page 6) and the legend of Figure 2-supplement 6 as “The percentages of the folded state (EFRET ~ 0.8) of Cy3Cy5-riboG-apt were plotted with the concentrations of Gua+ at 0.5 mM Mg2+, with an apparent Kd of 286.0  18.1 μM in three independent experiments”.

      (6) The authors mention "strategies" on line 306, but it is unclear what they are referring to. Are the strategies referring to the constructs (EC-87, etc) or Steps 1-8 in the supplemental figure? Please clarify. 

      We have clarified the confusion by adding “The detailed procedures of strategies 1-8 were shown in Figure 7–figure supplement 1” to the manuscript ( page 12).

      (7) What are the fraction of dynamic traces versus static traces in the cases for the full-length riboG? This would help depict the structural heterogeneity in the population. 

      We have added the fractions of dynamic single-molecule traces of the full-length riboG to Figure 4-supplements 1-5. 

      (8) The labels in Figure 4 (A-E) don't match the caption (A-H). 

      We have corrected the error. 

      (9) The coloring of the RNA strands in Figure 4A should be explained in the figure legend. It could be interpreted as multiple strands annealed instead of a continuous strand. 

      We have revised the legend of Figure 4A by adding “The full-length riboG contains the aptamer domain (black), terminator (red) and the extended sequence (blue). Cy3 and Cy5 are shown by green and red sparkles, respectively”.

      (10) Reported quantities and uncertainties should have the same number of decimal places. In many places, the uncertainties likely have too many significant figures, for example, in Figure 5 and related figures. 

      We have corrected the significant figures of the uncertainties. 

      (11) In Figure 5, A and B should have the same vertical scale to facilitate comparison. 

      We have adjusted Figure 5A to match the vertical scale of Figure 5B in the revised manuscript.

      (12) In Figure 5C-D, the construct from which those trajectories come should be indicated in the legend. 

      We have added the construct to the legend of Figures 5C and D.  

      (13) In Figure 6J, the splines between data points are confusing and can be misleading. They suggest that the data has been fit to a model, but I am not sure if it represents a model. The data points should be colored instead and lines removed. 

      We thank the reviewer for the comment. We have changed Figure 6J by coloring the data points and removing the lines to avoid confusion. 

      (14) Line 330 mentions a P2 structure in Figure 8, but there is no such label in Figure. Please clarify. 

      We thank the reviewer for the comment and have added P2 to Figure 8. 

      Reviewer #2 (Recommendations For The Authors):

      (1) Figure 1B. The authors don't seem to address the role of the blue stem-loop following Stems 1 and 2. Is this element needed at all for gene regulation? Does it impact the conformations or folding of the preceding Stems 1 and 2? It seems feasible to disrupt the stem and see whether there is an impact on riboswitch function. 

      We thank the reviewer for the comment. The presence of the sequence which formed blue stem-loop indicates the formation of an anti-terminator conformation in riboG during transcription. Our smFRET data shows that the inclusion of the stem-loop sequence induces additional peaks in the full-length riboG compared to the riboGterm. This indicates that the stem-loop influences the folding of the kissing loop (KL) and potentially also affects the stems 1 and 2.  

      (2) Figure 7 supplement 1, C &D. Maybe I am missing something, but it seems to me in reaction #8 (EC-105, last two lanes), the readthrough percentage is close to 50% based on the gel but plotted in D as 20%. Further, there is a strong effect of guanidine in reaction #8 but that is not reflected in the quantitation in panel D. 

      We thank the reviewer for the comment. The observed discrepancy between reaction 8 in (C) and (D) is from the differential handling of the crude product at the last step (step 17) in gel loading for (C), contrasted with the combination of crude products from steps 16 and 17 to calculate the read-through percentage in (D). We have corrected the discrepancy by replacing Figure 7-Supplement figure 1C (now Figure 7C), and revised the legend to include the following clarification: “Taking into consideration that the 17 step-PLOR reaction exhibited a pause within the terminator region, resulting in a significant amount of terminated product at step 16, crude products from steps 16 and 17 were collected for (C) and (D) of the 17 step-PLOR reaction (Lanes 15 and 16 in C)”.

      (3) Figure 7C is a control that shows the quality of the elongation complexes, which probably should be in the supplement. Instead, in Figure 7 supplement 1, panels C and D are actual experiments and could be moved into the main figure.  

      We thank the reviewer for the comment. We made the adjustment.  

      (4) Figure S7D. I would suggest not labelling the RNA polymerase halt/stoppage sites due to NTP deprivation as "pausing sites" because transcriptional pausing has previously been defined as natural sites where the RNA polymerase transiently halts itself, but not due to the lack of the next NTPs. In this case, the elongating complexes were artificially halted, which is technically not "pausing", as it will not restart/resume on its own without intervention. 

      We have changed the “pausing” to “halting”.  

      (5) Figure 7 is titled "In vitro transcriptional performance of riboG." But the data is actually not about the performance of the riboswitch, or how well it functions. I would suggest the authors revise the title. This is mostly about the observed sensitivity window of the riboswitch to ligand-mediated conformational switching. 

      We have changed the title of Figure 7 to “Ligand-mediated conformational switching of riboG during transcription”.

      (6) Figure 7A, the illustration gives the visual impression that there are multiple RNA polymerases on the same DNA template, which is not the case. 

      We have revised Figure 7A by adding arrows between RNA polymerases to illustrate the movement of a single RNAP, rather than multiple RNAP on the same template.

      (7) It could be informative to compare the guanidine-IV riboswitch with the first three classes (I, II, III), to see how their architectures or gene regulatory mechanisms are similar or different. 

      We thank the reviewer for the comment. We have added the comparison of the guanidine-IV riboswitch to other three guanidine riboswitches to the manuscript as “The guanidine-IV riboswitch exhibits similarities to the guanidine-I riboswitch in gene regulatory mechanism, functioning as a transcriptional riboswitch. Structurally, it resembles the guanidine-II riboswitch through the formation of loop-loop interactions upon binding to guanidine (Battaglia & Ke, 2018; L. Huang et al., 2017; Lin Huang et al., 2017; Lenkeit et al., 2020; Nelson et al., 2017; Reiss & Strobel, 2017; Salvail et al., 2020)” ( page 12).  

      Reviewer #3 (Recommendations For The Authors):

      In addition to the public review items, I provide the following recommendations:

      (1) As a second language speaker, I understand that writing a compelling and concise story may be hard, and we tend to write more than needed or more repetitively. That being said, I do think that the writing could be improved to make it more concise, clear, and avoid repetitions.

      We thank the reviewer for the comment. We re-wrote the abstract and some sentences in the manuscript.

      (2) In the abstract, instead of saying that "...This lack of understanding has impeded the application of this riboswitch", which makes the statement too strong, perhaps, stating something along the lines of "this understanding would assist the application of this riboswitch", would be a better fit. 

      We have re-wrote the abstract, and revised the sentence.  

      (3) Methods should state which RNA polymerase was used. PLOR uses T7 RNA pol, so I assume it was the same. 

      We have added the statement “T7 RNAP was utilized in the PLOR and in vitro transcription reactions except noted” in the Methods ( page 15). 

      (4) The impact statement says comprehensive structure-function, where perhaps comprehensive folding-function would be more appropriate. We are still missing a lot of structural information about this particular riboswitch. 

      We agree with the reviewer, and changed “comprehensive structure-function” to “folding-function” in Impact statement ( page 2).

      (5) Higher Mg2+ concentrations implicated in a lesser extent of the switch of RiboGapt, a sentence talking about it would be useful (how Mg2+ could have promiscuous interaction and interfere with folding). 

      We have added the role of higher Mg2+ to the manuscript as “However, at a higher concentration of 50.0 mM Mg2+, the proportion of the pre-folded and unfolded conformations were more prevalent at 50.0 mM Mg2+ than at 20.0 mM Mg2+. This suggests that an excess of Mg2+ may promote the pre-folded and even unfolded conformations” ( page 6).

      (6) In the investigations of RiboG-term and RiboG, seems like that monovalents from the buffer are sufficient to promote secondary structure. A statement commenting on this would benefit the paper and the audience. 

      We agree with the reviewer and have accordingly revised the manuscript accordingly by adding “This indicates that monovalent ions in the buffer can facilitate the formation of stable guanidine-IV riboswitch” ( page 8).

      (7) Figure 3. Figure goes to panel E and legend to panel H. G and H colors do not correspond to actual figure colors. 

      We made the correction.  

      (8) Figure 4. The same as Figure 3, the panels and figures are divergent.  

      We made the correction.  

      (9) During the discussion, stating that the DNA and RNA pol play a role in folding and ligand binding may be excessive. This could be an indirect effect of the transcriptional bubble hindering part of the nascent RNA from folding, which is something intrinsic to any transcription and not specific to this system. 

      We agree with the reviewer and deleted the statement about the DNA and RNAP play a role in folding and ligand binding.

      (10) PLOR is not properly cited. When introduced in the manuscript, please cite the original PLOR paper (Liu et. al. Nature 2015) and additional related papers. 

      We cited the original PLOR paper (Liu et al, Nature 2015) and the related papers (Liu et al, Nature Protocols 2018). ( pages 4 and 15)

      (11) The kinetics race of folding and binding could be a little more emphasized in discussion, particularly from the perspective of its physiological importance. 

      We agree with the reviewer and deleted the kinetics race of folding and binding from the Discussion part.

    1. that we might form great friendship, for I knew that they were a people who could be more easily freed and converted to our holy faith by love than by force, gave to some of them red caps, and glass beads to put round their necks, and many other things of little value, which gave them great pleasure, and made them so much our friends that it was a marvel to see.

      I found this sentence from the excerpt to be quite informative on how Columbus saw the natives of americas. He talked about wanting to be "great friends". But in my mind he means that they are simply easily people to use in his idea of furthering European powers. It also seems that the tone of force and love may cause different forms of his ideas to be on display. That being how he deals with them what happens to those who conform to christianity and those who do not want to. It's easy to see that Columbus saw the natives as less than because they didn't know the value of them items he shared with them. I think this statement was the start of furthering the notion of those who are not from Europe to be less than and in capable of being on the same level as Europeans. Simply because the natives put higher value in other things than the Europeans do.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      This manuscript uses C. elegans as a model to interrogate the effects of autism-associated variants of previously unknown function in the RNA-binding protein RBM-26/RBM27.

      Despite its potential impact, there are several concerns related to the technical rigor and specificity of the observed effects.

      Major concerns: 1. The effects on PLM are interesting, but why was this neuron selected for study? Was this a lucky guess or are other axons also affected? It is important to clarify whether the effects of RBM-26 are specific to this neuron or act pleiotropically across many or all neurons. According to CeNGEN, rbm-26 is strongly expressed in the well-characterized neurons ASE, PVD, and HSN. Are there morphological defects in these neurons, or others? As a note, there are also functional assays for these neurons (salt sensing, touch response, and egg laying, respectively).

      We have added new data to the supplemental materials showing that loss of rbm-26 function also causes the beading phenotype in the axons and dendrites of the PVD neuron (Figure S4 and lines 196-199). We have focused on the PLM neuron because our preliminary studies indicated that it had a higher penetrance of axon defects relative to the PVD neuron. Moreover, we observed expression of endogenously tagged RBM-26 in the PLM neuron (Figure 3A-C and lines 210-215).

      Similarly, the choice of the MALSU homolog seemed like a shot in the dark. It is ranked 46th (out of 63 genes) for fold-enrichment following RBM-26 pull-down, and 9th for p-value. Were any of the mRNAs with greater fold-enrichment or smaller p-values examined further? It is important to determine whether many or all of these interacting genes are overexpressed in the absence of RBM-26 and whether they are also required for the phenotypic effects of RBM-26 mutants, or if the MALSU homolog is special.

      We have clarified our reasoning for selecting the MALS-1 ortholog of MALSU1 for further study (see lines 283-284 and Table S2). Amongst binding partners with human orthologs, MALS-1 was by far the top ranked candidate. The adjusted p-value for MALS-1 was 0.0008. The next smallest adjusted p-value was two orders of magnitude larger (0.028 for dpy-4). Moreover, the log2fold fold enrichment for MALS-1 was 1.98, about the same as the largest (ACADS with 2.13). Nonetheless, we agree that some of the other interactors may also be of interest and have thus included them in the supplemental table S2. Although these other potential binding partners are outside the scope of this study, we expect that future studies by ourselves or others may focus on the roles of these other binding partners.

      In addition to the specificity controls mentioned above, positive and negative controls are needed throughout the results. While each of these may be relatively minor by itself, as a group they raise questions about the technical rigor of the study. Briefly these include: Fig 1C. Missing loading controls and negative control (rbm-26 null allele). Additional exposures should be included to show whether RBM-26(P80L) protein or the lower band for RBM-26(L13V) are present at all, relative to the null allele.

      We have added no-stain loading controls to figure 1C. We have also switched to using ECL detection, which is much more sensitive and reveals faint bands for RBM-26(P80L) and additional faint bands for RBM-26(L13V). In addition, we have included a longer exposure for the blot (Figure S1). We are unable to test the null, as we can only produce a limited number of small maternally rescued progeny, thereby precluding western blot analysis.

      Fig 2. Controls to distinguish overextension of PLM axon from posterior mispositioning of ALM cell body are needed. Quantification of PLM axon lengths in microns (or normalized to body size) with standard deviation, not error of proportion, should be shown. Measurement of “beading phenotype” should be more rigorous, see for example the approach in Rawson et al. Curr. Biol. 2017 https://doi.org/10.1016/j.cub.2014.02.025 . The developmental stage examined, and the reason for choosing that stage, should be described for this and all figures.

      We have added new data that shows PLM axon length relative to body length for each of the RBM-26 mutants (Figure S2 and lines 183-185). These results indicate that the PLM axon has a larger axon length to body length ration, suggesting that the PLM/ALM overlap phenotype is a result of PLM axon overextension. For most experiments, we retain penetrance, as this has been standard practice in the field and allows for a much larger sample size (see examples listed below). We have also added examples of how the beading phenotype was measured (Figure S3). Moreover, we have now analyzed this phenotype and others at multiple developmental stages (Figures 2D-H and Table S1). In general, we have conducted experiments at the L3 stage because the rbm-26(null) mutants don’t survive past this stage. However, for many of our experiments we have also included additional stages as well. We have added this explanation to the methods section of phenotype analysis and also at various locations throughout the text. We have also labeled all graphs to clearly indicate the developmental stages and included.

      10.1038/s41467-019-12804-3 Article by laboratory of Brock Grill

      10.1371/journal.pgen.1002513 Article by laboratory of Ian Chin-Sang

      doi.org/10.1073/pnas.1410263111 Article by laboratory of Chun-Liang Pan

      10.1016/j.neuron.2007.07.009 Article by laboratory of Yishi Jin

      doi.org/10.1523/JNEUROSCI.5536-07.2008 Article by laboratory of William Wadsworth

      Fig 3. Controls without auxin and with neuronal TIR1 expression alone should be included. Controls demonstrating successful RBM-26 depletion, in larvae as well as in embryos at the time of PLM extension, should be included (weak embryonic depletion might explain why the overextension phenotype is only 14% instead of 40% as in the null). According to CeNGEN, rbm-26 expression in PLM is barely detected, thus depletion with a PLM-specific TIR1 should also be tested. To confirm the authors' identification of the cell marked "N" as the PLM cell body, co-expression of rbm-26 and a PLM-specific marker should be added. Rescue of the rbm-26 mutants with neuronal (and PLM-only) expression should be included to test sufficiency in PLM, and as a further control for potential artifacts of the AID system.

      We have added new data showing that an endogenously tagged RBM-26::Scarlet protein is expressed in the PLM neuron (Figure 3A-C). Moreover, we have added rescue experiments, showing that a Pmec-7::rbm-26::scarlet transgene can rescue the beading phenotype and the PLM/ALM overlap phenotype (Figure 3 F-G). We have also added controls without auxin (Figure S7) __and without the rbm-26::scarlet::aid gene (Figure S8). We have added a new figure showing auxin-mediated depletion of RBM-26::Scarlet::AID in the PLM neuron (Figure S10)__. We examined auxin-mediated depletion at the L3 stage for consistency with our auxin-mediated phenotypic experiments. Moreover, these were done at the L3 stage for consistency with other experiments that included the rbm-26(null) mutants, which don’t survive past this stage.

      In general, auxin-mediated knockdown tends to be hypomorphic in neurons. This is likely due to the fact that the neuronal TIR1 driver is expressed at much lower levels relative to the other drivers. In addition, the lower penetrance observed in auxin-mediated PLM/ALM overlap phenotype could reflect the fact that this phenotype resolves by the L4 stage in the hypomorphic mutants. For example, in P80L mutants at the L3 stage we see only about a 20% penetrance of the PLM/ALM overlap phenotype (relative to about 15% in auxin-mediated knockdown).

      Fig 4. More rigorous quantification of the distribution of mitochondria along the axon should be included, not only total number, and it should be clarified what region of the axon the images are taken from. Including the AID-depletion strain with and without auxin would further add to the sense of rigor. For the mitoTimer experiments, why is RBM-26(L13V) not included and why do wild-type values differ ~5-fold between experiments (despite error bars being almost non-existent)? A more rigorous approach to standardizing imaging conditions may be needed. Positive controls using compounds that affect oxidation should be included. Measurements of individual mitochondria with standard deviations should be shown, rather than aggregate averages with error of proportion.

      We have changed our methodology for measuring mitochondria, so that we now report the density of mitochondria in the axon (number per 100µm), (Figure 4E-F). We agree that this method is much better than counting the total number of mitochondria per axon, as it corrects for differences in body length and axon length). We also now include data for the whole axon (Figure 4E), proximal axon (Figure 4G), and distal axon (Figure 4H). These data suggest that the mitochondrial density defects occur in the proximal axon but not in the distal axon. Using the null allele, we have also examined the timing of mitochondria defects in the axon and report that the defects begin in the L1 stage and continue throughout larval development (Figure 4F). Individual datapoints have been added for all graphs in Figure 4.

      For the mitoTimer experiments (Figure 5), we have added data for L13V and have added the individual datapoints to the graph. In the prior version, the values did not differ 5-fold between experiments with the same stage, rather the different graphs were from different stages (as noted in the figure legends/main text) and the L4 stage has much more oxidation than the L2 stage. To clear this up, we have added labels to the graphs to indicate the stages for each experiment. We have also added new data, so that we now show results for the L2, L3, and L4 stages for all three rbm-26 mutants (see Figure 5C-E). We didn’t test the L1 stage because the signal was not sufficient for accurate quantitation.

      Fig 5. Additional positive and negative controls should be added, including additional rbm-26 alleles, the AID-tagged strain with and without auxin, and a rescued mutant.

      The old Figure 5 has become Figure 6 in the new version. We have added the rbm-26(L13V) allele to each experiment, (Figure 6B-D). We have also added the loading controls for the western blot along with quantification for 3 biological replicates of the western blot analysis (Figure 6D). We agree that these additions significantly strengthen the data because they show that two independent alleles of rbm-26 cause very substantial increase in the expression of mals-1 at both the mRNA and protein levels. We did not do these experiments with the rescuing transgene or with the AID-tagged strain because these experiments are done on whole worm lysates, whereas the AID-tagged and rescuing transgene are neuron-specific.

      Fig 6. Controls showing whether the Scarlet-tagged protein is functional are needed, to rule out dominant negative or toxicity-related effects.

      This is Figure 7 in the new version. For this experiment, we are showing that overexpression of MALS-1 does cause defects. The idea is that excessive amounts of MALS-1 causes deleterious effects to the mitochondria. In fact, these defects could be considered as dominant negative or toxic. We considered the possibility of crossing the Pmec-7::mals-1::scarlet transgene with rbm-26; mals-1 double mutants. However, this does not seem workable, because the single copy Pmec-7::mals-1::scarlet transgene produces the phenotypes at penetrances that are similar to what we observe in rbm-26; mals-1 double mutants. We concede that the results of the overexpression experiments in Figure 7 are limited when considered in isolation. However, we think that they are meaningful when considered in combination with the results on the mals-1;rbm-26 double mutants in Figure 8.

      Fig 8. Controls for other mitochondrial components need to be included. It is important to determine if the decrease in ribosomes is specific or reflects a general decrease in mitochondria. If there are fewer mitochondria as suggested in Fig. 4, then of course mitochondrial ribosomal protein levels are also reduced. Additional rbm-26 alleles should be included here as well. Is this effect dependent on the MALSU homolog?

      This is Figure 8D-E in the new version. We have added new data showing that the decrease in MRPL-58 expression that is caused by the rbm-26(P80L) mutation is dependent on MALS-1. We concede that these experiments cannot be used to determine anything about the mitoribosomes per se, but rather serve as an alternative way of testing the effect of rbm-26 on mitochondria. We have revised the text accordingly (lines 355-357). Given these limitations we have elected not to try additional mitochondrial markers and have also not included additional rbm-26 alleles for this experiment.

      Finally the authors should address concerns about image manipulation, which amplify the concerns about technical rigor outlined above. The image in Fig. 2A appears to have a black box placed over the lower-right portion of the field to hide some features. Black boxes also appear to have been placed over the tops of images in Fig. 4B and 4D and at the left of Fig. 6A, 6B, and 6C. While these manipulations probably do not affect the conclusions, they further undermine confidence in data integrity and experimental rigor.

      We have corrected all of these image processing errors. The box in 2A was for the purpose of squaring off a corner that was clipped during image rotation. The boxes in Figures 4 and 6 (of the prior version) were added to give space for labels (without obscuring image features). We have now used alternative methods to accomplish the same goals. For example, in Figures 4-D we have placed the labels outside of the images.

      Minor points. 1. C. elegans nomenclature conventions should be followed: - C. elegans gene names have three or four letters, thus the MALSU homolog cannot be named "malsu-1". Please have new gene names approved by WormBase BEFORE submitting for publication http://tazendra.caltech.edu/~azurebrd/cgi-bin/forms/gene_name.cgi

      We have changed malsu-1 to mals-1. In addition, both mals-1 and mrpl-58 have now been approved by wormbase and will be listed on the website upon its next update.

      • If two sequential CRISPR edits are made on the same gene then they should be listed as a compound allele, such as rbm-26(cue22cue25)

      We have updated our gene names to reflect this convention.

      • Genes on the same chromosome should not be separated with a semicolon, for example rbm-26(cue40) K12H4.2(syb6330)

      We have updated our gene names to reflect this convention.

      Describing the defects as "neurodevelopmental" is misleading in the case of axon beading or degeneration. Similarly, there is no evidence for an "axon targeting" defect as stated in the abstract.

      We have revised such that instead of referring to degeneration phenotypes as neurodevelopmental, we now refer to axon degeneration phenotypes that occur during development. For example, in the abstract we now say, “These observations reveal a mechanism that regulates expression of a mitoribosomal assembly factor to protect against axon degeneration during neurodevelopment.

      Regarding targeting defects, this was meant to refer to the misplacement of the PLM axon tip (which contains electrical synapses). However, our subsequent analysis has revealed that these defects are transient in P80L and L13V mutants, as they resolve by the L4 stage. The rbm-26 null axon development defects do not resolve, though these mutant die prior to the L4 stage. Given these findings, we have decided not to use the term of targeting defects. Instead, we now refer to this as an axon tiling defect or PLM/ALM overlap phenotype.

      In Fig. 5A, the symbol that appears to correspond to F59C6.15 (lowest p-value) is a different size than the others and is colored as ncRNA, whereas WormBase annotates this gene as snoRNA.

      This error has been corrected.

      In the Introduction, the last sentences of the first two paragraphs should be varied ("However, little is known about the [...] mechanisms that protect [...] during neurodevelopment.")

      This has been done.

      Why is RBM-26 protein running as a doublet at both sizes?

      We have improved our western blotting methodology by using 12% gel, allowing for better resolution. We have also switched from colorimetric detection to ECL detection, allowing for greater sensitivity. In our new blots, we identify 6 different RBM-26 protein bands. We don’t know the reason for these bands, but speculate that they are the result of post-translational processing (148-150).

      When showing the RBM-26 expression pattern (Fig. 3) please include a lower-magnification image of the entire animal.

      This has been done (Figure S6)

      It is confusing to refer to the RNA IP experiments as an "unbiased screen", which in C. elegans typically refers to a genetic screen.

      We now refer to this as a “biochemical screen”.

      The relationship between axon overextension, beading, and mitochondrial localization is not clear. What causal connection between these is being proposed? The causal connections between these phenotypes, if any, should be clarified experimentally. For example, if the axon extension defects develop before mitochondrial localization defects, then it is unlikely that mitochondrial defects cause axon overextension.

      We have added new data showing that the reduction in mitochondrial density within the axon begins during the L1 stage and increases throughout larval development (Figure 4F). We have also added additional data showing that the increase in mitochondrial oxidation is weak in the L2 stage and surges in the L3 stage (Figure 5C-E), coincident with the beginning of the axon degeneration phenotypes. We propose (lines 383-391) that a low level of mitochondrial defects is present in L1 larvae, giving rise to the axon tiling defects. In the L3 stage there is a surge in excessive mitochondrial oxidation, giving rise to the axon degeneration phenotypes. We have added a new section to the discussion that addresses the relationship between defects in axon development and axon degeneration (lines 375-405).

      Please explain how to interpret the difference in axon beading in the two deletion alleles of the MALSU homolog (axon beading defects in tm12122 but not in syb6330). Is syb6330 not a null allele? Or are the defects in tm12122 due to other mutations in this strain background?

      One likely reason for this difference is that tm12122 is predicted to cause a partial deletion of the mals-1 coding sequence, whereas the syb6330 is a full deletion. Thus, the tm12122 could be acting as a dominant negative. In fact, prior work on the MALSU1 ortholog has indicated that this protein is subject to interference by a dominant negative construct (see Rorbach et al, Nucleic Acids Res 2012). Nonetheless, we cannot rule out the possibility of a linked second mutation in tm12122. However, since we have found similar phenotypes and genetic interactions with both alleles, we can conclude that these phenotypes and interactions are due to loss of MALS-1, rather than a second mutation.

      Are mitochondria reduced in number or mislocalized? If they are reduced in number, is this due to altered balance of fission/fusion?

      We have adjusted our methods for quantifying mitochondria and have also analyzed the proximal vs distal axon (Figure 4). We find that the density of mitochondria is decreased in the proximal axon, but not in the distal axon. We speculate that this might reflect a higher demand on mitochondria in the proximal axon, due to a higher amount of trafficking activity in the proximal axon (lines 255-257). We propose that the loss of RBM-26 causes dysfunction in mitochondria. Since fission and fusion are mechanisms that can help to repair damaged mitochondria, it is likely that they would be involved in the phenotypes that we observe.

      In Fig. 3A-D, please keep the labels in the same position in all panels and do not alter brightness settings between single-color and merged panels.

      These images have been moved to the supplemental data section (Figure S5). We have adjusted the labels as suggested. We have not changed the brightness settings, as they were already the same in all panels. However, the blue signal in the merged panel does obscure some of the red signal, giving an appearance of an alteration in color balance.

      The claim that rbm-26 acts cell-autonomously requires PLM-specific depletion and rescue experiments.

      We have added new data indicating that a Pmec-7::rbm-26::scarlet transgene can rescue the beading phenotype (Figure 3F-G).

      **Referees cross-commenting** I appreciate the use of the consultation session to resolve differences between reviewers, but in this case I fully agree with the content and tone of all the comments from the other reviewer -- I think our remarks are very well aligned!

      Reviewer #1 (Significance (Required)):

      The study engineers autism-associated variants in conserved residues of RBM27 into the C. elegans homolog RBM-26 and identifies neuronal phenotypes potentially relevant to autism and a potential molecular mechanism involving regulation of mitochondrial ribosome assembly.

      The key claims of the study are 1} that autism-associated variants in RBM-26 decrease its protein expression; 2} that impaired RBM-26 function leads to a variety of defects in development and maintenance of a single neuron called PLM, including altered axonal localization of mitochondria; 3} that RBM-26 normally binds the mRNA for the C. elegans homolog of MALSU, a mitochondrial ribosomal assembly factor; 4} that loss of RBM-26 leads to overexpression of the MALSU homolog; and 5} that MALSU is required for some of the deleterious effects on the PLM neuron seen in RBM-26 mutants.

      This study will be of interest to the autism research community because it bolsters the idea that variants in RBM27 are likely to disrupt gene function and to affect neuronal health. It will also be of interest to the broader cell biology community because it suggests an interesting potential nucleus-to-mitochondria signaling mechanism, in which a nuclear RNA-binding protein might regulate assembly of mitochondrial ribosomes.

      My field of expertise is developmental biology in C. elegans.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary In this manuscript, the authors studied an ASD-associated gene, rbm-26 in neuronal morphology using the touch receptor neuron PLM in C. elegans, and found that loss-of-function rbp-27 causes overextension and the formation of bulb-like structures in the axon. Using UV-crosslinking RNA immunoprecipitation and RNA-Seq, they identify malsu-1 as a target of rbm-26. Genetic analyses suggest malsu-1 likely functions downstream of rbm-26 in controlling the PLM morphology. Major comments:

      • The authors describe RBM27 is associated with ASD and ID while they only cite SFARI paper that describes a weak association of RBM27 to ASD. The appropriate referenced that show link between RBM27 and ID should be provided. The link with ID was an error. We had meant to say “ASD or other neurodevelopmental disorders.” This has been corrected.

      • SFARI database only has three (P79L, R190Q, G348D) mutations listed as ASD-associated. Where are other mutations L13V and R455H, particularly L13V that the authors used to generate the C. elegans mutant come from? Are they associated with intellectual disabilities? The others came from the devovo-DB. We have added a reference for this database and have also added the primary source references for each of the five de novo variants (see line 121).

      • The authors should be very careful when describing 'gene X causes Y diseases'. Many (if not all) of the examples described in this manuscript are disease-associated genes without validation to be causal genes. We have revised accordingly. For example on lines 433-435, we now say,” For example, mutations in the EXOSC3, EXOSC8 and EXOSC9 are thought to cause syndromes that include defects in brain development such as hypoplasia of the cerebellum and the corpus callosum”. We have decided to use the phrase “thought to cause” because three of the five referenced articles on these genes use titles that indicate causation.

      • The authors refer PLM axon beading and overextension phenotypes to 'axon degeneration and targeting defects'. The authors must provide additional evidence of axon degeneration (see below). Also the term 'targeting defects' is misleading as the authors did not examine if overextension of the PLM axon causes targeting defects. At least they should examine some synaptic markers. To provide more evidence of degeneration we have analyzed several additional phenotypes at multiple developmental stages (Figure 2 and Table S1). Regarding targeting defects, this was meant to refer to the misplacement of the PLM axon tip (which contains electrical synapses). However, our subsequent analysis has revealed that these defects are transient in P80L and L13V mutants, as they resolve by the L4 stage. The rbm-26 null axon development defects do not resolve, though these mutant die prior to the L4 stage. Given these findings, we have decided not to use the term of targeting defects. Instead, we now refer to this as an axon tiling defect or PLM/ALM overlap phenotype.

      • Neuronal phenotypes (axon overextension and beading) should be examined at different developmental timepoints (larval, young adult, and aged animals) to test if these phenotypes are indeed degenerative instead of developmental defects. We have included new data to observe all of these phenotypes at multiple developmental time points (Figure 2 and Table S1).

      • The authors use the blebbing (beading) phenotype in the axon as the sole evidence of neurodegenerative properties of the PLM neuron. A more thorough analysis of this phenotype as done by others (Pan PNAS 2006) must be provided to support the authors' claim that this phenotype represents neurodegeneration. We have included new data on multiple degenerative phenotypes in axons including: blebbing, beading, waviness and breaks (Table S1).

      • The number of beads per axon should be quantified to better represent the severity of rbm-26 mutant. Individual samples should be plotted in the quantification instead of showing the percentage of animals. We have added data on the density of beads in rbm-26(null), rbm-26(P80L), and rbm-26(L13V) mutants (Figure S3). For most experiments we have decided to use penetrance to measure axon degeneration because this is a standard in the field and allows for a larger sample size. For examples please see:

      10.1523/JNEUROSCI.1494-11.2012 (Toth et al, 2012)

      https://doi.org/10.1016/j.cub.2014.02.025 (Rawson et al, 2014)

      10.1073/pnas.1011711108 (Pan et al, 2012)

      https://doi.org/10.7554/eLife.80856 (Czech et al, 2023)

      https://doi.org/10.1016/j.celrep.2016.01.050 (Nichols et al, 2016)

      • Based on the single gel image in Fig. 1C with no loading control, the P80L mutant appears to have no protein expression. How is the P80L viable while the null mutant is lethal? The authors should quantify the protein expression levels from multiple blots with proper loading controls. If P80L mutation is introduced into RBM-26::mScarlet strain can it cause depletion of the signal in vivo? We have added new data showing that the RBM-26::Scarlet signal is diminished by the P80L mutation in vivo (Figure 1E-F). We have also added quantification from 3 biological replicate blots (Figure 1D). Finally, we have improved the sensitivity of our blots by using ECL detection and also show various exposures to highlight the fainter bands (Figures 1C and S1). Therefore, we are now able to detect low level expression of RBM-26(P80L) mutant protein. It is likely that the low level of RBM-26(P80L) and RBM-26(L13V) seen on western blots is sufficient to prevent the lethal phenotype.

      • 'Moreover, loss of either the SPTBN1 or ADD1 genes causes a neurodevelopmental syndrome that includes autism and ADHD' References are missing, and as described above, be extra careful when indicating causality. Very few genes are known to cause ASD and ADHD. We have added the citations for this work (line 81). We also note that the titles for both of the cited articles indicate causation. To be on the safe side we have revised this line to say, “Moreover, loss of either the SPTBN1 or ADD1 genes are thought to cause a neurodevelopmental syndrome that includes autism and ADHD”

      • Fig. 3E F, the authors should use the strains that express TIR1 specifically in the touch receptor neurons to argue cell autonomous function of RBM-26. Alternatively, the authors may conduct PLM neuron-specific rescue experiments to test the sufficiency. We have added new data indicating that a Pmec-7::rbm-26::scarlet transgene can rescue the beading phenotype and the PLM/ALM overlap phenotype (see Figure 3F-G).

      • 'Loss of RBM-26 causes mitochondria dysfunction in axons.' The authors did not examine mitochondria function in axons. They only examined the number of mitochondria, and ROS production in the soma. The authors should provide additional evidence to support the idea that elevated ROS production in the soma is due to mitochondrial dysfunction in axons. Also, the authors should use both P80L and L13V for this experiment, and indicate individual datapoint as dots. Here, they quantified at the L4 stage, which the authors should justify. We have added the L13V data to this experiment and now show the individual data points. In addition, we have now conducted this analysis at the L2, L3 and L4 stages (Figure 5C-E). We have also revised the text to indicate that loss of rbm-26 function causes mitochondrial dysfunction in the cell body which could potentially cause a reduction of mitochondria in the axon (see lines 100-101 and 268-270). We speculate that mitochondria in the axon are also dysfunctional. However, the mitoTimer signal is not bright enough in axons to allow for quantification.

      • Figure 5B and C: the authors should also use L13V to quantify malsu-1 mRNA and protein level, and include quantifications in panel C (from multiple blots). This is Figure 6 in the new version. We have added new data for expression of mals-1 mRNA and protein in rbm-26(L13V) mutants (Figure 6B-D). We have also included quantifications from 3 biological replicates (Figure 6D).

      • In the rbm-26 mutant, the number of mitochondria is reduced, while the amount of MALSU-1 protein is increased. If MALSU-1 is specifically localized at mitochondria in wild type, where does the excessive MALSU-1 go in the rbm-26 mutants? Quantification of MALSU-1 signal intensity should be provided. Our Pmec-7::mals-1::scarlet transgene uses the tbb-2 3’UTR and causes an overexpression phenotype. To address the question posed by the reviewer, we would need to express MALS-1 at endogenous levels. Given that endogenous levels of MALS-1 are very low, it is unlikely that we would be able to visualize its expression. Nonetheless, as a way to address this question we have attempted to create a single copy Pmec-7::mals-1::scarlet transgene that utilizes the mals-1 endogenous 3’UTR. We have tried multiple approaches for generating this construct, but all have failed, likely due to sequence complexities within the mals-1 3’UTR. While we cannot say where the extra MALS-1 protein goes, we think that it is likely overloaded into the remaining mitochondria and could also be in the cytosol as well.

      • Figure 7C: malsu-1 knockout mutants exhibit PLM overextension phenotype, which is not consistent with their model. The authors should discuss this in detail. We have added a paragraph to the discussion explaining that mitochondria function could be disrupted by either MALS-1 overexpression or by MALS-1 loss of function (lines 471-480).

      • 'To validate these findings, we also repeated these experiments with an independent allele of malsu-1, malsu-1(tm12122) and found similar results (Fig. 7A-C).' The malsu-1(tm12122) exhibits beading phenotype and more severe overextension phenotype which the authors must describe and discuss more carefully. One likely reason for this difference is that tm12122 is predicted to cause a partial deletion of the mals-1 coding sequence, whereas the syb6330 is a full deletion. Thus, the tm12122 could be acting as a dominant negative. In fact, prior work on the MALSU1 ortholog has indicated that this protein is subject to interference by a dominant negative construct (see Rorbach et al, Nucleic Acids Res 2012). Nonetheless, we cannot rule out the possibility of a linked second mutation in tm12122. However, since we have found similar phenotypes and genetic interactions with both alleles, we can conclude that these phenotypes and interactions are due to loss of MALS-1, rather than a second mutation (albeit at a slightly different penetrance). We have added these considerations to the results section (lines 342-244).

      • Figure 8: The authors should include data from L13V, malsu-1 and rbm-26; malsu-1 mutants. Quantification from multiple blots should be provided. This is Figure 8D in the new version. We have added the malsu-1 and rbm-26;malsu-1 double mutants to this experiment. We have also added quantification from multiple biological replicate blots. As pointed out by the other reviewer, we think that this experiment does not give specific information about mitoribosomes, but is an alternative approach to looking at the reduction in mitochondria. Given this limitation and considering that we have added L13V data to the mitochondria experiment in Figure 8B, we have elected not to add additional data on L13V to the western blot experiment in Figure 8D

      Minor comments: • 'Consistent with a role for mitochondria in neurodevelopmental disorders, some of these disorders include a neurodegenerative phenotype.' Why is it consistent to have neurodegenerative phenotypes if mitochondria is associated with neurodevelopmental disorders? A better explanation would help.

      We have changed this sentence to, “Some neurodevelopmental syndromes feature neurodegenerative phenotypes that occur during neuronal development.”

      • L13V is generally more severe in axon overextension phenotype than P80L while protein level is more abundant. The authors should discuss about this. We have also added a time course for the PLM/ALM overlap phenotype mutants (Figure 2D). This new data shows that the PLM/ALM overlap is quite similar overall between the P80L and L13V mutants. Both of these mutations cause an increase in PLM/ALM overlap in early larval development that is resolved by the L4 stage. The P80L phenotype resolves slightly sooner for reasons that are unknown. This could reflect differences in expression within the PLM that are not reflected in the whole worm lysate. This could also be due to a slight difference in the genetic background or other stochastic factors. The key point is that these two independent alleles cause similar phenotype overall, indicating that this phenotype is the result of loss in RBM-26 function.

      • Fig. 2E, F: 'Beading refers to focal enlargement or bubble-like lesions which were at least twice the diameter of the axon in size.' How are the diameters of axons measured? A more detailed quantification method, and examples of measurement should be provided. We have added example measurements to the supplemental section (Figure S3). Additional detail on the measurements are in the Methods section (lines 517-518).

      • Figure 3: The authors should also include low-magnification images to show where RBM-26 is expressed. The current image does now allow identifying cells. The transgene that labels the nuclei of hypodermis should be indicated in the manuscript. Specifically, the expression of the RBM-26 in the PLM should be shown. We have added a low magnification image (Figure S6) and have also added images of endogenously tagged RBM-26:Scarlet in the PLM (Figure 3A-C). The transgenic label for the hypodermis has been added to the legend of Figure S5.

      • Figure 3: 'Tissue specific degradation of RBM-26::SCARLET::AID was achieved due to cell-type specific TIR-1 driver lines (see methods for details).' This information is not provided in the method section. This information has been added to methods section, ”Auxin proteindegredation”

      • Fig. 4 E. Values from individual samples should be indicated as dots. Representative images of P80L and L13V should be included. Conduct quantifications at adult stage as the authors use in other quantifications, or justify use of specific developmental stage (L3) they used. Figure 4 has become Figures 4 and 5 in the revised version. We have updated the graphs to include dots for individual data points. We have added quantifications of the mitoTImer experiments for the L2, L3 and L4 stages (Figure 5C-E). We note that our other experiments were done at the L1, L2, L3 and L4 and adult stages. The mitoTimer signal is not sufficient at the L1 stage for quantification. At the adult stage, the red signal becomes saturated. We have added representative images for mitoTimer in P80L and L13V mutants (Figure S9).

      • The genes malsu-1 and mrpl-58 are not listed on wormbase. If the authors would like to designate names to these gene, they should clearly indicate that along with the sequence name. We have changed malsu-1 to mals-1. In addition, both mals-1 and mrpl-58 have now been approved by wormbase and will be listed on the website upon its next update.

      • The authors found that MRPL-58 amount is reduced in rbm-26 mutants (which require additional verifications). This can be explained by the fact that axonal mitochondria number is reduced in the rbm-26 mutants. How did the authors confirm that the reduction in MRPL-58 level is due to the disruption of mitoribosome assembly? This is Figure 8D-E in the new version. We have added new data showing that the decrease in MRPL-58 expression that is caused by the rbm-26(P80L) mutation is dependent on MALS-1. We concede that these experiments cannot be used to determine anything about the mitoribosomes per se, but rather serve as an alternative way of testing the effect of rbm-26 on mitochondria. We have revised the text accordingly (lines 355-357).

      • 'MALSU-1 is a mitoribosomal assembly factor that functions as part of the MALSU1:LOR8F8:mtACP anti-association module [37-39].' I don't think these are known for C. elegans MALSU-1. We have revised to, “MALS-1 is an ortholog of the MALSU1 mitoribosomal assembly factor that functions as part of the MALSU1:LOR8F8:mtACP anti-association module”

      • 'Moreover, our results also suggest that disruption of this process can give rise to neurodevelopmental disorders.' I feel this is a quite a bit of stretch.

      This has been replaced with, “Therefore, we speculate that human RBM26/27 could function with the RNA exosome complex to protect against neurodevelopmental defects and axon degeneration in infants.” (lines 371-373)

      **Referees cross-commenting** Yes, many of our comments overlap, and I fully agree with all comments from the other reviewer too. Reviewer #2 (Significance (Required)):

      I found the manuscript interesting particularly the use of innovative techniques in identifying the target of RBM-26, The genetic analyses of rbm-26 and malsu-1 generally support the authors main conclusions that rbm-26 inhibits malsu-1 and be of potential interest to basic neuroscientists and cell biologists. However, the current manuscript looked premature which made my reading experience less pleasant. The phenotypic analyses is superficial compared to works similar to this work, which are insufficient to support the authors' claim of 'axon degeneration and targeting defects'. A number of issues listed above should be addressed before this manuscript is published. The reviewer's expertise: neurodevelopment in model organisms.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary

      In this manuscript, the authors studied an ASD-associated gene, rbm-26 in neuronal morphology using the touch receptor neuron PLM in C. elegans, and found that loss-of-function rbp-27 causes overextension and the formation of bulb-like structures in the axon. Using UV-crosslinking RNA immunoprecipitation and RNA-Seq, they identify malsu-1 as a target of rbm-26. Genetic analyses suggest malsu-1 likely functions downstream of rbm-26 in controlling the PLM morphology.

      Major comments:

      • The authors describe RBM27 is associated with ASD and ID while they only cite SFARI paper that describes a weak association of RBM27 to ASD. The appropriate referenced that show link between RBM27 and ID should be provided.
      • SFARI database only has three (P79L, R190Q, G348D) mutations listed as ASD-associated. Where are other mutations L13V and R455H, particularly L13V that the authors used to generate the C. elegans mutant come from? Are they associated with intellectual disabilities?
      • The authors should be very careful when describing 'gene X causes Y diseases'. Many (if not all) of the examples described in this manuscript are disease-associated genes without validation to be causal genes.
      • The authors refer PLM axon beading and overextension phenotypes to 'axon degeneration and targeting defects'. The authors must provide additional evidence of axon degeneration (see below). Also the term 'targeting defects' is misleading as the authors did not examine if overextension of the PLM axon causes targeting defects. At least they should examine some synaptic markers.
      • Neuronal phenotypes (axon overextension and beading) should be examined at different developmental timepoints (larval, young adult, and aged animals) to test if these phenotypes are indeed degenerative instead of developmental defects.
      • The authors use the blebbing (beading) phenotype in the axon as the sole evidence of neurodegenerative properties of the PLM neuron. A more thorough analysis of this phenotype as done by others (Pan PNAS 2006) must be provided to support the authors' claim that this phenotype represents neurodegeneration.
      • The number of beads per axon should be quantified to better represent the severity of rbm-26 mutant. Individual samples should be plotted in the quantification instead of showing the percentage of animals.
      • Based on the single gel image in Fig. 1C with no loading control, the P80L mutant appears to have no protein expression. How is the P80L viable while the null mutant is lethal? The authors should quantify the protein expression levels from multiple blots with proper loading controls. If P80L mutation is introduced into RBM-26::mScarlet strain can it cause depletion of the signal in vivo?
      • 'Moreover, loss of either the SPTBN1 or ADD1 genes causes a neurodevelopmental syndrome that includes autism and ADHD' References are missing, and as described above, be extra careful when indicating causality. Very few genes are known to cause ASD and ADHD.
      • Fig. 3E F, the authors should use the strains that express TIR1 specifically in the touch receptor neurons to argue cell autonomous function of RBM-26. Alternatively, the authors may conduct PLM neuron-specific rescue experiments to test the sufficiency.
      • 'Loss of RBM-26 causes mitochondria dysfunction in axons.' The authors did not examine mitochondria function in axons. They only examined the number of mitochondria, and ROS production in the soma. The authors should provide additional evidence to support the idea that elevated ROS production in the soma is due to mitochondrial dysfunction in axons. Also, the authors should use both P80L and L13V for this experiment, and indicate individual datapoint as dots. Here, they quantified at the L4 stage, which the authors should justify.
      • Figure 5B and C: the authors should also use L13V to quantify malsu-1 mRNA and protein level, and include quantifications in panel C (from multiple blots).
      • In the rbm-26 mutant, the number of mitochondria is reduced, while the amount of MALSU-1 protein is increased. If MALSU-1 is specifically localized at mitochondria in wild type, where does the excessive MALSU-1 go in the rbm-26 mutants? Quantification of MALSU-1 signal intensity should be provided.
      • Figure 7C: malsu-1 knockout mutants exhibit PLM overextension phenotype, which is not consistent with their model. The authors should discuss this in detail.
      • 'To validate these findings, we also repeated these experiments with an independent allele of malsu-1, malsu-1(tm12122) and found similar results (Fig. 7A-C).' The malsu-1(tm12122) exhibits beading phenotype and more severe overextension phenotype which the authors must describe and discuss more carefully.
      • Figure 8: The authors should include data from L13V, malsu-1 and rbm-26; malsu-1 mutants. Quantification from multiple blots should be provided.

      Minor comments:

      • 'Consistent with a role for mitochondria in neurodevelopmental disorders, some of these disorders include a neurodegenerative phenotype.' Why is it consistent to have neurodegenerative phenotypes if mitochondria is associated with neurodevelopmental disorders? A better explanation would help.
      • L13V is generally more severe in axon overextension phenotype than P80L while protein level is more abundant. The authors should discuss about this.
      • Fig. 2E, F: 'Beading refers to focal enlargement or bubble-like lesions which were at least twice the diameter of the axon in size.' How are the diameters of axons measured? A more detailed quantification method, and examples of measurement should be provided.
      • Figure 3: The authors should also include low-magnification images to show where RBM-26 is expressed. The current image does now allow identifying cells. The transgene that labels the nuclei of hypodermis should be indicated in the manuscript. Specifically, the expression of the RBM-26 in the PLM should be shown.
      • Figure 3: 'Tissue specific degradation of RBM-26::SCARLET::AID was achieved due to cell-type specific TIR-1 driver lines (see methods for details).' This information is not provided in the method section.
      • Fig. 4 E. Values from individual samples should be indicated as dots. Representative images of P80L and L13V should be included. Conduct quantifications at adult stage as the authors use in other quantifications, or justify use of specific developmental stage (L3) they used.
      • The genes malsu-1 and mrpl-58 are not listed on wormbase. If the authors would like to designate names to these gene, they should clearly indicate that along with the sequence name.
      • The authors found that MRPL-58 amount is reduced in rbm-26 mutants (which require additional verifications). This can be explained by the fact that axonal mitochondria number is reduced in the rbm-26 mutants. How did the authors confirm that the reduction in MRPL-58 level is due to the disruption of mitoribosome assembly?
      • 'MALSU-1 is a mitoribosomal assembly factor that functions as part of the MALSU1:LOR8F8:mtACP anti-association module [37-39].' I don't think these are known for C. elegans MALSU-1.
      • 'Moreover, our results also suggest that disruption of this process can give rise to neurodevelopmental disorders.' I feel this is a quite a bit of stretch.

      Referees cross-commenting Yes, many of our comments overlap, and I fully agree with all comments from the other reviewer too.

      Significance

      I found the manuscript interesting particularly the use of innovative techniques in identifying the target of RBM-26, The genetic analyses of rbm-26 and malsu-1 generally support the authors main conclusions that rbm-26 inhibits malsu-1 and be of potential interest to basic neuroscientists and cell biologists. However, the current manuscript looked premature which made my reading experience less pleasant. The phenotypic analyses is superficial compared to works similar to this work, which are insufficient to support the authors' claim of 'axon degeneration and targeting defects'. A number of issues listed above should be addressed before this manuscript is published.

      The reviewer's expertise: neurodevelopment in model organisms.

    1. Author response:

      The following is the authors’ response to the original reviews. 

      eLife assessment<br /> This important manuscript follows up on previous findings from the same lab supporting the idea that deficits in learning due to enhanced synaptic plasticity are due to saturation effects. Compelling evidence is presented that behavioral learning deficits associated with enhanced synaptic plasticity in a transgenic mouse model can be rescued by manipulations designed to reverse the saturation of synaptic plasticity. In particular, the finding that a previously FDA-approved therapeutic can rescue learning could provide new insights for biologists, psychologists, and others studying learning and neurodevelopment.

      eLife assessment, Significance of findings

      This valuable manuscript follows up on previous findings from the same lab supporting the idea that deficits in learning due to enhanced synaptic plasticity are due to saturation effects. 

      According to the eLife criteria for assessing significance, the “valuable” assessment indicates “findings that have theoretical or practical implications for a subfield.” We have revised the manuscript to emphasize the “theoretical and practical implications beyond a single subfield” which “substantially advance our understanding of major research questions”, with “profound implications” and the potential for “widespread influence,” the eLife criteria for a designation of “landmark” significance.   

      The most immediate implications of our results are for the two major neuroscience subfields of cerebellar research and autism research. However, as recognized by Reviewer 2, the implications are much broader than that: “the finding that a previously FDA-approved therapeutic can rescue learning could provide important new insights for biologists, psychologists, and others studying learning and neurodevelopment.” We have substantially revised the Discussion section of the manuscript to more explicitly lay out how the central idea of our manuscript-- that the capacity for learning at any given moment is powerfully influenced by dynamic, activity- and plasticity-dependent changes in the threshold for synaptic plasticity over short timescales of tens of minutes to hours --has implications for scientific thinking and experiments on plasticity and learning throughout the brain, as well as clinical practice for a wide array of brain disorders associated with altered plasticity and learning impairment. 

      To emphasize the broad conceptual implications of our research, we have reframed our conclusions in terms of metaplasticity rather than saturation of plasticity throughout the revised manuscript. In our previous submission, we had used the “saturation “ terminology for continuity with our previous NguyenVu et al 2017 eLife paper, and mentioned the related idea of threshold metaplasticity in a single sentence: “Similarly, the aberrant recruitment of LTD before training may lead, not to its saturation per se, but to some other kind of reduced availability, such as an increased threshold for its induction (Bienenstock, Cooper, and Munro, 1982; Leet, Bear, and Gaier, 2022).” However, we now appreciate that metaplasticity is a more general conceptual framework for our findings, and therefore emphasize this concept in the revised manuscript, while still making the conceptual link with the “saturation” idea presented in NguyenVu et al 2017 (lines 236-238). 

      The concept of a sliding threshold for synaptic plasticity (threshold metaplasticity) was proposed four decades ago by Bienenstock, Cooper and Munro (1982) as a mechanism for countering an instability inherent in Hebbian plasticity whereby correlated pre- and post-synaptic activity strengthens a synapse, which leads to an increase in correlated activity, which in turn leads to further strengthening. To counter this, BCM proposed a sliding threshold whereby increases in neural activity increase the threshold for LTP and decreases in activity decrease the threshold for LTP, thereby providing a mechanism for stabilizing firing rates and synaptic weights. This BCM sliding threshold model has been highly influential in theoretical and computational neuroscience, but experimental evidence for whether and how such a mechanism functions in vivo has been quite limited.  

      Our work extends the previous, limited experimental evidence for a BCM-like sliding threshold in vivo in several significant ways, which we now discuss in the revised manuscript:

      First, we analyze threshold metaplasticity at synapses where the plasticity is not Hebbian and lacks the inherent instability that inspired the BCM model. The synapses onto cerebellar Purkinje cells have been described as “anti-Hebbian” because the associative form of plasticity is synaptic LTD of excitatory inputs. This anti-Hebbian associative plasticity lacks the instability inherent in Hebbian plasticity. Moreover, a BCM-like sliding threshold that increases the threshold for associative LTD with increased firing rates and decreases threshold for LTD with decreased firing rates would tend to oppose rather than support the stability of firing rates, nevertheless we find evidence for this in our experimental results. Thus, for cerebellar LTD, the central function of the sliding threshold may not be the stabilization of firing rates, but rather to limit plasticity in order to suppress the overwrite of new memories or to allocate different memories to the synapses of different Purkinje cells. 

      Second, we analyze the influence of a BCM-like sliding threshold for plasticity on behavioral learning. Most previous evidence for the BCM model in vivo has derived from studies of the effects of sensory deprivation (e.g., monocular occlusion) on the functional connectivity of sensory circuits (Kirkwood et al., 1996; Desai et al. 2002; Fong et al., 2021) rather than on learning per se.  

      Third, our results provide evidence for major changes in the threshold for plasticity over short time scales and with more subtle manipulations of neural activity than used in previous studies, with practical implications for clinical application. Previously, metaplasticity has been demonstrated with sensory deprivation over multiple days (Kirkwood et al., 1996; Desai et al. 2002) or with drastic changes in neural activity, such as with TTX in the retina (Fong et al, 2021), TMS (Hamada et al 2008), or high frequency electrical stimulation in vitro (Holland & Wagner 1998; Montgomery & Madison 2002) or in vivo (Abraham et al 2001). In contrast, we provide evidence for metaplasticity induced by 30 min of behavioral manipulation (pre-training) and by the relatively subtle pharmacological manipulation of activity with systemic administration of diazepam, a drug approved for humans. Thus, our work contributes not only conceptually to understanding the function of threshold metaplasticity in vivo, but also offers practical observations that could pave the way for novel therapeutic interventions.  

      Fourth, whereas efforts to enhance plasticity and learning have largely focused on increasing the excitability of neurons during learning to help cross the threshold for plasticity (e.g., Albergaria et al., 2018; Yamaguchi et al., 2020; Le Friec et al., 2017), we take the opposite, somewhat counterintuitive approach of inhibiting the excitability of neurons during a period before learning to reset the threshold for plasticity to a state compatible with new learning. To our knowledge, the only other application of such an approach in an animal model of a brain disorder has been inhibiting peripheral (retinal) activity with TTX for treatment of amblyopia (Fong et al, 2021). Our findings from CNS inhibition with a single systemic dose of diazepam greatly expands the potential applications, which could readily be tested in other mouse models of human disorders, and other learning deficits. Even in cases where the specific synaptic impairments and circuitry are less fully understood, the impact of suppressing neural activity during a period before training to reduce the threshold for plasticity could be empirically tested.  

      Fifth, our work extends the consideration of a BCM-like sliding threshold for plasticity to the cerebellum, whereas previous work has focused on models and experimental studies of forebrain circuits. Currently there is a surge of interest in the contribution of the cerebellum to functions and brain disorders previously ascribed to forebrain, hence we anticipate broad interest in this work. 

      Sixth, our results suggest that the history of plasticity rather than the history of firing rates may be the homeostat controlling the threshold for plasticity, at least at the synapses under consideration. Diazepam pre-treatment only enhanced learning in the L7-Fmr1 KO mice with a low “baseline” threshold for plasticity, as measured in vitro, and not WT mice. This suggests it is not the neural activity per se that drives the change in threshold for plasticity, but the interaction of activity with the plasticity mechanism.

      In the revised Discussion, we make all of the above points, to make the implications more clear to readers.  

      The broad interest in this topic is illustrated by two concrete examples. First, an abstract of this work was honored with selection for oral presentation at the November 2023 Symposium of the Molecular and Cellular Cognition Society, a conceptually wide-ranging organization with thousands of members worldwide. Second, the most closely related published work on activity-dependent metaplasticity in vivo, the Fong et al 2021 eLife paper demonstrating reversal of amblyopia by suppression of activity in the retina by TTX, attracted such broad interest, not just of professional scientists, but also the general public, as to be reported on National Public Radio’s All Things Considered, with an audience of 11.9 million people worldwide.  

      In considering the potential of this work for widespread influence, it is important to note that activitydriven changes in the threshold for plasticity could very well be a general property of most if not all synapses, yet very little is known about its function in vivo, especially during learning.  Therefore, the seminal conceptual and practical advances described above have the potential for profound implications throughout neuroscience, psychiatry, neurology and computer science/AI, the eLife criterion for designation as “landmark” in significance. We respectfully request that the reviewers and editor reassess the significance of our findings in light of our much-improved discussion of the broad significance of the work.

      eLife assessment, Strength of support

      Convincing evidence is presented that behavioral learning deficits associated with enhanced synaptic plasticity in a transgenic mouse model can be rescued by manipulations designed to reverse the saturation of synaptic plasticity. In particular, the finding that a previously FDA-approved therapeutic can rescue learning could provide important new insights for biologists, psychologists, and others studying learning and neurodevelopment.

      The designation of “Convincing” indicates “methodology in line with current state-of the-art.” In the revised Discussion, we more clearly highlight that our evidence is “more rigorous than current state-ofthe-art” in several respects, thereby meeting the eLife criterion for “Compelling”:

      (1) Comparison of learning deficits and effects of behavioral and pharmacological pretreatment across five closely related oculomotor learning tasks, which all depend on the same region of the cerebellum (the flocculus), but which previous work has found to vary in their dependence on LTD at the cerebellar parallel fiber-to-Purkinje cell synapses. 

      The “state-of-the-art” behavioral standard in the field of learning is assessment of a single learning task that depends on a given brain area, with the implicit or explicit assumption that the task chosen is representative of “cerebellum-dependent learning” or hippocampus-, amygdala-, basal ganglia-, cortex- dependent learning, etc. Sometimes there is a no-learning behavioral control. 

      Our study exceeds this standard by comparing across many different closely related learning tasks, which all depend on the cerebellar flocculus and other shared vestibular, visual, and oculomotor circuitry, but vary in their dependence on LTD at the cerebellar parallel fiber-to-Purkinje cell synapses. In the original submission, we reported results for high-frequency VOR-increase learning that were dramatically different than for three other VOR learning tasks for which there is less evidence for a role of LTD. Reviewer 2 noted, “the specificity of the effects to forms of plasticity previously shown to require LTD is remarkable.” In the revised manuscript, we provide new data for a second oculomotor learning task in which LTD has been implicated, OKR adaptation, with very similar results as for high-frequency VORincrease learning. The remarkable specificity of both the learning deficits and the effects of pre-training manipulations, in two different lines of mice, for the two specific learning tasks in which LTD has been most strongly implicated, and not the other three oculomotor learning tasks, substantially strengthens the evidence for the conclusion that the learning deficits and effects of pre-training are related specifically to the lower threshold for LTD, rather than the result of some other effect of the gene KO or pre-treatment on the cerebellar or oculomotor circuitry (discussed on lines 270-290 of revised manuscript). 

      (2) Replication of findings in more than one line of mice, targeting distinct signaling pathways, with a common impact of enhancing LTD at the cerebellar PF-Purkinje cell synapses.  

      State-of-the-art is to report the effects of one specific molecular signaling pathway on behavior. 

      In the first part of this Research Advance, we replicate the findings of Nguyen-Vu et al 2017 for a completely different line of mice with enhanced LTD at the parallel fiber-to-Purkinje cell synapses. Like the comparison across LTD-dependent and LTD-independent oculomotor learning tasks, the comparison across completely different lines of mice with enhanced LTD strengthens the evidence that the shared behavioral phenotypes are a reflection of the state of LTD rather than other “off-target” effects of each mutation (discussed on lines 291-309 of revised manuscript).

      (3) Reversal of learning impairments with more than one type of treatment. 

      State-of-the-art is to be able to reverse a learning deficit or other functional impairment in an animal model of a brain disorder with a single treatment; indeed, success in this respect is viewed as wildly exciting, as evidenced by the reception by the scientific and lay communities of the Fong et al, 2021 eLife report of reversal of amblyopia by TTX treatment of the retina. 

      In the current work, we demonstrate reversal of learning deficits with two different types of treatment during the period before training, one behavioral and one pharmacological. The current diazepam pretreatment results provide a fundamentally new type of evidence for the hypothesis that the threshold for LTD and LTD-dependent learning varies with the recent history of activity in the circuit, complementing the evidence from behavioral and optogenetic pre-training approaches used previously in Nguyen-Vu et al, 2017 (discussed on lines 151-158 and 246-255 of revised manuscript).

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Shakhawat et al., investigated how enhancement of plasticity and impairment could result in the same behavioral phenotype. The authors tested the hypothesis that learning impairments result from saturation of plasticity mechanisms and had previously tested this hypothesis using mice lacking two class I major histocompatibility molecules. The current study extends this work by testing the saturation hypothesis in a Purkinje-cell (L7) specific Fmr1 knockout mouse mice, which have enhanced parallel fiber-Purkinje cell LTD. The authors found that L7-Fmr1 knockout mice are impaired on an oculomotor learning task and both pre-training, to reverse LTD, and diazepam, to suppress neural activity, eliminated the deficit when compared to controls.

      Strengths:

      This study tests the "saturation hypothesis" to understand plasticity in learning using a well-known behavior task, VOR, and an additional genetic mouse line with a cerebellar cell-specific target, L7-Fmr1 KO. This hypothesis is of interest to the community as it evokes a novel inquisition into LTD that has not been examined previously.

      Utilizing a cell-specific mouse line that has been previously used as a genetic model to study Fragile X syndrome is a unique way to study the role of Purkinje cells and the Fmr1 gene. This increases the understanding in the field in regards to Fragile X syndrome and LTD.

      The VOR task is a classic behavior task that is well understood, therefore using this metric is very reliable for testing new animal models and treatment strategies. The effects of pretraining are clearly robust and this analysis technique could be applied across different behavior data sets.

      The rescue shown using diazepam is very interesting as this is a therapeutic that could be used in clinical populations as it is already approved.

      There was a proper use of controls and all animal information was described. The statistical analysis and figures are clear and well describe the results.

      We thank the reviewer for summarizing the main strengths of our original submission. We have further strengthened the revised submission by 

      (1) more fully discussing the broad conceptual implications, as outlined above; 

      (2) adding additional new data (Fig. 5) showing that another LTD-dependent oculomotor learning task, optokinetic reflex (OKR) adaptation, is impaired in the L7-Fmr1 KO mice and rescued by pre-treatment with diazepam, as we had already shown for high-frequency VOR increase learning;  3) responding to the specific points raised by the reviewers, as detailed below.

      Weaknesses:

      While the proposed hypothesis is tested using genetic animal models and the VOR task, LTD itself is not measured. This study would have benefited from a direct analysis of LTD in the cerebellar cortex in the proposed circuits.

      Our current experiments were motivated by the direct analysis of cerebellar LTD in Fmr1 knock out mice that was already published (Koekkoek et al., 2005). In that previous work, LTD was analyzed in both Purkinje cell selective L7-Fmr1 KO mice (Koekkoek et al., 2005; Fig. 4D), as used in our study, and global Fmr1 knock out mice (Koekkoek et al., 2005; Fig. 4B). Both lines were found to have enhanced LTD, as cited in the Introduction of our manuscript (lines 48-51, 63-64). The goal of our current study was to build on this previous work by analyzing the behavioral correlates of the findings from this previous, direct analysis of LTD. 

      Diazepam was shown to rescue learning in L7-Fmr1 KO mice, but this drug is a benzodiazepine and can cause a physical dependence. While the concentrations used in this study were quite low and animals were dosed acutely, potential side-effects of the drug were not examined, including any possible withdrawal. 

      In humans, diazepam (valium) is one of the most frequently prescribed drugs in the world, and the side effects and withdrawal symptoms have been extensively studied and documented.1 Withdrawal symptoms are generally not observed with treatments of less than 2 weeks (Brett and Murnion, 2015). After longterm treatments tapering of the dosage is recommended to mitigate withdrawal (Brett and Murnion, 2015 and https://americanaddictioncenters.org/valium-treatment/withdrawal-duration). The extensive data on the safety of diazepam in humans lowers the barrier to potential clinical translation of our basic science findings, although we emphasize that our own expertise is scientific, and translation to Fragile X patients or other patient groups will require additional development of the research by clinicians.

      Given the extensive history of research on this drug, we focused on looking for side effects that would reflect an adverse effect of diazepam on the function of the same oculomotor neural circuitry whose ability to support certain oculomotor learning tasks was improved after diazepam. In other words, we assessed whether the pharmacological manipulation was enhancing certain functions of a given circuit at the expense of others. As we note (line 164), “The acute effect of diazepam administration [measured 2 hours after administration] was to impair learning” in both WT and L7-Fmr1 KO mice. One could consider this a side effect. More importantly, we also tested extensively for oculomotor side-effects during the therapeutic period when learning impairments were eliminated in the L7-Fmr1 KOs, 18-24 hours post-administration, and have a full section of the Results describing our findings about this, titled “Specificity of pre-training effects on learning.” As described in the Results and Discussion (lines 184195, 312-318, Figure 3, figure 3-supplement1; figure 4B; figure 5-supplement 1), we found no such adverse side-effects, which is again encouraging with respect to the translational potential of our findings. 

      This drug is not specific to Purkinje cells or cerebellar circuits, so the action of the drug on cerebellar circuitry is not well understood for the study presented.

      The effects of diazepam are indeed not specific to Purkinje cells, but rather are known to be widespread. Diazepam is a positive allosteric modulator of GABAA receptors, which are found throughout the brain, including the cerebellum. When delivered systemically, as we did in our experiments, diazepam will suppress neural activity throughout the brain by facilitating inhibition, as documented by decades of previous research with this and related benzodiazepines, including dozens of studies of the effects of diazepam in the cerebellum. 

      To our knowledge, there is currently no drug that can specifically inhibit Purkinje cells, especially one that can be given systemically to cross the blood-brain barrier. Moreover, if such a drug did exist, we would not predict it to have the same effect as diazepam in reversing the learning deficits of the L7-Fmr1 KO mice, because the latter presumably depends on suppression of activity in the cerebellar granule cells and neurons of the inferior olive, whose axons form the parallel fibers and climbing fibers, and whose correlated activity controls LTD at the parallel fiber-Purkinje cell synapses.  

      We have revised the text to clarify the key point that despite its widespread action on the brain, the effects of diazepam on cerebellum-dependent learning were remarkably specific (lines 184-195, 210-228, 312318). During the period 18-24 hours after a single dose of diazepam, the learning deficits of L7-Fmr1 KO mice on two LTD-dependent oculomotor learning tasks were completely reversed, with no effects on the same tasks in WT mice, and no effects (“side-effects”) in L7-Fmr1 KO mice or WT mice on other, LTDindependent oculomotor learning tasks that depend on the same region of the cerebellum, and no effects on baseline performance of visually or vestibularly driven eye movements. 

      As described in the revised Discussion (lines 318-323), the non-specific mild suppression of neural activity throughout the brain by diazepam makes it a potentially generalizable approach for inducing BCM-like shifts in the threshold for associative plasticity to facilitate subsequent learning. More specifically, diazepam-mediated reduction of activity throughout the brain has the potential to lower any aberrantly high thresholds for associative plasticity at synapses throughout the brain, and thereby reverse any learning deficits associated with such aberrantly high plasticity thresholds. This approach might even be useful in cases where the neural circuitry supporting a given behavior is not well characterized and the specific synapses responsible for the learning deficit are unknown. On lines 323-327 we compare this generalizable approach with the challenges of designing task- and circuit-specific approaches to reset the threshold for plasticity, particularly in circuits that are less well characterized than the oculomotor circuit.

      It was not mentioned if L7-Fmr1 KO mice have behavior impairments that worsen with age or if Purkinje cells and the cerebellar microcircuit are intact throughout the lifespan. 

      At the adult ages used in our study (8-22 weeks), the oculomotor circuitry, including the Fmr1-deficient Purkinje cells, appears to be functionally intact because all of the oculomotor performance and learning tasks we tested were either normal, or could be restored to normal with brief behavioral and/or pharmacological pre-treatment.  

      Any degeneration of the Fmr1-deficient Purkinje cells or cerebellar microcircuit or additional behavioral impairments at older ages, if they should exist, would not alter our interpretation of the results from 8-22 week old adults regarding history- and activity-dependent changes in the capacity for LTD-dependent learning. Therefore, we leave the question of changes throughout the lifespan to investigators with an interest and expertise in development and/or aging. 

      Only a small handful of the scores of previous studies of the Fmr1 KO mouse model have investigated age-dependent effects; the reviewer may be interested in papers such as Tang et al., 2015 (doi: 10.1073/pnas.1502258112) or Martin et al., 2016 (doi: 10.1093/cercor/bhv031). 

      Connections between Purkinje cells and interneurons could also influence the behavior results found.

      This comment is repeated below in a more general form (Reviewer 1, second to last comment)—please see our response there and lines 270-309 of the revised manuscript for a discussion of how concerns about “off-target” effects are mitigated by the high degree of specificity of the learning deficits and effects of pre-training for the specific learning tasks in which LTD has been previously implicated, and the very similar findings in two different lines of mice with enhanced LTD.

      While males and females were both used for the current study, only 7 of each sex were analyzed, which could be underpowered. While it might be justified to combine sexes for this particular study, it would be worth understanding this model in more detail.

      We performed additional analyses to address the question of whether there might be sex differences that were not detected because of the sample size.

      (1) In a new figure, Fig. 1-figure supplement 1, we break out the results for male and female mice in separate plots, and show that all of the effects of both the KO of Fmr1 from the Purkinje cells and of pretreatment with diazepam that are observed in the full cohort are also statistically significant in just the subset of male mice, and just the subset of female mice (see Fig. 1-figure supplement 1 legend for statistics). In other words, qualitatively, there are no sex differences, and all of the conclusions of our manuscript are statistically valid in both male and female mice. This strengthens the justification for combining sexes for the specific scientific purposes of our study.  

      (2) We performed a power analysis to determine how many mice would be needed to determine whether the very, very small quantitative differences between male and female mice are significant. The analysis indicates that this would require upwards of 70 mice of each sex for WT mice (Cohen’s d, 0.6162; power

      0.95) and upwards of 2500 mice of each sex for L7-Fmr1 KO mice (Cohen’s d, 0.0989; power 0.95). Since the very small quantitative sex differences observed in our cohorts would not alter our scientific conclusions or the possibility for clinical application to patients of both sexes, even if the small quantitative differences turned out to be significant, the very large number of animals needed did not seem warranted for the current scientific purposes. Researchers focused on sex differences may find a motivation to pursue this issue further.   

      Training was only shown up to 30 minutes and learning did not seem to plateau in most cases. What would happen if training continued beyond the 30 minutes? Would L7-Fmr1 KO mice catch-up to WT littermates? Nguyen-Vu

      (1) For VOR learning, we used a 30 min training time because in our past (e.g., Boyden et al., 2003; Kimpo and Raymond, 2007; Nguyen-Vu et al., 2013; Nguyen-Vu et al., 2017) and current results, we find that VOR learning does plateau quite rapidly, with little or no additional adaptive change in the VOR observed between the tests of learning after 30 min vs 20 min of VOR-increase training, in WT or L7Fmr1 KO mice (Fig. 1A; WT, p=0.917; L7-Fmr1 KO, p=0.861; 20 vs. 30 min; Tukey). In the L7-Fmr1 KO mice, there is no significant high-frequency VOR-increase learning after 30 min training, and the mean VOR gain is even slightly lower on average (not significant) than before training (Fig. 1A, red). Therefore, we have no reason to expect that the L7-Fmr1 KO mice would catch up to WT after additional VOR-increase training.  

      (2) We have added new data on OKR adaptation, induced with 60 min of training (Fig. 5). The L7-Fmr1 KO mice exhibited impaired OKR adaptation, even with 60 min of training (p= 1.27x10-4, Tukey). In our experience, restraint for longer than 60 min produces a behavioral state that is not conducive to learning, as also reported by (Katoh and Yamagiwa, 2018), therefore longer training times were not attempted. 

      The pathway discussed as the main focus for VOR in this learning paradigm was connections between parallel fibers (PF) and Purkinje cells, but the possibility of other local or downstream circuitry being involved was not discussed. PF-Purkinje cell circuits were not directly analyzed, which makes this claim difficult to assess.

      In the revised manuscript (lines 299-309), we have expanded our discussion of the possibility that loss of expression of Fmr1 from Purkinje cells in the Purkinje cell-specific L7-Fmr1 KO mice might influence other synapses or intrinsic properties of the Purkinje cells (including synapses from interneurons, as raised in this reviewer’s comment above), in addition to enhancing associative LTD at the parallel fiberPurkinje cell synapses. 

      It is a very general limitation of all perturbation studies, even cell-type specific perturbation studies as in the current case, that it is never possible to completely rule out “off-target” effects of the manipulation. Because of this, causality cannot be definitively concluded from correlations (e.g., between the effects of a perturbation observed at the cellular and behavioral level), and therefore we make no such claim in our manuscript. Rather, we conclude that our results “provide evidence for,” “support,” “predict,” or “are consistent with” the hypothesis of a history- and activity-dependent change in the threshold for associative LTD at the parallel fiber-Purkinje cells.

      That said, perturbation is still one of the major tools in the experimental toolbox, and there are approaches for mitigating concern about off-target effects. We highlight three aspects of our experimental design that accomplish this (lines 184-228, 256-309). First, we show nearly identical learning impairments and effects of behavioral pretreatment in lines of mice with two completely different molecular manipulations that have the common effect of enhancing PF-Purkinje cell LTD, but are likely to have different off-target cellular effects on the Purkinje cells and their synapses. Second, we show that the learning impairments were highly specific to oculomotor learning tasks in which PF-Purkinje cell LTD was previously implicated, with no such effects on three other oculomotor learning tasks that depend on the same region of the cerebellum and oculomotor circuitry. In the original submission, we provided data for one LTDdependent oculomotor learning task, high-frequency VOR-increase learning; in the revised manuscript we provide new data for a second LTD-dependent oculomotor learning task, optokinetic reflex adaptation, with nearly identical results (Fig. 5). Third, we show that the effects of diazepam pre-treatment were highly specific to the same two LTD-dependent oculomotor learning tasks and also highly specific to the L7-Fmr1 KO mice with enhanced LTD and not WT mice. These three features of the experimental design are not common in studies of learning, especially in combination. On lines 256-309, we provide an expanded discussion of how together, these three features of the design strengthen the evidence that the learning impairments and effects of diazepam pre-treatment on learning are related to LTD at the PF-Pk synapses, while acknowledging the possibility of other effects on the circuit. 

      The authors mostly achieved their aim and the results support their conclusion and proposed hypothesis. This work will be impactful on the field as it uses a new Purkinje-cell specific mouse model to study a classic cerebellar task. The use of diazepam could be further analyzed in other genetic models of neurodevelopmental disorders to understand if effects on LTD can rescue other pathways and behavior outcomes.

      We agree that the present findings are potentially relevant for a very wide array of behavioral tasks, disease models, and brain areas beyond the specific ones in our study, and we make this point on lines 310-338 of the revised manuscript. 

      Reviewer #2 (Public Review):

      This manuscript explores the seemingly paradoxical observation that enhanced synaptic plasticity impairs (rather than enhances) certain forms of learning and memory. The central hypothesis is that such impairments arise due to saturation of synaptic plasticity, such that the synaptic plasticity required for learning can no longer be induced. A prior study provided evidence for this hypothesis using transgenic mice that lack major histocompatibility class 1 molecules and show enhanced long-term depression (LTD) at synapses between granule cells and Purkinje cells of the cerebellum. The study found that a form of LTD-dependent motor learning-increasing the gain of the vestibulo-ocular reflex (VOR)-is impaired in these mice and can be rescued by manipulations designed to "unsaturate" LTD. The present study extends this line of investigation to another transgenic mouse line with enhanced LTD, namely, mice with the Fragile X gene knocked out. The main findings are that VOR gain increased learning is selectively impaired in these mice but can be rescued by specific manipulations of visuomotor experience known to reverse cerebellar LTD. Additionally, the authors show that a transient global enhancement of neuronal inhibition also selectively rescues gain increases learning. This latter finding has potential clinical relevance since the drug used to boost inhibition, diazepam, is FDA-approved and commonly used in the clinic. The evidence provided for the saturation is somewhat indirect because directly measuring synaptic strength in vivo is technically difficult. Nevertheless, the experimental results are solid. In particular, the specificity of the effects to forms of plasticity previously shown to require LTD is remarkable. The authors should consider including a brief discussion of some of the important untested assumptions of the saturation hypothesis, including the requirement that cerebellar LTD depends not only on pre- and postsynaptic activity (as is typically assumed) but also on the prior history of synaptic activation.

      We thank the reviewer for this exceptionally clear and concise assessment of the findings and strengths of the manuscript.

      We agree that one of the most “remarkable” aspects of our findings is the specificity of the effects for oculomotor learning tasks for which there is the strongest previous evidence for a role of PF-Purkinje cell LTD. In the original manuscript, we tested just one LTD-dependent oculomotor learning task, highfrequency VOR increase learning; in the revised manuscript, we strengthen the case for LTD-dependent task specificity by adding new data (Fig. 5) showing the same effects for OKR adaptation, an additional LTD-dependent oculomotor learning task.

      The reviewer’s suggestion to include discussion of “untested assumptions”, “including the requirement that cerebellar LTD depends not only on pre- and postsynaptic activity (as is typically assumed) but also on the prior history of synaptic activation” prompted us to more deeply consider the broader implications of our results, and extensively revise the Discussion accordingly. We clarify that we consider historydependent changes in the threshold for LTD to be a prediction of the behavioral and pharmacological findings (lines 339-347, 356) rather than an assumption. In addition, we highlight the broader implications of the results by putting them in the context of work in other brain areas on historydependent changes in the threshold for plasticity, i.e., metaplasticity, going back to the seminal Bienenstock-Cooper-Munro (BCM; year) theory (lines 348-378).  

      Reviewer #1 (Recommendations for The Authors):

      The text and figures are very clear to read, but there are a couple of questions that remain:

      The concentrations chosen for diazepam are not well described and it is unclear why the concentrations jump from 2.5 mg/kg to 0.5 mg/kg. Please add an explanation for these concentrations and if any additional behavior outcomes were observed.

      Our choice of diazepam concentrations was guided by the concentrations reported in the literature to be effective in mice, which suggest that a higher dose (2 mg/kg) can have additional effects not observed with a lower effective dose (0.5 mg/kg) (Pádua-Reis et al, 2021). Since we did not know how much enhancement of inhibition/suppression of activity might be necessary to substantially reduce the induction of PF-Purkinje cell LTD, we did pilot experiments to test concentrations at the low and high ends of the doses typically used in mice. These pilot experiments revealed that a lower dose of 0.4 or 0.5 mg/kg was comparable to the higher dose of 2.5 mg/kg in suppressing VOR-increase learning 2 hours after administration (Fig. 3 – figure supplement 2). Anecdotally, we observed higher levels of locomotor activity and other abnormal cage behavior during the period immediately after administration of the higher compared to the lower dose. To limit these side effects and any possibility of dependence, we used only the lower dose in all subsequent experiments. We clarify this rationale for using a lower dose in the legend of Fig. 3 – figure supplement 2.   

      Figure 4 describes low-frequency VOR, but the paragraph discussing these results (line 191) mentions high-frequency VOR-increase learning. It is unclear where the results are for the high-frequency data. Please include or rephrase for clearer understanding.

      In the revised manuscript, we clarify that the 1 Hz vestibular and visual stimuli used in Figs. 1-3 is the

      “high” frequency, which yields different results than the “low” frequency of 0.5 Hz (Fig. 4), as also observed in Boyden et al 2006, and Nguyen-Vu et al, 2017. 

      Reviewer #2 (Recommendations For The Authors):

      The authors should consider including a brief discussion of some of the important untested assumptions of the saturation hypothesis, including the requirement that cerebellar LTD depends not only on pre- and postsynaptic activity (as is typically assumed) but also on the prior history of synaptic activation.

      We thank the reviewer for this comment, which, along with your public comments, inspired us to thoroughly reconsider and revise our Discussion. We think this has greatly improved the manuscript, and will substantially increase its appeal to a broad segment of the neuroscience research community, including computational neuroscientists as well as those interested in synaptic physiology, learning and memory, or plasticity-related brain disorders including autism. 

      Note that we consider the idea that ”LTD depends not only on pre- and post- synaptic activity but also on the prior history of synaptic activation” to be the central prediction of the threshold metaplasticity hypothesis rather than an assumption, and in the revised manuscript we explicitly refer to this as a prediction (line 339, 356).  We also added a discussion of multiple known cellular phenomena in the Purkinje cells and their synapses that can regulate LTD and thus represent candidate mechanisms for LTD threshold metaplasticity (lines 339-347). Again, sincere thanks for prompting us to write a vastly improved Discussion section.

      Editor's note:

      Should you choose to revise your manuscript, please include full statistical reporting including exact pvalues wherever possible alongside the summary statistics (test statistic and df) and 95% confidence intervals. These should be reported in the main text for all key questions and not only when the p-value is less than 0.05.

      We have added exact p-values throughout the manuscript.  

      References

      Albergaria C, Silva NT, Pritchett DL, Carey MR. (2018). Locomotor activity modulates associative learning in mouse cerebellum. Nat Neurosci.21:725-735. doi: 10.1038/s41593-018-0129-x.

      Abraham WC, Mason-Parker SE, Bear MF, Tate WT. (2001). Heterosynaptic metaplasticity in the hippocampus in vivo: A BCM-like modifiable threshold for LTP. Proc Natl Acad Sci USA. 98:1092410929.

      Bienenstock E, Cooper L, Munro P. (1982). Theory for the development of neuron selectivity: orientation specificity and binocular interaction in visual cortex. J Neurosci. 2:32-48. https://doi.org/10.1523/JNEUROSCI.02-01-00032.1982

      Brett J, Murnion B. (2015). Management of benzodiazepine misuse and dependence. Aust Prescr.38:152155. doi: 10.18773/austprescr.055.

      Boyden ES, Raymond JL. (2003). Active Reversal of Motor Memories Reveals Rules Governing Memory Encoding. Neuron.39:1031-1042. https://doi.org/10.1016/S0896-6273(03)00562-2

      Boyden ES, Katoh A, Pyle JL, Chatila TA, Tsien RW, Raymond JL. (2006). Selective engagement of plasticity mechanisms for motor memory storage. Neuron. 51:823-834. https://doi.org/10.1016/j.neuron.2006.08.026

      Desai NS, Cudmore RH, Nelson SB, Turrigiano GG. (2002). Critical periods for experience-dependent synaptic scaling in visual cortex. Nat Neurosci. 5:783-789. doi: 10.1038/nn878.

      Fong M, Duffy KR, Leet MP, Candler CT, Bear MF. (2021). Correction of amblyopia in cats and mice after the critical period. ELife.10:e70023. https://doi.org/10.7554/eLife.70023

      Hamada M, Terao Y, Hanajima R, Shirota Y, Nakatani-Enomoto S, Furubayashi T, Matsumoto H, Ugawa Y. (2008). Bidirectional long-term motor cortical plasticity and metaplasticity induced by quadripulse transcranial magnetic stimulation. J Physiol. 586:3927-3947. doi: 10.1113/jphysiol.2008.152793.

      Katoh A, Yamagiwa A. (2018). Inhibition of PVN neurons influences stress-induced changes of motor learning in the VOR. Society for Neuroscience. Online Program No. 067.14.

      Kimpo RR, Raymond JL. (2007). Impaired motor learning in the vestibulo-ocular reflex in mice with multiple climbing fiber input to cerebellar Purkinje cells. J Neurosci. 27:5672-5682. doi:

      10.1523/JNEUROSCI.0801-07.2007.

      Kirkwood A, Rioult MG, Bear MF. (1996). Experience-dependent modification of synaptic plasticity in visual cortex. Nature. 381:526–528. https://doi.org/10.1038/381526a0

      Koekkoek SK, Yamaguchi K, Milojkovic BA, Dortland BR, Ruigrok TJ, Maex R, De Graaf W, Smit AE, VanderWerf F, Bakker CE, Willemsen R, Ikeda T, Kakizawa S, Onodera K, Nelson DL, Mientjes E, Joosten M, De Schutter E, Oostra BA, Ito M, De Zeeuw CI. (2005). Deletion of FMR1 in Purkinje Cells Enhances Parallel Fiber LTD, Enlarges Spines, and Attenuates Cerebellar Eyelid Conditioning in Fragile X Syndrome. Neuron. 47:339–352. https://doi.org/10.1016/j.neuron.2005.07.005

      Le Friec A, Salabert AS, Davoust C, Demain B, Vieu C, Vaysse L, Payoux P, Loubinoux I. (2017). Enhancing Plasticity of the Central Nervous System: Drugs, Stem Cell Therapy, and Neuro-Implants. Neural Plast. 2017:2545736. doi: 10.1155/2017/2545736.

      Leet MP, Bear MF, Gaier ED. (2022). Metaplasticity: a key to visual recovery from amblyopia in adulthood? Curr Opin Ophthalmol. 33:512–518. https://doi.org/10.1097/ICU.0000000000000901

      Martin HGS, Lassalle O, Brown JT, Manzoni OJ. (2016). Age-Dependent Long-Term Potentiation Deficits in the Prefrontal Cortex of the Fmr1 Knockout Mouse Model of Fragile X Syndrome. Cereb Cortex. 26:2084–2092. doi: 10.1093/cercor/bhv031.

      Montgomery JM, Madison DV. (2002). State-dependent heterogeneity in synaptic depression between pyramidal cell pairs. Neuron. 33:765-777. doi: 10.1016/s0896-6273(02)00606-2.

      Nguyen-Vu TDB, Kimpo RR, Rinaldi JM, Kohli A, Zeng H, Deisseroth K, Raymond JL. (2013). Cerebellar Purkinje cell activity drives motor learning. Nat Neurosci. 16:1734-1736. doi:

      10.1038/nn.3576.

      Nguyen-Vu TB, Zhao GQ, Lahiri S, Kimpo RR, Lee H, Ganguli S, Shatz CJ, Raymond JL. (2017). A saturation hypothesis to explain both enhanced and impaired learning with enhanced plasticity. ELife. 6:e20147. https://doi.org/10.7554/eLife.20147

      Pádua-Reis M, Nôga DA, Tort ABL, Blunder M. (2021). Diazepam causes sedative rather than anxiolytic effects in C57BL/6J mice. Sci Rep. 2021;11:9335.

      Singh A, Nagpal R, Mittal SK, Bahuguna C, Kumar P. (2017). Pharmacological therapy for amblyopia. Taiwan J Ophthalmol. 7:62-69. doi: 10.4103/tjo.tjo_8_17.

      Tang B, Wang T, Wan H, Han L, Qin X, Zhang Y, Wang J, Yu C, Berton F, Francesconi W, Yates JR 3rd, Vanderklish PW, Liao L. (2015). Fmr1 deficiency promotes age-dependent alterations in the cortical synaptic proteome. Proc Natl Acad Sci USA. 112:E4697-E4706. doi: 10.1073/pnas.1502258112.

      Yamaguchi T, Moriya K, Tanabe S, Kondo K, Otaka Y, Tanaka S. (2020). Transcranial direct-current stimulation combined with attention increases cortical excitability and improves motor learning in healthy volunteers. J Neuroeng Rehabil. 17:23. doi: 10.1186/s12984-020-00665-7.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      eLife assessment

      This valuable work performed fMRI experiments in a rodent model of absence seizures. The results provide new information regarding the brain's responsiveness to environmental stimuli during absence seizures. The authors suggest reduced responsiveness occurs during this type of seizure, and the evidence leading to the conclusion is solid, although reviewers had divergent opinions.

      Public Reviews:

      Reviewer #1 (Public Review):

      In this paper, the effects of two sensory stimuli (visual and somatosensory) on fMRI responsiveness during absence seizures were investigated in GEARS rats with concurrent EEG recordings. SPM analysis of fMRI showed a significant reduction in whole-brain responsiveness during the ictal period compared to the interictal period under both stimuli, and this phenomenon was replicated in a structurally constrained whole-brain computational model of rat brains.

      The conclusion of this paper is that whole-brain responsiveness to both sensory stimuli is inhibited and spatially impeded during seizures.

      Reviewer #2 (Public Review):

      Summary:

      This study examined the possible affect of spike-wave discharges (SWDs) on the response to visual or somatosensory stimulation using fMRI and EEG. This is a significant topic because SWDs often are called seizures and because there is non-responsiveness at this time, it would be logical that responses to sensory stimulation are reduced. On the other hand, in rodents with SWDs, sensory stimulation (a noise, for example) often terminates the SWD/seizure.

      In humans, these periods of SWDs are due to thalamocortical oscillations. A certain percentage of the normal population can have SWDs in response to photic stimulation at specific frequencies. Other individuals develop SWDs without stimulation. They disrupt consciousness. Individuals have an absent look, or "absence", which is called absence epilepsy.

      The authors use a rat model to study the responses to stimulation of the visual or somatosensory systems during and in between SWDs. They report that the response to stimulation is reduced during the SWDs. While some data show this nicely, the authors also report on lines 396-8 "When comparing statistical responses between both states, significant changes (p<0.05, cluster-) were noticed in somatosensory auditory frontal..., with these regions being less activated in interictal state (see also Figure 4). That statement is at odds with their conclusion. I do not see that this issue was addressed.

      See comments below starting with “We acknowledge the reviewer…”.

      They also conclude that stimulation slows the pathways activated by the stimulus. I do not see any data proving this. It would require repeated assessments of the pathways in time. This issue was not addressed.

      See comments below starting with “We acknowledge the reviewer…”.

      The authors also study the hemodynamic response function (HRF) and it is not clear what conclusions can be made from the data. This is still an issue. No conclusions appear to be possible to make.

      See comments below starting with “We acknowledge the reviewer…”.

      Finally, the authors use a model to analyze the data. This model is novel and while that is a strength, its validation is unclear. The authors did not add any validation of their model.

      See comments below starting with “We acknowledge the reviewer…”.

      Strengths:

      Use of fMRI and EEG to study SWDs in rats.

      Weaknesses:

      Several aspects of the Methods and Results were improved but some are still are unclear.

      We acknowledge the reviewer for the concerns of we not addressing the comments above. However, we emphasize that most of the comments were addressed in the already sent “Response to Review Comments” and in the updated manuscript. Here we repeat the responses and provide also additional clarifications to some of the comments.

      We thank the reviewer for noting the discrepancy in the statement of “less activated in interictal state”. The statement should have been written vice versa. We also address that the direction of activation change between groups can be misinterpreted based on statistical maps itself (Figure 3) where only statistical changes are visible and not the polarity of response (can be seen in Figure 4). Therefore, we have made a following changes in the section 3.3.: “There were more voxels with significant changes of activity during interictal state compared to ictal state (136% more). Comparing the statistical responses between interictal and ictal states revealed significant changes (p<0.05, cluster-level corrected) in the visual, somatosensory, and medial frontal cortices. In the ictal state, these regions showed significant hemodynamic decreases when comparing to interictal state, and these polarity changes can be seen the hemodynamic response functions (Figure 4).”

      We agree with the reviewer that there are no data showing slowing of the pathways in response to stimulus. However, we are a bit confused about this comment, as to what part in conclusion section it refers to. We did not intentionally claim that stimulation slows the activated pathways in the manuscript.

      Reviewer is right that strong claims cannot be made from HRF by itself. Therefore, we have avoided to such phrasing throughout the manuscript. In the conclusion section, we speculate that HRF decreases “could play a role in decreased sensory perception” but also state that “further studies are required”. The observed HRF decreases (rather than increases) in the cortex when stimulation was applied during SWD, was discussed in section 4.4., where we speculated that neuronal suppression (possible apparent in negative HRFs) caused by SWD can prevent responsiveness. Conclusion now states the following: “Moreover, the detected decreases in the cortical HRF when sensory stimulation was applied during spike-and-wave discharges, could play a role in decreased sensory perception. Further studies are required to evaluate whether this HRF change is a cause or a consequence of the reduced neuronal response.”

      We point out that the main validation of the model and its details were provided in the previous answer to the reviewer and added to the manuscript. The model presented in the paper is based on a mean-field formalism that captures neuronal activity at the mesoscale level. This mean-field formalism is derived via a detailed statistical description of the activity of a spiking neuronal population of excitatory and inhibitory with conductance-based synaptic interactions. Thus, the validation of the mean-field model is performed via direct comparison between the dynamics obtained from the mean-field model and the dynamics obtained from the underlying spiking neural network model. This comparison is shown in the supplementary material of the manuscript, where the transition studied in the paper between interictal (asynchronous irregular activity) and ictal (SWD dynamics) activity, which is predicted by the mean-field model, is indeed observed in the underlying spiking neuronal model. The existence of these two types of dynamics and the transition between them is the main component of the model used to build the analysis of the responsiveness performed in the paper (which has been properly validated).

      Reviewer #3 (Public Review):

      Summary:

      This is an interesting paper investigating fMRI changes during sensory (visual, tactile) stimulation and absence seizures in the GAERS model. The results are potentially important for the field and do suggest that sensory stimulation may not activate brain regions normally during absence seizures. But the findings are limited by substantial methodological issues that do not enable fMRI signals related to absence seizures to be fully disentangled from fMRI signals related to the sensory stimuli.

      Strengths:

      Investigating fMRI brain responses to sensory stimuli during absence seizures in an animal model is a novel approach with potential to yield important insights.

      Use of an awake, habituated model is a valid and potentially powerful approach.

      Weaknesses:

      The major difficulty with interpreting the results of this study is that the duration of the visual and tactile stimuli were 6 seconds, which is very close to the mean seizure duration per Table 1. Therefore the HRF model looking at fMRI responses to visual or auditory stimuli occurring during seizures was simultaneously weighting both seizure activity and the sensory (visual or auditory) stimuli over the same time intervals on average. The resulting maps and time courses claiming to show fMRI changes from visual or auditory stimulation during seizures will therefore in reality contain some mix of both sensory stimulation-related signals and seizure-related signals. The main claim that the sensory stimuli do not elicit the same activations during seizures as they do in the interictal period may still be true. But the attempts to localize these differences in space or time will be contaminated by the seizure related signals.

      In their response to this comment the authors state that some seizures had longer than average duration, and that they attempted to model the effects of both seizures and sensory stimulation. However these factors do not mitigate the concern because the mean duration of seizures and sensory stimulation remain nearly identical, and the models used therefore will not be able to effectively separate signals related to seizures and related to sensory stimulation.

      Regressors for seizures were formed by including periods of seizures without any stimulation present. In theory, if seizures were perfectly modeled by the regressor, the left variance is completely orthogonal to the main effect of the stimulus. Furthermore, only the cases where the seizures are longer than the stimulus are used to calculate the responsiveness of the stimulus (while the cases where the seizures are shorter than the stimulus are used as nuisance regressors to account for error variance). However, we agree with the reviewer that in practice all effects of the seizure cannot be removed completely from the effect of stimulus. We have addressed this concern in the “physiologic and methodology consideration” section: “We note a caution that presented maps and time courses showing fMRI changes from visual or whisker stimulation during seizures may contain a mixture of both sensory stimulation-related signals and seizure-related signals. To minimize this contamination in the linear model used, we considered both stimulation and seizure-only states as regressors of interest and used seizure-only responses as nuisance regressors to account for error variance. Thereby, the effects caused by the stimulation should be separated as much as possible from the effects caused by the seizure itself.”

      The claims that differences were observed for example between visual cortex and superior colliculus signals with visual stim during seizures vs interictal remain unconvincing due to above.

      Maps shown in Figure 3 do not show clear changes in the areas claimed to be involved.

      In their response the authors enlarged the cross sections. However there are still discrepancies between the images and the way they are described in the text. For example, in the Results text the authors say that comparing the interictal and ictal states revealed less activation in the somatosensory cortex during the ictal than during the interictal state, yet Figure 3 bottom row left shows greater activation in somatosensory cortex in this contrast.

      We note that the direction of activation change between groups can be misinterpreted based on statistical maps itself (Figure 3) where only statistical changes are visible and not the polarity of response (can be seen in Figure 4). Therefore, we have made the following changes to the section 3.3.: “There were more voxels with significant changes of activity during interictal state compared to ictal state (136% more). Comparing the statistical responses between interictal and ictal states revealed significant changes (p<0.05, cluster-level corrected) in the visual, somatosensory, and medial frontal cortices. In the ictal state, these regions showed significant hemodynamic decreases when comparing to interictal state, and these polarity changes can be seen the hemodynamic response functions (Figure 4).”

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Authors have revised this paper with a lot of detail. The paper can be accepted for publication in this version.

      Reviewer #2 (Recommendations For The Authors):

      Reviewer #1

      (1) The analysis in this paper does not directly answer the scientific question posed by the authors, which is to explore the mechanisms of the reduced brain responsiveness to external stimuli during absence seizures (in terms of altered information processing), but merely characterizes the spatial involvement of such reduced responsiveness. The same holds for the use of mean-field modeling, which merely reproduces experimental results without explaining them mechanistically as what the authors have claimed at the head of the paper.

      We agree with the reviewer that the manuscript does not answer specifically about the mechanisms of reduced brain responsiveness. The main scientific question addressed in the manuscript was to compare whole-brain responsiveness of stimulus between ictal and interictal states. The sentence that can lead to misinterpretations in the manuscript abstract: "The mechanism underlying the reduced responsiveness to external stimulus remains unknown." was therefore modified to the following "The whole-brain spatial and temporal characteristics of reduced responsiveness to external stimulus remains unknown".

      This change did not address the issue. The problem is that there is no experimentation to address the underlying mechanisms of the results. I also think the changed language in the abstract is less clear than the original.

      We fully agree that this manuscript does not answer or claim to be answering about the mechanisms of reduced brain responsiveness. The main scientific question addressed in the manuscript was to compare whole-brain responsiveness of stimulus between ictal and interictal states, by means of hemodynamics and mean-field simulation.

      We have changed the language of the abstract to the following:

      “In patients suffering absence epilepsy, recurring seizures can significantly decrease their quality of life and lead to yet untreatable comorbidities. Absence seizures are characterized by spike-and-wave discharges on the electroencephalogram associated with a transient alteration of consciousness. However, it is still unknown how the brain responds to external stimuli during and outside of seizures.

      This study aimed to investigate responsiveness to visual and somatosensory stimulation in GAERS, a well-established rat model for absence epilepsy. Animals were maintained in a non-curarized awake state allowing for naturally occurring seizures to be produced inside the magnet. They were imaged continuously using a quiet zero-echo-time functional magnetic resonance imaging (fMRI) sequence. Sensory stimulations were applied during interictal and ictal periods. Whole brain responsiveness and hemodynamic responses were compared between these two states. Additionally, a mean-field simulation model was used to mechanistically explain the changes of neural responsiveness to visual stimulation between interictal and ictal states.

      Results showed that, during a seizure, whole-brain responses to both sensory stimulations were suppressed and spatially hindered. In several cortical regions, hemodynamic responses were negatively polarized during seizures, despite the application of a stimulus. The simulation experiments also showed restricted propagation of spontaneous activity due to stimulation and so agreed well with fMRI findings. These results suggest that sensory processing observed during an interictal state is hindered or even suppressed by the occurrence of an absence seizure, potentially contributing to decreased responsiveness during this absence epileptic process.”

      The authors also study the hemodynamic response function (HRF) and it is not clear what conclusions can be made from the data.

      The response of the authors did not clarify this issue. Instead, they explained why they examined HRF and that they can only speculate what the data means.

      Reviewer is right that strong claims cannot be made from HRF by itself. Therefore, we have avoided to such phrasing throughout the manuscript. In the conclusion section, we speculate that HRF decreases “could play a role in decreased sensory perception” but also state that “further studies are required”.

      Finally, the authors use a model to analyze the data. This model is novel and while that is a strength, its validation is unclear. The conclusion is that the modeling supports the conclusions of the study, which is useful.

      Details about the model were added.

      This is not entirely satisfactory because there is still no validation of the model.

      We point out that the main validation of the model and its details were provided in the previous answer to the reviewer and added to the manuscript. The model presented in the paper is based on a mean-field formalism that captures neuronal activity at the mesoscale level. This mean-field formalism is derived via a detailed statistical description of the activity of a spiking neuronal population of excitatory and inhibitory with conductance-based synaptic interactions. Thus, the validation of the mean-field model is performed via direct comparison between the dynamics obtained from the mean-field model and the dynamics obtained from the underlying spiking neural network model. This comparison is shown in the supplementary material of the manuscript, where the transition studied in the paper between interictal (asynchronous irregular activity) and ictal (SWD dynamics) activity, which is predicted by the mean-field model, is indeed observed in the underlying spiking neuronal model. The existence of these two types of dynamics and the transition between them is the main component of the model used to build the analysis of the responsiveness performed in the paper (which has been properly validated).

      How is ROI defined in this paper? What type of atlas is used?

      Anatomical ROIs were drawn based on Paxinos and Watson rat brain atlas 7th edition. Region was selected if there were statistically significant activations detected inside that region, based on activation maps. We clarified the definition of ROI as the following:<br /> "Anatomical ROIs, based on Paxinos atlas (Paxinos and Watson rat brain atlas 7th edition), were drawn on the brain areas where statistical differences were seen in activation maps."

      This is helpful, but the unstained brain does not show the borders of the areas. Therefore just saying an atlas was used is not enough. How in an unstained brain can the areas be accurately outlined?

      Areas of the brain were differentiated by co-registering the functional MRI images with an T1-weighted anatomical reference brain that was created on site from the same data set that was used for the manuscript. Potential co-registration inaccuracies created by using a reference brain measured in different site, sequence and a rat strain can be thus avoided. T1-images create sufficient contrast to differentiate main brain areas, but for more accurate border definition (e.g., to differentiate different thalamic nuclei), a coordinate system of the atlas and coordinates known in the used anatomical brain, were used to pinpoint exact borders of the brain areas.

      Reviewer #2

      The following also is not precise:

      "Although seizures are initially triggered by hyperactive somatosensory cortical neurons, the majority of neuronal populations are deactivated rather than activated during the seizure, resulting in an overall decrease in neuronal activity during SWD (McCafferty et al. 2023)."

      What neuronal populations? Cortex? Which neurons in the cortex? Those projecting to the thalamus? What about thalamocortical relay cells? Thalamic gabaergic neurons?

      Please check that these issues were corrected.

      The issues were addressed as follows:

      “Although SWDs are initially triggered by hyperactive somatosensory cortical neurons, neuronal firing rates, especially in majority of frontoparietal cortical and thalamocortical relay neurons, are decreased rather than increased during SWD, resulting in an overall decrease in activity in these neuronal populations (McCafferty et al., 2023). Previous fMRI studies have demonstrated blood volume or BOLD signal decreases in several cortical regions including parietal and occipital cortex, but also, quite surprisingly, increases in subcortical regions such as thalamus, medulla and pons (David et al., 2008; McCafferty et al., 2023).”

      Results

      After removing problematic animals and sessions, was there sufficient power? There probably wasn't enough to determine sex differences.

      After removing problematic sessions, we found statistically significant results (multiple comparison corrected) results in both activation maps, and hemodynamic responses. To determine sex differences, there were not enough animals for statistical findings (p>0.05).

      This is not the question. The question is whether there was sufficient power.

      A simple power calculation was performed as follows: considering a t-test, a risk alpha of 0.05, a power of 0.8, matched pairs (seizure/control), we can detect an effect size of 0.37 with our 4 animals, considering repeated measurements (4 sessions/animal x 11 seizure/control pairs per session). This is now mentioned in the manuscript.

      Table 1 has no statistical comparisons.

      Table 1 is purely an illustration of stimulation and seizure occurrence. There is no specific interest to compare stimulation types (in what state of seizure it occurred) as it does not provide any meaningful inferences to the study.

      Table 1 could be improved by statistics. More could be said and there would be justification to include it.

      We thank the reviewer for the suggestion, but as it is yet unclear to what statistical comparison would be feasible to do, we opt to leave it out.

      Statistical activation maps - it is not clear how this was done.

      Creation of statistical maps are explained in section 2.5.3.

      This section is not clear.

      We have added a reference (https://doi.org/10.1002/hbm.460020402) for readers to familiarize themselves with the concept of statistical parametric mapping.

      Fig 3 "F-contrast maps." Please explain.

      Creation of statistical maps are explained in section 2.5.3.

      This section is unclear.

      We have added a reference (https://doi.org/10.1002/hbm.460020402) for readers to familiarize themself with the concept of statistical parametric mapping.

      Reviewer #3 (Recommendations For The Authors):

      Aside from the concerns listed as weaknesses above which were not addressed, most of the more minor comments were addressed by the authors in the resubmission. However, the comment below was not addressed because it is impossible to see any firing rate changes elicited by sensory stimuli (if they are present) due to the scale during seizures. The seizure signals should be removed or accounted for by the model so that any possible sensory stimulus-related signals could be seen, and displayed on the same scale as firing rates without seizures. Prior comment (unaddressed) is repeated below:

      Figure 6-figure supplement 1, the scales are very different for many of the plots so they are hard to compare. Especially in the ictal periods (D, E, F) it is hard to see if any changes are happening during ictal stimulation similar to interictal stimulation due to very different scales. The activity related to SWD is so large that it overshadows the rest, and perhaps should be subtracted out.

      These two comments were addressed and replied in the previous round of reviews. Regarding the different scales of the plots from Figure 6-figure supplement 1, we point out that all the plots in the same scale are already presented in Figure 6 of the main-text. Regarding the activity related to SWD and sensory stimulation, we remark that the effect of the stimulation should be (and was) evaluated with respect to the ongoing activity. All the results concerning the neuronal responsiveness presented in the paper evaluate the statistical significance of the changes in activity produced by the stimulation with respect to the ongoing activity (during ictal and interictal states respectively). For this reason, all the plots containing the time series of neuronal activity in the simulations include the ongoing activity (with SWD dynamics when present) for proper comparison and relevant analysis. 

      Additional changes:

      In the section 3.2., the sentence: “In addition, responses were observed in the somatosensory cortex during a seizure state.” was removed for clarification purposes as deactivation rather than activation was observed in this brain area during a seizure state.

    1. Reviewer #1 (Public Review):

      Summary:

      This study examines a hypothesized link between autism symptomatology and efference copy mechanisms. This is an important question for a number of reasons. Efference copy is both a critical brain mechanism that is key to rapid sensorimotor behaviors, and one that has important implications for autism given recent empirical and theoretical work implicating atypical prediction mechanisms and atypical reliance on priors in ASD.<br /> The authors test this relationship in two different experiments, both of which show larger errors/biases in spatial updating for those with heightened autistic traits (as measured by AQ in neurotypical (NT) individuals).

      Strengths:

      The empirical results are convincing - effects are strong, sample sizes are sufficient, and the authors also rule out alternative explanations (ruling out differences in motor behavior or perceptual processing per se).

      Weaknesses:

      My main residual concern is that the paper should be more transparent about both (1) that this study does not include individuals with autism, and (2) acknowledging the limitations of the AQ.<br /> On the first point, and I don't think this is intentional, there are several instances where the line between heightened autistic traits in the NT population and ASD is blurred or absent. For example, in the second sentence of the abstract, the authors state "Here, we examine the idea that sensory overload in ASD may be linked to issues with efference copy mechanisms". I would say this is not correct because the authors did not test individuals with ASD. I don't see a problem with using ASD to motivate and discuss this work, but it should be clear in key places that this was done using AQ in NT individuals.<br /> For the second issue, the AQ measure itself has some problems. For example, reference 38 in the paper (a key AQ paper) also shows that the AQ is skewed more male than modern estimates of ASD, suggesting that the AQ may not fully capture the full spectrum of ASD symptomatology.<br /> Of course, this does not mean that the AQ is not a useful measure (the present data clearly show that it captures something important about spatial updating during eye movements), but it should not be confused with ASD, and its limitations need to be acknowledged. My recommendation would be to do this in the title as well - e.g. note impaired visuomotor updating in individuals with "heightened autistic traits".

      Suggestions for improvement:<br /> - Figure 5 is really interesting. I think it should be highlighted a bit more, perhaps even with a model that uses the results of both tasks to predict AQ scores.<br /> - Some discussion of the memory demands of the tasks will be helpful. The authors argue that memory is not a factor, but some support for this is needed.<br /> - With 3 sessions for each experiment, the authors also have data to look at learning. Did people with high AQ get better over time, or did the observed errors/biases persist throughout the experiment?

    2. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This important study tests the hypothesis that a high autism quotient in neurotypical adults is strongly associated with suboptimal motor planning and visual updating after eye movements, which in turn, is related to a disrupted efference copy mechanism. The implication is that such abnormal behavior would be exaggerated in those with ASD and may contribute to sensory overload - a key symptom in this condition. The evidence presented is convincing, with significant effects in both visual and motor domains, adequate sample sizes, and consideration of alternatives. However, the study would be strengthened with minor but necessary corrections to methods and statistics, as well as a moderation of claims regarding direct application to ASD in the absence of testing such patients.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This study examines a hypothesized link between autism symptomatology and efference copy mechanisms. This is an important question for several reasons. Efference copy is both a critical brain mechanism that is key to rapid sensorimotor behaviors, and one that has important implications for autism given recent empirical and theoretical work implicating atypical prediction mechanisms and atypical reliance on priors in ASD.

      The authors test this relationship in two different experiments, both of which show larger errors/biases in spatial updating for those with heightened autistic traits (as measured by AQ in neurotypical (NT) individuals).

      Strengths:

      The empirical results are convincing - effects are strong, sample sizes are sufficient, and the authors also rule out alternative explanations (ruling out differences in motor behavior or perceptual processing per se).

      Weaknesses:

      My main concern is that the paper should be more transparent about both (1) that this study does not include individuals with autism, and (2) acknowledging the limitations of the AQ.

      On the first point, and I don't think this is intentional, there are several instances where the line between heightened autistic traits in the NT population and ASD is blurred or absent. For example, in the second sentence of the abstract, the authors state "Here, we examine the idea that sensory overload in ASD may be linked to issues with efference copy mechanisms". I would say this is not correct because the authors did not test individuals with ASD. I don't see a problem with using ASD to motivate and discuss this work, but it should be clear in key places that this was done using AQ in NT individuals.

      For the second issue, the AQ measure itself has some problems. For example, reference 38 in the paper (a key paper on AQ) also shows that those with high AQ skew more male than modern estimates of ASD, suggesting that the AQ may not fully capture the full spectrum of ASD symptomatology. Of course, this does not mean that the AQ is not a useful measure (the present data clearly show that it captures something important about spatial updating during eye movements), but it should not be confused with ASD, and its limitations need to be acknowledged. My recommendation would be to do this in the title as well - e.g. note impaired visuomotor updating in individuals with "heightened autistic traits".

      We thank the reviewer for the kind words. We now specify more carefully that our sample of participants consists of neurotypical adults scored for autistic traits and none of them was diagnosed with autism before participating in our experiment. Regarding the Autistic Quotient Questionnaire (AQ) on page 5 of the Introduction we now write:

      “The autistic traits of the whole population form a continuum, with ASD diagnosis usually situated on the high end 31-33. Moreover, autistic traits share a genetic and biological etiology with ASD 34. Thus, quantifying autistic-trait-related differences in healthy people can provide unique perspectives as well as a useful surrogate for understanding the symptoms of ASD 31,35.”

      In the Discussion (page 9) we now write:

      ”It is essential to note that our participant pool lacked pre-existing diagnoses before engaging in the experiments and we must address limitations associated with the AQ questionnaire. The AQ questionnaire demonstrates adequate test-retest reliability 36, normal distribution of sum scores in the general population 50, and cross-cultural equivalence has been established in Dutch and Japanese samples 51-53. The AQ effectively categorizes individuals into low, average, and high degrees of autistic traits, demonstrating sensitivity for both group and individual assessments 54.

      However, evolving research underscores many aspects that are not fully captured by the self-administered questionnaire: for example, gender differences in ASD trait manifestation 55. Autistic females may exhibit more socially typical interests, often overlooked by professionals 56. Camouflaging behaviors, employed by autistic women to blend in, pose challenges for accurate diagnosis 57. Late diagnoses are attributed to a lack of awareness, gendered traits, and outdated assessment tools 58. Moving forward, complementing AQ evaluations in the general population with other questionnaires, such as those assessing camouflaging abilities 59, or motor skills in everyday situation (MOSES-test 60) becomes crucial for a comprehensive understanding of autistic traits.”

      Suggestions for improvement:

      - Figure 5 is really interesting. I think it should be highlighted a bit more, perhaps even with a model that uses the results of both tasks to predict AQ scores.

      We thank the reviewer for the suggestion. However, the sample size is relatively small for building a robust and generalizable model to predict AQ scores. Statistical models built on small datasets can be prone to overfitting, meaning that they might not accurately predict the AQ for new individuals.

      - Some discussion of the memory demands of the tasks will be helpful. The authors argue that memory is not a factor, but some support for this is needed. 

      The reviewer raises an important point regarding the potential for memory demands to influence our results. We have now also investigated the accuracy of the second saccade separately for the x and y dimension. As also shown in figure 3 panel A, a motor bias was observed only in one dimension (x), weaking the argument of memory which would imply a bias in both directions (participants remembering the position of the target relative to both screen borders for example). We performed a t-test between our subsample of participants and indeed we found a difference in saccade accuracy for the x dimension (p = 0.03) but not in the y dimension (p = 0.88).

      We now add these analyses in Discussion on page 8.

      - With 3 sessions for each experiment, the authors also have data to look at learning. Did people with high AQ get better over time, or did the observed errors/biases persist throughout the experiment? 

      We thank the reviewer for pointing this out. On page 7 (Results) we now write:

      ” Understanding how these biases might change over time could provide further insights into this mechanism. Specifically, we investigated whether participants exhibited any learning effects throughout the experiments. For data of Experiment 1 – motor updating – we divided our data into 10 separate bins of 30 trials each. We conducted a repeated measure ANOVA with the within-subject factor “number of sessions” (two main sessions of 5 bins each, ~150 trials) and the between-subject factor “group” (lower vs upper quartile of the AQ distribution). We found no main effect of “number of sessions” (F(1,7) = 0.25, p = 0.66), a main effect of “group” (F(1,7) = 2.52, p = 0.015), and no interaction between the two subsample of participants and the sessions tested (F(1,7) = 0.51, p = 0.49). Data of Experiment 2 – visual updating– were separated into 3 sessions. For each session we extracted the PSE and we conducted a repeated measure ANOVA with within subject factor “sessions” and between subject factor “groups” (lower vs upper quartile of the AQ distribution). Also here we found no main effect of sessions (F(1,13) = 0.86, p = 0.39), a main effect of group (F(1,14) = 11.85, p = 0.004), and no interaction between the two subsample of participants and the sessions tested (F(1,13) = 0.20, p = 0.73). In conclusion, the current study found no evidence of learning effects across the experimental sessions. However, a significant main effect of group was observed in both Experiment 1 (motor updating) and Experiment 2 (visual updating). Participants in the group with higher autistic traits performed systematically differently on the task, regardless of the number of sessions completed compared to those in the group with lower autistic traits.”

      Reviewer #2 (Public Review):

      Summary:

      The idea that various clinical conditions may be associated, at least partially, with a disrupted corollary discharge mechanism has been present for a long time.

      In this paper, the authors draw a link between sensory overload, a characteristic of autism spectrum disorder, and a disturbance in the corollary discharge mechanism. The authors substantiate their hypothesis with strong evidence from both the motor and perceptual domains. As a result, they broaden the clinical relevance of the corollary discharge mechanism to encompass autism spectrum disorder.

      The authors write:

      "Imagine a scenario in which you're watching a video of a fast-moving car on a bumpy road. As the car hits a pothole, your eyes naturally make quick, involuntary saccades to keep the car in your visual field. Without a functional efference copy system, your brain would have difficulty accurately determining the current position of your eye in space, which in turn affects its ability to anticipate where the car should appear after each eye movement."

      I appreciate the use of examples to clarify the concept of efference copy. However, I believe this example is more related to a gain-field mechanism, informing the system about the position of the eye with respect to the head, rather than an example of efference copy per se.

      Without an efference copy mechanism, the brain would have trouble accurately determining where the eyes will be in space after an eye movement, and it will have trouble predicting the sensory consequences of the eye movement. However it can be argued that the gain-field mechanism would be sufficient to inform the brain about the current position of the eyes with respect to the head. 

      We now used a different example. And on page 3 of Introduction, we now write:

      “During a tennis game, rapid oculomotor saccades are employed to track the high-velocity ball across the visual display. In the absence of a functional efference copy mechanism, the brain would encounter difficulty in anticipating the precise retinal location of the ball following each saccade. This could result in a transient period of visual disruption as the visual system adjusts to the new eye position. The efference copy, by predicting the forthcoming sensory consequences of the saccade, would bridge this gap and facilitate the maintenance of a continuous and accurate representation of the ball's trajectory.”

      The authors write:

      "In the double-step paradigm, two consecutive saccades are made to briefly displayed targets 21, 22. The first saccade occurs without visual references, relying on internal updating to determine the eye's position."

      Maybe I have missed something, but in the double-step paradigm the first saccade can occur without the help of visual references if no visual feedback is present, that is, when saccades are performed in total darkness. Was this the case for this experiment? I could not find details about room conditions in the methods. Please provide further details.

      In case saccades were not performed in total darkness, then the first saccade can be based on the remembered location of the first target presented, which can be derived from the retinotopic trace of the first stimuli, as well as the contribution from the surroundings, that is: the remembered relative location of the first target with respect to the screen border along the horizontal meridian (i.e. allocentric cues).

      A similar logic could be applied to the second saccade. If the second saccade were based only on the retinotopic trace, without updating, then it would go up and 45 deg to the right, based on the example shown in Figure 1. With appropriate updating, the second saccade would go straight up. However, if saccades were not performed in total darkness, then the location of the second target could also be derived from its relationship with the surroundings (for example, the remembered distance from screen borders, i.e. allocentric cues).

      If saccades were not performed in total darkness, the results shown in Figures 2 and 3 could then be related to i) differences in motor updating between AQ score groups; ii) differences in the use of allocentric cues between AQ score groups; iii) a combination of i) and ii). I believe this is a point worth mentioning in the discussion." 

      Thank you for raising the important issue of visual references in the double-step saccade task. Participants performed saccades in a dimly lit room where visual references, i.e. the screen borders, were barely visible. At the time we collected the data a laboratory that allowed performing experiments in complete darkness was not at our disposal. We acknowledge the possibility that participants could have memorized the target locations relative to the screen borders. The bias of high AQ participants could then be attributed to differences in either encoding, memorization or decoding of the target location relative to the screen borders. However, the potentially abnormal use of visual references must reflect an altered remapping process since we did not find differences in saccade landing in the vertical dimension. A t-test between our group of participants revealed a difference in saccade accuracy for the x dimension (p = 0.03) but not in the y dimension (p = 0.88). We thus agree that in addition to an altered efference copy signal in high AQ participants, altered use of visual references might also affect their saccadic remapping.

      In Discussion we now write: “Our findings suggest that a general memory deficit is unlikely to fully explain the observed bias in high-AQ participants' second saccades. As highlighted in Figure 3A, the bias was specific to the horizontal dimension, weakening the argument for a global memory issue affecting both vertical and horizontal encoding of target location. However, it's important to acknowledge that even under non-darkness conditions, participants might rely on a combination of internal updating based on the initial target location and visual cues from the environment, such as screen borders. This potential use of visual references could contribute to the observed bias in the high-AQ group. If high-AQ participants differed in their reliance on visual cues compared to the low-AQ group, it could explain the specific pattern of altered remapping observed in the horizontal dimension. This possibility aligns with our argument for an abnormal remapping process underlying the results. While altered efference copy signals remain a strong candidate, the potential influence of visual cues on remapping in this population warrants further investigation. Future studies could incorporate a darkness condition to isolate the effects of internal updating on the first saccade, and systematically manipulate the availability of visual cues throughout the task. This would allow for a more nuanced understanding of how internal updating and visual reference use interact in the double-step paradigm, particularly for individuals with varying AQ scores “.

      The authors write:

      According to theories of saccadic suppression, an efference copy is necessary to predict the occurrence of a saccade."

      I would also refer to alternative accounts, where saccadic suppression appears to arise as early as the retina, due to the interaction between the visual shift introduced by the eye movement, and the retinal signal associated with the probe used to measure saccadic suppression. This could potentially account for the scaling of saccadic suppression magnitude with saccade amplitude.

      Idrees, S., Baumann, M.P., Franke, F., Münch, T.A. and Hafed, Z.M., 2020. Perceptual saccadic suppression starts in the retina. Nature communications, 11(1), p.1977. 

      We thank the reviewer. Now on page 4 of Introduction we write:

      “Some theories consider saccadic omission and saccadic suppression as resulting from an active mechanism. In this view an efference copy would signal the occurrence of a saccade, yielding a transient decrease in visual sensitivity20-22. Others however have pointed out the possibility that a purely passive mechanism suffices to induce saccadic omission23. A recent study has found evidence for saccadic suppression already in the retina. Idrees et al.24 demonstrated that retinal ganglion cells in isolated retinae of mice and pigs respond to saccade-like displacements, leading to the suppression of responses to additional flashed visual stimuli through visually triggered retinal-circuit mechanisms. Importantly, their findings suggest that perisaccadic modulations of contrast sensitivity may have a purely visual origin, challenging the need for an efference copy in the early stages of saccadic suppression. However, the suppression they measured lasted much longer than time-courses observed in behavioral data. An efference copy signal could thus be necessary to release perception from suppression.”

      Reviewer #3 (Public Review): 

      Summary:

      This work examined efference copy related to eye movements in healthy adults who have high autistic traits. Efference copies allow the brain to make predictions about sensory outcomes of self-generated actions, and thus serve important roles in motor planning and maintaining visual stability. Consequently, disrupted efference copies have been posited as a potential mechanism underlying motor and sensory symptoms in psychopathology such as Autism Spectrum Disorder (ASD), but so far very few studies have directly investigated this theory. Therefore, this study makes an important contribution as an attempt to fill in this knowledge gap. The authors conducted two eye-tracking experiments examining the accuracy of motor planning and visual perception following a saccade and found that participants with high autistic traits exhibited worse task performance (i.e., less accurate second saccade and biased perception of object displacement), consistent with their hypothesis of less impact of efference copies on motor and visual updating. Moreover, the motor and visual biases are positively correlated, indicative of a common underlying mechanism. These findings are promising and can have important implications for clinical intervention if they can be replicated in a clinical sample.

      Strengths:

      The authors utilized well-established and rigorously designed experiments and sound analytic methods. This enables easy translations between similar work in non-human primates and humans and readily points to potential candidates for underlying neural circuits that could be further examined in follow-up studies (e.g., superior colliculus, frontal eye fields, mediodorsal thalamus). The finding of no association between initial saccade accuracy and level of autistic trait in both experiments also serves as an important control analysis and increases one's confidence in the conclusion that the observed differences in task performance were indeed due to disrupted efference copies, not confounding factors such as basic visual/motor deficits or issues with working memory. The strong correlation between the observed motor and visual biases further strengthens the claim that the findings from both experiments may be explained by the same underlying mechanism - disrupted efference copies. Lastly, the authors also presented a thoughtful and detailed mechanistic theory of how efference copy impairment may lead to ASD symptomatology, which can serve as a nice framework for more research into the role of efference copies in ASD.

      Weaknesses:

      Although the paper has a lot of strengths, the main weakness of the paper is that a direct link with ASD symptoms (i.e., sensory overload and motor inflexibility as the authors suggested) cannot be established. First of all, the participants are all healthy adults who do not meet the clinical criteria for an ASD diagnosis. Although they could be considered a part of the broader autism phenotype, the results cannot be easily generalized to the clinical population without further research. Secondly, the measure used to quantify the level of autistic traits, Autistic Quotient (AQ), does not actually capture any sensory or motor symptoms of ASD. Therefore, it is unknown whether those who scored high on AQ in this study experienced high, or even any, sensory or motor difficulties. In other words, more evidence is needed to demonstrate a direct link between disrupted efference copies and sensory/motor symptoms in ASD.

      This is a valid point, and we thank the reviewer for raising it up. Moving forward, complementing AQ evaluations in the general population with other questionnaires, such as those assessing camouflaging abilities (Hull, L., Mandy, W., Lai, MC., et al., 2019), or motor skills in everyday situation (MOSES-test, Hillus J, Moseley R, Roepke S, Mohr B. 2019 ) becomes crucial for a comprehensive understanding of autistic traits.”

      We now address this point in Discussion page 9.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Minor comments

      - The pothole example in the introduction was really hard to follow. I wonder if there is a better example. 

      We now used a different example. And on page 3 of Introduction, we now write:

      “During a tennis game, rapid oculomotor saccades are employed to track the high-velocity ball across the visual display. In the absence of a functional efference copy mechanism, the brain would encounter difficulty in anticipating the precise retinal location of the ball following each saccade. This could result in a transient period of visual disruption as the visual system adjusts to the new eye position. The efference copy, by predicting the forthcoming sensory consequences of the saccade, would bridge this gap and facilitate the maintenance of a continuous and accurate representation of the ball's trajectory.”

      - This is really minor; I would say that saccades are not the most frequent movement that humans perform. Some of the balance-related adjustments and even heartbeats are faster. Maybe just add "voluntary". 

      We thank the reviewer for the suggestion, now added.

      - "Severe consequences" on page 4 is a bit strong. If that were true, there would be pretty severe impairments in eye movement behavior in ASD, which I don't think is the case.

      We agree with the reviewer. We now eliminated the term “severe”.

      - The results section would read better if each experiment had a short paragraph reiterating its overall goal and the specific approach each experiment took to achieve that goal. 

      Now on page 5, for the first experiment, we write:

      ”We investigated the influence of autistic traits on visual updating during saccadic eye movements using a classic double-step saccade task. This task relies on participants making two consecutive saccades to briefly presented targets. The accuracy of the second saccade serves as an indirect measure of how effectively the participant's brain integrated the execution of the first saccade into their internal representation of visual space. Participants were divided into quartiles based on the severity of their autistic traits, as assessed by the Autistic quotient questionnaire (cite). We hypothesized that individuals with higher autistic traits would exhibit greater difficulty in visual updating compared to those with lower autistic traits. This would be reflected in reduced accuracy of their second saccades in the double-step task. Figure 2C illustrates examples from participants at the extremes of the autistic trait distribution (Autistic quotient = 3, in orange and Autistic quotient = 31, in magenta). As shown, both participants were instructed to make saccades to the locations indicated by two brief target appearances (T1 and T2), as quickly and accurately as possible, following the order of presentation. However, successful execution of the second saccade requires accurate internal compensation for the first saccade, without any visual references or feedback available during the saccade itself.”

      On page 6, for experiment 2, we write:

      ”With a trans-saccadic localization task, we explored how autistic traits affect the integration of eye movements into visual perception. Participants were presented with stimuli before and after a single saccade, creating an illusion of apparent motion. We measured the perceived direction of this displacement, which is influenced by how well the participant's brain accounts for the saccadic eye movement. We predicted that individuals with higher autistic traits would show a stronger bias in the perceived displacement direction, suggesting a less accurate integration of the eye movement into their visual perception.”

      - On page 6, the text about "vertical displacement" is confusing. The spatial displacements in this experiment were horizontal? 

      Yes, they were. The spatial displacement is horizontal, but the perceived trajectory (due to the saccade) is vertical. We now changed “vertical displacement” to “vertical trajectory”.

      - Page 6, grammatical problems in "while we report a slightly slant of the dots trajectory". 

      Thank you. Now fixed.

      - It would be helpful to discuss the apparent motion part of Experiment 2 in the main text. This important part is not made clear. 

      We now in Introduction, page 4, write:

      “In this paradigm, one stimulus is shown before and another after saccade execution. Together these two stimuli produce the perception of “apparent motion”. If stimuli are placed such that the apparent motion path is orthogonal to the saccade path, then the orientation of the apparent motion path indicates how the saccade vector is integrated into vision. The apparent motion trajectory can only appear vertical if the movement of the eyes is perfectly accounted for, that is the retinotopic displacement is largely compensated, ensuring spatial stability. However, small biases of motion direction – implying under- (or over-) compensation of the eye movement – can indicate relative failures in this stabilization process. In a seminal study, Szinte and Cavanagh 27 found a slight over-compensation of the saccade vector leading to apparent motion slightly tilted against the direction of the saccade. More importantly, when efference copies are not available, i.e. localization occurring at the time of a second saccade in a double step task, a strong saccade under-compensation occurs 28.

      This phenomenon cannot be explained by perisaccadic mislocalization of flashed visual stimuli 29,30, but the two phenomena may be related in that they may both depend upon efference copy information.”

      - Figure 1 could be improved. For example, the text talks about the motor plan, but this is not clearly shown in the figure.

      We now added the motor plan into the model. Thank you.

      - Figure 2A, the scale is off (the pictures make it look like the horizontal movement was longer than the vertical). 

      Now fixed.

      - Figure 4, it would be helpful if the task was also described in the figure. 

      We thank the reviewer for the comment. We now tried to modify the figure by also adding the perceptual judgment task.

      - Figure 5A, the y-axis shows p(correct), but that is not what the y-axis shows (the legend makes the same mistake). 

      We apologize, it’s the proportion of time participants reported the second dot to be more to the right compared to the first one. We now changed the figure and the text accordingly.

      - A recent study on motion and eye movement prediction in ASD is very relevant to the work presented here.: Park et al. (2021). Atypical visual motion-prediction abilities in autism spectrum disorder. Clinical Psychological Science, 9(5), 944-960.

      Indeed. We now refer to the cited study in Discussion, on page 9.

      Reviewer #2 (Recommendations For The Authors):

      Statistics and plotting.

      I believe some of the reported statistics are not clear. For example, the authors write:

      "Saccade landing positions of participants in the lower quartile (mean degree {plus minus} SEM: 10.17{plus minus} 0.50) did not deviate significantly from those in the upper quartile (mean degree {plus minus} SEM: 9.65 {plus minus} 0.77). This result was also confirmed by a paired sample t-test (t(7) = 0.66; p = 0.66, BF10 = 0.40)"

      Maybe I am missing something, but why use a paired-sample t-test when the upper and lower quartiles constitute different groups of participants? Shouldn't a two-sample t-test be used in this case?

      We apologize for the confusion. It is indeed a two-sample t-test.

      Along the same lines, I do not understand the link between the number of degrees of freedom reported in the t-test (7) and the number of participants reported in the study (41).

      This is also evident when looking at the scatterplot in Figure 3C. How many participants formed the averages and standard errors reported in Figures 3B and 3D? Please clarify.

      I have the same comment(s) also for the visual updating task (and related figures), where 13 degrees of freedom are reported in the t-tests. Please clarify. 

      We thank the reviewer for pointing this out. The number of participants reported in the scatter plots were indeed 42.  However, we opted to compare the averages only in the lower and upper quartile of the AQ distribution to avoid dealing with a median split (which would imply a skewed distribution). Of our sample of participants in Exp1, 8 fell into the lower quartile of the AQ distribution and 8 in the upper quartile (14 deg of freedom); from Exp 2, 8 participants fell in the lower and 7 in the upper (13 deg of freedom).

      We now fixed the values accordingly.

      Reviewer #3 (Recommendations For The Authors):

      (1) The language can be a bit misleading (especially the title and abstract) as it wasn't always clear that the participants don't actually have clinical ASD. I'd suggest avoiding using words like "symptom" as that would indicate clinical severity, and using words like "traits/characteristics" instead for more precise language. 

      We apologize for the misleading terminology used. Now fixed.

      (2) In the Intro: "...perfect compensation results in a vertical trajectory, while small biases indicate stabilization issues23-25." This is a bit confusing without knowing the details of the paradigm. Consider clarifying or at least referring to Figure 4. 

      Thank you.

      (3) In the Results: "This result was also confirmed by a paired sample t-test (t(7) = 0.66;..." This is confusing as a two-sample t-test is the appropriate test here. Also, the degree of freedom seems very low - could the authors clarify how many participants are in each subgroup (i.e., low vs. high AQ quartile), for both experiments? 

      Of our sample of participants in Exp1 8 fell into the lower quartile of the AQ distribution and 8 in the upper quartile (14 deg of freedom); from Exp 2, 8 participants fell in the lower and 7 in the upper (13 deg of freedom).

      (4) In the Methods: Experiment 2: "The first dot could appear randomly above or below gaze level at a fixed horizontal location, halfway between the two fixations (x = 0, y = -5{degree sign} or +5{degree sign} depending on the trial). The second dot was then shown orthogonal to the first one at a variable horizontal location (x = 5{degree sign} {plus minus} 2.5{degree sign})." This would mean that the position of the 2nd dot relative to the 1st one would be 2.5{degree sign}- 7.5{degree sign}, but the task description in Results and Figure 5A would suggest the horizontal location of the second dot is x = 0{degree sign} {plus minus} 2.5{degree sign}. Which one is correct? 

      The second option is the correct one. We now fixed the typo in the Methods part.

      (5) There is another study that examined oculomotor efference copies in children with ASD using a similar trans-saccadic perception task (Yao et al., 2021, Journal of Vision). In that study, they found a correlation between task performance and an ASD motor symptom (repetitive behavior). This seems quite relevant to the authors' hypothesis and discussion. 

      We thank the reviewer for the suggestion. We now added the mentioned paper in the discussion.

      (6) Please proofread the entire paper carefully as there were multiple grammatical and spelling errors.

      Thank you.

    1. Reviewer #2 (Public Review):

      Overview

      In this work, Manley and Vaziri investigate the neural basis for variability in the way an animal responds to visual stimuli evoking prey-capture or predator-avoidance decisions. This is an interesting problem and the authors have generated a potentially rich and relevant data set. To do so, the authors deployed Fourier light field microscopy (Flfm) of larval zebrafish, improving upon prior designs and image processing schemes to enable volumetric imaging of calcium signals in the brain at up to 10 Hz. They then examined associations between neural activity and tail movement to identify populations primarily related to the visual stimulus, responsiveness, or turn direction - moreover, they found that the activity of the latter two populations appears to predict upcoming responsiveness or turn direction even before the stimulus is presented. While these findings may be valuable for future more mechanistic studies, issues with resolution, rigor of analysis, clarity of presentation, and depth of connection to the prior literature significantly dampen enthusiasm.

      Imaging

      - Resolution: It is difficult to tell from the displayed images how good the imaging resolution is in the brain. Given scattering and lensing, it is important for data interpretation to have an understanding of how much PSF degrades with depth.

      - Depth: In the methods it is indicated that the imaging depth was 280 microns, but from the images of Figure 1 it appears data was collected only up to 150 microns. This suggests regions like the hypothalamus, which may be important for controlling variation in internal states relevant to the behaviors being studied, were not included.

      - Flfm data processing: It is important for data interpretation that the authors are clearer about how the raw images were processed. The de-noising process specifically needs to be explained in greater detail. What are the characteristics of the noise being removed? How is time-varying signal being distinguished from noise? Please provide a supplemental with images and algorithm specifics for each key step.

      - Merging: It is noted that nearby pixels with a correlation greater than 0.7 were merged. Why was this done? Is this largely due to cross-contamination due to a drop in resolution? How common was this occurrence? What was the distribution of pixel volumes after aggregation? Should we interpret this to mean that a 'neuron' in this data set is really a small cluster of 10-20 neurons? This of course has great bearing on how we think about variability in the response shown later.

      - Bleaching: Please give the time constants used in the fit for assessing bleaching.

      Analysis

      - Slow calcium dynamics: It does not appear that the authors properly account for the slow dynamics of calcium-sensing in their analysis. Nuclear-localized GCaMP6s will likely have a kernel with a multiple-second decay time constant for many of the cells being studied. The value used needs to be given and the authors should account for variability in this kernel time across cell types. Moreover, by not deconvolving their signals, the authors allow for contamination of their signal at any given time with a signal from multiple seconds prior. For example, in Figure 4A (left turns), it appears that much of the activity in the first half of the time-warped stimulus window began before stimulus presentation - without properly accounting for the kernel, we don't know if the stimulus-associated activity reported is really stimulus-associated firing or a mix of stimulus and pre-stimulus firing. This also suggests that in some cases the signals from the prior trial may contaminate the current trial.

      - Partial Least Squares (PLS) regression: The steps taken to identify stimulus coding and noise dimensions are not sufficiently clear. Please provide a mathematical description.

      - No response: It is not clear from the methods description if cases where the animal has no tail response are being lumped with cases where the animal decides to swim forward and thus has a large absolute but small mean tail curvature. These should be treated separately.

      Results

      - Behavioral variability: Related to Figure 2, within- and across-subject variability are confounded. Please disambiguate. It may also be informative on a per-fish basis to examine associations between reaction time and body movement.

      - Data presentation clarity: All figure panels need scale bars - for example, in Figure 3A there is no indication of timescale (or time of stimulus presentation). Figure 3I should also show the time series of the w_opt projection.

      - Pixel locations: Given the poor quality of the brain images, it is difficult to tell the location of highlighted pixels relative to brain anatomy. In addition, given that the midbrain consists of much more than the tectum, it is not appropriate to put all highlighted pixels from the midbrain under the category of tectum. To aid in data interpretation and better connect this work with the literature, it is recommended that the authors register their data sets to standard brain atlases and determine if there is any clustering of relevant pixels in regions previously associated with prey-capture or predator-avoidance behavior.

      Interpretation

      - W_opt and e_1 orthogonality: The statement that these two vectors, determined from analysis of the fluorescence data, are orthogonal, actually brings into question the idea that true signal and leading noise vectors in firing-rate state-space are orthogonal. First, the current analysis is confounding signals across different time periods - one could assume linearity all the way through the transformations, but this would only work if earlier sources of activation were being accounted for. Second, the transformation between firing rate and fluorescence is most likely not linear for GCaMP6s in most of the cells recorded. Thus, one would expect a change in the relationship between these vectors as one maps from fluorescence to firing rate.

      - Sources of variability: The authors do not take into account a fairly obvious source of variability in trial-to-trial response - eye position. We know that prey capture responsiveness is dependent on eye position during stimulus (see Figure 4 of PMID: 22203793). We also expect that neurons fairly early in the visual pathway with relatively narrow receptive fields will show variable responses to visual stimuli as the degree of overlap with the receptive field varies with eye movement. There can also be small eye-tracking movements ahead of the decision to engage in prey capture (Figure 1D, PMID: 31591961) that can serve as a drive to initiate movements in a particular direction. Given these possibilities indicating that the behavioral measure of interest is gaze, and the fact that eye movements were apparently monitored, it is surprising that the authors did not include eye movements in the analysis and interpretation of their data.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      In this study, Hoops et al. showed that Netrin-1 and UNC5c can guide dopaminergic innervation from nucleus accumbens to cortex during adolescence in rodent models. 

      We showed this with respect to Netrin-1 only. With respect to UNC5c, we showed that the timing of its expression suggests that it may be involved, but did not conduct the UNC5cmanipulation experiments necessary to prove it. We state this clearly in the manuscript.

      They found that these dopamine axons project to the prefrontal cortex in a Netrin-1 dependent manner and knocking down Netrin-1 disrupted motor and learning behaviors in mice. 

      We would like to clarify that we did not show that learning or motor behaviors are affected. We showed that inhibitory control, measured in the Go/No-Go task, is altered in adulthood.

      Furthermore, the authors used hamsters, a seasonal model that is affected by the length of daylight, to demonstrate that the guidance of dopamine axons is mediated by the environmental factor such as daytime length and in sex dependent manner. 

      We agree with this characterization of our hamster experiments, but want to emphasize that it is the timing of the adolescent dopamine axon input to the prefrontal cortex what is impacted by daytime length in a sex dependent manner.

      Regarding the cell type specificity of Netrin-1 expression, the authors began by stating "this question is not the focus of the study and we consider it irrelevant to the main issue we are addressing, which is where in the forebrain regions we examined Netrin-1+ cells are present." This statement contradicts the exact issue regarding the specificity issue I raised.

      We are not sure why the identities of the cell types expressing Netrin-1 are at issue. As a secreted protein, Netrin-1 can be attached to the extracellular cell surface or in the extracellular matrix, where it interacts with its receptors, which are embedded in the cell surfaces of growing axons (Finci et al., 2015; Rajasekharan & Kennedy, 2009). Netrin-1 is expressed by a wide variety of cell types, for example it is expressed in medium spiny neurons in the striatum of rodents as well as in cholinergic neurons (Shatzmiller et al., 2008). However, we cannot see why showing exactly what type(s) of cells have Netrin-1 on their surfaces, or have secreted them into the matrix, would be at issue for our study.

      They then went on to show the RNAscope data for Netrin-1 in Figure 2, which showed Netrin-1 mRNA was actually expressed quite ubiquitously in anterior cingulate cortex, dorsopeduncular cortex, infralimbic cortex, prelimbic cortex, etc. 

      Figure 2 - this is referring to Author response image 2 of our first response to reviewers.

      We agree that Netrin-1 mRNA is present throughout the forebrain. In particular, its presence in the regions mentioned by Reviewer #1 is a key component of our theory for how dopamine axons grow to the prefrontal cortex in adolescence.

      In addition, contrary to the authors' statement that Netrin-1 is a "secreted protein", the confocal images in Figure 1 in the rebuttal letter actually show Netrin-1 present in "granule-like" organelles inside the cytoplasm of neurons. 

      The rebuttal letter’s Figure 1 is not sufficient to determine the subcellular location of the Netrin-1, however we agree that it is likely that Netrin-1 is present in the cytoplasm of neurons. Indeed, its presence in vesicles in the cytoplasm is to be expected as this is a common mechanism for cells to secrete proteins into the extracellular space (Glasgow et al., 2018). We are not sure whether Reviewer #1’s “granule-like” organelles are in fact secretory vesicles or not, and we do not think our immunohistochemical images are an appropriate method by which to determine this kind of question. We find, however, that a detailed characterization of the subcellular distribution of Netrin-1 is beyond the scope of our study. 

      That Netrin-1 is a secreted protein is well-established in the literature (for example, see Glasgow et al., 2018). The confocal images we provide suggest, but do not prove, that it is likely Netrin-1 is present both extracellularly and intracellularly, which is entirely consistent with its synthesis, secretion, and function. It is also consistent with our methodology and findings. 

      Finally, the authors presented Figure 7 to indicate the location where virus expressing Netrin-1 shRNA might be located. Again, the brain region targeted was quite focal and most likely did not cover all the Netrin-1+ brain regions in Figure 2. 

      Figure 2 - this is referring to Author response image 2 of our first response to reviewers.

      Figure 7 - this is referring to Author response image 4 of our first response to reviewers.

      We agree with Reviewer #1’s characterization of our experiment. We intended to interrupt the Netrin-1 pathway to the prefrontal cortex, like removing a bridge along a road. The Netrin-1 signal remained intact along the dopamine axon’s route before and after the location of the viral injection, however it was lost at the site of the virus injection. This is like a road remaining intact on either side of a destroyed bridge, but becoming impassable at the location where the bridge was destroyed. We are glad that Reviewer 1 agrees our experimental design achieved the desired outcome (a focal reduction in Netrin-1 expression).

      Collectively, these results raised more questions regarding the specificity of Netrin-1 expression in brain regions that are behaviorally relevant to this study.

      We do not agree with this assessment. Our manipulation of Netrin-1 expression was highly localized and specific, as Reviewer #1 seems to acknowledge. We are not clear on what questions this might raise that would call into question our findings as described in our manuscript. We have now added the following paragraph to our manuscript:  

      “It remains unknown exactly what types of cells are expressing Netrin-1 along the dopamine axon route, and how this expression is regulated to produce the Netrin-1 gradients that guide the dopamine axons. It also remains unclear where the misrouted axons end up in adulthood. Future experiments aimed at addressing these questions will provide further valuable insight into the nature of the “Netrin-1 pathway”. Nonetheless, our results allow us to conclude that Netrin-1 expressing cells “pave the way” for dopamine axons growing to the medial prefrontal cortex.”

      With respect to the effectiveness of Netrin-1 knockdown in the animals in this study, the authors cited data in HEK293 cells (Cuesta et al., 2020. Figure 2a), which did not include any statistics, and previously published in vivo data in a separate, independent study (Cuesta et al., 2020. Figure 2c). They do not provide any data regarding the effectiveness of Netrin-1 knockdown in THIS study.

      Indeed, we understand the concerns of Reviewer 1 here. This issue was discussed at the time all the experiments (both in the current manuscript and in Cuesta et al., (2020)) were conducted, and we decided that it was sufficient to show the virus was capable of knocking down Netrin-1 in vitro and in vivo in the forebrain. These characterization experiments were published in the first manuscript to present results using the virus, which was Cuesta et al., 2020. However, all experiments from both manuscripts were conducted contemporaneously.

      We do not see how repeating the same characterization experiments again is useful. 

      Similar concerns regarding UNC5C knockdown (points #6, #7, and #8) were not adequately addressed.

      There is no UNC5c knockdown in this manuscript. Furthermore, points #6, #7 and #8 do not deal with UNC5c knockdown. Point #6 is regarding the Netrin-1 virus efficacy, which we discuss above. Points #7 and #8 are requesting numerous additional experiments that we feel are worthy of their own manuscripts, and we do not feel that they call into question the findings we present here. Rather, answering points #7 and #8 would further refine our understanding of how dopamine axons grow to the prefrontal cortex beyond our current manuscript.

      In brief, while this study provides a potential role of Netrin-1-UNC5C in target innervation of dopaminergic neurons and its behavioral output in risk-taking, the data lack sufficient evidence to firmly establish the cause-effect relationship.

      We do not claim a cause-effect relationship here or anywhere in the manuscript. Concrete establishment of a cause-effect relationship will require several more manuscripts worth of experiments.

      Reviewer #2 (Public Review):

      In this manuscript, Hoops et al., using two different model systems, identified key developmental changes in Netrin-1 and UNC5C signaling that correspond to behavioral changes and are sensitive to environmental factors that affect the timing of development. They found that Netrin-1 expression is highest in regions of the striatum and cortex where TH+ axons are travelling, and that knocking down Netrin-1 reduces TH+ varicosities in mPFC and reduces impulsive behaviors in a Go-No-Go test. 

      We want to point out that we examined the Netrin-1 expression in the septum rather than the striatum but otherwise feel the above description is accurate.

      Further, they show that the onset of Unc5 expression is sexually dimorphic in mice, and that in Siberian hamsters, environmental effects on development are also sexually dimorophic. This study addresses an important question using approaches that link molecular, circuit and behavioral changes. Understanding developmental trajectories of adolescence, and how they can be impacted by environmental factors, is an understudied area of neuroscience that is highly relevant to understanding the onset of mental health disorders. I appreciated the inclusion of replication cohorts within the study.

      We appreciate Reviewer #2’s comments, which we feel accurately describe our experimental approach and findings, including their limitations.

      Reviewer #3 (Public Review):

      This study from the Flores group aims at understanding neuronal circuit changes during adolescence which is an ill-defined, transitional period involving dramatic changes in behavior and anatomy. They focus on DA innervation of the prefrontal cortex, and their interaction with the guidance cue Netrin1. They propose DA axons in the PFC increase in the postnatal period, and their density is reduced in a Netrin 1 knockdown, suggesting that Netrin abets the development of this mesocortical pathway. 

      We feel it necessary to point out that we are not the first to propose that dopamine axons in the prefrontal cortex increase in the postnatal period.  This is well-established and was first documented in rodents in the 1980s (Kalsbeek et al., 1988). Otherwise we agree with Reviewer 3’s characterization.

      In such mice impulsivity gauged by a go-no go task is reduced. They then provide some evidence that Unc5c is developmentally regulated in DA axons. Finally they use an interesting hamster model, to study the effect of light hours on mesocortical innervation, and make some interesting observations about the timing of innervation and Unc5c expression, and the fact that females housed in winter day length conditions display an accelerated innervation of the prefrontal cortex.

      We agree with Reviewer #3’s characterization of our study and findings here.

      Comments on the revision. Several points were addressed; some remain to be addressed.

      (4) It's not clear to me that TH doesnt stain noradrenergic axons in the PFC. See Islam and Blaess, 2021, and references therein.

      Presuming that Reviewer #3 is referring to Islam et al. (2021), the review they cite supports our position that TH-stained axons in the forebrain are by-and-large dopamine axons.

      Nonetheless, Islam et al. do point out that it is important to keep in mind that TH-positive axons have a slight possibility of being noradrenaline axons. We are very conscious of this possibility and are careful to minimize this risk. As we state in the methods, we only examine axons that are morphologically consistent with dopamine axons and are localized to areas within the forebrain where dopamine axons are known to innervate, in addition to being THpositive. The localization and morphology of noradrenaline axons in the forebrain is different from that of dopamine axons. This is stated in our methods on lines 76-94, where we describe in detail the differentiation between dopamine and norepinephrine axons and include a full list of relevant citations.

      (6) The Netrin knockdown data provided is from a previous study/samples.

      Indeed, however the experiments for the two manuscripts were conducted contemporaneously. We believe two sets of validation experiments are not required.

      (8) While the authors make the argument that the behavior is linked to DA, they still haven't formally tested it, in my opinion.

      We agree that we have not formally tested this link. However, we disagree that we claim to have established a formal link in our manuscript.

      (1). Fig 3, UNc 5c  levels are not yet quantified. Furthermore, I agree with the previous reviewer that Unc5C knockdown would corroborate key aspects of the model.

      We present UNC5c quantities for mice in our first response to reviewers (Figure 11 therein) however we did not do so for the hamsters due to the time involved. We are planning further experiments with the hamsters and may include quantification of UNC5c in the nucleus accumbens at such time. However, we do not feel its absence from this manuscript calls into question our findings.

      With regards to the UNC5c knockdown, we agree it would be an informative extension of our findings here, but again we do not feel that it is necessary to corroborate our current findings.

      New - Developmental trajectory of prefrontal TH-positive axons from early adolescence to adulthood is similar in male and female rats, (Willing Juraska et al., 2017). This needs discussion.

      Willing et al. (2017) reported an increase in prefrontal dopamine density during adolescence in male and female rats, with a non-significant trend towards an earlier increase in females.

      This is in line with our current results in mice indicating that the timing of dopamine axon targeting and growth is sex specific. We are currently testing this idea directly using intersectional viral tracing methods. We now added the following sentence to the manuscript: 

      “Differences in the precise timing of dopamine innervation to the PFC in adolescence have been suggested by findings reported in male and female rats (Willing et al., 2017)”.

      References

      Brignani, S., Raj, D. D. A., Schmidt, E. R. E., Düdükcü, Ö., Adolfs, Y., Ruiter, A. A. D., Rybiczka-Tesulov, M., Verhagen, M. G., Meer, C. van der, Broekhoven, M. H., MorenoBravo, J. A., Grossouw, L. M., Dumontier, E., Cloutier, J.-F., Chédotal, A., & Pasterkamp, R. J. (2020). Remotely Produced and Axon-Derived Netrin-1 Instructs GABAergic Neuron Migration and Dopaminergic Substantia Nigra Development. Neuron, 107(4), 684-702.e9. https://doi.org/10.1016/j.neuron.2020.05.037

      Cuesta, S., Nouel, D., Reynolds, LM, Morgunova, A., Torres-Berrio, A., White, A., Hernandez, G., Cooper, HM, Flores, C. (2020). Dopamine axon targeting in the nucleus accumbnes in adolescence requires Netrin-1. Frontiers in Cell and Developmental Biology, 8,  doi:10.3389/fcell.2020.00487

      Finci, L., Zhang, Y., Meijers, R., & Wang, J. H. (2015). Signaling mechanism of the netrin-1 receptor DCC in axon guidance. Progress in Biophysics and Molecular Biology, 118(3), 153-160. https://doi.org/10.1016/j.pbiomolbio.2015.04.001

      Glasgow, S. D., Labrecque, S., Beamish, I. V., Aufmkolk, S., Gibon, J., Han, D., Harris, S. N., Dufresne, P., Wiseman, P. W., McKinney, R. A., Séguéla, P., Koninck, P. D., Ruthazer, E. S., & Kennedy, T. E. (2018). Activity-Dependent Netrin-1 Secretion Drives Synaptic Insertion of GluA1-Containing AMPA Receptors in the Hippocampus. Cell Reports, 25(1),

      168-182.e6. https://doi.org/10.1016/j.celrep.2018.09.028

      Islam, K. U. S., Meli, N., & Blaess, S. (2021). The Development of the Mesoprefrontal Dopaminergic System in Health and Disease. Frontiers in Neural Circuits, 15, 746582. https://doi.org/10.3389/fncir.2021.746582

      Kalsbeek, A., Voorn, P., Buijs, R. M., Pool, C. W., & Uylings, H. B. M. (1988). Development of the Dopaminergic Innervation in the Prefrontal Cortex of the Rat. The Journal of Comparative Neurology, 269(1), 58–72. https://doi.org/10.1002/cne.902690105

      Rajasekharan, S., & Kennedy, T. E. (2009). The netrin protein family. Genome Biology, 10(9), 239. https://doi.org/10.1186/gb-2009-10-9-239

      Shatzmiller, R. A., Goldman, J. S., Simard-Émond, L., Rymar, V., Manitt, C., Sadikot, A. F., & Kennedy, T. E. (2008). Graded expression of netrin-1 by specific neuronal subtypes in the adult mammalian striatum. Neuroscience, 157(3), 621–636. https://doi.org/10.1016/j.neuroscience.2008.09.031

      Willing, J., Cortes, L. R., Brodsky, J. M., Kim, T., & Juraska, J. M. (2017). Innervation of the medial prefrontal cortex by tyrosine hydroxylase immunoreactive fibers during adolescence in male and female rats. Developmental Psychobiology, 59(5), 583–589. https://doi.org/10.1002/dev.21525

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2023-02235

      Corresponding author: Adriano, Aguzzi

      1. General Statements

      We thank the reviewers for providing valuable comments. We are pleased that our study is considered important to advance the knowledge on IL-1-independent inflammatory functions of inflammasomes. We have clarified and revised the manuscript (track changed) as detailed below in the point-by-point response in this letter.

      2. Point-by-point description of the revisions

      Referee 1

      General: In this manuscript, et al., investigates the role of the inflammasome adapter ASC (in AA amyloidosis). This condition involves the aggregation of serum amyloid A (SAA) and is linked to chronic inflammation. Firstly, I can directly say that I do recommend this study for publication. This is a well conducted and well-written study which advances the knowledge on IL-1-independent inflammatory functions of inflammasomes. Furthermore, I find it particularly impressive that despite the inflammasome research community is well aware that amyloidosis is a hallmark of inflammatory diseases, it took a neuroscientist specialized in prion diseases to raise the question whether ASC would be involved in seeding serum AA aggregation. Key findings include: • ASC forms extracellular aggregates that enhance SAA aggregation, as observed through superresolution microscopy. • In a mouse model, the absence of ASC significantly reduced amyloid load, not due to increased phagocytosis but likely due to diminished aggregation. • Treatment with anti-ASC antibodies reduced amyloid load and mitigated weight loss in mice with AA amyloidosis. These findings suggest that ASC plays a crucial role in AA amyloidosis and that targeting ASC could be a potential therapeutic strategy. The study expands our understanding of the involvement of ASC in proteinopathies beyond neural diseases, pointing to its role in systemic conditions like AA amyloidosis.

      __Significance: __In conclusion, this manuscript offers valuable insights into the role of ASC in AA amyloidosis, presenting compelling findings that support its potential as a therapeutic target. Addressing the mentioned concerns and making the suggested revisions will further enhance the manuscript's scientific rigor and impact. Overall, this study is a valuable contribution to the field of inflammasome research and its relevance in systemic conditions like AA amyloidosis.

      Comment 1: Overall, the experiments are well-conducted and mostly all controls I would expect were included. With few exceptions, the data is convincing. With that said, I have issues with some of the staining employed in Fig 1. In Fig. 1, the authors assess ASC staining in cardiac tissues from a patient with vasculitis and systemic inflammation-related AA amyloidosis, and a control patient who died of a heart attack but had no signs of amyloidosis. However, most of the data shown is related to the AL177 anti-ASC. More importantly, no isotype stainings are included. We have previously demonstrated that the AL177 anti-ASC, used here, reacts quite strongly with ASC−/− cells, and it is one of the less specific anti-ASC commercially available (PMID: 27221487). As this is data from one patient (understandably), I wonder if the authors could counterstain ASC in the same samples using a specific human anti-ASC with a different color (ex: Biolegend HASC), and confirm that the signal overlays with the AL-177.

      Response: We conducted additional experiments to address the anti-ASC antibody specificity, as now described in Results, Method, and Fig. S1. We tested a set of anti-ASC antibodies (AL177, MY6745, 1C3D7) for their ASC specificity. We confirmed that both the AL177 and the MY6745 antibodies have high ASC-specificity (Fig. S1A). Moreover, for illustration purposes (and to warn other scientists), we included a third anti-ASC antibody (1C3D7) found to be unspecific as it yielded a strong signal in PYCARD-/- (ASC-/-) THP-1 cells (Fig. S1B). In addition, isotype controls were included in these experiments (Fig. S1A, right panels), as suggested by the reviewer, showing no target protein detection in both, PYCARD+/+ (ASC+/+) and PYCARD-/- cells underscoring the anti-ASC specificity of AL177 and MY6745 antibodies.

      • *

      Comment 2: Finally, in Figure 1H it seens from the description that another anti-ASC was used: "referred in the legend as ASC (MAB ASC, Yellow)". Is this a monoclonal anti-ASC? Also, the images show large and bright antibody aggregates (middle of the image, top left corner behind the "H", and a massive fluorescence in the bottom right of the image), indicating the presence of staining artifacts. Again, no counterstaining with isotype controls are shown.

      Response: We apologize for the confusing jargon in Figure 1H. “MAB ASC” refers to the anti-ASCPYD antibody (MAB/MY6745). We have corrected the antibody terminology in the legend. MAB/MY6745 is a monoclonal antibody generated by Mabylon that is highly reactive to both human and murine ASC. This antibody was generated to 1) perform an immunotherapy in vivo study and to 2) be used as alternative specific antibody in addition to AL177 to show co-localization of SAA and ASC in a human AA patient using STED superresolution microscopy. MAB/MY6745 is a rabbit monoclonal anti-ASC antibody targeting the pyrin domain (PYD) from which the rabbit Fcγ domain was replaced with that of a mouse IgG2a domain to avoid xenogeneic anti-drug responses in recipients and to improve its effector functions in vivo. To examine possible staining artefacts which can occur with Formalin-Fixed Paraffin-Embedded (FFPE) human tissues, we assessed the specificity of a variety of anti-ASC antibodies (Fig. S1). Our data presented in Fig. S1 show that the monoclonal anti-ASC antibody binds specifically. It is conceivable that AL177 and MAB/MY6745 target different epitopes of ASC, resulting in different staining patterns. An isotype control, included in __Fig. S1, __was used to test the specificity of the secondary antibodies, and did not show any nonspecific staining. We have adapted and added this to the text body and figure legend accordingly.

      Comment 3: Overall, although I don't dispute the possibility that ASC would co-localize with SAA deposits, I don't think the data presented can safely sustain that claim. I would, therefore, suggest that alternative methods to be employed to substantiate these conclusions: Supposedly, would it be possible to immuno-precipitate (IP) amyloid SAA and assess ASC via western blotting? As well as IP ASC and detect SAA? Or use DSS-crosslinking to find ASC oligomers in tissue areas rich in SAA?

      Response: In addition to assessing co-localization by means of STED superresolution microscopy (Fig. 1), we also employed LiP-MS with various forms of ASC (monomeric and ASC specks) and identified a previously unrecognized biophysical interaction of SAA and the ASC PYD domain (Fig. 2C-F). As an orthogonal line of evidence, we provided kinetic data showing that SAA aggregation is enhanced in the presence of ASC specks (Fig. 2A-B). We feel that these results are reasonably convincing, but we agree that co-localization is almost invariably an aspirational finding, and even superresolution microscopy cannot fully exclude the presence artifacts (nor can, in fairness, co-immunoprecipitation, which must often rely on overexpression). A sentence acknowledging this limitation was added to the Discussion.

      Comment 4: For example, it would be reasonable to quantify the results in Figure 3G and providing clarification regarding the controls in the figure legend. Though there is significantly less SAA in spleen homogenates from Asc−/−, there also seems to be the case for b-actin in Fig 3G. Moreover, in the figure legend the authors state: "...Spleen homogenate from untreated (-ctrl) and AA+ (+ctrl) C57BL/6 wt mice from an independent experiment served as negative and positive control, respectively." I don't know what the authors mean with that. Is this a montage, or samples from different experiments were run together in one blot? And if so, for what reason? This is confusing and should be clarified.

      Response: We reworded the figure legend to provide clarity about the technical assay controls and adjusted the labels in Fig. 3E __accordingly: To ascertain SAA antibody functionality, mouse spleen homogenate from independently obtained and Congo red-confirmed AA+ tissue served as positive, whereas non-induced (AA-) spleen tissue served as negative technical controls. (__Fig 3E). We decided to show the two (positive/AA+ and negative/AA-) technical controls in Fig. 3E.

      Comment 5: Furthermore, in the Abstract, a slight rephrasing is suggested to accurately describe ASC specks as molecular aggregates formed inside cells, which are subsequently released into the extracellular space.

      Response: We thank the referee for bringing this to our attention. We rephrased the abstract accordingly.

      Comment 6: Lastly, enhancing the text size in figures, particularly in Fig 3, is advised to improve legibility and overall clarity.

      Response: The legibility and style of main Fig. 3 text sizes has been changed and additional figure formatting has been performed.

      Referee 2

      General: The manuscript by Losa et al., investigates whether ASC is involved in serum AA amyloidosis. The authors report that ASC colocalizes with SAA in human AA amyloidosis and that purified ASC specks accelerate SAA fibril formation in vitro. In addition, splenic AA amyloid was decreased in Pycard-/- mice compared to Pycard+/+ mice and that treatment with anti-ASC antibodies decreased amyloid loads in Pycard+/+ mice. Lastly, they analyzed serum of 19,334 patients to show that the prevalence of anti-ASC antibodies did not correlate with any specific disease. The authors conclude that ASC to play a role in extraneural proteinopathies of humans and experimental animals and suggest that anti-ASC immunotherapy may contribute to resolving such diseases. The findings in the study are novel and demonstrate a new role for ASC in aggregation proteinopathies. However, there are number of issues that need to be addressed before acceptance for publication.

      Significance: __The findings in the study are novel and demonstrate a new role for ASC in aggregation proteinopathies. This study reports a crucial role for ASC in SAA interaction and recruitment, SAA serum level modulation, SAA fibril formation acceleration, and controlling the extent of inflammation associated amyloidosis with respect to AA amyloid deposition __

      Comment 1: Figure 3 E depicts Western blots of monomeric SAA in spleen of Pycard+/+ and Pycard-/- mice. The authors should include immunoblots depicting the levels of ASC in these tissues and to demonstrate that the Pycard-/- mice lack ASC.

      Response: We did not perform ASC immunoblots for Pycard-/- and Pycard+/+ mice since the absence of the ASC protein in this well-established mouse line has been demonstrated in several key publications, including under inflammation conditions (right side of the figure below, from Mariathasan et al., Nature, 2014). However, we show ASC IHC of Pycard+/+ and Pycard-/- AA+ mice on spleen, confirming the absence of an ASC signal in Pycard-/- mice and its presence in the Pycard+/+ (Fig. 3F). Moreover, our genotyping data confirmed the presence and absence of the Pycard gene in Pycard+/+ and Pycard-/- AA+ mice.

      Comment 2: Fig. 3B shows that at 96 hours after injection there was no difference in SAA serum concentration. How do the authors explain this drop in SAA serum concentration? No explanation is provided.

      Response: Acute-phase response peaks at 24 hours after injury (i.e., Kushner I, 1982; Gabay et Kushner, 1999; Gitlin et Colten, 1987, Calif.: Academic Press, 1987:123-53). Beyond 24 hours, acute phase proteins decay over time mirroring the process of tissue integrity restoration and the clearance of the insulting stimuli. This is in line with our data, where the inflammatory injury was induced by subcutaneous AgNO3 injection, resulting in a non-statistical serum SAA difference between the Pycard+/+ and Pycard-/- experimental mice at 96 hours post AgNO3 injection. In addition, the majority of SAA in Pycard+/+ mice was incorporated into amyloid deposit. As suggested by the reviewer we have included this explanation/references into the revised manuscript.

      Comment 3: Figure 4 shows anti-ASC administration reduces amyloid load. The immunoblot in Figure 4C does not represent the quantification of the blot. In fact, there are only 3 samples per treatment group whereas the quantification shows 5-6 animals per group.

      Response: We have performed two independent immunoblots at the same time to perform technical replicates (duplicates). As pointed out by the reviewer, this resulted in 6 samples and data points that were visualized and analyzed in main Fig. 4C. To avoid duplicating data, overloading the main figures with technical replicates, we opted to show only one representative immunoblot in the main Fig. 4C. The other blots are shown in the supplementary figures Fig. S13A and Fig. S13B for full transparency.

      Comment 4: Additionally, the authors have not shown that the drug penetrates the target tissue and how much drug is present in spleen to provide a therapeutic effect. What is the half-life of the drug? These parameters are critical to assess the MOA of the anti-ASC used in these studies.

      Response: To assess the pharmacokinetics of the anti-ASC antibody, we determined its titers in serum by ELISA at various time points up to 96 hpi after the first injection. The anti-ASC antibody serum levels peaked at 24 hpi and declined to about half maximal serum concentration levels at 96 hpi. This serum half-life, for the injected concentration, is in the range of reported kinetic parameters of engineered monoclonal antibodies (e.g., Unverdorben et al., MAbs, 2016; Foss et al., Nat Comm, 2024) (Fig. 4B). Because of the high permeability of splenic red pulp vasculatures, and because of the absence of any selectively permeable barrier, efficacious imbibement of the splenic extracellular space can be plausibly expected. Theoretically, one could perfuse mice intracardially with PBS and then measure antibody in tissue. Such measurements can work relatively well in the brain, which possesses a highly impermeable barrier. However, here we would find it difficult to convince ourselves that such measurements would not be contaminated by residual blood in splenic capillaries that may be difficult to clean up through perfusion. Therefore, we did not measure the antibody levels in the spleen.

      Comment 5: The authors should expand the discussion section to include the work of other groups that have successfully employed anti-ASC antibodies. For example, PMID: 35793783, PMID: 32366256

      Response: We thank the referee for pointing out that literature. We extended the discussion section accordingly and added these important references into the discussion.

      Comment 6: Methods: The authors provide the number of animals employed in the Supplemental Tables 5 and 7. These numbers should be provided in the methods section or in the Figure legends. Additionally, how many replicates were performed for the data in Figure 2?

      Response: __As suggested by the reviewer we now provide the number of animals in the figure legends of main __Fig. 2 and Fig. 3 __in addition to those in Table 5 and Supp Table 7__ to enhance clarity.

      Referee 3____

      General: The manuscript by Losa et al. explores the co-aggregation of ASC with serum amyloid A (SAA) in vivo and in mouse models, It posits that, similar to Amyloid beta, SAA is cross-seeded by ASC foci both in vitro and in vivo. This review only addresses the co-localization and in vitro cross seeding data (Figs. 1 and 2A, B), not the mouse experiments or mass spectrometry data. The manuscript first shows co-deposition of ASC with SAA amyloid. SAA was stained both with Congo red and ThS, both standard dyes for amyloid staining. Figure S2 shows CR birefringence, the hallmark of amyloid deposits. The authors then move to demonstrate co-localization of SAA and ASC in confocal and STED immuno-fluorescence microscopy.

      Significance: The discovery of the role of ASC in Alzheimer's disease generated an exciting new hypothesis to the etiology of sporadic AD, for which the cause is unknown. The current manuscript finds that ASC may also play a role in AA amyloidosis, which is a significant finding.

      Comment 1: Confocal images C-E show overlapping staining of markers for both SAA and ASC. Similarly, STED images show co-aggregation of ASC and SAA in amyloidosis patients. However, since confocal images F and G seem to show overlapping staining of the yellow and magenta channels as well, a careful quantitative analysis of the data I needed. Quantify co-localization (Pearson coefficient) in confocal and STED images. STED images from control patients are missing and need to be included.

      Response: AA amyloidosis is a relatively rare disease, and tissue samples thereof are even rarer. We only had access to the samples of one patient in both control and SAA groups. This limitation prevented us from conducting quantitative analyses. Rather than looking at the Pearson – or, possibly better, Spearman – correlation coefficient, we opted for an unbiased method of correlation in which we reconstructed the picture using 3D surface rendering with the Imaris software (see Fig. 1). From this reconstruction, we exported the barycenter of each surface on a 3D plot for both SAA and ASC markers (see Fig. S2B-C). Each point represents the center of a surface, while the box plots on the sides represent the distribution of the markers in space, demonstrating the overlap of the markers for ASC and SAA. We also understand the suggestion to conduct STED imaging on control samples to show the absence of co-aggregation. However, we could not be sure of which region to capture and how to decide on the focus, as we did not detect strong signal from confocal images of the control sample. Imaging blindly would almost necessarily lead to irrelevant imaging and aberrant comparison. We do not claim any quantitative data out of these images; however, we report an observation. Quantitative and mechanistic co-aggregation data are presented in Fig. 2 using LiP-MS.

      Comment 2: The authors then move on to demonstrate that ASC foci can cross-seed SAA amyloid formation in vitro, by recording SAA aggregation kinetics in the presence and absence of ASC foci. Curves recorded in the presence of ASC foci have accelerated kinetics as shown by a decrease in the time to reach half-maximal fluorescence (t1/2). However, these data (Fig 2A, B) are not very clean. Only three data points out of five curves shown in panel A. are presented in the fitting of the control (yellow) aggregation kinetics in panel B. Why was this done? Panel B shows a significant difference between the control and the kinetics seeded with ASC specks. It looks doubtful that the results are still statistically significant if these data are included, so their exclusion impacts the overall conclusion of the paper. The significance of the cross-seeding results needs to be substantiated experimentally.

      __Response: __The in vitro SAA aggregation assay was performed under established conditions (Claus S et al., EMBO Rep 2017) and the resulting data was processed using the AmyloFit software from the Knowles lab in Cambridge, UK (Meisl G et al., Nat Protoc 2016). The AmyloFit technology uses global fitting resulting in high-accuracy kinetics. Given the software algorithm, only curves that show a sigmoidal ThT fluorescence signal over time can be fitted. Therefore, replicates that do not show aggregation (characteristic ThT signal) over time cannot be fitted. As a result, only three out of six curves could be fitted resulting in three t1/2. Conversely, in the presence of ASC specks, all six replicates aggregated in a dose-dependent manner, and could be fitted perfectly, yielding six t1/2 values. Thus, all available data points are plotted and used for statistical analysis. Moreover, the fact that in presence of ASC specks all SAA replicates aggregated/converted successfully in a dose-dependent manner (whereas in the SAA-only condition some replicates do not aggregate) further underscores the pivotal role of ASC specks in SAA seeding, conversion, and aggregation enhancement.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      In this potentially useful study, the authors attempt to use comparative meta-analysis to advance our understanding of life history evolution. Unfortunately, both the meta-analysis and the theoretical model is inadequate and proper statistical and mechanistic descriptions of the simulations are lacking. Specifically, the interpretation overlooks the effect of well-characterised complexities in the relationship between clutch size and fitness in birds.

      Public Reviews:

      We would like to thank the reviewers for their helpful comments, which have been considered carefully and have been valuable in progressing our manuscript. The following bullet points summarise the key points and our responses, though our detailed responses to specific comments can be found below:<br /> - Two reviewers commented that our data was not made available. Our data was provided upon submission and during the review process, however was not made accessible to the reviewers. Our data and code are available at https://doi.org/10.5061/dryad.q83bk3jnk.

      - The reviewers have highlighted that some of our methodology was unclear and we have added all the requested detail to ensure our methods can be easily understood.

      - The reviewers highlight the importance of our conclusions, but also suggest some interpretations might be missing and/or are incomplete. To make clear how we objectively interpreted our data and the wider consequences for life-history theory we provide a decision tree (Figure 5). This figure makes clear where we think the boundaries are in our interpretation and how multiple lines of evidence converge to the same conclusions.

      Reviewer #1 (Public Review):

      This paper falls in a long tradition of studies on the costs of reproduction in birds and its contribution to understanding individual variation in life histories. Unfortunately, the meta-analyses only confirm what we know already, and the simulations based on the outcome of the meta-analysis have shortcomings that prevent the inferences on optimal clutch size, in contrast to the claims made in the paper.

      There was no information that I could find on the effect sizes used in the meta-analyses other than a figure listing the species included. In fact, there is more information on studies that were not included. This made it impossible to evaluate the data-set. This is a serious omission, because it is not uncommon for there to be serious errors in meta-analysis data sets. Moreover, in the long run the main contribution of a meta-analysis is to build a data set that can be included in further studies.

      It is disappointing that two referees comment on data availability, as we supplied a link to our full dataset and the code we used in Dryad with our submitted manuscript. We were also asked to supply our data during the review process and we again supplied a link to our dataset and code, along with a folder containing the data and code itself. We received confirmation that the reviewers had been given our data and code. We support open science and it was our intention that our dataset should be fully available to reviewers and readers. Our data and code are at https://doi.org/10.5061/dryad.q83bk3jnk.

      The main finding of the meta-analysis of the brood size manipulation studies is that the survival costs of enlarging brood size are modest, as previously reported by Santos & Nakagawa on what I suspect to be mostly the same data set.

      We disagree that the main finding of our paper is the small survival cost of manipulated brood size. The major finding of the paper, in our opinion, is that the effect sizes for experimental and observational studies are in opposite directions, therefore providing the first quantitative evidence to support the influential theoretical framework put forward by van Noordwijk and de Jong (1986), that individuals differ in their optimal clutch size and are constrained to reproducing at this level due to a trade-off with survival. We further show that while the manipulation experiments have been widely accepted to be informative, they are not in fact an effective test of whether within-species variation in clutch size is the result of a trade-off between reproduction and survival.

      The comment that we are reporting the same finding as Santos & Nakagawa (2012) is a misrepresentation of both that study and our own. Santos & Nakagawa found an effect of parental effort on survival only in males who had their clutch size increased – but no effect for males who had their clutch size reduced and no survival effect on females for either increasing or reducing parental effort. However, we found an overall reduction in survival for birds who had brood sizes manipulated to be larger than their original brood (for both sexes and mixed sex studies combined). In our supplementary information, we demonstrate that the overall survival effect of a change in reproductive effort is close to zero for males, negative (though non-significant) for females and significantly negative for mixed sexes (which are not included in the Santos & Nakagawa study). Please also note that the Santos & Nakagawa study was conducted over 10 years ago. This means we added additional data (L364-365). Furthermore, meta-analyses are an evolving practice and we also corrected and improved on the overall analysis approach (e.g. L358-359 and L 393-397, and see detailed SI).

      The paper does a very poor job of critically discussing whether we should take this at face value or whether instead there may be short-comings in the general experimental approach. A major reason why survival cost estimates are barely significantly different from zero may well be that parents do not fully adjust their parental effort to the manipulated brood size, either because of time/energy constraints, because it is too costly and therefore not optimal, or because parents do not register increased offspring needs. Whatever the reason, as a consequence, there is usually a strong effect of brood size manipulation on offspring growth and thereby presumably their fitness prospects. In the simulations (Fig.4), the consequences of the survival costs of reproduction for optimal clutch size were investigated without considering brood size manipulation effects on the offspring. Effects on offspring are briefly acknowledged in the discussion, but otherwise ignored. Assuming that the survival costs of reproduction are indeed difficult to discern because the offspring bear the brunt of the increase in brood size, a simulation that ignores the latter effect is unlikely to yield any insight in optimal clutch size. It is not clear therefore what we learn from these calculations.

      The reviewer’s comment is somewhat of a paradox. We take the best studied example of the trade-off between reproductive effort and parental survival – a key theme in life history and the biology of ageing – and subject this to a meta-analysis. The reviewer suggests we should interpret our finding as if there must be something wrong with the method or studies we included, rather than considering that the original hypothesis could be false or inflated in importance. We do not consider questioning the premise of the data over questioning a favoured hypothesis to necessarily be the best scientific approach here. In many places in our manuscript, we question and address, at length, the underlying data and their interpretation (L116-117, L165-167, 202-204 and L277-282). Moreover, we make it clear that we focus on the trade-off between current reproductive effort and subsequent parental survival, while being aware that other trade-offs could counter-balance or explain our findings (discussed on L208-210 & L301-316). Note that it is also problematic, when you do not find the expected response, to search for an alternative that has not been measured. In the case here, of potential trade-offs, there are endless possibilities of where a trade-off might operate between traits. We purposefully focus on the one well-studied and most commonly invoked trade-off. We clearly acknowledge, though, that when all possible trade-offs are taken into account a trade-off on the fitness level can occur and cite two famous studies (Daan et al., 1990 and Verhulst & Tinbergen 1991) that have shown just that (L314-316).

      So whilst we agree with the reviewer that the offspring may incur costs themselves, rather than costs being incurred by the parents, the aim of our study was to test for a general trend across species in the survival costs of reproductive effort. It is unrealistic to suggest that incorporating offspring growth into our simulations would add insight, as a change in offspring number rarely affects all offspring in the nest equally and there can even be quite stark differences; for example, this will be most evident in species that produce sacrificial offspring. This effect will be further confounded by catch-up growth, for example, and so it is likely that increased sibling competition from added chicks alters offspring growth trajectories, rather than absolute growth as the reviewer suggests. There are mixed results in the literature on the effect of altering clutch size on offspring survival, with an increased clutch size through manipulation often increasing the number of recruits from a nest.

      What we do appreciate from the reviewer’s comment is that the interpretation of our findings is complex. Even though our in-text explanation includes the caveats the reviewer refers to, and are discussed at length, their inter-relationships are hard to appreciate from a text format. To improve this presentation and for ease of the reader, we have added a decision tree (Figure 5) which represents the logical flow from the hypothesis being tested through to what overall conclusion can be drawn from our results. We believe this clarifies what conclusions can be drawn from our results. We emphasise again that the theory that trade-offs between reproductive effort and parental survival being the major driver of variation in offspring production was not supported though is the one that practitioners in the field would be most likely to invoke, and our result is important for this reason.

      There are other reasons why brood size manipulations may not reveal the costs of reproduction animals would incur when opting for a larger brood size than they produced spontaneously themselves. Firstly, the manipulations do not affect the effort incurred in laying eggs (which also biases your comparison with natural variation in clutch size). Secondly, the studies by Boonekamp et al on Jackdaws found that while there was no effect of brood size manipulation on parental survival after one year of manipulation, there was a strong effect when the same individuals were manipulated in the same direction in multiple years. This could be taken to mean that costs are not immediate but delayed, explaining why single year manipulations generally show little effect on survival. It would also mean that most estimates of the fitness costs of manipulated brood size are not fit for purpose, because typically restricted to survival over a single year.

      First, our results did show a survival cost of reproduction for brood manipulations (L107-123, Figure 1, Table 1). Note, however, that much theory is built on the immediate costs of reproduction and, as such, these costs are likely overinterpreted, meaning that our overall interpretation still holds, i.e. “parental survival trade-off is not the major determinative trade-off in life history within-species” (Figure 5).

      We agree with the reviewer that lifetime manipulations could be even more informative than single-year manipulations. Unfortunately, there are currently too few studies available to be able to draw generalisable conclusions across species for lifetime manipulations. This is, however, the reason we used lifetime change in clutch size in our fitness projections, which the reviewer seems to have missed – please see methods line 466-468, where we explicitly state that this is lifetime enlargement. Of course, such interpretations do not include an accumulation of costs that is greater than the annual cost, but currently there is no clear evidence that such an assumption is valid. Such a conclusion can also not be drawn from the study on jackdaws by Boonekamp et al (2014) as the treatments were life-long and, therefore, cannot separate annual from accrued (multiplicative) costs that are more than the sum of the annual costs incurred. Note that we have now included specific discussion of this study in response to the reviewer (L265-269).

      Details of how the analyses were carried out were opaque in places, but as I understood the analysis of the brood size manipulation studies, manipulation was coded as a covariate, with negative values for brood size reductions and positive values for brood size enlargements (and then variably scaled or not to control brood or clutch size). This approach implicitly assumes that the trade-off between current brood size (manipulation) and parental survival is linear, which contrasts with the general expectation that this trade-off is not linear. This assumption reduces the value of the analysis, and contrasts with the approach of Santos & Nakagawa.

      We thank the reviewer for highlighting a lack of clarity in places in our methods. We have added additional detail to the methodology section (see “Study sourcing & inclusion criteria” and “Extracting effect sizes”) in our revised manuscript. Note, that our data and code was not shared with the reviewers despite us supplying this upon submission and again during the review process, which would have explained a lot more of the detail required.

      For clarity in our response, each effect size was extracted by performing a logistic regression with survival as a binary response variable and clutch size was the absolute value of offspring in the nest (i.e., for a bird that laid a clutch size of 5 but was manipulated to have -1 egg, we used a clutch size value of 4). The clutch size was also standardised and, separately, expressed as a proportion of the species’ mean.

      We disagree that our approach reduces the value of our analysis. First, our approach allows a direct comparison between experimental and observational studies, which is the novelty of our study. Our approach does differ from Santos & Nakagawa but we disagree that it contrasts. Our approach allows us to take into consideration the severity of the change in clutch size, which Santos & Nakagawa do not. Therefore, we do not agree that our approach is worse at accounting for non-linearity of trade-offs than the approach used by Santos & Nakagawa. Arguably, the approach by Santos & Nakagawa is worse, as they dichotomise effort as increased or decreased, factorise their output and thereby inflate their number of outcomes, of which only 1 cell of 4 categories is significant (for males and females, increased and decreased brood size). The proof is in the pudding as well, as our results clearly demonstrate that the magnitude of the manipulation is a key factor driving the results, i.e. one offspring for a seabird is a larger proportion of care (and fitness) than one offspring for a passerine. Such insights were not achieved by Santos & Nakagawa’s method and, again, did not allow a direct quantitative comparison between quality (correlational) and experimental (brood size manipulation, i.e. “trade-off”) effects, which forms a central part of our argumentation (Figure 5). 

      Our analysis, alongside a plethora of other ecological studies, does assume that the response to our predictor variable is linear. However, it is common knowledge that there are very few (if any) truly linear relationships. We use linear relationships because they serve a good approximation of the trend and provide a more rigorous test for an underlying relationship than would fitting nonlinear models. For many datasets the range of added chicks required to estimate a non-linear relationship was not available. The question also remains of what the shape of such a non-linear relationship should be and is hard to determine a priori. There is also a real risk when fitting non-linear terms that they are spurious and overinterpreted, as they often present a better fit (denoting one df is not sufficient especially when slopes vary). We have added this detail to our discussion.

      The observational study selection is not complete and apparently no attempt was made to make it complete. This is a missed opportunity - it would be interesting to learn more about interspecific variation in the association between natural variation in clutch size and parental survival.

      We clearly state in our manuscript that we deliberately tailored the selection of studies to match the manipulation studies (L367-369). We paired species extracted for observational studies with those extracted in experimental studies to facilitate a direct comparison between observational and experimental studies, and to ensure that the respective datasets were comparable. The reviewer’s focus in this review seems to be solely on the experimental dataset. This comment dismisses the equally important observational component of our analysis and thereby fails to acknowledge one of the key questions being addressed in this study. Note that in our revised version we have edited the phylogenetic tree to indicate for which species we have both types of information, which highlights our approach to selecting observational data (Figure 3).

      Reviewer #2 (Public Review):

      I have read with great interest the manuscript entitled "The optimal clutch size revisited: separating individual quality from the costs of reproduction" by LA Winder and colleagues. The paper consists in a meta-analysis comparing survival rates from studies providing clutch sizes of species that are unmanipulated and from studies where the clutch sizes are manipulated, in order to better understand the effects of differences in individual quality and of the costs of reproduction. I find the idea of the manuscript very interesting. However, I am not sure the methodology used allows to reach the conclusions provided by the authors (mainly that there is no cost of reproduction, and that the entire variation in clutch size among individuals of a population is driven by "individual quality").

      We would like to highlight that we do not conclude that there is no cost of reproduction. Please see lines 336–339, where we state that our lack of evidence for trade-offs driving within-species variation in clutch size does not necessarily mean the costs of reproduction are non-existent. We conclude that individuals are constrained to their optima by the survival cost of reproduction. It is also an over-statement of our conclusion to say that we believe that variation in clutch size is only driven by quality. Our results show that unmanipulated birds that have larger clutch sizes also lived longer, and we suggest that this is evidence that some individuals are “better” than others, but we do not say, nor imply, that no other factors affect variation in clutch size. We have added Figure 5 to our manuscript to help the reader better understand what questions we can answer with our study and what conclusions we can draw from our results.

      I write that I am not sure, because in its current form, the manuscript does not contain a single equation, making it impossible to assess. It would need at least a set of mathematical descriptions for the statistical analysis and for the mechanistic model that the authors infer from it.

      We appreciate this comment, and have explained our methods in terms that are accessible to a wider audience. Note, however, that our meta-analysis is standard and based on logistic regression and standard meta-analytic practices. We have added the model formula to the model output tables.

      For the simulation, we simply simulated the resulting effects. We of course supplied our code for this along with our manuscript (https://doi.org/10.5061/dryad.q83bk3jnk), though as we mentioned above, we believe this was not shared with the reviewers despite us making this available for the review process. We therefore understand why the reviewer feels the simulations were not explained thoroughly. We have revised our methods section and added details which we believe make our methodology more clear without needing to consult the supplemental material. However, we have also added the equations used in the process of calculating our simulated data to the Supplementary Information for readers who wish to have this information in equation form.

      The texts mixes concepts of individual vs population statistics, of within individual vs among-individuals measures, of allocation trade-offs and fitness trade-offs, etc ....which means it would also require a glossary of the definitions the authors use for these various terms, in order to be evaluated.

      We would like to thank the reviewer for highlighting this lack of clarity in our text. Throughout the manuscript we have refined our terminology and indicated where we are referring to the individual level or the population level. The inclusion of our new Figure 5 (decision tree) should also help in this context, as it is clear on which level we base our interpretation and conclusions on.

      This problem is emphasised by the following sentence to be found in the discussion "The effect of birds having naturally larger clutches was significantly opposite to the result of increasing clutch size through brood manipulation". The "effect" is defined as the survival rate (see Fig 1). While it is relatively easy to intuitively understand what the "effect" is for the unmanipulated studies: the sensitivity of survival to clutch size at the population level, this should be mentioned and detailed in a formula. Moreover, the concept of effect size is not at all obvious for the manipulated ones (effect of the manipulation? or survival rate whatever the manipulation (then how could it measure a trade-off ?)? at the population level? at the individual level ?) despite a whole appendix dedicated to it. This absolutely needs to be described properly in the manuscript.

      Thank you for identifying this sentence for which the writing was ambiguous, our apologies. We have now rewritten this and included additional explanation. L282-290: ‘The effect on parental annual survival of having naturally larger clutches was significantly opposite to the result of increasing clutch size through brood manipulation, and quantitatively similar. Parents with naturally larger clutches are thus expected to live longer and this counterbalances the “cost of reproduction” when their brood size is experimentally manipulated. It is, therefore, possible that quality effects mask trade-offs. Furthermore, it could be possible that individuals that lay larger clutches have smaller costs of reproduction, i.e. would respond less in terms of annual survival to a brood size manipulation, but with our current dataset we cannot address this hypothesis (Figure 5).’

      We would also like to thank the reviewer for bringing to our attention the lack of clarity about the details of our methodology. We have added details to our methodology (see “Extracting effect sizes” section) to address this (see highlighted sections). For clarity, the effect size for both manipulated and unmanipulated nests was survival, given the brood size raised. We performed a logistic regression with survival as a binary response variable (i.e., number of individuals that survived and number of individuals that died after each breeding season), and clutch size was the absolute value of offspring in the nest (i.e., for a bird that laid a clutch size of 5 but was manipulated to have -1 egg, we used a clutch size value of 4). This allows for direct comparison of the effect size (survival given clutch size raised) between manipulated and unmanipulated birds.

      Despite the lack of information about the underlying mechanistic model tested and the statistical model used, my impression is still that the interpretation in the introduction and discussion is not granted by the outputs of the figures and tables. Let's use a model similar to that of (van Noordwijk and de Jong, 1986): imagine that the mechanism at the population level is

      a.c_(i,q)+b.s_(i,q)=E_q

      Where c_(i,q) are s_(i,q) are respectively the clutch size for individual i which is of quality q, and E_q is the level of "energy" that an individual of quality q has available during the given time-step (and a and b are constants turning the clutch size and survival rate into energy cost of reproduction and energy cost of survival, and there are both quite "high" so that an extra egg (c_(i,q) is increased by 1) at the current time-step, decreases s_(i,q) markedly (E_q is independent of the number of eggs produced), that is, we have strong individual costs of reproduction). Imagine now that the variance of c_(i,q) (when the population is not manipulated) among individuals of the same quality group, is very small (and therefore the variance of s_(i,q) is very small also) and that the expectation of both are proportional to E_q. Then, in the unmanipulated population, the variance in clutch size is mainly due to the variance in quality. And therefore, the larger the clutch size c_(i,q) the higher E_q, and the higher the survival s_(i,q).

      In the manipulated populations however, because of the large a and b, an artificial increase in clutch size, for a given E_q, will lead to a lower survival s_(i,q). And the "effect size" at the population level may vary according to a,b and the variances mentioned above. In other words, the costs of reproduction may be strong, but be hidden by the data, when there is variance in quality; however there are actually strong costs of reproduction (so strong actually that they are deterministic and that the probability to survive is a direct function of the number of eggs produced)

      We would like to thank the reviewer for these comments. We have added detail to our methodology section so our models and rationale are more clear. Please note that our simulations only take the experimental effect of brood size on parental survival into account. Our model does not incorporate quality effects. The reviewer is right that the relationship between quality and the effects exposed by manipulating brood size can take many forms and this is a very interesting topic, but not one we aimed to tackle in our manuscript. In terms of quality we make two points: (1) overall quality effects connecting reproduction and parental survival are present, (2) these effects are opposite in direction to the effects when reproduction is manipulated and similar in magnitude. We do not go further than that in interpreting our results. The reviewer is correct, however, that we do suggest and repeat suggestions by others that quality can also mask the trade-off in some individuals or circumstances (L74-76, L95-98 & L286-289), but we do not quantify this, as it is dependent on the unknown relationship between quality and the response to the manipulation. A focussed set of experiments in that context would be interesting and there are some data that could get at this, i.e. the relationship between produced clutch size and the relative effect of the manipulation (now included L287-290). Such information is, however, not available for all studies and, although we explored the possibility of analysing this, currently this is not possible with adequate confidence and there is the possible complexity of non-linear effects. We have added this rationale in our revision (L259-265).

      Moreover, it seems to me that the costs of reproduction are a concept closely related to generation time. Looking beyond the individual allocative (and other individual components of the trade-off) cost of reproduction and towards a populational negative relationship between survival and reproduction, we have to consider the intra-population slow fast continuum (some types of individuals survive more and reproduce less (are slower) than other (which are faster)). This continuum is associated with a metric: the generation time. Some individuals will produce more eggs and survive less in a given time-period because this time-period corresponds to a higher ratio of their generation time (Gaillard and Yoccoz, 2003; Gaillard et al., 2005). It seems therefore important to me, to control for generation time and in general to account for the time-step used for each population studied when analysing costs of reproduction. The data used in this manuscript is not just clutch size and survival rates, but clutch size per year (or another time step) and annual (or other) survival rates.

      The reviewer is right that this is interesting. There is a longstanding unexplained difference in temperate (seasonal) and tropical reproductive strategies. Most of our data come from seasonal breeders, however. Although there is some variation in second brooding and such, these species mostly only produce one brood. We do agree that a wider consideration here is relevant, but we are not trying to explain all of life history in our paper. It is clearly the case that other factors will operate and the opportunity for trade-offs will vary among species according to their respective life histories. However, our study focuses on the two most fundamental components of fitness – longevity and reproduction – to test a major hypothesis in the field, and we uncover new relationships that contrast with previous influential studies and cast doubt on previous conclusions. We question the assumed trade-off between reproduction and annual survival. We show that quality is important and that the effect we find in experimental studies is so small that it can only explain between-species patterns but is unlikely to be the selective force that constrains reproduction within species. We do agree that there is a lot more work that can be done in this area. We hope we are contributing to the field, by questioning this central trade-off. We have incorporated some of the reviewers suggestions in the revision (L309-315). We have added Figure 5 to make clear where we are able to reach solid conclusions and the evidence on which these are based as clearly as possible in an easily accessible format.

      Finally, it is important to relate any study of the costs of reproduction in a context of individual heterogeneity (in quality for instance), to the general problem of the detection of effects of individual differences on survival (see, e.g., Fay et al., 2021). Without an understanding of the very particular statistical behaviour of survival, associated to an event that by definition occurs only once per life history trajectory (by contrast to many other traits, even demographic, where the corresponding event (production of eggs for reproduction, for example) can be measured several times for a given individual during its life history trajectory).

      Thank you for raising this point. The reviewer is right that heterogeneity can dampen or augment selection. Note that by estimating the effect of quality here we give an example of how heterogeneity can possibly do exactly this. We thank the reviewer for raising that we should possibly link this to wider effects of heterogeneity and we have added to our discussion of how our results play into the importance of accounting for among-individual heterogeneity (L252-256).

      References:

      Fay, R. et al. (2021) 'Quantifying fixed individual heterogeneity in demographic parameters: Performance of correlated random effects for Bernoulli variables', Methods in Ecology and Evolution, 2021(August), pp. 1-14. doi: 10.1111/2041-210x.13728.

      Gaillard, J.-M. et al. (2005) 'Generation time: a reliable metric to measure life-history variation among mammalian populations.', The American naturalist, 166(1), pp. 119-123; discussion 124-128. doi: 10.1086/430330.

      Gaillard, J.-M. and Yoccoz, N. G. (2003) 'Temporal Variation in Survival of Mammals: a Case of Environmental Canalization?', Ecology, 84(12), pp. 3294-3306. doi: 10.1890/02-0409.

      van Noordwijk, A. J. and de Jong, G. (1986) 'Acquisition and Allocation of Resources: Their Influence on Variation in Life History Tactics', American Naturalist, p. 137. doi: 10.1086/284547.

      Reviewer #3 (Public Review):

      The authors present here a comparative meta-analysis analysis designed to detect evidence for a reproduction/ survival trade-off, central to expectations from life history theory. They present variation in clutch size within species as an observation in conflict with expectations of optimisation of clutch size and suggest that this may be accounted for from weak selection on clutch size. The results of their analyses support this explanation - they found little evidence of a reproduction - survival trade-off across birds. They extrapolated from this result to show in a mathematical model that the fitness consequences of enlarged clutch sizes would only be expected to have a significant effect on fitness in extreme cases, outside of normal species' clutch size ranges. Given the centrality of the reproduction-survival trade-off, the authors suggest that this result should encourage us to take a more cautious approach to applying concepts the trade-off in life history theory and optimisation in behavioural ecology more generally. While many of the findings are interesting, I don't think the argument for a major re-think of life history theory and the role of trade-offs in fitness maximisation is justified.

      The interest of the paper, for me, comes from highlighting the complexities of the link between clutch size and fitness, and the challenges facing biologists who want to detect evidence for life history trade-offs. Their results highlight apparently contradictory results from observational and experimental studies on the reproduction-survival trade-off and show that species with smaller clutch sizes are under stronger selection to limit clutch size.

      Unfortunately, the authors interpret the failure to detect a life history trade-off as evidence that there isn't one. The construction of a mathematical model based on this interpretation serves to give this possible conclusion perhaps more weight than is merited on the basis of the results, of this necessarily quite simple, meta-analysis. There are several potential complicating factors that could explain the lack of detection of a trade-off in these studies, which are mentioned and dismissed as unimportant (lines 248-250) without any helpful, rigorous discussion. I list below just a selection of complexities which perhaps deserve more careful consideration by the authors to help readers understand the implications of their results:

      We would like to thank the reviewer for their thoughtful response and summary of the findings that we also agree are central to our study. The reviewer also highlights areas where our manuscript could benefit from a deeper consideration and we have added detail accordingly to our revised discussion.

      We would like to highlight that we do not interpret the failure to detect a trade-off as evidence that there is not one. First, and importantly, we do find a trade-off but show this is only incurred when individuals produce a clutch beyond their optimal level. Second, we also state on lines 322-326 that the lack of evidence to support trade-offs being strong enough to drive variation in clutch size does not necessarily mean there are no costs of reproduction.

      The statement that we have constructed a mathematical model based on the interpretation that we have not found a trade-off is, again, factually incorrect. We ran these simulations because the opposite is true – we did find a trade-off. There is a significant effect of clutch size when manipulated on annual parental survival. We benefit from our unique analysis allowing for a quantitative fitness estimate from the effect size on annual survival (as this is expressed on a per-egg basis). This allowed us to ask whether this quantitative effect size can alone explain why reproduction is constrained, and we evaluate this using simulations. From these simulations we find that this effect size is too small to explain the constraint, so something else must be going on, and we do spend a considerable amount of text discussing the possible explanations (L202-215). Note that the possibly most parsimonious conclusion here is that costs of reproduction are not there, or simply small, so we also give that explanation some thought (L221-224 and L315-331).

      We are disappointed by the suggestion that we have dismissed complicating factors that could prevent detection of a trade-off, as this was not our intention. We were aiming to highlight that what we have demonstrated to be an apparent trade-off can be explained through other mechanisms, and that the trade-off between clutch size and survival is not as strong in driving within-species variation in clutch size as previously assumed. We have added further discussion to our revised manuscript to make this clear and give readers a better understanding of the complexity of factors associated with life-history theory, including the addition of a decision tree (Figure 5).

      • Reproductive output is optimised for lifetime reproductive success and so the consequences of being pushed off the optimum for one breeding attempt are not necessarily detectable in survival but in future reproductive success (and, therefore, lifetime reproductive success).

      We agree this is a valid point, which is mentioned in our manuscript in terms of alternative stages where the costs of reproduction might be manifested (L316-320). We would also like to highlight that , in our simulations, the change in clutch size (and subsequent survival cost) was assumed for the lifetime of the individual, for this very reason.

      • The analyses include some species that hatch broods simultaneously and some that hatch sequentially (although this information is not explicitly provided (see below)). This is potentially relevant because species which have been favoured by selection to set up a size asymmetry among their broods often don't even try to raise their whole broods but only feed the biggest chicks until they are sated; any added chicks face a high probability of starvation. The first point this observation raises is that the expectation of more chicks= more cost, doesn't hold for all species. The second more general point is that the very existence of the sequential hatching strategy to produce size asymmetry in a brood is very difficult to explain if you reject the notion of a trade-off.

      We agree with the reviewer that the costs of reproduction can be absorbed by the offspring themselves, and may not be equal across offspring (we also highlight this at L317-318 in the manuscript). However, we disagree that for some species the addition of more chicks does not equate to an increase in cost, though we do accept this might be less for some species. This is, however, difficult to incorporate into a sensible model as the impacts will vary among species and some species do also exhibit catch-up growth. So, without a priori knowledge on this, we kept our model simple to test whether the effect on parental survival (often assumed to be a strong cost) can explain the constraint on reproductive effort, and we conclude that it does not.

      We would also like to make clear that we are not rejecting the notion of a trade-off. Our study shows evidence that a trade-off between survival and reproductive effort probably does not drive within-species variation in clutch size. We do explicitly say this throughout our manuscript, and also provide suggestions of other areas where a trade-off may exist (L317-320). The point of our study is not whether trade-offs exist or not, it is whether there is a generalisable across-species trend for a trade-off between reproductive effort and survival – the most fundamental trade-off in our field but for which there is a lack of conclusive evidence within species. We believe the addition of Figure 5 to our reviewed manuscript also makes this more evident.

      • For your standard, pair-breeding passerine, there is an expectation that costs of raising chicks will increase linearly with clutch size. Each chick requires X feeding visits to reach the required fledge weight. But this is not the case for species which lay precocious chicks which are relatively independent and able to feed themselves straight after hatching - so again the relationship of care and survival is unlikely to be detectable by looking at the effect of clutch size but again, it doesn't mean there isn't a trade-off between breeding and survival.

      Precocial birds still provide a level of parental care, such as protection from predators. Though we agree that the level of parental care in provisioning food (and in some cases in all parental care given) is lower in precocial than altricial birds, this would only make our reported effect size for manipulated birds to be an underestimate. Again, we would like to draw the reviewer’s attention to the fact we did detect a trade-off in manipulated birds and we do not suggest that trade-offs do not exist. The argument the reviewer suggests here does not hold for unmanipulated birds, as we found that birds that naturally lay larger clutch sizes have higher survival.

      • The costs of raising a brood to adulthood for your standard pair-breeding passerine is bound to be extreme, simply by dint of the energy expenditure required. In fact, it was shown that the basal metabolic rate of breeding passerines was at the very edge of what is physiologically possible, the human equivalent being cycling the Tour de France (Nagy et al. 1990). If birds are at the very edge of what is physiologically possible, is it likely that clutch size is under weak selection?

      If birds are at the very edge of what is physiologically possible, then indeed it would necessarily follow that if they increase the resource allocated in one area then expenditure in another area must be reduced. In many studies, however, the overall brood mass is increased when chicks are added and cared for in an experimental setting, suggesting that birds are not operating at their limit all the time. Our simulations show that if individuals increase their clutch size, the survival cost of reproduction counterbalances the fitness gained by increasing clutch size and so there is no overall fitness gain to producing more offspring. Therefore, selection on clutch size is constrained to the within-species level. We do not say in our manuscript that clutch size is under weak selection – we only ask why variation in clutch size is maintained if selection always favours high-producing birds.

      • Variation in clutch size is presented by the authors as inconsistent with the assumption that birds are under selection to lay the Lack clutch. Of course, this is absurd and makes me think that I have misunderstood the authors' intended point here. At any rate, the paper would benefit from more clarity about how variable clutch size has to be before it becomes a problem for optimality in the authors' view (lines 84-85; line 246). See Perrins (1965) for an exquisite example of how beautifully great tits optimise clutch size on average, despite laying between 5-12 eggs.

      We thank the reviewer for highlighting that our manuscript may be misleading in places, however, we are unsure which part of our conclusions the author is referring to here. The question we pose is “Why don’t all birds produce a clutch size at the population optimum?”, and is central to the decades-long field of life-history theory. Why is variation maintained? As the reviewer outlines, there is extensive variability, with some birds laying half of what other birds lay.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) Title: while the costs of reproduction are possibly important in shaping optimal clutch size, it is not clear what you can about it given that you do not consider clutch / brood size effects on fitness prospects of the offspring.

      We have expanded on our discussion of how some costs may be absorbed by the offspring themselves. However, a change in offspring number rarely affects all offspring in the nest equally and there can even be quite stark differences; for example this will be most evident in species that produce sacrificial offspring. This effect will be further confounded by catch-up growth. There are mixed results in the literature on the effect of altering clutch size on offspring survival, with an increased clutch size through manipulation often increasing the number of recruits from a nest. We have focussed on the relationship between reproductive effort and survival because it is given the most weight in the field in terms of driving intra-specific variation in clutch size. We have altered our title to show we focus on the survival costs specifically: “The optimal clutch size revisited: separating individual quality from the parental survival costs of reproduction”.

      (2) L.11-12: I agree that this is true for birds, but this is phrased more generally here. Are you sure that that is justified?

      The trade-off between survival and reproductive effort has largely been tested experimentally through brood manipulations in birds as this provides a good system in which to test the costs and benefits of increasing parental effort. The work in this area has provided theory beyond just passerine birds, which are the most commonly manipulated group, to across-taxa theories. We are unaware of any study/studies that provide evidence that the reproduction/survival trade-off is generalisable across multiple species in any taxa. As such, we do believe this sentence is justified. An example is the lack of a consistent negative genetic correlation in populations of fruitflies, for example, that has also been hailed as a lack-of-cost paradigm. Furthermore, some mutants that live longer do so without a cost on reproduction.

      (3) L.13-14: Not sure what you mean with this sentence - too much info lacking.

      We have added some detail to this sentence.

      (4) L.14: it is slightly awkward to say 'parental investment and survival' because it is the survival effect that is usually referred to as the 'investment'. Perhaps what you want to say is 'parental effort and survival'?

      We have replaced “parental investment” with “reproductive effort”

      (5) L.15: you can omit 'caused'. Compared to control treatment or to reduced broods? Why not mention effects or lack thereof of brood reduction? And it would be good to also mention here whether effects were similar in the sexes.

      Please see our methodology where we state that we use clutch size as a continuous variable (we do not compare to control or reduced but include the absolute value of offspring in a logistic regression). The effects of a brood reduction are drawn from the same regression and so are opposite. Though we appreciate the detail here is lacking to fully comprehend our study, we would like to highlight this is the abstract and details are provided in the main text.

      (6) L. 15: I am not sure why you write 'however', as the finding that experimental and natural variation have opposite effects is in complete agreement with what is generally reported in the literature and will therefore surprise no one that is aware of the literature.

      We use “however” to highlight the change in direction of the effect size from the results in the previous sentence. We also believe that ours ise the first study that provides a quantitative estimate of this effect and that previous work is largely theoretical. The reviewer states that this is what is generally reported but it is not reported in all cases, as some relationships between reproductive effort and survival are negative (for the quality measurement, in correlational space, see Figure 1).

      (7) L.16: saying 'opposite to the effect of phenotypic quality' seems difficult to justify, as clutch size cannot be equated with phenotypic quality. Perhaps simply say 'natural variation in clutch size'? If that is what you are referring to.

      Please note we are referring to effect sizes here –- that is, the survival effect of a change in clutch size. By phenotypic quality we are referring to the fact that we find higher parental survival when natural clutch sizes are higher. It is not the case that we refer to quality only as having a higher clutch size. This is explicitly stated in the sentence you refer to. We have changed “effect” to “effect size” to highlight this further.

      (8) L.18: why do you refer to 'parental care' here? Brood size is not equivalent to parental care.

      Brood size manipulations are used to manipulate parental care. The effect on parental survival is expected to be incurred because of the increase in parental care. We have changed “parental care” to “reproductive effort” to reduce the number of terms we use in our manuscript.

      (9) L.18-19: suggest to tone down this claim, as this is no more than a meta-analytic confirmation of a view that is (in my view) generally accepted in the field. That does not mean it is not useful, just that it does not constitute any new insight.

      We are unaware of any other study which provides generalisable across-species evidence for opposite effects of quality and costs of reproduction. The work in this area is also largely theoretical and is yet to be supported experimemtally, especially in a quantitative fashion. It is surprising to us that the reviewer considers there to be general acceptance in a field, rather than being influenced by rigorous testing of hypotheses, made possible by meta-analysis, the current gold standard in our field.

      (10) L.21: what does 'parental effort' mean here? You seem to use brood size, parental care, parental effort, and parental investment interchangeably but these are different concepts. Daan et al (1990, Behaviour), which you already cite, provide a useful graph separating these concepts. Please adjust this throughout the manuscript, i.e. replace 'reproductive effort' with wording that reflect the actual variable you use.

      We have not used the phrase “parental effort” in this sentence. We agree these are different concepts but in this context are intertwined. For example, brood size is used to manipulate parental care as a result of increased parental effort. We do agree the manuscript would benefit from keeping terminology consistent throughout the manuscript and have adjusted this throughout.

      (11) L.23: perhaps add 'in birds' somewhere in this sentence? Some reference to the assumptions underlying this inference would also be useful. Two major assumptions being that birds adjusted their effort to the manipulation as they would have done had they opted for a larger brood size themselves, and that the costs of laying and incubating extra eggs can be ignored. And then there is the effect that laying extra eggs will usually delay the hatch date, which in many species reduces reproductive success.

      Though our study does exclusively use birds, birds have been used to test the survival/reproduction trade-off because they present a convenient system in which to experimentally test this. The conclusions from these studies have a broader application than in birds alone. We believe that although these details are important, they are not appropriate in the abstract of our paper.

      (12) L.26: how is this an explanation? It just repeats the finding.

      We intend to refer to all interpretations from all results presented in our manuscript. We have made this more clear by adjusting our writing.

      (13) L.27: I do not see this point. And 'reproductive output' is yet another concept, that can be linked to the other concepts in the abstract in different ways, making it rather opaque.

      We have changed “reproductive output” to “reproductive effort”.

      (14) L.33: here you are jumping from 'resources' to 'energetically' - it is not clear that energy is the only or main limiting resource, so why narrow this down to energy?

      We do not say energy is the only or main limiting resource. We simply highlight that reproduction is energetically demanding and so, intuitively, a trade-off with a highly energetically demanding process would be the focal place to observe a trade off. We have, though, replaced “energetically” with “resource”.

      (15) L.35-36: this is new to me - I am not aware of any such claims, and effects on the residual reproductive value could also arise through effects on future reproduction. The authors you cite did not work on birds, or (in their own study systems) presented results that as far as I remember warrant such a general statement.

      The trade-off between reproduction and survival is seminal to the disposable soma theory, proposed by Kirkwood. Though Kirkwood’s work was largely not focussed on birds, it had fundamental implications for the field of evolutionary ecology because of the generalisable nature of his proposed framework. In particular, it has had wide-reaching influence on how the biology of aging is interpreted. The readership of the journal here is broad, and our results have implications for that field too. The work of Kirkwood (many of the papers on this topic have over 2000 citations each) has been perhaps overly influential in many areas, so a link to how that work should be interpreted is highly relevant. If the reviewer is interested in this topic the following papers by one of the co-authors and others could be of interest, some of which we could not cite in the main manuscript due to space considerations:

      https://www.science.org/doi/pdf/10.1126/sciadv.aay3047

      https://agingcelljournal.org/Archive/Volume3/stochasticity_explains_non_genetic_inheritance_of_lifespan/

      https://pubmed.ncbi.nlm.nih.gov/21558242/

      https://besjournals.onlinelibrary.wiley.com/doi/full/10.1111/1365-2435.13444

      https://www.nature.com/articles/362305a0

      https://www.cell.com/trends/ecology-evolution/fulltext/S0169-5347(12)00147-4

      https://www.cell.com/cell/pdf/S0092-8674(15)01488-9.pdf

      https://bmcbiol.biomedcentral.com/articles/10.1186/s12915-018-0562-z

      (16) L.42: this could be preceded with mentioning the limitations of observational data.

      We have added detail as to why brood manipulations are a good test for trade-offs and so this is now inherently implied.

      (17) L.42-43: why?

      We have added detail to this sentence.

      (18) L.45: do any of the references cited here really support this statement? I am certain that several do not - in these this statement is an assumption rather than something that is demonstrated. It may be useful to look at Kate Lessell's review on this that appeared in Etologia, I think in the 1990's. Mind however that 'reproductive effort' is operationally poorly defined for reproducing birds - provisioning rate is not necessarily a good measure of effort in so far as there are fitness costs.

      We have updated the references to support the sentence.

      (19) L.47: Given that you make this statement with respect to brood size manipulations in birds, it seems to me that the paper by Santos & Nakagawa is the only paper you should cite here. Given that you go on to analyze the same data it deserves to be discussed in more detail, for example to clarify what you aim to add to their analysis. What warrants repeating their analysis?

      Please first note that our dataset includes Santos & Nakagawa and additional studies, so it is not accurate to say we analyse the same data. Furthermore, we believe our study has implications beyond birds alone and so believe it is appropriate to cite the papers that do support our statement. We have added details to the methods to explicitly state what data is gathered from Santos & Nakagawa (it is only used to find the appropriate literature and data was re-extracted and re-analysed in a more appropriate way) and, separately, how we gathered the observational studies (see L352-381).

      (20) L.48: There are more possible explanations to this, which deserve to be discussed. For example, brood size manipulations may not have been that effective in manipulating reproductive effort - for example, effects on energy expenditure tend to be not terribly convincing. Secondly, the manipulations do not affect the effort incurred in laying eggs (which also biases your comparison with natural variation in clutch size). Thirdly, the studies by Boonekamp et al on Jackdaws found that while there was no effect of brood size manipulation on parental survival after one year of manipulation, there was a strong effect when the same individuals were manipulated in the same direction in multiple years. This could be taken to mean that costs are not immediate but delayed, explaining why single year manipulations generally show little effect on survival. It would also mean that most estimates of the fitness costs of manipulated brood size are not fit for purpose, because typically restricted to survival over a single year.

      Please see our response to this comment in the public reviews.

      Out of interest and because the reviewer mentioned “energy expenditure” specifically: There are studies that show convincing effects of brood size manipulation on parental energy expenditure. We do agree that there are also studies that show ceilings in expenditure. We therefore disagree that they “tend to be not terribly convincing”. Just a few examples:

      https://academic.oup.com/beheco/article/10/5/598/222025 (Figure 2)

      https://besjournals.onlinelibrary.wiley.com/doi/full/10.1111/1365-2435.12321 (Figure 1)

      https://besjournals.onlinelibrary.wiley.com/doi/full/10.1046/j.1365-2656.2000.00395.x (but ceiling at enlarged brood).

      (21) L.48, "or, alternatively, that individuals may differ in quality": how do you see that happening when brood size is manipulated, and hence 'quality' of different experimental categories can be assumed to be approximately equal? This point does apply to observational studies, so I assume that that is what you had in mind, but that distinction should be clear (also on line 54).

      We have made it more clear that we determine if there are quality effects separate to the costs of reproduction found using brood manipulation studies.

      (22) L.50: Drent & Daan, in their seminal paper on "The prudent parent" (1980, Ardea) were among the earliest to make this point and deserve to be cited here.

      We have added this citation

      (23) L.51, "relative importance": relative to what? Please be more specific.

      We have adjusted this sentence.

      (24) L.54: Vedder & Bouwhuis (2018, Oikos) go some way towards this point and should be explicitly mentioned with reference to the role of 'quality' effects on the association between reproductive output and survival.

      We have added this reference.

      (25) L.55: can you be more specific on what you want to do exactly? What you write here could be interpreted differently.

      We have added an explicit aim after this sentence to be more clear.

      (26) L.57: Here also a more specific wording would be useful. What does it mean exactly when you say you will distinguish between 'quality' and 'costs'?

      We have added detail to this sentence.

      (27) L.62: it should be clearer from the introduction that this is already well known, which will indirectly emphasize what you are adding to what we know already.

      We would argue this is not well known and has only been theorised but not shown empirically, as we do here.

      (28) L.62: you equate clutch size with 'quality' here - that needs to be spelled out.

      We refer to quality as the positive effect size of survival for a given clutch size, not clutch size alone. We appreciate this is not clear in this sentence and have reworded.

      (29) L.64: this looks like a serious misunderstanding to me, but in any case, these inferences should perhaps be left to the discussion (this also applies to later parts of this paragraph), when you have hopefully convinced readers of the claims you make on lines 62-63.

      We are unsure of what the reviewer is referring to as a misunderstanding. We have chosen this format for the introduction to highlight our results. If this is a problem for the editors we will change as required.

      (30) L.66: quantitative comparison of what?

      Comparison of species. We have changed the wording of this sentence

      (31) L.67-69: this should be in the methods.

      We have used a modern format which highlights our result. We are happy to change the format should the editors wish us to.

      (32) L.74-88: suggest to (re)move this entire paragraph, presenting inferences in such an uncritical manner before presenting the evidence is inappropriate in my view. I have therefore refrained from commenting on this paragraph.

      We have chosen a modern format which highlights our result. We are happy to change the format should the editors wish us to.

      (33) L.271, "must detail variation in the number of raised young": it is not sufficiently clear what this means - what does 'detail' mean in this context? And what does 'number of raised young' mean? The number hatched or raised to fledging?

      We have now made this clear.

      (34) L271, "must detail variation in the number of raised young": looking at table S4, it seems that on the basis of this criterion also brood size manipulation studies where details on the number of young manipulated were missing are excluded. I see little justification for this - surely these manipulations can for example be coded as for example having the average manipulation size in the meta-analysis data set, thereby contributing to tests of manipulation effects, but not to variation within the manipulation groups?

      We have done in part what the reviewer describes. We are specifically interested in the manipulation size, so we required this to compare effect sizes across species and categories, a key advance of our study and outlined in many places in our manuscript. Note, however, that we only need comparative differences, and have used clutch size metrics more generally to obtain a mean clutch size for a species, as well as SD where required. Please also note that our supplement details exactly why studies were excluded from our analysis, as is the preferred practice in a meta-analysis.

      (35) L.271, "referred to as clutch size": the point of this simplification is not clear to me why it is clearly confusing - why not refer to 'brood size' instead?

      Brood size and clutch size can be used interchangeably here because, in the observational studies, the individuals vary in the number of eggs produced, whereas for brood manipulations this obviously happens after hatching and brood is perhaps a more appropriate term, but we wanted to simplify the terminology used. However, we use clutch size throughout as the aim of our study is to determine why individuals differ in the number of offspring they produce, and so clutch size is the most appropriate term for that.

      (36) L.280: according to the specified inclusion criteria (lines 271/272) these studies should already be in the data set, so what does this mean exactly?

      Selection criteria refers to whether a given study should be kept for analysis or not. It does not refer to how studies were found. Please see lines 361-378 for details on how we found studies (additional details are also in the Supplementary Methods).

      (37) L.281: the use of 'quality' here is misleading - natural variation in clutch or brood size will have multiple causes, variation in phenotypic quality of the individuals and their environment (territories) is only one of the causes. Why not simply refer to what you are actually investigating: natural and experimental variation in brood size.

      We disagree, our study aims to separate quality effects from the costs of reproduction and we use observational studies to test for quality differences, though we make no inference about the mechanisms. We do not imply that the environment causes differences in quality, but that to directly compare observation and experimental groups, they should contain similar species. So, to be clear again, quality refers to the positive covariation of clutch size with survival. We feel that we explain this clearly in our study’s rationale and have also improved our writing in several sections on this to avoid any confusion (see responses to earlier comments by the three reviewers).

      (38) L.283, "in most cases": please be exact and say in xx out xx cases.

      We have added the number of studies for each category here.

      (39) L.283-285: presumably readers can see this directly in a table with the extracted data?

      Our data and code can be accessed with the following link: https://doi.org/10.5061/dryad.q83bk3jnk. We believe the data are too large to include as a table in the main text and are not essential in understanding the paper. Though we do believe all readers should have access to this information if they wish and so is publicly available.

      (40) L.293: there does not seem to be a table that lists the included studies and effect sizes. It is not uncommon to find major errors in such tables when one is familiar with the literature, and absence of this information impedes a complete assessment of the manuscript.

      We supplied a link to our full dataset and the code we used in Dryad with our submitted manuscript. We were also asked to supply our data during the review process and we again supplied a link to our dataset and code, along with a folder containing the data and code itself. We received confirmation that the reviewers had been given our data and code. We support open science and it was our intention that our dataset should be fully available to reviewers and readers. We believe the data are too large to include as a table in the main text and are not essential in understanding the paper. Our data and code are at https://doi.org/10.5061/dryad.q83bk3jnk.

      (41) L.293: from how many species?

      We have added this detail.

      (42) L.296, "longevity": this is a tricky concept, not usually reported in the studies you used, so please describe in detail what data you used.

      We have removed longevity as we did not use this data in our current version of the manuscript.

      (43) L. 298: again: where can I see this information?

      Our data and code can be accessed with the following link: https://doi.org/10.5061/dryad.q83bk3jnk. We did supply this information when we submitted our manuscript and again during the review process but we believe this was not passed onto the reviewers.

      (44) L. 304, "we used raw data": I assume that for the majority of papers the raw data were not available, so please explain how you dealt with this. Or perhaps this applies to a selection of the studies only? Perhaps the experimental studies?

      By raw data, we mean the absolute value of offspring in the nest. We have changed the wording of this sentence and added detail about whether the absolute value of offspring was not present for brood manipulation studies (L393-397).

      (45) L.304: When I remember correctly, Santos and Nakagawa examined effects of reducing and enlarging brood size separately, which is of importance because trade-off curves are unlikely to be linear and whether they are or not has major effects on the optimization process. But perhaps you tackled this in another way? I will read on.....

      You are correct that Santos & Nakagawa compared brood increases and reductions to control separately. Note that this only partially accounts non-linearity and it does not take into account the severity of the change in brood size. By using a logistic regression of absolute clutch size, as we have done, we are able to directly compare brood manipulations with experimental studies. Please see Supplementary Methods lines 11-12, where we have added additional detail as to why our approach is beneficial in this analysis.

      (46) L.319: what are you referring to exactly with "for each clutch size transformation"?

      We refer to the raw, standardised and proportional clutch size transformations. We have added detail here to be more clear.

      (47) L.319: is there a cost of survival? Perhaps you mean 'survival cost'? This would be appropriate for the experimental data, but not for the observational data, where the survival variation may be causally unrelated to the brood size variation, even if there is a correlation.

      We have changed “cost of survival” to “effect of parental survival”. We only intend to imply causality for the experimental studies. For observational studies we do not suggest that increasing clutch size is causal for increasing survival, only correlative (and hence we use the phrase “quality”).

      (48) L.320: please replace "parental effort" with something like 'experimental change in brood size'.

      We have changed “parental effort” to “reproductive effort”

      (49) L.321: due to failure of one or more eggs to hatch, and mortality very early in life, before brood sizes are manipulated, it is not likely that say an enlargement of brood size by 1 chick can be equated to the mean clutch size +1 egg / check. For example, in the Wytham great tit study, as re-analysed by Richard Pettifor, a 'brood size manipulation' of unmanipulated birds is approximately -1, being the number of eggs / chicks lost between laying and the time of brood size manipulation. Would this affect your comparisons?

      Though we agree these are important factors in determining what a clutch/brood size actually is for a given individual/pair, as this can vary from egg laying to fledging. We do not believe that accounting for this (if it was possible to do so) would significantly affect our conclusions, as observational studies are comparable in the fact that these birds would also likely see early life mortality of their offspring. It is also possibly the case that parents already factor in this loss, and so a brood manipulation still changes the parental care effort an individual has to incur.

      (50) L.332: instead of "adjusted" perhaps say 'mean centred'?

      We have implemented this suggestion.

      (51) L.345: this statement surprised me, but is difficult to verify because I could not locate a list of the included studies. However, to my best knowledge, most studies reporting brood size manipulation effects on parental survival had this as their main focus, in contrast to your statement.

      Our data and code can be accessed with the following link: https://doi.org/10.5061/dryad.q83bk3jnk. We did supply this information when we submitted our manuscript and again during the review process but we believe this was not passed onto the reviewers by the journal, although supplied by us on several occasions. We regret that the reviewer was impeded by this unfortunate communication failure, but we did our best to make the data available to the reviewers during the initial review process.

      (52) L.361-362: this seems a realistic approach from an evolutionary perspective, but we know from the jackdaw study by Boonekamp that the survival effect of brood size manipulation in a single year is very different from the survival effect of manipulating as in your model, i.e. every year of an individual's life the same manipulation. For very short-lived species this possibly does not make much difference, but for somewhat longer-lived species this could perhaps strongly affect your results. This should be discussed, and perhaps also explored in your simulations?

      Note that the Boonekamp study does not separate whether the survival effects are additive or

      multiplicative. As such, we do not know whether the survival effects for a single year manipulation are just small and hard to detect, or whether the survival effects are multiplicative. Our simulations assumed that the brood enlargement occurred every year throughout their lives. We have added some text to the discussion on the point you raise.

      (53) L.360: what is "lifetime reproductive fitness"? Is this different from just "fitness"?

      We have changed “lifetime reproductive fitness” to “lifetime reproductive output”.

      (54) L.363: when you are interested in optimal clutch size, why not also explore effects of reducing clutch size?

      As we find that a reduction in clutch size leads to a reduction in survival (for experimental studies), we already know that these individuals would have a reduced fitness return compared to reproducing at their normal level, and so we would not learn anything from adding this into our simulations. The interest in using clutch size enlargements is to find out why an individual does not produce more offspring than it does, and the answer is that it would not have a fitness benefit (unless its clutch size and survival rate combination is out of the bounds of that observable in the wild).

      (55) Fig.1 - using 'parental effort' in the y-axis label is misleading, suggest to replace with e.g. "clutch or brood size". Using "clutch size" in the title is another issue, as the experimental studies typically changed the number of young rather than the number of eggs.

      We have updated the figure axes to say “clutch size” rather than “parental effort”. Please see response to comment 35 where we explain our use of the term “clutch size” throughout this manuscript.

      (56) L.93 - 108: I appreciate the analysis in Table 1, in particular the fact that you present different ways of expressing the manipulation. However, in addition, I would like to see the results of an analysis treating the manipulations as factor, i.e. without considering the scale of the manipulation. This serves two purposes. Firstly, I believe it is in the interest of the field that you include a detailed comparison with the results of Santos & Nakagawa's analysis of what I expect to be largely the same data (manipulation studies only - for this purpose I would also like to see a comparison of effect size between the sexes). Secondly, there are (at least) two levels of meta-analysis, namely quantifying an overall effect size, and testing variables that potentially explain variation in effect size. You are here sort of combining the two levels of analysis, but including the first level also would give much more insight in the data set.

      Our main intention here was to improve on how the same hypothesis was approached by Santos & Nakagawa. We did this by improving our analysis (on a by “egg” basis) and by adding additional studies (i.e. more data). In this process mistakes are corrected (as we re-extracted all data, and did not copy anything across from their dataset – which was used simply to ensure we found the same papers); more recent data were also added, including studies missed by Santos & Nakagawa. This means that the comparison with Santos & Nakagawa becomes somewhat irrelevant, apart from maybe technical reasons, i.e. pointing out mistakes or limitations in certain approaches. We would not be able to pinpoint these problems clearly without considering the whole dataset, yet Santos & Nakagawa only had a small subset of the data that were available to us. In short, meta-analysis is an iterative process and similar questions are inevitably analysed multiple times and updated. This follows basic meta-analytic concepts and Cochrane principles. Except where there is a huge flaw in a prior dataset or approach (like we sometimes found and highlighted in our own work, e.g. Simons, Koch, Verhulst 2013, Aging Cell), in itself a comparison of the kind the reviewer suggests distracts from the biology. With the dataset being made available others can make these comparisons, if required. On the sex difference, we provide a comparison of effect sizes separated between both sexes and mixed sex in Table S2 and Figure S1.

      (57) L.93 - 108: a thing that does not become clear from this section is whether experimentally reducing brood size affects parental survival similarly (in absolute terms) as enlarging brood size. Whether these effects are symmetric is biologically important, for example because of its effect on clutch size optimization. In the text you are specific about the effects of increasing brood size, but the effect you find could in theory be due entirely to brood size reduction.

      We have added detail to make it clear that a brood reduction is simply the opposite trend. We use linear relationships because they serve a good approximation of the trend and provide a more rigorous test for an underlying relationship than would fitting nonlinear models. For many datasets there is not a range of chicks added for which a non-linear relationship could be estimated. The question also remains of what the shape of this non-linear relationship should be and is hard to determine a priori.

      We have added some discussion on this to our manuscript (L278-282), in response to an earlier comment.

      (58) L.103-107: this is perhaps better deferred to the discussion, because other potential explanations should also be considered. For example, there have been studies suggesting that small birds were provisioning their brood full time already, and hence had no scope to increase provisioning effort when brood size was experimentally increased.

      We agree this is a discussion point but we believe it also provides an important context for why we ran our simulations, and so we believe this is best kept brief but in place. We agree the example you give is relevant but believe this argument is already contained in this section. See line 121-123 “...suggesting that costs to survival were only observed when a species was pushed beyond its natural limits”.

      (59) L.103-107: this discussion sort of assumes that the results in Table 1 differ between the different ways that the clutch/brood size variation is expressed. Is there any statistical support for this assumption?

      We are unsure of what the reviewer means here exactly. Note that in each of the clutch size transformations, experimental and observational effect sizes are significantly opposite. For the proportional clutch size transformation, experimental and observation studies are both separately significantly different from 0.

      (60) L.104: at this point, I would like to have better insight into the data set. Specifically, a scatter plot showing the manipulation magnitude (raw) plotted against control brood size would be useful.

      Our data and code can be accessed with the following link: https://doi.org/10.5061/dryad.q83bk3jnk. We did supply this information when we submitted our manuscript and again during the review process but we believe this was not passed onto the reviewers by the journal.

      Thank you for this suggestion: this is a useful suggestion also to illustrate how manipulations are relatively stronger for species with smaller clutches, in line with our interpretation of the result presented in Figure 2. We have added Figure S1 which shows the strength of manipulation compared to the species average.

      (61) L. 107: this seems a bold statement - surely you can test directly whether effect size becomes disproportionally stronger when manipulations are outside the natural range, for example by including this characterization as a factor in the models in Table 1.

      It is hard to define exactly what the natural range is here, so it is not easy to factorise objectively, which is why we chose not to do this. However, it is clear that for species with small clutches the manipulation itself is often outside the natural range. Thank you for your suggestion to include a figure for this as it is clear manipulations are stronger in species with smaller clutches. We attribute this to species being forced outside their natural range. We consider our wording makes it clear that this is our interpretation of our findings and we therefore do not think this is a bold statement, especially as it fits with how we interpret our later simulations.

      (62) Fig.3, legend: the term 'node support' does not mean much to me, please explain.

      Node support is a value given in phylogenetic trees to dictate the confidence of a branch. In this case, values are given as a percentage and so can translate to how many times out of 100 the estimate of the phylogeny gives the same branching. Our values are low, as we have relatively few species in our meta-analysis.

      (63) Fig.3: it would be informative when you indicate in this figure whether the species contributed to the experimental or the observational data set or both.

      We have added into Fig 3 whether the species was observational, experimental or both.

      (64) L.139: the p-value refers to the interaction between species clutch size and treatment (observational vs. experimental), but it appears that no evidence is presented for the correlation being significant in either observational or experimental studies.

      We agree that our reporting of the effect size could be misinterpreted and have added detail here. The statistic provided describes the slopes are significantly different between observational and experimental, implying there are differences between the slopes of small and large clutch-laying species.

      (65) L.140: I am wondering to what extent these correlations, which are potentially interesting, are driven by the fact that species average clutch size was also used when expressing the manipulation effect. In other words, to what extent is the estimate on the Y-axis independent from the clutch size on the X-axis? Showing that the result is the same when using survival effect sizes per manipulation category would considerably improve confidence in this finding.

      We are unsure what the reviewer means by “per manipulation category”. Please also note that we have used a logistic regression to calculate our effect sizes of survival, given a unit increase in reproductive effort. So, for example, if a population contained birds that lay 2,3 or 4 eggs, provided that the number of birds which survived and died in each category did not change, if we changed the number of eggs raised to 10,11 or 12, respectively, then our effect size would be the same. In this way, our effect sizes are independent of the species’ average clutch size.

      (66) L.145: when I remember correctly, Santos & Nakagawa considered brood size reduction and enlargement separately. Can this explain the contrasting result? Please discuss.

      You are correct, in that Santos & Nakagawa compared reductions and enlargements to controls separately. However, we found some mistakes in the data extracted by Santos & Nakagawa that we believe explain the differences in our results for sex-specific effect sizes. We do not feel that highlighting these mistakes in the main text is fair, useful or scientifically relevant, as our approach is to improve the test of the hypothesis.

      (67) L.158-159: looking at table S2 it seems to me you have a whole range of estimates. In any case, there is something to be said for taking the estimates for females because it is my impression (and experience) that clutch size variation in most species is a sex-linked trait, in that clutch size tends to be repeatable among females but not among males.

      We agree that, in many cases, the female is the one that ultimately decides on the number of chicks produced. We did also consider using female effect sizes only, however, we decided against this for the following reasons: (1) many of the species used in our meta-analysis exhibit biparental care, as is the case for many seabirds, and so using females only would bias our results towards species with lower male investment; in our case this would bias the results towards passerine species. (2) it has also been shown that, as females in some species are operating at their maximum of parental care investment, it is the males who are able to adjust their workload to care for extra offspring. (3) we are ultimately looking at how many offspring the breeding adults should produce, given the effort it costs to raise them, and so even if the female chooses a clutch size completely independently of the male, it is still the effort of both parents combined that determines whether the parents gain an overall fitness benefit from laying extra eggs. (4) some studies did not clearly specify male or female parental survival and we would not want to reduce our dataset further.

      (68) L.158-168: please explain how you incorporated brood size effects on the fitness prospects of offspring, given that it is a very robust finding of brood size manipulation studies that this affects offspring growth and survival.

      We would argue this is near-on impossible to incorporate into our simulations. It is unrealistic to suggest that incorporating offspring growth into our simulations would add insight, as a change in offspring number rarely affects all offspring in the nest equally and there can even be quite stark differences; for example, this will be most evident in species that produce sacrificial offspring. This effect will be further confounded by catch-up growth, for example, and so it is likely that increased sibling competition from added chicks alters offspring growth trajectories, rather than absolute growth as the reviewer suggests. There are mixed results in the literature on the effect of altering clutch size on offspring survival, with an increased clutch size through manipulation often increasing the number of recruits from a nest. It would be interesting, however, to explore this further using estimates from the literature, but this is beyond our current scope, and would in our initial intuition not be very accurate. It would be interesting to explore how big the effect on offspring should be to constrain effect size strongly. Such work would be more theoretical. The point of our simple fitness projections here is to aid interpretation of the quantitative effect size we estimated.

      (69) L.163: while I can understand that you select the estimate of -0.05 for computational reasons, it has enormous confidence intervals that also include zero. This seems problematic to me. However, in the simulations, you also examined the results of selecting -0.15, which is close to the lower end of the 95% C.I., which seems worth mentioning here already.

      Thank you for this suggestion. Yes, indeed, our range was chosen based on the CI, and we have now made this explicit in the manuscript.

      (70) L.210: defined in this way, in my world this is not what is generally taken to be a selection differential. Is what you show not simply scaled lifetime reproductive success?

      As far as we are aware, a selection differential is the relative change between a given group and the population mean, which is what we have done here. We appreciate this is a slightly unusual context in which to place this, but it is more logical to consider the individuals who produce more offspring as carrying a potential mutation for higher productivity. However, we believe that “selection differential” is the best terminology for the statistic we present. We also detail in our methodology how we calculate this. We have adjusted this sentence to be more explicit about what we mean by selection differential.

      (71) L.177-180: is this not so because these parameter values are closest to the data you based your estimates on, which yielded a low estimate and hence you see that here also?

      We are unsure of what exactly the reviewer means here. The effect sizes for our exemplar species were predicted from each combination of clutch size and survival rate. Note that we used a range of effect sizes, higher than that estimated in our meta-analysis, to explore a large parameter space and that these same conclusions still hold.

      (72) L.191-194: these statements are problematic, because based on the assumption that an increase in brood size does not impact the fitness prospects of the offspring, and we know this assumption to be false.

      Though we appreciate that some cost is often absorbed by the offspring themselves, we are unaware of any evidence that these costs are substantial and large enough to drive within-species variation in reproductive effort, though for some specific species this may be the case. However, in terms of explaining a generalisable, across-species trend, the fitness costs incurred by a reduction in offspring quality are unlikely to be significantly larger than the survival costs to reproduce. We also find it highly unlikely the cost to fitness incurred by a reduction in offspring quality is large enough to counter-balance the effect of parental quality that we find in our observational studies. We do also discuss other costs in our discussion.

      (73) L.205: here and in other places it would be useful to be more explicit on whether in your discussion you are referring to observational or experimental variation.

      We have added this detail to our manuscript. Do note that many of our conclusions are drawn by the combination of results of experimental and observational studies. We believe the addition of Figure 5 makes this more clear to the reader.

      (74) L.225: this may be true (at least, when we overlook the misuse of the word 'quality' here), but I would expect some nuance here to reflect that there is no surprise at all in this result as this pattern is generally recognized in the literature and has been the (empirical) basis for the often-repeated explanation of why experiments are required to demonstrate trade-offs. On a more quantitative level, it is worth mentioning the paper of Vedder & Bouwhuis (2017, Oikos) that essentially shows the same thing, i.e. a positive association between reproductive output and parental survival.

      We have added some discussion on this point, including adding the citation mentioned. However, we would like to highlight that our results demonstrate that brood manipulations are not necessarily a good test of trade-offs, as they fail to recognise that individuals differ in their underlying quality. Though we agree that this result should not necessarily be a surprising one, we have also not found it to be the case that differences in individual quality are accepted as the reason that intra-specific clutch size is maintained – in fact, we find that it is most commonly argued that when costs of reproduction are not identifiedit is concluded that the costs must be elsewhere – yet we cannot find conclusive evidence that the costs of reproduction (wherever they lie) are driving intra-specific variation in reproductive effort. Furthermore, some studies in our dataset have reported negative correlations between reproductive effort and survival (see observational studies, Figure 1).

      (75) L.225-226: perhaps present this definition when you first use the term.

      We have added more detail to where we first use and define this term to improve clarity (L57-58).

      (76) L.227-228, "currently unknown": this statement surprised me, given that there is a plethora of studies showing within-population variation in clutch size to depend on environmental conditions, in particular the rate at which food can be gathered.

      We mean to question that if an individual is “high quality”, why is it not selected for? We have rephrased, to improve clarity.

      (77) L.231: this seems no more than a special case of the environmental effect you mention above.

      We think this is a relevant special case, as it constitutes within-individual variation in reproduction that is mistaken for between-individual variation. This is a common problem in our field, that we feel needs adressing. We only have between-individual variation here in our study on quality, and by highlighting this we show that there might not be any variation between individuals, but this could come about fully (doubtful) or partly (perhaps likely) due to terminal effects.

      (78) L235-236: but apparently depending on how experimental and natural variation was expressed? Please specify here.

      We are not sure what results the reviewer is referring to here, as we found the same effect (smaller clutch laying species are more severely affected by a change in clutch size) for both clutch size expressed as raw clutch size and standardised clutch size.

      (79) L.237: the concept of 'limits' is not very productive here, and it conflicts with the optimality approach you apply elsewhere. What you are saying here can also be interpreted as there being a non-linear relationship between brood size manipulation and parental survival, but you do not actually test for that. A way to do this would be to treat brood size reduction and enlargement separately. Trade-off curves are not generally expected to be linear, so this would also make more sense biologically than your current approach.

      We have replaced “limits” with “optima”. We believe our current approach of treating clutch size as a continuous variable, regardless of manipulation direction, is the best approach, as it allows us to directly compare with observational studies and between species that use different manipulations (now nicely illustrated by the reviewer’s suggested Figure S1). Also note that transforming clutch size to a proportion of the mean allows us to account for the severity in change in clutch size. We also do not believe that treating reductions and enlargements separately accounts for non-linearity, as either we are separating this into two linear relationships (one for enlargements and one for reductions) or we compare all enlargements/reductions to the control, as in Santos & Nakagawa 2012, which does not take into account the severity of the increase, which we would argue is worse for accounting for non-linearity. Furthermore, in the cases where the manipulation involved one offspring only, we also cannot account for non-linearity.

      (80) L.239: assuming birds are on average able to optimize their clutch size, one could argue that any manipulation, large or small, on average forces birds to raise a number of offspring that deviates from their natural optimum. At this point, it would be interesting to discuss in some detail studies with manipulation designs that included different levels of brood size reduction/enlargement.

      We agree with the reviewer that any manipulation is changing an individual’sclutch size away from its own individual optima, which we have argued also means brood manipulations are not necessarily a good test of whether a trade-off occurs in the wild (naturally), as there could be interactions with quality – we have now edited to explicitly state this (L299-300).

      (81) L.242-244: when you choose to maintain this statement, please add something along the lines of "assuming there is no trade-off between number and quality of offspring".

      As explained above, though we agree that the offspring may incur some of the cost themselves, we are not aware of any evidence suggesting this trade-off is also large enough to drive intra-specific variation in clutch size across species. Furthermore, in the context here, the trade-off between number and quality of offspring would not change our conclusion – that the fitness benefit of raising more offspring is offset by the cost on survival. We have added detail on the costs incurred by offspring earlier in our discussion (L309-315). The addition of Figure 5 should help interpret these data.

      (82) L.253: instead of reference 30 the paper by Tinbergen et al in Behaviour (1990) seems more appropriate.

      We believe our current citation is relevant here but we have also added the Tinbergen et al (1990) citation.

      (83) L.253-254: such trade-offs may perfectly explain variation in reproductive effort within species if we were able to estimate cost-benefit relations for individuals. In fact, reference 29 goes some way to achieve this, by explaining seasonal variation in reproductive effort.

      We are unaware of any quantitative evidence that any combination of trade-offs explains intra-specific variation in reproductive effort, especially as a general across-species trend.

      (84) L.255: how does one demonstrate "between species life-history trade-offs"? The 'trade-off' between reproductive rate and survival we observe between species is not necessarily causal, and hence may not really be a trade-off but due to other factors - demonstrating causality requires some form of experimental manipulation.

      Between-species trade-offs are well established in the field, stemming from GC Williams’ seminal paper in 1966, and for example in r/K selection theory. It is possible to move from these correlations to testing for causation, and this is happening currently by introducing transgenes (genes from other species) that promote longevity into shorter-lived species (e.g., naked-mole rat genes into mice). As yet it is unclear what the effects on reproduction are.

      (85) L.256: it is quite a big claim that this is a novel suggestion. In fact, it is a general finding in evolutionary theory that fitness landscapes tend to be rather flat at equilibrium.

      It is important to note here that we simulate the effect size found, and hence this is the novel suggestion, that because the resulting fitness landscape is relatively flat there is no directional selection observed. We did not intend to suggest our interpretation of flat fitness landscapes is novel. We have changed the phrasing of this sentence to avoid misinterpretation.

      (86) L.259: why bring up physiological 'costs' here, given that you focus on fitness costs? Do you perhaps mean fitness costs instead of physiological costs? Furthermore, here and in the remainder of this paragraph it would be useful to be more specific on whether you are considering natural or experimental variation.

      The cost of survival is a physiological cost incurred by the reduction of self-maintenance as a result of lower resource allocation. This is one arm of fitness; we feel it would be confusing here to talk about costs to fitness, as we do not assess costs to future reproduction (which formed the large part of the critique offered by the reviewer). We would like to highlight that the aim of this manuscript was to separate costs of reproduction from the effects of quality, and this is why we have observational and experimental studies in one analysis, rather than separately. Our conclusion that we have found no evidence that the survival cost to reproduce drives within-species variation in clutch size comes both from the positive correlation found in the observational studies and our negligible fitness return estimates in our simulations. We therefore, do not believe it is helpful to separate observational and experimental conclusions throughout our manuscript, as the point is that they are inherently linked. We hope that with the addition of Figure 5 that this is more clear.

      (87) L.262: The finding that naturally more productive individuals tend to also survive better one could say is by definition explained by variation in 'quality', how else would you define quality?

      We agree, and hence we believe quality is a good term to describe individuals who perform highly in two different traits. Note that we also say the lack of evidence that trade-offs drive intra-specific variation in clutch size also potentially suggests an alternative theory, including intra-specific variation driven by differences in individual quality.

      Supplementary information

      (88) Table S1: please provide details on how the treatment was coded - this information is needed to derive the estimates of the clutch size effect for the treatments separately.

      We have added this detail.

      (89) Table S2: please report the number of effect sizes included in each of these models.

      We have added this detail.

      (90) Table S4: references are not given. Mentioning species here would be useful. For example, Ashcroft (1979) studied puffins, which lay a single egg, making me wonder what is meant when mentioning "No clutch or brood size given" as the reason for exclusion. A few more words to explain why specific studies were excluded would be useful. For example, what does "Clutch size groups too large" mean? It surprises me that studies are excluded because "No standard deviation reported for survival" - as the exact distribution is known when sample size and proportion of survivors is known.

      We have updated this table for more clarity.

      (91) Fig.S1: please plot different panels with the same scale (separately for observational and experimental studies). You could add the individual data points to these plots - or at least indicate the sample size for the different categories (female, male, mixed).

      We have scaled all panels to have the same y axis and added sample sizes to the figure legend.

      (92) Fig.S3: please provide separate plots for experimental and observational studies, as it seems entirely plausible that the risk of publication bias is larger for observational studies - in particular those that did not also include a brood size manipulation. At the same time, one can wonder what a potential publication bias among observational studies would represent, given that apparently you did not attempt to collect all studies that reported the relevant information.

      We have coloured the points for experimental and observational studies. Note that a study is an independent effect size and, therefore, does not indicate whether multiple data (i.e., both experimental and observational studies) came from the same paper. As we detail in the paper and above in our reviewer responses, we searched for observational studies from species used in the experimental studies to allow direct comparison between observational and experimental datasets.

      Reviewer #2 (Recommendations For The Authors):

      I strongly recommend improving the theoretical component of the analysis by providing a solid theoretical framework before, from it, drawing conclusions.

      This, at a minimum, requires a statistical model and most importantly a mechanistic model describing the assumed relationships.

      We thank the reviewer for highlighting that our aims and methodology are unclear in places. We have added detail to our model and simulation descriptions and have improved the description of our rationale. We also feel the failure of the journal to provide code and data to the reviewers has not helped their appreciation of our methodology and use of data.

      Because the field uses the same wording for different concepts and different wording for the same concept, a glossary is also necessary.

      We thank the reviewer for raising this issue. During the revision of this manuscript, we have simplified our terminology or given a definition, and we believe this is sufficient for readers to understand our terminology.

      Reviewer #3 (Recommendations For The Authors):

      • The files containing information of data extracted from each study were not available so it has not been possible to check how any of the points raised above apply to the species included in the study. The ms should include this file on the Supp. Info as is standard good practice for a comparative analysis.

      We supplied a link to our full dataset and the code we used in Dryad with our submitted manuscript. We were also asked to supply our data during the review process and we again supplied a link to our dataset and code, along with a folder containing the data and code itself. We received confirmation that the reviewers had been given our data and code. We support open science and it was our intention that our dataset should be fully available to reviewers and readers. We believe the data is too large to include as a table in the main text and is not essential in understanding the paper. Our data and code are at https://doi.org/10.5061/dryad.q83bk3jnk.

      • For clarity, refer to 'the effect size of clutch size on survival" rather than simply "effect size". Figures 1 and 2 require cross-referencing with the main text to understand the y-axis.

      We have added detail to the figure legend to increase the interpretability of the figures.

      • Silhouettes in Figure 3 (or photos) would help readers without ornithological expertise to understand the taxonomic range of the species included in the analyses.

      We have added silhouettes into Figure 3.

      • Throughout the discussion: superscripts shouldn't be treated as words in a sentence so please add authors' names where appropriate.

      We have added author names and dates where required.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We would like to thank the reviewers for their attentive reading of our manuscript. We appreciate all the comments and suggestions. We have addressed all the concerns and have included point-by-point responses.

      Reviewer #1

      Evidence, reproducibility and clarity

      • *

      Summary:

      * Cacioppo et al perform a meta-analysis of public omics data examining AURKA protein and mRNA expression (including mRNA isoforms with alternative cleavage and polyadenylation), and hsa-let-7a miRNA (shown to target AURKA mRNA) in multiple cancer types from The Cancer Genome Atlas. They conclude AURKA mRNA and protein expression may be discordant in cancer in part due to the interplay between alternative polyadenylation and hsa-let-7a miRNA.

      Major comments:*

      * 1) Unfortunately, there is a major flaw in the TCGA AURKA protein quantification data that underpins much of this study. Following the protein data trail (via https://docs.gdc.cancer.gov/Data/Introduction and its dependents), it appears to rely on the CST anti-AURKA #14475 which is raised to an antigen around Pro70.*

      Response: We believe the reviewer refers to work from Bertolin et al. 2018 paper (https://doi.org/10.7554/eLife.38111.001) that describes the appearance of truncated versions of AURKA in mitochondrial fractions of cell extracts and shows they depend upon the presence of PMPCB mitochondrial matrix peptidase. We are not familiar with any other literature describing this phenomenon. In our own hands we find AURKA present in the mitochondrial fraction, but the protein is mostly full-length (Grant et al. 2018, https://doi.org/10.1098/rsob.170272). In both papers the mitochondrial pool is small relative to the total cellular pool of AURKA. In fact, this mitochondrial pool is so difficult to detect in intact cells that it has not been reported by other labs and is not universally acknowledged. Given the small size of the mitochondrial pool, any increased amounts of mitochondrial AURKA in cancers, it would be unlikely to significantly impact the measured total protein levels.

      2) Following the flaws identified in the protein foundation data, the study would then benefit from some post-validation of findings with actual biological data derived from their own independent assessment of the cancers being examined.

      • *

      Response: The literature thoroughly reports empirical evidence on AURKA protein expression levels in the cancers analysed in this study, therefore we don't believe our own post-validation of findings would add any novelty in this sense.

      Minor comments:

      * 1) All of the Correlation analysis have been tested for statistical significance and these results are available in the supplementary data. However, I think it would be useful if these statistics were also included in the main figures themselves. (Figures 1B, 2B and 2C) A low correlation that is statistically significant is a more powerful statement.*

      Response: We agree, and plan to add the results of the statistical analyses in the Figures 1B, 2B and 2C.

      2) In the materials and methods, Correlation is separated into distinct degrees: none to very strong, but apart from some lines on the graphs, these degrees of correlation strength are never revisited, so they should be included. Perhaps there is a biological difference between AURKA post transcriptional regulation and protein levels with different R score strength?

      Response: We believe that reiterating a discussion on the degrees of correlation strength in the main text would appear repetitive. We do however plan to add a sentence to appropriate points in the main text to redirect the reader to the materials and methods section for information on the distinct degrees of correlation.

      3) In Figure 2D a clustering analysis was performed to show the possible relationships between hsa-let-7a and protein levels. The current visualization is hard to understand. A 3D graph with Protein, mRNA and has-let-7a axis's would be easier to follow. I believe it would also be beneficial to do something similar including the APA data as this is the area that the paper lacks depth.

      • *

      Response: We agree that 3D graphs could aid visualization and plan to provide a link to an interactive 3D view of our analysis.

      * 4) Figure 3B and 3C, can you apply a statistical test on the SLR ratios given the magnitude difference between CCND1 and AURKA SLRs?*

      • *

      Response: Since the values of AURKA and CCND1 SLRs are not always coming from the same dataset and are therefore not matched for patients, we believe it would not be appropriate to make comparisons applying statistical tests.

      * 5) Even though the paper does not claim to provide a unifying hypothesis for APA/has-let-7a regulation of AURKA, I think a more in depth look at the data would be useful. The discussion starts off well when describing what was found with the analysis, but as is, is mostly a re-statement of the results without added insight.*

      Response: We agree that more in depth analysis of more data would be useful in strengthening conclusions. However, given the variability in interplay between APA and hsa-let-7a we describe, it is well beyond the scope of this study (or the extent of TCGA database) to come up with a unifying hypothesis.

      Significance

      • *

      The study is novel in attempting to show additional layers of AURKA regulation that hadn't been previously investigated. Furthermore, factors controlling AURKA expression are of broad interest. Overall, I would like to say this is an interesting investigation into AURKA mRNA expression in cancers. In our opinion the choice of bioinformatic tools is appropriate and well controlled.*

      General Assessment: As noted in the major comments, a major weakness is the reliance on a flawed measure of AURKA protein levels from the foundation dataset. Thus, the study needs to be repeated using an alternative MS derived dataset to accurately quantify total AURKA protein levels. This would greatly improve the study and subsequent claims.

      Advance: The study has potential to extend knowledge in the field in a conceptual way, predicting the complex interplay of factors that regulate AURKA mRNA processing and translation.

      Audience: Currently the paper is only fully accessible a specialized bioinformatician audience but the topic (factors controlling AURKA expression) has a broad interest in many fields not limited to just cancer but also development and other non-cancer diseases.*

      * This review was jointly completed by a mouse model of human disease AURKA biologist with 24 years' experience, and a bioinformatician.*

      • *

      Reviewer #2

      Evidence, reproducibility and clarity

      In the manuscript "Post-transcriptional control drives Aurora kinase A expression in human cancers", authors Cacioppo, Lindon and colleagues analyze publicly available data on transcript and protein levels for many cancer types to determine correlations between transcript and protein levels for Aurora A and the microRNA hsa-let-7a. This study builds on a recent publication from their lab where they show that different polyadenylation isoforms of the Aurora A transcript in triple negative breast cancer correlate with patient survival and affect protein abundance. In this study, they aim to extend this analysis to 18 different cancer types to determine if posttranscriptional regulation potentially plays a role in Aurora A protein abundance. The authors find that for certain cancer types, Aurora A protein abundance does not correlate with mRNA abundance, suggesting that posttranscriptional regulation may be responsible for differences in protein expression in these cancer types. Furthermore, they find negative correlations between expression of hsa-let-7a and mRNA and protein abundance in certain cancer types, implicating this microRNA as a potential regulator of Aurora A mRNA stability.*

      Major comments:

      1. The biggest issue that I have with this analysis relates to the assumption that Aurora A levels will be meaningfully different between individual tumors in all cancer types. For some cancers, the lack of a correlation between mRNA and protein levels for Aurora A could simply be because Aurora A overexpression is not a feature of that cancer type. Looking at the data, the cancer types where they see little-to-no correlation are the cancer types where none of the tumors have high levels of Aurora A mRNA or protein. Therefore, the lack of correlation is likely because differences in protein levels result from noise in the measurements rather than posttranscriptional regulation. Since the lack of correlation between protein and mRNA in these cancer types is the main evidence for the primary conclusion in the paper that "AURKA mRNA and protein expression are often discordant in cancer as a result of dynamic post-transcriptional regulation", I don't think that this conclusion is supported by the data. If anything, the data seems to show that substantial changes in Aurora A protein levels are almost always accompanied by a corresponding change in mRNA levels.

      To address this issue, the authors could look at the variability in Aurora A protein levels for each cancer type, and then focus their correlation analyses on cancer types where overexpression of Aurora A is a feature.*

      Response: We thank the reviewer for this thoughtful comment. We decided not to consider data on AURKA protein levels between healthy and tumour samples because of the lack of proteomic datasets of matching normal tissues for all cancers (except BRCA) in the TCGA database. For this reason, it cannot be excluded that the tumours where we see little-to-no protein-mRNA correlation have in fact high levels of AURKA protein. Indeed, the literature reports wide empirical evidence that AURKA protein is overexpressed in the cancer tissues where we see little-to-no protein-mRNA correlation (Thyroid cancer: Zhao et al, Cell Biosci, 2022; Jingtai et al, Cell Death Dis, 2023. Prostate cancer: Das et al, Pathol, 2010; Chun Yu Lee et al, Cancer Res, 2006. Kidney cancers: Wen et al, Heliyon, 2024; Li et al, Cell Death Dis, 2022. No evidence available for PCPG). Therefore, we believe that is reasonable to propose that in these cancers, which according to our analysis of TCGA data only show minor or no increase in AURKA mRNA expression compared to the normal tissue, lack of correlation is because of post-transcriptional regulation.

      2. The statistical significance of the analyses is often unclear. For the correlations between Aurora A protein levels and hsa-let-7a, authors mention that two cancers have a correlation with "statistical significance", but I cannot find any indication of how that was determined, and it is not shown in the corresponding figure (2C). The only time significance is indicated for a correlation is in Figure 4A. Is this the only correlation in the whole manuscript with a p-value less than .05?

      Response: The results of the statistical analyses are included in the corresponding supplementary data (Sup. Fig 1, Sup. Fig. 2A-B). We plan to add them to the Figures 1B, 2B and 2C as requested by another reviewer.

      3. The SLR for the Aurora A transcripts is only shown in terms of a ratio between cancer and normal tissue. Without the numbers in the absence of normalization, it is difficult to determine how meaningful this is. Is a two-fold change going from .3 to .6 or .001 to .002?

      • *

      Response: We plan to add a supplementary table containing the SLR values for matched normal and cancer samples in the absence of normalization.

      4. Figure 5B is nearly impossible to interpret due to the extreme differences in overall transcript levels between the cancer types. The differences in scaling of the y-axis between the plots makes this even more challenging. The authors state that "It is evident that each isoform has an individual profile of expression across cancers", but this could only be determined from relative expression levels between the different isoforms instead of absolute levels.

      Response: We retrieved this plot from the GEPIA2 platform without possibility of editing the y-axis. We plan to edit the text to "It is likely that each isoform has an individual profile of expression across cancers, however a measure of the relative expression levels between the different isoforms would be required".

      Minor comments:*

      1. In supplementary figure 3, SLR is plotted on a log scale in A and a linear scale in B.*

      Response: We plan to convert the SLR scale in Sup. Fig. 3B to a log scale.

      2. Figure 4D is a correlation of correlations. I don't see how to interpret this in a meaningful way.

      Response: Figure 4D is not intended for quantitative analysis of correlation of correlations (no quantitative coefficients were in fact calculated), rather to visualize how the link of AURKA SLR with AURKA protein levels and that with hsa-let-7a levels can be differently associated in different cancers.

      Significance

      Aurora A is overexpressed in a wide variety of cancer types. This overexpression is commonly believed to result primarily from increased mRNA abundance. The identification of additional mechanisms regulating Aurora A protein levels would therefore be of interest to the field, as these regulatory mechanisms could be contributing to Aurora A's role in cancer progression.*

      To some degree, the significance of the findings presented here depend on whether they convincingly demonstrate substantial post-transcriptional regulation. My interpretation of the data presented in this manuscript is that it largely supports Aurora A protein levels being extremely well correlated with mRNA levels, which is in line with previous findings.*

      • *

      • *

      • *

      Reviewer #3

      Evidence, reproducibility and clarity

      • *

      *Aurora A misregulation at both mRNA and protein levels has been known since the 1990s to be casually associated in vivo, and strongly associated in vitro, with tumourigenesis. The study builds the case that dysregulation of Aurora A mRNA and protein levels (most previously established) are more prevalent in cancer cells than 'normal' cells, using data from TCGA, and extends this to a mechanistic explanation. It evaluates miRNA and the ratio of the two short/long ratio (SLR) isoforms of mRNA across cancer types compared to healthy controls. The work concludes that an interplay between APA (alternative polyadenylation) and hsa-let-7a miRNA (which has known tumor suppressor properties) regulation of AURKA mRNA contributes to alternative splicing, revealing a new factor explaining changes in AURKA expression in many (if not all) cancers. *

      • *

      *Minor points: *

      • *

      *1) To strengthen the study, some analysis of AURKB mRNA would be useful in the same datasets, because this is also an M-phase kinase. *

      • *

      Response: We carried out a specific study of AURKA (and to some extent also of the cell cycle regulator CCND1) using time-limited access to private TCGA datasets. Although we agree that investigation of AURKB would potentially enable us to strengthen some conclusions, this would be a new project that we do not currently have resources for.

      *2) What happens to TPX2 or CEP192 mRNA (splicing or levels) in the same samples? For TPX2 in particular, this is described in the literature to help form the oncogenic holoenzyme, as well as dictating AURKA protein stability. *

      • *

      Response: Again, we like this suggestion but are not in a position to carry out analyses of TPX2 and CEP192 within the scope of this study.

      • *

      *3) Does an alternative AURKA splicing change G1/S to G2/M-phase roles of AURKA? I understand that mRNA is repressed by hsa- let-7a in G1 and S phases but not in G2, so how does non M-phase AURKA protein get made? This may be beyond the scope of the study at this point. *

      • *

      Response: Whether alternative AURKA transcripts change non-mitotic roles of AURKA is an open and intriguing question. In acknowledgement of this point raised by the reviewer, we plan to add a discussion on this in the main text: "Although there is no evidence to date that different AURKA transcripts might influence AURKA activity, instances of isoform-dependent protein localization and function are increasingly reported (Mitschka and Mayr, Nat Rev Mol Cell Biol, 2022). In a previous study, we have detected higher nuclear localization of a reporter protein under the regulation of AURKA short 3'UTR (Cacioppo et al., eLife, 2023). Therefore, there is a possibility that AURKA mRNA isoforms are targeted to different subcellular localizations to support localized translation - or that AURKA protein is co-translationally targeted to different compartments - and AURKA may be preferentially localized in the nucleus when coded by the short 3'UTR mRNA".

      AURKA protein levels are maintained very low in G1 to S phase compared to G2 and M phases. At the level of translation, this is likely ensured by the absence of factors/mechanisms that activate AURKA translation (e.g., hnRNP Q1) and the presence of factors/mechanisms that repress its translation (e.g., hsa-let-7a), the combination of which results in basal translation of AURKA in G1/S until full translational activation in G2 (where a switch likely occurs whereby activating factors operate while repressing factors are disabled). However, the combination and synergy of these factors/mechanisms are likely cell type- and context-dependent.

      • *

      Significance

      *I think the study is strong overall, and the authors are humble enough to describe the work as an exploratory analysis, which though not directly in my area of expertise (since it relies on data assembly and statistical analysis), has the right team to ask the questions and interrogate the data. It builds on a huge amount of literature and a recent study from this team showing that alternative translation is relevant to activation of AURKA, and which linked let-7a to this process. Overall, the study provides a very useful resource for other researchers, assembling a large amount of data around AURKA mRNA variants, Let-7a miRNA and coming to the conclusions that *

      *1) hsa-let-7a potentially negatively controls the rate of degradation or translation of AURKA mRNA in cancer cells. *

      *2) Splicing-related architecture of the 5'UTR of AURKA mRNA likely plays a role in determining the context-dependent cancer expression profile of expression. *

      Overall, with some extra information around the key regulators of AURKA (TPX2 mRNA?) the work is likely to be cited and spur on future studies.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This valuable paper presents a new protocol for quantifying tRNA aminoacylation levels by deep sequencing. The improved methods for discrimination of aminoacyl-tRNAs from non-acylated tRNAs, more efficient splint-assisted ligation to modify the tRNAs' ends for the following RT-PCR reaction, and the use of an error-tolerating mapping algorithm to map the tRNA sequencing reads provide new tools for anyone interested in tRNA concentrations and functional states in different cells and organisms. The results and conclusions are solid with well-designed tests to optimize the protocol under different conditions.

      Public Reviews:

      We thank both reviewers for suggestions, feedback and improvements. We address these pointwise below.

      Reviewer #1 (Public Review):

      Summary:

      The manuscript of Davidsen and Sullivan describes an improved tRNA-seq protocol to determine aminoacyl-tRNA levels. The improvements include: (i) optimizing the Whitfeld or oxidation reaction to select aminoacyl-tRNAs from oxidation-sensitive non-acylated tRNAs; (ii) using a splint-assisted ligation to modify the tRNAs' ends for the following RT-PCR reaction; (iii) using an error-tolerating mapping algorithm to map the tRNA sequencing reads that contain mismatches at modified nucleotides.

      Strengths:

      The two steps, the oxidation, and the splint-assisted ligation are yield-diminishing steps, thus the protocol of Davidsen and Sullivan is an important improvement of the current protocols to enhance the quantification of aminocyl-tRNAs.

      Weaknesses:

      The oxidation and the selection of aminoacyl-tRNA is the first step in all protocols. Thereafter they differ on whether blunt ligation, hairpin (DM-tRNA-seq, YAMAT-seq, QuantM-seq, mim tRNA-seq, LOTTE tRNA-seq), or splint ligation is used and finally what detection method is applied (i-tRAP, tRNA microarrays). What is the correlation to those alternative approaches (e.g. i-tRAP (PMID 36283829), tRNA microarrays (PMID: 31263264) etc.)? What is the correlation with other approaches with which this improved protocol shares some steps (DM-tRNA-seq, mim-tRNA-seq)?

      We appreciate the fair assessment and fully agree that our work would benefit from a large comparison between all known tRNA-seq methods. We did directly compare many elements of our method to those of other methods (e.g. ligation efficiency and barcode bias); however, as noted by the reviewer we did not perform a direct end-to-end comparison with all other methods. An ideal comparison would require running several different sample conditions and technical replicates through our protocol and repeating the process across a half dozen or so other methods as they are described. Unfortunately, this approach is unlikely to be feasible since each method uses different oligos, reagents and kits, and all would have to be acquired at substantial cost. Some methods also rely on other detection methods such as microarrays, qPCR, or Illumina sequencing, which would also make this goal all the more onerous. There are also different pipelines for data processing that, in some instances, make the final results hard to compare. In short, this would be a monumental and expensive task to do comprehensively. We also worry that, even if these experiments were conducted such that some variables were concluded to be superior, they could still be challengeable based on perceived or actual protocol differences from the prior art. In summary, we think that an overall comparison with each method would be ideal, but practical concerns limit us to optimizing and comparing the variables that we found to be most prone to introducing bias in the results.

      For methods that measure tRNA expression levels (DM-tRNA-seq, YAMAT-seq, QuantM-seq, mim-tRNA-seq, LOTTE tRNA-seq etc.) there are some fundamental problems regarding absolute quantification using NGS that preclude simple comparisons. These problems are well known in the field of microRNA (Fuchs et al. (2012) [PMID: 25942392]) and arise due to several factors introduced during processing steps such as purification, ligation, reverse transcription and amplification. With the lack a “true” quantitation benchmark it would be difficult to make quantitative claims from each.  Therefore, in our own work we benchmark tRNA expression levels for sample-to-sample reproducibility (i.e. precision) as further explained in the response to reviewer #2.

      For comparison to methods that measure tRNA charge we did have an opportunity to compare our results with those of another study. To this end, we have added a figure comparing the baseline charge found using our method and the one used in Evans et al. (Revised manuscript Figure 2—figure supplement 9). This comparison finds broadly similar results for tRNA charge, including similar trends for a subset of Glu, Ser and Pro codons that are notable for their lowered basal tRNA charge.

      Reviewer #2 (Public Review):

      Davidsen and Sullivan present an improved method for quantifying tRNA aminoacylation levels by deep sequencing. By combining recent advances in tRNA sequencing with lysine-based chemistry that is more gentle on RNA, splint oligo-based adapter ligation, and full alignment of tRNA reads, they generate an interesting new protocol. The lab protocol is complemented by a software tool that is openly available on Github. Many of the points highlighted in this protocol are not new but have been used in recent protocols such as Behrens et al. (2021) or McGlincy and Ingolia (2017). Nevertheless, a strength of this study is that the authors carefully test different conditions to optimize their protocol using a set of well-designed controls.

      The conclusions of the manuscript appear to be well supported by the data presented. However, there are a few points that need to be clarified.

      We appreciate the acknowledgement of the strength of our aminoacylation controls and agree that our method is relying on many aspects of the mentioned prior work.  

      (1) One point that remains unsatisfactory is a better benchmarking against the state of the art. It is currently impossible to estimate how much the results of this new protocol differ from alternative methods and in particular from Behrens et al. (2021). Here it will be helpful to perform experiments with samples similar to those used in the mim-tRNAseq study and not with H1299 cells.

      We fully agree that more rigorous benchmarking would be desirable. As also noted in the response to reviewer #1, a full end-to-end comparison of methods would be ideal but would be onerous and expensive in practice, so we focused on optimizing the steps we found to be most prone to introducing bias in the data.

      We agree that Behrens et al., (2021) has substantial methodological overlap with our work and was instrumental in our efforts; however, the focus of their manuscript was largely on quantification of tRNA abundance and modifications, rather than the tRNA charge. In fact, tRNA charge was only determined for yeast in that study. Quantifying the abundance of short RNAs using NGS is very difficult (Fuchs et al. (2012) [PMID: 25942392]) and will likely require the use of a mixture of tRNAs as spike-in references for normalization (Bissels et al. (2009) [PMID: 19861428]). In the case of Behrens et al. (2021), they did not use a spike-in tRNA reference, but instead correlated gene copy number with their measured tRNA abundance. They also compare to Northern blotting for two tRNA transcripts, showing a directionally similar result; however, no quantitative claims can be made measurement accuracy. Until a good method of normalizing tRNA quantification is found, we believe that sample-to-sample reproducibility (i.e. precision) is the most useful objective to optimize because this will allow detection of differential expression. Towards that end, we quantified the precision of our method (Figure 4 and its two supplementary figures) with associated statistics, which can be used to estimate the number of samples required to detect significance during differential expression analysis. For tRNA charge, quantification is easier, which is why we present statistics on both accuracy and precision. In this case we can better compare results across methods, and so we have added a comparison of our results to the charge quantification from Evans et al. (2017) (Figure 2—figure supplement 9).

      (2) While the protocol aims to implement an improved method for quantification of tRNA aminoacylation, it can also be used for tRNA quantification and analysis of tRNA modifications. It will increase the impact of this study if the authors benchmark the outcomes of their protocol with other tRNA sequencing protocols with samples similar to these papers, which will be important for certain research teams that are unlikely to implement two different tRNA sequencing methods. Are there any possible adaptations that would allow the analysis of tRNA fragments?

      The first part of this comment regarding comparison of methods is addressed in response to in the prior reviewer comment and in the response to reviewer 1. In the specific case of tRNA modifications, the issue is similar to abundance quantification in that a “true” reference of modified tRNA is likely necessary for proper quantification, alongside testing of each method simultaneously.

      Regarding tRNA fragments, our method is not suitable for this use case. This is because our adapter ligation step depends on an intact tRNA structure with either CCA or CC overhang on the 3’-end and thus we almost exclusively get reads with CCA/CC ends and no reads from fragments. This specificity is good for increasing charge quantification accuracy but not good for the methods versatility. For a more versatile method we recommend Watkins et al. (2022) [PMID: 35513407].

      (3) Like Behrens et al. (2021), Davidsen and Sullivan use TGIRT-III RT for their analyses. The enzyme is not currently available in a form suitable for tRNA-seq. It would be very helpful to test different new RT enzymes that are commercially available. The example of Maxima RT - Figure 2 Supp 6 - shows significantly lower performance than the presented TGIRT-III RT data. In lines 296-298, the authors mention improvements to the protocol by using ornithine. Why are these improvements not included?

      We share similar concerns that the TGIRT-III enzyme is no longer commercially available. It became unavailable while we were preparing this manuscript, reflected by the fact that almost all our figures are made using this enzyme. Others have discovered this too and Lucas et al. (2023) [PMID: 37024678] tested several RT polymerases using TapeStation as a readout for readthrough. As they reported that Maxima has good performance, we decided to test it on a full run with replicates. The results are outlined in Figure 2—figure supplement 6 and for resubmission we have added a table to the appendix that compares the alignment statistics. Unfortunately, the readthrough of the Maxima polymerase on cytoplasmic tRNAs is not as high as for TGIRT-III; however, interestingly it seems to have better performance for mitochondrial tRNAs (Figure 2 – Figure Supplement 6). Regardless, in the initial paper submission we failed to evaluate whether this readthrough difference affected charge measurements. We have now fixed this by adding Figure 2—figure supplement 7, which shows that there are no differences in charge measurements TGIRT-III vs. Maxima. Not surprisingly, there are substantial differences between polymerases when looking at relative tRNA abundance (which affirms the discussion above related to the difficulty of tRNA abundance quantification); however, the high sample-to-sample reproducibility remains intact with either polymerase. An exhaustive search for better polymerases is warranted but falls outside the scope of our work.

      Regarding the improvements suggested by us, using ornithine as a cleavage catalyst instead of lysine, we first learned about this possibility later and thus only want to make readers aware that other options exist. We have clarified the paragraph to make this clearer.

      (4) A technical concern: The samples are purified multiple times using a specific RNA purification kit. Did the authors test different methods to purify the RNA and does this influence the result of the method?

      In the past, we have relied exclusively on alcohol precipitation but during the development of this protocol we found it easier and more reproducible to use column-based purification when possible. However, as we have not made a direct comparison this remains anecdotal evidence. Nonetheless, to minimize any possible bias of column-based purification you will notice that we use columns with binding capacity 5x higher than the highest amount of RNA/DNA added to the column.

      (5) The study would benefit from an explicit step-by-step protocol, including the choice of adapters that are shown to work best in the protocol.

      This is a great point! We have included tables with all the oligos used (Supplementary file 1), a detailed step-by-step protocol with pictures of anticipated gel results (Supplementary file 2) and an overview of the RNA/DNA manipulations to make it clear where adapter sequences are located (Supplementary file 3). For the data processing we provide a comprehensive example in the Github repository. All this was included in our first submission of this manuscript (as well as on bioRxiv), but we suspect this was not readily accessible to the reviewers. We will make sure that these documents are going to be available through eLife and have emphasized their existence in the main text of the manuscript.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      To stratify this improvement a comparison to the most common methods should be made. For example, how do the results with the improved protocol with i-tRAP (PMID 36283829), tRNA microarrays (PMID: 31263264), or with the approaches the improved protocol shares with some other tRNA-seq approaches (DM-tRNA-seq, mim-tRNA-seq)?

      Once again, we thank the reviewer for the good recommendations. The points about direct comparisons were discussed above.

      Reviewer #2 (Recommendations For The Authors):

      These are all great points; we address them below.

      Minor points:

      - Please use chemical conventions, e.g. for mcm5s2U and NaIO4 with superscript or subscript.

      Fixed.

      - Figure 2F: Glu GAA is only 82% charged; can this be due to mcm5s2U (Figure 3 supp 2) leading to a misalignment? What happens to Ser-NNN? Why is mitochondrial tRNA so much less charged?

      Regarding the Glu-GAA charge at baseline, we do not think this is an artifact of the mcm5s2U modification as it would then also be expected for Gln-CAA and Lys-AAA. The same occurs in the charge data in Evans et al. (2017) and they use a very different alignment strategy. Lastly, the charge titration and half-life experiments show no evidence of inaccuracy/bias for Glu-GAA.

      But the question remains – why is the charge of Glu-GAA so low? At this point our best guess is speculative. It may have something to do with the strong enrichment of Glu-GAA codons in the A site found by ribosome profiling on mouse embryonic stem cells (Ingolia et al. (2011) [PMID: 22056041]).

      - Spell out "clvg" or "dphs" in the figure legend of Figure 2 and others. Similar for other abbreviations in figures. They are not always explained in the legends.

      Fixed.

      - Figure 3 supp 2: Please use U instead of T in the anticodons. The labels are a bit confusing. Please clearly align to the tick (also for Figure 3C).

      Fixed.

      - Line 220-223. Which RT enzyme was used for Figure 3 supp 2? Does it make a difference?

      TGIRT-III was used. Only Figure 2—figure supplement 6 and Figure 2—figure supplement 7 (added for resubmission) show data with the Maxima polymerase. To address the second part of the question we have added a comparison between TGIRT-III and Maxima for mcm5s2U modification detection (Figure 3—figure supplement 3). Interestingly, there is a polymerase specific signature for mcm5s2U modifications; however, more work would be required to determine which polymerase is best suited for detection of this and other modifications.

      - Figure 4 supp 1 and Figure 4 supp 2 change order.

      Fixed.

      Typos:

      - Figure 1 and Figure 1-figure supplement 1: In the periodate the "-" is in a small box (at least in my PDF viewer). Can this box be removed?

      - Line 175: duplicated verb.

      - Line 348: "moved".

      Thanks for catching these. They have now been fixed.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      (1) Measurement of secreted amylase could be seen as direct evidence of sweating, however, how to determine the causal relationship between climbing behavior and sweating? Friction force may also be reduced when there is too much fingertip moisture.

      As the reviewer notes, measurement of secreted amylase can provide direct evidence of sweating, and we performed an iodine and starch reaction. Upon observing the involvement of TRPV4 in mouse foot pad perspiration, we then considered which type of behavioral analysis would be suitable to evaluate this perspiration. We agree with the reviewer’s point that friction force in the climbing test may be reduced by excessive sweating. However, we did not observe severe sweating in the absence of acetylcholine treatment. Accordingly, we interpreted that the increase in the climbing test failure rate for TRPV4KO mice could reflect the reduced friction force associated with the lack of TRPV4 activity.

      (2) For the human skin immunostaining, did the author use the same TRPV4 antibody as used in the mouse staining? Did they validate the specificity of the antibody for the human TRPV4 channel? 

      We used different antibodies for human and mouse samples. Since commercially available anti-TRPV4 antibodies do not work well with mouse samples, we generated our own anti-TRPV4 antibody and validated its specificity.

      (3) In lines 116-117, the authors tried to determine "the functional interaction of TRPV4 and ANO1 is involved in temperature-dependent sweating", however, they only used the TRPV4 ko mice and did not show any evidence supporting the relationship between TRPV4 and ANO1. 

      As the reviewer pointed out, based on the data presented in the original submission we cannot conclude that an interaction between TRPV4 and ANO1 is involved in perspiration. However, we think that the data for TRPV4KO mice presented in Figure 3 of the original version does indicate that TRPV4 is involved in perspiration. The finding that menthol and its related compounds, which inhibit the function of both TRPV4 and ANO1 (see our publication in Scientific Reports 7: 43132, 2017), blocked perspiration in both wild-type and TRPV4KO mice (original Figure 3C, D) indicates involvement of either TRPV4 or ANO1 in perspiration. In the revised version, we present results for additional iodine and starch reaction experiments using Ani9, a potent and specific ANO1 inhibitor. Ani9 drastically inhibited perspiration from mouse food pads both at 25 °C and 35 °C. Based on these collective results, we concluded that both TRPV4 and ANO1, likely acting as a complex, are involved in perspiration. We present the new data with Ani9 in the revised Figure 3E, F.

      (4) Figure 3-4 is quite confusing. At 25˚C, no sweating difference was observed between TRPV4 and wt mice (Fig 3A-3D), suggesting both Ach-induced sweating and basal sweating are TRPV4-independent at 25˚C, however, the climbing test was done at 26-27 ˚C and the data showed a climbing deficit in TRPV4 ko mice. How to interpret the data is unclear. 

      Thank you for raising this point. In the iodine and starch reaction experiment, we observed no significant reduction in perspiration in the absence of acetylcholine at 25 °C, which is the same condition as in the climbing test, whereas we detected less perspiration for TRPV4KO mice. In a trial using additional mice, we detected significantly less perspiration under control conditions without acetylcholine at 25 °C, which is consistent with the results of the climbing test. We have added this new data to the revised Figure 3A, B.

      (5) Were there any gender differences associated with sweating in mice? In Figure 3, the mouse number for behavior tests should be at least 5. 

      The TRPV4KO mice reproduced poorly and we were unable to obtain sufficient numbers of male and female mice to determine whether there were gender differences in sweating. However, according to the reviewer’s suggestion, and as mentioned above, we increased the number of experiments to obtain the results shown in the revised Figure 3. We did not a observe a significant difference in sweating with the larger sample size, which supports our conclusions.

      (6) 8- to 21-week-old mice were used in the immunostaining, the time span is too long. 

      Given the difficulty in obtaining sufficient numbers of TRPV4KO mice, we used a somewhat wider age distribution to obtain samples for immunostaining. However, we did not observe age-dependent differences in immunostaining. We reference this point in the revised manuscript.

      (7) The authors used homozygous TRPV4 ko mice for all experiments. What are control mice? Are they littermates of the TRPV4 ko mice? 

      We did not use littermates for our in vivo experiments because the TRPV4KO mice reproduced poorly and the litter sizes were small. However, we did backcross the KO mice to the commercially available wild-type mice more than ten times. As such, we expect that the wild-type and TRPV4KO mice will have similar genetic backgrounds. In addition, we have published multiple studies that have successfully used this method, which we think supports the reliability of our results for experiments involving mice.

      Reviewer #2 (Public Review):

      (1) The coexpression data needs additional controls. In the TRPV4 KO mice, there appears to be staining with the TRPV4 Ab in TRPV4 KO mice below the epidermis. This pattern appears similar to that of the location of the secretory coils of the sweat glands (Fig 1A). Is the co-staining the authors note later in Figure 1 also seen in TRPV4 KOs? This control should be shown, since the KO staining is not convincing that the Ab doesn't have off-target binding. 

      We thank the reviewer for raising these concerns about immunostaining. As the reviewer notes, in the low power image the signals appeared to be weak and punctate signals were present in the basal region of glandular cells. Although we did not identify immunohistochemical conditions that produced no signal, tissue sections from WT mice stained with anti-TRPV4 antibody showed conspicuous apical signals for the glandular cells facing lumen. Meanwhile, TRPV4KO tissues showed no signals at the apical region of the glandular cells, where the TRPV4-ANO1 interaction is expected to occur. We confirmed no trace signals in the TRPV4KO tissues in the immunoblotting.

      (2) Are there any other markers besides CGRP for dark cells in mice to support the conclusion that mouse secretory cells have clear cell and dark cell properties? 

      We did not stain with other dark cell markers. Based on previous studies describing the differences between clear and dark cells in mouse eccrine glands, we think that dark and clear cells cannot be clearly discriminated, as we described in lines 93-96 of the Results. We identified secretory cells using CK8 and dark cells with CGRP, a marker of dark cells in human eccrine glands (Zancanaro et al. 1999 J Anat). Our result showed that CGRP immunostaining could not discriminate between clear and dark cells, which is consistent with a previous report showing that mouse secretory cells were assumed to be undifferentiated and primitive based on electron microscopic observation (Kurosumi et al. 1970 Arch Histol Jap).

      (3) The authors utilize menthol (as a cooling stimulus) in several experiments. In the discussion, they interpret the effect of menthol as potentially disrupting TRPV4-ANO1 interactions independent of TRPM8. Yet, the role of TRPM8, such as in TRPM8 KO mice, is not evaluated in this study.

      We performed the iodine and starch reaction experiments with TRPM8KO mice. In the TRPM8KO mice, the sweat spots did not differ from those seen for WT mice (p=0.63, t-test), and there was also a significant reduction in sweating with menthol treatment following acetylcholine stimulation that was similar to that seen for WT mice. These results would rule out the involvement of TRPM8 in a menthol-induced reduction in sweating. We have included this data in the revised Figure 3D.

      (4) Along those lines, the authors suggest that menthol inhibits eccrine function, which might lead to a cooling sensation. But isn't the cooling sensation of sweating from evaporative cooling? In which case, inhibiting eccrine function may actually impair cooling sensations.

      Menthol has a non-specific effect that activates TRPM8, TRPV3 and TRPA1, and inhibits TRPV1, TRPV4 and ANO1. Therefore, we did not carry out a climbing test with menthol in part because menthol-dependent TRPA1 activation decreased the propensity of the mice to climb. As the reviewer notes, TRPM8 activation following topical application of menthol may cause a cooling sensation elicited in sensory neurons beneath the skin. However, the comfortable cooling sensation could also be caused in part by decreased sweating. The relationship between a comfortable cooling sensation and less perspiration following menthol application may be difficult to determine, and we have mentioned this in the updated Discussion.

      (5) The climbing assay is interesting and compelling. The authors note performing this under certain temperature and humidity conditions. Presumably, there is an optimal level of skin moisture, where skin that is too dry has less traction, but skin that is too wet may also have less traction. It would bolster this section of the study to perform this assay under hot conditions (perhaps TRPV4 KO mice, with impaired perspiration, would outperform WT mice with too much sweating?), or with pharmacologic intervention using TRPV4 agonists or antagonists to more rigorously evaluate whether this model correlates to TRPV4 function in the setting of different levels of perspiration.

      We thank the reviewer for this suggestion. Upon detecting the involvement of TRPV4/ANO1 interaction in perspiration, we considered different behavioral analyses that can be performed to demonstrate whether the TRPV4/ANO1 interactions are involved in perspiration. As the reviewer suggested, there should be an optimal level of sweating. Therefore, we first set the room temperature at 26-27 ˚C and humidity at 35-50%. To our knowledge, this is the first demonstration of temperature-dependent sweating of mouse foot pads. In humans, palm sweating is often referred to as psychotic sweating that is known to be regulated by sympathetic nerve activity. Here we tested whether foot pad sweating might be related to friction force wherein sufficient amounts of sweating could increase the friction force and in turn increase the success rate for the climbing test using a vinyl-covered slippery slope that was selected based on several trials to determine the optimal surface material and slope angles. As the reviewer suggests, the success rates could be affected by multiple factors, and hot temperatures likely induce more sweating that could increase the success rates in the climbing test. We will need to carry out additional experiments that are beyond the scope of this study to examine these temperature-dependent effects. Generally, sweating is regulated by sympathetic nerve activity that occurs in response to increased brain neuron excitation. However, here we raise for the first time the possibility that sweating might be regulated by local temperature sensation mediated through TRPV4 that may be effective for fine-tuning of perspiration activity. We have updated the Discussion to reference this possibility.

      (6) There are other studies (PMID 33085914, PMID 31216445) that have examined the role of TRPV4 in regulating perspiration. The presence of TRPV4 in eccrine glands is not a novel finding. Moreover, these studies noted that TRPV4 was not critical in regulating sweating in human subjects. These prior studies are in contradiction to the mouse data and the correlation to human anhidrotic skin in the present study. Neither of these studies is cited or discussed by the authors, but they should be. 

      We thank the reviewer for referencing these other studies concerning the possible involvement of TRPV4 in perspiration in humans. These studies focused on the vasodilating effects of TRPV4 and drew the conclusion that TRPV4 is not involved in sweating in humans, which is in contrast to our data for mice and humans. Multiple factors could explain the apparent difference between the two studies. For example, the parameters they examined differed from ours in that we assessed patients with AIGA, whereas the previous studies involved healthy volunteers. We have updated the Discussion to note the difference in the results of our and previous studies.   

      Reviewer #3 (Public Review):

      (1) Figure 2: The calcium imaging-based approach shows average traces from 6 cells per genotype, but it was unclear if all acinar cells tested with this technique demonstrated TRPV4-mediated calcium influx, or if only a subset was presented.

      “n = 6” does not indicate the number of cells, but rather 6 independent experiments that each had over 20 ROIs of sweat glands. We have clarified this point in the updated figure legend.

      (2) Figure 4: The climbing behavioral test shows a significant reduction in climbing success rate in TRPV4-deficient mice. The authors ascribe this to a lack of hind paw 'traction' due to deficiencies in hind paw perspiration, but important controls and evidence that could rule out other potential confounds were not provided or cited. 

      As noted in our response to Comment 5 made by Reviewer #2, we spent considerable time identifying optimal conditions that would delineate success rates in the climbing experiments. We are confident that TRPV4KO mice had significantly lower success rates than WT mice, but there are various factors that could affect the experimental outcomes. We reference these factors in the updated Discussion.

      (3) In general, the results support the authors' claims that TRPV4 activity is a necessary component of sweat gland secretion, which may have important implications for controlling perspiration as well as secretion from other glands where TRPV4 may be expressed. 

      As described above, the results we obtained in the climbing test can be affected by various factors. However, based on the consistency of the results obtained for the climbing test and the iodine and starch reaction assay, we think that our interpretation is correct. In terms of the involvement of TRPV4/ANO1 interactions in fluid secretion, we previously reported that the TRPV4/ANO1 complex is involved in cerebrospinal fluid secretion in the mouse choroid plexus (FASEB J. 2014) and in saliva and tear secretion in mouse salivary and lacrimal glands (FASEB J. 2018). Together, these findings suggest that this mechanism is common to water efflux from exocrine glands.

      Reviewer #1 (Recommendations For The Authors):

      (1) An exocrine gland-specific trpv4 knockout mouse should be used, as TRPV4 is also expressed by muscles, global knockout TRPV4 may affect the TRPV4-dependent muscle strength and reduce the climbing ability in mice. 

      As the reviewer suggests, use of mice with TRPV4 knockout specific to exocrine glands would be preferable to mice having global TRPV4 knockout given that TRPV4 is expressed in multiple tissues. We agree with this suggestion, but we do not currently have such mice in hand. However, as mentioned above, we have reported the involvement of theTRPV4/ANO1 interaction in cerebrospinal fluid secretion from the choroid plexus in mice (FASEB J. 28: 2238-2248, 2014), as well as saliva and tear secretion in mouse salivary and lacrimal glands (FASEB J. 32: 1841-1854, 2018.), suggesting that the TRPV4/ANO1 interaction could be widely involved in exocrine gland functions that involve water movement. We have updated the Discussion to reference this point.  

      (2) The authors showed Calcium imaging data that Menthol inhibits TRPV4-dependent calcium influx. However, it is well known that menthol induces the sensation of cooling by activating TRPM8. More evidence, including patch clamp recordings, should be done to verify the inhibition effects of menthol on TRPV4 and ANO1. Moreover, Fig 3E-3F could only suggest that menthol-induced cooling sensation may affect sweating but not the inhibition effect of menthol on TRPV4 and ANO1 channels. 

      We agree that more evidence including patch-clamp recordings can verify the inhibitory effects of menthol on TRPV4 and ANO1. We did not include such experiments here since we previously showed that menthol and related agents indeed inhibit TRPV4- and ANO1-mediated currents (Sci. Rep. 7: 43132, 2017). We now cite this paper in the revised version.

      (3) Excepting the climbing test, are there any other better models to asses the sweating-related behaviors? 

      When we detected the involvement of TRPV4/ANO1 interactions in perspiration, we considered different types of behavioral analyses that could be used to demonstrate TRPV4/ANO1-dependent perspiration. We think that the climbing experiment is the best test, particularly since foot pads are one of the few regions on mice that is not covered by fur and thus amenable to evaluation of perspiration using an iodine and starch test.

      Reviewer #2 (Recommendations For The Authors):

      (1) I was confused by a section in the introduction on lines 59-60: How does Cl- efflux lead to the formation of a physical complex in cells with high intracellular Cl-? What is the physical complex? This seems like several disparate concepts combined together, which need to be clarified.

      We apologize for the incomplete descriptions of several of our previous works. We have amended the Introduction section in the revised manuscript.

      Reviewer #3 (Recommendations For The Authors):

      (1) TRPV4 is expressed by multiple other cell types in the skin (keratinocytes, macrophages etc.) which may have an impact on peripheral sensory function. Is there evidence that TRPV4-deficient animals have relatively normal sensory acuity and/or proprioception? Such evidence would lend more credibility to the reported findings in the climbing test. 

      As the reviewer points out, TRPV4 is expressed by multiple other cell types in the skin. To date we have found that TRPV4KO mice show no differences in sensory functions compared to WT mice. Whether TRPV4 is involved in proprioception is unclear, based on both our own observation and those that appear in the literature, although TRPV4 is clearly activated by mechanical stimuli. We previously compared the mechanical sensitivity of TRPV4 and Piezo1 in bladder epithelial cells, and found that Piezo 1 shows much higher sensitivity relative to TRPV4 (J. Biol. Chem. 289: 16565-16575, 2014), which is consistent with the involvement of Piezo1, rather than TRPV4, in proprioception. Although TRPV4 is reported to be expressed in sensory neurons, we did not detect TRPV4-mediated responses in isolated rat and mouse DRG neurons, suggesting that TRPV4-positive sensory neurons are relatively rare.

      (2) The methods section refers to loading entire sweat glands with Fura-2 dye for calcium imaging, but the figure legend refers to sweat gland acinar cells. Resolving this ambiguity would help readers to interpret the data. 

      We apologize for this error and have made an appropriate correction in the revised manuscript.

      (3) Alternatively, could acute intraplantar injection of a TRPV4 antagonist (e.g. GSK205) in wild-type mice phenocopy the TRPV4-knockout mouse deficits, or could normal climbing behavior be restored in the TRPV4 knockout by adding artificial perspiration to their hindpaws?

      We thank the reviewer for raising this interesting possibility and suggesting use of TRPV4 agonists or antagonists in the climbing tests. We agree that results of such an experiment would support the involvement of TRPV4 in sweating. We tried to do such experiments using injection of TRPV4 regulators into mouse hindpaws. However, the injections themselves appeared to impact climbing ability, perhaps in part due to painful sensations associated with the injection. Similarly, menthol injection appeared to reduce climbing activity, likely through pain sensations associated with TRPA1 activation. As such, we did not pursue these experiments.

    1. Author response:

      Reviewer 1:

      A limit of the paper is that the biological mechanisms by which intracellular mechanics is modulated (e.g. among cell types) remains unexplored and only briefly discussed. Yet this limit is greatly offset by the rigor of the approach.

      We thank the reviewer for the valuable feedback. The question regarding the biological mechanisms responsible for the different mechanical properties is, indeed, a highly important and interesting issue. In line with the reviewer, we consider this so important that it requires an extra, dedicated research focus, which is far beyond the scope of this article. By introducing the concept of the mechanical fingerprint, we provide in this work the framework to systematically investigate biological mechanisms but also the functional relevance of the intracellular mechanical properties in future studies. In the revised manuscript, we’ll elaborate on the discussion.

      Reviewer 2:

      The most difficult part of the method is the part with actin polymerization inhibition with cytochalasin B. The data shows that viscoelastic parameters as well as active energy parameters are unaffected by cytochalasin B. It is reasonable to expect that elasticity will reduce and fluidity will increase upon application of such a drug. The stiffness-reducing effect was observed only when CB was used with nocodazole most likely because of phagocytosis of the bead, which is governed by microtubule. The use of other actin-depolymerizing drugs such as latrunculin A would be needed to test actin’s role in mechanical fingerprints. If actin’s role is only explained by accompanying microtubule inhibition, it is not a convenient system to directly test the mechano-adaptation process.

      We thank the reviewer for the time and the instructive feedback. Our finding that actin depolymerization has no effect on the intracellular mechanics may appear unfamiliar, as many rheological studies performed on the cell’s cortex highlight the importance of actin on the mechanical properties of the whole cell. However, as the actin network is reported to be very sparse away from the cortex it is not impossible that the mechanical properties may be dominated by other structures in the cytoplasm. Indeed, our findings are consisted with other studies that see no strong effect of actin depolymerization on the interphase intracellular mechanics (e.g. https://doi.org/10.1016/j.bpj.2023.04.011 or https://doi.org/10.1038/s41567-021-01368-z). Still, we fully agree with the reviewers that this is an important point. In a revised version we aim to investigate the effect of other actin-depolymerizing drugs and will try to perform immunostaining to visualize and further illuminate the potential compensation mechanism between actin and MT.

      Depolymerization of MT with nocodazole did not reduce the solid-like property A. Adding discussion and comparison with other papers in the literature using nocodazole will be helpful in understanding why.

      Again, we agree with the reviewer and propose to further study this point by performing additional immunostainings and by elaborating on the discussion, also including the results of other studies.

      Reviewer 3:

      The importance of the mechanical fingerprint is diluted due to some missing controls needed for biological relevance.

      We thank the reviewer for his valuable time and feedback. This comment is in line with the point already raised by reviewer 1 and highlights the important question of how the intracellular mechanical properties are related to the actual cell function. We fully agree with the reviewers that at this point we can only report on differences, but cannot claim a biological function that is depending on the fingerprint. Although we think the alignment between function and the mechanical fingerprints allows the hypothesis that the biological system is tuning its mechanical properties for a specific function, we do not want to make any claim in this direction at the current state of our research. Hence, to answer these intriguing questions, carefully designed control experiments are required, as pointed out by the reviewer. However, this direction is not the scope of this manuscript. Here, we establish the tools we’ll use in future studies to address these highly relevant questions. Therefore, we propose to discuss these important future directions in a revised manuscript.

    1. Author response:

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this study, Kroll et al. conduct an in-depth behavioral analysis of F0 knockouts of 4 genes associated with late-onset Alzheimer's Disease (AD), together with 3 genes associated with early- onset AD. Kroll and colleagues developed a web application (ZOLTAR) to compare sleep-associated traits between genetic mutants with those obtained from a panel of small molecules to promote the identification of affected pathways and potential therapeutic interventions. The authors make a set of potentially important findings vis-à-vis the relationship between AD-associated genes and sleep. First, they find that loss-of-function in late-onset AD genes universally results in nighttime sleep loss, consistent with the well-supported hypothesis that sleep disruption contributes to Alzheimer's-related pathologies. psen-1, an early-onset associated AD gene, which the authors find is principally responsible for the generation of AB40 and AB42 in zebrafish, also shows a slight increase in activity at night and slight decreases in nighttime sleep. Conversely, psen-2 mutations increase daytime sleep, while appa/appb mutations have no impact on sleep. Finally, using ZOLTAR, the authors identify serotonin receptor activity as potentially disrupted in sorl1 mutants, while betamethasone is identified as a potential therapeutic to promote reversal of psen2 knockout-associated phenotypes.

      This is a highly innovative and thorough study, yet a handful of key questions remain. First, are nighttime sleep loss phenotypes observed in all knockouts for late-onset AD genes in the larval zebrafish a valid proxy for AD risk?

      We cannot say, but it is an interesting question. We selected the four late-onset Alzheimer’s risk genes (APOE, CD2AP, CLU, SORL1) based on human genetics data and brain expression in zebrafish larvae, not based on their likelihood to modify sleep behaviour, which we could have tried by searching for overlaps with GWAS of sleep phenotypes, for example. Consequently, we find it remarkable that all four of these genes caused a night-time sleep phenotype when mutated. We also find it reassuring that knockout of appa/appb and psen2 did not cause a night-time sleep phenotype, which largely excludes the possibility that the phenotype is a technical artefact (e.g. caused by the F0 knockout method) or a property of every gene expressed in the larval brain.

      Having said that, it could still be a coincidence, rather than a special property of genes associated with late-onset AD. In addition to testing additional late-onset Alzheimer’s risk genes, the ideal way to answer this question would be to test in parallel a random set of genes expressed in the brain at this stage of development. From this random set, one could estimate the proportion of genes that cause a night-time sleep phenotype when mutated. One could then use that information to test whether late-onset Alzheimer’s risk genes are indeed enriched for genes that cause a night-time sleep phenotype when mutated.

      For those mutants that cause nighttime sleep disturbances, do these phenotypes share a common underlying pathway? e.g. Do 5-HT reuptake inhibitors promote sleep across all 4 late-onset genes in addition to psen1? Can 5-HT reuptake inhibitors reverse other AD-related pathologies in zebrafish? Can compounds be identified that have a common behavioral fingerprint across all or multiple AD risk genes? Do these modify sleep phenotypes?

      To attempt to answer these questions, we used ZOLTAR to generate predictions for all the knockout behavioural fingerprints presented in the study, in the same way as for sorl1 in Fig. 5 and Fig. 5–suppl. 1. Here are the indications, targets, and KEGG pathways which are shared by the largest number of knockouts:

      – Four indications are shared by 4/7 knockouts: “mydriasis” (dilated pupils, significant for psen1, apoea/apoeb, cd2ap, clu); “fragile X syndrome” (psen1, apoea/apoeb, cd2ap, sorl1), “insomnia” (psen2, apoea/apoeb, cd2ap, sorl1); “malignant essential hypertension” (appa/appb, psen1, apoea/apoeb, cd2ap).

      – Two targets are shared by 5/7 knockouts: “glycogen synthase kinase−3 alpha” (psen1, apoeab, cd2ap, clu, sorl1) and “neuronal acetylcholine receptor beta−2” (appa/appb, psen1, apoeab, cd2ap, clu).

      – Two KEGG pathways are shared by 5/7 knockouts: “cholinergic synapse” (psen1, apoea/apoeb, cd2ap, clu, sorl1) and “nitrogen metabolism” (appa/appb, psen1, psen2, cd2ap, clu).

      As reminder, we hypothesised that loss of Sorl1 affected serotonin signalling based on the following annotations being significant: indication “depression”, target “serotonin transporter”, and KEGG pathway “serotonergic synapse”. All three are also significant for psen2 knockouts, but none others. ZOLTAR therefore does not predict serotonin signalling to be a major theme common to all mutants with a night-time sleep loss phenotype.

      While perhaps not surprising, we find reassuring that insomnia appears in the indications shared by the largest number of knockouts. apoea/apoeb, cd2ap, sorl1 also happen to be the knockouts with the largest loss in night-time sleep.

      Particularly interesting is cholinergic signalling appearing in the most common targets and KEGG pathways. Acetylcholine signalling is a major theme in research on Alzheimer’s disease. For example, the first four drugs ever approved by the FDA to treat Alzheimer’s disease were acetylcholinesterase inhibitors, which increase acetylcholine signalling by preventing its breakdown by acetylcholinesterase. These drugs are generally considered only to treat symptoms and not modify disease course, but this view has been called into question (Munoz-Torrero, 2008; Relkin, 2007). If, as ZOLTAR suggests, mutations in several Alzheimer’s risk genes affect cholinergic signalling early in development, this would point to a potential causal role of cholinergic disruption in Alzheimer’s disease.

      We see that literature also exists on the involvement of glycogen synthase kinase-3 in AD (Lauretti et al., 2020). We plan to explore further these predictions in a future study.

      Finally, the web- based platform presented could be expanded to facilitate comparison of other behavioral phenotypes, including stimulus-evoked behaviors.

      Yes, absolutely. The behavioural dataset we used (Rihel et al., 2010) did not measure other stimuli than day/night light transitions, but the “SauronX” platform and dataset (Myers-Turnbull et al., 2022) seems particularly well suited for this. To provide some context, we and collaborators have occasionally used the dataset by Rihel et al. (2010) to generate hypotheses or find candidate drugs that reverse a behavioural phenotype measured in the sleep/wake assay (Ashlin et al., 2018; Hoffman et al., 2016). The present work was the occasion to enable a wider and more intuitive use of this dataset through the ZOLTAR app, which has already proven successful. Future versions of ZOLTAR will seek to incorporate larger drug datasets using more types of measurements.

      Finally, the authors propose but do not test the hypothesis that sorl1 might regulate localization/surface expression of 5-HT2 receptors. This could provide exciting / more convincing mechanistic support for the assertion that serotonin signaling is disrupted upon loss of AD-associated genes.

      5-HT receptor type 4a is another candidate as it was shown to interact with sorting nexin 27, a subunit of retromer (Joubert et al., 2004). We see that antibodies against human 5-HT receptor type 2 and 4a exist; whether they would work in zebrafish remains to be tested, and in our experience, the availability of antibodies suitable for immunohistochemistry in the zebrafish is a serious experimental roadblock.

      Despite these important considerations, this study provides a valuable platform for high-throughput analysis of sleep phenotypes and correlation with small-molecule-induced sleep phenotypes.

      Strengths:

      - Provides a useful platform for comparison of sleep phenotypes across genotypes/drug manipulations.

      - Presents convincing evidence that nighttime sleep is disrupted in mutants for multiple late-onset AD-related genes.

      - Provides potential mechanistic insights for how AD-related genes might impact sleep and identifies a few drugs that modify their identified phenotypes

      Weaknesses:

      - Exploration of potential mechanisms for serotonin disruption in sorl1 mutants is limited.

      - The pipeline developed can only be used to examine sleep-related / spontaneous movement phenotypes and stimulus-evoked behaviors are not examined.

      - Comparisons between mutants/exploration of commonly affected pathways are limited.

      Thank you for these excellent suggestions, please see our answers above.

      Reviewer #2 (Public Review):

      Summary:

      This work delineates the larval zebrafish behavioral phenotypes caused by the F0 knockout of several important genes that increase the risk for Alzheimer's disease. Using behavioral pharmacology, comparing the behavioral fingerprint of previously assayed molecules to the newly generated knockout data, compounds were discovered that impacted larval movement in ways that suggest interaction with or recovery of disrupted mechanisms.

      Strengths:

      This is a well-written manuscript that uses newly developed analysis methods to present the findings in a clear, high-quality way. The addition of an extensive behavioral analysis pipeline is of value to the field of zebrafish neuroscience and will be particularly helpful for researchers who prefer the R programming language. Even the behavioral profiling of these AD risk genes, regardless of the pharmacology aspect, is an important contribution. The recovery of most behavioral parameters in the psen2 knockout with betamethasone, predicted by comparing fingerprints, is an exciting demonstration of the approach. The hypotheses generated by this work are important stepping stones to future studies uncovering the molecular basis of the proposed gene-drug interactions and discovering novel therapeutics to treat AD or co-occurring conditions such as sleep disturbance.

      Weaknesses:

      - The overarching concept of the work is that comparing behavioral fingerprints can align genes and molecules with similarly disrupted molecular pathways. While the recovery of the psen2 phenotypes by one molecule with the opposite phenotype is interesting, as are previous studies that show similar behaviorally-based recoveries, the underlying assumption that normalizing the larval movement normalizes the mechanism still lacks substantial support. There are many ways that a reduction in movement bouts could be returned to baseline that are unrelated to the root cause of the genetically driven phenotype. An ideal experiment would be to thoroughly characterize a mutant, such as by identifying a missing population of neurons, and use this approach to find a small molecule that rescues both behavior and the cellular phenotype. If the connection to serotonin in the sorl1 was more complete, for example, the overarching idea would be more compelling.

      Thank you for this cogent criticism.

      On the first point, we were careful not to claim that betamethasone normalises the molecular/cellular mechanism that causes the psen2 behavioural phenotype. Having said that, yes, to a certain extent that would be the hope of the approach. As you say, every compound which normalises the behavioural fingerprint will not normalise the underlying mechanism, but the opposite seems true: every compound that normalises the underlying mechanism should also normalise the behavioural fingerprint. We think this logic makes the “behaviour-first” approach innovative and interesting. The logic is to discover compounds that normalise the behavioural phenotype first, only subsequently test whether they also normalise the molecular mechanism, akin to testing first whether a drug resolves the symptoms before testing whether it actually modifies disease course. While in practice testing thousands of drugs in sufficient sample sizes and replicates on a mutant line is challenging, the dataset queried through ZOLTAR provides a potential shortcut by shortlisting in silico compounds that have the opposite effect on behaviour.

      You mention a “reduction in movement bouts” but note here that the number of behavioural parameters tested is key to our argument. To take the two extremes, say the only behavioural parameter we measured in psen2 knockout larvae was time active during the day, then, yes, any stimulant used at the right concentration could probably normalise the phenotype. In this situation, claiming that the stimulant is likely to also normalise the underlying mechanism, or even that it is a genuine “phenotypic rescue”, would not be convincing. Conversely, say we were measuring thousands of behavioural parameters under various stimuli, such as swimming speed, position in the well, bout usage, tail movements, and eye angles, it seems almost impossible for a compound to rescue most parameters without also normalising the underlying mechanism. The present approach is somewhere in-between: ZOLTAR uses six behavioural parameters for prediction (e.g. Fig 6a), but all 17 parameters calculated by FramebyFrame can be used to assess rescue during a subsequent experiment (Fig. 6c). For both, splitting each parameter in day and night increases the resolution of the approach, which partly answers your criticism. For example, betamethasone rescued the day-time hypoactivity without causing night-time hyperactivity, so we are not making the “straw man argument” explained above of using any broad stimulant to rescue the hypoactivity phenotype.

      Furthermore, for diseases where the behavioural defect is the primary concern, such as autism or bipolar disorder, perhaps this behaviour-first approach is all that is needed, and whether or not the compound precisely rescues the underlying mechanism is somewhat secondary. The use of lithium to prevent manic episodes in bipolar disorder is a good example. It was initially tested because mania was thought to be caused by excess uric acid and lithium can dissolve uric acid (Mitchell and Hadzi-Pavlovic, 2000). The theory is now discredited, but lithium continues to be used without a precise understanding of its mode of action. In this example, behavioural rescue alone, with tolerable secondary effects, is sufficient to be beneficial to patients, and whether it modulates the correct causal pathway is secondary.

      On the second point, we agree that testing first ZOLTAR on a mutant for which we have a fairly good understanding of the mechanism causing the behavioural phenotype could have been a productive approach. Note, however, that examples already exist in the literature. First, Hoffman et al. (2016) found that drugs generating behavioural fingerprints that positively correlate with the cntnap2a/cntnap2b double knockout fingerprint are enriched with NMDA and GABA receptor antagonists. In experiments analogous to our citalopram treatment (Fig. 5c,d), cntnap2a/cntnap2b knockout larvae were found to be overly sensitive to the NMDA receptor antagonist MK-801 and the GABAA receptor antagonist pentylenetetrazol (PTZ). Among other drugs tested, zolpidem, a GABAA receptor agonist, caused opposite effects on wild-type and cntnap2a/cntnap2b knockout larvae. Knockout larvae also had fewer GABAergic neurons in the forebrain. Second, Ashlin et al. (2018) found that the fingerprint of pitpnc1a knockout larvae clustered with anti-inflammatory compounds. Flumethasone, an anti-inflammatory corticosteroid, caused a lower increase in activity when added to knockout larvae compared to wild-type larvae. While these studies did not use precisely the same analysis that ZOLTAR runs, they used the same rationale and behavioural dataset to make these predictions (Rihel et al., 2010), which shows that approaches like ZOLTAR can point to causal processes.

      Related to your next point, we may reduce the discussion on sorl1 and serotonin and add some of the present arguments instead, depending on the results from  testing a second SSRI (see next point).

      - The behavioral difference between the sorl1 KO and scrambled at the higher dose of the citalopram is based on a small number of animals. The KO Euclidean distance measure is also more spread out than for the other datasets, and it looks like only five or so fish are driving the group difference. It also appears as though the numbers were also from two injection series. While there is nothing obviously wrong with the data, I would feel more comfortable if such a strong statement of a result from a relatively subtle phenotype were backed up by a higher N or a stable line. It is not impossible that the observed difference is an experimental fluke. If something obvious had emerged through the HCR, that would have also supported the conclusions. As it stands, if no more experiments are done to bolster the claim, the confidence in the strength of the link to serotonin should be reduced (possibly putting the entire section in the supplement and modifying the discussion). The discussion section about serotonin and AD is interesting, but I think that it is excessive without additional evidence.

      We mostly agree with this criticism. One could interpret the larger spread of the data for sorl1 larvae treated with 10 µM citalopram as evidence that the knockout larvae do indeed react differently to the drug at this dose. However, the result indeed does not survive removing the top 5 (p = 0.87) or top 3 (p = 0.18) sorl1 larvae.

      Given that the HCR did not reveal anything striking, we agree with you that too much of our argument relies on this result being robust. As you and reviewer #3 suggest, we plan on repeating this experiment with a different serotonin reuptake inhibitor (SSRI). If the other SSRI also shows a differential effect, this should strengthen the claim that ZOLTAR correctly predicted serotonin signalling as being affected by the loss of Sorl1, even if we did not discover the molecular mechanism.

      - The authors suggest two hypotheses for the behavioral difference between the sorl1 KO and scrambled at the higher dose of the citalopram. While the first is tested, and found to not be supported, the second is not tested at all ("Ruling out the first hypothesis, sorl1 knockouts may react excessively to a given spike in serotonin." and "Second, sorl1 knockouts may be overly sensitive to serotonin itself because post-synaptic neurons have higher levels of serotonin receptors."). Assuming that the finding is robust, there are probably other reasons why the mutants could have a different sensitivity to this molecule. However, if this particular one is going to be mentioned, it is surprising that it was not tested alongside the first hypothesis. This work could proceed without a complete explanation, but additional discussion of the possibilities would be helpful or why the second hypothesis was not tested.

      There are no strong scientific reasons why this hypothesis was not tested. The lead author (F Kroll) moved to a different lab and country so the project was finalised at that time. We do not plan on testing this hypothesis at this stage. However, we will adapt the wording to make it clear this is one possible alternative hypothesis which could be tested in the future, rather than the only alternative.

      - The authors claim that "all four genes produced a fairly consistent phenotype at night". While it is interesting that this result arose in the different lines, the second clutch for some genes did not replicate as well as others. I think the findings are compelling, regardless, but the sometimes missing replicability should be discussed. I wonder if the F0 strategy adds noise to the results and if clean null lines would yield stronger phenotypes. Please discuss this possibility, or others, in regard to the variability in some phenotypes.

      For the first part of this point, please see below our answer to Reviewer #3, point (2) c.

      Regarding the F0 strategy potentially adding variability, it is an interesting question which we tested in a larger dataset of behavioural recordings from F0 and stable knockouts for the same genes (unpublished). In summary, the F0 knockout method does not increase clutch-to-clutch or larva-to-larva variability in the assay. F0 knockout experiments found many more significant parameters and larger effect sizes than stable knockout experiments, but this difference could largely be explained by the larger sample sizes of F0 knockout experiments. In fact, larger sample sizes within individual clutches appears to be a major advantage of the F0 knockout approach over in-cross of heterozygous knockout animals as it increases sensitivity of the assay without causing substantial variability. We plan to report in more details on this analysis in a separate paper as we think it would dilute the focus of the present work.

      - In this work, the knockout of appa/appb is included. While APP is a well-known risk gene, there is no clear justification for making a knockout model. It is well known that the upregulation of app is the driver of Alzheimer's, not downregulation. The authors even indicate an expectation that it could be similar to the other knockouts ("Moreover, the behavioural phenotypes of appa/appb and psen1 knockout larvae had little overlap while they presumably both resulted in the loss of Aβ." and "Comparing with early-onset genes, psen1 knockouts had similar night-time phenotypes, but loss of psen2 or appa/appb had no effect on night-time sleep."). There is no reason to expect similarity between appa/appb and psen1/2. I understand that the app knockouts could unveil interesting early neurodevelopmental roles, but the manuscript needs to be clarified that any findings could be the opposite of expectation in AD.

      On “there is no reason to expect similarity […]”, we disagree. Knockout of appa/appb and knockout psen1 will both result in loss of Aβ (appa/appb encode Aβ and psen1 cleaves Appa/Appb to release Aβ, cf. Fig. 3e). Consequently, a phenotype caused by the loss of Aβ, or possibly other Appa/Appb cleavage products, should logically be found in both appa/appb and psen1 knockouts.

      On “it is well known that the upregulation of APP is the driver of Alzheimer’s, not downregulation”; we of course agree. Among others, the examples of Down syndrome, APP duplication (Sleegers et al., 2006), or mouse models overexpressing human APP show definitely that overexpression of APP is sufficient to cause AD. Having said that, we would not be so quick in dismissing APP knockout as potentially relevant to understanding of Alzheimer’s disease. Loss of soluble Aβ due to aggregation could contribute to pathology (Espay et al., 2023). Without getting too much into this intricate debate, links between levels of Aβ and risk of disease are often counter-intuitive too. For example, out of 138 PSEN1 mutations screened in vitro, 104 reduced total Aβ production and 11 even seemingly abolished the production of both Aβ40 and Aβ42 (Sun et al., 2017). In short, loss of soluble Aβ occurs in both AD and in our appa/appb knockout larvae, but the ideal approach would be to study zebrafish larvae with an in-frame deletion in the Aβ sequence within appa/appb.

      We will adapt the language to address your point. We would not want to imply, for example, that the absence of a night-time sleep phenotype for appa/appb is contradictory to the body of literature showing links between Aβ and sleep, including in zebrafish (Özcan et al., 2020). As you say, our experiment tested loss of App, including Aβ, while the literature typically reports on overexpression of APP, as in APP/PSEN1-overexpressing mice (Jagirdar et al., 2021).

      Reviewer #3 (Public Review):

      In this manuscript by Kroll and colleagues, the authors describe combining behavioral pharmacology with sleep profiling to predict disease and potential treatment pathways at play in AD. AD is used here as a case study, but the approaches detailed can be used for other genetic screens related to normal or pathological states for which sleep/arousal is relevant. The data are for the most part convincing, although generally the phenotypes are relatively small and there are no major new mechanistic insights. Nonetheless, the approaches are certainly of broad interest and the data are comprehensive and detailed.

      A notable weakness is the introduction, which overly generalizes numerous concepts and fails to provide the necessary background to set the stage for the data.

      Major points

      (1) The authors should spend more time explaining what they see as the meaning of the large number of behavioral parameters assayed and specifically what they tell readers about the biology of the animal. Many are hard to understand--e.g. a "slope" parameter.

      We agree that some parameters do not tell something intuitive about the biology of the animal. It would be easy to speculate. For example, the “activity slope” parameter may indicate how quickly the animal becomes tired over the course of the day. On the other hand, fractal dimension describes the “roughness/smoothness” of the larva’s activity trace (Fig. 2–suppl. 1a); but it is not obvious how to translate this into information about the physiology of the animal. We do not see this as an issue though. While some parameters do provide intuitive information about the animal’s behaviour (e.g. sleep duration or sunset startle as a measure of startle response), the benefit of having a large number of behavioural parameters is to compare behavioural fingerprints and assess rescue of the behavioural phenotype by small molecules (Fig. 6c). For this purpose, the more parameters the better. The “MoSeq” approach from Wiltschko et al., 2020 is a good example from literature that inspired our own Fig. 6c. While some of the “behavioural syllables” may be intuitive (e.g. running or grooming), it is probably pointless to try to explain the ‘meaning’ of the “small left turn in place with head motion” syllable (Wiltschko et al., 2020). Nonetheless, this syllable was useful to assess whether a drug specifically treats the behavioural phenotype under study without causing too many side effects. Unfortunately, ZOLTAR has to reduce the FramebyFrame fingerprint (17 parameters) to just six parameters to compare it to the behavioural dataset from Rihel et al., 2010, but here, more parameters would almost certainly translate into better predictions too, regardless of their intuitiveness.

      It is true however that we do not give much information on how some of the less intuitive parameters, such as activity slope or fractal dimension, are calculated or what they describe about the dataset (e.g. roughness/smoothness for fractal dimension). We will improve this in our revised version.

      (2) Because in the end the authors did not screen that many lines, it would increase confidence in the phenotypes to provide more validation of KO specificity. Some suggestions include:

      a. The authors cite a psen1 and psen2 germline mutant lines. Can these be tested in the FramebyFrame R analysis? Do they phenocopy F0 KO larvae?

      We unfortunately do not have those lines. We investigated the availability of importing a psen2 knockout line from abroad, but the process of shipping live animals is becoming more and more cost and time prohibitive. However, we observed the same pigmentation phenotype for psen2 knockouts as reported by Jiang et al., 2018, which is at least a partial confirmation of phenocopying a loss of function stable mutant. 

      b. psen2KO is one of the larger centerpieces of the paper. The authors should present more compelling evidence that animals are truly functionally null. Without this, how do we interpret their phenotypes?

      We disagree that there should be significant doubt about these mutants being truly functionally null,  given the high mutation rate and presence of the expected pigmentation phenotype (Jiang et al., 2018, Fig. 3f and Fig. 3–suppl. 2). The psen2 F0 knockouts were virtually 100% mutated at three exons across the gene (mutation rates were locus 1: 100 ± 0%; locus 2: 99.99 ± 0.06%; locus 3: 99.85 ± 0.24%). Additionally, two of the three mutated exons had particularly high rates of frameshift mutations (locus 1: 97 ± 5%; locus 2: 88 ± 17% frameshift mutation rate). It is virtually impossible that a functional protein is translated given this burden of frameshift mutations. Phenotypically, in addition to the pigmentation defect, double psen1/psen2 F0 knockout larvae had curved tails, the same phenotype as caused by a high dose of the γ-secretase inhibitor DAPT (Yang et al., 2008). These double F0 knockouts were lethal, while knockout of psen1 or psen2 alone did not cause obvious morphological defects. Evidently, most larvae must have been psen2 null mutants in this experiment, otherwise functional Psen2 would have prevented early lethality.

      Translation of zebrafish psen2 can start at downstream start codons if the first exon has a frameshift mutation, generating a seemingly functional Psen2 missing the N-terminus (Jiang et al., 2020). Zebrafish homozygous for this early frameshift mutation had normal pigmentation, showing it is a reliable marker of Psen2 function even when it is mutated. This mechanism is not a concern here as the alternative start codons are still upstream of two of the three mutated exons (the alternative start codons discovered by Jiang et al., 2020 are in exon 2 and 3, but we targeted exon 3, exon 4, and exon 6).

      We understand that the zebrafish community may be cautious about F0 phenotyping compared to stably generated mutants. As mentioned to Reviewer 2, we are planning to assemble a paper that expressly examines F0s vs. stable mutants to allay some of these concerns. We would also suggest that our current manuscript, which combines CRISPR-F0 rapid screening with in silico pharmacological predictions, ultimately represents a first step in characterizing the functions of genes.

      c. Related to the above, for cd2AP and sorl1 KO, some of the effect sizes seem to be driven by one clutch and not the other. In other words, great clutch-to-clutch variability. Should the authors increase the number of clutches assayed?

      Correct, there is great clutch-to-clutch variability in this behavioural assay. This is not specific to our experiments. Even within the same strain, wild-type larvae from different clutches (i.e. non-siblings) behave differently (Joo et al., 2021). This is why it is essential to compare behavioural phenotypes within individual clutches (i.e., from a single pair of parents, one male and one female), as we explain in Methods (section Behavioural video-tracking) and in the documentation of the FramebyFrame package. We often see two different experimental designs in literature: comparing non-sibling wild-type and mutant larvae, or pooling different clutches which include all genotypes (e.g., pooling multiple clutches from heterozygous in-crosses or pooling wild-type clutches before injecting them). The first experimental design causes false positive findings, as the clutch-to-clutch variability we and others (Joo et al., 2021) observe gets interpreted as a behavioural phenotype. The second experimental design should not cause false positives but will decrease the sensitivity of the assay by increasing the spread within genotypes. In both cases, the clutch-to-clutch variability is hidden, either by interpreting it as a phenotype (first case) or by adding it to animal-to-animal variability (second case). Our experimental design is technically more challenging as it requires obtaining large clutches from unique pairs of parents. However, this approach is better as it clearly separates the different sources of variability (clutch-to-clutch or animal-to-animal). As for every experiment, yes, a larger number of replicates would be better, but we do not plan to assay additional clutches at this time. Our work heavily focuses on the sorl1 and psen2 knockout behavioural phenotypes. The key aspects of these phenotypes were effectively tested in four clutches as sorl1 were also tested in the citalopram experiment (Fig. 5), and psen2 was also tested in the small molecule rescue experiment (Fig. 6 and Fig. 6–suppl. 1). In the citalopram experiment, one H2O-treated sorl1 knockout clutch (n = 10) replicates fairly well the baseline recordings in Fig. 4–suppl. 5, the other does not but had especially low sample size (n = 6).

      We also plan to test another SSRI on sorl1 knockouts, so this point will be addressed.

      (3) The authors make the point that most of the AD risk genes are expressed in fish during development. Is there public data to comment on whether the genes of interest are expressed in mature/old fish as well? Just because the genes are expressed early does not at all mean that early- life dysfunction is related to future AD (though this could be the case, of course). Genes with exclusive developmental expression would be strong candidates for such an early-life role, however. I presume the case is made because sleep studies are mainly done in juvenile fish, but I think it is really a pretty minor point and such a strong claim does not even need to be made.

      This is a fair criticism but we do not make this claim, at least not from expression. The reviewer is probably referring to the following quote:

      “[…] most of these were expressed in the brain of 5–6-dpf zebrafish larvae, suggesting they play a role in early brain development or function,”

      which does not mention future risk of Alzheimer’s disease. We do suggest that these genes have a function in development. After all, every gene that plays a role in brain development must be expressed during development, so this wording seems reasonable. As noted, the primary goal was to check that the genes we selected were indeed expressed in zebrafish larvae before performing knockout experiments. Our discussion does raise the hypothesis that mutations in Alzheimer’s risk genes impact brain development and sleep early in life, but this argument primarily relies on our observation that knockout of late-onset Alzheimer’s risk genes causes sleep phenotypes in 7-day old zebrafish larvae and from previous work showing brain structural differences in infants and children at high genetic risk of Alzheimer’s disease (Dean et al., 2014; Quiroz et al., 2015), not solely on gene expression early in life.

      (4) A common quandary with defining sleep behaviorally is how to rectify sleep and activity changes that influence one another. With psen2 KOs, the authors describe reduced activity and increased sleep during the day. But how do we know if the reduced activity drives increased behavioral quiescence that is incorrectly defined as sleep? In instances where sleep is increased but activity during periods during wake are normal or elevated, this is not an issue. But here, the animals might very well be unhealthy, and less active, so naturally they stop moving more for prolonged periods, but the main conclusion is not sleep per se. This is an area where more experiments should be added if the authors do not wish to change/temper the conclusions they draw. Are psen2 KOs responsive to startling stimuli like controls when awake? Do they respond normally when quiescent? Great care must be taken in all models using inactivity as a proxy for sleep, and it can harm the field when there is no acknowledgment that overall health/activity changes could be a confound. Particularly worrisome is the betamethasone data in Figure 6, where activity and sleep are once again coordinately modified by the drug.

      This is a fair criticism. We agree it is a concern, especially in the case of psen2 as we claim that day-time sleep is increased while zebrafish are diurnal. We do not rely heavily on the day-time inactivity being sleep (the ZOLTAR predictions or the small molecule rescue do not change whether the parameter is called sleep or inactivity), but  our choice of labelling may be misleading. We will try to test this claim by plotting the distribution of the inactive period durations. If psen2 knockout larvae indeed sleep more during the day compared to controls, we might predict that inactive periods longer than 1 minute to increase disproportionately compared to the increase in shorter inactive periods.

      To address, “are psen2 KO responsive to startling stimuli like controls when awake/when quiescent”, we can try to look at the behaviour of psen2 knockout larvae that were awake (i.e., moved in the preceding one minute) or ‘asleep’ (i.e., did not move in the preceding one minute) at the light transitions and count the proportion of psen2 knockout or control larvae which displayed a startle response. If most psen2 knockouts react to the light transition, it should at least exclude the concern that they are very unhealthy, as the reviewer suggests. This criticism seems challenging to definitely address experimentally though. A possible approach could be to use a closed-loop system which, after one minute of inactivity, triggers a stimulus which is sufficient to startle an awake larva but not an asleep larva. If psen2 knockout larvae indeed sleep more during the day, the stimulus should usually not be sufficient to startle them. Note, how to calibrate this stimulus is also not straightforward. We do not plan to test this, but our analysis of the light transitions may provide a decent proxy.

      (5) The conclusions for the serotonin section are overstated. Behavioural pharmacology purports to predict a signaling pathway disrupted with sorl1 KO. But is it not just possible that the drug acts in parallel to the true disrupted pathway in these fish? There is no direct evidence for serotonin dysfunction - that conclusion is based on response to the drug. Moreover, it is just 1 drug - is the same phenotype present with another SSRI? Likewise, language should be toned down in the discussion, as this hypothesis is not "confirmed" by the results (consider "supported"). The lack of measured serotonin differences further raises concern that this is not the true pathway. This is another major point that deserves further experimental evidence, because without it, the entire approach (behavioral pharm screen) seems more shaky as a way to identify mechanisms. There are any number of testable hypotheses to pursue such as a) Using transient transgenesis to visualize 5HT neuron morphology (is development perturbed: cell number, neurite morphology, synapse formation); b) Using transgenic Ca reporters to assay 5HT neuron activity.

      Regarding the comment, “is it not just possible that the drug acts in parallel to the true disrupted pathway”, we think no, assuming we understand correctly your question. Key to our argument is the fact that sorl1 knockout larvae react differently to the drug than control larvae. As an example, take night-time sleep bout length, which was not affected by knockout of sorl1 (Fig. 4–suppl. 5). For the sake of the argument, say only dopamine signalling (the “true disrupted pathway”) was affected in sorl1 knockouts but that serotonin signalling was intact. Assuming that citalopram specifically alters serotonin signalling, then treatment should cause the same increase in sleep bout length in both knockouts and controls as serotonin signalling is intact in both. This is not what we see, however. Citalopram caused a greater increase in sleep bout length in sorl1 knockouts than in scrambled-injected larvae. In other words, the effect is non-additive, in the sense that citalopram did not add the same number of Z-scores to sorl1 knockouts or controls. We think this shows that serotonin signalling is somehow different in sorl1 knockouts. Nonetheless, we would concede that the experiment does not necessarily says much about the importance of the serotonin disruption caused by loss of Sorl1. It could be, for example, that the most salient consequence of loss of Sorl1 is cholinergic disruption (see reply to Reviewer #1 above) and that serotonin signalling is a minor theme.

      Furthermore, we agree with you and Reviewer #2 that the conclusions are overly confident. We will repeat this experiment with another SSRI as you suggest. Your suggestions to further test the serotonin system in the sorl1 knockouts are excellent as well, however we do not plan to pursue them at this stage.

      References:

      Ashlin TG, Blunsom NJ, Ghosh M, Cockcroft S, Rihel J. 2018. Pitpnc1a Regulates Zebrafish Sleep and Wake Behavior through Modulation of Insulin-like Growth Factor Signaling. Cell Rep 24:1389–1396. doi:10.1016/j.celrep.2018.07.012

      Chen D, Wang X, Huang T, Jia J. 2022. Sleep and Late-Onset Alzheimer’s Disease: Shared Genetic Risk Factors, Drug Targets, Molecular Mechanisms, and Causal Effects. Front Genet 13. doi:10.3389/fgene.2022.794202

      Cirrito JR, Disabato BM, Restivo JL, Verges DK, Goebel WD, Sathyan A, Hayreh D, D’Angelo G, Benzinger T, Yoon H, Kim J, Morris JC, Mintun MA, Sheline YI. 2011. Serotonin signaling is associated with lower amyloid-β levels and plaques in transgenic mice and humans. Proc Natl Acad Sci U S A 108:14968–14973. doi:10.1073/pnas.1107411108

      Dean DC, Jerskey BA, Chen K, Protas H, Thiyyagura P, Roontiva A, O’Muircheartaigh J, Dirks H, Waskiewicz N, Lehman K, Siniard AL, Turk MN, Hua X, Madsen SK, Thompson PM, Fleisher AS, Huentelman MJ, Deoni SCL, Reiman EM. 2014. Brain Differences in Infants at Differential Genetic Risk for Late-Onset Alzheimer Disease A Cross-sectional Imaging Study. JAMA Neurol 71:11–22. doi:10.1001/jamaneurol.2013.4544

      Eriksen JL, Sagi SA, Smith TE, Weggen S, Das P, McLendon DC, Ozols VV, Jessing KW, Zavitz KH, Koo EH, Golde TE. 2003. NSAIDs and enantiomers of flurbiprofen target γ-secretase and lower Aβ42 in vivo. J Clin Invest 112:440–449. doi:10.1172/JCI18162

      Espay AJ, Herrup K, Kepp KP, Daly T. 2023. The proteinopenia hypothesis: Loss of Aβ42 and the onset of Alzheimer’s Disease. Ageing Res Rev 92:102112. doi:10.1016/j.arr.2023.102112

      Hoffman EJ, Turner KJ, Fernandez JM, Cifuentes D, Ghosh M, Ijaz S, Jain RA, Kubo F, Bill BR, Baier H, Granato M, Barresi MJF, Wilson SW, Rihel J, State MW, Giraldez AJ. 2016. Estrogens Suppress a Behavioral Phenotype in Zebrafish Mutants of the Autism Risk Gene, CNTNAP2. Neuron 89:725–733. doi:10.1016/j.neuron.2015.12.039

      in ’t Veld Bas A., Ruitenberg Annemieke, Hofman Albert, Launer Lenore J., van Duijn Cornelia M., Stijnen Theo, Breteler Monique M.B., Stricker Bruno H.C. 2001. Nonsteroidal Antiinflammatory Drugs and the Risk of Alzheimer’s Disease. N Engl J Med 345:1515–1521. doi:10.1056/NEJMoa010178

      Jagirdar R, Fu C-H, Park J, Corbett BF, Seibt FM, Beierlein M, Chin J. 2021. Restoring activity in the thalamic reticular nucleus improves sleep architecture and reduces Aβ accumulation in mice. Sci Transl Med 13:eabh4284. doi:10.1126/scitranslmed.abh4284

      Jiang H, Newman M, Lardelli M. 2018. The zebrafish orthologue of familial Alzheimer’s disease gene PRESENILIN 2 is required for normal adult melanotic skin pigmentation. PLOS ONE 13:e0206155. doi:10.1371/journal.pone.0206155

      Jiang H, Pederson SM, Newman M, Dong Y, Barthelson K, Lardelli M. 2020. Transcriptome analysis indicates dominant effects on ribosome and mitochondrial function of a premature termination codon mutation in the zebrafish gene psen2. PloS One 15:e0232559. doi:10.1371/journal.pone.0232559

      Joo W, Vivian MD, Graham BJ, Soucy ER, Thyme SB. 2021. A Customizable Low-Cost System for Massively Parallel Zebrafish Behavioral Phenotyping. Front Behav Neurosci 14.

      Joubert L, Hanson B, Barthet G, Sebben M, Claeysen S, Hong W, Marin P, Dumuis A, Bockaert J. 2004. New sorting nexin (SNX27) and NHERF specifically interact with the 5-HT4a receptor splice variant: roles in receptor targeting. J Cell Sci 117:5367–5379. doi:10.1242/jcs.01379

      Lauretti E, Dincer O, Praticò D. 2020. Glycogen synthase kinase-3 signaling in Alzheimer’s disease. Biochim Biophys Acta Mol Cell Res 1867:118664. doi:10.1016/j.bbamcr.2020.118664

      Leng Y, Ackley SF, Glymour MM, Yaffe K, Brenowitz WD. 2021. Genetic Risk of Alzheimer’s Disease and Sleep Duration in Non-Demented Elders. Ann Neurol 89:177–181. doi:10.1002/ana.25910

      Mitchell PB, Hadzi-Pavlovic D. 2000. Lithium treatment for bipolar disorder. Bull World Health Organ 78:515–517.

      Munoz-Torrero D. 2008. Acetylcholinesterase Inhibitors as Disease-Modifying Therapies for Alzheimer’s Disease. Curr Med Chem 15:2433–2455. doi:10.2174/092986708785909067

      Muto V, Koshmanova E, Ghaemmaghami P, Jaspar M, Meyer C, Elansary M, Van Egroo M, Chylinski D, Berthomier C, Brandewinder M, Mouraux C, Schmidt C, Hammad G, Coppieters W, Ahariz N, Degueldre C, Luxen A, Salmon E, Phillips C, Archer SN, Yengo L, Byrne E, Collette F, Georges M, Dijk D-J, Maquet P, Visscher PM, Vandewalle G. 2021. Alzheimer’s disease genetic risk and sleep phenotypes in healthy young men: association with more slow waves and daytime sleepiness. Sleep 44. doi:10.1093/sleep/zsaa137

      Myers-Turnbull D, Taylor JC, Helsell C, McCarroll MN, Ki CS, Tummino TA, Ravikumar S, Kinser R, Gendelev L, Alexander R, Keiser MJ, Kokel D. 2022. Simultaneous analysis of neuroactive compounds in zebrafish. doi:10.1101/2020.01.01.891432

      Özcan GG, Lim S, Leighton PL, Allison WT, Rihel J. 2020. Sleep is bi-directionally modified by amyloid beta oligomers. eLife 9:e53995. doi:10.7554/eLife.53995

      Quiroz YT, Schultz AP, Chen K, Protas HD, Brickhouse M, Fleisher AS, Langbaum JB, Thiyyagura P, Fagan AM, Shah AR, Muniz M, Arboleda-Velasquez JF, Munoz C, Garcia G, Acosta-Baena N, Giraldo M, Tirado V, Ramírez DL, Tariot PN, Dickerson BC, Sperling RA, Lopera F, Reiman EM. 2015. Brain Imaging and Blood Biomarker Abnormalities in Children With Autosomal Dominant Alzheimer Disease: A Cross-Sectional Study. JAMA Neurol 72:912–919. doi:10.1001/jamaneurol.2015.1099

      Relkin NR. 2007. Beyond symptomatic therapy: a re-examination of acetylcholinesterase inhibitors in Alzheimer’s disease. Expert Rev Neurother 7:735–748. doi:10.1586/14737175.7.6.735

      Rihel J, Prober DA, Arvanites A, Lam K, Zimmerman S, Jang S, Haggarty SJ, Kokel D, Rubin LL, Peterson RT, Schier AF. 2010. Zebrafish Behavioral Profiling Links Drugs to Biological Targets and Rest/Wake Regulation. Science 327:348–351. doi:10.1126/science.1183090

      Sleegers K, Brouwers N, Gijselinck I, Theuns J, Goossens D, Wauters J, Del-Favero J, Cruts M, van Duijn CM, Van Broeckhoven C. 2006. APP duplication is sufficient to cause early onset Alzheimer’s dementia with cerebral amyloid angiopathy. Brain J Neurol 129:2977–2983. doi:10.1093/brain/awl203

      Sun L, Zhou R, Yang G, Shi Y. 2017. Analysis of 138 pathogenic mutations in presenilin-1 on the in vitro production of Aβ42 and Aβ40 peptides by γ-secretase. Proc Natl Acad Sci 114:E476–E485. doi:10.1073/pnas.1618657114

      Weggen S, Rogers M, Eriksen J. 2007. NSAIDs: small molecules for prevention of Alzheimer’s disease or precursors for future drug development? Trends Pharmacol Sci 28:536–543. doi:10.1016/j.tips.2007.09.004

      Wiltschko AB, Tsukahara T, Zeine A, Anyoha R, Gillis WF, Markowitz JE, Peterson RE, Katon J, Johnson MJ, Datta SR. 2020. Revealing the structure of pharmacobehavioral space through motion sequencing. Nat Neurosci 23:1433–1443. doi:10.1038/s41593-020-00706-3

      Yang T, Arslanova D, Gu Y, Augelli-Szafran C, Xia W. 2008. Quantification of gamma-secretase modulation differentiates inhibitor compound selectivity between two substrates Notch and amyloid precursor protein. Mol Brain 1:15. doi:10.1186/1756-6606-1-15

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      A. General Statements

      We thank the reviewers for their constructive feedback. We have made significant revisions to the mathematical modelling section of the manuscript to address your concerns. Therefore, some of the specific issues and concerns raised in previous reviews no longer apply. Where that is the case, please see the relevant context in the revision as indicated in the point-by-point description section below. We summarize the key points in the revised manuscript as follows.

      1. The key finding of our study, involving experimental measurements and mathematical modelling, is plasticity in the MinD concentration gradient, which results from spatial differences in molecular interactions and is an intrinsic property of the Min system during cell growth. This study reveals not only the role of the MinD concentration gradient in modulating bacterial cell division site placement but also showcasing an example of cellular components in the form of a concentration gradient in fundamental cellular processes, a concept crucial in cell biology. This work provides conceptual advancement in a quantitative understanding of MinD oscillations in the cellular environment and provides implications for bacterial cell division regulation for further studies in the field.

      2. The reviewer requested clarification on the differences between our study and previous studies involving experimental measurements and mathematical modelling of Min oscillations in cells. We would like to emphasize that although the goal of the previous works was to measure the spatiotemporal distribution of oscillating MinD concentration gradients as a function of cell length, these works conceived the problem differently and therefore used different experimental designs and execution methods, which differentiates our key conclusions from theirs. This is also true for mathematical modelling. Although similar observations can be found in some respects, they are not directly comparable due to the different mathematics and assumptions used in the simulations. For example, our model was built to adequately investigate the biological question of the MinD concentration gradient during cell elongation but not to evaluate the impact of cell shape and confinement or the nucleation effect of MinD. Thus, our model cannot be generalized to other shapes, such as those observed in the study by Wu et al., 2015 (Wu et al, 2015). Therefore, we would like to draw attention to the experimental rigor and to the specific points and views that contribute to our understanding of Min systems. We now provide a comprehensive comparison between them in the Supplemental Information.

      3. We have re-run the simulation to refine and improve the modelling procedures and results, and the corresponding text and illustration are provided in the Results section of the main text (Lines 265-279, 614-653) and Fig. S6. In brief, we fixed the diffusion coefficients D_D and D_E from Meacci et al. (2006) (Meacci et al, 2006); the dissociation rate constant k_de from a previous simulation (Wu et al., 2015); and the experimentally measured MinD and MinE concentrations in this study. Meanwhile, the diffusion coefficients D_d and D_de were assumed values based on bacterial membrane protein diffusion (Schavemaker et al, 2018). This operation allowed us to probe for the general behaviours of the system. As a result, we were able to obtain a few parameter sets, including #2728, that generate features of the oscillation period, λ_N and I_Ratio, that highly mimic MinD oscillation in the cellular context (Figs. 4C, S7-9). We further tested the impact of different kinetic constants, k_de, k_dD, k_dE, k_D, and k_(ADP→ATP), which represent different molecular interactions influencing the oscillation period, λ_N and I_Ratio (Fig 4D-H). Our findings have provided us with a solid theoretical view of how oscillation features may be controlled by different molecular interactions. Furthermore, the modelling results help us understand the possible mechanisms associated with oscillation cycle maintenance and length-dependent variable concentration gradients.

      4. Regarding the inclusion or removal of results from more culture conditions, we decided to keep only one condition as in the previous version for the following reasons. In order to draw convincing conclusions, we consider it more important to characterize all aspects under the same growth condition and avoid manipulation. Therefore, the main conclusions are drawn from our experiments characterizing several aspects of MinD oscillations in cells growing with 0.4% glucose. In support of these observations, we decided to maintain only one other condition, 0.1% glucose. Further analysis of cells growing under other conditions will not change the main conclusions but will increase the difficulty of determining how the MinD concentration changes with cell growth.

      5. Studying the variable concentration gradient underlying the dynamic oscillations of the Min system may be of broad interest to cell biologists since the concentration gradient plays a fundamental role in various cellular processes, and the concept of concentration gradients is crucial in cell biology. Examples of related processes include passive and active transport, osmosis, cell signalling, and maintenance of cellular homeostasis. These processes allow cells to respond to their environment, regulate their internal conditions, and perform important functions required for survival and normal function. In addition, variable concentration gradients, characterized by the numerical descriptor λ_N and was reproduced in a simple mathematical model, demonstrate a nonlinear dynamics behaviour in physical biology. Therefore, the audience of this work can include the broader general audience of cell biology and physical biology rather than just the immediate specialized audience interested in the Min system. We will also reiterate the importance of specialized research, which often provides the basis for broader application and understanding.

      B. Point-by-point description of the revisions

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary: Parada et al. studied both experimentally and theoretically the MinD concentration distribution of Min waves during cell growth. The main finding was that (i) the gradient of MinD is steeper for longer cells and accordingly the MinD concentration at the middle of cell is lower, (ii) period of the oscillation is independent to the cell length, and (iii) those features are shared even under glucose starvation except the MinD gradient is steeper. (iv) Those results are supplemented by the analyses of the reaction-diffusion equations in which parameters that can reproduce the MinD concentration distribution are identified. I think the results are interesting; basically, as the cell grows, the contrast of the wave becomes clearer, such the MinD concentration at the cell centre decreases. The results may clarify the mechanism of FtsZ accumulation at the cell centre more quantitatively. The experiments were performed by measuring the fluorescent intensity of MinD during cell growth and analysing the intensity distribution along the long axis of the cell. The theoretical results were based on the analyses of the reaction-diffusion model. Both approaches are already well established and the results sound. Nevertheless, I do not think the novelty of this work is not well highlighted in the current manuscript; I think most of the results, except (iii) and (iv), have already been shown explicitly or implicitly in the previous studies. Min oscillations in a growing cell have been analysed both theoretically and experimentally in (Meacci 2005) and [1] (Fischer-Friedrich et al, 2010). The concentration distribution and period of the oscillation were measured. The complete results were presented in [2] (Meacci et al., 2006), and I am not aware of those results in scientific journals (the thesis is available online). Nevertheless, I think it is fair to cite those studies and compare the current results with them. In fact, in [2], it was shown that the concentration of MinD near the cell centre decreases as the cell grows, the total MinD concentration is approximately constant during the growth (therefore, the number of the molecules increases), and that the variance of the period becomes smaller as the cell grows. I do not think those previous studies spoil this work, and this work deserves publication somewhere. Still, the authors should highlight the novelty of this study more clearly.

      ANS: We thank the reviewer for recognizing the soundness of our experimental and theoretical approaches and results. The key finding of our study, involving experimental measurements and mathematical modelling, is plasticity in the MinD concentration gradient, which results from spatial differences in molecular interactions and is an intrinsic property of the Min system during cell growth. This study reveals not only the role of the MinD concentration gradient in modulating bacterial cell division site placement but also showcasing an example of cellular components in the form of a concentration gradient in fundamental cellular processes, a concept crucial in cell biology. We believe that the established techniques and methods are integral to a broad range of works and provide confidence in improving them and using them to test hypotheses and obtain results. We also appreciate the reviewer for pointing out that Meacci's PhD thesis entitled "Physical aspects of Min oscillations in Escherichia coli" (Meacci & Kruse, 2005) is available online for public access. This thesis, along with two publications (Meacci & Kruse, 2005) (Meacci et al., 2006), explored Min oscillations in growing cells and used mathematical models. These two published works are cited in the previous version of the manuscript because we agree that these earlier works provide valuable context. As recommended, we went through these works again and the work by Fischer-Friedrich et al. (2010) (Fischer-Friedrich et al., 2010) to compare their wet experiments and mathematical models with ours, which are detailed in the Supplemental Information (Lines 26-147). Here, we emphasize that although the published works and our work set the goal of measuring the spatiotemporal distribution of oscillating MinD concentration gradients as a function of cell length, we conceived the problem differently and therefore used different experimental designs and analysis approaches, which have led to the key conclusions that differentiate our work from theirs.

      Major comments: (i) In (Meacci 2005) and [1,2], it was claimed that the standard deviation of the period is comparable with the mean period, particularly for the shorter cell. Therefore, they did not claim the period is independent to the cell length. As far as I understood, the variance arises from the variance of the total protein concentration in the assemble of cells. I am wondering how the authors are able to conclude the constant period in different cell length. I also point out that in the theoretical part of (Meacci 2005), the period is, in fact, increasing as the cell grows and suddenly decreases at the length in which cell division occurs.

      ANS: In our experiments, we found that the oscillation periods ranged from 36.8 to 65.6 sec, as measured from a population of cells (length of 1.9-4.5 µm; main text, Fig. 1E). Moreover, the standard deviations of the period ranged from 5.4% to 34.8% of the period, with larger standard deviations more common in shorter cells (Fig. 1D), indicating that regular interpolar oscillations are more likely to occur in longer cells. This observation echoes the study by Fischer-Friedrich et al. (2010) (Fischer-Friedrich et al., 2010), who reported stochastic switching MinD oscillation between two cell poles in cells below 2.5 μm. MinD starts to oscillate regularly from pole-to-pole between 2.5-3 μm with an oscillation period of 80 sec. Above 3.5 μm, MinD invariably undergoes regular oscillation with an initial period of 87 sec and then decreases to 70 sec at the end. In their study, they focused on the length-dependent switching from stochastic to regular oscillation states and speculated that the amount of MinE bound to the membrane critically influenced the shift from stochastic to regular interpolar oscillations. In addition, their observation of a longer period at the initial phase and a shorter period after the cells grew beyond 3.5 μm somewhat coincided with our simulation results, as shown in Fig. 4C-H, left. In Meacci's work (Thesis: Figure 2.14; Meacci and Kruse (2005) (Meacci & Kruse, 2005): Figure 5(b)), the temporal oscillation periods were measured from 40 to 120 sec when focusing on cells with lengths similar to those in our measurements (black dots in Meacci's chart). Our measurements of oscillation periods clearly show much smaller fluctuations than those in Meacci's study and are more comparable to Fischer-Friedrich's measurements. Differences can arise across different bacterial strains and culture conditions that may significantly affect the amount and quality of protein expressed in individual studies. In short, all three works differ in terms of experimental design and execution. Although similar observations can be found in some aspects, they are not directly comparable. Therefore, we would like to draw attention to the experimental rigor and specific points and views that contribute to our understanding of the Min system. We have changed the wording from 'constant period' to 'fairly stable period' throughout the manuscript. This description is based on our experimental measurements (Fig. 1D, E) and is also supported by our mathematical modelling (Fig. 4C-H, left). In response to the statement from the theoretical model of (Meacci & Kruse, 2005): "the period is increasing as the cell grows and suddenly decreases at the length in which cell division occurs." First, our simulation results revealed a mild increase in the oscillation period during cell elongation (Fig. 4C). The increase is adjustable by varying the reaction rate constants in the simulation (Fig. 4D-H). Second, although we did not simulate dividing cells, our experimental measurements clearly showed that this period increased in newborn cells (Fig. S4). As mentioned above, although similar observations can be found in different studies, they are not directly comparable because the experiments were performed differently for different purposes. We have added comparison of different models in the Supplemental Information (Lines 26-147).

      (ii) I do not think the explanations of the reaction-diffusion model were well described. The authors mentioned that they studied a one-dimensional model and used the delta function to describe the membrane reaction. Did the authors study 1D cytosol and 0D membrane? Then, why the surface diffusion term exists in (4) and (5)? I believe the authors simply assumed that both the membrane and the cytosol are 1D (with larger diffusion constants for cytosolic Min concentrations). Then, the delta functions in (1)-(5) are not necessary. In (Wu 2015), the delta function was used in order to treat a 2D membrane embedded in 3D space.

      Besides that, there is no description of the initial conditions for the concentration fields to solve the reaction-diffusion equations. I think the description of the no-flux boundary condition is better put in the Methods rather than supplementary materials.

      ANS: Thank you for your suggestions to improve the description of the numerical model. As summarized below, we have rewritten this section of 'Simulating the dynamic MinD concentration gradient in growing cells' in the manuscript (Lines 237-279). We have specified the dimensionality of the rate and diffusion constants of each molecule, where applicable, in our 1D model from Lines 237-264. Their dimensionality can also be conceived from their units, as listed in Tables 2 and S4. We have specified the initial 'no-flux' boundary conditions in Lines 267, 630, and 647. We agree that the delta function is not necessary and have removed it from the equations.

      (iii) As in the previous comment, the current model did not take into account the geometry of the system; namely, cytosol is in 3D and membrane is on 2D. Recent theoretical studies can handle the effect, and also the effect of confinement. I would appreciate it if the authors would make a comment on whether those issues are relevant or not for the conclusion of this work.

      ANS: Thank you for pointing out this interesting aspect of cell geometry as investigated in Wu et al., 2015 (Wu et al., 2015). Our model is built to adequately describe changes in the MinD concentration gradient during cell elongation under the assumption that a 1D description is sufficient. Thus, our model cannot be generalized to other shapes, such as those observed in Wu et al., 2015 (Wu et al., 2015). This point is now commented upon in Supplemental Information, lines 120-123.

      (iv) I would appreciate it if the authors would describe the screening process more clearly. I did understand the first screening is a finite imaginary part and a positive real part at the first mode of spatial inhomogeneity in the eigenvalues. However, I did not understand the other processes clearly. The second screening is based on \lambda_N and I_Ratio, but its criteria is not clear. I think both quantities fluctuated in experimental results and I am not sure what to define numerical results match them. The third process is based on a fitting error using the fitting function of linear increase plus a constant. I am not sure why we need to exclude, for example, the bottom right example in Fig.S6 because it shows no oscillation until the cell length of 3um but then the gradient linearly increases. Please clarify how to justify the criteria. The same argument applies to the fourth screening process. It is not clear why the slope should be smaller than 2.

      ANS: Thank you for your suggestions to improve the description of the screening process. We have re-run the simulation to refine and improve the screening process, and the corresponding text and illustration are provided in the Results section of the main text (Lines 237-279, 614-653) and Fig. S6.

      (v) The authors claimed that the steeper gradient of MinD under glucose starvation results in cell division for shorter cells. I do not think the claim is convincing. It is necessary to measure the correlation between the length at the cell division and the gradient. It would also be nicer to show the correlation under other parameters. I think those studies truly support the authors' claim and the novelty of this work.

      ANS: Thank you for the comments. We would like to draw your attention to the right side of the graph shown in Fig. 3B, E, where measurements were obtained from cells prior to division. Our claim that "the steeper gradient of MinD under glucose starvation results in cell division for shorter cells" is also supported by the wave slope (λ_N range): 0.4% glucose of 1.49-2.66 (cell length range: 1.7-4.5 µm) and glucose starvation of 1.34-3.54 (cell length range: 2.1-3.8 µm). Therefore, under glucose starvation, λ_N increases more significantly with increasing length, allowing us to speculate on the contribution of steeper concentration gradient in stressed shorter cell to division. In the revised manuscript, the statement is kept in the Results section (Lines 217-218), but removed from the abstract. About the correlation between the concentration gradient and cell length at division under different conditions, we consider it more important to characterize all aspects under the same growth condition and avoid manipulation. In this study, the main conclusions are drawn from our experiments characterizing several aspects of MinD oscillations in cells growing with 0.4% glucose. In support of these observations, we decided to maintain only one other condition, 0.1% glucose. Further analysis of cells growing under other conditions will not change the main conclusions but will increase the difficulty of determining how the MinD concentration changes with cell growth.

      (vi) The conclusion at Line 346 "This plasticity arises from spatial differences in molecular interactions between MinD and MinE, as demonstrated..." looks unclear to me. My understanding is that (i) by screening the randomly sampled parameters in the reaction-diffusion model, the authors found the parameters that "match" experimental results, and (ii) the parameters after screening show the correlation between them (k_dD-k_dE and k_D-k_ATP->ADP). The logic heavily relies on the reaction-diffusion model is quantitatively correct. First, I think it is better to explain the logic more explicitly, that is, the claim of the molecular interaction is not based on the experimental facts. Second, I personally think the reaction-diffusion model used in this work does not reproduce quantitatively the experimental results, as discussed in (iii) and also (iv). Please make some discussions on how to justify the comparison between the model and experiments.

      ANS: Thank you for your constructive comments. To address these questions, we have re-run the simulation to refine and improve the results, and the corresponding text and illustration are provided in the Results section of the main text (Lines 237-279, 614-653) and Fig. S6. The kinetic parameters used in this study are described in the main text, lines 258-264: 'To randomly search for combinations of the parameter sets k_dD, k_dE, k_D, and k_(ADP→ATP), the following parameters were fixed in the simulation: the diffusion coefficients D_d and D_de were assumed values based on bacterial membrane proteins (Schavemaker et al., 2018), the diffusion coefficients D_D and D_E were from Meacci et al. (2006) (Meacci et al., 2006), and the dissociation rate constant k_de were from a previous simulation (Wu et al., 2015). This operation allowed us to probe for the general behaviours of the system.' Lines 277-279: 'This screening process reduced the parameter sets to 23, including set #2827, which, judging by the correlation plots for length vs. period, λ_N, and I_Ratio (Figs. S7-S9), showed features similar to those of the experimental data (Figs. 1E, 3B, C).' Based on the parameters of set #2827, we rigorously tested the impact of different kinetic constants that represent different molecular interactions on the oscillation period, λ_N and I_Ratio (Fig 4D-H). The results are described in the section of 'Effect of the kinetic rate constant on the MinD concentration gradient' of the main text, lines 323-349. This effort has provided us with a solid theoretical view of how oscillation features may be controlled by different molecular interactions. In addition, a comparison between our modelling and experimental results is described in the main text, section 'In silico oscillation resembles oscillation in a cellular context', lines 300-321.

      (vii) I did not capture the point why the authors can claim "... further distinguishing in vivo and in vitro observations. " at Line 350. I did not find the results comparing with vitro studies. I would appreciate a demonstration of vitro results and/or references.

      ANS: To avoid confusion, this sentence has been removed.

      Minor comments: (1) Line 214: It should be "Fange and Elf".

      ANS: Line 238 in the revised manuscript: This has been corrected.

      (2) I think it is better to show sampled points in Fig. 4C and 4D to show how dense the authors sampled in the parameter space.

      ANS: Since we have rewritten this part, the suggested revision is no longer applicable.

      REFERENCES: [1] Fischer-Friedrich, Elisabeth / Meacci, Giovanni / Lutkenhaus, Joe / Chaté, Hugues / Kruse, Karsten, "Intra- and intercellular fluctuations in Min-protein dynamics decrease with cell length", Proceedings of the National Academy of Sciences, 107, 6134-6139 (2010). [2] Meacci, Giovanni, "Physical Aspects of Min Oscillations in Escherichia Coli", PhD thesis (2006) available at

      Reviewer #1 (Significance (Required)):

      General assessment: I think the strength of this study is that it potentially shows the quantitative correlation between the MinD concentration gradient during the oscillation and the cell length when it divides. However, the current data of glucose starvation is not convincing enough. The model parts are interesting but their connection to the experiments is not clear in the current manuscript.

      ANS: Thank you for your comment. The key finding of our study, involving experimental measurements and mathematical modelling, is plasticity in the MinD concentration gradient, which results from spatial differences in molecular interactions and is an intrinsic property of the Min system during cell growth. We hypothesized that if the plasticity of the MinD concentration gradient is an intrinsic property of the system, then this property would be robust and show consistent behaviour under different growth conditions. Therefore, we tested this hypothesis by studying MinD oscillations under a low-glucose condition, and the results strengthened the main conclusion derived from experiments under the regular growth condition containing 0.4 % glucose. We believe that further analysis of cells growing under other conditions will not change the main conclusions but may increase the difficulty of determining how the MinD concentration changes with cell growth. Therefore, we decide to make this section concise, containing only one additional condition, even though we have more data than presented here. As mentioned earlier in this response letter, we have re-run the simulation to refine and improve the results, and the corresponding text and illustration are provided in the Results section of the main text (Lines 237-279, 614-653) and Fig. S6. This operation allowed us to probe for the general behaviours of the system. As a result, we were able to obtain a few parameter sets, including #2728, that generate features of the oscillation period, λ_N and I_Ratio, that strongly mimic MinD oscillation in the cellular context (Figs. 4C, S7-9). We further tested the impact of different kinetic constants, k_de, k_dD, k_dE, k_D, and k_(ADP→ATP), which represent different molecular interactions influencing the oscillation period, λ_N and I_Ratio (Figs. 4D-H). This effort has provided us with a solid theoretical view of how oscillation features may be controlled by different molecular interactions.

      Advance: The advance of this study is to measure the MinD concentration gradient under glucose starvation, and to compare the experimental results with the (simplified) model under a wide range of parameters. I do not think the advance in the current manuscript looks conceptual level because the conceptual conclusions are not really convincing from the results. In this respect, the advance of this work may be technical.

      ANS: Thank you for this constructive comment and have responded as follows. In combination with both experimental and theoretical efforts in the revised manuscript, this work provides conceptual advancement in a quantitative understanding of MinD oscillations in the cellular environment and provides implications for bacterial cell division regulation for further studies in the field. Specifically, we would like to emphasize that this work revealed the inherent plasticity and adaptability of the MinD concentration gradient that contributes to division site selection. The mathematical modelling provided us with a solid theoretical view of how oscillation features may be controlled by different molecular interactions.

      Audience: As a theoretician working on biophysics, including the model of the Min system, I think a specialised audience would be interested in this study. People who are studying the mechanism of the Min oscillation and resulting cell division, particularly those who are interested in both experiments and models, would be interested in this work. For the broad audience, I do not think the novelty of this study is well described.

      ANS: Thank you for your comment. We would like to point out that studying the variable concentration gradient underlying the dynamic oscillations of the Min system may be of broad interest to cell biologists since the concentration gradient plays a fundamental role in various cellular processes, and the concept of concentration gradients is crucial in cell biology. Examples include passive and active transport, osmosis, cell signalling, and maintenance of cellular homeostasis. These processes allow cells to respond to their environment, regulate their internal conditions, and perform important functions required for survival and normal function. In addition, the variable concentration gradient, characterized by the numerical descriptor λ_N and reproduced in a simple mathematical model, demonstrates a nonlinear dynamics behaviour in physical biology. Therefore, the audience of this work may include the broader general audience of cell biology and physical biology rather than just the immediate specialized audience interested in the Min system. We will also reiterate the importance of specialized research, which often provides the basis for broader application and understanding.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary: This work by Parada et al showed that in the oscillatory Min System, MinD gradient was steeper in longer e.coli cells, while period was stable. This behavior was recapitulated in a mathematical model and it also revealed coordinated reaction rates in a wide range of parameter space.

      ANS: We thank the reviewer for the concise summary of our work.

      Major comments: 1. There were some inconsistencies between experimental and modeling data. Wave slope (𝜆𝑁) plateaued at ~3um in the model but not shown in the experiment (Fig.3B). The period was much less in the model (Fig. S8) than in the experiment (Fig. 1B).

      ANS: Thank you for pointing out this problem. We have re-run the simulation to refine and improve the results, and the corresponding text and illustration are provided in the Results section of the main text (Lines 237-279, 614-653) and Fig. S6. This operation allowed us to probe for the general behaviours of the system. As a result, we were able to obtain a few parameter sets, including #2728, that generate features of the oscillation period, λ_N and I_Ratio, that highly mimic MinD oscillation in the cellular context (Figs. 4C, S7-9). Regarding oscillation period, the simulation result was shorter than the experimental measurements. Even though, based on the parameters of set #2827, we rigorously tested the impact of different kinetic constants that represent different molecular interactions on the oscillation period, λ_N and I_Ratio (Main text, lines 323-349; Fig 4D-H). This effort has provided us with a theoretical view of how oscillation features may be controlled by different molecular interactions. We found that the rate constants k_de, representing detachment of the MinDE complex from the membrane, and k_(ADP→ATP), representing recharging of MinD-ADP with ATP, more significantly affected the oscillation period. The results suggested that the oscillation cycle time is tunable. In response to the question of the wave slope (λ_N) plateaued at ~3um in the modelling (Fig. 3B) but not shown in the experiment (Fig. 1D), we think this is due to experimental examination of a heterogenous population of cells versus simulating a growing bacterial cell. We came up with conclusions and hypotheses through wet experiments, which were further strengthened using mathematical modelling, providing insights into kinetic properties of the Min system.

      1. Generally, I found that the data of starved condition added little to the major message. Unless the model can recapitulate the even steeper gradient in such condition by tuning starvation-related parameters, it may be removed.

      ANS: We thank the reviewer for this suggestion. The key finding of our study, involving experimental measurements and mathematical modelling, is plasticity in the MinD concentration gradient, which results from spatial differences in molecular interactions and is an intrinsic property of the Min system during cell growth. We hypothesized that if the plasticity of the MinD concentration gradient is an intrinsic property of the system, then this property would be robust and show consistent behaviour under different growth conditions. Therefore, we tested this hypothesis by studying MinD oscillations under a low-glucose condition, and the results strengthened the main conclusion derived from experiments under the regular growth condition containing 0.4 % glucose. We agree that further analysis of cells growing under other conditions will not change the main conclusions but may increase the difficulty of determining how the MinD concentration changes with cell growth. Therefore, we decide to make this section concise, containing only one additional condition, even though we have more data than presented here.

      1. The authors need to compare what was different/novel between the model in this study and previous models such as Wu, et al 2015 and highlight the uniqueness of this work.

      ANS: Thank you for this suggestion. We now provide a comprehensive comparison between them in the Supplemental Information (Lines 26-147). We would like to emphasize that although the goal of the previous works was to measure the spatiotemporal distribution of oscillating MinD concentration gradients as a function of cell length, these works conceived the problem differently and therefore used different experimental designs and execution methods, which differentiates our key conclusions from theirs. This is also true for mathematical modelling. Although similar observations can be found in some respects, they are not directly comparable due to the different mathematics and assumptions used in the simulations. Therefore, we would like to draw attention to the experimental rigor and to the specific points and views that contribute to our understanding of Min systems.

      1. The model explored parameter space of reaction rates and found 60 sets. The KdE, KD, KdD, KADP-ATP ranged 6 orders of magnitude. It is interesting data in itself, but cells were not likely to vary that much for reaction rates. The relevance should be discussed.

      ANS: Thank you for pointing out this problem. For this revision, we re-ran the simulation to refine and improve the results, allowing us to identify parameter sets that generate features resembling the experimental measurements. Using set #2728 as an example, the variations in the five rate constants k_de, k_dD, k_dE, k_D, and k_(ADP→ATP) fall within a small range (Table 2, S4), eliminating the concern that arose from the previous version of the manuscript. We found that this parameter set allows for maximum utilization of MinD and MinE molecules, which are fixed in number according to experimental measurements, to drive membrane-associated oscillations in the simulation.

      Minor comments: 1. Fig.1B colors were conflicting. The legend was different than diagram. Fig.1C no scale for x axis.

      ANS: We have resolved the colour conflict in Fig. 1B, and a time range has been added to Fig. 1C.

      1. Fig.S6A How the 638 oscillatory parameter sets were matched with experimental data and screened to 174 sets was not clear. Data of fitting errorANS: Thank you for your suggestions to improve the description of the screening process. In this revision, we have re-run the simulation to refine and improve the results, and the corresponding text and illustration are provided in the Results section of the main text (Lines 237-279, 614-653) and Fig. S6. This operation allowed us to probe for the general behaviours of the system. The mentioned filter no longer applies.

      2. Significant digits were not used properly. For example, the period (table 1) was showed as 46.00 sec, but the imaging interval was 12 sec, the 2 decimal digits were thus meaningless. The same argument goes for length measurement at 2.84 um, while the optical resolution of the microscope used should be no good than 200nm.

      ANS: We have corrected this significant digit throughout the manuscript.

      1. For scatter plot like Fig.1D-G, generally smaller dots would show trend more obvious.

      ANS: We have modified the plots and used smaller dots in Figs. 1D-G, 3B, C, E, F, S3D, and S5B, C.

      1. The molecular mechanism of why MinD gradient increases with length was not the scope of the current study, but better to be discussed.

      ANS: Let me address this comment in another way. The key finding of our study, involving experimental measurements and mathematical modelling, is plasticity in the MinD concentration gradient, which results from spatial differences in molecular interactions and is an intrinsic property of the Min system during cell growth. In the revised manuscript, we have re-run the simulation to refine and improve the modelling procedures and results, and the corresponding text and illustration are provided in the Results section of the main text (Lines 265-279, 614-653) and Fig. S6. In brief, we fixed the diffusion coefficients D_D and D_Efrom Meacci et al. (2006) (Meacci et al., 2006); the dissociation rate constant k_de from a previous simulation (Wu et al., 2015); and the experimentally measured MinD and MinE concentrations in this study. Meanwhile, the diffusion coefficients D_d and D_de were assumed values based on bacterial membrane protein diffusion (Schavemaker et al., 2018). This operation allowed us to probe for the general behaviours of the system. As a result, we were able to obtain a few parameter sets, including #2728, that generate features of the oscillation period, λ_N and I_Ratio, that highly mimic MinD oscillation in the cellular context (Figs. 4C, S7-9). We further tested the impact of different kinetic constants, k_de, k_dD, k_dE, k_D, and k_(ADP→ATP), which represent different molecular interactions influencing the oscillation period, λ_N and I_Ratio (Fig 4D-H). Our findings have provided us with a solid theoretical view of how oscillation features may be controlled by different molecular interactions. Furthermore, the modelling results help us understand the possible mechanisms associated with oscillation cycle maintenance and length-dependent variable concentration gradients.

      1. Fig. S8, why sudden jump in period in many of the sets of both groups?

      ANS: This supplemental figure is now Fig. S7. A slower oscillation at the initiation of oscillation appears to be a common property in our simulation.

      Reviewer #2 (Significance (Required)):

      Min system was well-studied oscillation mechanism to restrict FtsZ at cell center. Previous work has shown how the system work molecularly, simulated the behavior and reconstituted many different patterns in vitro. The major new information from this work was: 1. the rigorously measured endogenous level of MinD and MinE; 2. gradient increased with length; 3. a model recapitulated this relationship and explored parameter space of reaction rates. The paper was well presented, experiments and analysis were rigorous, and the conclusions were not overstated. It should interest specialized cell biologists studying cell size, oscillation pattern.

      ANS: Many thanks to Reviewer 2 for recognizing the contributions of our work to the understanding of the Min system and its role in cell division. We also thank you for identifying professional cell biologists studying cell size and oscillation patterns as readers of our paper. We would like to emphasize that cellular concentration gradients play a fundamental role in various cellular processes and that the concept of concentration gradients is crucial in cell biology. These concentration gradient-mediated processes allow cells to respond to their environment, regulate their internal conditions and perform important functions required for survival. In addition, the variable concentration gradient, characterized by the numerical descriptor λ_N and reproduced in a simple mathematical model, demonstrates a nonlinear dynamics behaviour in physical biology. Therefore, the audience of this work may include a broader audience in the field of cell biology and physical biology rather than just an immediate specialist audience. We will also reiterate the importance of specialized research, which often provides the basis for broader application and understanding.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      The manuscript shows that the concentration of MinD does not change during the division cycle of E. coli. Due to the oscillation pattern the concentration of MinD decreases at the mid-cell which makes it favorable for the division. The mid-cell decrease in concentration of MinD is majorly length dependent. The oscillation pattern is not due to the change in concentration of MinD, but due to the plasticity arises from the spatial differences in molecular interactions between MinD and MinE. The manuscript is well written, the experiments are performed carefully and the results will be of interest to readers from variety of field. However, there are several concerns need explanation.

      ANS: We greatly appreciate the positive feedback from the reviewer, and we address the specific concerns below.

      Major concerns: One of my major concern is these interactions are not shown experimentally but explained using either the previously published literature or mathematical models. Further, the previous literatures are shown on in vitro models which does not mimic the in vivo system fully.

      ANS: We thank the reviewer for the important point that reaction rates in previous studies and in our model of Min oscillations have not been experimentally tested. We are aware of the lack of experimental measurements, but these reaction rates cannot be measured in batch reactions using classical biochemical methods. To accurately measure these reaction rates, the experiments require advanced techniques and methods to handle spatial and temporal resolution, which is beyond the scope of our current study. However, in the revised manuscript, we have re-run the simulation to refine and improve the results, and the corresponding text and illustration are provided in the Results section of the main text (Lines 237-279, 614-653) and Fig. S6. In our simulation, we fixed the diffusion coefficients D_D and D_E from Meacci et al. (2006) (Meacci et al., 2006); the dissociation rate constant k_de from a previous simulation (Wu et al., 2015); and the experimentally measured MinD and MinE concentrations in this study. Meanwhile, the diffusion coefficients D_d and D_de were assumed values based on bacterial membrane protein diffusion (Schavemaker et al., 2018). This operation allowed us to probe for the general behaviours of the system. As a result, we were able to obtain a few parameter sets, including #2728, that generate features of the oscillation period, λ_N and I_Ratio, that highly mimic MinD oscillation in the cellular context (Figs. 4C, S7-9). Interestingly, we found that this parameter set allows for maximum utilization of MinD and MinE molecules, which are fixed numbers from experimental measurements, to drive membrane-associated oscillations in the simulation. We further tested the impact of different kinetic constants, k_de, k_dD, k_dE, k_D, and k_(ADP→ATP), which represent different molecular interactions influencing the oscillation period, λ_N and I_Ratio (Figs. 4D-H). Our findings have provided us with a solid theoretical view of how oscillation features may be controlled by different molecular interactions, and help us understand the possible mechanisms associated with oscillation cycle maintenance and length-dependent variable concentration gradients.

      The concentration of MinD does not change with the increasing length of the cell. Is the MinD concentration (or copy numbers) is different in the case of cells growing in low glucose and when compared to the cells growing at high glucose?

      ANS: Thank you for the comments. As shown in Figs. 2B, C, the concentration of MinD changed with cell length, but the number of MinD molecules per unit area did not change significantly with cell length. Although how the number of MinD molecules changes when cells are grown under low-glucose conditions is unclear, this number does not appear to be essential for the following reasons. We focused on studying Min oscillations during the normal growth cycle, minimizing experimental manipulations to analyse oscillation dynamics. Measurements of oscillations in cells grown under low-glucose conditions support the primary measurements. We think that further analysis of MinD concentration changes in growing cells under low-glucose conditions will not change the main conclusion of this manuscript: 'plasticity in the MinD concentration gradient is an intrinsic property of the Min system during cell growth',

      As per the current study a particular I-ratio at the mid-cell is required to initiate the cell division. In the case of cells growing at low glucose, how this required I-ratio is achieved at the mid-cell?

      ANS: Thank you for the excellent question. As described in the main text, lines 199-201, I_Ratio is defined as the ratio of the minimum intensity to the maximum intensity measured from the experimental data, which gradually decreases as the cell length increases (Fig. 3C). Since the minimum and maximum intensities were measured from the concentration gradient, which is characterized by the slope of the concentration gradient (λ_N), there exists a correlation between I_Ratio and λ_N. That is, a larger λ_N will result in a smaller I_Ratio, and vice versa. When comparing measurements made from cells grown with 0.4% and 0.1% glucose (Fig. 3B, C, E, F), the changes in λ_N are more drastic within a shorter length under low-glucose condition, which is accompanied by more drastic changes in I_Ratio. Furthermore, when the I_Ratio value was approximately 0.5, the corresponding cell length was significantly shorter under low-glucose condition. Therefore, we speculate that there may be an effective I_Ratio that is low enough for stable FtsZ ring formation. This effective I_Ratio can occur at any cell length, allowing us to see that bacteria divide at shorter cell lengths under low-glucose conditions. This property necessitates a faster reduction in the concentration gradient to reach the effective I_Ratio for cells dividing at shorter lengths. As a result, by adjusting λ_N as a function of length, the steepness of the I_Ratio reduction can be altered. Please see the main text, lines 389-406.

      There is decrease in the MinD oscillation time observed in low glucose condition. As explained by the authors the MinD oscillation is mainly guided by the FtsE induced removal of MinD from the membrane, how the authors can explain this decrease?

      ANS: Thank you for raising the question of how the MinE-induced detachment of membrane-bound MinD contributes to the oscillation time of MinD under low-glucose conditions. Although this is an interesting question, determining what regulates MinE-induced detachment of membrane-bound MinD under low-glucose conditions is beyond the scope of the current study. This unknown regulatory mechanism that regulates MinD-MinE interactions in growing cells under low glucose conditions is worthy of further investigation. However, our modelling results have provided a theoretical view of how oscillation features may be controlled by different molecular interactions between MinD and MinE and may guide future experiments investigating the underlying mechanism involved. Please refer to the Results section: 'Spatiotemporal distribution of the concentration gradient' in the main text, lines 351-373.

      Further, it is explained that the concentration of cellular ATP is in much higher concentration compared to the required amount for this oscillation. As the Iratio is majorly dependent on the cell length, what could be the reason for the differential N in the case of low and high glucose condition?

      ANS: Please refer to the previous answer to the question: 'As per the current study a particular I-ratio at the mid-cell is required to initiate the cell division. In the case of cells growing at low glucose, how this required I-ratio is achieved at the mid-cell?'. (this letter, Lines 764-779) In addition, our modelling in search of parameter sets that generate characteristics of MinD oscillation resembling oscillation in vivo allowed us to evaluate the impact of different molecular interactions, as represented by different rate constants (Fig. 4), which has provided important information for future mechanistic investigations, although not in the present study. Please see the Results section: 'Effect of the kinetic rate constant on the MinD concentration gradient' in the main text, lines 323-349.

      MinD is a highly insoluble protein. It also has an amphipathic helix and thus most of the time it binds to the membrane. The method used by the author to determine the cellular MinD concentration (mentioned in Fig S1) will only give the concentration of the soluble MinD and not of the total MinD. How the authors justify this as the total concentration. This is also the same in the case of MinE copy number calculation. Authors may need to perform the transcriptome analysis and compare both the data.

      ANS: We thank the reviewer for the comments. Since the attachment of MinD and MinE to the membrane is transient and MinD-membrane interactions require ATP, we expected that most of the protein would be released from the membrane into the cytoplasm after cell disruption, sufficiently representing the total MinD concentration. Furthermore, our measurements of molecule numbers are within the range of previous measurements (Di Ventura & Sourjik, 2011; Juarez & Margolin, 2010; Meacci & Kruse, 2005; Tostevin & Howard, 2006; Touhami et al, 2006). Thus, we believe that our current measurements are reliable and sufficient for subsequent interpretation.

      One of the main question asked by the authors in the abstract is. "How the intracellular Min protein concentration gradients are coordinated with cell growth to achieve spatiotemporal accuracy of cell division is unknown". Although the authors have shown that there is a change in concentration gradient during cell growth, the mechanism for the same is not very well explained. Authors have not provided any specific explanation for the increase in the velocity of the MinD oscillation and the gradient formation. How the velocity of MinD is increasing although there is no increase in the MinD concentration.

      ANS: We have changed 'the mechanism' to 'the exact way' in the abstract (Abstract, line 28). Moreover, in the revised manuscript, we have improved the mathematical model and performed a thorough investigation of the variations in the kinetic constants. This effort has provided us with a solid theoretical view of how oscillation features may be controlled by different molecular interactions. The results may guide future experiments investigating the underlying mechanism involved. Please refer the answers to previous questions above.

      Figure 2B: shows the overall concentration of MinD in a single cell varies between 1180 - 1160 molecules/um2. In Fig 2C it is mentioned that mid-cell has a MinD concentration of 120-20 molecuels/ um2. Further, Fig3C and 3F shows I-ratio values varies between 0.6-0.4. Considering the values given the I-ratio (I min/ I max) should be between 0.1- 0.01. Authors need to explain the same. Figure 2C: The data in both the Y-axes are not matching and needs more clarification in the legend. Whether the number of molecules were counted only in the marked 200 nm area? If so, why the Y-axis 1 (molecules/um2) is decreasing 7 times, whereas, Y-axis 2 (molecules) is only by 2 times.

      ANS: In this work, we measured sfGFP-MinD intensity through fluorescence microscopy. The fluorescence intensity was converted into molecular numbers based on estimates from Western blot analyses (Fig. S1). This number of molecules for MinD and MinE was assumed to be the mean number, which was fit into the midpoint of the doubling time (Fig. 2B, black dashed line; main text, lines 166-167). Fig. 2C was obtained by further processing the same dataset to restrict the region of analysis to the midcell zone. Please refer to the main text, lines 158-178. However, the λ_N and I_Ratio values were calculated from the processed intensity data (Fig. S2; main text, lines 190-209, 533-559). Because of the conversion from intensity to molecule number in Figs. S2B, C and the image processing procedure applied to the calculation of λ_N and I_Ratio, it is not possible to directly compare the fold change and the upper and lower limits between molecule numbers and the λ_N and I_Ratio values.

      Other comments: Line 84: Requires reference for this statement.

      ANS: A recent review article has been added in the main text, line 84: '(Cameron & Margolin, 2024)'.

      Line 96: Can authors provide other evidence or validation for the determination of the copy numbers such as transcriptome analysis.

      ANS: We thank the reviewer for this suggestion. However, we believe that direct measurement of cellular protein abundance is reliable and sufficient for our purposes. Furthermore, transcriptome-measured RNA abundance does not translate directly to protein abundance in living cells because posttranscriptional processing, translation, posttranslational processing, and protein stability issues complicate the interpretation. Therefore, protein abundance measurement from cell extracts is straightforward for our purpose.

      Fig 1C: what is the units of time in Fig 1C? Is it equal for all the cell lengths?

      ANS: As described in the main text, lines 511-512, 'Time-lapse images of sfGFP-MinD were acquired at 12-sec intervals for 10 min or before the fluorescence diminished'. This condition is applied to all the acquired images in this work.

      Page 6, line 136-138: what could be the possible mechanism for change in velocity at different cell cycle time?

      ANS: To avoid confusion, we have modified the text and tone down the velocity when mentioned. This is because the mentioned velocity is inferred from the measured oscillation period and cell length but not from direct measurements; our emphasis is on understanding how the oscillation period remains fairly stable during cell growth rather than how the velocity changes. In the revised manuscript, we used modelling results to elucidate the possible mechanism related to period maintenance. The corresponding text and illustration are provided in the Results section (Lines 300-373) and the Discussion section of the main text (Lines 407-446) and Figs. 4, 5. In brief, this simulation allowed us to probe for general behaviours of the system, allowing us to obtain a few parameter sets that generate features of the oscillation period, λ_N and I_Ratio highly mimicking MinD oscillation in the cellular context (Fig 4C, S7-9). We further tested the impact of different kinetic constants, k_de, k_dD, k_dE, k_D, and k_(ADP→ATP), which represent different molecular interactions influencing the oscillation period, λ_N and I_Ratio (Fig 4D-H). This effort has provided us with a solid theoretical view of how oscillation features may be controlled by different molecular interactions. Please see the Results section: 'Effect of the kinetic rate constant on the MinD concentration gradient' in the main text, lines 323-349.

      Page 7, line 155: Any evidence for claiming the same?

      ANS: The sentence has been modified as follows: 'Thus, the fairly stable oscillation period and variable velocity did not change the precision of the septum placement.' (Main text, lines 155-156)

      Page 7, line 156: Is there any proof authors can show that burst MinD synthesis occurs during the division? If not in the case of MinD, is it shown in any other protein?

      ANS: The text is now in line 168-171: 'Interestingly, the value after division was not doubled, which could indicate a balanced outcome between de novo synthesis and degradation or a burst of MinD synthesis at cell division followed by constant synthesis.' In previous studies by Männik et al. (2018) (Mannik et al, 2018) and Vischer et al. (2015) (Vischer et al, 2015), the division protein FtsZ increased the cellular concentration throughout the cell cycle under slow growth conditions and degraded rapidly at the end of the cell cycle, a process controlled by the ClpXP protease. Because we do not know the relevance of these observations to our study, which focused on the plasticity of the MinD concentration gradient, we decided not to discuss them in the manuscript.

      Page 9, line 217: The Fig 4A is not explained clearly and all the terms mentioned needs to be explained. This figure is used to explain the differential concentration of MinD at the poles and the mid-cell, thus needs to be explain more clearly.

      ANS: Thank you for your comments. Please refer to the above answer to the question: 'One of my major concern is these interactions are not shown experimentally but explained using either the previously published literature or mathematical models. Further, the previous literatures are shown on in vitro models which does not mimic the in vivo system fully.', in this letter, lines 691-715.

      Page 12, line 285: What is meaning of default speed of MinD oscillation in new-born cells? Do the authors observed any specific velocity in the new-born cells? What is the explanation for length dependent oscillation velocity for MinD?

      ANS: Thank you for the questions. As mentioned earlier, the emphasis of this study is on understanding how the oscillation period remains relatively stable while showing plasticity of the concentration gradient during cell growth. The velocity is inferred from the oscillation period and cell length but is not a direct measurement. To avoid confusion, we have modified the text and placed less emphasis on the velocity when mentioned.

      Reviewer #3 (Significance (Required)):

      General assessment: Major work of the manuscript is relying on the mathematical models, whereas the audience are majorly from the biology fields and thus simplified explanations are required in many places. Many of the legends in the figures require more explanation for better understanding. If possible more experimental data can be added, specifically to explain the model mentioned in figure 4A.

      ANS: We have modified the figure legends to include more explanations. As mentioned above, we have also revised Fig. 4 to include improvements in modelling results to better fit the experimental data and to examine the impacts of the kinetics constants of the reaction steps in the Min system. Please refer to lines 691-715 in this letter.

      Advance: The study is adding to the existing knowledge and will be helpful to fill the conceptual gaps in understanding the mid-cell MinD concentration and what may favor the initiation of bacterial division. Audience: Majorly the microbiology community will be interested in the study. This will also be interest to Physicists and mathematical persons working to understand bacterial division.

      ANS: We thank the reviewer for this positive comment.

      Reviewer #4 (Evidence, reproducibility and clarity (Required)):

      The study by Parada et al. illuminates the intricate interplay between Min proteins, exemplified by MinD, and cell growth in E. coli. Their findings demonstrate that the MinD concentration gradient steepens progressively as cells elongate, potentially influencing FtsZ ring formation via MinC. Moreover, their comprehensive reaction-diffusion model not only corroborates experimental observations of length-dependent concentration gradients but also underscores the critical role of kinetic interactions involving Min proteins, the membrane, and ATP. This elucidation significantly advances our understanding of the oscillatory mechanisms within the Min system. Both the experimental and simulation data are robust, and the manuscript is exceptionally well-written. I express my full support for publication pending the satisfactory resolution of the outlined concerns.

      ANS: We appreciate the reviewer's positive feedback and have addressed most issues to the best of our ability.

      1. Remove the dot in front of "Min" in line 57.

      ANS: This has now been removed.

      1. In lines 82-84, the statement "...The distribution of the division inhibitor MinC may be synchronized with spatiotemporal differences in MinD concentrations, leading to a stable placement of the FtsZ ring at the midcell..." suggests a potential synchronization between MinC and MinD oscillations. It is crucial to investigate if sfGFP-MinC exhibits similar concentration gradient oscillatory behavior in vivo as observed with MinD.

      ANS: Thank you for bringing up this question. The key finding of our study, involving experimental measurements and mathematical modelling, is plasticity in the MinD concentration gradient, which results from spatial differences in molecular interactions and is an intrinsic property of the Min system during cell growth. With many investigations already covered in this manuscript, we prefer to investigate sfGFP-MinC in future studies, which will have different focuses on how MinC dynamics are coupled with the variable MinD concentration gradient to directly impact FtsZ ring formation.

      1. Ensure consistent significant digits throughout the text. For instance, 1.95{plus minus}0.16 μM in line 97, 1.4{plus minus}0.13 μM in line 98, and 1.9 {plus minus} 0.2 μM in line 100 have varying precision. Consider using integers for molecules.

      ANS: We have corrected the significant digits in the main text and supplemental information.

      1. Address the discrepancy in expression levels of MinD and MinE between strain FW1541 and its parental strain W3110. Given the labeling effect, it is possible that MinD expression levels differ. However, MinC's expression level should be approximately the same. Conduct whole-genome sequencing of both strains to identify any additional mutations.

      ANS: Thank you for the comments. As described in the main text (Lines 67-70), the most important aspect is the concentration ratio between MinD and MinE. Although the numbers are not the same, they are comparable to those in previous studies (Hale et al, 2001; Li et al, 2014; Schmidt et al, 2016; Shih et al, 2002) (Main text, lines 113-115). Furthermore, we performed whole-genome sequencing of the W3110 and FW1541 strains. We confirmed that sfGFP was correctly inserted. The sequence alignment of the minCDE locus is provided for your reference but not for publication. Although there are some sporatic point mutations, there is no obvious reason to believe that the mutations would impact Min protein expression. We will organize the deposition data as soon as I can.

      1. Clarify the apparent discrepancy between lines 112 and 127. Line 112 suggests that the periodic regularity of interpolar oscillations increases with cell length, as demonstrated in Fig 1B-C, 1E, Fig S5. However, in the subsequent section (starting from line 127), the authors state that oscillation periods remain relatively stable across cells of different lengths. Provide clarification on this apparent discrepancy.

      ANS: Thank you for pointing out this confusion caused by misuse of the term. In Lines 122-123, the statement has been modified as follows: '...the uniformity of the oscillation intervals appears to increase with length...' In line 139, 'The oscillation period' refers to the time required for the oscillation cycle. Since the correction in line 123 should suffice to clarify, we did not modify the statement in line 139.

      1. Specify if the analysis was limited to non-constricted cells. If so, state this explicitly in the text, as it could impact the interpretation of results, especially in relation to the linear dependence of cell length on time before constriction, as shown in Fig S3C.

      ANS: We did not specifically remove those constricted cells, but cells before splitting were considered one cell. We have added a statement to clarify in Lines 144-145.

      1. Improve clarity in Fig 2A by using distinct colors (e.g., green and red) for differentiation on the Y-axis.

      ANS: The Y axes of Fig. 2A have been modified.

      1. Correct "of" to "from" in line 223 for improved clarity and accuracy.

      ANS: Corrected.

      1. Include the missing "A" in Fig S6A for completeness and accuracy.

      ANS: This figure has been updated.

      1. Ensure consistency in referencing style (full names versus short names) throughout the manuscript.

      ANS: This has now been done.

      Reviewer #4 (Significance (Required)):

      While numerous commendable in vitro studies have explored the oscillatory behavior of the Min system, this work uniquely delves into the oscillation of MinD within live cells. It unveils the remarkable coordination between intracellular Min protein concentration gradients and cell growth, shedding light on the precise spatiotemporal regulation of cell division.

      ANS: We thank the reviewer for this positive comment.

      References Di Ventura B, Sourjik V (2011) Self-organized partitioning of dynamically localized proteins in bacterial cell division. Molecular systems biology 7: 457 Fischer-Friedrich E, Meacci G, Lutkenhaus J, Chate H, Kruse K (2010) Intra- and intercellular fluctuations in Min-protein dynamics decrease with cell length. Proceedings of the National Academy of Sciences of the United States of America 107: 6134-6139 Hale CA, Meinhardt H, de Boer PA (2001) Dynamic localization cycle of the cell division regulator MinE in Escherichia coli. The EMBO journal 20: 1563-1572 Juarez JR, Margolin W (2010) Changes in the Min oscillation pattern before and after cell birth. Journal of bacteriology 192: 4134-4142 Li GW, Burkhardt D, Gross C, Weissman JS (2014) Quantifying absolute protein synthesis rates reveals principles underlying allocation of cellular resources. Cell 157: 624-635 Mannik J, Walker BE, Mannik J (2018) Cell cycle-dependent regulation of FtsZ in Escherichia coli in slow growth conditions. Molecular microbiology 110: 1030-1044 Meacci G, Kruse K (2005) Min-oscillations in Escherichia coli induced by interactions of membrane-bound proteins. Phys Biol 2: 89-97 Meacci G, Ries J, Fischer-Friedrich E, Kahya N, Schwille P, Kruse K (2006) Mobility of Min-proteins in Escherichia coli measured by fluorescence correlation spectroscopy. Phys Biol 3: 255-263 Schavemaker PE, Boersma AJ, Poolman B (2018) How Important Is Protein Diffusion in Prokaryotes? Front Mol Biosci 5: 93 Schmidt A, Kochanowski K, Vedelaar S, Ahrne E, Volkmer B, Callipo L, Knoops K, Bauer M, Aebersold R, Heinemann M (2016) The quantitative and condition-dependent Escherichia coli proteome. Nature biotechnology 34: 104-110 Shih YL, Fu X, King GF, Le T, Rothfield L (2002) Division site placement in E. coli: mutations that prevent formation of the MinE ring lead to loss of the normal midcell arrest of growth of polar MinD membrane domains. The EMBO journal 21: 3347-3357 Tostevin F, Howard M (2006) A stochastic model of Min oscillations in Escherichia coli and Min protein segregation during cell division. Phys Biol 3: 1-12 Touhami A, Jericho M, Rutenberg AD (2006) Temperature dependence of MinD oscillation in Escherichia coli: running hot and fast. Journal of bacteriology 188: 7661-7667 Vischer NO, Verheul J, Postma M, van den Berg van Saparoea B, Galli E, Natale P, Gerdes K, Luirink J, Vollmer W, Vicente M, den Blaauwen T (2015) Cell age dependent concentration of Escherichia coli divisome proteins analyzed with ImageJ and ObjectJ. Front Microbiol 6: 586 Wu F, van Schie BG, Keymer JE, Dekker C (2015) Symmetry and scale orient Min protein patterns in shaped bacterial sculptures. Nature nanotechnology 10: 719-726

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      Parada et al. studied both experimentally and theoretically the MinD concentration distribution of Min waves during cell growth. The main finding was that (i) the gradient of MinD is steeper for longer cells and accordingly the MinD concentration at the middle of cell is lower, (ii) period of the oscillation is independent to the cell length, and (iii) those features are shared even under glucose starvation except the MinD gradient is steeper. (iv) Those results are supplemented by the analyses of the reaction-diffusion equations in which parameters that can reproduce the MinD concentration distribution are identified.

      I think the results are interesting; basically, as the cell grows, the contrast of the wave becomes clearer, such the MinD concentration at the cell centre decreases. The results may clarify the mechanism of FtsZ accumulation at the cell centre more quantitatively. The experiments were performed by measuring the fluorescent intensity of MinD during cell growth and analysing the intensity distribution along the long axis of the cell. The theoretical results were based on the analyses of the reaction-diffusion model. Both approaches are already well established and the results sound. Nevertheless, I do not think the novelty of this work is not well highlighted in the current manuscript; I think most of the results, except (iii) and (iv), have already been shown explicitly or implicitly in the previous studies. Min oscillations in a growing cell have been analysed both theoretically and experimentally in (Meacci 2005) and [1]. The concentration distribution and period of the oscillation were measured. The complete results were presented in [2], and I am not aware of those results in scientific journals (the thesis is available online). Nevertheless, I think it is fair to cite those studies and compare the current results with them. In fact, in [2], it was shown that the concentration of MinD near the cell centre decreases as the cell grows, the total MinD concentration is approximately constant during the growth (therefore, the number of the molecules increases), and that the variance of the period becomes smaller as the cell grows. I do not think those previous studies spoil this work, and this work deserves publication somewhere. Still, the authors should highlight the novelty of this study more clearly.

      Major comments:

      (i) In (Meacci 2005) and [1,2], it was claimed that the standard deviation of the period is comparable with the mean period, particularly for the shorter cell. Therefore, they did not claim the period is independent to the cell length. As far as I understood, the variance arises from the variance of the total protein concentration in the assemble of cells. I am wondering how the authors are able to conclude the constant period in different cell length. I also point out that in the theoretical part of (Meacci 2005), the period is, in fact, increasing as the cell grows and suddenly decreases at the length in which cell division occurs.

      (ii) I do not think the explanations of the reaction-diffusion model were well described. The authors mentioned that they studied a one-dimensional model and used the delta function to describe the membrane reaction. Did the authors study 1D cytosol and 0D membrane? Then, why the surface diffusion term exists in (4) and (5)? I believe the authors simply assumed that both the membrane and the cytosol are 1D (with larger diffusion constants for cytosolic Min concentrations). Then, the delta functions in (1)-(5) are not necessary. In (Wu 2015), the delta function was used in order to treat a 2D membrane embedded in 3D space.

      Besides that, there is no description of the initial conditions for the concentration fields to solve the reaction-diffusion equations. I think the description of the no-flux boundary condition is better put in the Methods rather than supplementary materials.

      (iii) As in the previous comment, the current model did not take into account the geometry of the system; namely, cytosol is in 3D and membrane is on 2D. Recent theoretical studies can handle the effect, and also the effect of confinement. I would appreciate it if the authors would make a comment on whether those issues are relevant or not for the conclusion of this work.

      (iv) I would appreciate it if the authors would describe the screening process more clearly. I did understand the first screening is a finite imaginary part and a positive real part at the first mode of spatial inhomogeneity in the eigenvalues. However, I did not understand the other processes clearly. The second screening is based on \lambda_N and I_Ratio, but its criteria is not clear. I think both quantities fluctuated in experimental results and I am not sure what to define numerical results match them.

      The third process is based on a fitting error using the fitting function of linear increase plus a constant. I am not sure why we need to exclude, for example, the bottom right example in Fig.S6 because it shows no oscillation until the cell length of 3um but then the gradient linearly increases. Please clarify how to justify the criteria. The same argument applies to the fourth screening process. It is not clear why the slope should be smaller than 2.

      (v) The authors claimed that the steeper gradient of MinD under glucose starvation results in cell division for shorter cells. I do not think the claim is convincing. It is necessary to measure the correlation between the length at the cell division and the gradient. It would also be nicer to show the correlation under other parameters. I think those studies truly support the authors' claim and the novelty of this work.

      (vi) The conclusion at Line 346 "This plasticity arises from spatial differences in molecular interactions between MinD and MinE, as demonstrated..." looks unclear to me. My understanding is that (i) by screening the randomly sampled parameters in the reaction-diffusion model, the authors found the parameters that "match" experimental results, and (ii) the parameters after screening show the correlation between them (k_dD-k_dE and k_D-k_ATP->ADP). The logic heavily relies on the reaction-diffusion model is quantitatively correct. First, I think it is better to explain the logic more explicitly, that is, the claim of the molecular interaction is not based on the experimental facts. Second, I personally think the reaction-diffusion model used in this work does not reproduce quantitatively the experimental results, as discussed in (iii) and also (iv). Please make some discussions on how to justify the comparison between the model and experiments.

      (vii) I did not capture the point why the authors can claim "... further distinguishing in vivo and in vitro observations. " at Line 350. I did not find the results comparing with vitro studies. I would appreciate a demonstration of vitro results and/or references.

      Minor comments:

      1. Line 214: It should be "Fange and Elf".
      2. I think it is better to show sampled points in Fig.4C and 4D to show how dense the authors sampled in the parameter space.

      REFERENCES:

      [1] Fischer-Friedrich, Elisabeth / Meacci, Giovanni / Lutkenhaus, Joe / Chaté, Hugues / Kruse, Karsten, "Intra- and intercellular fluctuations in Min-protein dynamics decrease with cell length", Proceedings of the National Academy of Sciences, 107, 6134-6139 (2010).

      [2] Meacci, Giovanni, "Physical Aspects of Min Oscillations in Escherichia Coli", PhD thesis (2006) available at https://www.pks.mpg.de/fileadmin/user_upload/MPIPKS/group_pages/BiologicalPhysics/dissertations/GiovanniMeacci2006.pdf

      Significance

      General assessment:

      I think the strength of this study is that it potentially shows the quantitative correlation between the MinD concentration gradient during the oscillation and the cell length when it divides. However, the current data of glucose starvation is not convincing enough. The model parts are interesting but their connection to the experiments is not clear in the current manuscript.

      Advance:

      The advance of this study is to measure the MinD concentration gradient under glucose starvation, and to compare the experimental results with the (simplified) model under a wide range of parameters. I do not think the advance in the current manuscript looks conceptual level because the conceptual conclusions are not really convincing from the results. In this respect, the advance of this work may be technical.

      Audience:

      As a theoretician working on biophysics, including the model of the Min system, I think a specialised audience would be interested in this study. People who are studying the mechanism of the Min oscillation and resulting cell division, particularly those who are interested in both experiments and models, would be interested in this work. For the broad audience, I do not think the novelty of this study is well described.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This structural and biochemical study of the mouse homolog of acidic mammalian chitinase (AMCase) enhances our understanding of the pH-dependent activity and catalytic properties of mouse AMCase and sheds light on its adaptation to different physiological pH environments. The methods and analysis of data are solid, providing several lines of evidence to support a development of mechanistic hypotheses. While the findings and interpretation will be valuable to those studying AMCase in mice, the broader significance, including extension of the results to other species including human, remain unclear.

      Public Reviews:

      Reviewer #1 (Public Review):

      General comments:

      This paper investigates the pH-specific enzymatic activity of mouse acidic mammalian chitinase (AMCase) and aims to elucidate its function's underlying mechanisms. The authors employ a comprehensive approach, including hydrolysis assays, X-ray crystallography, theoretical calculations of pKa values, and molecular dynamics simulations to observe the behavior of mouse AMCase and explore the structural features influencing its pH-dependent activity.

      The study's key findings include determining kinetic parameters (Kcat and Km) under a broad range of pH conditions, spanning from strong acid to neutral. The results reveal pH-dependent changes in enzymatic activity, suggesting that mouse AMCase employs different mechanisms for protonation of the catalytic glutamic acid residue and the neighboring two aspartic acids at the catalytic motif under distinct pH conditions.

      The novelty of this research lies in the observation of structural rearrangements and the identification of pH-dependent mechanisms in mouse AMCase, offering a unique perspective on its enzymatic activity compared to other enzymes. By investigating the distinct protonation mechanisms and their relationship to pH, the authors reveal the adaptive nature of mouse AMCase, highlighting its ability to adjust its catalytic behavior in response to varying pH conditions. These insights contribute to our understanding of the pH-specific enzymatic activity of mouse AMCase and provide valuable information about its adaptation to different physiological conditions.

      Overall, the study enhances our understanding of the pH-dependent activity and catalytic properties of mouse AMCase and sheds light on its adaptation to different physiological pH environments.

      Reviewer #2 (Public Review):

      Summary:

      In this study of the mouse homolog of acidic mammalian chitinase, the overall goal is to provide a mechanistic explanation for the unusual observation of two pH optima for the enzyme. The study includes biochemical assays to establish kinetic parameters at different solution pH, structural studies of enzyme/substrate complexes, and theoretical analysis of amino acid side chain pKas and molecular dynamics.

      Strengths:

      The biochemical assays are rigorous and nicely complemented by the structural and computational analysis. The mechanistic proposal that results from the study is well rationalized by the observations in the study.

      Weaknesses:

      The overall significance of the work could be made more clear. Additional details could be provided about the limitations of prior biochemical studies of mAMC that warranted the kinetic analysis. The mouse enzyme seems unique in terms of its behavior at high and low pH, so it remains unclear how the work will enhance broader understanding of this enzyme class. It was also not clear can the findings be used for therapeutic purposes, as detailed in the abstract, if the human enzyme works differently.

      We have edited the paper to address these concerns

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Major comments:

      (1) Regarding the pH profiles of mouse AMCase, previous studies have reported its activity at pH 2.0 and within the pH range of 3-7. In this paper, the authors conducted kinetic measurements and showed that pH 6.5 is optimal for kcat/Km. The authors emphasize the significance of mouse AMCase's activity in the neutral region, particularly at pH 6.5, for understanding its physiological relevance in humans. To provide a comprehensive overview, it would be valuable for the authors to summarize the findings from previous and current studies, discuss their implications for future pulmonary therapy in humans, and cite relevant literature. Additionally, the authors should highlight their research's specific contributions and novel findings, such as the determination of kinetic parameters (Kcat and Km) under different pH conditions. Emphasizing why previous studies may have required these observations and underscoring the importance of the present findings in addressing those knowledge gaps will help readers understand the significance of the study and its impact on the field of enzymology.

      We thank the reviewer for this comment. In keeping with the knowledge gaps addressed directly by this paper, we have not augmented the discussion of future pulmonary therapy in humans. We have summarized the present findings at the end of the introduction as follows:

      “We measured the mAMCase hydrolysis of chitin, which revealed significant activity increase under more acidic conditions compared to neutral or basic conditions. To understand the relationship between catalytic residue protonation state and pH-dependent enzyme activity, we calculated the theoretical pKa of the active site residues and performed molecular dynamics (MD) simulations of mAMCase at various pHs. We also directly observed conformational and chemical features of mAMCase between pH 4.74 to 5.60 by solving X-ray crystal structures of mAMCase in complex with oligomeric GlcNAcn across this range.”

      (2) Regarding the implications of the pKa values and Asp138 orientation for the pH optima, it would be valuable for the authors to discuss the variations in optimal activity by pH among GH-18 chitinases and investigate the underlying factors contributing to these differences. In particular, exploring the role of Asp138 orientation in chitotriosidase, another mammalian chitinase, would provide important insights. Chitotriosidase is known to be inactive at pH 2.0, and it would be interesting to investigate whether the observed orientation of Asp138 towards Glu140 in mouse AMCase for pH 2.0 activity is lacking in chitotriosidase.

      There are similar rotations of the two acidic residues in the literature on Chit1. The variety of crystal pH conditions and the lack of a straightforward mechanism for pKa shifts in AMCase make it difficult to draw a comparison to why Chit1 is inactive at low pH, but this is an interesting area for future study. See a more full discussion in: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2760363/

      Furthermore, considering the lower activity of human AMCase at pH 2.0, it would be worthwhile to examine whether the Asp138 orientation towards Glu140, as observed in mouse AMCase, is also absent in human AMCase. Exploring this aspect will help determine if the orientation of Asp138 plays a critical role in pH-dependent activity in human AMCase.

      The situation for hAMCase is similar to Chit1 as the rotations observed here for mAMCase are also present. It is not the whether Asp138 can rotate, but rather the relevant energetic penalties as we discuss in the manuscript.

      (3) In a previous study by Okawa et al.(Loss and gain of human acidic mammalian chitinase activity by nonsynonymous SNPs. Mol Biol Evol 33, 3183-3193, 2016), it was reported that specific amino acid substitutions (N45D, D47N, and R61M) encoded by nonsynonymous single nucleotide polymorphisms (nsSNPs) in the N-terminal region of human AMCase had distinct effects on its chitinolytic activity. Introducing these three residues (N45D, D47N, and R61M) could activate human AMCase. This activation significantly shifted the optimal pH from 4-5 to 2.0.

      Considering the significant impact of these amino acid substitutions on the pH-dependent activity of human AMCase, the authors should discuss this point in the manuscript's discussion section. Incorporating the findings and relating them to the current study's observations on pH optima and Asp138 orientation can provide a comprehensive understanding of the factors influencing pH-dependent activity in AMCase.

      We added a citation and dicuss how the mutations identified by this study could potentially shift the pKa of key catalytic residues:

      “Okawa et al identified how primate AMCase lost activity by integration of specific, potentially pKa-shifting, mutations relative to the mouse counterpart42b.”

      (4) To further strengthen the discussion, the authors could explore the ancestral insectivorous nature of placental mammals and the differences in chitinase activity between herbivorous and omnivorous species. Incorporating these aspects would add depth and relevance to the overall discussion of AMCase. AMCase is an enzyme known for its role in digesting insect chitin in the stomachs of various insectivorous and omnivorous animals, including bats, mice, chickens, pigs, pangolins, common marmosets, and crab-eating monkeys 1-7. However, in certain animals, such as dogs (carnivores) and cattle (herbivores), AMCase expression and activity are significantly low, leading to impaired chitin digestion 8. These observations suggest a connection between dietary habits and the expression and activity of the AMCase gene, ultimately influencing chitin digestibility across different animal species 8.

      (1) Strobelet al. (2013). Insectivorous bats digest chitin in the stomach using acidic mammalian chitinase. PloS one 8, e72770.

      (2) Ohno et al. (2016). Acidic mammalian chitinase is a proteases-resistant glycosidase in mouse digestive system. Sci Rep 6, 37756.

      (3) Tabata et al. (2017). Gastric and intestinal proteases resistance of chicken acidic chitinase nominates chitin-containing organisms for alternative whole edible diets for poultry. Sci Rep 7, 6662.

      (4) Tabata et al. (2017). Protease resistance of porcine acidic mammalian chitinase under gastrointestinal conditions implies that chitin-containing organisms can be sustainable dietary resources. Sci Rep 7, 12963.

      (5) Ma et al. (2018). Acidic mammalian chitinase gene is highly expressed in the special oxyntic glands of Manis javanica. FEBS Open Bio 8, 1247-1255.

      (6) Tabata et al. (2019). High expression of acidic chitinase and chitin digestibility in the stomach of common marmoset (Callithrix jacchus), an insectivorous nonhuman primate. Sci. Rep. 9. 159.

      (7) Uehara et al. (2021). Robust chitinolytic activity of crab-eating monkey (Macaca fascicularis) acidic chitinase under a broad pH and temperature range. Sci. Rep. 11, 15470.

      (8) Tabata et al. (2018). Chitin digestibility is dependent on feeding behaviors, which determine acidic chitinase mRNA levels in mammalian and poultry stomachs. Sci Rep 8, 1461.

      This overall point is covered by our brief discussion on diet differences:

      “However, hAMCase is likely too destabilized at low pH to observe an increase in _k_cat. hAMCase may be under less pressure to maintain high activity at low pH due to humans’ noninsect-based diet, which contains less chitin compared to other mammals with primarily insect-based diets42. “

      (5) It is important for the authors to clearly state the limitations of their simulations and emphasize the need for experimental validation or additional supporting evidence. This will provide transparency and enable readers to understand the boundaries of the study's findings. A comprehensive discussion of limitations would contribute to a more robust interpretation of the results.

      We added a sentence to the discussion:

      “Our simulations have important limitations that could be overcome by quantum mechanical simulations that allow for changes in protonation state and improved consideration of polarizability.”

      Minor comments:

      (1) Regarding the naming of AMCase, it is important to accurately describe it based on its acidic isoelectric point rather than its enzymatic activity under acidic conditions based on the original paper (Reference #14 (Boot, R. G. et al. Identification of a novel acidic mammalian chitinase distinct from chitotriosidase. J. Biol. Chem. 276, 6770-6778 (2001)).

      We have made this modification

      (2) In the introduction, providing more context regarding the terminology of acidic mammalian chitinase (AMCase) would be beneficial. While AMCase was initially discovered in mice and humans, subsequent research has revealed its presence in various vertebrates, including birds, fish, and other species. Therefore, it would be appropriate to include the alternative enzyme name, Chia (chitinase, acidic), in the introduction to reflect its broader distribution across different organisms. This clarification would enhance the readers' understanding of the enzyme's taxonomy and facilitate further exploration of its functional significance in diverse biological systems.

      We have made this modification

      (3) The authors mention that AMCase is active in tissues with neutral pHs, such as the lung. However, it is important to consider that the pH in the lung is lower, around 5, due to the presence of dissolved CO2 that forms carbonic acid. The lung microenvironment is known to vary, and specific regions or conditions within the lung may have slightly different pH levels. By addressing the pH conditions in the lungs and their relationship to AMCase's activity, the authors can enhance our understanding of the enzyme's function within its physiological context. A thorough discussion of the specific pH conditions in the lung and their implications for AMCase's activity would provide valuable insights into the enzyme's role in lung pathophysiology.

      To keep the focus on the insights we have made, we have elected not to expand this discussion.

      (4) It would be helpful for the authors to provide more information about the substrate or products of AMCase. The basic X-ray crystal structures used in this study are GlcNAc2 or GlcNAc3, known products of AMCase. Including details about the specific ligands involved in the enzymatic reactions would enhance the understanding of the study's focus.

      We are unclear about what this means - and since it is a minor comment, we have elected not to change the discussion of substrates here.

      (5) The authors should critically evaluate the inclusion of the term "chitin-binding" in the Abstract and Introduction. Suppose substantial evidence or discussion regarding the specific chitin-binding properties of the enzyme or its relevance to the immune response needs to be included. In that case, removing or modifying that statement might be appropriate.

      We are unclear about what this means - and since it is a minor comment, we have elected not to change the discussion of “chitin-binding” here.

      (6) The authors developed an endpoint assay to measure the activity of mouse AMCase across a broad pH range, allowing for direct measurement of kinetic parameters. The authors should provide a more detailed description of the methods used, including any specific modifications made to the previous assay, to ensure reproducibility and facilitate further research in the field. It is important to clearly show the novelty of their endpoint assay compared to previous methods employed in other reports. The authors should also explain how their modified endpoint assay differs from existing assays and highlight its advancements or improvements. This will help readers understand the unique features and contributions of the assay in the context of previous methods.

      We have included a detailed method description and figures already. See also our previous paper by Barad which includes other, related, assays.

      (7) The authors suggest that mouse AMCase may be subject to product inhibition, potentially due to its transglycosylation activity, which can affect the Michaelis-Menten model predictions at high substrate concentrations. However, the reviewer needed help understanding the specific impact of transglycosylation on the kinetic parameters. It would be helpful for the authors to provide a more appropriate and detailed explanation, clarifying how transglycosylation activity influences the kinetic behavior of AMCase and its implications for the observed results.

      The experiments to conclusively demonstrate this are beyond our current capabilities.

      (8) In the Abstract, the authors state, "We also solved high resolution crystal structures of mAMCase in complex with chitin, where we identified extensive conformational ligand heterogeneity." This reviewer suggests replacing "chitin" with "oligomeric GlcNAcn" throughout the text, specifically about biochemical experiments. It is important to accurately describe the experimental conditions and ligands used in the study.

      We have made these changes throughout the manuscript

      (9) In the introduction, the authors mention "a polymer of β(1-4)-linked N-acetyl-D-glucosamine (GlcNAc)". In this case, the letter "N" should be italicized to conform to the proper notation for the monosaccharide abbreviation.

      corrected (and hopefully would have been done so by the copy editor!)

      (10) In the introduction, the authors state, "In the absence of AMCase, chitin accumulates in the airways, leading to epithelial stress, chronic activation of type 2 immunity, and age-related pulmonary fibrosis5,6". It is recommended to clarify that "AMCase" refers to "acidic mammalian chitinase (AMCase)" in this context, as it is the first mention of the enzyme in the introduction.

      We moved that section so that it flows better and is introduced with the full name.

      (11) In the introduction, the authors state, "Mitigating the negative effects of high chitin levels is particularly important for mammalian lung and gastrointestinal health." This reviewer requests further clarification on the connection between chitin and gastrointestinal health. Please provide an explanation or reference to support this statement.

      We have modified this sentence to:

      “Chitin levels can be potentially important for mammalian lung and gastrointestinal health.”

      (12) In the introduction, the authors mention that "Acidic Mammalian Chitinase (AMCase) was originally discovered in the stomach and named for its high enzymatic activity under acidic conditions." It is recommended to include Reference #14 (Boot et al. J. Biol. Chem. 276, 6770-6778, 2001) as it provides the first report on mouse and human AMCase, contributing to the understanding of the enzyme.

      However, it is worth noting that while this paragraph primarily focuses on human tissues, Reference #14 primarily discusses mouse AMCase but also reports on human AMCase. Additionally, References #8 and #9 mainly discuss mouse AMCase. This creates confusion in the description of human and mouse AMCase within the paragraph.

      Considering that this paper aims to focus on the unique features of mouse AMCase, it is suggested that the authors provide a more specific and balanced description of both human and mouse AMCase throughout the main text..

      We have clarified the origin of the name AMCase and the results distinguish the two orthologs in the text with h or mAMCase.

      (13) Figure 1A in the Introduction section has been previously presented in several papers. The authors should consider moving this figure to the Results section and present an alternative figure based on their experimental results to enhance the novelty and impact of the study.

      We have considered this option, but prefer the original placement.

      (14) In the Results section, the authors mentioned, "Prior studies have focused on relative mAMCase activity at different pH18,20, limiting the ability to define its enzymological properties precisely and quantitatively across conditions of interest." It would be beneficial for the authors to include reference #14, the first report showing the pH profile of mouse AMCase, to support their statement.

      We have added this reference

      (15) Regarding the statement, "To overcome the pH-dependent fluorescent properties of 4MU-chitobioside, we reverted the assay into an endpoint assay, which allowed us to measure substrate breakdown across different pH (Supplemental Figure 1A)", the authors should provide a more detailed description of the improvements made to measure AMCase activity. Additionally, it would be helpful to include a thorough explanation of the figure legend for Supplementary Figure 1A to provide clarity to readers.

      We have included a detailed method description and figures already. See also our previous paper by Barad which includes other, related, assays.

      (16) Figure 1B shows that the authors used the AMCase catalytic domain. It would benefit the authors to explain the rationale behind this choice in the figure legend or the main text.

      This point is addressed in the text:

      “Previous structural studies on AMCase have focused on interactions between inhibitors like methylallosamidin and the catalytic domain of the protein.”

      (17) For Figures 1C-E, it is recommended that the authors include error bars in their results to represent the variability or uncertainty of the data. In Figure 1E, the authors should clarify the units of the Y-axis (e.g., sec-1 µM-1). Additionally, in Figure 1F, the authors should explain how the catalytic acidity is shown.

      We have added error bars and axis labels. Figure 1F is conceptual, so we are leaving it as is.

      (18) The authors stated, "These observations raise the possibility that mAMCase, unlike other AMCase homologs, may have evolved an unusual mechanism to accommodate multiple physiological conditions." It would be helpful for the authors to compare and discuss the pH-dependent AMCase activity of mouse AMCase with other AMCase homologs to support this statement.

      That is an excellent idea for future comparative studies, but beyond the scope of what we are examining in this paper.

      (19) The authors should explain Supplemental Figures 1B and C in the Results or Methods sections to provide context for these figures.

      We are unclear about what this means - and since it is a minor comment, we have elected not to change these sections.

      (20) Supplemental Figure 3 is missing any description. It would be important for the authors to include a mention of this figure in the main text before Supplemental Figure 4 to guide the readers.

      The full legend is in there now and the reference to Supplemental 4 was mislabeled.

      (21) For Supplemental Figure 4, the authors should explain the shape of the symbol used in the figure. Additionally, they should explain "apo" and "holoenzyme" in the context of this figure.

      Unclear what a shape means in this context - perhaps the confusion arises because these are violin plots showing distributions.

      (22) Table 1 requires a more detailed explanation of its contents. Additionally, Tables 2 and 3 need to be included. The authors should include these missing tables in the revised version and explain their contents appropriately.

      Table 1 is the standard crystallographic table - there isn’t much more detailed explanation that can be offered. Tables 2 and 3 were not transferred properly by BioRxiv but were included in the review packet as requested a day after submission.

      (23) In Figure 4, it would be beneficial to enlarge Panels A-C to improve the ease of comprehension for readers. Additionally, it is recommended to use D136, D138, and E140 instead of D1, D2, and E to label the respective parts. The authors should also explain the meaning of the symbol used in the figure.

      Since it is a minor comment, we have elected not to change these figures.

      (24) In Figure 5, it would be beneficial to enlarge Panels A-C to improve the ease of comprehension for readers.

      Since it is a minor comment, we have elected not to change these figures.

      (25) Similarly, in Figure 6, all panels should be enlarged to enhance the ease of comprehension for readers.

      Since it is a minor comment, we have elected not to change these figures.

      Reviewer #2 (Recommendations For The Authors):

      In general, I did not identify many detailed or technical concerns with the work. A few items for the authors to consider are listed below.

      (1) The interpretation of the crystallographic datasets seems complicated by the heterogeneity in the substrate component. It might be nice to see more critical analysis of the approach here. Are there other explanations or possible models that were considered? Do other structures of chitinases or other polysaccharide hydrolases exhibit the same phenomenon?

      We have tried in writing it to provide a very critical approach to this and it is quite likely that other structures contain unmodeled density containing similar heterogeneity (but it is just unmodeled).

      (2) It would be ideal to include more experimental validation of the proposed mechanism. Much of the manuscript includes theoretical validations (pKa estimation, dynamics, etc) - but it would be optimal to make an enzyme variant or do an experiment with a substrate analog.

      Yes - we agree that follow on experiments are needed to fully test the mechanism and that those will be the subject of future work.

      (3) For an uninitiated reviewer, I think the major issue with this study is that the broader significance of the work and how it fits into the context of other work on these enzymes is not clear. It would be helpful to be more specific about what we know of mechanism from work on other enzymes to help the reader understand the motivation for this study.

      We have added w few additional references, guided by reviewer 1 comments, that should help in this respect.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2024-02393

      Corresponding author(s): Katja Petzold

      1. General Statements [optional]

      We thank the reviewers for recognising the impact of our manuscript. The reviewers noted the novelty of the miRNA bulge structure, the importance of the three observed binding modes and their potential for use in future structure-based drug design, and the possible importance of the duplex release phenomenon. We are also thankful for the relevant and constructive feedback provided.

      Our responses to the comments are written point by point in blue, and any changes in the manuscript are shown in red.

      2. Description of the planned revisions

      In response to Reviewer 1 - major comment 2

      Some of the data is over-interpreted. For example, in Figure 3A, it is concluded that supplementary regions are more important for weaker seeds. Only two 8-mer seeds are present among the twelve target sites and thus it might be difficult to generalize.

      We found the relationship between seed type and the effect of supplementary pairing in our data intriguing. To further investigate this effect, we tested whether it exists in published microarray data from HCT116 cells transfected with six different miRNAs (Linsley et al., 2007; Argawal et al., 2015). Here we found that the for the two miRNAs (miR-103 and miR-106b) where we see an impact of supplementary pairing, the difference is primarily driven by 7mer-m8 seeds.

      Since the effect appears to be specific to the miRNA, we would like to test whether it can be observed for miR-34a in a larger dataset. Therefore, we plan to transfect HEK293T cells with miR-34a and analyse the mRNA response via RNAseq. We will repeat the analysis shown above, using the predicted number of supplementary pairs to categorise the dataset into groups with or without the effect of supplementary pairing. We will then compare the three seed types within these groups.

      In response to Reviewer 2 - minor comment 1, "why was the 34-nt 3'Cy3-labeled miR34a complementary probe shifted up in the presence of AGO?".

      We plan to investigate the upper band, which we hypothesise is a result of duplex release, using EMSA to ascertain whether the band height agrees with the size of the duplex.

      3. Description of the revisions that have already been incorporated in the transferred manuscript

      Reviewer #1

      Evidence, reproducibility and clarity

      Sweetapple et al. Biophysics of microRNA-34a targeting and its influence on down-regulation

      In this study, the authors have investigated binding of miR-34a to a panel of natural target sequences using EMSA, luciferase reporter systems and structural probing. The authors compared binding within a binary and a ternary complex that included Ago2 and find that Ago2 affects affinity and strengthens weak binders and weakens strong binders. The affinity is, however, generally determined by binary RNA-RNA interactions also in the ternary complex. Luciferase reporter assays containing 12 different target sites that belong to one of three seed-match types were tested. Generally, affinity is a strong contributor to repression efficiency. Duplex release, a phenomenon observed for specific miRNA-target complementarities, seems to be more pronounced when high affinity within the binary complex is observed. Furthermore, the authors use RABS for structural probing either in a construct in CIS or binding by the individual miRNA in TRANS or in a complex with Ago2. They find pronounced asymmetric target binding and Ago2 does not generally change the binding pattern. The authors observe one specific structural group that was unexpected, which was mRNA binding with bulged miRNAs, which was expected sterically problematic based on the known structures. MD simulations, however, revealed that such structures could indeed form.

      This is an interesting manuscript that contributes to our mechanistic understanding of the miRNA-target pairing rules. The combination of affinity measurements, structural probing and luciferase reporters allow for a broad correlation of target binding and repression strength, which is a well-thought and highly conclusive approach. However, there are a number of shortcomings that are summarized below.

      The manuscript is not easy to read and to follow for several reasons. First, many of the sub-Figures are not referenced in the text of the results section (1C, 1D, 2C, 4D), which is somewhat annoying. Figure 4A seems to be mis-labeled. Second, a lot of data is presented in suppl. Figures. It should be considered to move more data into the main text in order to make it easier for readers to evaluate and follow.

      Thank you for bringing this to our attention. We have now revised the figure references accordingly.

      We have relocated gel images of BCL2, WNT1, MTA2 and the control samples from Figure S3 and S4 to the main results (Figure 2A-B) to improve readability and provide controls and details that aid in clear understanding. Additionally, we have relocated panel C from Figure S6 to Figure 2C to enhance the clarity of our rationale for using polyuridine (pU) in our AGO2 binding assays.

      The updated figure is shown below, with changes to the legend marked in red.

      Figure 2. Binary and ternary____ complex binding affinities measured by EMSA. (A) Binary (mRNA:miR-34a) binding assays showing examples of BCL2, WNT1 and MTA2. (B) Ternary (mRNA:miR-34a-AGO2) binding assays showing examples of BCL2, WNT1, MTA2, and the three control targets PERFECT, SCRseed, and SCRall. The Cy5 labelled species is indicated with asterisk (*). F indicates the free labelled species (miR34a or mRNA), B indicates binary complex, and T indicates ternary complex. Adjacent titrations points differ two-fold in concentration, with maximum concentrations stated at the top right. Adjacent titration points for MTA2 differed three-fold to assess a wider concentration range. In theternary assay, miRNA duplex release from AGO2 was observed for amongst others BCL2, WNT1, PERFECT, and SCRseed (band indicated with B), while it was not observed for SCRall and MTA2. See Figures S3 and S4 for representative gel images for all targets. See Supplementary files 2 and 3 for all images and replicates. (C) Titrations with increasing miR-34a-AGO2 concentration against Cy5-labelled SCRall (left) or PNUTS (right) comparing the absence and presence of 20 μM polyuridine (pU) during equilibration. pU acted as a blocking agent, reducing nonspecific binding, as seen by the different KD,app values for SCRall and PNUTS after addition of 20 μM pU. Therefore, all final mRNA:miR-34a-AGO2 EMSAs were carried out in the presence of 20 μM pU. Labels are as stated above. (D) Individual binding profiles for each of the 12 mRNA targets assessed by electrophoretic mobility assay (EMSA). Each datapoint represents an individual experiment (n=3). Blue represents results for the binary complex, and green represents results for the ternary complex. Dotted horizontal lines represent the KD,app values, which are also stated in blue and green with standard deviations (units = nM). Note that the x-axis spans from 0.1 to 100,000 in CCND1, MTA2 and NOTCH2, whereas the remaining targets span 0.1 to 10,000.

      Some of the data is over-interpreted. For example, in Figure 3A, it is concluded that supplementary regions are more important for weaker seeds. Only two 8-mer seeds are present among the twelve target sites and thus it might be difficult to generalize.

      We have revised our wording to recognise that more 8-mer sites would be required to draw a stronger conclusion based on this hypothesis. This hypothesis would be interesting to confirm in a larger dataset but is unfortunately outside of the scope of this paper.

      Our hypothesis also aligns with recent data from Kosek et al. (NAR 2023; Figure 2D) where SIRT1 with an 8mer and 7mer-A1 seed was compared. Only the 7mer-A1 was sensitive to mutations in the central region or switching all mismatched to WC pairs.

      Page 21 now states:

      "This result indicates that the impact of supplementary binding may be greater for targets with weaker seeds, as has been observed earlier in a mutation study of miR-34a binding to SIRT1 (Kosek et al., 2023), although a larger sample size would be needed to confirm this observation."

      Furthermore, we found the relationship between seed type and the effect of supplementary pairing in our data intriguing. To further investigate this effect, we tested whether it exists in published microarray data from HCT116 cells transfected with six different miRNAs (Linsley et al., 2007; Argawal et al., 2015). Here we found that the for the two miRNAs (miR-103 and miR-106b) where we see an impact of supplementary pairing, the difference is primarily driven by 7mer-m8 seeds. We therefore plan to test whether the effect can be observed for miR-34a in a larger dataset. We have outlined our preliminary data and planned experiments in Section 2 - description of the planned revisions.

      I did not understand why the CIS system shown in 4A is a good test case for miR-34a-target binding. It appears very unnatural and artificial. This needs to be rationalized better. Otherwise it remains questionable, whether these data are meaningful at all.

      Thank you for pointing out the need for clearer rationalisation.

      The TRANS construct, where the scaffold carries the mRNA targeting sequence, provides reactivity information for the mRNA side only, while the microRNA is bound within RISC, with the backbone protected by AGO2. Therefore, to gain information on the miR-34a side of each complex we used the CIS construct, which provides reactivity information from both the miRNA and mRNA. We used the miRNA and mRNA reactivities to calculate all possible secondary structures for the binary complex, and then compared these structures to the mRNA reactivity in TRANS to find which structure fitted the reactivity patterns observed in the ternary complex.

      We have included an additional statement in the manuscript to clarify this point on pages 12-13:

      "Two RNA scaffolds were used for each mRNA target; i) a CIS-scaffold: RNA scaffold containing both mRNA target and miRNA sequence separated by a 10 nucleotide non-interacting closing loop, and ii) a TRANS-scaffold: RNA scaffold containing only the mRNA target sequence, to which free miR-34a or the miR-34a-AGO2 complex was bound (Figure 4A). The CIS constructs therefore provided reactivity information on the miRNA side, which is lacking in the TRANS construct, and was used to complement the TRANS data."

      It may be worthwhile noting that a non-interacting 10 nucleotide loop was inserted between then miRNA and mRNA of the CIS constructs, allowing the miRNA and mRNA strands to bind and release freely. The reactivity patterns of each mRNA:miRNA duplex were compared between CIS and TRANS, and showed similar base pairing (Figure 4D). Furthermore, we have previously compared the two scaffolds in our RABS methodology paper (Banijamali et al. 2022), where no differences were observed besides reduced end fraying in the CIS construct.

      For the TRANS experiments, only one specific scaffold structure is used. This structure might impact binding as well and thus at least one additional and independent scaffold should be selected for a generalized statement.

      For each construct, the potential of interaction with the scaffold was tested using the RNAstructure (Reuter & Mathews, 2010)package. Based on the results of this assessment, two different scaffolds were used for our TRANS experiments. The testing and use of scaffolds has now been clarified further on page 13:

      "The overall conformation of each scaffold with the inserted RNA was assessed using the RNAstructure (Reuter & Mathews, 2010) package to ensure that the sequence of interest did not interact with the scaffold. If any interaction was observed between the RNA of interest and the scaffold, then the scaffold was modified until no predicted interaction occurred. The different scaffolds and their sequence details are shown in supplementary information (Table S1)."

      We have previously examined the scaffold's effect on binding and structure during the development of the RABS method. We tested the same mRNA (SIRT1) in separate, independent scaffolds to verify the consistency of the results. An example of this can be found in the supplementary information (Figure S1a) of Banijamali et al. (2022).

      Generally, it would be nice to have some more information about the experiments also in the result section. Recombinant Ago2 is expressed in insect cells and re-loaded with miR-34a, luciferase reporters are transfected into tissue culture cells, I guess.

      We have now stated the cell types used for AGO2 expression and luciferase reporter assays in the results.

      On page 17 we have included:

      "Samples of each of the 12 mRNA targets, as well as miR-34a and AGO2, were synthesised in-house for biophysical and biological characterisation. Target mRNA constructs were produced via solid-phase synthesis while miR-34a was transcribed in vitro and cleaved from a tandem transcript (Feyrer et al., 2020), ensuring a 5' monophosphate group. AGO2 was produced in Sf9 insect cells."

      "To measure the affinity of each mRNA target binding to miR-34a, both within the binary complex (mRNA:miR-34a) and theternary complex (mRNA:miR-34a-AGO2), we optimised an RNA:RNA binding EMSA protocol to suit small RNA interactions. The protocol is loosely based on Bak et al. (2014)36, with major differences being use of a sodium phosphate buffering system so as not to disturb weaker interactions (James et al., 1996; Stellwagen et al., 2000), supplemented with Mg2+ as a counterion to reduce electrostatic repulsion between the two negatively charged RNAs (Misra & Draper, 1998), and fluorescently labelled probes."

      Page 19:

      " We successfully tested various RNA backgrounds, including polyuridine (pU) and total RNA extract (Figure S6B) to block any unspecific binding. Ultimately, we supplemented our binding buffer with pU at a fixed concentration of 20 µM for the ternary assays to achieve the greatest consistency."

      Page 20:

      "Repression efficacy for the 12 mRNA targets by miR-34a was assessed through a dual luciferase reporter assay6. Target mRNAs were cloned into reporter constructs and transfected into HEK293T cells."

      Page 22:

      "To infer base pairing patterns and secondary structure for each of the 12 mRNA:miR-34a pairs, we used the RABS technique (Banijamali et al., 2023) with 1M7 as a chemical probe. All individual reactivity traces are shown in Figure S9. Reactivity of each of the 22 miR-34a nucleotides was assessed upon binding to each of the 12 mRNA targets within a CIS construct, containing both miR-34a and the mRNA target site separated by a non-interacting 10-nucleotide loop. The two RNAs can therefore bind and release freely within the CIS construct and reactivity information is collected from both RNA strands."

      In the first sentence of the abstract, Argonaute 2 should be replaced by Argonaute only since other members bind to miRNAs as well.

      Thank you for recognising this. It has now been corrected.

      Significance

      This is an interesting manuscript that contributes to our mechanistic understanding of the miRNA-target pairing rules. The combination of affinity measurements, structural probing and luciferase reporters allow for a broad correlation of target binding and repression strength, which is a well-thought and highly conclusive approach. However, there are a number of shortcomings.

      We thank the reviewer for recognising the approach and impact of our work. In addition we thank the reviewer for identifying the need for further data to support our conclusions from the luciferase assays, which is something that we plan to address, as described in section 2.



      Reviewer #2

      Evidence, reproducibility and clarity

      Summary: Sweetapple et al. took the approaches of EMSA, SHAPE, and MD simulations to investigate target recognition by miR-34a in the presence and absence of AGO2. Surprisingly, their EMSA showed that guide unloading occurred even with seed-unpaired targets. Although previous studies reported guide unloading, they used perfectly complementary guide and target sets. The authors of this study concluded that the base-pairing pattern of miR-34a with target RNAs, even without AGO2, can be applicable to understanding target recognition by miR-34a-bound AGO2.

      Major comments:

      (Page 11 and Figure S4) The authors pre-loaded miR-34a into AGO2 and subsequently equilibrated the RISC with a 5' modified Cy5 target mRNA. Since properly loaded miR-34a is never released from AGO2, it is impossible for the miR-34a loaded into AGO2 to form the binary complex (mRNA:miR-34a) in the EMSA (guide unloading has been a long-standing controversy). However, they observed bands of the binary complex in Figure S4. The authors did not use ion-exchange chromatography. AGOs are known to bind RNAs nonspecifically on their positively charged surface. Is it possible that most miR-34a was actually bound to the surface of AGO2 instead of being loaded into the central cleft? This could explain why they observed the bands of the binary complex in EMSA.

      Thank you for mentioning this crucial point which has been a focus of our controls. We have addressed this point in four ways:

      Salt wash during reverse IMAC purification. Separation of unbound RNA and proteins via SEC. Blocking non-specific interactions using polyuridine. Observing both the presence and absence of duplex release among different targets using the same AGO2 preparation and conditions.

      Firstly, although we did not use a specific ion exchange column for purification, we believe the ionic strength used in our IMAC wash step was sufficient to remove non-specific interactions. We used A linear gradient with using buffer A (50 mM Tris-HCl, 300 mM NaCl, 10 mM Imidazole, 1 mM TCEP, 5% glycerol v/v) and buffer B (50 mM Tris-HCl, 500 mM NaCl, 300 mM Imidazole, 1 mM TCEP, 5% glycerol) at pH 8. The protocol followed recommendation by BioRad for their Profinity IMAC resins where it is stated that 300 mM NaCl should be included in buffers to deter nonspecific protein binding due to ionic interactions. The protein itself has a higher affinity for the resin than nucleic acids.

      A commonly used protocol for RISC purification follows the method by Flores-Jasso et al. (RNA 2013). Here, the authors use ion exchange chromatography to remove competitor oligonucleotides. After loading, they washed the column with lysis buffer (30 mM HEPES-KOH at pH 7.4, 100 mM potassium acetate, 2 mM magnesium acetate and 2 mM DTT). AGO was eluted with lysis buffer containing 500 mM potassium acetate. Competing oligonucleotides were eluted in the wash.

      As ionic strength is independent of ion identity or chemical nature of the ion involved (Jerermy M. Berg, John L. Tymoczko, Gregory J. Garret Jr., Biochemistry 2015), we reasoned that our Tris-HCl/NaCl/ imidazole buffer wash should have at comparable ionic strength to the Flores-Jasso protocol.

      Our total ionic contributions were: 500 mM Na+, 550 mM Cl-, 50 mM Tris and 300 mM imidazole. We recognise that Tris and imidazole are both partially ionized according the pH of the buffer (pH 8) and their respective pKa values, but even if only considering the sodium and chloride it should be comparable to the Flores-Jasso protocol.

      We have restated the buffer compositions below written the methods section more explicitly to describe this:

      "Following dialysis, any precipitate was removed by centrifugation, and the resulting supernatant was loaded onto a IMAC buffer A-equilibrated HisTrap-Ni2+ column to remove TEV protease, other proteins, and non-specifically bound RNA. A linear gradient was employed using IMAC buffers A and B."

      Secondly, after reverse HisTrap purification, AGO2 was run through size exclusion chromatography to remove any remaining impurities (shown Figure S2B).

      Thirdly, knowing that AGO2 has many positively charged surface patches and can bind nucleic acid nonspecifically (Nakanishi, 2022; O'Geen et al., 2018), we tested various blocking backgrounds to eliminate nonspecific binding effects in our EMSA ternary binding assays. We were able to address this issue by adding either non-homogenous RNA extract or homogenous polyuridine (pU) in our EMSA buffer during equilibration background experiments. This allowed us to eliminate non-specific binding of our target mRNAs, as shown previously in Supplementary Figure S6. We appreciate that the reviewer finds this technical detail important and have moved the panel C of figure S6 into the main results in Figure 2C, to highlight the novel conditions used and important controls needed to be performed. If miR-34a were non-specifically bound to the surface of AGO2 after washing, this blocking step would render any impact of surface-bound miR-34a negligible due to the excess of competing polyuridine (pU).

      Our EMSA results show that, using polyU, we can reduce non-specific interaction between AGO2 and RNAs that are present. And still, duplex release occurs despite the blocking step. It is therefore less likely that duplex release is caused by surface-bound miR-34a.

      Finally, the observation of distinct duplex release for certain targets, but not for others (e.g. MTA2, which bound tightly to miR-34a-AGO2 but did not exhibit duplex release; see Figure 2), argues against the possibility that the phenomenon was solely due to non-specifically bound RNA releasing from AGO2.

      In response to the reviewers statement "Since properly loaded miR-34a is never released from AGO2, it is impossible for the miR-34a loaded into AGO2 to form the binary complex (mRNA:miR-34a)" we would like to refer to the three papers, De et al. (2013) Jo MH et al. (2015), and Park JH et al. (2017), which have previously reported duplex release and collectively provide considerable evidence that miRNA can be unloaded from AGO in order to promote turnover and recycling of AGO. It is known that AGO recycling must occur, therefore there must be some mechanisms to enable release of miRNA from AGO2 to enable this. It is possible that AGO recycling proceeds via miRNA degradation (TDMD) in the cell, but in the absence of enzymes responsible for oligouridylation and degradation, the miRNA duplex may be released. As TDMD-competent mRNA targets have been observed to release the miRNA 3' tail from AGO2 (Sheu-Gruttadauria et al., 2019; Willkomm et al., 2022), there is a possible mechanistic similarity between the two processes, however, we do not have sufficient data to make any statement on this.

      (Page 18 and Figure S5) Previous studies (De et al., Jo MH et al., Park JH et al.) reported guide unloading when they incubated a RISC with a fully complementary target. However, neither MTA2, CCND1, CD44, nor NOTCH2 can be perfectly paired with miR-34a (Figure 1A). Therefore, the unloading reported in this study is quite different from the previously reported works and thus cannot be explained by the previously reported logic. The authors need to explain the guide unloading mechanism that they observed. Otherwise, they might misinterpret the results of their EMSA and RABS of the ternary complex.

      The three aforementioned studies have reported unloading/duplex release. However, they did not only report fully complementary targets in this process.

      De et al. (2013) reported that "highly complementary target RNAs promote release of guide RNAs from human Argonaute2".

      Subsequently, Park et al. (2017) reported: "Strikingly, we showed that miRNA destabilization is dramatically enhanced by an interaction with seedless, non-canonical targets."

      A figure extracted from Figure 5 of Park et al. is shown below illustrating the occurrence of unloading in the presence of seed mismatches in positions 2 and 3 (mm 2-3). Jo et al. (2015) also reported that binding lifetime was not affected by the number of base pairs in the RNA duplex.

      In addition to these three reports, a methodology paper focusing on miRNA duplex release was published recently titled "Detection of MicroRNAs Released from Argonautes" (Min et al., 2020).

      Therefore, we do believe that the previously observed microRNA release is similar to our observation. Here we also correlate it to structure and stability of the complex.

      (Page 20) The authors reported, "it is notable that the seed region binding does not appear to be necessary for duplex release." The crystal structures of AGO2 visualize that the seed of the guide RNA is recognized, whereas the rest is not, except for the 3' end captured by the PAZ domain. How do the authors explain the discrepancy?

      In this manuscript, we intend to present our observations of duplex release. There are many potential relationships between duplex release and AGO2 activity, which we do not have data to speculate upon. Previous studies, such as Park et al. (2017) have also observed non-canonical and seedless targets leading to duplex release, supporting our findings. Additionally, other publications including McGearly et al. (2019) report 3'-only miRNA targets, Lal et al. (2009) have documented seedless binding by miRNA and their downstream biological effects, and Duan et al. (2022) show that a large number of let-7a targets are regulated through 3′ non-seed pairing.

      It is also possible that duplex release is not coupled to classical repression outcomes, and does not need to proceed by the seed, but instead regulates AGO2 recycling before AGO2 enters the quality control mode of recognising the formed seed.

      (Pages 22) The authors mentioned, "It follows that the structure imparted via direct RNA:RNA interaction remains intact within AGO2, highlighting the role of RNA as the structural determinant." A free guide and a target can start their annealing from any nucleotide position. In contrast, a guide loaded into AGO needs to start annealing with targets through the seed region. Additionally, the Zamore group reported that the loaded guide RNA behaves quite differently from its free state (Wee et al., Cell 2012). How do the authors explain the discrepancy?

      The key point we would like to emphasise is that AGO does not seem to alter the underlying RNA:RNA interactions. The bound state in the ternary complex reflects the structure established in the binary complex. We do not aim to claim a specific sequence of events, as this interpretation is not possible from our equilibrium data. Our data indicates that the protein is flexible enough to accommodate the RNA structure that is favoured in the binary complex. This hypothesis is further supported by our MD simulation, which demonstrates the accommodation of a miRNA-bulge structure within AGO2.

      Targets lacking seeds have been identified previously (McGeary et al. 2019, Park et al. 2017, Lal et al. 2009) and can bind to miRNA within AGO. Therefore, there must be a mechanism by which these targets can anneal within AGO, such as via sequence-independent interactions (as discussed in question 3).

      With respect to Wee et al., (2012), which studied fly and mouse AGO2 and found considerable differences between the thermodynamic and kinetic properties of the two AGO2 species. Furthermore, they found different average affinities between the two species, with the fly AGO binding tighter the mouse. Following this logic, it is not unexpected that human AGO2 would have unique properties compared to those of fly and mouse.

      Below is an extract from Wee et al., (2012):

      "Our KM data and published Argonaute structures (Wang et al., 2009) suggest that 16-17 base pairs form between the guide and the target RNAs, yet the binding affinity of fly Ago2-RISC (KD = 3.7 {plus minus} 0.9 pM, mean {plus minus} S.D.) and mouse AGO2-RISC (KD = 20 {plus minus} 10 pM, mean {plus minus} S.D.) for a fully complementary target was comparable to that of a 10 bp RNA:RNA helix. Thus, Argonaute functions to weaken the binding of the 21 nt siRNA to its fully complementary target: without the protein, the siRNA, base paired from positions g2 to g17, is predicted to have a KD ∼3.0 × 10−11 pM (ΔG25{degree sign}C = −30.7 kcal mol−1). Argonaute raises the KD of the 16 bp RNA:RNA hybrid by a factor of > 1011."

      In the Wee et al. (2012) paper, affinity data on mouse and fly AGO2 was collected via filter binding assays, using a phosphorothioate linkage flanked by 2′-O-methyl ribose at positions 10 and 11 of the target to prevent cleavage. They then compared the experimentally determined mean KD and ΔG values for each species to predicted values of an RNA:RNA helix of 16-17 base-pairs. No comparison was made between individual targets, and no experimental data was collected for the RNA:RNA binding. The calculated energy values were made based on a simple helix without taking into account any possible secondary structure features. Considering the different AGO species, alternative experimental setup, modified nucleotides in the tested RNA, and the computationally predicted RNA values compared to the averaged experimental values, we believe there is considerable reason to observe differences compared to our findings.

      We have expanded our discussion on page 27 to the following:

      "An earlier examination of mRNA:miRNA binding thermodynamics by Wee and colleagues (2012) found that mouse and fly AGO2 reduce the affinity of a guide RNA for its target61. Our data indicate that the range of miR-34a binary complex affinities is instead constricted by human AGO2 in the ternary complex - strengthening weak binders while weakening strong binders. The 2012 study reported different average affinities between the two AGO2 species, with the fly protein binding tighter the mouse. Following this logic, it is not unexpected that human AGO2 would have unique properties compared to those of fly and mouse."

      The authors concluded that the range of binary complex affinities is constricted by human AGO2 in the ternary complex - strengthening weak binders while weakening strong binders. This may hold true for miR-34a, but it cannot be generalized. Other miRNAs need to be tested.

      That is true, we have now adjusted the wording to encompass this more clearly, shown below. Testing of further miRNAs is the likely content of future work from us and others.

      "Our data indicate that the range of miR-34a binary complex affinities is instead constricted by human AGO2 in the ternary complex - strengthening weak binders while weakening strong binders."

      Minor comments:

      (Figure S2) Why was the 34-nt 3'Cy3-labeled miR34a complementary probe shifted up in the presence of AGO?

      We believe this observation is also indicative of duplex release. At the time that these activity assays were collected, we were not as aware of the presence of duplex release so did not test it further, assuming it may be due to transient interactions. We plan to investigate this via EMSA and have included this in the planned revisions (section 2).

      2.(Page 17) Does the Cy3 affect the interaction of the 3' end of miR-34 with AGO2?

      miR-34a-3'Cy5 was used for binary experiments only and the reverse experiment was conducted as a control (where Cy5 was located on the mRNA) (Figure S3b), showing no change in affinity/interaction when the probe was switched to the target. For ternary experiments the mRNA target was labelled on the 5' terminus, to make sure there was no interference with loading miR-34a into AGO2.

      A Cy3 labelled RNA probe (fully complementary to miR-34a) was used to detect miR-34a in northern blots, but AGO2 interaction is not relevant here under denaturing conditions.

      Otherwise, the 34-nt slicing probe had Cy3 on the 5 nt 3' overhang and should therefore not interact with AGO.

      1. Several groups reported that overproduced AGOs loaded endogenous small RNAs. The authors should mention that their purified AGO2 was not as pure as a RISC with miR-34a. Otherwise, readers might think that the authors used a specific RISC.

      We have now improved our explanation of the loading efficiency to make it more clear to the reader that our AGO2 sample was not fully bound by miR-34a, and that all concentrations refer to the miR-34a-loaded portion of AGO2. The following text can be found in the results on page 18:

      "The mRNA:miR-34a-AGO2 assay had a limited titration range, reaching a maximum miR-34a-AGO2 concentration of 268 nM due to a 5% loading efficiency (see Figure S2D for loading efficiency quantification). The total AGO2 concentration was thus 20-fold higher than the miR-34a-loaded portion. Further increase in protein concentration was prevented by precipitation. Weaker mRNA targets (CD44, CCND1, and NOTCH2) did not reach a saturated binding plateau within this range, leading to larger errors in their estimated KD,app values. However, reasonable estimation of the KD,app was possible by monitoring the disappearance of the free mRNA probe. Note that we refer to the miR-34a-loaded portion of AGO2 when discussing concentration values for all titration ranges. To ensure AGO2 binding specificity despite low loading efficiency, a scrambled control was used (SCRall; lacking stable base pairing with miR-34a or other human miRNAs according to the miRBase database57). SCRall showed no interaction with miR-34a-AGO2 (Figure 2B)."

      (Figure legend of Figure S5) Binding was assessed "by."

      Thank you for pointing this out, it is now fixed.

      (Page 17) It would be great if the authors could even briefly describe the mechanism by which the sodium phosphate buffer with magnesium does not disturb weaker interactions by citing reference papers.

      We have now added a supplementary methods section to our manuscript and included the description below on page 10:

      "We found that a more traditional Tris-borate-EDTA (TBE) buffer disrupted weaker RNA:RNA binding interactions (Supplementary Methods Figure M1). Borate anions form stable adducts with carbohydrate hydroxyl groups (James et al., 1996) and can form complexes with nucleic acids, likely through amino groups in nucleic bases or oxygen in phosphate groups (Stellwagen et al., 2000). This makes TBE unsuitable for assessment of RNA binding, particularly involving small RNA molecules, which typically have weaker affinities. We therefore adapted our buffer system to a sodium phosphate buffer supplemented with magnesium. Magnesium acts as a counterion to reduce electrostatic repulsion between the two negatively charged backbones by neutralisation (Misra et al., 1998)."

      We have also clarified the buffer adaptions in our results section on page 17:

      The protocol is loosely based on Bak et al. (2014)36, with major differences being use of a sodium phosphate buffering system so as not to disturb weaker interactions(James et al., 1996; Stellwagen et al., 2000), supplemented with Mg2+ as a counterion to reduce electrostatic repulsion between the two negatively charged RNAs(Misra & Draper, 1998), and fluorescently labelled probes. Original gel images and quantification are shown in supplementary Figures S3 and S4. All KD,app values are shown in Supplementary Table 1, and represent the mean of three independent replicates.

      Figure M1. Comparison of Tris-borate EDTA (TBE) and sodium phosphate with magnesium (NaP-Mg2+) buffer systems for EMSA. Cy5-labelled miR-34a and unlabelled CD44 were equilibrated in the two different buffer systems, using the same titration range. No mobility shifts were observed in the TBE system, while clear binding shifts were observed in the NaP-Mg2+ system.

      6.(Page 22) The authors cited Figure 4C in the sentence, "Comparison between CIS and TRANS ..." Is this supposed to be Figure 4D?

      The reviewer was correct in their assumption, and this has now been corrected.

      7.(Figure 6) Readers would appreciate it if the guide and target were colored in red and blue. The color codes have been used in most papers reporting AGO structures. The current color codes are opposite.

      We have now adjusted the colour schemes throughout the manuscript, and Figure 6 has been modified to the following:

      __"Figure 6. The miRNA-bulge structure is readily accommodated by AGO2 as shown by molecular dynamics simulation. __Panel (A) displays a snapshot of the all-atom MD simulation of miR-34a (red) and NOTCH1 (blue) in AGO2. The NOTCH1:miR-34a duplex is shown with AGO2 removed for clarity and is rotated 90{degree sign} to show the miRNA bulge and bend in the duplex. This NOTCH1:miR-34a-AGO2 structure is compared with (B), which shows the crystal structure of miR-122 (orange) paired with its target (purple) via the seed and four nucleotides in the supplementary region (PDB-ID 6N4O17), and (C), which shows the crystal structure of miR-122 (orange) and its target (green) with extended 3' pairing, necessary for the TDMD-competent state (PDB-ID 6NIT19). AGO2 is depicted in grey, with the PAZ domain in green, and the N-terminal domain marked with N. The miRNA duplexes in (B) and (C) feature symmetrical 4-nucleotide internal loops, whereas the NOTCH1 structure in (A) has an asymmetrical miRNA bulge with five unpaired nucleotides on the miRNA side and a 3-nucleotide asymmetry."

      Significance

      This paper will have a significant impact on the field if seed-unpaired targets can indeed unload guide RNAs. The authors may want to validate their results very carefully.

      We thank the reviewer for recognising the significance of duplex release (or guide unloading) from AGO2. We agree that the observations should be tested rigorously and have outlined the actions we took to ensure validity in our AGO2 preparation.

      __Reviewer #3 __

      Evidence, reproducibility and clarity (Required):

      In this manuscript, the authors use a combination of biochemical, biophysical, and computational approaches to investigate the structure-function relationship of miRNA binding sites. Interestingly, they find that AGO2 weakens tight RNA:RNA binding interactions, and strengthens weaker interactions.

      Given this antagonistic role, I wonder: shouldn't there be an 'average' final binding affinity? Furthermore, if I understand correctly, not many trends were observed to correlate binding affinity with repression, etc.

      Overall, there was no 'average' final binding affinity observed, as the binary assays had a much higher maximum (NOTCH2binary affinity was within the micromolar range) skewing the mean average of the binary affinities to 657 nM, versus 111 nM for the ternary affinities. We also compare the variances of the binary and ternary affinity datasets using the F-test and found that F > F(critical one tail) and thus the variation of the two populations is unequal (binary variation is significantly larger than ternary).

      F-Test Two-Sample for Variances

      • *

      binary affinity

      ternary affinity

      Mean

      657.3

      110.971667

      Variance

      2971596.1

      24406.4012

      Observations

      12

      12

      df

      11

      11

      F

      121.754784

      P(F

      7.559E-10

      F(critical one-tail)

      2.81793047

      We agree that the overall correlation between affinity and repression was not strong, although we found a stronger correlation within the miRNA-bulge group (Figure 5C and S7C). A larger sample size of miRNA bulge-forming duplexes would be needed to test the generalizability of this observation.

      Given the context of the study - whereby structure is being investigated as a contributing factor to the interaction between the miRNA and mRNA, I find it interesting that the authors chose to use MC-fold to predict the structures of the mRNA, rather than using an experimental approach to assess / validate the structures. Thirty-seven RNAs were assessed; I think even for a subset (the 12 that were focused on in the study), the secondary structure should be validated experimentally (e.g., by chemical probing experiments, which the research group has demonstrated expertise in over the last several years). The validation should follow the in silico folding approach used to narrow down the region of interest. It is necessary to know whether an energy barrier (associated with the mRNA unfolding) has to occur prior to miRNA binding; this could help explain some of the unexplained results in the study. Indeed, the authors mention that there are many variables that influence miRNA regulation.

      Indeed, experimentally validated structures offer valuable insights that cannot be obtained solely through sequence-based predictions. This is why we opted to employ our RABS method to experimentally evaluate the binary and ternary complex binding of our 12 selected targets (as depicted in Figures 4 and S9 and discussed in the text on pages 23-24). While we (in silico) assessed all 37 RNA targets that were experimentally confirmed at the time, selecting 12 to represent both biological and predicted structural diversity, it would have been impractical to experimentally pre-assess all the targets not included in the final selection. Our in-silico assessment was designed to narrow down the regions of interest and evaluate predicted secondary structures present. The pipeline is shown in Figure 1. Details of the code used in the in-silico analysis are provided in Supplementary File 1.

      Regarding the energy of unfolding of mRNA, our constructs considered the isolated binding sites thus the effects of surrounding mRNA interactions were removed. We compared our affinities to dG as well as MFE and have now included this analysis in Figure S8A. Additionally, we have included the text on page 27-28 of the discussion:

      "Gibbs free energy (G), which is often included in targeting prediction models as a measure of stability of the miRNA:mRNA pair12,62, correlated with the log of our binary KD,app values, using ΔG values predicted by RNAcofold (R2 = 0.61). There was a weaker correlation with the free energy values derived from the minimum free energy (MFE) structures predicted by RNAcofold (R2 = 0.41) (Figure S8A). This result highlights the contribution of unfolding (in ΔG) as being an important in predicting KD. The differences between ΔG and KD,app are likely primarily due to inaccurately predicted structures used for energy calculations."

      Additionally, we assessed the free form of all mRNA targets via RABS (Figure S9) and observed that the seed of each free mRNA was available for miRNA binding (seeds of the free mRNA were not stably bound).

      Finally, when designing our luciferase plasmids we used RNAstructure (Reuter & Mathews, 2010) to check for self-folding effects which could interfere with target site binding and ensured that all plasmids were void of such effects.

      In the methods, T7 is italicized by accident in the T7 in vitro transcription section. Bacmid is sometimes written with a capital B and other times with a lower-cased b. The authors should be consistent. The concentration of TEV protease that was added (as opposed to the volume) should be described for reproducibility.

      Thank you for pointing out these overlooked points. They have now been corrected.

      In figure S2D, what is the second species in the gel on the right-hand side of the gel in the miR-34a:AGO lanes? The authors should mention this.

      We believe that the faint upper band corresponds to other longer RNA species loaded into AGO2. As AGO2 is loaded with a diversity of RNA species, it is likely that some of them may have a weak affinity for the miR-34a-complementary probe, and therefore show up on the northern blot.

      Figure S3B and S3A are referenced out of order in the text. In regard to S3A, what are the anticipated or hypothesized alternative conformations for NOTCH1, DLL1, and MTA2? There are really interesting things going on in the gels, also for HNF4a and NOTCH2. Can the authors offer some explanation for why the free RNA bands don't seem to disappear, but rather migrate slowly? Is this a new species?

      The order of the figure references have now been updated, thank you for alerting us to this.

      Figure S3A: For MTA2, the two alternative conformations are shown in Figure S9 and S10 (and shown below here, miR-34aseed marked in pink). It appears that a single conformation is favoured at high concentration (> 1 µM) while the two conformations are present at {less than or equal to} 1 µM. The RABS data for MTA2 also indicated multiple binding conformations, as the reactivity traces were inconsistent. We expect that the conformation shown on the left was most dominant within AGO2, based on the reactivity of the TRANS + AGO assays. However, we cannot exclude a possible G-quadruplex formation due to the high G content of MTA2 (shown below right).

      Regarding NOTCH1 and DLL1, a faint fluorescent shadow was observed beneath the miR-34a bound band. The RABS reactivity traces indicated a single dominant conformation for these targets, so it is possible that the lower shadow observed was due to more subtle differences in conformation, such as the opening/closing of one or a few base pairs at the terminus or bulge, (i.e. end fraying). HNF4α and NOTCH2 appear to never fully saturate the miR-34a, so a small un-bound population remains visible on the gel. For NOTCH2 this free miR-34a band appears to migrate upwards, possibly due to overloading the gel lane with excess NOTCH2 (which are not observed in the Cy5 fluorescence image).

      In the EMSA for Perfect, why does the band intensity for the bound complex increase then decrease? How many replicates were run for this? This needs to be reconciled.

      As for all EMSAs, three replicates were carried out for each mRNA target and all gels are shown in Supplementary Files 2 and 3, for the binary and ternary assays respectively.

      Uneven heat distribution across the gel can lead to bleaching of the Cy5 fluorophore. To address this, we we used a circulating cooler in our electrophoresis tank, as outlined in our methods (page 10). However, the aforementioned gel for one of thePERFECT sample replicates appears to have been evenly cooled. As the binding ratio (rather than total band volume) was used for quantification, the binding curve was unaffected, and this did not influence KD,app.

      We have now replaced the exemplary gel for PERFECT in Figure S3 with a more representative and evenly labelled gel from our replicates (Cy5 fluorescence image shown below). The binding curve for PERFECT is also shown here:

      The authors list that the RNA concentration was held constant at 10 nM; in EMSAs, the RNA concentration should be less than the binding affinity; what is the lowest concentration of protein used in the assays shown in S3A? Is this a serial dilution? It seems to me like the binding assays for MTA2, Perfect, and SRCseed might have too high of an RNA concentration. (Actually, now I see in the supplement the concentrations of proteins, and the RNA concentration is too high). Also, why is the intensity of bands for bound complex for SRCseed more intense than the free RNA?

      Why are the binding affinity error bars so large (e.g., for NOTCH2 with mir-34a) - 6 uM +/- 3 uM?

      No protein was used in the binary assays shown in Figure S3A. For the ternary assays in Figure S4, the maximum concentration of miR-34a-loaded AGO2 (miR-34a-AGO2) was 268 nM, with a serial dilution down to a minimum of 0.06 nM.

      Optimal EMSA conditions require a constant RNA concentration that is lower than the binding affinity to accurately estimate high-affinity interactions.

      For our tightest binders, such as SIRT1, we can confidently state that the KD,app is less than 10 nM, estimated at 0.4 {plus minus} 1.1 nM. Therefore, the accuracy of this estimation is reduced, and the standard deviation is larger than the estimated KD,app. As NOTCH2 bound miR-34a very weakly and did not reach a fully bound plateau, the resulting high error was expected. Consequently, we do not have the same level of certainty for extremely tight or weak binders. In this study, the relative affinities were of primary importance.

      We have included on page 18:

      As the Cy5-miR-34a concentration was fixed to 10 nM to give sufficient signal during detection, KD,app values below 10 nM have a lower confidence.

      Regarding the control samples PERFECT and SCRseed, our focus was not on determining the exact KD,app of these artificial constructs. Instead, we were primarily interested in whether they exhibited binding and under which conditions. For SCRseed, we neither adjusted the titration range nor calculated KD,app. For PERFECT, the concentration was adjusted to a lower range of 30 nM - 0.001 nM to give a relative comparison with the other tight binder SIRT1. However, further reduction in RNA concentration was not pursued, as it already fell well below the 10 nM sensitivity threshold.

      Regarding the intensity of the bound SCRseed band, we observed that the bound fluorophore often resulted in stronger intensity than for the free probe. This was observed for a number of the samples (PERFECT, BLC2, SCRseed). A previous publication reported that Cy5 is sequence dependent in DNA, that the effect is more sensitive to double-stranded DNA, and that the fluorophore is sensitive to the surrounding 5 base pairs (Kretschy, Sack and Somoza, 2016). It is likely that the same phenonenon exists in RNA.

      For MTA2, the two alternative conformations (shown in Figure S9 and S10) make assessment of KD,app more difficult. As the higher affinity conformation did not reach a fully-bound plateau before the weaker affinity conformation appeared, the binding curve plateau (where all miR-34a was bound) reflected the weaker conformation KD,app. We increased the range of titration tested by using a three-fold serial dilution, but further reduction in RNA concentration would not have been fruitful as it already dropped below well below the 10 nM sensitivity range. Therefore the MTA2 binary complex had a higher error at (944 {plus minus} 274 nM) and lower confidence.

      We then decided to run a competition assay to detect the weaker KD,app of MTA2. The assay was set up using the known binding affinity of CD44, which was labelled with Cy5 to track the reaction. MTA2 was titrated against a constant concentration of Cy5-CD44:miR-34a, and disruption of the CD44 and miR-34a binding was monitored. We fitted the data to a quadratic for competitive binding (Cheng and Prusoff., 1973) to calculate the KD,app for competitive binding, or KC,app.

      We validated our competition assay by comparing it with our direct binding assays, specifically assessing CD44 in a self-competition assay. The CD44 KC,app (168 {plus minus} 24 nM; mean and SD of three replicates) was found to be consistent with the KD,app obtained from the direct assay (165 {plus minus} 21 nM).

      As we wanted all affinity data to be directly comparable (using the same methodology), we compared the KD,app values obtained via direct assay in the manuscript. It appears that the competitive EMSA assay for MTA2 reflects the weaker affinity conformation observed in the direct assay.

      It would be very helpful if the authors wrote in the Kds in Figure 2A in green and blue (in the extra space in the plots). This would help the reader to better understand what's going on, and for me, as a reviewer, to better consider the analysis/conclusions presented by the authors.

      KD,app values are written in in green and blue in what is now Figure 2D (originally Figure 2A).

      The authors state on page 18 that 'Interestingly, however, we did not observe a correlation between binary or ternary complex affinity and seed type.' They should elaborate on why this is interesting.

      The prevailing view is that the miRNA seed type significantly influences affinity within AGO2. The largest biochemical studies of miRNA-target interactions to date, conducted by McGeary et al. (2019, 2022), used AGO-RBNS (RNA Bind-n-Seq) to reveal relative binding affinities. These studies demonstrated strong correlations between the canonical seed types and binding affinity. Therefore, we find it interesting that no such correlation was observed in our dataset (despite its small size).

      We have now added to the manuscript (page 20):

      "The largest biochemical studies of miRNA-target interactions to date (McGeary et al., 2019, 2022) used AGO-RBNS (RNA Bind-n-Seq) to extract relative binding affinities, demonstrating strong correlations between the canonical seed types and binding affinity. Therefore, it is intriguing that our dataset, despite its small size, showed no such correlation."

      Figure 2C is not referenced in the text (the authors should go back through the text to make sure everything is referenced and in order). The Kds should be listed alongside the gels in Figure 2C.

      Figure 2 has now been rearranged and updated, with KD,app values listed in what is now Figure 2D.

      Figure 3B is rather confusing to understand.

      We have now adapted Figure 3 to simplify readability. Panel B has now been moved to C, and we have introduced panel A (moved from Figure 2B). In Figure 3C (originally 3B) we have added arrows to indicate the direction of affinity change from binary to ternary complex, and moved the duplex release information to panel A. We thank the reviewer and think that the data is now much clearer.

      Figure 3. AGO2 moderates affinity by strengthening weak binders and weakening strong binders. (A) Correlation of relative mRNA:miR-34a with mRNA:miR-34aAGO2 binding affinities. No seed type correlation is observed, seeds coloured, where 8mer is pink, 7mer-m8 is turquoise, and 7-mer-A1 is mauve. The slope of the linear fit is 0.48, and intercept on the (log y)-axis is 7.11. The occurrence of miRNA duplex release from AGO2 is marked with diamonds. (B) miR-34a-mediated repression of dual luciferase reporters fused to the 12 mRNA targeting sites. Luciferase activity from HEK293T cells co-transfected with each reporter construct, miR-34a was measured 24 hours following transfection and normalised to the miR-34a-negative transfection control. Each datapoint represents the R/F ratio for an independent experiment (n=3) with standard deviations indicated. SCRseed is a scrambled seed control, SCRall is a fully scrambled control, and PERFECT is the perfect complement of miR-34a. Dotted horizontal lines represent the repression values for the 22-nucleotide seed-only controls6 for the respective seed types, in the absence of any other WC base pairing. (C) Comparison of relative target repression with relative affinity assessed by EMSA. Blue represents mRNA:miR-34a affinity (binary complex), while green represents mRNA:miR-34a-AGO2 affinity (ternary complex). Arrows indicate the direction of change in affinity upon binding within AGO2 compared to the binary complex. It is seen that AGO2 moderates affinity bi-directionally by strengthening weak binders and weakening strong binders.

      Page 20: Perfect should be italicized.

      Thank you for bringing this to our attention, this how now been adjusted.

      Have the authors considered using NMR to assess the base pair pattern formed between the miRNA:mRNA complexes (with / without AGO)? As a validation for results obtained by RABS? This could be helpful for the Asymmetric target binding section, the Ago increases flexibility section, and the three distinct structural groups section in the results. It is widely accepted that while chemical probing is insightful, results should be validated using alternative approaches. Distinguishing structural changes and protected reactivity in the presence of protein is challenging.

      NMR provides high-resolution information on RNA base-pairing patterns, allowing us to compare our RABS results for SIRT1with those obtained via NMR (Banijamali et al., 2022) for the binary complex. For SIRT1, the RNA:RNA structures identified were consistent between both methods. However, using NMR to measure RNA:RNA binding within AGO2 is challenging due to the protein's large size. Currently, there are no published complete NMR structures of RNA within AGO2. The largest solution-state NMR structures published that include AGO consist solely of the PAZ domain. Our group has been working on method development using DNP-enhanced solid-state NMR to obtain structural information within the complete AGO2 protein, but the current resolution does not allow us to fully reconstruct a complete NMR structure. We hope that in the coming years, this will be a method to evaluate RNA within AGO. This limitation highlights the advantage of RABS in providing RNA base-pairing information within the ternary complex in solution.

      Reviewer #3 (Significance (Required)):

      The work is helpful for understanding how microRNAs recognize and bind their mRNA targets, and the impact Ago has on this interaction. I think for therapeutic studies, this will be helpful for structure-based design. Especially given the three types of structures identified to be a part of the interaction.

      We thank the reviewer for their detailed remarks, especially concerning the importance of technical details the binding assays. We further thank the reviewer for recognising the potential impact of our work for rational design.

      4. Description of analyses that authors prefer not to carry out

      • *

      In response to Reviewer 2 - major comment 1, we prefer to not run an additional ion exchange purification on the AGO2 protein due to the reasoning discussed above, which is repeated here:

      We have addressed this point in three ways:

      Thank you for mentioning this crucial point which has been a focus of our controls. We have addressed this point in four ways:

      Salt wash during reverse IMAC purification. Separation of unbound RNA and proteins via SEC. Blocking non-specific interactions using polyuridine. Observing both the presence and absence of duplex release among different targets using the same AGO2 preparation and conditions.

      Firstly, although we did not use a specific ion exchange column for purification, we believe the ionic strength used in our IMAC wash step was sufficient to remove non-specific interactions. We used A linear gradient with using buffer A (50 mM Tris-HCl, 300 mM NaCl, 10 mM Imidazole, 1 mM TCEP, 5% glycerol v/v) and buffer B (50 mM Tris-HCl, 500 mM NaCl, 300 mM Imidazole, 1 mM TCEP, 5% glycerol) at pH 8. The protocol followed recommendation by BioRad for their Profinity IMAC resins where it is stated that 300 mM NaCl should be included in buffers to deter nonspecific protein binding due to ionic interactions. The protein itself has a higher affinity for the resin than nucleic acids.

      A commonly used protocol for RISC purification follows the method by Flores-Jasso et al. (RNA 2013). Here, the authors use ion exchange chromatography to remove competitor oligonucleotides. After loading, they washed the column with lysis buffer (30 mM HEPES-KOH at pH 7.4, 100 mM potassium acetate, 2 mM magnesium acetate and 2 mM DTT). AGO was eluted with lysis buffer containing 500 mM potassium acetate. Competing oligonucleotides were eluted in the wash.

      As ionic strength is independent of ion identity or chemical nature of the ion involved (Jerermy M. Berg, John L. Tymoczko, Gregory J. Garret Jr., Biochemistry 2015), we reasoned that our Tris-HCl/NaCl/ imidazole buffer wash should have at comparable ionic strength to the Flores-Jasso protocol.

      Our total ionic contributions were: 500 mM Na+, 550 mM Cl-, 50 mM Tris and 300 mM imidazole. We recognise that Tris and imidazole are both partially ionized according the pH of the buffer (pH 8) and their respective pKa values, but even if only considering the sodium and chloride it should be comparable to the Flores-Jasso protocol.

      Secondly, after reverse HisTrap purification, AGO2 was run through size exclusion chromatography to remove any remaining impurities (shown Figure S2B).

      Thirdly, knowing that AGO2 has many positively charged surface patches and can bind nucleic acid nonspecifically (Nakanishi, 2022; O'Geen et al., 2018), we tested various blocking backgrounds to eliminate nonspecific binding effects in our EMSA ternary binding assays. We were able to address this issue by adding either non-homogenous RNA extract or homogenous polyuridine (pU) in our EMSA buffer during equilibration background experiments. This allowed us to eliminate non-specific binding of our target mRNAs, as shown previously in Supplementary Figure S6. We appreciate that the reviewer finds this technical detail important and have moved the panel C of figure S6 into the main results in Figure 2C, to highlight the novel conditions used and important controls needed to be performed. If miR-34a were non-specifically bound to the surface of AGO2 after washing, this blocking step would render any impact of surface-bound miR-34a negligible due to the excess of competing polyuridine (pU).

      Our EMSA results show that, using polyU, we can reduce non-specific interaction between AGO2 and RNAs that are present. And still, duplex release occurs despite the blocking step. It is therefore less likely that duplex release is caused by surface-bound miR-34a.

      Finally, the observation of distinct duplex release for certain targets, but not for others (e.g. MTA2, which bound tightly to miR-34a-AGO2 but did not exhibit duplex release; see Figure 2), argues against the possibility that the phenomenon was solely due to non-specifically bound RNA releasing from AGO2.

      In response to the reviewers statement "Since properly loaded miR-34a is never released from AGO2, it is impossible for the miR-34a loaded into AGO2 to form the binary complex (mRNA:miR-34a)" we would like to refer to the three papers, De et al. (2013) Jo MH et al. (2015), and Park JH et al. (2017), which have previously reported duplex release and collectively provide considerable evidence that miRNA can be unloaded from AGO in order to promote turnover and recycling of AGO. It is known that AGO recycling must occur, therefore there must be some mechanisms to enable release of miRNA from AGO2 to enable this. It is possible that AGO recycling proceeds via miRNA degradation (TDMD) in the cell, but in the absence of enzymes responsible for oligouridylation and degradation, the miRNA duplex may be released. As TDMD-competent mRNA targets have been observed to release the miRNA 3' tail from AGO2 (Sheu-Gruttadauria et al., 2019; Willkomm et al., 2022), there is a possible mechanistic similarity between the two processes, however, we do not have sufficient data to make any statement on this.

    1. Reviewer #3 (Public Review):

      Wu et al. present cryo-EM structures of the potassium channel Kv1.2 in open, C-type inactivated, toxin-blocked and presumably sodium-bound states at 3.2 Å, 2.5 Å, 2.8 Å, and 2.9 Å. The work builds on a large body of structural work on Kv1.2 and related voltage-gated potassium channels. The manuscript presents a plethora of structural work, and the authors are commended on the breadth of the studies. The structural studies are well-executed. Although the findings are mostly confirmatory, they do add to the body of work on this and related channels. Notably, the authors present structures of DTx-bound Kv1.2 and of Kv1.2 in a low concentration of potassium (which may contain sodium ions bound within the selectivity filter). These two structures add considerable new information. The DTx structure has been markedly improved in the revised version and the authors arrive at well-founded conclusions regarding its mechanism of block. Regarding the Na+ structure, the authors claim that the structure with sodium has "zero" potassium - I caution them to make this claim. It is likely that some K+ persists in their sample and that some of the density in the "zero potassium" structure may be due to K+ rather than Na+. This can be clarified by revisions to the text and discussion. I do not think that any additional experiments are needed. Overall, the manuscript is well-written, a nice addition to the field, and a crowning achievement for the Sigworth lab.

      Most of this reviewer's initial comments have been addressed in the revised manuscript. Some comments remain that could be addressed by revisions of the text.

      Specific comments on the revised version:<br /> Quotations indicate text in the manuscript.<br /> (1) "While the VSD helices in Kv1.2s and the inactivated Kv1.2s-W17'F superimpose very well at the top (including the S4-S5 interface described above), there is a general twist of the helix bundle that yields an overall rotation of about 3o at the bottom of the VSD."

      Comment: This seemed a bit confusing. I assume the authors aligned the complete structures - the differences they indicate seem to be slight VSD repositioning relative to the pore rather than differences between the VSD conformations themselves. The authors may wish to clarify. As they point out in the subsequent paragraph, the VSDs are known to be loosely associated with the pore.

      (2) Comment: The modeling of DTx into the density is a major improvement in the revision. Figure 3 displays some interactions between the toxin and Kv1.2 - additional side views of the toxin and the channel might allow the reader to appreciate the interactions more fully. The overall fit of the toxin structure into the density is somewhat difficult to assess from the figure. (The authors might consider using ChimeraX to display density and model in this figure.)

      (3) "We obtained the structure of Kv1.2s in a zero K+ solution, with all potassium replaced with sodium, and were surprised to find that it is little changed from the K+ bound structure, with an essentially identical selectivity filter conformation (Figure 4B and Figure 4-figure supplement 1)."

      Comment: It should be noted in the manuscript that K+ and Na+ ions cannot be distinguished by the cryo-EM studies - the densities are indistinguishable. The authors are inferring that the observed density corresponds to Na+ because the protein was exchanged from K+ into Na+ on a gel filtration (SEC) column. It is likely that a small amount of K+ remains in the protein sample following SEC. I caution the authors to claim that there is zero K+ in solution without measuring the K+ content of the protein sample. Additionally, it should be considered that K+ may be present in the blotting paper used for cryo-EM grid preparation (our laboratory has noted, for example, a substantial amount of Ca2+ in blotting paper). The affinity of Kv1.2 for K+ has not been determined, to my knowledge - the authors note in the Discussion that the Shaker channel has "tight" binding for K+. It seems possible that some portion of the density in the selectivity filter could be due to residual K+. This caveat should be clearly stated in the main text and discussion. More extensive exchange into Na+, such as performing the entire protein purification in NaCl, or by dialysis (as performed for obtaining the structure of KcsA in low K+ by Y. Zhou et al. & Mackinnon 2001), would provide more convincing removal of K+, but I suspect that the Kv1.2 protein would not have sufficient biochemical stability without K+ to endure this treatment. One might argue that reduced biochemical stability in NaCl could be an indication that there was a meaningful amount of K+ in the final sample used for cryo-EM (or in the particles that were selected to yield the final high-resolution structure).

      (4) Referring to the structure obtained in NaCl: "The ion occupancy is also similar, and we presume that Kv1.2 is a conducting channel in sodium solution."

      Comment: Stating that "Kv1.2 is a conducting channel in sodium solution" and implying that conduction of Na+ is achieved by an analogous distribution of ion binding sites as observed for K+ are strong statements to make - and not justified by the experiments provided. Electrophysiology would be required to demonstrate that the channel conducts sodium in the absence of K+. More complete ionic exchange, better control of the ionic conditions (Na+ vs K+), and affinity measurements for K+ would be needed to determine the distribution of Na+ in the filter (as mentioned above). At minimum, the authors should revise and clarify what the intended meaning of the statement "we presume that Kv1.2 is a conducting channel in sodium solution". As mentioned above, it seems possible/likely that a portion of the density in the filter may be due to K+.

    2. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      In this manuscript by Wu et al., the authors present the high-resolution cryoEM structures of the WT Kv1.2 voltage-gated potassium channel. Along with this structure, the authors have solved several structures of mutants or experimental conditions relevant to the slow inactivation process that these channels undergo and which is not yet completely understood. 

      One of the main findings is the determination of the structure of a mutant (W366F) that is thought to correspond to the slow inactivated state. These experiments confirm results in similar mutants in different channels from Kv1.2 that indicate that inactivation is associated with an enlarged selectivity filter. 

      Another interesting structure is the complex of Kv1.2 with the pore-blocking toxin Dendrotoxin 1. The results show that the mechanism of the block is different from similar toxins, in which a lysine residue penetrates the pore deep enough to empty most external potassium binding sites. 

      The quality of the structural data presented in this manuscript is very high and allows for the unambiguous assignment of side chains. The conclusions are supported by the data. This is an important contribution that should further our understanding of voltagedependent potassium channel gating. Specific comments are appended below. 

      (1) In the mains text's reference to Figure 2d residues W18' and S22' are mentioned but are not labeled in the insets. 

      Now labeled in Fig. 2D

      (2) On page 8 there is a discussion of how the two remaining K+ ions in binding sites S3 and S4 prevent permeation K+ in molecular dynamics. However, in Shaker, inactivated W434F channels can sporadically allow K+ permeation with normal single-channel conductance but very reduced open times and open probability at not very high voltages. 

      Addressed in the Discussion, lines 480-490.

      (3) The structures of WT in the absence of K+ show a narrower selectivity filter, however, Figure 4 does not convey this finding. In fact, the structure in Figure 4B is constructed at such an angle that it looks as if the carbonyl distances are increased, perhaps this should be fixed. Also, it is not clear how the distances between carbonyls given in the text on page 12 are measured. Is it between adjacent or kitty-corner subunits? 

      We decided to remove mention of carbonyl distances, because at our resolutions the atoms are not resolved.

      (4) It would be really interesting to know the authors' opinions on the driving forces behind slow inactivation. For example, potassium flux seems to be necessary for channels to inactivate, which might indicate a local conformational change is the trigger for the main twisting events proposed here. 

      We cite Sauer et al. (2011) for the idea that the intact selectivity filter is a strained conformation, and its relaxation yields the wide vestibule seen in NaK2K and Kv channels.  Lines 434-439.

      Reviewer #2 (Public Review): 

      There are four Kv1.2 channel structures reported: the open state, the C-type inactivated state, a dendrotoxin-bound state, and a structure in Na+. 

      A high-resolution crystal structure of the open state for a chimeric Kv1.2 channel was reported in 2007 and there is no new information provided by the cryoEM structure reported in this study. 

      The cryo-EM structure of the C-type inactivated state of the Kv1.2 channel was determined for a channel with the W to F substitution in the pore helix. A cryo-EM structure of the Shaker channel and a crystal structure of a chimeric Kv1.2 channel with an equivalent W to F mutation were reported in 2022. Cryo-EM structures of the C-type inactivated Kv1.3 channel are also available. All these previous structures have provided a relatively consistent structural view of the C-type inactivated state and there is no significant new information that is provided by the structure reported in this study. 

      A structure of the Kv1.2 channel blocked by dendrotoxin is reported. A crystal structure of charybdotoxin and the chimeric Kv1.2 channel was reported in 2013. Density for dendrotoxin could not be clearly resolved due to symmetry issues and so the definitive information from the structure is that dendrotoxin binds, similarly to charybdotoxin, at the mouth of the pore. A potential new finding is that there is a deeper penetration of the blocking Lys residue in dendrotoxin compared to charybdotoxin. It will however be necessary to use approaches to break the symmetry and resolve the electron density for the dendrotoxin molecule to support this claim and to make this structure significant.  

      We have now succeeded in breaking the symmetry and present in Fig. 3 a C1 structure of the toxin-channel complex. In the improved map we now see that our previous conclusion was wrong: the penetration of Lys5 cannot be much deeper than that seen in CTx and ShK structures. However for some reason the pattern of ion-site occupancies in the blocked state is different in this structure than in the others. Fig. 3, Fig. 4E; text lines 559-568.

      The final structure reported is the structure of the Kv1.2 channel in K+ free conditions and with Na+ present. The structure of the KcsA channel by the MacKinnon group in 2001 showed a constricted filter and since then it has been falsely assumed by the K channel community that the lowering of K concentration leads to a construction of the selectivity filter. There have been structural studies on the MthK and the NaK2K channels showing a lack of constriction in the selectivity filter in the absence of K+. These results have been generally ignored and the misconception of filter constriction/collapse in the absence of K+ still persists. The structure of the Kv1.2 channel in Na+ provided a clear example that loss of K+ does not necessarily lead to filter constriction. 

      We are grateful to the reviewer for pointing out this serious omission. We now cite other work including from the Y. Jiang and C. Nichols labs showing examples of outer pore expansion and destabilization. Page p. 4, lines 90-104; lines 421-439.

      The structure in Na+ is significant while the other structures are either merely reproductions of previous reports or are not resolved well enough to make any substantial claims. 

      We now state more clearly the confirmatory nature of our Kv1.2 open structure (lines 71-74) and the similarities of the inactivated-channel structures (lines 193196).

      Reviewer #3 (Public Review): 

      Wu et al. present cryo-EM structures of the potassium channel Kv1.2 in open, C-type inactivated, toxin-blocked and presumably sodium-bound states at 3.2 Å, 2.5 Å, 2.8 Å, and 2.9 Å. The work builds on a large body of structural work on Kv1.2 and related voltage-gated potassium channels. The manuscript presents a large quantity of structural work on the Kv1.2 channel, and the authors should be commended on the breadth of the studies. The structural studies seem well-executed (this is hard to fully evaluate because the current manuscript is missing a data collection and refinement statistics table). The findings are mostly confirmatory, but they do add to the body of work on this and related channels. Notably, the authors present structures of DTXbound Kv1.2 and of Kv1.2 in a low concentration of potassium (with presumably sodium ions bound within the selectivity filter). These two structures add new information, but the studies seem somewhat underdeveloped - they would be strengthened by accompanying functional studies and further structural analyses. Overall, the manuscript is well-written and a nice addition to the field. 

      The data collection and refinement table has been added (Fig. 4 supplement 3.)

      We agree and regret the lack of functional studies. We have not been able to carry them out because work in our laboratory is winding down and the lab soon will be closing.

      Recommendations for the authors: 

      Reviewer #2 (Recommendations For The Authors): 

      (1) It is not obvious from the data shown how well the side chain positions in the inactivated state are defined by the electron density. These figures should be redone. Maybe the use of stereo would be useful. This will be particularly useful for the reader to decide if the small changes in, for example, the positioning of the carbonyl oxygens are believable. 

      Figure 2 – figure supplement 4 shows the stereo views.

      (2) The authors note the changes observed (though small) in the VSD which were not observed in other structures. The relevance of this observation is not described. Do these changes arise due to the different environments of detergents versus nanodisc etc. in the different structures?

      We’ve now inserted a note about variety of environments and how this might be a cause of the difference: lines 280-285.  

      Are there changes in the pore-VSD interface in the inactivated and the open channel structures and if yes, then do mutations at these residues affect inactivation?

      There is surprisingly little movement at the S4-S5 interface residues identified by Bassetto et al. (2022) as having effects on inactivation. Lines 262-267.

      (3) For the structures in Na+, it is important to provide analytical data showing the biochemical behavior of the channel. This is also true for the wild type and the W to F mutant channel. Size exclusion profiles should be included. 

      The SEC profile (noisy, but showing a clear peak) of the channel in Na+ is now shown in Fig. 4 supplement 1. Low expression of the W366F mutant produced even worse SEC results, but we include a representative micrograph of W366F in Na+ to show the monodispersed protein prep. In Figure 5 – figure supplement 1.

      Reviewer #3 (Recommendations For The Authors): 

      Portions of text from the manuscript are indicated by quotations. 

      Introduction: "One goal of the current study was to examine the structure of the native Kv1.2 channel." 

      Comment, minor points: The authors refer to the Kv1.2 construct used for the structural studies as "native Kv1.2". I found this somewhat confusing because the word "native" suggests derived from a native source. The phrasing above also gives the impression that the structure by Wu et al is the first structure of Kv1.2. The Kv1.2 construct is essentially identical to the one used by Long et al in 2005 to determine the initial structure of Kv1.2 (PDB 2A79). The authors discuss a subsequent paddle-chimera Kv1.2-2.1 structure from 2007 (PDB 2R9R) in the introduction, but it would be prudent to mention the 2005 one of Kv1.2 as well. The open structure determined by Wu et al. is an improvement on the 2A79 structure in that the 2A79 structure was modeled as a poly-alanine model within the voltage sensor domain. Nevertheless, the Kv1.2-2.1 structure (2R9R) is highly similar to the 2A79 structure of Kv1.2. The 2007 structure indicated that Kv1.2-2.1 recapitulates structural features of Kv1.2. It is therefore not surprising that the open structure presented here is highly similar to that of both PDB 2A79 (Kv1.2) and PDB 2R9R (Kv1.2-2.1).  

      We failed to point out the high quality of the original Long et al. 2005 structure and its comparisons with the chimeric structure in Long et al. 2007. We now have tried to correct this: lines 70-74.

      Comment: The cryo-EM analyses suggest that a large percentage (most?) of the particles are missing the beta subunit. This should be commented on somewhere.      

      Now noted on lines 120-132, we pooled particles with and without beta subunits. 

      Regarding ions in the selectivity filter, one-dimensional plots of the density would strengthen the analysis.

      Now included in Fig. 4.

      Also, one should mention caveats associated with identifying ions in cryo-EM maps and the added difficulty/uncertainty when the density is located along a symmetry axis (C4 axis, due to the possible build-up of noise). C1 reconstructions, showing density within the filter, if possible, would strengthen the analyses.

      You are correct. However local resolution is highest in the selectivity filter region. So I think that since the CTF-based filtering is constant over all the structure I think the SNR will be good on axis. 

      Comment: The section on channel inactivation could be simplified by stating that the structure is highly similar to W17'F structures of other Kv channels. (And then discussing possible differences).  

      We now note, “overall conformational difference is identical…” p. 7, lines 193-196.

      "Salt bridges involving the S4 Arg and Lys residues are shifted slightly (Figure 2-figure supplement 3A-D). Arg300 (R3) is in close proximity to Glu226 on the S2 helix for the open channel, while R3 is closer to Glu183 in the S2 helix. The Glu226 side chain adopts a visible interaction with R4 in the inactivated state." 

      Comment: The density for these acidic amino acids seems weak, especially in the inactivated state. It seems like a stretch to make much of their possible conformational changes. 

      We’ve included stereo pairs in Fig. 2 – figure supplement 4.

      "By adding 100 nM α-DTx to detergent solubilized Kv1.2 protein we obtained a cryo-EM structure at 2.8 Å resolution of the complex." 

      Comment: 100 nm. might be lower than the Kv concentration. The current methods are ambiguous on the concentration of Kv channel used for the DTx sample. From the methods, it seems possible that 100 nM DTX is a sub-stoichiometric amount relative to the channel. Regardless, the cryo-EM data seems to suggest that a large percentage of particles do not have DTx bound. This surely complicates the interpretation of density within the filter (which has partly been ascribed to a lysine side chain from DTx).

      The reviewer correctly points a potentially serious problem. It turns out that the 100nM figure we quoted was incorrect, and the actual concentration of toxin, >400 nM, was substantially greater than the protein concentration. This is confirmed by the small fraction (<1%) of 3D class particles that do not show the toxin density (lines 303-306).

      Comment: The methods on atomic structure building/refinement (Protein model building, refinement, and structural analysis) are sparse. A table is needed showing data collection and refinement statistics for each of the structures. This data should also provide average B factors for the ions in the filter. An example can be found in PMID 36224384. 

      Data collection and statistics are now in Fig. 4 – figure supplement 3.

      "In the selectivity filter of the toxin-bound channel (Figure 3E) a continuous density is seen to extend downward from the external site IS0 through to the boundary between IS1 and IS2. This density is well modeled by an extended Lys side chain from the bound toxin, with the terminal amine coordinated by the carbonyls of G27”.

      Comment: While there seems to be extra density in site IS0 from the figures, the density ascribed to lysine in the filter doesn't seem that distinct from those of ions in the open structure. 1-dimensional density plots and some degree of caution may be prudent. Could there, for example, be a mixture of toxin-bound and free channels in the dataset?

      Could the lysine penetrate to different depths? If the toxin binds with nM affinity, why are any channels missing the toxin? Have the authors modeled an atomic structure of the entire toxin bound to the channel to evaluate how plausible the proposed binding of the lysine is? Can the toxin be docked onto Kv1.2 with the deep positioning of the lysine and not clash with the extracellular surface of Kv1.2? 

      We also were concerned about these issues. We have been able to obtain a C1 reconstruction of the toxin-channel complex. In building the atomic model we found that indeed the Lys5 side chain could not penetrate as far as we had thought, and appears to be coordinated by the first carbonyl pair. Fig. 3; text lines 331-332. 

      "Toxin binding shrinks the distances between opposing carbonyl oxygens in the selectivity filter, forming a narrower tunnel into which the Lys side chain fits (Figure 3F). The second and fourth carbonyl oxygen distances are substantially reduced from 4.7 Å and 4.6 Å in an open state to 3.7 Å and 3.9 Å, respectively (Figure 4E). In a superposition of Kv1.2 open-state and α-DTX-bound P-loop structures, there is also an upward shift of the first three carbonyl groups by 0.7~1.0 Å (Figure 4F). " 

      Comment: I suspect the authors intend to refer to Figure 3F rather than 4. I would be cautious here. The refined positions of the carbonyl oxygens are almost certainly affected by the presence or absence of ions in the atomic model during refinement. The density and the resolution of the map may not be able to distinguish small changes to the positions of the carbonyl oxygens (and these differences/uncertainties are compounded by the C4 symmetry). 

      "On the other hand, the terminal amine of lysine in α-DTX is deeply wedged at the second set of carbonyls, narrowing both IS1 and IS2 while displacing ions from the sites (Figure 3-figure supplement 2A). CTX does not cause narrowing of the selectivity filter or displacements of the carbonyls (Figure 3-figure supplement 2B). "

      Comment: Again, caution would be prudent here.  

      We are very grateful to the reviewer for pointing out these problems. We have removed these statements that are weakly supported at our resolution level.

      "Shaker channels are able to conduct Na+ in the absence of K+ (Melishchuk et al., 1998)." 

      Comment: How about the Kv1.2 channel? Is Kv1.2 able to conduct Na+ in the absence of K+ ? This would certainly be relevant for interpreting the conformation of the filter and the density ascribed to Na+ for the structure in sodium.  

      We agree wholeheartedly, but unfortunately we are no longer capable of doing the measurements as our lab will soon close.

      "Ion densities are seen in the IS1, IS3, and IS4 ion binding sites, but the selectivity filter shows a general narrowing as would be expected for binding of sodium ions. The second, third, and fourth carbonyl oxygen distances are reduced from 4.7 Å, 4.7 Å, and 4.6 Å in the open state to 4.4 Å, 3.9 Å, and 4.5 Å, respectively. The rest of the channel structure is very little perturbed. " 

      Comment: The density for IS4 seems weak. To me, it looks like IS1 and IS3 are occupied, whereas IS2 and IS4 are much weaker. 1-dimensional density plots would be helpful. I would suggest caution in commenting too strongly on the "general narrowing" since the resolution of the maps, the local density, and the atomic structure refinement would be consistent with coordinate errors of 0.5 Å or more - and would be compounded (~ doubled) by measuring between symmetry-related atoms.  

      We present 1D plots in Fig. 4E. We no longer comment on “narrowing”

      "Finally, the snake toxin a-Dendrotoxin (DTx) studied here is seen to block Kv1.2 by insertion of a lysine residue into the pore." 

      Comment: Discussion (and references) should be given regarding what was known prior to this study on the mode of inhibition by DTx. 

      Discussion and references now added, lines 287-301.

      "On the other hand, a lengthy molecular-dynamics simulation of deactivation in the Kv1.2-2.1..." 

      Comment: I don't think mentioning this personal communication adds to the manuscript. 

      Actually the original “personal communication” reference was there because the situation is complicated. The movie S3 accompanying the Jensen et al. paper shows deactivation and dewetting of the channel during a 250 us simulation. In the movie there are ions visible in the selectivity filter for the first 50 us, but after that the SF appears empty. Puzzled by this we contacted Dr. Jensen who explained that the movie was in error, ions remain in the SF throughout the entire 250 us. We now cite Jensen (2012) along with the personal communication.

      "The difference between the open and inactivated Kv1.2 structures, like the difference in Kv1.2-2.1 (Reddi et al., 2022) and Shaker (Tan et al., 2022) can be imagined as resulting from a two-step process." 

      Comment: Confusing phrasing because the authors mean to compare their structure to inactivated structures of Kv1.2-2.1 and shaker. 

      Fixed, lines 220-222.

      "Molecular dynamics simulations by Tan et al. based on the Shaker-W17'F structure show that IS3 and IS4 are simultaneously occupied by K+ ions in the inactivated state." 

      Comment: I think that the word "show" is too strong. Perhaps "suggest" 

      The MD result seems to us to be unequivocal, that most of the time the two sites are occupied by ions.

      References are needed for the following statements:  

      -  "as well as the charge-transfer center phenylalanine"

      Now citing Tao et al. 2010, line 156.

      - "total gating charge movement in Shaker channels is larger, about 13 elementary charges per channel" 

      Now citing the review by Islas, 2015 (line 166-169).

      "The selectivity filter of potassium channels consists of an array of four copies of the extended loop (the P-loop) formed by a highly conserved sequence, in this case, TTVGYGD. Two residues anchor the outer half of the selectivity filter and are particularly important in inactivation mechanisms (Figure 2B, right panels). Normally, the tyrosine Y28' (Y377 in Kv1.2) is constrained by hydrogen bonds to residues in the pore helix and helix S6 and is key to the conformation of the selectivity filter. The final aspartate of the P-loop, D30' (D379 in Kv1.2) is normally located near the extracellular surface and has a side chain that also participates in H-bonds with W17' (W366 in Kv1.2) on the pore helix." 

      Citations added (Pless 2013, Sauer 2011) lines 211-214.

      - "During normal conduction, ion binding sites in the selectivity filter are usually occupied by K+ and water molecules in alternation." 

      Added Morais-Cabral et al. 2001, p. 17, lines 463-465.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We would like to thank the reviewers for their thoughtful evaluation of our work. Our point-by-point responses to reviewer critiques follow below. Please note that any referenced changes to the manuscript are highlighted in yellow in the revised manuscript text.

      Response to Common Critiques

      1. Reviewers 1 and 2 state that some elements of this study confirm previously published results (many in murine systems). However, the reviewers also acknowledge that the mouse and human rDNA repeats may be subject to quite distinct regulation because of the much denser CG content of the human rDNA promoter (26 CpGs) vs. the mouse rDNA promoter (only 2 CpGs); these potential differences in regulation motivated this study in human cells. We evaluate the functions of rDNA methylation in human cells, which is directly relevant to understanding the regulation of rDNA function in human aging, and to understanding the functional implications of DNA methylation "aging clocks" more generally. We also apply a recently developed technology (dCas9-mediated epigenome editing) to directly test the function of rDNA methylation. Novel findings reported in this study include:
      2. Pol I - engaged rDNA repeats are hypomethylated at sites both in the promoter and the gene body; this contrasts with Pol II transcription, which is coincident with gene body methylation.
      3. rDNA copy number remains stable with age in mammals, in striking contrast to findings in other eukaryotes. rDNA copy number instability has been proposed to be a universal feature of the aging genome, and this finding refutes that possibility.
      4. Induction of DNA methylation by an average of ~20% along 7-11 of the 26 CpGs in the human rDNA repeat does not measurably inhibit rDNA transcription.
      5. Human Pol I and UBTF remain bound to rDNA promoters in the presence of elevated CpG methylation, in contrast to the murine Pol I machinery.

      Reviewers 1 and 2 questioned our strategy of mapping sequencing data to the consensus ribosomal DNA (rDNA) repeat alone. We followed the approach of Wang & Lemos Genome Research 2019, who initially described the rDNA methylation clock. Wang & Lemos also mapped genomic data to rDNA consensus sequences alone due to the computational efficiency of this approach, and describe a head-to-head comparison of mapping performance outcomes in their Methods section. Importantly, their analysis indicated that the vast majority (>98%) of sequencing reads can be mapped uniquely to the consensus human rDNA repeat (U13369.1). When we launched our study, we also initially compared the performance of mapping to the rDNA repeat consensus sequence alone versus to the whole human genome. We noted very similar performance in both cases, with the possible exception of a modest increase in simple repeat sequences being erroneously mapped to the intergenic spacer (IGS) region of the rDNA when we mapped to the rDNA repeat alone. As the reviewers pointed out, the IGS contains simple repeat sequences that are also found at numerous other non-rDNA sites in the genome. However, the minor mis-mapping of simple repeats to the IGS did not affect our analyses of non-IGS sequences, which were the focus of this study. We therefore proceeded with mapping to the rDNA consensus sequence only.

      Reviewers 1 and 2 pointed out that our dCas9-DNMT strategy induced only a 15-20% increase in rDNA methylation and questioned whether we could expect to detect downstream effects in rDNA transcription. While Reviewer 2 suggested that multiple sgRNAs could enhance methylation efficiency, it turns out that this has already been tested for other target genes and shown that multiple sgRNAs cannot increase efficiency of CpG methylation by dCas9-DNMTs (Stepper et al., Nucleic Acids Research 2017). Separately, the goal of this study was to model the effects of age-linked rDNA hypermethylation, which increases by 15-20% over mammalian lifespan (Wang & Lemos 2019; see also our Figure 1). Importantly for interpreting these data, induction of promoter methylation to a similar extent on the mouse rDNA repeat was able to direct detectable repression of rDNA transcription (Santoro et al., 2011). Further, dCas9-DNMT has been previously shown to induce a ~20% increase in CpG methylation of the Pol II target gene EpCAM and cause measurable transcriptional repression that was detectable by qPCR (Stepper et al., 2017). In contrast, we were able to induce rDNA methylation to a similar extent and observed no change in the levels of either pre-rRNA or mature rRNA. Because we see that UBF and Pol I remain bound to rDNA in spite of higher CpG methylation (Fig. 7 and Fig. S4), we interpret these data together to indicate that the human Pol I machinery can continue to engage with rDNA in the presence of intermediate levels of CpG methylation.

      Reviewer 1

      1. inactivation of rDNA transcription per se does not affect chromatin accessibility, to date only depletion or deletion of UBTF has been found to do this and even this does not enhance CpG methylation, these published findings should be referenced.

      Our analyses in Figure 2 focus on defining the relationships between chromatin accessibility, transcriptional activity, and CpG methylation throughout the human rDNA repeat. We cannot determine causation from this analysis - meaning whether chromatin accessibility influences CpG methylation or vice versa - and this point is beyond the scope of our study. Our major goal was to test whether induced CpG methylation affects transcription output.

      The authors overstate their results by writing "actively transcribed rDNA repeats are hypomethylated at their promoter" despite only one SmaI site but many CpG sites exist in the human promoter, the latter having not been assayed.

      We analyzed several pieces of data to come to this conclusion. First, ATAC-Me indicates that ATAC-accessible rDNA repeats are completely devoid of methylation both in their promoter and throughout the gene body; as UBTF binding controls rDNA accessibility (Sanij et al., JCB 2008; Hamdane et al., PLoS Genet 2014), we infer that ATAC-accessible repeats are engaged with the Pol I transcription machinery and hypomethylated. To more directly probe this question, we evaluated the methylation status of Pol I-bound rDNA repeats at five separate sites by ChIP-chop: two sites in the 5' regulatory region (5' ETS and core promoter, pooled together as "promoter" in Figure 2F) and three sites within the gene body (18S, 5.8S, and 28S, pooled together as "gene body" in Figure 2F). These data clearly indicate that Pol I preferentially binds to these regions when they are hypomethylated, as the extent of CpG methylation at these same sites is higher in input DNA and lower in Pol I-ChIPped DNA. While we do not comprehensively profile CpG methylation status of Pol I-bound DNA, these ChIP-chop analyses are consistent with our interpretation that "actively transcribed (that is, Pol I-engaged) rDNA repeats are hypomethylated at their promoter".

      Pol I's preference for binding hypomethylated promoters has been previously described in mouse cells (Santoro & Grummt 2001) and human cells (Brown & Szyf Mol Cell Biol 2007). We confirm this and also report the novel finding that rDNA gene bodies bound by Pol I are hypomethylated. This contrasts with known relationships between Pol II and CpG methylation, where genes actively transcribed by Pol II often have dense gene body CpG methylation.

      While we think it is reasonable to infer from ATAC-Me data and ChIP-chop data together that accessible and hypomethylated rDNA repeats reflect transcriptionally active repeats, we appreciate the reviewer's point that we analyzed only a select few CpG sites by Pol I ChIP-chop. We have adjusted the text to make our interpretation more parsimonious (see highlights).

      The human rDNA promoter contains many CpGs which may not affect transcription when methylated. RRBS and WGBS data can't tell us much if we don't understand which sites, when methylated, affect transcription*. *

      We agree, and this ambiguity is what motivated us to induce methylation and evaluate the consequences. In plasmid reporter experiments where the human rDNA promoter was fused to a luciferase reporter, it was shown that in vitro methylation of the plasmid potently inhibited transcription in human cells (Ghoshal et al., J Biol Chem 2004). In this study, methylation of 7/26 CpGs was sufficient to induce >75% inhibition of reporter plasmid transcription, while methylation at single sites could induce ~50% inhibition. We neglected to site this relevant study and have included a reference to it in the revised manuscript. Importantly, this plasmid reporter assay does not assess the effects of CpG methylation on the full rDNA repeat in its endogenous genomic context. We were able to induce significant CpG hypermethylation on 11/26 promoter CpGs with one guide (P+G) and on 7/26 CpGs with a second guide (P+A) (Figure 3D). This level of methylation did not induce detectable silencing of rRNA transcription. Instead, we found that both UBF (Fig. 7) and Pol I (Fig. S4) remained bound to rDNA in the presence of CpG hypermethylation.

      The argument that the mouse rDNA Pol I machinery is "exquisitely sensitive" to CpG methylation is a little misleading as there are only two CpGs in the mouse rDNA promoter. Which of the 26 human CpGs are the critical ones?

      Immediately following this statement in the Discussion, we state that "the human rDNA promoter is significantly more CG-rich than the mouse rDNA promoter". We have revised this section to emphasize the difference (26 CpGs in human vs. only 2 in the mouse) and discuss this point raised by the reviewer: which are the critical CpGs in the human rDNA? Here again it is relevant to cite the human rDNA promoter reporter assays performed by Ghoshal et al., J Biol Chem 2004. These data indicate that CpG methylation of 7/26 promoter CpGs interferes with transcription from an rDNA reporter plasmid. Notably, it is unclear how generalizable findings from reporter assays are to the genomic context of the endogenous full length rDNA sequence. Our data indicate that partial methylation of 7-11 CpGs in the human rDNA promoter causes no detectable rDNA inhibition, and indeed does not displace UBF or Pol I (Fig. 7; Fig. S4).

      Antibody SC13125 used for UBF ChIP sees nearly exclusively the shorter transcriptionally inactive UBF2 variant. These data need to be repeated with an antibody that detects both UBF forms.

      We thank the reviewer for raising the important issue of UBTF splice isoforms. Relevant citations demonstrating that the SC13125 antibody recognizes only UBF2 would have been very helpful. The human UBTF gene is alternatively spliced into full-length UBF1 (exon 8 retained) and UBF2 (exon 8 spliced out). The deletion of exon 8 results in a 37 amino acid deletion in UBF2 corresponding to residues 221-268 in HMG box 2 of UBF1 (see Ensembl entry ENSG00000108312.16). The truncation of HMG box 2 makes UBF2 a far less potent transcriptional activator than UBF1. Because of the small molecular weight difference between these two isoforms, preference of an antibody for one vs. another isoform is not readily apparent by Western blotting. However, according to the manufacturer of the UBTF antibody used in this study, the immunogen corresponds to residues 1-220 of UBTF1, which is immediately N-terminal to the residues deleted in UBF2 (AAs 221-268, encoded by exon 8). The antibody's immunogen is thus entirely sequence that is shared between UBF1 and UBF2. Further, a previous study performed immunoprecipitation followed by mass spectrometry using this antibody and reported detection of UBF1-specific peptides (Drakas et al., PNAS 2004). Therefore, absent our knowledge of any evidence to the contrary, we conclude that this antibody recognizes UBF1 and possibly also UBF2.

      We thank the reviewer for raising this point and have adjusted the text to avoid the misleading implication that we are unambiguously detecting only the UBF1 isoform; all mentions of "UBF1" in the revised text have been replaced with "UBTF".

      Setting aside the question about the UBTF antibody reagent used, we observe consistent results by evaluating both UBTF (Figure 7) and Pol I (Figure S4) binding to rDNA in spite of CpG methylation; therefore, we conclude that the human Pol I machinery is not displaced from the human rDNA promoter by intermediate levels of CpG methylation.

      Reviewer 2

      1. There is very little discussion concerning the methylation status of the IGS...the Kobayashi lab has convincingly demonstrated that rDNA repeats fall into 2 classes. Those in which the supposedly active repeats lack methylation on promoters and coding regions and those in which both promoters and coding regions are heavily methylated. In both cases the IGS is fully methylated.

      We cite this study in the Discussion (reference 18 in bibliography) and agree that this work is relevant to ours; we have adjusted the text to emphasize this point. Notably, this previous analysis of CpG methylation patterns by long-read sequencing implied that active repeats may be entirely hypomethylated along their coding sequence; our data more directly demonstrate this both by ATAC-Me and by Pol I ChIP-chop (Fig. 2).

      There is no description of how rRNA levels were assessed. I suggest this could be further complemented by in vivo incorporation studies such as EU labeling.

      We apologize for this lack of clarity. rRNA levels were assessed by qPCR of the 45S pre-rRNA (Fig. 3A) and of mature 28S rRNA (Fig. 3B), and these data are presented as a fold change in each rDNA-targeting sgRNA compared to a non-targeting control sgRNA. The primersets used are listed in Supplementary Table 1.

      While we agree that EU labeling could be useful for detecting nucleolar transcription, qPCR detection of the 45S rRNA also sensitively reports nascent transcription and we think is sufficient to address this question.

      Reviewer 3

      1. The study points to differences between mouse and human rDNA and the effect of DNA methylation on transcriptional output. Did the mouse rDNA dataset also measure transcription output to correlate with DNA methylation age differences?

      The original study that defined the rDNA methylation clock (Wang & Lemos Genome Research 2019) did not evaluate rDNA transcription in parallel. More generally, the relationship of age-linked "clock" CpG methylation sites to expression / function of CpG methylated loci is very unclear, and testing the potential relationship between age-linked rDNA methylation and function was the major goal of this study.

      Did the spacer promoter also get methylated and did that affect UBF and Pol I binding?

      While the existence and function of a spacer promoter has been more clearly defined in the mouse rDNA repeat, recent evidence indicates that the Pol I transcription machinery also binds a second location about 800 bp upstream of the core promoter in the human rDNA repeat (Mars et al G3 2018). The guides that we used to direct CpG methylation recognize single unique sites in the core rDNA promoter and do not recognize sequences in this putative spacer promoter, and we did not analyze methylation at the spacer promoter. Analysis of the spacer promoter is generally beyond the scope of this study, as it is unknown whether there is any relationship between spacer promoter methylation and aging progression.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      The authors present evidence suggesting that MDA5 can substitute as a sensor for triphosphate RNA in a species that naturally lacks RIG-I. The key findings are potentially important for our understanding of the evolution of innate immune responses, but the evidence is incomplete, as additional biochemical and functional experiments are needed to unambiguously assign MDA5 as a bona fide sensor of triphosphate RNA in this model. This also leaves the title as overstating its case.

      We would like to thank the editorial team for these positive comments on our manuscript and the constructive suggestions to improve our manuscript. According to the suggestions and valuable comments of the referees, we have added substantial amounts of new data and analysis to substantiate our claims, and the manuscript, including the title, has been carefully revised to better reflect our conclusions. We are now happy to send you our revised manuscript, we hope the modified manuscript addresses your and the reviewers’ concerns satisfactorily and is suitable for publication in eLife now.

      Reviewer #1 (Public Review):

      This study offers valuable insights into host-virus interactions, emphasizing the adaptability of the immune system. Readers should recognize the significance of MDA5 in potentially replacing RIG-I and the adversarial strategy employed by 5'ppp-RNA SCRV in degrading MDA5 mediated by m6A modification in different species, further indicating that m6A is a conservational process in the antiviral immune response.

      However, caution is warranted in extrapolating these findings universally, given the dynamic nature of host-virus dynamics. The study provides a snapshot into the complexity of these interactions, but further research is needed to validate and extend these insights, considering potential variations across viral species and environmental contexts.

      We concur with the viewpoint that virus-host coevolution complicates the derivation of universal conclusions. To address this challenge, incorporated additional experiments and data based on the suggestions of the reviewers. These experiments were carried out across diverse models, including two distinct vertebrate species (M. miiuy and G. gallus), two different viruses (SCRV and VSV), and the synthesis of corresponding 5’ppp-RNA probes. We believe that these supplementary data bolster the evidence supporting the immune replacement role of MDA5 in the recognition of 5'ppp-RNA in RIG-I deficient species (Figure 1C-1E, Figure 2O and 2P, Figure 4). Moreover, we have duly incorporated references in both the introduction and discussion sections to further support our conclusion that MDA5 in T. belangeri, a mammal lacking RIG-I, possesses the ability to detect RNA viruses posed as RIG-I agonists (doi: 10.1073/pnas.1604939113). Lastly, meticulous revisions have been undertaken in the manuscript, including adjustments to the title, to ensure harmonization with our research outcomes.

      Reviewer#2 (Public Review):

      This manuscript by Geng et al. aims to demonstrate that MDA5 compensates for the loss of RIG-I in certain species, such as teleost fish miiuy croaker. The authors use siniperca cheats rhabdovirus (SCRV) and poly(I:C) to demonstrate that these RNA ligands induce an IFN response in an MDA5-dependent manner in M. miiuy derived cells. Furthermore, they show that MDA5 requires its RD domain to directly bind to SCRV RNA and to induce an IFN response. They use in vitro synthesized RNA with a 5'triphosphate (or lacking a 5'triphosphate as a control) to demonstrate that MDA5 can directly bind to 5'-triphosphorylated RNA. The second part of the paper is devoted to m6A modification of MDA5 transcripts by SCRV as an immune evasion strategy. The authors demonstrate that the modification of MDA5 with m6A is increased upon infection and that this causes increased decay of MDA5 and consequently a decreased IFN response.

      The key message of this paper, i.e. MDA5 can sense 5'-triphosphorylated RNA and thereby compensate for the loss of RIG-I, is novel and interesting, yet there is insufficient evidence provided to prove this hypothesis. Most importantly, it is crucial to test the capacity of in vitro synthesized 5'-triphosphorylated RNA to induce an IFN response in MDA5-sufficient and -deficient cells. In addition, a number of important controls are missing, as detailed below.

      To further support the notion that MDA5 is capable of detecting 5'ppp-RNA in species lacking RIG-I, we conducted additional experiments. Initially, we isolated the RNA from SCRV and VSV viruses. Subsequently, we synthesized 5'ppp-RNA probes that corresponded to the genome termini of SCRV and VSV in vitro. Then, these RNAs were treated with Calf intestinal phosphatase (CIAP) to generate dephosphorylated derivatives. Next, we separately tested the activation ability of various RNAs on IRF3 dimer and IFN response in MKC (M. miiuy kidney cell line) and DF-1 (G. gallus fibroblast cell line) cells, and determined that the immune activation ability of SCRV/VSV viruses depends on their triphosphate structure (Figure 1C-1E, Figure 4C and 4J). In addition, the knockdown of MDA5 inhibited the immune response mediated by SCRV RNA (Figure 2P and 2Q). Finally, we incorporated essential experimental controls (Figure 4B and 4I). We think that the inclusion of these supplementary experimental data significantly enhances the credibility and further substantiates our hypothesis.

      The authors describe an interaction between MDA5 and STING which, if true, is very interesting. However, the functional implications of this interaction are not further investigated in the manuscript. Is STING required to relay signaling downstream of MDA5?

      To better explore the role of STING in MDA5 signal transduction, we constructed a STING expression plasmid and synthesized specific siRNA targeting STING. Next, we found that co-expression of STING and MDA5 significantly enhance MDA5-mediated IFN-1 response during SCRV virus infection (Figure 2N). Conversely, silencing of STING expression restored the MDA5-mediated IFN-1 response (Figure 2O). These findings provide important evidence for the critical involvement of STING in the immune signaling cascade mediated by MDA5 in response to 5'ppp-RNA viruses.

      The second part of the paper is quite distinct from the first part. The fact that MDA5 is an interferon-stimulated gene is not mentioned and complicates the analyses (i.e. is there truly more m6A modification of MDA5 on a per molecule basis, or is there simply more total MDA5 and therefore more total m6A modification of MDA5).

      For the experimental data analysis in Figure 5E and 5F, we first compared the m6A-IP group to the input group, and then normalized the control group (IgG group of 5E and Mock group of 5F) to a value of “1”. Given the observed variability in MDA5 expression levels within the input group of Mock and SCRV virus-infected cells, our analysis represents the actual m6A content of each MDA5 molecule. To enhance clarity, we have updated the label on the Y-axis in Figure 5E and 5F.

      Finally, it should be pointed out that several figures require additional labels, markings, or information in the figure itself or in the accompanying legend to increase the overall clarity of the manuscript. There are frequently details missing from figures that make them difficult to interpret and not self-explanatory. These details are sometimes not even found in the legend, only in the materials and methods section. The manuscript also requires extensive language editing by the editorial team or the authors.

      We acknowledge the valuable feedback from the reviewer and have made significant improvements to our manuscript based on the recommendations provided in the "Recommendation for the authors" section. Furthermore, we have conducted a thorough review of the entire article, resulting in substantial enhancements to the format, clarity, and overall readability of our manuscript.

      Reviewer#3 (Public Review):

      Summary: In this manuscript, the authors investigated the interaction between the pattern recognition receptor MDA5 and 5'ppp-RNA in a teleost fish called Miiuy croaker. They claimed that MDA5 can replace RIG-I in sensing 5'ppp-RNA of Siniperca cheats rhabdovirus (SCRV) in the absence of RIG-I in Miiuy croaker. The recognition of MDA5 to 5'ppp-RNA was also observed in the chicken (Gallus gallus), a bird species that lacks RIG-I. Additionally, they reported that the function of MDA5 can be impaired through m6A-mediated methylation and degradation of MDA5 mRNA by the METTL3/14-YTHDF2/3 regulatory network in Miiuy croaker under SCRV infection. This impairment weakens the innate antiviral immunity of fish and promotes the immune evasion of SCRV.

      Strengths:<br /> These findings provide insights into the adaptation and functional diversity of innate antiviral activity in vertebrates.

      Weaknesses:<br /> However, there are some major and minor concerns that need to be further addressed. Addressing these concerns will help the authors improve the quality of their manuscript.One significant issue with the manuscript is that the authors claim to be investigating the role of MDA5 as a substitute for RIG-I in recognizing 5'ppp-RNA, but their study extends beyond this specific scenario. Based on my understanding, it appears that sections 2.2, 2.3, 2.5, 2.6, and 2.7 do not strictly adhere to this particular scenario. Instead, these sections tend to investigate the functional involvement of Miiuy croaker MDA5 in the innate immune response to viral infection. Furthermore, the majority of the data is focused on Miiuy croaker MDA5, with only a limited and insufficient study on chicken MDA5. Consequently, the authors cannot make broad claims that their research represents events in all RIG-I deficient species, considering the limited scope of the species studied.

      We agree with the reviewer's perspective that functional analysis of MDA5 in M. miiuy may not adequately represent all species lacking RIG-I. To address this concern, we have incorporated additional experimental data utilizing different model systems, including two different vertebrate species (M. miiuy and G. gallus), two distinct viruses (SCRV and VSV), and the synthesis of two corresponding 5’ppp-RNA probes. While the functional characterization of G. gallus MDA5 remains relatively limited compared to M. miiuy, our current experimental findings provide support for two key observations. Firstly, the triphosphate structure of the VSV virus is pivotal in activating the innate immune response in G. gallus against the virus (Figure 1D and 4J). Secondly, G. gallus MDA5 can recognize 5’ppp-RNA (Figure 4I, 4K and 4L). Consequently, although we cannot definitively establish the immune surrogate function of MDA5 in all RIG-I-deficient species, our research data further substantiates this hypothesis. Moreover, we have adopted a more cautious attitude in summarizing our experimental conclusions, thereby enhancing the rigor of our manuscript language.

      The current title of the article does not align well with its actual content. It is recommended that the focus of the research be redirected to the recognition function and molecular mechanism of MDA5 in the absence of RIG-I concerning 5'ppp-RNA. This can be achieved through bolstering experimental analysis in the fields of biochemistry and molecular biology, as well as enhancing theoretical research on the molecular evolution of MDA5. It is advisable to decrease or eliminate content related to m6A modification.

      Following the reviewer's recommendations, we have revised the title to emphasize that our main research focus is a teleost fish devoid of RIG-I. Furthermore, we have conducted additional molecular experiments to further elucidate the 5'ppp-RNA recognition function of MDA5 in RIG-I-deficient species. In an attempt to analyze the potential molecular evolution of MDA5 resulting from RIG-I deficiency, we collected MDA5 coding sequences from diverse vertebrates. However, due to multiple independent loss events of RIG-I in fish, fish with or without RIG-I genes in the phylogenetic tree cannot be effectively clustered separately, making it extremely difficult to perform this aspect of analysis. Consequently, we have regrettably opted to forgo the molecular evolution analysis of MDA5.

      Our article topic is to reveal an antagonistic phenomenon between fish receptor and RNA viruses. The MDA5 of RIG-I-lost fish has evolved the ability to recognize 5’ppp-RNA virus and mediate IFN response to resist SCRV infection. Conversely, the m6A methylation mechanism endows the SCRV virus with a means to weaken the immune capacity of MDA5. Therefore, we believe that the latter part is an important part of the arms race between the virus and its host, and should be retained.

      Additionally, the main body of the writing contains several aspects that lack rigor and tend to exaggerate, necessitating significant improvement.

      We appreciate the reviewer’s comment and have improved the manuscript addressing the points raised in the “Recommendation for the authors”. We have added corresponding experiments to strengthen the verification of the conclusions, and in addition, we are more cautious in summarizing the language of the conclusions.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) The evidential foundation within the Result 1 section appears somewhat tenuous.

      Firstly, the author derives conclusions regarding the phenomenon of RIG-I loss in lower vertebrates by referencing external literature and conducting bioinformatics analyses. It is pertinent to inquire whether the author considered fortifying these findings through additional WB/PCR experiments, particularly for evaluating RIG-I expression levels across diverse vertebrates, encompassing both lower and higher orders.

      Firstly, the species we analyzed are mostly model species with excellent genomic sequence information in the database. Secondly, the RIG-I protein sequences (at least some domain sequences) are relatively conserved in vertebrates. Therefore, the credibility of evaluating the existence of RIG-I in these species through homology comparison is high. Therefore, we do not intend to conduct additional PCR/WB experiments to confirm this.

      Additionally, following the identification of RIG-I loss, the author postulates MDA5 as a substitute of RIG-I, grounding this speculation in the analysis of MDA5 and LGP2 protein structures. It is imperative to address whether the author could enhance the manuscript by supplying expression data for MDA5 and LGP2 across different vertebrates and elucidating further why MDA5 is posited as the compensatory mechanism for RIGI loss.

      Like MDA5, LGP2 is also an interferon-stimulating gene, so they both likely exhibit high sensitivity to viral infections. Therefore, we think that comparing the expression data of these two genes is difficult to evaluate their function. In mammals, the regulatory mechanisms of LGP2 to RIG-I and MDA5 were complicated and ambiguous. To evaluate the potential function of LGP2 in M. miiuy, we further constructed LGP2 plasmid and synthesized siRNA targeting LGP2. Then, our results indicate that mmiLGP2 can enhance the antiviral immune response mediated by mmiMDA5 (Figure 1H and 1I), further indicating the regulatory role of mmiLGP2 in RLR signaling, rather than acting as a compensatory receptor for RIG-I.

      Also, is it conceivable that other receptors contribute to this compensatory effect in lower vertebrates?

      5’ triphosphate short blunt-end double-strand RNA is the ligand of RIG-I as contained in the panhandle of negative-strand viral genomes. We mainly focus on the immune recognition and compensatory effects of other receptors on RIG-I loss, and MDA5, as the protein with the most similar structure, first attracted our attention. In addition, IFIT proteins have been reported to recognize triphosphate single-stranded RNA (doi: 10.1038/nature11783). However, we used SCRV and VSV RNA as viral models, both of which have negative stranded genomes and meet the ligand standards of RIG-I, rather than IFIT. Therefore, we excluded the IFIT protein from our research scope.

      (2) The article exclusively employs a singular type of 5'PPP-RNA virus and one specific lower vertebrate species, thereby potentially compromising the robustness of the assertion that this phenomenon is prevalent in lower vertebrates. To bolster this claim, could the author consider incorporating data from an alternative 5'PPP-RNA virus and a different lower vertebrate species?

      To address this concern, we have incorporated additional experimental data utilizing different model systems, including two different vertebrate species (M. miiuy and G. gallus) and two distinct viruses (SCRV and VSV). While the functional characterization of G. gallus MDA5 remains relatively limited compared to M. miiuy, our current experimental findings provide support for two key observations. Firstly, the triphosphate structure of the VSV virus is pivotal in activating the innate immune response in G. gallus against the virus (Figure 1D and 4J). Secondly, G. gallus MDA5 can recognize 5’ppp-RNA (Figure 4I, 4K and 4L). Consequently, these experimental results further confirmed the conservatism of this immune compensation mechanism.

      (3) A nuanced consideration of the statement in Result 5 is warranted. Examination of the results under SCRV infection conditions suggests dynamic fluctuations in MDA5 expression levels, challenging the veracity of the statement implying "increased expression", which contradicts the proposed working model of this article.

      Because MDA5 acts as a receptor and plays a recognition immune role in the early stages of virus infection, the expression of MDA5 in the early stage of SCRV infection rapidly increases. In the later stage of infection, the expression of MDA5 may gradually decrease again due to the negative feedback mechanism in the host body to prevent excessive inflammation. However, compared to the uninfected group, the expression of MDA5 was significantly increased in the SCRV-infected group, so we believe that the term "increased expression" is not a problem. In addition, the m6A mechanism can weaken the function of MDA5, but it still cannot prevent the overall increase of MDA5 expression, which is not contradictory to the working model in this article.

      Additionally, the alterations in m6A levels in miiuy croaker under SCRV infection conditions warrant clarification. Could the author employ m6A dot blotting to supplement the findings related to total m6A levels?

      Our previous studies (doi: 10.4049/jimmunol.2200618) have suggested that the total m6A level is increased after SCRV infection in miiuy croaker. We cited this conclusion in the discussion of our manuscript.

      (4) It would be beneficial if the editors could assist the author in enhancing the language of the manuscript.

      We have carefully checked the full article and modified it with Grammarly tools, and we believe that the grammar, format, and readability of our articles have been greatly improved.

      Reviewer #2 (Recommendations For The Authors):

      Figure 1

      (1) Figure 1B - some clarification needs to be added about this figure in the text. It is unclear what the main point is that the authors would like to convey.

      What we want to emphasize is that some species with RIG-I, such as zebrafish, have also experienced RIG-I loss events, but have undergone whole genome replication events before the loss, thus preserving a copy of RIG-I. This indicates that loss events of RIG-I are very common in vertebrates and do not occur randomly. We have elaborated on this point in the results and discussion.

      (2) Figure 1C - is not very informative other than showing Mm MDA5 and LGP2 side-by-side. It would be more useful to show a comparison of human RIG-I/MDA5 alongside Mm and Gg MDA5. Are there any conserved/shared key residues between hRIG-I/hMDA5 versus mmMDA5?

      Homologous proteins are often known to adopt the same or similar structure and function. We have added human RIG-I domain information to this figure (Figure 1F). By comparing the domain information of human RIG-I with M. miiuy MDA5 and LGP2, M. miiuy MDA5 has a similar structure to human RIG-I, making it most likely to compensate for the missing RIG-I. While M. miiuy LGP2 lacks the CARD domain, which is crucial for signal transduction, so we will shift our focus to M. miiuy MDA5. In addition, we collected protein sequences of MDA5 and RIG-I from various vertebrates to identify key residues evolved in recognizing 5'ppp-RNA by M. miiuy MDA5. However, unfortunately, no potential residues were found during the comparison process.

      Figure 2

      (1) Figure 2B - It would be important to demonstrate MDA5-Flag expression by immunoblot and compare MDA5-Flag overexpression to endogenous MDA5 expression using the anti-MDA5 antibody from panel 2A. If IF is used, more cells need to be visible in the field.

      After transfecting the MDA5 plasmid into MKC, endogenous MDA5 expression was detected using MDA5 antibodies. The results showed a significant increase in MDA5 protein levels, indicating that MDA5 antibodies can specifically recognize MDA5 protein. In addition, we retained the original immunofluorescence images to better demonstrate the subcellular localization of MDA5.

      (2) Figure 2C - The 1:1 stoichiometry of MDA5:MAVS (in the absence of any stimulus) is quite surprising. How does the interaction between MDA5 and MAVS change upon stimulation with an RNA ligand (SCRV, poly(I:C))?

      We do not believe that the actual stoichiometry between MDA5 and MAVS is what you described as 1:1. In fact, the proportion of proteins in the complex depends on many factors in the experimental results with Co-IP. Firstly, the MDA5 plasmid in this study has a 3 × Flag tag, while the MAVS only has a 1x Myc tag, which makes the antibody more sensitive for detecting MDA5-Flag. In addition, the Co-IP results are also affected by multiple factors such as the type of antibody and the number of recoveries, making it difficult to estimate the actual ratio of MDA5 to MAVS. Based on the above reasons and the fact that the detection of the interaction strength between MDA5 and MAVS after infection seems to be off-topic, we did not continue to explore this point.

      (3) Figure 2D - The interaction between MDA5 and STING is a very interesting finding but is not elaborated on in the paper (even though the interaction between MDA5 and STING is mentioned in the abstract). The manuscript would be strengthened if the interaction between MDA5 and STING is further investigated. For example, does the IFN response that is reported in panels 2E to 2H require the presence of STING? Does mmMDA5 signal via STING in response to a DNA ligand?

      We appreciate the referee's suggestion to study the mutual influence between MDA5 and STING. We found that co-expression of STING and MDA5 can enhance MDA5-mediated IFN-1 response during SCRV virus infection, while knocking down STING can restore MDA5-mediated IFN-1 (Figure 2N and 2O). This indicates that STING plays an important signaling role in the immune response of MDA5 to RNA viruses. We understand the importance of cGAS/STING pathways in identifying exogenous DNA, so exploring the MDA5 pathway for DNA ligand recognition is an interesting and meaningful perspective. But this seems to be detached from the theme of our article, so we didn't continue to explore this point.

      (4) Figures 2F and 2H - the authors demonstrate that SCRV induces a type I IFN response in an MDA5-dependent manner. While SCRV is a single-stranded negative-sense RNA virus that contains 5'ppp-RNA, it cannot be excluded that MDA5 is activated here in response to a double-stranded RNA intermediate of viral origin or even a host-derived RNA whose expression or modification is altered during infection. To demonstrate in an unambiguous manner that MDA5 senses 5'ppp-RNA, it is crucial to use the in vitro synthesized 5'ppp-RNA (and its dephosphorylated derivative as a control) from Fig. 4 in these experiments.

      We transfected 5 'ppp SCRV and 5' ppp VSV (and their dephosphorylated derivatives) synthesized in vitro into MKC cells and DF-1 cells, respectively. The results showed that 5’ppp-RNAs significantly promoted the formation of IRF3 dimers, while their dephosphorylated derivatives did not (Figure 4C and 4J). In addition, we extracted virus RNA from the SCRV and VSV viruses and dephosphorylated them with Calf intestinal phosphatase (CIAP). These RNAs were transfected into MKC and DF-1 cells and found that the immune response mediated by virus RNAs was much higher than the dephosphorylated form (Figure 1C-1E). The above results indicate that the immune response activated by SCRV and VSV is indeed dependent on their triphosphate structure. Finally, the IRF3 dimer and IFN induction activated by SCRV RNA can be inhibited by si-MDA5 (Figure 2P and 2Q), further demonstrating the involvement of MDA5 in the immune response mediated by 5’ppp-RNA ligands.

      (5) In mice and humans, MDA5 is known to collaborate with LGP2 to jointly induce an IFN response. Does M.miiuy express LGP2? If so, it would be informative to include a siRNA targeting LGP2 in the experiments in panel F. In mammals, LGP2 potentiates the response via MDA5 while it may inhibit RIG-I activation.

      M.miiuy express LGP2. We constructed an LGP2 plasmid and synthesized si-LGP2 to investigate the impact of LGP2 on MDA5-mediated immune processes (Figure 1G-1I). The results showed that LGP2 can enhance the IFN response mediated by MDA5 during SCRV virus infection, similar to that in mammals.

      (6) Minor comment - Is the poly(I:C) used in this figure high or low molecular weight poly(I:C)? HMW poly(I:C) preferentially stimulates MDA5, while LMW poly(I:C) preferentially stimulates RIG-I.

      We used poly(I:C)-HMW as a positive control for activating MDA5. We have modified the relevant information in Figure 2 and its legend.

      Figure 3

      (1) Figure 3F/G - The normalization in this Figure is difficult to interpret. It would be better to split Figure 3G into 4 separate graphs and include the mock-infected cells alongside the infected samples (as done in Figure 2).

      To better demonstrate the function of the RD domain of MDA5 in M. miiuy, we have changed the experimental plan, as shown in figure 3F. We detected the induction of antiviral factors by overexpression of MDA5 and MDA5-△RD under poly (I:C)-HMW stimulation. This can indicate that the RD domain of MDA5 has a conserved function in the recognition of poly(I:C)-HMW in M. miiuy, and can serve as a positive control for the recognition of SCRV virus by the RD domain.

      Figure 4

      (1) Figure 4B - A number of important controls are missing. Was the immunoprecipitation of RNA successful? This could be shown by running a fraction of the immunoprecipitated material on an RNA gel and/or by showing that the input RNA was depleted after IP. In addition, a control IP (Streptavidin beads without biotinylated RNA) is missing to ensure that MDA5 does not stick non-specifically to the Streptavidin resin.

      We appreciate the referee's suggestions. We rerun this experiment and added a non-biomarker RNA IP control group, and the results showed that MDA5 did not adsorb non-specific onto the beads (Figure 4B). In addition, based on the referee's suggestion, we tested the consumption of RNA before and after immunoprecipitation, and the results showed that biotin-labeled RNA, rather than non-biotin-labeled RNA, could be adsorbed by beads, indicating the success of RNA precipitation. However, we think that this is not necessary for the final presentation of the experimental results, so we did not show this in the figure.

      (2) Figure 4B - It is unclear why there is such a large molecular weight difference between endogenous MDA5 and MDA5-Flag (110 kDa versus 130/140 kDa). Why is there less MDA5-Flag retrieved than endogenous MDA5?

      After careful analysis, we believe that the significant difference in molecular weight between endogenous MDA5 and MDA5 Flag may be due to three reasons. Firstly, MDA5 flag has a 3× Flag tag. Secondly, as shown in the primer table, we constructed MDA5 between the NotI and XbaI cleavage sites in the pcDNA3.1 vector, which are located at the posterior position in the vector. This means that the Flag tag has a certain distance from the starting codon of MDA5, and these sequences on the vector can also be translated and increase the molecular weight of the exogenous MDA5 protein. Finally, in order to facilitate the amplification of the primers, the F-terminal primers of MDA5 contain a small portion of the 3'UTR sequence (excluding the stop codon). These above reasons may have led to significant differences in molecular weight. In addition, in order to supplement important experimental controls, we have conducted a new RNA pull-down experiment as shown in Figure 4B.

      (3) Minor point: Figure 4B - please clarify in the figure whether RNA or protein is immunoprecipitated and via which tags.

      We have conducted a new RNA pull-down experiment as shown in Fig 4B, and we have clearly labeled the relevant information in the figure.

      (4) Figure 4E - the fraction of MDA5 that binds 5'ppp-RNA seems incredibly minor. And why is this experiment done using 5'OH-RNA as a competitor, rather than simply incubating MDA5 and 5'OH-RNA together and demonstrating that these do not form a complex?

      The proportion of MDA5 combined with 5’ppp-RNA is influenced by many conditions, including the concentration and purity of the probe and purified protein. In addition, the dosage ratio between the RNA probe and MDA5 protein in the EMSA experiment can also have a significant impact on the results. Therefore, it is not possible to accurately determine the actual binding force between MDA5 and RNA. In the EMSA experimental program, both cold probes (5’ppp-RNA) and mutated cold probes (5’OH-RNA and 5’pppGG-RNA) are crucial for demonstrating the specific binding between MDA5 and 5’ppp-RNA, as they can exclude false positive errors caused by factors such as the presence of biotin in the purified MDA5 protein itself.

      (5) Figure 4B/4C/4F - These experiments would be strengthened by including an MDA5 mutant that cannot bind to RNA. These mutants are well-described in mammals. If these residues are conserved, it is straightforward to generate this mutant.

      As shown in Figure 3, the MDA5 of M. miiuy has an RD domain that can recognize the SCRV virus. We constructed MDA5-△RD mutant plasmids with 6x His-tags and purified them for EMSA experiments (Figure 4E). The experimental results further indicate that MDA5, rather than MDA5-△RD, can bind to 5’ppp-SCRV (Figure 4G). This further confirms the crucial role of the RD domain in recognizing the 5'ppp-RNA virus.

      (6) Minor point: Figure 4E: please clarify in which lanes MDA5 has been added.

      Thank you for the referee's suggestion. We have synthesized new 5'ppp-RNA probes (5’ppp-SCRV and their dephosphate derivatives) and rerun this experiment, and relevant information has been added in the Figure (Figure 4F).

      Figure 5

      (1) Figure 5C - As MDA5 is an interferon-stimulated gene (as shown in panel G/H/I)) the increased MDA5 expression could simply explain the increase in the amount of m6A-MDA5 that is immunoprecipitated after infection. Could this figure be improved by doing a fold change between input vs m6A-IP OR uninfected vs SCRV-infected conditions? This would reveal whether the modification of MDA5 with m6A is really increased after infection.

      As shown in Figure 5F below, our data indicates that the proportion of m6A-modified MDA5 does indeed increase after SCRV infection, rather than solely due to the increased expression of MDA5 itself.

      (2) Figure. 5E/F - The y-axis is unclear: relative MDA5 m6A levels. Relative to what? Input? Mock infected?

      For experiments in Figure 5E/F, we first compared the m6A-IP group with the input group, and then normalized the control group (IgG group of 5E and Mock group of 5F) to “1”. We have replaced the Y-axis name with a clearer one (Figure 5E and 5F).

      (3) General comment - It is not mentioned in the text that MDA5 is an interferon-stimulated gene. This would account for the increase in expression (qPCR) after viral infection or poly(I:C) transfection, hence there is no novelty in this finding. In addition, the authors suggest that MDA5 increases at the protein level (by immunoblot) but the increase on these blots is not convincing (figure 5H/5I).

      We understand that the increase in expression of MDA5 as an interferon-stimulated gene after viral infection is a common phenomenon. We present this to further validate the m6A sequencing transcriptome data, and to demonstrate that although m6A modification interferes with MDA5 expression during viral infection, it cannot prevent the increase of mRNA level of MDA5. In addition, we rerun the experiment and the results showed that the expression of MDA5 protein can indeed be specifically activated by the SCRV virus and poly(I:C)-HMW.

      Figure 6

      (1) Figure 6E - What was the MOI of the virus used in this experiment? It is not mentioned in the figure legend.

      MOI=5, we have added this point in the figure legend.

      Figure 7

      (1) Figure 7J - This graphic is somewhat misleading and should be altered to better reflect the conclusions that are drawn in the manuscript. The graphic suggests that MAVS and STING interact, but this is not demonstrated in the paper. In addition, the paper does not demonstrate whether MAVS or STING (or both) are needed downstream of MDA5 to relay signalling. Finally, please draw an arrow from type I IFNs to increased expression of MDA5 to illustrate that MDA5 is an ISG.

      Thank you for the referee's suggestion. We have revised the images to more accurately match the conclusions of the manuscript (Figure 7J). Firstly, we have separated the STING protein from the MAVS protein. Secondly, arrows have been used to indicate that MDA5 is an IFN-stimulated gene. Finally, as we have added relevant experiments to demonstrate the importance of MITA protein in the signaling process of MDA5-activated IFN response. In addition, the function of MAVS binding to MDA5 protein and promoting its signal transduction is very conserved, and there is a good research background even in fish with RIG-I deficiency (10.1016/j.dci.2021.104235). Therefore, in Figure 7J, we still chose to bind MAVS to MDA5 protein and use it as a downstream signal transducer of MDA5.

      Discussion<br /> (1) There is very little discussion about METTL and YTHDF proteins in the discussion despite the fact that the last 2 figures are entirely devoted to these proteins.

      Based on the referee's suggestion, we have added relevant content about METTL and YTHDF proteins in the discussion. In addition, the basic mechanism and function of METTL and YTHDF proteins were briefly described in the introduction.

      Reviewer #3 (Recommendations For The Authors):

      Please refer to the specific suggestions and recommendations. They include proposals for experimental additions, improved methodologies, and suggestions to resolve writing-related concerns.

      Major concerns

      (1) I suggest changing the article title to "Functional Replacement of RIG-I with MDA5 in Fish Miiuy Croaker", or a similar title, to make it more focused and closely aligned with the content of the article.

      Following the reviewer's recommendations, we have revised the title to emphasize our primary research subject is a teleost fish that lacks RIG-I. In addition, we have changed “5’ppp-RNA” to “5’ppp-RNA virus” to emphasize the interaction between the virus and the receptor. We believe that the revised title is more in line with the content of the article.

      (2) Due to the inherent limitations in genome sequencing, assembly, and annotation for the Miiuy croaker, comprehensive annotation of immune-related genes remains incomplete. To address this critical gap, it is recommended that authors establish experimental protocols, such as Fluorescence In Situ Hybridization (FISH), to confirm the absence of RIG-I in the Miiuy croaker. They should simultaneously employ MDA5 probes as a positive control for validation purposes.

      The miiuy croaker has good genomic information at the chromosomal level (doi: 10.1016/j.aaf.2021.06.001). In addition, studies have shown that RIG-I is absent in the orders of Perciformes (doi: 10.1016/j.fsirep.2021.100012), while miiuy croaker belongs to the order Perciformes, so it does indeed lose the RIG-I gene. Therefore, we do not intend to use FISH technology to prove this.

      (3) Similarly, it is recommended that the authors first provide evidence of the presence of 5'ppp at the 5' terminus of the genome RNA of SCRV, as demonstrated in the study by Goubau et al. (doi: 10.1038/nature13590, Supplementary figure 1). This evidence is crucial before drawing conclusions about the compensatory role of MDA5 in recognizing 5'ppp RNA viruses, using SCRV as the viral model.

      As suggested by the referee, we extracted SCRV RNA from SCRV virus particles and assessed the 5’-phosphate-dependence of stimulation by SCRV RNA. Calf intestinal phosphatase (CIAP) treatment substantially reduced the stimulatory activity of SCRV RNA in MKC cells of M. miiuy (Figure 1C and 1E). In addition, similar results were obtained by transfecting VSV-RNA isolated from VSV virus into DF-1 cells of G. gallus (Figure 1D). The above evidences confirm the presence of triphosphate molecular features between SCRV and VSV viruses, and indicating that birds and fish lacking RIG-I have other receptors that can recognize 5’ppp-RNA.

      (4) The 62-nucleotide (nt) 5'ppp-RNA utilized in this study was obtained from Vesicular Stomatitis Virus (VSV). In order to provide direct evidence, it is necessary to include a 62-nt 5'ppp-RNA that is directly derived from SCRV itself.

      We adopted this suggestion and synthesized a 67-nucleotide 5’ppp-SCRV RNA probe. We found that 5’ppp-SCRV activates dimerization of IRF3 and binds to MDA5 of M. miiuy in a 5’-triphosphate-dependent manner (Figure 4A-4F).

      (5) Given that RNAs with uncapped diphosphate (PP) groups at the 5′ end also activate RIG-I, similar to RNAs with 5′-PPP moieties, and the 5′-terminal nucleotide must remain unmethylated at its 2′-O position to allow RNA recognition by RIG-I, it is necessary for the authors to conduct additional experiments to supplement and validate these two distinguishing features of RIG-I in RNA recognition. This will provide more reliable evidence for the replacement of RIG-I by MDA5 in RNA recognition.

      Thank you for the reviewer's professional suggestions. We understand that exploring the combination of 5’pp-RNA and 2′-O-methylated RNA with MDA5 can further demonstrate the alternative function of MDA5. But we think that the use of 5’ppp-RNA and their dephosphorylation derivatives can fully demonstrate that the MDA5 of M. miiuy and G. gallus have evolved to recognize 5’triphosphate structure like human RIG-I. Therefore, we do not intend to conduct any additional experiments

      (6) In section 2.3, the authors assert that Miiuy croaker recognizes SCRV through its RD domain. This claim is supported by their data showing that cells overexpressed with the MDA5 ΔRD mutant lost the ability to inhibit SCRV replication. As a result, the authors draw the conclusion that "these findings provide evidence that MDA5 may recognize 5'-triphosphate-dependent RNA (5'ppp-RNA) through its RD domain." However, to strengthen their argument, the authors should first demonstrate that during SCRV infection, MDA5-mediated antiviral immune response is indeed initiated by recognizing the 5'ppp part of the SCRV RNA, rather than the double-strand part (which can exist in ssRNA virus) of the viral RNA, as this is naturally a ligand for MDA5. Additionally, the authors should treat the isolated SCRV RNA with CIP to remove the phosphate group and examine the binding of MDA5 with SCRV RNA before and after treatment. They should also transfect CIP-treated or untreated SCRV RNA into MDA5 knockdown and wild-type MKC cells to investigate the induction of antiviral signaling and levels of viral replication. Finally, the authors should verify the binding ability of the mutants with isolated SCRV RNA, with or without CIP treatment, to determine which domain of MDA5 is responsible for SCRV 5'ppp-RNA recognition.

      We understand the reviewer's concern that MDA5 may be identified by binding to dsRNA in the SCRV virus. Based on the reviewer's suggestion, we extracted SCRV RNA and obtained its dephosphorylated RNA using Calf intestinal phosphatase (CIAP). Next, we transfected them into MDA5-knockdown and wild-type MKC cells, and detected the dimerization of IRF3 and IFN reaction. The results indicate that SCRV RNA does indeed activate immunity in a triphosphate-dependent manner, and knockdown of MDA5 prevents immune activation of SCRV RNA (Figure 1C and 1E, Figure 2P and 2Q). Finally, we synthesized a 5'ppp-SCRV RNA probe and demonstrated that MDA5 binds to 5'ppp-SCRV through the RD domain (Figure 4E-4G). We believe that these results can better demonstrate that MDA5 recognizes 5’ppp-RNA through its RD domain and addresses the concerns of the reviewers.

      (7) Similarly, merely presenting Co-IP data demonstrating the interaction between Miiuy croaker MDA5 and STING in overexpressed EPC cells does not justify the claim that "in vertebrates lacking RIG-I, MDA5 can utilize STING to facilitate signal transduction in the antiviral response". This is because interactions observed through overexpression may not accurately reflect the events occurring during viral infection or their actual antiviral functions. To provide more robust evidence, it is essential to conduct functional experiments after STING knockout (or at least knockdown). Furthermore, it is important to note that Miiuy Croaker alone cannot adequately represent all "vertebrates lacking RIG-I".

      We found that co-expression of STING and MDA5 can enhance MDA5-mediated IFN-1 response during SCRV virus infection, while knocking down STING can restore MDA5-mediated IFN-1 response (Figure 2N and 2O). This indicates that STING plays an important signaling role in the immune response of MDA5 to RNA viruses. In addition, loss of RIG-I is a common phenomenon in vertebrates, and STING of birds such as chickens (doi: 10.4049/jimmunol.1500638) and mammalian tree shrews (doi: 10.1073/pnas.1604939113) can also bind to MDA5, indicating that STING can indeed play a crucial role in MDA5 signaling in species with RIG-I deficiency. We have added this section to our discussion and elaborated on our observations in more cautious language.

      (8) In the manuscript, a series of experiments were conducted using an antibody (Beyotime Cat# AF7164) against endogenous MDA5. The corresponding immunogen for this MDA5 antibody is a recombinant fusion protein containing amino acids 1-205 of human IFIH1/MDA5 (NP_071451.2). However, the amino acid sequences of IFIH1/MDA5 differ substantially between humans and Miiuy croaker, which could introduce errors in the results. Therefore, it is essential to employ antibodies specifically designed for targeting Miiuy croaker's own MDA5 in the experiments.

      As shown in Figure 2B, endogenous MDA5 antibodies can detect the MDA5 portion that is forcibly overexpressed by plasmids, suggesting that the MDA5 antibody can indeed specifically recognize the MDA5 protein of M. miiuy.

      (9) It is recommended to investigate the phosphorylation of IRF3 in order to confirm the downstream signaling pathway during viral infection when MDA5 is knocked down or overexpressed.

      Due to the lack of available phosphorylation antibodies for fish IRF3, we used IRF3 dimer experiments to detect downstream signaling (Figure 1C and 1D, Figure 2P, Figure 4C and 4J).

      (10) The use of poly I:C as a mimic for dsRNA to investigate MDA5's recognition of 5'ppp-RNA in hosts lacking RIG-I, as well as the examination of the regulatory role of MDA5 m6A methylation upon activation by 5'ppp-RNA, may be inappropriate. Poly I:C does not possess 5'ppp, and while it has been identified as a ligand for MDA5 in various studies, MDA5 cannot serve as a substitute for RIG-I in recognizing poly (I:C). Therefore, the authors should utilize 5'ppp-dsRNA as the mimic and include the corresponding 5'ppp-dsRNA control without a 5'triphosphate as the negative control (both available from InvivoGen). This approach will specifically elucidate the mechanisms involved when MDA5 functions similarly to RIG-I in the recognition of 5'ppp-RNA.

      In our study, we used poly(I:C)-HMW, a known dsRNA mimetic that can be preferentially recognized by MDA5 rather than RIG-I, as a positive control for activating MDA5. What we want to demonstrate is that, like poly(I:C)-HMW (positive control), SCRV can also promote MDA5-mediated IFN immunity, further indicating the important role of MDA5 in 5’ppp-RNA virus invasion. We have clearly labeled the type of poly(I:C) in the figures and legends to avoid misunderstandings for readers.

      (11) In Figure 2, Figure 3, and Figure 6, the appearance of virus plaques is not readily apparent, and it is necessary to replace these images with clearer photographs. It appears that MKC or MPC cells are not appropriate for conducting plaque assays. To accurately assess viral proliferation, the authors should measure key indicators throughout the process, such as the production of positive-strand RNAs (+RNAs), replication intermediates (RF), and transcription of subgenomic RNAs. This approach is preferable to solely measuring the M and G protein genes from the virus genome as positive results can still be observed in contaminated cells.

      As pointed out by the reviewer, we also think that the virus plaque images in Figure 2K and Figure 3D are not clear enough, so we have replaced them with new clear images (Figure 2J and Figure 3D). But we think that other images can clearly display the proliferation of the SCRV virus, so we did not replace them. In addition, the primers we currently use do measure +RNA, so the replication level of the SCRV virus can be accurately evaluated without being affected by virus contamination. Because the regions where the two pairs of primers are located belong to the SCRV-M and SCRV-G protein genes, we label them as SCRV-M and SCRV-G to distinguish between the two pairs of genes. To avoid reader misunderstanding, we have modified the Y-axis label in the figures (Figure 2I and 2K, Figure 3E, Figure 6E and 6O).

      (12) There is a substantial disparity in the molecular size of M. miiuy MDA5 between endogenous and exogenously expressed proteins, as shown in Figure 2A and 2C-D. Please provide clarification.

      Please refer to the response to Reviewer 2's question regarding Figure 4B above.

      (13) The manuscript incorporates the evolutionary perspective, but lacks specific evolutionary analysis. Thus, it is essential to include relevant analysis to comprehend the evolutionary dynamics and positive selection on MDA5 and LGP2 in the absence of RIG-I in Miiuy croaker. This can be achieved through theoretical calculations using appropriate algorithms, such as the branch models and branch-site models based on the maximum-likelihood method implemented in the phylogenetic analysis by maximum likelihood (PAML) package.

      In fact, we have analyzed the molecular evolution of MDA5 and LGP2. Unfortunately, even when analyzing only the MDA5/LGP2 CDS sequences in fish, we found that the topologies of gene trees of MDA5/LGP2 were largely consistent with the species tree. Thus, species with or without RIG-I in the gene trees cannot effectively separate clusters, making it extremely difficult to analyze the molecular evolution of MDA5/LGP2 caused by RIG-I deficiency. Consequently, we gave up this aspect of analysis.

      (14) If the narrative regarding m6A methylation goes beyond the activation of MDA5 through recognition of 5'ppp-RNA and represents a regulatory mechanism for all MDA5 activation events, it is not relevant to the theme of "An arms race under RIG-I loss: 5'ppp-RNA and its alternative recognition receptor MDA5." Therefore, all investigations in this paper should focus solely on events when MDA5 recognizes 5'ppp-RNA. Any data associated with the broader regulatory mechanisms and m6A methylation of MDA5 should be excluded from this manuscript and instead be included in a separate study dedicated to exploring this specific topic.

      Our theme aims to showcase RNA viruses, rather than an interaction between 5'ppp-RNA and host virus receptors, which our current topic cannot accurately express. Therefore, we made two main changes: firstly, we limited the study species to M. miiuy, although some studies on the functional substitution of MDA5 for RIG-I involved birds. Secondly, change “5’ppp-RNA” to “5’ppp-RNA virus”. We believe that the revised title is more in line with our current research contents.

      (15) The running title appears to be hastily done.

      We modified it to “MDA5 recognizes 5’ppp-RNA virus in species lacking RIG-I”.

      (16) There are many descriptions that are not strongly related to the main theme of the article in the introduction section, making it lengthy and fragmented. Please focus on the research background of RIG-I and MDA5, including their structures, functions, and regulatory mechanisms, as well as the research progress on the compensatory effect of MDA5 in the absence of RIG-I and its evolutionary adaptation mechanism in other species.

      Based on the suggestions of the reviewers, we have removed some of the less relevant content in the introduction and added research progress on the compensatory effect of MDA5 in the evolutionary adaptation mechanism of tree shrews in the absence of RIG-I.

      (17) Lines 149-156 in the "Results" section include content that resembles an "Introduction" It is important to avoid duplicating information in the results section. Therefore, the authors are encouraged to revise this paragraph to ensure conciseness in the article.

      We have streamlined this section to enhance the article's conciseness and clarity.

      (18) In the "Results" section, at line 177, the authors assert, "As depicted in Figure 1F-1H," which should be corrected to Figure 2F-2H. Furthermore, the y-axis of the two figures on the right-hand side of Figure 2H represents the ISG15 genes. At line 182, "as demonstrated in Figure 1I-1L," should be revised as "as illustrated in Figure 2I-2L". The authors demonstrated a lack of attention to detail.

      Thank you to the reviewer for pointing out our errors, and we have made the necessary corrections.

      (19) In lines 197-198, the authors stated that "MDA5-ΔRD showed an inability to interact with SCRV." However, Figure 3D did not reveal any significant difference, thus it is advisable to repeat this experiment at least once.

      We have replaced this virus spot image with a new one (Figure 3D).

      (20) In lines 200-201 of the "2.3 RD domain is required for MDA5 to recognize SCRV" section, the authors report that the expression of antiviral genes was induced by the overexpression of both MDA5 and MDA5-ΔRD, even in the absence of infection (Figure 3F). Why does the expression of antiviral genes increase in the absence of viral RNA stimulation? Please provide a reasonable explanation.

      In the absence of viral infection, overexpression of viral receptor proteins may still transmit erroneous signaling, affecting the body's immunity. We speculate that due to the preservation of the CARD domain by MDA5 and MDA5-ΔRD, they can still induce the expression of antiviral factors without ligands, although this induction effect is much smaller than that of viral infection. However, in order to better demonstrate the function of the RD domain of MDA5 in M. miiuy, we have changed the experimental plan, as shown in the figure 3F. We detected the induction of antiviral factors by overexpression of MDA5 and MDA5-△RD under poly (I:C)-HMW stimulation. This can indicate that the RD domain has a conserved function in the recognition of poly(I:C)-HMW in M. miiuy, and can serve as a positive control for the recognition of SCRV virus invasion by the RD domain of MDA5.

      (21) Please provide the GeneBank accession number of M. miiuy MDA5.

      The GeneBank accession number of M. miiuy MDA5 was added in the section 4.5 plasmids construction.

      (22) The content of lines 228-233 in the "Results" section bears resemblance to that of the "Introduction." To ensure the avoidance of information duplication, it is recommended to remove this paragraph from the results section.

      This section has been streamlined.

      (23) The bands of mmiMDA5 in the 5'ppp-RNA and dsRNA lanes in Figure 4B are weak and almost unobservable. Please replace them with clear images.

      We have rerun this experiment and replaced the images (Figure 4B).

      (24) In Figure 5G and at line 253, there are only results presented for the SCRV infection group, while no results are shown for the control group. This raises the question of why the control group results are missing. It is necessary to provide a reasonable explanation or correction for this issue.

      The "0 h" infection time point of the SCRV virus is the control group, and we have replaced it with a more intuitive image (Figure 5G).

      (25) In Figure 7C, it would be necessary to include the western blot result of YTHDF protein expression in order to verify the efficiency of YTHDF siRNA.

      In fact, we have attempted to detect the endogenous expression of YTHDF protein using available commercial antibodies. Unfortunately, only the YTHDF2 antibody can specifically recognize the endogenous protein expression of YTHDF2 in M. miiuy. In addition, the knockdown effect of si-YTHDF2 has been validated by YTHDF2 antibody (doi: 10.4049/jimmunol.2200618).

      (26) In line 422 of the "4.3 Cell culture and treatment" section, the paragraph raises a question regarding the nature of Miiuy croaker kidney cells (MKCs) and spleen cells (MPCs) - whether they are cell lines or freshly isolated cells (or primary cultures) derived from kidney and spleen tissues. If these cells are indeed cell lines, it is requested to provide detailed information about the sources and properties of the cells (such as whether they are epithelial cells or other mixed cell types) and the generations of propagation. Alternatively, if the cells were freshly isolated or primary cultures obtained from fish, the method for cell isolation should be provided. The source and stability of cells are extremely important for ensuring the repeatability and reliability of experimental outcomes.

      M. miiuy kidney cells (MKCs) and spleen cells (MPCs) are cell lines derived from the kidney and spleen tissues of M. miiuy, with passages ranging from 20 to 40 times. These details have been incorporated into section 4.3.

      (27) There are many inaccurate descriptions in the text, which employ concepts that are too broad. These descriptions need to be narrowed down to specific species or objects. Here are a few examples, along with the necessary revisions. Other similar instances should also be revised accordingly. For instance, in line 119, "fish MDA5" should be changed to "Miiuy croaker MDA5." Similarly, in line 166, "fish MDA5-mediated signaling pathway" should be changed to "Miiuy croaker MDA5-mediated signaling pathway." In line 174, "fish MDA5" should be revised to "Miiuy croaker MDA5." Additionally, in line 185, "antiviral responses of teleost" should be changed to "antiviral responses of Miiuy croaker." In line 197, "interact with SCRV" should be revised to "interact with 5'ppp-RNA of SCRV." In line 337, "loss of RIG-I in the vertebrate" should be modified to "loss of RIG-I in Miiuy croaker and chicken." Similarly, in line 338, "MDA5 of fish" should be changed to "MDA5 of Miiuy croaker." Lastly, in line 348, "RIG-I deficient vertebrates" should be revised to "RIG-I deficient Miichthys miiuy and Gallus gallus."

      Thank you for the reviewer's suggestions. We have made revisions to these inaccurate descriptions and reviewed the entire manuscript to address similar statements with broad concepts.

      (28) Finally, it should be noted that a similar discovery has already been reported in tree shrews (Ling Xu, et al., Proc Natl Acad Sci., 2016, 113(39):10950-10955). This article shares similarities with that research report, therefore it is necessary to discuss in detail the relationship between the two in the discussion and compare and analyze the evolutionary patterns of MDA5 from it.

      Based on the reviewer's suggestions, we have compared the similarities and differences between these two reports during the discussion and analyzed the evolutionary dynamics of MDA5 in these vertebrates lacking RIG-I.

      Minor concerns:

      Thank you to the reviewer for their meticulous examination to our manuscript, we have made revisions to the following suggestions.

      (1) At line 120, the sentence "SCRV(one 5'ppp-RNA virus)" should have a space between "SCRV" and "(one 5'ppp-RNA virus)". Please make this correction.

      Corrected.

      (2) At lines 147-148, the sentence "However, the downstream gene of TOPORSa is missing a RIG-I" is not accurate and needs modification.

      We have modified this sentence.

      (3) At line 184, "findings indicate" should be corrected to "findings indicated".

      Corrected.

      (4) At line 189, "a 5'ppp-RNA virus" should be deleted and the text seems redundant.

      Deleted.

      (5) At line 198, "replication. (Figure 3C-3E)", please remove the punctuation between "replication" and "(Figure 3C-3E)".

      Corrected.

      (6) At line 416 in "Materials and methods" section, "4.2 Sample and challenge" should be corrected to "4.2 Fish and challenge".

      Corrected.

      (7) At line 419, the authors state that "The experimental procedure for SCRV infection was performed as described", please briefly describe the SCRV infection method and the infectious dose.

      Based on the reviewer's suggestions, we have added relevant descriptions of SCRV infection in section 4.2.

      (8) There are several formatting issues in the "Materials and Methods" section. For instance, in line 424, there is no space between the number and letter in "100 μg/ml" and "26 ℃" should be corrected to "26℃". Additionally, in line 430, "Cells" should be corrected to "cells".

      Corrected.

      (9) At line 446, "50 ng/ul" and "100 mU/ul" should be corrected to "50 ng/μl" and "100 mU/μl".

      Corrected.

      (10) At line 459, "primers 1)" should be corrected to "primers".

      Corrected.

      (11) At lines 461-464, the description "For protein purification, MDA5 plasmids with 6× His tag was constructed based on pcDNA3" seems to be no direct logical connection between protein purification and the plasmid construction. Please make the necessary corrections.

      Corrected.

      (12) At line 548, "cytoplasmic" should be corrected to "Cytoplasmic".

      Corrected.

      (13) At line 549, "5× 107" should be corrected to "5 × 107".

      Corrected.

      (14) At line 557, "MgCl2" should be corrected to "MgCl2".

      Corrected.

      (15) At line 558, "6 %" should be corrected to "6%".

      Corrected.

      (16) At line 565, "50μg" should be corrected to "50 μg".

      Corrected.

      (17) At line 571, "300{plus minus}50 bp." should be corrected to "300 {plus minus} 50 bp."

      Corrected.

      (18) At lines 592-593, the sentence "After several incubations, the m6A level was quantified colorimetrically at a wavelength of 450 nm" does not read smoothly, please improve it.

      Revised.

      (19) At line 786, "MDA5 recognize" should be corrected to "MDA5 recognized".

      Corrected.

      (20) At lines 788 and 798, "Pulldown" should be corrected to "Pull-down".

      Corrected.

      (21) At lines 790 and 796, "bluestaining" should be corrected to "blue staining".

      Deleted.

      (22) At line 825, "SCRV and infection" should be corrected to "SCRV infection".

      Corrected.

      (23) At lines 826-827, "SCRV (H) and poly(I:C) (I) infection" should be corrected to "SCRV infection (H) and poly(I:C) stimulation (I)".

      Corrected.

    1. Author response:

      The following is the authors’ response to the original reviews.

      We are thankful for the comments and suggestions from the Editor and Reviewers about our manuscript submitted to the eLife Journal. We have addressed all the comments, and we think these modifications will help bring clarity to our message and be helpful to your readership. Here we include an outline of the corrections performed, as well as a detailed response to each of the reviewer’s comments.

      As per the Editor and Reviewers suggestions, outline of corrections:

      ·        The title of the manuscript has been changed to reflect a more conservative conclusion.

      ·        Changes in the main manuscript text were made to enhance clarity, including the use genetic terminology and naming.

      ·        Specific responses to some comments from the reviewers are included in this document. We combined some comments that would be better addressed together.

      Accompanied to this letter is an updated version of our manuscript with the track changes feature enabled. Again, we are thankful of the comments and suggestions we received, and we hope this revised version of our manuscript will be accompanied by an updated assessment and public reviews and a final eLife Version of Record.

      Response to the public review and minor recommendations.

      From Reviewer #1:

      The major inference of the work is that SIV infection of gorillas drove the observed diversity in gorilla CD4. This is supported by the majority of SNPs being localized to the CD4 D1, which directly interacts with the envelope, and the demonstrated functional consequences of that diversity for viral entry. However, SIVgor (to the best of my knowledge) only infects Western lowland gorillas (Gorilla gorilla gorilla), and one Gorilla gorilla diehli and three Gorilla beringei graueri individuals were included in the haplotype and allele frequency analyses. The presence of these haplotypes or the presence of similar allele frequencies in Eastern lowland and mountain gorillas would impact this conclusion. It would be helpful for the authors to clarify this point.

      From Reviewer #1 (minor comment):

      Which subspecies of gorilla are the nsSNPs coming from? Gorilla gorilla diehli [n =1]; Gorilla beringei graueri [n = 3]) are not extant reservoirs of SIV and to my knowledge are not thought to have been, and so it's important to point out where the diversity is coming from if the authors are asserting that SIVgor drove this population-level diversity in gorilla CD4.

      We initially included genomic data from all the gorilla individuals available to maximize sensitivity to identify allelic variants. Although evidence points to eastern gorillas not being currently infected with SIV, our results show that all allelic variants identified have differential susceptibility to the HIV-1 and SIVcpz strains tested. The allelic variants we identified with this genomic data set match the variants identified by Russell et al (doi.org/10.1073/pnas.2025914118), including the ones found in eastern gorillas, and recapitulate that those variants have differential susceptibility to lentiviral entry, similar to the variants of western populations. Whether eastern gorillas have been exposed to lentiviruses in the past remains unknown.

      From Reviewer #1:

      The authors appear to use a somewhat atypical approach to assess intra-population selection to compensate for relatively small numbers of NHP sequences (Fig. 6). However, they do not cite precedence for the robustness of the approach or the practice of grouping sequences from multiple species for the endemic vs other comparison. They also state in the methods that some genes encoded in the locus were removed from the analysis "because they have previously been shown to directly interact with a viral protein." This seems to undercut the analysis and prevents alternative explanations for the observed diversity in CD4 (e.g., passenger mutations from selection at a neighboring locus).

      Given the nature of our samples, to detect any influence of natural selection acting on CD4, we chose to compare patterns of molecular evolution of CD4 to its neighboring loci. Comparisons of molecular evolution signatures across genomic regions are the basis of methods to detect positive selection (e.g., Sabeti DOI: 10.1038/nature01140). For our comparison, the neighboring loci represent our neutral standard for the genomic region CD4 resides. Our rationale is that demographic and neutral influences on the number and frequency of polymorphic sites in a region would equally affect all loci in a genomic region. Because these neighboring loci are our neutral benchmark, we excluded before analysis other genes in this genomic region that interact with viruses. The logic is that these loci may be evolving under the influence of positive selection and would decrease the power of our comparison. None of the excluded loci are direct neighbors to CD4. This, and given that the CD4 genomic region in humans is of average recombination rate, dampens the possibility that what we are observing at CD4 is due to selection acting at a neighboring locus. In addition, the classic population genetic method to detect positive selection, the McDonald-Kreitman test (McDonald DOI: 10.1038/351652a0), was originally presented combining polymorphism data across species. We assume that any effect on levels of diversity created by combining variability between species would equally affect all loci included in the study, not just CD4.

      From Reviewer #1:

      Data in Figure 5 is graphed as % infected cells instead of virus titer (TDU/mL). It's unclear why this is the case, and prevents a comparison to data in Figure 2 and Figure 4.

      From Reviewer #1 (minor comment):

      Figure 5: the data presentation is now shown as % infected cells instead of viral titer. This makes it difficult to compare data from Figure 5 to other figures. Can the authors please either justify this change, display data consistently or provide matched data displays as a Supplemental Figure?

      For the experiments presented in figures 2 and 4 we used different volumes of infecting pseudoviruses, which allowed us to identify the linear range of infection. Then, based on the number of cells plated per experimental replicate, we calculated a virus titer. In follow-up experiments (Fig. 5), we used fixed volumes of virus that would infect ~10-20% of control (wild-type; wt) CD4-expressing cells. Comparisons were then made between wt and mutated CD4s, and these data are best presented in their raw forms as percent cells infected.  Although this change in method prevents direct comparison between the figures, we focused on the differences observed between the experimental conditions per experimental panel.

      From Reviewer #1:

      The lack of pseudotyping with SIVgor envelope is a surprising omission from this study, that would help to contextualize the findings.

      From Reviewer #2 (minor comment):

      The inclusion of HIV-1 but not SIVgor strains in Figures 2D/E is somewhat conspicuous since chimpanzee alleles certainly differ in susceptibility to SIVcpz (and SIVgor) strains per Russell et al. 2021. The authors should either test some SIVgor infections, cite published data on at least extant human/chimpanzee/gorilla CD4 susceptibility to SIVgor, or address why they did not include it.

      We agree the data of host susceptibility to SIVgor strains would have been an interesting question to explore. However, we opted to focus on the transmission of SIVcpz strains into gorilla populations for this study. It is worth mentioning that we have cloned SIVgor envelope genes from some strains into our expression system, but we were unable to recover infectious pseudoviruses using an HIV-1DEnv-GFP backbone. This suggests that HIV-1 may be incompatible with incorporating SIVgor Env into virus particles. Recently, Russell et al (DOI: 10.1073/pnas.2025914118) managed to generate SIVgor Env pseudotyped virions using a different backbone (SIVcpzDEnv-GFP) that was unavailable to us at the time of this study.

      From Reviewer #1:

      Similarly, building gorilla CD4 haplotype SNPs onto the hominin ancestor (as opposed to extant human CD4) may provide additional insights that are meaningful toward understanding the evolutionary trajectory of gorilla CD4.

      We decided to use the extant human CD4 as a backbone to test the effects on the individual amino acid variants found in the allelic diversity of the gorilla population since the human protein is highly susceptible to all the HIV-1 and SIV strains tested, and the expected phenotype is a loss-of-function. Since the D1 of the human and ancestral sequences for CD4 are almost identical (except for a change that is fixed in gorillas), and they showed similar levels of susceptibility to lentivirus entry, we expect that the phenotypes found would be the same if the gorilla SNPs were built into the ancestral CD4 backbone.

      From Reviewer #2:

      To bolster the argument that lentiviruses are indeed the causative driver of this diversification, which seems likely from a logical perspective but is difficult to prove, Warren et al. pursue two novel lines of evidence. First, the authors reconstruct ancestral CD4 genes that predate lentiviral infection of hominid populations. They then demonstrate that resistance to lentiviral infection is a derived trait in chimpanzees and gorillas, which have been co-evolving with endemic lentiviruses, but not in humans, which only recently acquired HIV. Nevertheless, the derived resistance could be stochastic or due to drift. This argument would be strengthened by demonstrating that bonobo and orangutan CD4, which also do not have endemic lentiviruses, resemble the ancestral and human susceptibility to great-ape-infecting lentiviruses.

      From Reviewer #2 (minor comment):

      The data presented in Figure 2, showing that chimp and gorilla (but not human) CD4 resistance to lentiviral infection is a derived trait, is very intriguing for suggesting that endemic lentiviruses are the causative driver of CD4 evolution. Nevertheless, this could be stochastic or due to genetic drift. Given the later emphasis on several other non-endemically infected species, the authors should at the very least include the sequences for bonobo and orangutan CD4 in the presented alignment (Fig 2B). Ideally, they would also test these orthologs to demonstrate that they are not resistant to lentiviruses infecting great apes (SIVcpz / HIV-1 / SIVgor). If they have also derived resistance, this would suggest a possible other evolutionary driver or genetic drift.

      Based on our analysis on polymorphic sites using available data from populations of apes, we strongly believe the accumulation of resistant polymorphisms in CD4 did not arise in a stochastic manner. The frequency and accumulation of these changes strongly correlate with the function of CD4 as a receptor for lentivirus entry. We agree that experimentally testing the CD4 protein from bonobo and orangutan would strengthen our conclusions; however, based on our genomic analyses, we decided to focus on the species that would present a higher level of variability of susceptibility to the lentivirus tested, namely gorillas and chimpanzees.

      From Reviewer #2:

      Warren et al. provide a population genetic argument that only endemically infected primates exhibit diversifying selection, again arguing for endemic lentiviruses being the evolutionary driver. The authors compare SNP occurrence in CD4 to neighboring genes, demonstrating that non-synonymous SNP frequency is only elevated in endemically infected species. Moreover, these amino-acid-coding changes are significantly concentrated in the CD4 domain that binds the lentiviral envelope. This is a creative analysis to overcome the problem of very small sample sizes, with very few great ape individuals sequenced. The additional small number of species compared (2-3 in each group) also limits the power of the analysis; the authors could consider expanding their analysis to Old World Monkey species that do or do not have endemic lentiviruses, as well as great apes.

      The scope of this project was to evaluate the differential phenotype of the accumulated polymorphisms found in the ape branch of the primates. Although evaluating the accumulation of polymorphisms in a broader range of primates would generate interesting observations, this would likely require increasing the total number of primate species to include sampling along the speciation tree, many of which lack population level data.

      From Reviewer #1 (minor comment):

      Ancestral reconstruction methods and associated data tables should be included to indicate statistical support for assigned codons. A comment on ambiguity at relevant positions is needed. Similarly, given the polymorphic nature of gorilla and chimpanzee CD4, how confident are the authors in their ancestral reconstructions based on a single representative genome per species? Does this change when you include the broader panel of gorilla sequences? Is the ancestral reconstruction robust to other methods besides PAML?

      We used the PAML software package to reconstruct the ancestral hominin and hominid sequence of CD4 because it is a standard and well recognized method for this purpose. For this analysis, we used the set of primate sequences selected for positive selection analyses (see methods), namely the longest isoform sequences for each of the available species that best aligned with human CD4. We feel that the best way to perform to the ancestral state reconstruction was to use only these curated sequences instead of the population level sequences, removing potential biases introduced by having different numbers of variants per species. 

      From Reviewer #1 (minor comment):

      Page 10: "It seems that allele 2, which doesn't have this glycan, would be at a fitness disadvantage. In support of this, allele 2 is one of the least frequent alleles in the gorilla population that we surveyed (Figure 3B)." - this inference depends on the gorilla species that encode allele 2 and allele frequencies. There are statistical tests to address this inference.

      Population genetic statistics that test for skews in sample allele frequencies are not appropriate here due to the nature of the samples in this study. However, the reviewer is correct that our inference in allele frequency is dependent on the gorilla species that we find this allele in. Allele 2 is found in the Gorilla beringei graueri subspecies of gorilla included in this study.  We only have data for three individuals (six alleles) from this subspecies compared to 51 individual (102 alleles) from Gorilla gorilla gorilla. As such, genetic subdivision between the gorilla subspecies could also produce the low frequency of allele 2 observed in our sample.

      From Reviewer #1 (minor comment):

      Page 11: "These results imply that the resistance to SIVcpz found in gorilla individuals is not dependent on single amino acids, but rather the cumulative effect of multiple SNPs." Would it be more relevant (or relevant in other ways) to test this statement by putting those mutations into the hominid ancestor? Testing individual residues in the context of human CD4 may be subject to epistasis or several other factors.

      We agree that constructing multiple of the resistant SNPs in the susceptible human background would have strengthened our hypothesis, as all these amino acid changes are associated with increased resistance to at least one of the lentiviruses tested. However, the number of CD4 variants to test would increase significantly and we feel that this approach was out of the scope of this manuscript.

      From Reviewer #1 (minor comment):

      Figure 6: If you perform this analysis on chimpanzee CD4 alone do you get the same result? Just gorillas? If you remove eastern/mountain gorillas? The very small numbers of non-human non-SIV-reservoir great apes may preclude a strong conclusion.

      We agree that our study is limited by the small number of available sequences from individuals of the studied species. If we remove a whole species or subspecies the statistical power would be greatly reduced. Removing all chimpanzees or gorillas (or a subspecies) would still show that only each of those species accumulate SNPs in the D1 region of CD4, although with less statistical significance.

      From Reviewer #2 (minor comment):

      Related to Figure 2: It would strengthen the argument that resistance is a derived trait if the authors mapped the causative mutations from gorilla CD4 onto the ancestral hominin CD4. However, this experiment is not particularly critical, merely a suggestion.

      We appreciate this suggestion. We decided to use the human CD4 backbone as it is widely susceptible to lentiviral entry. The hominid and hominin ancestral sequences are almost identical to the human sequence in domain 1, except for a fixed mutation shared with the gorilla CD4. We expect that the SNPs observed in the gorilla population would also reduce susceptibility to lentivirus entry in the ancestral CD4 reconstructions.

      From Reviewer #2 (minor comment):

      Related to Figure 3B: It is difficult to make much of the allele frequency for 8 alleles in 32 individuals. Can the authors collate this with allele frequency for the referenced 100 individuals from Russell et al. 2021, to give a better sense of population frequency? This may allow the authors to better correlate allele frequency with SIVcpz resistance patterns in Figure 4, strengthening their argument that more resistant alleles should be over-represented in the population.

      At the time of our analysis the data from Russell (DOI: 10.1073/pnas.2025914118) was not available to collate or compare. When that data became available, we immediately compared the existence of the alleles found and confirmed that the ones we found were also detected in the samples used in that study.

      From Reviewer #2 (minor comment):

      Related to Figure 6: As written, several methodological details should be clarified. How were human genomes selected to limit the sample size to 50?

      We selected a total of 50 human individuals in order to size-match the sample size of the largest group in Fig 6B (chimpanzee, n=50). We randomly selected 10 individuals for each of the 5 superpopulations [Africans (AFR), Admixed Americans (AMR), East Asians (EAS), Europeans (EUR) and South Asians (SAS)] defined by the 1000 Genome Project.

      From Reviewer #2 (minor comment):

      Related to Figure 6: What comparison is being reported for the Mann-Whitney U test (CD4 vs. which gene)? Are the means shown in A an average of 2 (endemic) or 3 (non-endemic) species - if so, the authors should show the individual data points to give a clearer depiction of the data spread. In addition, it is not clear that a statistical test with sample sizes of 2 is meaningful, since Mann Whitney typically assumes n > 5. To strengthen this statistical argument, it may be necessary to include additional species that have (a) multiple genomes (or at least this locus) sequenced, and (b) have or lack lentiviral sequences. This may necessitate expanding the analysis to include Old World Monkeys (e.g. Rhesus Macaque Genome Project).

      In the Figure 6 we use the Mann-Whitney U test to compare variation between CD4 and the neighboring loci. The average and SEM are for two endemic and four non-endemic species (two orangutan datasets are from two distinct species vs the gorilla subspecies). It is true our sample size is small for any statistical testing. For the Mann-Whitney U-test it is generally preferred to have n > 5 in each group. So, we do run into problems with the endemically infected comparisons as we only have two data points (chimpanzee and gorilla) for the CD4 group. For the uninfected species, CD4 has four data points.

      From Reviewer #1 (minor comment):

      Page 6. "This suggests that the ancestral versions of CD4 in apes were susceptible to primate lentivirus entry" - The data show that tested virus pseudotyped with SIV/HIV envs can engage ancestral CD4 in the context of a canine cell line expressing human CCR5, but not necessarily that this interaction was sufficient for the process of entry per se, especially in the context of a gorilla (or hominid) cell. Some additional context would be useful for a broad readership.

      From Reviewer #1 (minor comment):

      Page 6: "but that selective pressures exerted by SIVs in the chimpanzee and gorilla lineages have led to the retention of mutations that confer resistance to primate lentivirus infection. This has not happened in humans where selective pressure by HIV-1 is too new" - this cannot be concluded from the data in Figure 1. It would be more appropriate as a Discussion point.

      From Reviewer #1 (minor comment):

      Page 14: "Natural tolerance is often required before a virus can establish itself long term in a host reservoir, and thus understanding it is key to understanding virus reservoirs in nature" - please provide a reference. This is one among several theories of long-term host-virus evolution dynamics/outcomes, and further discussion may benefit the broad readership of eLife.

      From Reviewer #1 (minor comment):

      Page 15: "There is a surprising outcome of virus-driven host evolution in that the divergence and diversity of these host genes ultimately comes at a detriment to the very viruses that drove this evolution." - it is not clear to this reviewer why this is surprising.

      From Reviewer #2 (minor comment):

      Related to Figure 5A: The authors suggest that the gorilla glycosylation site provides resistance to SIVcpz, based on TAN1.910, but in fact the glycosylated allele is no more resistant than the un-glycosylated allele to most SIVcpz strains (in Figure 4). The authors should acknowledge this more clearly in the text.

      From Reviewer #2 (minor comment):

      The title of this article (that infection "has driven selection") is somewhat overstated - though it seems very likely that lentiviruses are driving CD4 diversification, this is difficult to prove. The arguments presented here rely on very few data points: modern chimp and gorilla compared to ancestral CD4, and a population genetic analysis relying on 2 or 3 species with 10-50 individuals each. The authors should either bolster these arguments (see the above suggestions) and/or soften the claim in the title.

      Modifications to the main text of the manuscript have been made to enhance clarity on the subjects stated above.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank both reviewers for their reviews of our work and suggestions for improvement. Changes to the manuscript are captured with the Track Changes feature, and our point-by-point responses are included below in bold/italic text.


      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary Bell et al. overexpress Prom1 or Ttyh1 and test its effect on EV formation from cell lines. They find that Ttyh1 expression leads to an increase in small EVs as well as tubulated EVs, while Prom1 expression leads to a milder increase in small EVs. EV induction by Prom1 is dependent on cholesterol and the authors show that Prom1 makes the cholesterol in EVs more resistant to detergent. The authors show no connection between Ttyh1 EV induction and cholesterol, although they claim it is important. They also show that a disease mutation in Prom1 decreases Prom1 trafficking to the plasma membrane and increases cholesterol resistance to detergent in EVs. The authors also find that the disease mutation decreases the size of the Prom1-induced EVs.

      Major Comments

      Results - line 99-106 - The EV isolation protocol would remove large EVs like the Prom1+ midbody remnants. It is important to explicitly specify that this study focused on small EVs.

      We agree with the reviewers and appreciate the suggestion to make this distinction. We have clarified the Results text (lines 104-105) to specify that our method specifically reconstitutes and isolates small EVs.

      Statistics - The t tests appear to have been performed without correction for multiple comparisons (Figure 2C-D, Fig. 4D). Given that >10 comparisons were made, this can alter the biological significance of p__We agree with the reviewers that multiple test correction is appropriate for these figures. We have applied Bonferroni correction to the t-tests in Figs 2C, 2D, and 4D by adjusting our significance thresholds (alpha), and included additional text in the figure legend to indicate how and why the correction was performed.__

      The DLS data does not appear to give any insight into EV size (unlike the EM data) and could be removed from the whole manuscript (or moved to supplemental). The authors should also remove any conclusions based on the DLS data.

      We appreciate the reviewers raising this point and agree that the DLS is less informative than our other measurements of EV size and morphology. We have moved all DLS figure panels where EV size is characterized by another method to the Supplement.

      Discussion - line 382-383 "Because Prom1 EVs arise directly from blebbing of the plasma membrane23, this finding suggests that Prom1 and Ttyh1 traffic to similar regions of the plasma membrane." The authors have not examined where Prom1 or Ttyh1 localize in the plasma membrane and can not draw this conclusion. That both proteins promote plasma membrane budding would only suggest that both proteins localize to the plasma membrane, not subregions of the plasma membrane. However, the authors have not demonstrated that Ttyh1 specifically induces plasma membrane budding. The different size of Ttyh1 EVs could be due to different biogenesis mechanisms (i.e. derived from intracellular organelles instead of the plasma membrane), making this statement an over-interpretation on both parts.

      This is a fair point. We have removed this sentence from the Discussion (lines 402-403) as the reviewer requests.

      Discussion - line 398-400 "Membrane cholesterol is necessary for Prom1-mediated remodeling20,21 and is present at similar levels in purified Prom1 and Ttyh1 EVs (Fig 5E), indicating that it is undoubtedly important for EV formation by both proteins." & line 415-417 "We find that conservative mutations in several of these adjacent aromatic residues impair EV formation by Prom1, but do not mimic the stable cholesterol binding of W795R (Figs 2C, 4D). " The author's data suggests that cholesterol is not important for Ttyh1 to induce EV formation. The authors show that cholesterol depletion does not alter Ttyh1 EV production. Similarly, they find separable effects on cholesterol binding and EV formation with Prom1 mutants, which suggest that there is more to Prom1-mediated EV formation than cholesterol. That cholesterol is present at similar levels can reflect that overexpression of these proteins does not alter the amount of cholesterol in the EV source membrane (i.e. plasma membrane). Also, wouldn't molecular crowding of a membrane protein be predicted to influence how easy it is to extract lipids?

      We thank the reviewer for highlighting this imprecisely phrased sentence. We only meant to indicate that cholesterol is present in both sets of EVs and contributes globally to membrane fluidity. We have removed this sentence from the Discussion (lines 419-421) to avoid over-interpretation or confusion.

      The reviewer is also correct to point out that molecular crowding could alter how extractable lipids are from EVs. We have included additional explanatory text in the Discussion (lines 421-426) addressing this point.

      Discussion - line 431-433 "Our findings suggest that the dynamic interaction of Prom1 with cholesterol may promote efficient maturation and trafficking of Prom1 between the endomembrane system and the plasma membrane. The authors did not investigate whether depleting cholesterol improved Prom1(W795R) trafficking to the plasma membrane, making this inference untested. Soften interpretation or test experimentally.

      We appreciate the reviewer raising this point. We have altered the text in this paragraph (lines449-459) to soften our interpretation of these results, as suggested by the reviewer.

      Minor Comments Abstract - "the EVs produced are biophysically similar" The authors don't perform any typical biophysical characterization (beyond size and perhaps density), so do they mean physically similar? Given the Prom1 and Ttyh1 EVs can have different shapes and are significantly different sizes, this statement feels misleading.

      We thank the reviewer for pointing out the ambiguity around this word. We agree that "physically similar" is a more precise and accurate term, and have revised all instances of this language in the manuscript.

      Intro - line 59-60 - "Large Prom1 EVs (500-700 nm in diameter) appear to form from bulk release of membrane from the cell midbody" Midbody remnants are well defined (if variously named, i.e. flemmingsome) large EVs derived from the spindle midbody, intercellular bridge, and cytokinetic ring. I'm not sure what the authors are trying to express by "bulk release of membrane". Midbody remnants are also a site of membrane tubulation.

      The reviewer is correct to point out that midbody remnant release is a well defined process. We originally included this statement to avoid indicating that we are studying the only known class of Prominin EVs, but now recognize that including this creates more confusion that it alleviates. To improve clarity concurrently with the changes referenced above emphasizing that we are specifically studying small EVs, we have removed this reference to the larger class of EVs from the introduction (lines 61-63).

      The effect on total numbers of EVs is buried in the y-axes of the EM graphs, making it difficult to distinguish where a higher n of images was examined vs. where there is an increase in EVs. This is especially hard to interpret given the high difference in n values.

      The reviewers raise a valid critique of these figure panels. To improve clarity, we have adjusted the y-axes to represent the fraction of EVs rather than the absolute value of EVs, and listed the n values in figure legends.

      Fig. 2C - Missing WT error bars

      We appreciate the reviewer's concern for the WT error bars in these figures. The measurements underlying these plots are derived from quantification of Western blots. Because the blots have a limited number of lanes, the WT sample was run as a normalization control on each of several sets of blots. By employing this approach, we could make quantitative comparisons within each blot without needing to make direct comparisons between blots, eliminating confounding variables such as blotting times, positions of blots on rotary shakers, developer incubation time, exposure times, etc. Because WT lanes were used for normalization, each "WT" blot condition has its own set of error bars that was used for t-test comparison with the samples that share a blot. For this purely technical reason, we can represent the data either normalized against WT values or with three separate WT measurements for each plot. In the interest of clarity and transparency, we elected to report the values normalized to WT and to include all raw blot images in Supplementary Fig. S4. We understand that we could have made this more transparent, so to clarify this decision for readers, we now explicitly reference the raw blot images in both the Results text (lines 185) and in the Figure 2 legend.

      Fig. 3H, 5C - Why not show raw numbers on the y-axes of the inset graphs like the main graph? Also, if it is only showing a subset of roundness ranges, then the x-axis should not go to 1 (i.e. axis range 0-0.8 would be clearer). I had a hard time figuring out what these insets were trying to show me, so please think about presenting this data more clearly (and larger).

      For clarity, we have moved the inset graphs to separate panels alongside the main panel and implemented the requested changes to the axes (see Figs. 3G, 5B).

      Discussion - line 377 - "Though we do not claim that Ttyh1 endogenously induces EV formation" This statement could be misinterpreted to say that you do not think endogenous Ttyh1 regulates EV formation. Rephrase as "although we have not examined whether..."

      We thank the reviewer for pointing out this unclear sentence and have applied the requested change (line 397).

      Discussion - line 400-402 "Our results do not indicate that Ttyh1 does not bind cholesterol, merely that it does not form an interaction that is sufficiently kinetically stable to be co-immunoprecipitated." The phrasing here is confusing with multiple "not". It is better to leave things open than to say what you have not shown. Rephrase suggestion: "Although Ttyh1 was not able to form a kinetically stable interaction for co-immunoprecipitation, it remains to be determined whether Ttyh1 is able to bind cholesterol."

      We thank the reviewer for their suggestion and have modified the sentence to avoid double-negative phrasing (lines 422-426).

      Movies - I'm not sure what the two videos add. It's difficult to convince myself that I see plasma membrane labeling in either movie, especially in comparison to the over-exposed WGA staining. Also, why are there ~5 sec of empty movie at the end of each?

      We appreciate the reviewer's feedback and have removed the movies from the manuscript.

      Reviewer #1 (Significance (Required)):

      The data is interesting and well presented, but over interpreted in the discussion. The data on Ttyh1 expression inducing EVs is novel, but limited to overexpression studies. This study will be of interest to the EV, membrane curvature, and Prmn1/Tthy1 fields My expertise is in basic research on membrane trafficking (including EV formation) and lipids

      We thank the reviewer for their favorable review and helpful suggestions.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      In this study, authors investigated the role of Prom1 and Ttyh1 proteins on EV formation. They showed that both proteins can induce EV formation, while the mechanisms by which they do it might differ slightly. Ttyh1 binding to cholesterol is not as pronounced as Prom1. Surprisingly, cholesterol binding efficiency inversely correlates with EV formation. Also, EVs induced by Tthy1 and Prom1 are structurally different.

      My suggestions to improve the manuscript are below.

      • Figure 2E is not very convincing. As the authors mentioned, the signal is too low to have a concrete conclusion. The line scans somehow show that WT is more membrane-localized than mutant, but colocalization of Prom1 and WGA seems very similar in both cases. Is it certain that the addition of fluorophore did not change the trafficking? Does endogenous Prom-1 staining look like this? Also, why is WGA staining brighter in mutant sample, just a usual variation or biologically important?

      We understand the reviewer's concern about low signal, but respectfully disagree that the signal is too low to draw a meaningful conclusion. The only point we conclusively make in Fig. 2E is that WT Prom1 is more efficiently trafficked to the plasma membrane than W795R Prom1. We feel that this effect is sufficiently well evidenced by the line scan analysis in Supp. Fig. S5, where Prom1 peaks are cleanly visible for WT but not for W795R protein.

      We observe somewhat variable WGA staining in our experiments, and the differences we show in this figure panel are representative of typical staining variation. We do not draw any biological conclusions from the level of WGA present, only from its localization. Because both the plasma membrane and late endosomes are WGA+, we suspect that the W795R Prom1 is failing to traffic from endosomes to the plasma membrane. However, given the limitations of our fluorescence assay, we have removed any claim beyond the change plasma membrane trafficking efficiency from discussion of this experiment.

      We cannot conclude whether the mStayGold fluorophore alters trafficking of Prom1 to the plasma membrane. In response to the reviewer's comment, we attempted to use immunofluorescence to measure membrane localization of untagged Prom1 with the AC133-1 antibody. Unfortunately, we were unable to optimize this protocol to achieve sufficient membrane staining for quantification. We have softened our interpretation of Fig. 2E in the Results and Discussion (lines 203-204, 450) to acknowledge that the effects we observe are only measured with fluorophore-tagged Prom1.

      • I also recommend showing the localization of Ttyh1 on cells.

      We appreciate the reviewer's suggestion here, and it is an experiment we considered. One of the challenges we faced in this assay was quantitatively measuring fluorescent signal along cell-boundary plasma membranes without saturating signal from the very bright WGA+ endosomes. Because Ttyh1 globally expresses at higher levels than Prom1 (see Figs. 3C, 3I), direct comparison of membrane-localized Prom1 and Ttyh1 is technically challenging in these cells. However, Ttyh membrane localization has been widely reported in other papers (Matthews et al., J. Neurochem, 2007; Jung et al., J. Neurosci., 2017; Sukalskaia et al., Nat. Commun., 2021; Melvin et al., Comm. Biol., 2022) that we now explicitly mention and cite for reader clarity in both the Introduction and Results (lines 69-71, 224-225).

      • A graph directly showing cholesterol binding vs EV formation efficiency would be very useful.

      We agree with the reviewer that this would be an interesting and useful addition to the paper. We now include this panel in the revised manuscript as Fig. 4F.

      • "Prominin and Tweety homology proteins are homologous and functionally analogous" involves speculation and authors should clearly mention this. Revealing that they are both contributing to EV formation does not make them definitely functionally analogous.

      We agree with the reviewer that this sentence is indeed ambiguous and somewhat speculative. We have revised the section heading to "Prominin and Tweety homology proteins are homologous proteins that both promote EV formation" (lines 461-462) to indicate the specific analogous function we observe.

      Reviewer #2 (Significance (Required)):

      Overall, it is a useful addition to the field of cell biology, particularly EV field. EV formation and efficiency are both important topics, and this manuscript might give insights.

      We thank the reviewer for their favorable review and helpful suggestions.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary

      Bell et al. overexpress Prom1 or Ttyh1 and test its effect on EV formation from cell lines. They find that Ttyh1 expression leads to an increase in small EVs as well as tubulated EVs, while Prom1 expression leads to a milder increase in small EVs. EV induction by Prom1 is dependent on cholesterol and the authors show that Prom1 makes the cholesterol in EVs more resistant to detergent. The authors show no connection between Ttyh1 EV induction and cholesterol, although they claim it is important. They also show that a disease mutation in Prom1 decreases Prom1 trafficking to the plasma membrane and increases cholesterol resistance to detergent in EVs. The authors also find that the disease mutation decreases the size of the Prom1-induced EVs.

      Major Comments

      Results - line 99-106 - The EV isolation protocol would remove large EVs like the Prom1+ midbody remnants. It is important to explicitly specify that this study focused on small EVs.

      Statistics - The t tests appear to have been performed without correction for multiple comparisons (Figure 2C-D, Fig. 4D). Given that >10 comparisons were made, this can alter the biological significance of p<0.05 (1 incorrect in 20 comparisons). Please reanalyze with a more appropriate statistical test for multiple comparisons (i.e. ANOVA) or apply a correction to the t test values (i.e. Bonferroni).

      The DLS data does not appear to give any insight into EV size (unlike the EM data) and could be removed from the whole manuscript (or moved to supplemental). The authors should also remove any conclusions based on the DLS data.

      Discussion - line 382-383 "Because Prom1 EVs arise directly from blebbing of the plasma membrane23, this finding suggests that Prom1 and Ttyh1 traffic to similar regions of the plasma membrane." The authors have not examined where Prom1 or Ttyh1 localize in the plasma membrane and can not draw this conclusion. That both proteins promote plasma membrane budding would only suggest that both proteins localize to the plasma membrane, not subregions of the plasma membrane. However, the authors have not demonstrated that Ttyh1 specifically induces plasma membrane budding. The different size of Ttyh1 EVs could be due to different biogenesis mechanisms (i.e. derived from intracellular organelles instead of the plasma membrane), making this statement an over-interpretation on both parts.

      Discussion - line 398-400 "Membrane cholesterol is necessary for Prom1-mediated remodeling20,21 and is present at similar levels in purified Prom1 and Ttyh1 EVs (Fig 5E), indicating that it is undoubtedly important for EV formation by both proteins." & line 415-417 "We find that conservative mutations in several of these adjacent aromatic residues impair EV formation by Prom1, but do not mimic the stable cholesterol binding of W795R (Figs 2C, 4D). " The author's data suggests that cholesterol is not important for Ttyh1 to induce EV formation. The authors show that cholesterol depletion does not alter Ttyh1 EV production. Similarly, they find separable effects on cholesterol binding and EV formation with Prom1 mutants, which suggest that there is more to Prom1-mediated EV formation than cholesterol. That cholesterol is present at similar levels can reflect that overexpression of these proteins does not alter the amount of cholesterol in the EV source membrane (i.e. plasma membrane). Also, wouldn't molecular crowding of a membrane protein be predicted to influence how easy it is to extract lipids?

      Discussion - line 431-433 "Our findings suggest that the dynamic interaction of Prom1 with cholesterol may promote efficient maturation and trafficking of Prom1 between the endomembrane system and the plasma membrane. The authors did not investigate whether depleting cholesterol improved Prom1(W795R) trafficking to the plasma membrane, making this inference untested. Soften interpretation or test experimentally.

      Minor Comments

      Abstract - "the EVs produced are biophysically similar" The authors don't perform any typical biophysical characterization (beyond size and perhaps density), so do they mean physically similar? Given the Prom1 and Ttyh1 EVs can have different shapes and are significantly different sizes, this statement feels misleading.

      Intro - line 59-60 - "Large Prom1 EVs (500-700 nm in diameter) appear to form from bulk release of membrane from the cell midbody" Midbody remnants are well defined (if variously named, i.e. flemmingsome) large EVs derived from the spindle midbody, intercellular bridge, and cytokinetic ring. I'm not sure what the authors are trying to express by "bulk release of membrane". Midbody remnants are also a site of membrane tubulation.

      The effect on total numbers of EVs is buried in the y-axes of the EM graphs, making it difficult to distinguish where a higher n of images was examined vs. where there is an increase in EVs. This is especially hard to interpret given the high difference in n values.

      Fig. 2C - Missing WT error bars

      Fig. 3H, 5C - Why not show raw numbers on the y-axes of the inset graphs like the main graph? Also, if it is only showing a subset of roundness ranges, then the x-axis should not go to 1 (i.e. axis range 0-0.8 would be clearer). I had a hard time figuring out what these insets were trying to show me, so please think about presenting this data more clearly (and larger).

      Discussion - line 377 - "Though we do not claim that Ttyh1 endogenously induces EV formation" This statement could be misinterpreted to say that you do not think endogenous Ttyh1 regulates EV formation. Rephrase as "although we have not examined whether..."

      Discussion - line 400-402 "Our results do not indicate that Ttyh1 does not bind cholesterol, merely that it does not form an interaction that is sufficiently kinetically stable to be co-immunoprecipitated." The phrasing here is confusing with multiple "not". It is better to leave things open than to say what you have not shown. Rephrase suggestion: "Although Ttyh1 was not able to form a kinetically stable interaction for co-immunoprecipitation, it remains to be determined whether Ttyh1 is able to bind cholesterol."

      Movies - I'm not sure what the two videos add. It's difficult to convince myself that I see plasma membrane labeling in either movie, especially in comparison to the over-exposed WGA staining. Also, why are there ~5 sec of empty movie at the end of each?

      Significance

      The data is interesting and well presented, but over interpreted in the discussion. The data on Ttyh1 expression inducing EVs is novel, but limited to overexpression studies. This study will be of interest to the EV, membrane curvature, and Prmn1/Tthy1 fields My expertise is in basic research on membrane trafficking (including EV formation) and lipids

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      The authors report a mass spectrometry (MS)-based interactomics technique, time-resolved interactome profiling (TRIP), which allows for tracking temporal changes in the interactome of protein of interest. To show that TRIP can successfully deconvolute interactomes over time, they pulsed thyroid cells with homopropargylglycine (Hpg), immunoprecipitated the Hpg incorporated thyroglobulin (Tg) and its interacting proteins at different time points, and subjected the samples to tandem mass tag (TMT)-based quantitative MS analysis. The MS results show that WT and variant Tg proteins indeed associate with different proteostasis network factors in a differential manner over the course of time. In addition, they utilized an siRNA-based luciferase fusion assay to evaluate whether silencing each proteostasis network component changes the levels of Tg in both lysate and media. From the combination of the TRIP and siRNA-based assays, they found many hits, including hits implicated in protein degradation, VCP and TEX264, which they validated with multiple experiments.

      I am overall quite positive and think this is an important study. But there are some meaningful points to consider.

      Our Response: We thank Reviewer #1 for their positive outlook on our manuscript and their constructive feedback. We have addressed the comments below.

      Significant comments:

      Reviewer #1, Comment #1: Oonly two replicates of the main data (the TRIP-MS experiments) for this paper is problematic. Especially since the manuscript is supposed to be demonstrating and validating the new technique. Consistent with this concern, the relative enrichment profiles for some of the results were surprising. For instance, interaction with CCDC47 was tapering off but then at 3 h it suddenly reaches the maximum level of engagement. Is this a real finding or the variability in the method? Impossible to tell with two replicates. Presenting heat maps based on biological duplicates is also very problematic. It masks the error, which is large as can be seen in some of the panels showing individual proteins. In my view, triplicates and a clear understanding of the error in the technique should be required.

      Our Response: The TRIP datasets for WT Tg contains 5 biological replicates, while the A2234D and C1264R Tg contains 6 biological replicates. Two replicates are typically included in a TMTpro 16plex mass spectrometry run, and each analysis consists of 3 MS runs. We apologize that the number of replicates and layout of the MS runs was not clearly explained. Data for individual replicates is found in Dataset EV1, Dataset EV3, and a newly added Table EV3 delineates the sample layout across the TMT channels and MS runs. We clarified the text as follows:

      "Subsequently, two sets of TRIP time course samples (0, 0.5, 1, 1.5, 2, and 3 hr) could be pooled using the 16plex TMTpro and analyzed by LC-MS/MS (Fig 2A). In total, 5 biological replicates were analyzed for WT and 6 biological replicates were analyzed for A2234D and C1264R, respectively (Table EV3)."

      Reviewer #1, Comment #2: The same concern arises for the high-throughput siRNA screen, which was performed only in duplicate for WT and A2234D.

      Our Response: While the initial screen was performed in duplicate for WT and A2234D, which is common for larger screens due to resource constraints, we would like to direct the reviewer to the fact that we followed up on observed hits using thyroid cell lines with many more replicates. Furthermore, most hits came from the C1264R Tg variant, which had three replicates in the initial screen. Hits were also extensively followed-up.

      Reviewer #1, Comment #3: *There are issues with some of the immunoprecipitation experiments: In Figure 1C, a negative control for FLAG IP is missing. *

      *-In Figure 2B, I am curious why the band (Hpg -, chase time 0 h) is so faint for the first WB (IB for FLAG) - is Hpg treatment indeed leading to much more Tg present at 0 h? If so, that is a concern. *

      -Also, a negative control must be included (either plain cells or cells expressing fluorescent protein or a different epitope-tagged WT Tg).

      -In this same figure, I am puzzled why the bands for 1.5-3 timepoints in Biotin PD elution, probed for Rhodamine, are very faint especially considering that in Figure 1D, the corresponding bands, which are 4 h after the pulse, look fine. It seems like the IP failed here?

      Our Response: In Fig 2B, we have updated this figure with higher-quality images that are more representative of the results found when performing this experiment. Furthermore, to address the missing negative controls in Fig. 1C, we have added a separate figure (Fig EV2) where (-) FLAG-tagged Tg is included in this panel. We updated the text as follows:

      "Furthermore, the C-terminal FLAG-tag and Hpg labeling are necessary for this two-stage enrichment strategy, and DSP crosslinking is necessary to capture these interactions after stringent wash steps (Fig 1D, Fig EV2)."

      Regarding the Biotin PD rhodamine/TAMRA signal in Fig 2B: The blots in this figure panel represent the time-resolved Tg fractions from cell lysate, corresponding only to intracellular thyroglobulin. The decrease in band intensity for 1.5-3 hr time points is expected due to continued secretion and/or degradation dynamics taking place that decrease the intracellular population of labeled thyroglobulin that is able to be captured. For comparison, please note the C1264R panel (Fig 2C), where the rhodamine/TAMRA signal in the Biotin PD elutions is more stable compared to WT, indicating the cellular retention of C1264R while WT Tg is efficiently secreted and the signal is lost more rapidly. Fig 1D contains samples derived from a 4 hr Hpg pulse (without chase), explaining why the overall fluorescent Tg signal is more intense.

      Suggestion to consider:

      Reviewer #1, Comment #4: This manuscript, supported by the title and abstract, mainly focuses on the presentation of the development and application of TRIP, which is highly significant. The story becomes less coherent and harder to follow as significant amounts of text/figures are dedicated to siRNA-based high throughput screening and follow-up. In addition, although the discovery of TEX264 as one of the hits is very interesting and exciting, TEX264 apparently was not a hit in the TRIP experiment and is pretty distracting from the main point of the paper highlighted in the abstract and title, therefore. The siRNA-based assay and follow-up studies could be a separate scientific story of their own. Especially considering my concerns on the number of replicates for both the TRIP and siRNA-based assay, it could be beneficial to actually split the manuscript into two and conduct more replicates of the -omic work, which should corroborate the exciting discoveries the authors have made.

      Our Response: We have edited the manuscript to hopefully provide a more cohesive presentation of all data, findings, and conclusions within the paper. Given the generally positive outlook on the manuscript from other reviewers and our responses to significant comments from Reviewer #1 we opted to keep the manuscript as a single piece and address all reviewer comments.

      Minor comments:

      Reviewer #1, Comment #5: Throughout the manuscript, the authors have not defined what FT is; presumably it means FLAG tag.

      Our Response: Reviewer #1 is correct in FT corresponding to FLAG tag. We have now edited the manuscript text to clarify this as follows:

      "Thyroglobulin was chosen as model secretory client protein, and we generated isogenic Fischer rat thyroid cells (FRT) cells that stably expressed FLAG-tagged Tg (Tg-FT), including WT or mutant variants (A2234D and C1264R)."

      Reviewer #1, Comment #6: The authors might discuss their rationale for choosing 0-3 hrs for their TRIP studies. That includes any relevant information about the half-life of WT versus variant Tg, whether the Hpg pulse time is short enough to avoid missing key features of the temporal interactome, and discussion of what would happen if the TRIP were performed at prolonged time points (e.g. 6-10 h).

      Our Response: Apologies that we omitted this important point, which is indeed related to the secretion and degradation half-life. We edited the manuscript text to discuss the rationale for 0-3 hr, length of the Hpg pulse and the impact on capturing interactions, and performing TRIP at prolonged time points as follows:

      "Our previous study indicated that ~70% of WT Tg-FT was secreted after 4 hours, while approximately 50% of A2234D and 15% of C1264R was degraded after the same time period (Wright et al, 2021). Therefore, we reasoned that a 3-hr chase period would be a enought time to capture the majority of Tg interactions throughout processing, secretion, cellular retention, and degradation, while still being able to capture an appreciable amount of sample for analysis."

      We explain the labeling timeline and limitations further in the discussion:

      "To address this, we utilized a labeling time of 1 hr which allows us to generate a large enough labeled population of Tg-FT for TRIP analysis, but some early interactions are likely missed within the TRIP workflow. In the case of mutant Tg, performing the TRIP analysis for much longer chase periods (6-8 hrs) may provide insightful details to the iterative binding process of PN components that is thought to facilitate protein retention within the secretory pathway."

      Reviewer #1, Comment #7: Lines 68-69: the two citations should probably come one sentence earlier (at least Coscia et al 2020 is a structure paper).

      Our Response: We agree. We have edited the manuscript as follows to correct this:

      "In earlier work, we mapped the interactome of the secreted thyroid prohormone thyroglobulin (Tg) comparing the WT protein to secretion-defective mutations implicated in congenital hypothyroidism (CH) (Wright et al, 2021). Tg is a heavily post-translationally modified, 330 kDa prohormone that is necessary to produce triiodothyronine (T3) and thyroxine (T4) thyroid specific hormones (Citterio et al, 2019; Coscia et al, 2020). Tg biogenesis relies extensively on distinct interactions with the PN to facilitate folding and eventual secretion."

      Reviewer #1, Comment #8: Line 91: "(Figure 1A)" should follow the sentence "To develop the time-resolved..." to help readers better understand the system.

      Our Response: __We agree. We have edited the manuscript to add the Fig 1A reference. Furthermore, we redesigned the schematic in Fig 1A to better explain the experimental system. (see also __Reviewer #2, comment 10)

      "To develop the time-resolved interactome profiling method, we envisioned a two-stage enrichment strategy utilizing epitope-tagged immunoprecipitation coupled with pulsed biorthogonal unnatural amino acid labeling and functionalization (Fig 1A). Cells can be pulse labeled with homopropargylglycine (Hpg) to synchronize newly synthesized populations of protein. After pulsed labeling with Hpg, samples can then be collected across time points throughout a chase period (Fig 1A, Box 1) (Kiick et al, 2001; Beatty et al, 2006). The Hpg alkyne incorporated into the newly synthesized population of protein can be conjugated to biotin using copper-catalyzed alkyne-azide cycloaddition (CuAAC) (Fig 1A, Box 2). Subsequently, the first stage of the enrichment strategy can take place where the client protein of interest is globally captured and enriched using epitope-tagged immunoprecipitation, followed by elution (Fig 1A, Box 3)."

      Reviewer #1, Comment #9: Line 101: Fisher should be Fischer

      Our Response: Thank you. We have edited the manuscript text to correct this.

      Reviewer #1, Comment #10: Line 131: Should be 1.5 hrs instead of 2 hrs.

      Our Response: We edited this point (see below in comment #11)

      Reviewer #1, Comment #11: Lines 135-136: I do not agree with the claim that HSPA5 profile looked similar for MS and WB. I do not see a peak for HSPA5 at 2 hrs in Figure 2D.

      Our Response: We replaced the mass spectrometry quantification in Fig 2D, E with the scaled, relative enrichments. This provides a more meaningful comparison, as all interactions are scaled in the same way. Unfortunately, it is still difficult to directly compare the Western blot results in Fig. 2B-C to the mass spectrometry quantifications in Fig 2D-E because the WB intensities are not normalized to the Tg bait protein amounts, which is changing over time. At 2-3hrs time points, little WT Tg is pulled down as most of it is secreted. Therefore, the HSPA5 interactions are no longer detectable by Western blot. On the other hand, MS is much more sensitive to capture the interactions. We modified the text as follows:

      "For C1264R, interactions with HSPA5 were highly abundant at the 0 hr time point and remained mostly steady throughout the first 1.5 hours (Fig 2C). A similar temporal profile was also observed for HSP90B1. Additionally, interactions with PDIA4 were detectable for C1264R and were found to gradually increase throughout the first 1.5 hr of the chase period, before rapidly declining (Fig 2C). We noticed similar temporal profiles for PDIA4 and HSPA5 to our western blot analysis, when measured via TMTpro LC-MS/MS as further outlined below (Fig 2D-E). In particular, the HSPA5 WT Tg interaction declined within the first hours, yet for C1264R Tg, the HSPA5 interactions remained mostly steady over the 3-hour chase period. (Fig 2E)."

      Reviewer #1, Comment #12: Line 186: The cited paper Shurtleff et al 2018 is missing in the reference list.

      Our Response: Thank you. We have corrected this in the citation management system and it is now available in the reference list.

      Reviewer #1, Comment #13: Line 188: I disagree with the authors' claim here because, at least for CCDC47, interactions with C1264R seem to come back at the 3 hr time point.

      Our Response: We have removed the discussion of EMC and PAT complex components from the text. The implications of these interactions for Tg biogenesis remain unclear and were therefore a distraction from the discussion of other core proteostasis network components pertinent to Tg processing. Nonetheless, the full dataset - including these interactions - remains available to readers in Appendix Fig S1 for further perusal.

      Reviewer #1, Comment #14: Line 203: I am not sure if P4HA1 can be included in the examples for showing distinct patterns for mutants compared to the WT according to their data in Figure 3H.

      Our Response: We agree. We have edited the text to remove the discussion of prolyl hydroxylation and isomerization family members and elected to discuss the new clustering analysis and the robustness of the TRIP method in more detail. The full TRIP data is nonetheless available to interested readers in Appendix Fig S1.

      Reviewer #1, Comment #15: Line 216: The authors should add citations about the functions of STT3A and STT3B proteins.

      Our Response: We've edited the manuscript text to include a reference to the primary literature for STT3A and STT3B functions, as follows:

      "Previously, we showed that A2234D and C1264R differ in interactions with N glycosylation components, particularly the oligosaccharyltransferase (OST) complex. Efficient A2234D degradation required both STT3A and STT3B isoforms of the OST, which mediate co-translational or post-translational N-glycosylation, respectively (Kelleher et al, 2003; Cherepanova & Gilmore, 2016)."

      Reviewer #1, Comment #16: Lines 248-251, "We found that interactions with these components...": this sentence should refer to Figure 3 - Figure Supplement 3 instead of Figure 3L and S4.

      Our Response: Thank you. This section of the manuscript was significantly rewritten and the figure references updated.

      Reviewer #1, Comment #17: Lines 258-260, "Another striking observation was that the temporal profile of EMC interactions for C1264R correlated with RTN3, PGRMC1, CTSB, and CTSD interactions.": Please provide more evidence to support the potential correlation between different interaction profiles. Or the authors should move this sentence to the discussion section as it sounds speculative. This highlights the issue of only having duplicates, as well.

      Our Response: We agree that this point was highly speculative and we removed discussion of the EMC interactions.

      To further investigate the correlation of interaction profiles across the dataset, we performed unbiased k-means clustering. This led to the identification of 7 and 6 unique clusters of interactors for WT and C1264R Tg-FT, respectively. These data are represented in Fig 3F and Fig EV5. Unique clusters highlight similar temporal interaction profiles for Tg-FT interactors, and provide a quantitative representation of correlative interactions that take place during Tg-FT processing.

      "To assess temporal interaction changes in an unbiased fashion and identify protein groups exhibiting comparative behavior, we carried out k-means clustering of the temporal profiles for WT and C1264R. This analysis revealed a large divergence in the interaction profiles. For WT Tg, only one cluster exhibited steadily decreasing interactions (cluster 4), while others increased with time, or showed peaks at intermediate times (Fig 3F, Fig EV5A). On the other hand, C1264R largely exhibited clusters with decreasing interactions over time (Fig 3F, Fig EV5B). Cluster 2 for WT with biomodal interactions at early and late time points contains many Hsp70/90 chaperoning components. For C1264R Tg, many Hsp70/90 chaperoning components and disulfide/redox-processing components are instead part of cluster 2', which exhibited an initial rise in interactions strength before plateauing (Fig 3F, Fig EV5A,B). This divergent temporal engagement between WT Tg and the destabilized C1264R mutant is aligned with the patterns observed in the manual grouping (Fig 3B,C), highlighting that the unbiased temporal clustering can reveal broader patterns in the reorganization of the proteostasis dynamics."

      One of the clusters of the C1264R Tg interactions contained autophagy interactors along with glycosylation components. We therefore postulate that this could point to a coordination of these processes. We discuss this new point in the updated manuscript:

      "In the k-means clustered profiles, autophagy interactions largely group together in the same cluster, showing stronger interactions at earlier time points. In the same cluster are glycosylation components (UGGT1 and STT3B, MLEC), further supporting a possible coordination for C1264R Tg between lectin-dependent protein quality control and targeting to autophagy (Fig EV5B,C)."

      Reviewer #1, Comment #18: Line 340: As written, should cite more than one paper

      Our Response: Thank you. We reworded the manuscript to correct this, as follows:

      "The discovery of several protein degradation components as hits for rescuing mutant Tg secretion may suggest that the blockage of degradation pathways can broadly rescue the secretion of A2234D and C1264R mutant Tg, a phenomenon similarly found for destabilized CFTR implicated in the protein folding disease cystic fibrosis (Vij et al, 2006; Pankow et al, 2015; McDonald et al, 2022)."

      Reviewer #1, Comment #19: Line 371: Should be Figure 4 - figure supplement 2

      Our Response: We edited the manuscript to correct this error.

      Reviewer #1, Comment #20: Line 1231: "Zhang et al 2018" needs to be removed

      Our Response: We have removed this citation.

      Reviewer #1, Comment #21: Line 1286: FRTR should be FRT

      Our Response: Thank you. We have corrected this within the text.

      Reviewer #1, Comment #22: Figure 3E: Color used to highlight the three proteins (CCDC47, EMC1, EMC4) should match the color used in Figure 3 - Figure Supplement 3

      Our Response: __We have edited Figure 3 to remove the section related to membrane protein biogenesis. This data is still available in __Appendix Fig S1 with consistent color coding.

      Reviewer #1, Comment #23: Figure 4A: The bottom figure where lysate signal is inversely proportional to time is misleading because the authors are assessing steady-state level of proteins in this assay.

      __Our Response: __We agree. We updated the schematic in __Fig 4A __to better explain the workflow and differentiate the steady-state protein level being measured within the lysate.

      Reviewer #1, Comment #24: Figure 4 - Figure Supplement 1 caption: in (C), (F) should be (B). (K) should be (G) and I am not sure what the authors mean when they refer to (J) in caption of (G).

      Our Response: We have corrected this lettering mistake to match the figure properly. Please note that this figure is now Fig EV6, and it includes some new and reorganized panels.

      Reviewer #1, Comment #25: Figure 5 caption for (C and D): Need to specify the time that the samples were collected (8 hrs), as it seems different from A and B according to the main text.

      Our Response: We have specified the collection time within the caption for these data in Fig 5C __and __5D.

      Reviewer #1, Comment #26: Figure 5 - Figure Supplement 1: Data for HERPUD1 and P3H1 should be included.

      Our Response: We have now included data to confirm the knockdown for HERPUD1 and LEPRE1 (P3H1) in Fig EV7F-G.

      Reviewer #1, Comment #27: Figure 5 - Figure Supplement 2B: Please mention in the caption how degradation is defined.

      Our Response: We have updated the Fig EV7H caption to include how "degradation" is defined within these experiments:

      "% Degradation is defined as . Where is the fraction of Tg-FT detected in the lysate at a given timepoint n, and is the fraction of Tg-FT detected in the media at a given timepoint n."

      Reviewer #1 (Significance (Required)):

      Reviewer #1, Comment #28: This manuscript is highly significant because the authors (1) designed and validated a new methodology for time-resolved interactomics study, (2) presented the dynamic changes in Tg interactome for WT and variants, and (3) discovered how proteins implicated in degradation pathways (e.g. VCP, TEX264, RTN3) can change the secretion profile of WT and mutant Tg proteins. With TRIP, the authors demonstrated that they could obtain valuable data that were previously not captured from steady-state interactomics studies (Wright et al. 2021; Figure 3M and Figure 3 - Figure supplement 4D-4I). Furthermore, the authors treated cells with VCP inhibitors and performed both 35S pulse-chase analyses and TRIP. These experiments provide valuable information to the field by (1) presenting a new method to rescue Tg secretion defect, and (2) demonstrating a broader applicability of TRIP. If the major comments above can be addressed I believe this is a tremendous contribution to the field.

      Our Response: We thank Reviewer #1 for their review comments and praise for the work presented within this manuscript.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Reviewer #2: In the manuscript 'Time-Resolved Interactome Profiling Deconvolutes Secretory Protein Quality Control Dynamics' Wright et al. developed an approach for time-resolved protein protein interaction mapping relying on pulsed unnatural amino acid incorporation, protein cross linking, sequential affinity purification, and quantitative mass spectrometry named time-resolved interactome profiling (TRIP). The authors applied the TRIP method to compare the interactions of the secreted thyroid prohormone thyroglobulin (Tg) comparing the WT protein to secretion-defective mutations implicated in congenital hypothyroidism. They further employed an RNA interference screening platform (1) to investigate if (1) interactors identified via TRIP are functionally relevant for Tg protein quality control and (2) to identify factors that can rescue mutant Tg secretion. The screen was initially performed in HEK293 cells, but selected hits with a phenotype in HEK cells were then followed up in Fisher rat thyroid cells. Further functional validation was performed by pharmacologic inhibition of VCP, a hit from the RNAi screen with an effect on Tg lysate abundance and Tg secretion. While the authors present a comprehensive study including identification of protein-protein interactions using proteomics followed up by an RNA interference screen for functional validation, major comments need to be addressed for both the proteomics as well as the functional genomics aspects of the study (see comments below).

      Our response: Thank you to reviewer 2 for their constructive feedback. We addressed all comments in detail below.

      Major comments:

      Reviewer #2, Comment #1: The authors describe a new method for quantitative, temporal interaction mapping. The protocol involves two enrichment steps as well as several reactions including cross-linking of the samples as well as functionalization of the unnatural amino acids. Given all these steps, the authors should rigorously characterize the quantitative reproducibility of the experiment when performed in independent biological replicates. This is important because in the final quantitative MS experiment, the authors only use two biological replicates, which is too low especially for such an involved sample preparation procedure, which would expect to have a high variability between replicates. Given the low number of replicates and the unknown reproducibility of the quantification for this protocol, it is questionable at this point how reliable the quantification over the time course is.

      __Our Response: __We apologize that the number of replicates and robustness of the analysis was not entirely clear in our manuscript. We thank the reviewer for the feedback, as this is important point to clarify. We included several additional analyses to further explain the robustness and quantitative reproducibility of our results:

      • We clarified the number of replicates For quantitative MS experiments five biological replicates were analyzed for WT, while six biological replicates were analyzed for A2234D and C1264R Tg-FT, respectively not two as mistakenly presumed by Reviewer #2. These data are available in Dataset EV1 and Table EV3. There is only one place where two biological replicates are included, C1264R Tg-FT FRT cells treated with ML-240 treatment for TRIP analysis. We have further clarified the number of biological replicates within the manuscript text as follows (see also reviewer #1, comment 1):

      "Subsequently, two sets of TRIP time course samples (0, 0.5, 1, 1.5, 2, and 3 hr) could be pooled using the 16plex TMTpro and analyzed by LC-MS/MS (Fig 2A). In total, 5 biological replicates were analyzed for WT and 6 biological replicates were analyzed for A2234D and C1264R, respectively (Table EV3)."

      • We displayed the reproducibility of TRIP time profiles for several individual proteins in Fig EV3 __and in __Fig 3K (VCP). We included shading to indicate the standard error of the mean (SEM) for the individual protein time courses to provide further assessment of the quantitative reproducibility. We updated the text as follows: "To benchmark the TRIP methodology, we chose to monitor a set of well-validated Tg interactors and compare the time-resolved PN interactome changes to our previously published steady-state interactomics dataset (Wright et al, 2021). Previously, we found that CALR, CANX, ERP29 (PDIA9), ERP44, and P4HB interactions with mutants A2234D or C1264R Tg exhibited little to no change when compared to WT under steady state conditions (Fig EV4A). However, in our TRIP dataset we were able to uncover distinct temporal changes in engagement that were previously masked within the steady-state data. Our time-resolved data deconvolutes these aggregate measurements, revealing prolonged CALR, ERP29, and P4HB engagements for both A2234D and C1264R Tg mutants compared to WT (Fig EV4B-F). We found that these measurements for key interactors and PN pathways exhibited robust reproducibility, as exemplified by the standard error of the mean for the TRIP data (Fig EV4B-I, Appendix Figure S1B)."

      • For full transparency, we also include the SEM of all TRIP profiles in the heatmap in Appendix Fig S1B.

      • Furthermore, we included 25-75% quartile ranges for the pathway aggregated time courses (Fig 3B,C,J,K) and the k-means hierarchical clustering analysis (Fig 3F, Fig EV5). Especially these clustering data allow for the visualization and analysis of temporal protein interactions that are correlated with one another, while the accompanying quartile ranges provide further context for the reproducibility of these measurements and cluster profiles (see __Reviewer #1, Comment 17 __above for further explanation about the k-means clustering).

        Reviewer #2, Comment #2: Compared to the previous dataset published last year, the authors discover an overlap in interactors, but also a huge discrepancy, with 96 previously identified interactors not detected in the current study, but 198 additional interactors identified. How do the authors explain the big differences between these datasets?

      __Our Response: __We can only speculate here but this difference in overlapping interactors may stem from several different factors, including but not limited to cell line, instrumentation, LC-MS/MS methodology, and sample processing workflows. Our previous dataset was published using transiently transfected HEK293 cell lines expressed FLAG-tagged constructs of Tg. The HEK293 cell line makes for a robust cell line used throughout several biological investigations, but it is not representative of the native cellular environment in which Tg is expressed. Moreover, transiently transfected cells can lead to high protein expression that may not always represent what is found within the native cellular environment and proteome. Here, we used Fischer rat thyroid (FRT) cells engineered to stably express FLAG-tagged constructs of Tg. This cell line model should more accurately represent the native cellular environment Tg is expressed as it is exclusively found within thyroid tissue. Our previous dataset was collected across two different instruments with similar LC-MS/MS methodology. Here, this dataset was collected on a single instrument after performing further method optimization from our methodology used to acquire the first dataset. In line with our LC-MS/MS methodology development, the sample processing workflows here are quite different. Our previous dataset utilized 6plex TMT labeling with globally immunoprecipitated samples from various Tg constructs. Global immunoprecipitation of Tg leads to much larger protein sample amounts than the TRIP methodology presented here, which we coupled with 16plex TMTpro labeling. This is also one of the reasons we chose to deploy a booster/carrier channel within our experimental labeling schemes.

      Reviewer #2, Comment #3: For the temporal interaction analysis the authors describe differences in the temporal profiles of selected interactions comparing wt and mutant, however no statistical analysis is performed comparing wt and mutant interaction profiles across the time course. Furthermore the variability between the replicates for the temporal profiles is not shown and some of the temporal profiles appear to be noisy. A more rigorous statistical analysis should be performed including additional biological replicates to evaluate the changes over the time course, especially as the temporal interaction analysis is the novelty of this study.

      Our Response: Please also see our response to Reviewer #2, comment 1 above. We previously presented an analysis of the variability of the TRIP measurements (SEM) (now in Appendix Fig S1B). We have since provided further statistical analysis found in the updated Fig 2B,C,J, which include 25-75% quartile ranges for respective proteostasis network pathways. We also included SEM for the time profiles of individual interactors in Fig EV4.

      To assess the divergence in time profiles in an unbiased way, we added a k-means hierarchical clustering analysis (Fig 3F, Fig. EV5). These clustering data allow for the visualization and analysis of temporal protein interaction profiles that are similar to one another and how groups of interactors shift between different clusters for WT Tg and the C1264R mutant.

      Reviewer #2, Comment #4: To functionally validate interactors derived from the TRIP analysis as well as to identify factors that can rescue mutant Tg secretion the authors developed an RNA interference screen. There are a number of aspects that need to be addressed/clarified for this part of the study.

      Our Response: We have added some clarifying changes to the text and the figure panels associated with the siRNA screening and follow-up experiments on the trafficking and degradation factors that rescue Tg secretion. We have addressed other comments from Reviewers #3 and #4 related to these portions of the paper and hope that Reviewer #2 finds them satisfactory.

      Reviewer #2, Comment #5: While the authors validate the stable cell lines expressing the nanoluciferase tagged Tg and the linearity of luminescence signal in lysate and media carefully, they do not validate their platform in combination with the RNAi knockdown strategy. The authors should select genes as positive controls that are expected to modulate Tg secretion and demonstrate that the knockout of these positive controls indeed results in changes in Tg secretion in their system.

      Our Response: This is an excellent suggestion and certainly something we would have done given any prior knowledge on known control genes that would positively or negatively regulate Tg secretion. The purpose for developing the siRNA screening platform was to investigate and hopefully discover genes that are able to positively or negatively regulate Tg processing. We have done so to the best of our ability, identifying for example NAPA which positively regulates WT Tg secretion, as seen by the decrease in WT Tg secretion when treated with NAPA siRNA. Conversely, we found that VCP may negatively regulate C1264R Tg secretion, as discovered by the increase in secretion with VCP siRNA or ML-240 treatment. We included a standard "TOX" siRNA control, which we knew would likely negatively affect WT Tg secretion and this was indeed the case. As we stated within the manuscript:

      "This is the first study to broadly investigate the functional implications of Tg in-teractors and other PQC network components on Tg processing."

      Reviewer #2, Comment #6: For the screen the authors select 167 Tg interactors and PN (Proteostasis network) related factors. This statement is very vague and the authors should clarify which genes were knocked down and which criteria were applied to narrow down the list of interactors and to select PN factors. The authors should therefore provide a supplementary table including all genes included in the screen, their source (were this derived from the initial study by Wright et al, from the current study or compiled from prior knowledge about PN), as well as their results from the screen based on luminescence in media and lysate. It is unclear how many of the selected factors are actually coming from the TRIP analysis.

      Our Response: The list of genes included within the siRNA screen, as well as the results were previously included, and are now included in Appendix Fig S2. We have further provided the information requested by Reviewer #2 within Dataset EV5 indicating whether a gene was included in the siRNA screen due to its identification within our previous proteomics dataset (Wright et al, 2021.), the proteomics dataset presented here, or based upon primary literature. We added a comment in the text:

      "Moreover, we were interested in identifying factors whose modulation may act to rescue mutant Tg secretion. HEK293 cells were engineered to stably express nanoluciferase-tagged Tg constructs (Tg-NLuc) and screened against 167 Tg interactors and related PN components (see Dataset EV5 for the list of genes)."

      Reviewer #2, Comment #7: Only a small number of the 167 selected genes shows an effect on Tg abundance/secretion. How do the authors explain this result? Would we not expect that Tg interactors, especially those from the TRIP method which interact with the newly synthesized are more enriched for functionally relevant genes.

      Our Response: The proteostasis network contains genes and proteins of high redundancy in structure and function, and many single-gene knockdowns are likely insufficient to have a large impact on Tg abundance or secretion. In fact, these results are in line with what we would have expected when designing these experiments. Our goal here was to identify the key players that control Tg protein quality control.

      We explain the proteostasis network redundancy in the manuscript:

      "The functional implications of protein-protein interactions can be difficult to deduce, especially in the case of PQC mechanisms containing several layers of redundancy across stress response pathways, paralogs, and multiple unique proteins sharing similar functions (Wright & Plate, 2021; Bludau & Aebersold, 2020; Karagöz et al, 2019; Braakman & Hebert, 2013)."

      Reviewer #2, Comment #8: The authors initially performed the screen in HEK293 cells and as a second step wanted to validate the hits from the HEK cells in more relevant Fisher rat thyroid cells. Indeed they could show that knockdown of NAPA increased WT TG in lysate and decreased WT Tg secretion. Furthermore, they further validated genes to modulate mutant Tg lysate and media abundance. The authors should perform a rescue experiment to demonstrate that the observed phenotype can be reversed through re-introduction of NAPA.

      Our Response: We have now performed the requested NAPA complementation experiments and provided the data within Fig EV 7I. Overexpression of a human, siRNA-resistant NAPA construct partially reversed the increase in WT Tg lysate retention. These results further support the identification of NAPA as a pro-trafficking factor for WT Tg. We updated the manuscript text to include these data as follows:

      "To understand if these results were directly attributable to NAPA function, we performed complementation experiments where FRT cells treated with NAPA siRNAs were co-transfected with a human NAPA plasmid. WT Tg lysate abundance decreased when NAPA expression was complemented, confirming that the observed retention phenotype could be attributed to NAPA silencing (Fig EV7I). These results established that NAPA acts as a pro-secretion factor for WT Tg."

      Reviewer #2, Comment #9: One hit from this analysis was the ER-phagy receptor TEX264, while TEX264 was not identified in the TRIP data, is selectively increased the C1264R secretion, but not wt and the other Tg mutant. Following Co-IP data however revealed some interaction between the C1264R and to a lesser extent the A2234D mutant. How do the authors explain that TEX264 was missed in the TRIP dataset?

      Our Response: The TRIP samples are of much lower protein abundance compared to globally purified samples used for the Co-IP analysis. While the interaction is seen with the globally purified Co-IP samples, this interaction is likely much more difficult to capture with the low abundance, time-resolved samples that are acquired through the TRIP workflow, especially if this interaction is transient or requires the coordination of other accessory proteins as has been detailed in the literature and discussed within the manuscript presented here:

      "While A2234D and C1264R Tg were preferentially enriched with TEX264 compared to WT, it remains unclear what other accessory proteins may be necessary for the recognition of TEX264 clients (Chino et al, 2019; An et al, 2019). Furthermore, TEX264 function in both protein degradation and DNA damage repair further complicates siRNA-based investigations (Fielden et al., 2022). Further investigation is needed to fully elucidate 1) if Tg degradation takes place via ER-phagy and 2) by which mechanisms this targeting is mediated."

      Minor comments:

      Reviewer #2, Comment #10: The workflow needs to be described clearer. For example, it should be better explained why the authors selected a two-stage enrichment strategy, I assume that the first based on the Flag affinity tag is to purify the protein of interest and the second step based on the incorporation and functionalization of the unnatural amino acids to enrich for the newly synthesized fraction at specific time points after protein synthesis? These are critical steps for the method but the rationals are not well explained, neither in the text nor the figures captures all these steps of the method very clearly, which makes it really difficult for the reader to understand the individual steps of the method. Moreover, the structures in Figure 1 workflow are not clearly labeled, so that it is confusing which part represents which protein/molecule.

      Our Response: Thank you for this feedback. We have updated Fig 1 to provide more detail to provide more clarity for the readers. Furthermore, we have edited the text to more clearly describe the workflow:

      "To develop the time-resolved interactome profiling method, we envisioned a two-stage enrichment strategy utilizing epitope-tagged immunoprecipitation coupled with pulsed biorthogonal unnatural amino acid labeling and functionalization (Fig 1A). Cells can be pulse labeled with homopropargylglycine (Hpg) to synchronize newly synthesized populations of protein. After pulsed labeling with Hpg, samples can then be collected across time points throughout a chase period (Fig 1A, Box 1) (Kiick et al, 2001; Beatty et al, 2006). The Hpg alkyne incorporated into the newly synthesized population of protein can be conjugated to biotin using copper-catalyzed alkyne-azide cycloaddition (CuAAC) (Fig 1A, Box 2). Subsequently, the first stage of the enrichment strategy can take place where the client protein of interest is globally captured and enriched using epitope-tagged immunoprecipitation, followed by elution (Fig 1A, Box 3). The second enrichment step can then utilize a biotin-streptavidin pulldown to capture the Hpg pulse-labeled, and CuAAC conjugated population, enriching samples into time-resolved fractions (Fig 1A, Box 4) (Li et al, 2020; Thompson et al, 2019)."

      Reviewer #2, Comment #11: Except for the general workflow shown in Figure 1, a more detailed workflow showing the experimental steps, such as the sample fractions with the following steps could be added so that the design of the method is clearer. Also the style of the workflows including Figure 1, Figure 2A, and Figure 3A are different. It would be helpful to make them the same style and make the Figure 2A as a zoom in or more detailed illustration on part of Figure 1.

      Our Response: Thank you for this feedback. In addition to updating Fig 1, we also expanded Fig 2A to more clearly outline the experimental steps in the TRIP workflow. Assuming the term "style" used here is in reference to color pallets and figure schematics used, these have been updated to ensure they are agreeable aesthetically across manuscript figures.

      Reviewer #2, Comment #12: A summary of proteomics results of time course labeling after all enrichment steps, including the total number of identified proteins at different conditions and control would be helpful for having an overview impression on the proteomics results

      Our Response: __We have included an updated __Dataset EV1 that provides a summary of proteomics data included which runs given proteins were identified in, % of TMT channels quantified, % of Hpg Pulse channels quantified, and generally number of proteins quantified across runs for each construct.

      Reviewer #2, Comment #13: In Figure 2B, the WB for PDIA4 in the Biotin PD elution is missing. Why was the PDIA4 interaction missing for the time course analysis, but the interaction was captured in the initial test for Wt Tg (Figure 1D). Additionally, in this panel the Rhodamine Probe Gel shows inconsistencies at the time points 1.5 - 3h. Does this mean that the labeling did not work well for these conditions? As we would expect a consistent Rhodamine Probe signal at every time point.

      Our Response: Please also see our response to Reviewer #1, comments 3 & 11. Fig 1D features continuous Hpg labeling for 4 hours to ensure that most intracellular Tg is labeled for this proof-of-concept experiment for the two-stage enrichment strategy. Fig 2B features a shorter 60 minute pulse of Hpg labeling, prior to the full chase period and two-stage enrichment strategy. PDIA4 interactions were detectable throughout Fig 1D because those measurements captured a larger population of labeled Tg, whereas in Fig 2B Tg bait protein amounts were much smaller after the two-stage enrichment procedure to capture the time-synchronized population.

      The Rhodamine/TAMRA Probe Gel in Fig 2B does not have inconsistencies in Tg abundance, but highlights the fact that pulse labeled WT Tg is being secreted or degraded in FRT cells. As you would expect as time continues during the chase period, intracellular WT Tg signal decreases as secretion and degradation take place. Constant Rhodamine/TAMRA probe signal would not be expected here. Consistent with this, the C1264R Tg signal remains more stable for the intial time course. This is expected as the C1264R Tg variant is retained intracellular undergoing increased interactions the proteostasis network. We have removed the PDIA4 panel for WT Tg because there was no signal above the detection limit. This is now explained as follows:

      "For WT Tg, interactions with HSPA5 peaked within the first 30 minutes of the chase period and rapidly declined, in line with previous observations, but PDIA4 interactions were not detectable by western blot analysis (Fig 2B) (Menon et al, 2007; Kim & Arvan, 1995)."

      Reviewer #2, Comment #14: In Figure 2, why was there no WB results for the A2234D? In Figure 2D and 2E, at which time point are the changes significant compared to WT?

      Our Response: We did not perform the WB experiments with A2234D. We used WT and C1264R Tg in our proof of concept experiments via WB and decided to move forward with analyzing A2234D Tg by LC-MS/MS. Please see our response above to Reviewer #2, comment 3 for information on the statistical analysis.

      Reviewer #2, Comment #15: All figure legends should indicate how many biological replicates were performed for each experiment represented in the figure.

      Our Response: We have updated the figure captions to include this information where applicable.

      Reviewer #2, Comment #16: The heatmaps shown in Figure 3, Figure 3 - Figure Supplement 3, and Figure 7 are in the current form incomprehensible. The heatmaps depict the relative enrichment vs the control sample, which was scaled between 1 and -1. The color coding with 5 different colors from 1 to -1 is very confusing and should be changed to just two colors, one for positive and one for negative relative enrichment. I would also suggest changing the visualization of the heatmap showing the wt and mutants side by side, instead of stacked on top of each other for each individual protein.

      Our Response: Thank you for this feedback, and we apologize for the confusion. We adjusted our data analysis approach by removing previous negative enrichment values. As these served only as "background" within the dataset, they did not carry much meaning. The TRIP enrichment is now scaled from 0 to 1, where a value of 1 represents the time point at which the enrichment is greatest, while 0 represents the background intensity in the (-) Hpg control sample. The associated figures have been updated accordingly, and we feel they are now more comprehensible and aesthetically pleasing.

      We opted to keep the Viridis color scheme in the heatmap to allow for more nuanced differentiation of the enrichment values.

      Reviewer #2, Comment #17: The data analysis method for generating relative enrichment shown in the heatmap is not explained. This should be described in the method section for a better understanding of the data analysis.

      Our Response: We have edited the methods section as follows to better explain the analysis:

      "For time resolved analysis, data were processed in R with custom scripts. Briefly, TMT abundances across chase samples were normalized to Tg TMT abundance as described previously and compared to (-) Hpg samples for enrichment analysis (Wright et al, 2021). For relative enrichment analysis, the means of log2 interaction differences were scaled to values from 0 to 1, where a value of 1 represented the time point at which the enrichment reached the maximum, and 0 represented the background intensity in the (-) Hpg channel. Negative log2 enrichment values were set to 0 as the enrichment fell below the background."

      Reviewer #2, Comment #18: There are no legends of flowcharts in Figure 2A and Figure 3A and it is difficult to understand which are the key components in the complex and what are the differences among different periods of labeling.

      Our Response: We have now consolidated Fig 2A and Fig 3A into a single panel found in Fig 2A, which is significantly reorganized to better explain the TRIP workflow. The caption has additionally been updated to highlight key steps within the workflow with numbering to allow readers to follow and visualize the steps more easily. The figure caption now reads as follows:

      "(A) Workflow for TRIP protocol utilizing western blot or mass spectrometric analysis of time-resolved interactomes. (1) Cells are pulse-labeled with Hpg (200μM final concentration) for 1 hr, chased in regular media for specified time points, and cross-linked with DSP (0.5mM) for 10 minutes to capture transient proteoastasis network interactions; (2) Lysates are functionalized with a TAMRA-Azide-PEG-Desthiobiotin probe using copper CuAAC Click reaction; (3) Lysates undergo the first stage of the enrichment strategy where the Tg-FT is globally captured and enriched using immunoprecipitation; (4) Eluted Tg-FT populations from the global immunoprecipitation undergo biotin-streptavidin pulldown to capture the pulse Hpg-labeled, and CuAAC conjugated population of Tg-FT, enriching samples into time-resolved fractions; (5) Time-resolved fraction may then undergo western blot analysis or (6) quantitative liquid chromatography - tandem mass spectrometry (LC-MS/MS) analysis with tandem mass tag (TMTpro) multiplexing or analysis. The (-) Hpg control channel is used to identify enriched interactors and a (-) Biotin pulldown channel to act as a booster (or carrier)."

      Reviewer #2, Comment #19: Why did only one of the VCP inhibitors (ML-240) exhibit a phenotype in Tg abundance and secretion, but not the other VCP inhibitors?

      Our Response: Please also see our response to Reviewer #3, comment 2 below. This could be due to a number of reasons, but we added a brief discussion on the mechanisms of action for the inhibitors that may at least partially explain the differences in phenotype seen with the VCP inhibitors. We updated the text as follows:

      "ML-240 and CB-5083 are ATP-competitive inhibitors that preferentially target the D2 domain of VCP subunits, whereas NMS-873 is a non-ATP-competitive allosteric inhibitor which binds at the D1-D2 interface of VCP subunits (Chou et al, 2013, 2014; Anderson et al, 2015; le Moigne et al, 2017; Tang et al, 2019). ML-240 and NMS-873 have been shown to decrease both proteasomal degradation and autophagy, in line with VCP playing a role in both processes (Chou et al, 2013, 2014; Her et al, 2016). Conversely, while CB-5083 is known to decrease proteasomal degradation it has been shown to increase autophagy. (Anderson et al, 2015; le Moigne et al, 2017; Tang et al, 2019)."

      Reviewer #2 (Significance (Required)):

      Reviewer #2, Comment #20: __The authors __describe a novel and elegant method to map time resolved protein interactions of newly synthesized proteins, which allows monitoring of proteins regulating protein quality control.

      Authors describe it as a general method, however, they only demonstrate the applicability to one protein and do not systematically evaluate the quantitative nature of their approach by determining quantitative reproducibility, which would be necessary to be able to claim that this is a method with broad applicability.

      Given my expertise in quantitative proteomics, I can mainly comment on the technological aspects of the proteomics part of the manuscript, but do not feel qualified to evaluate the significance of this study in terms of novel biology. Nevertheless, it feels that there is a stronger emphasis on the biology in the current form of the manuscript which will raise interest of scientists with a focus on protein quality control and Tg biology.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Provide a short summary of the findings and key conclusions (including methodology and model system(s) where appropriate). Please place your comments about significance in section 2.

      In this manuscript, the authors describe their efforts to develop a methodology for determining time-resolved protein-protein interactions using quantitative mass spectrometry. With TRIP (time-resolved interactome profiling), they combine a pulsed bio-orthogonal unnatural amino acid labelling (homopropargylglycine, Hpg), CuAAC conjugation and biotin-streptavidin pulldowns to enrich at different timepoints and time-resolve by combining TMT labelling and LC-MS/MS (Figure 1). This technique is then applied to the maturation of the secreted WT and mutant thyroglobulin (Tg-WT, Tg-C1264R, Tg-A2234D) expressed in HEK293 and rat thyroid cells (FRT) and linked to hyperthyroidism. There, they identify a collection of ER resident proteins involved in protein folding/processing (e.g. chaperones, redox, glycans, hydroxylation) as well as degradation (e.g. autophagy, ERAD/proteasomes) (Fig. 2). Here the authors effectively use pulse-labelled form of TRIPs to highlight the different interactions formed with Tg-WT vs. Tg-mutants during biogenesis and secretion (or retention). The analysis found ~200 new interactions compared to previous studies along with about 40% of those identified previously. Differences in interactions were observed for mutants, which shown extended interaction with chaperones and redox processing pathways. While many interactions appeared as might be expected, the identification of membrane protein processing elements (e.g. EMC, PAT) was puzzling and raised some questions about the specificity within the protocol. Mutants enriched for CANX CALR and UGGT, suggesting prolonged association with glyco-processing factors. Interaction of C1264R with the ER-phagy factors CCPG1 and RTN3 was greater than WT. The authors note that their interaction correlated with that of EMC1 & 4, but it is not clear why that might be.

      With interactors in hand, the authors complemented the TRIP protocol with siRNA KD of identified factors, to investigate any changes to secreted vs intracellular Tg upon loss. KD of NAPA (a-SNAP) and LMAN1 increased WT lysate (intracellular) Tg but not mutants. NAPA also reduced Tg-WT secretion. In contrast, KD of NAPA increased A2234D secretion while LEPRE1 increased C1264R (but not A2234D or WT), suggesting mutants have differential processing paths and requirements. KD of VCP increased secretion of both mutants. Some ER-phagy receptors were found among interactors (e.g. RTN3 in Tg-C1264R only) but often their KD had no impact on secretion (CCPG1, SEC62, FAM134B). NAMA observations were recapitulated in thyroid derived cell line (FRT). KD of TEX264 and VCP increased Tg-C1264 secretion while RTN3 KD in FRTs decreased Tg-C1264 secretion. This was in contrast to data from HEK293s for reasons that are not clear. Co-IP with TEX264 enriched for all Tg forms but more so for C1264R and A2234D - motivating the authors to propose selective targeting of Tg to TEX264 and the consideration of ER-phagy as a "major" degradative pathway during Tg processing.

      Given the observations with siRNAs to VCP, the authors next use a selection of VCP inhibitors to ask whether secretion can be rescued upon pharmacological impairment of the AAA ATPase. They observed that ML-240, but interestingly not the more conventionally used CB-5083 or NMS-873, increased secretion of Tg-C1264R but not lysate. Inhibitors increased lysate but decreased the secreted fraction for Tg-WT (Fig 7). Finally, the authors used TRIP again in ML-240 treated Tg-C1264R expressing cells to look for changes to interactome with treatment - observed decreases to glycan and chaperone interactions, CANX and UGGT1, decreased interaction with DNAJB11 and C10, like that of WT. There was no apparent change to the UPR, although activation was not directly measured.

      Major comments:

      Reviewer #3, Comment #1: __Are the key conclusions convincing? __The TRIP methodology appears to be quite robust and should be a powerful strategy for this field and others going forward. The drawback will be the length of pulse required will limit the number/type of proteins to be monitored to ones with longer t1/2's. There were interesting interactions found with Tg and the mutants linked to hyperthyroidism, but cut and dry differences did not appear as obvious, even though strong "trends" appear to be present. The path from identifying interactors in a time-resolved manner to then following them up with targeted KD does provides some clarity, which is important.

      Our Response: We thank Reviewer #3 for their time in reviewing our manuscript and providing this positive feedback. We have enhanced our analysis of the TRIP data to more clearly highlight difference in time profiles between WT and mutant variants. Please see our response to Reviewer #2, comment 1 & 3. We also highlight the limitations of the time resolution in the discussion (see also Reviewer #2, comment 6):

      "To address this, we utilized a labeling time of 1 hr which allows us to generate a large enough labeled population of Tg-FT for TRIP analysis, but some early interactions are likely missed within the TRIP workflow. In the case of mutant Tg, performing the TRIP analysis for much longer chase periods (6-8 hrs) may provide insightful details to the iterative binding process of PN components that is thought to facilitate protein retention within the secretory pathway."

      We have addressed all further comments below.

      __Reviewer #3, Comment #2: __Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether? The data regarding VCP silencing and pharmacological impairment appear clear but leave some questions outstanding in this reviewer's opinion. The lack of effect with the 2 highly selective inhibitors suggests that the underlying mechanism for switching fate of intracellularly retained Tg-C1264R towards secreted forms is not at all clear. ML-240 is an early derivative of DBeQ and reportedly impairs both ERAD and autophagic pathways, similarly to DBeQ. The differences between the VCP inhibitors' mechanism of action were not discussed, but perhaps should be elaborated upon, particularly in the matter of how ERAD and ER-phagy pathways might be being differentially affected. At the risk of asking for too many additional experiments, this reviewer would just prefer to see this fleshed out in a bit more detail.

      Our response: We agree with Reviewer #3 that the underlying mechanism for switching fate of the intracellular retained Tg-C1264R towards secreted forms remains unclear. We have added additional text to discuss further the details surrounding the inhibitors used and the general manner in which ERAD and ER-phagy pathways can be affected. This added text reads as follows:

      "ML-240 and CB-5083 are ATP-competitive inhibitors that preferentially target the D2 domain of VCP subunits, whereas NMS-873 is a non-ATP-competitive allosteric inhibitor which binds at the D1-D2 interface of VCP subunits (Chou et al, 2013, 2014; Anderson et al, 2015; le Moigne et al, 2017; Tang et al, 2019). ML-240 and NMS-873 have been shown to decrease both proteasomal degradation and autophagy, in line with VCP playing a role in both processes (Chou et al, 2013, 2014; Her et al, 2016). Conversely, while CB-5083 is known to decrease proteasomal degradation it has been shown to increase autophagy. (Anderson et al, 2015; le Moigne et al, 2017; Tang et al, 2019)."

      "As we discovered that pharmacological VCP inhibition with ML-240 can rescue C1264R Tg secretion yet is detrimental for WT Tg processing, it is unclear whether VCP may exhibit distinct functions for WT and mutant Tg PQC. Finally, as ML-240 is shown to block both the proteasomal and autophagic functions of VCP it is unclear which of these pathways may be playing a role in the rescue of C1264R, or detrimental WT processing (Chou et al, 2013, 2014)."

      __Reviewer #3, Comment #3: __Q1. The degree (if any) of Tg-C1264 aggregation during and/or detergent solubility do not appear to have been considered as a potential source of the increase in released secreted material (Figure 4, 5). Do Tg mutants partition into RIPA-insoluble fractions at all? That is to say.. is the total population of synthesized Tg being considered? A full accounting? Could the authors address this and if biochemical extraction data (via urea or high SDS) is available, include it to answer this concern.

      Our response: The transient aggregation of Tg has been investigated in some detail previously (Kim et al, 1992, 1993). The transient aggregates have the ability to partition into RIPA-insoluble fractions. Of note, these aggregates are shown to be made up, at least in part, of mixed disulfide linkages requiring reducing agent to fully resolubilize. With that being said, these aggregates represent a minority of the overall Tg population. In our prior manuscript (Wright, et al. 2021), we quantified the RIPA-insoluble fraction found in the pellet (see Supplemental Info Fig. 5). As the majority of Tg remains soluble during processing it should be able to be captured via our TRIP methodology. That is to say, we are capturing most of the Tg that is available for analysis while understanding that some smaller population of Tg remains in RIPA-insoluble fractions.

      __Reviewer #3, Comment #4: __Q2. Along the same lines, what does Tg-WT and mutant expression look like by microscopy? Is Tg-WT uniformly distributed while Tg-mutants appear in puncta... more aggregated - perhaps reflecting the increased engagement of chaperones and redox machinery? Changes in the pattern of Tg-C1264R mutant (e.g. w/ VCP KD or inhibition) would add additional support for the authors interpretation of improved secretion. If this data is at hand, including it might be worth consideration.

      Our response: Thank you for this suggestion. The subcellular localization of Tg and any changes from proteostasis modulation is an ongoing area of follow up work in our lab. We have some preliminary results that the localization for WT and C1264R Tg indeed differs. However, given that this manuscript is already dense in information, we opted to reserve this data for a future manuscript where we plan to further elucidate the targeting mechanism of mutant Tg to VCP or TEX264. We direct the reviewer to work published by Zhang et al, 2022,(https://doi.org/10.1016/j.jbc.2022.102066) showing a staunch difference of WT vs mutant Tg in the localization from intracellular to a secreted population in rat tissue. While most all WT Tg is found in the follicular lumen (secreted), mutant Tg heavily co-localizes with the ER resident chaperone BiP. While this paper does not go into detail on the differences in subcellular localization, it further highlights the drastic changes in Tg processing and how these manifest in distinct differences in localization within tissue.

      __Reviewer #3, Comment #5: __Q3. Does the level of Tg mutant expression in the FRT clones impact the profiles obtained by TRIP? (Figure 3). This is a question of gauging the relative saturation of QC machinery and how that might impact profiles from TRIP. Were clones expressing at different levels tested? Perhaps a brief discussion of this.

      Our response: We do not foresee an impact from level of Tg expression on the profiles obtained by TRIP. We were able to identify distinct profiles because we processed the data and normalized it based on the relative Tg amount. For example, while WT and A2234D Tg are expressed at similar levels intracellularly, we were able to identify distinct differences in the interaction profiles across the two constructs. When developing FRT clones, we selected those that were expressed at similar levels and, therefore, did not have the capability to directly test differences, if any, in observed profiles that may be the result of different expression levels of the same Tg construct. Furthermore, Tg can make up 50% of all protein content within thyroid tissue (Di Jeso & Arvan, 2016). As such, thyroid cells are adept at maintaining the balance of QC machinery to process thyroid. Therefore, we do not anticipate that the amount of Tg expressed in TRIP experiments would have a significant impact on the profiles that we were able to observe.

      __Reviewer #3, Comment #6: __Q4. For Figure 3, the hour-long labelling period seems a bit long, compared with 3 hr of chase. Perhaps this reviewer missed this but how long does Tg take to mature and/or mutants to misfold and degrade? Is there any possibility to shorten this so that the profiles of labelled Tg could be more synchronized? If not, perhaps this could just be discussed.

      Our response: While the 1-hour labeling period may seem long, we had to balance the labeling time to 1) label a large enough population of Tg for it to remain detectible throughout the chase period, and 2) keep the chase period long enough to capture the large majority of Tg processing. In our hands we found that by 4 hours WT Tg was ~63% secreted, with ~25% retained intracellular (Fig EV7H). Conversely, we found that C1264R remains very stable over this period with most protein being retaining intracellularly and little degradation taking place (Fig EV9A). Hence, we opted for the overall ~4 hour total for sample processing (1 Hr pulse labeling + 3 hour chase period for time point collections). Literature suggest that WT Tg takes ~2 hours to be processed within the ER and reach the medial golgi. This is exemplified by the EndoH resistant population that appears at this ~2 hour time point (Menon et al. JBC. 2007). Please also see our response to Reviewer #1, comment 6. We updated the text as follows:

      "We pulse labeled WT Tg FRT cells with Hpg for 1 hr, followed by a 3 hr chase in regular media capturing time points in 30-minute intervals and analyzing via western blot or TMTpro LC-MS/MS (Fig 2A). Our previous study indicated that ~70% of WT Tg-FT was secreted after 4 hours, while approximately 50% of A2234D and 15% of C1264R was degraded after the same time period (Wright et al, 2021). Therefore, we reasoned that a 3-hr chase period would be a enought time to capture the majority of Tg interactions throughout processing, secretion, cellular retention, and degradation, while still being able to capture an appreciable amount of sample for analysis."

      We anticipate that this labeling period can be decreased with future iterations of this methodology. This will also be bolstered by the continued improvements that come about within quantitative proteomics in increased instrument sensitivity and improved sample preparation methods that have the ability to decrease sample loss.

      We explain the labeling timeline and limitations further in the discussion:

      "To address this, we utilized a labeling time of 1 hr which allows us to generate a large enough labeled population of Tg-FT for TRIP analysis, but some early interactions are likely missed within the TRIP workflow. In the case of mutant Tg, performing the TRIP analysis for much longer chase periods (6-8 hrs) may provide insightful details to the iterative binding process of PN components that is thought to facilitate protein retention within the secretory pathway."

      __Reviewer #3, Comment #7: __Q5. It is curious that only ML-240 and not other well characterized inhibitors of VCP/p97, has an effect, as both are used far more often than ML-240. The authors do not really address this in detail but does it suggest that the ML-240 effect on VCP/p97 could be affecting different pathways, given the nature of this compound. Is this compound acting on Tg-C1264R maturation at the level of translation or post-translationally? If the latter, through what means?

      Our Response: We thank Reviewer #3 for appreciating this surprising finding. We were similarly curious as to how, or why ML-240 was able to elicit this effect compared to other VCP inhibitors. We elaborated in the manuscript text on these compounds and on how the ERAD and ERphagy pathways, utilizing VCP, may be differentially regulated (See response to__ Reviewer #3, Comment 2__). While speculative, we believe that ML-240 acts on C1264R Tg maturation post-translationally. This is given by the fact that ML-240 does not seem to affect the translational velocity of C1264R Tg, as Fig EV9A shows similar levels of 35S-labeled C1264R in DMSO or ML-240 treated cells. It may be the case that acute treatment with ML-240 alters the folding vs degradation balance of the ER proteostasis network in such a way that some population of C1264R that is usually degraded is able to be secreted. Another Tg mutation G2320R was shown to be degraded via the proteasome in PLCCL3 thyrocytes, as MG-132 treatment slowed mutant Tg degradation (Menon et al. JBC. 2007), although G2320R degradation was not be exclusively proteasomal. The L2284P Tg mutation exemplified similar results to G2340R where MG-132 slowed degradation. Furthermore, L2284P Tg was not affected by autophagic/lysosomal inhibitors chloroquine and E64 (Tokunaga et al. JBC. 2000), suggesting ERAD more exclusively degrades L2284P. It is unclear which degradation pathway, ERAD or ER-phagy, may be the predominate pathway for C1264R Tg degradation. Furthermore, we do not exclude the possibility that both may be at play and affected by treatment with ML-240.

      We utilized our HEK293 Tg-NLuc cells and screened other proteasomal and lysosomal inhibitors bafilomycin and bortezomib. Neither of these compounds were able to rescue A2234D or C1264R secretion, highlighting that the effect is specific to ML-240 treatment. This new data is now shown in __Fig EV10A,B __and described in the text:

      "To understand whether this rescue in secretion was uniquely linked to VCP inhibition or could be more broadly attributed to blocking Tg degradation, we tested the proteasomal inhibitor bortezomib, and lysosomal inhibitor bafilomycin. Bafilomycin increased WT Tg lysate abundance, and bortezomib significantly increased A2234D lysate abundance, consistent with a role of these degradation processes in Tg PQC (Fig EV10A). When monitoring Tg-NLuc media abundance, neither bafilomycin nor bortezomib significantly altered WT, A2234D, or C1264R abundance (Fig. EV10B). confirming that general inhibition of proteasomal or lysosomal degradation does with rescue mutant Tg secretion."

      __Reviewer #3, Comment #8: __Q6. Continuing from Q5.. At what point and where is VCP/p97 able to affect mutant Tg processing? In line 317, the authors seem to correlate increased VCP association with mutants to their increased secretion. It is not clear how this would result, as engagement with VCP would be in a compartment different to that which supports trafficking and secretion. Could the authors expand on how this might come about. This is also relevant to the ML-240 data in Figure 7. Moreover, VCP is associated with ERAD (as is HerpUD1) rather than ER-phagy and at least in the siRNA raw data, there are also effects from Derlin3 and FAF2 KDs.. both ERAD factors. Some clarity here would be appreciated.

      Our Response: This line of discussion in the text was meant to suggest that, since VCP showed a higher enrichment for mutant Tg, particularly C1264R, it would make sense that inhibiting VCP would have a larger effect on mutant Tg processing as compared to WT Tg. As we saw with the siRNA screening data, suppression of VCP resulted in increased C1264R secretion, while not affecting WT Tg processing. This passage was not intended to suggest that increased VCP association with mutant Tg found within the TRIP dataset was the reason for rescued secretion. These are two different sets of experiments and environments in which these data are captured. We were simply looking for the opportunity to bridge the findings from the two sets of experiments to a single discussion point. Of note, we understand that VCP is associated with ERAD and acts to regulate autophagy. Given that core autophagy machinery is relevant for both bulk autophagy and ER-phagy, we did not want to rule out the fact that VCP inhibition via ML-240 could affect autophagic flux in these experiments (Chou et al. Chemmedchem. 2013; Khaminets et al. Nature. 2015; Hill et al. Nat. Chem. Bio. 2021.)

      It is great that the reviewer also noted that DERL3 and FAF2 knockdown increased C1264R Tg secretion. Since these ERAD factors did not reach the defined threshold in the screen, we did not include further discussion, but this data remains available in Appendix Fig S3. We have updated the manuscript text to clarify the previous points we aimed to make. The text now reads as follows:

      "VCP silencing exclusively affecting mutant Tg corroborates our TRIP dataset, and suggest a more prominent role for VCP in mutant Tg PQC compared to WT. VCP interactions were sparse for WT Tg while they remained more steady throughout the chase period for the mutants (Fig 3H,K)."

      __Reviewer #3, Comment #9: __Q7. There does not appear to be a direct demonstration of Tg-C1264R turnover by ER-phagy (via TEX264). Given the inconsistency with it not being detected by TRIP, while another receptor RTN3 was, but has not impact on Tg-C1264R secretion, perhaps including that data would go some way to demonstrating a fate of ER-phagy (at least partly) for this mutant.

      Our response: We performed follow-up experiments to test interactions with Tg and the wider panel of ER-phagy receptors. We transiently expressed FLAG-tagged CCPG1, RTN3L, and TEX264 in HEK293 cells stably expressing Tg-NLuc and performed FLAG IPs followed by western blot analysis. We found that WT and C1264R Tg were enriched, albeit modestly, in the RTN3L Co-IP compared to control samples expressing GFP. Additionally, we found that WT, A2234D, and C1264R Tg were all enriched with CCPG1 compared to control samples expressing GFP. CCPG1 was found to be a C1264R Tg interactor within our mass spectrometry datasets, along with RTN3. We have now integrated these data into the manuscript as Fig EV8, and updated the manuscript text as follows:

      "Additionally, we monitored Tg enrichment with ER-phagy receptors CCPG1 and RTN3 via Western blot as both were found to be C1264R Tg interactors within our TRIP dataset. RTN3L is found to be the only RTN3 isoform involved in ER turnover via ER-phagy (Grumati et al, 2017). WT and C1264R Tg-NLuc were modestly enriched with RTN3L compared to control samples expressing GFP. Conversely, we found that all Tg variants exhibited modest interactions with CCPG1 compared to control samples expressing GFP, although less than with TEX264 (Fig EV8).

      Together, these data suggest that TEX264, CCPG1, or RTN3L engage with Tg during processing, and CH-associated Tg mutants may be selectively targeted to TEX264. Furthermore, ER-phagy may be considered as a degradative pathway in Tg processing, as other studies have mainly focused on Tg degradation through ERAD (Tokunaga et al, 2000; Menon et al, 2007)."

      Whether the TEX246 recruitment of mutant Tg leads to degradation remains to be tested. When we monitored C1264R Tg degradation by pulse-chase assay (Fig. EV9A), only a small fraction (

      __Reviewer #3, Comment #10: __Q9. The authors provide data that the UPR was not induced by ML-240 at 3hrs (10µM) (Figure 7, supplemental 1). This is in stark contrast to the results of Chou et al (2013) which the authors reference, reporting that ML-240 induced ATF4 and CHOP by 2 hrs at concentrations lower than used here (albeit a different cell type). While not exclusively UPR, could the authors address the potential activation of the integrated stress response (eIF2a phosphorylation, ATF4 and CHOP) in the FRT cells due to ML-240 treatment? If present, is there some link that could this provide an explanation for increased Tg-C1264R secretion? [Basal PERK/UPR activation with mutants.]

      Our Response: Thank you for bringing up this important point. As the reviewer acknowledges, the difference in UPR activation could stem from the different cell lines. Additionally, we measured activation via qPCR, whereas Chou et al. measured via immunoblot. We would like to point out that while we did not observe the upregulation of HSPA5 or ASNS (markers of ATF6 and PERK/ISR activation, respectively) in the presence of short ML-240 treatment (2-3 hr), we did observe the upregulation of DNAJB9 (a marker of IRE1/XBP1s activation).

      To address Reviewer #3's point, we performed further experiments monitoring the potential activation of the ISR in FRT cells due to ML-240 treatment. We treated C1264R Tg-FT FRT cells with ML-240 (10μM) for 2 hours, and monitored eIF2a phosphorylation via immunoblot. Indeed, we observed that ML-240 induced eIF2a phosphorylation compared to cells treated with DMSO. Tunicamycin (1mg/mL) was used a positive control, and showed similar results to ML-240. We have integrated these results into the manuscript, available in Fig EV10C.

      However, we would like to point out that all of these markers represent signs of early UPR inductions. Importantly, our results that HSPA5 transcript levels are not induced suggest that there is only very modest upregulation of ER chaperone levels occurring. Typically, the ER proteostasis network remodeling requires a longer time than the acute 2-4 hr treatment with ML-240. We have updated the manuscript text as follows:

      "Finally, we monitored activation of the unfolded protein response (UPR) in the presence of ML-240 in FRT cells expressing C1264R Tg-FT. Phosphorylation of eIF2a, an activation marker for the PERK arm of the UPR, was induced within 2 hr of ML-240 treatment (Fig EV10C). We further investigated the induction of UPR targets via qRT-PC. HSPA5 and ASNS transcripts, markers of ATF6 and PERK UPR activation respectively, remained unchanged or slightly decreased after 3 hr treatment with ML-240 in C1264R Tg cells (Fig EV10D). Only DNAJB9 transcript expression showed a significant increase in both WT Tg and C2164R Tg FRT cells (Fig EV10D). Moreover, ML-240 did not significantly alter cell viability after 3 hr, as measured by propidium iodide staining (Fig EV10E). Overall, these results highlight that the short ML-240 treatment induces early UPR markers, but the selective rescue of C1264R Tg secretion via ML-240 treatment is unlikely the results of global remodeling of the ER PN due to UPR activation."

      __Reviewer #3, Comment #11: __Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments. Any of the suggested experiments above all use reagents reported in the manuscript and so would presumably incur minimal cost and hopefully time. This reviewer is sympathetic to time and financial constraints and so discussion of the issue could suffice.

      Our response: We have addressed follow-up experiments whenever possible or provided further discussion details where applicable. We are appreciative of Reviewer #3's sympathy for the time and financial constraints that go into this work and addressing manuscript revisions. Unfortunately, the 1st and 2nd authors both left the lab immediately after the reviews were received. Hence, many of the experiments had to be addressed by other lab members joining the project, which took considerably longer than anticipated. We apologize for the long delay with our revisions.

      __Reviewer #3, Comment #12: __Are the data and the methods presented in such a way that they can be reproduced? Yes. The methodology is explained in detail.

      Our Response: Thank you.

      __Reviewer #3, Comment #13: __Are the experiments adequately replicated and statistical analysis adequate? Yes. Relevant information is either in the figure legends or is provided in the source data.

      Our Response: Thank you.

      Minor comments:

      __Reviewer #3, Comment #14: __Are prior studies referenced appropriately? The references are generally appropriate, with a few exceptions of more general references used

      Our Response: Thank you.

      __Reviewer #3, Comment #15: __Are the text and figures clear and accurate? The text is clearly written, and the figures are clear.

      Our Response: Thank you.

      __Reviewer #3, Comment #16: __Do you have suggestions that would help the authors improve the presentation of their data and conclusions? A summary figure comparing the changing profiles of WT and C1264R and the factors implicated for them could be helpful.

      Our Response: We opted not to include a summary figure because the paper and figures area already dense in information.

      __Reviewer #3, Comment #17: __Perhaps include common nomenclature for proteins as well (e.g. HSP5A - BiP, HSP90B1 - Grp94, etc..)

      Our Response: We updated the manuscript throughout to reference common nomenclature or other protein names where applicable at their first mention.

      __Reviewer #3, Comment #18: __Line 317 - our is misspelled

      Our Response: Thank you. We have made this correction.

      __Reviewer #3, Comment #19: __Figure 4 - Supplemental Figure 1 - Legend has text referring to panels J and K, but Figure only goes up to F.

      Our Response: Thank you. This was an error in references to Figure panel lettering and we have since corrected this. Please note that this Figure is now Fig EV6.

      Reviewer #3 (Significance (Required)):

      __Reviewer #3, Comment #20: __

      • Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field.

      • Place the work in the context of the existing literature (provide references, where appropriate).

      Protein-protein interactions are often used to illustrate complexes and functionality, but these provide only snapshots, rather than "movies". There are many datasets out there exploring P-P interactions, but most if not all lack any temporal resolution for the interactions they report. The TRIP method described approaches this from the dynamic perspective - identifying the transient interactions formed by folding nascent chains with proteins that aid in their maturation and trafficking, or degradation. This represents an important technical advance in our ability to dynamically monitor protein interactions. The use of Tg mutants is valuable and perhaps this will lead to new perspectives on how to rescue it or other pathophysiological mutants with loss of function phenotypes.

      • State what audience might be interested in and influenced by the reported findings.

      This work should appeal to a broad audience within cell biology, particularly as the TRIP technique is attempting to address a fundamental question - what interactions form during the biogenesis/lifetime of a protein. Moreover, the effort to try to understand the different interactions formed with pathologically relevant mutant proteins as a strategy to try to rescue functionality, is a valuable exercise of this approach.

      • Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.

      ER quality control

      Our Response: We thank reviewer #3 for this positive endorsement.

      Reviewer #4 (Evidence, reproducibility and clarity (Required)):

      Summary

      In this manuscript, Wright et al. developed an approach (termed TRIP) that allowed to map the temporal changes in the interaction landscape of a newly synthesized protein of interest. Using their TRIP approach, the authors found that the extensive interactions of thyroglobulin (Tg) with the proteostasis network (PN) during its passage through the secretory pathway were profoundly altered in response to disease-causing mutations (e.g. C1264R). The authors cross-validated their findings with a focus RNAi screen monitoring the cellular and secreted abundance of Tg variants upon deletion of PN components. In subsequent experiments the authors focused on two hits, VCP and TEX264, for which they confirmed their inhibitory effect on the secretion of Tg C1264R. Importantly, the authors found that TEX264 increasingly interacts with the Tg mutant and that pharmacological inhibition of VCP yielded the same phenotype than depletion of VCP. Overall, Wright and colleagues__ established an elegant method to map protein interaction in a time-resolved manner and demonstrated its value by the analysis of disease-related Tg mutants__. Hence, this work has the potential to serve as a rich resource for Tg-related research and as a powerful new tool to examine protein interactions. However, several concerns remain.

      Our response: Thank you to reviewer #4 for their valuable feedback and positive assessment. We addressed all comments in detail below.

      Major points:

      __Reviewer #4, Comment #1: __Overall, the TRIP workflow is quite difficult to understand at a first glance - even for a reader with a background in proteomics, biochemistry and cell biology. The authors may want to improve the description of the TRIP methodology and explain in more detail what the individual components and steps are good for. Along the same line, from the main text and the figure legend it was not clear that Tg was actually Flag-tagged. However, without this information it is difficult to follow the workflow. While Figure 1A is certainly helpful, the bulky graphics are deflecting the reader's attention. A more schematic version might be more informative.

      Our Response: Thank you for this feedback, which was also mirrored by Reviewer #2 (comment 10). We have made significant updates to clarify Fig 1 to provide more detail and eliminate some of unnecessary bulky graphics. We also expanded the schematic for the TRIP workflow in Fig 2A and we aligned all symbols used. Furthermore, we have edited the text to describe the workflow more clearly:

      "To develop the time-resolved interactome profiling method, we envisioned a two-stage enrichment strategy utilizing epitope-tagged immunoprecipitation coupled with pulsed biorthogonal unnatural amino acid labeling and functionalization (Fig 1A). Cells can be pulse labeled with homopropargylglycine (Hpg) to synchronize newly synthesized populations of protein. After pulsed labeling with Hpg, samples can then be collected across time points throughout a chase period (Fig 1A, Box 1) (Kiick et al, 2001; Beatty et al, 2006). The Hpg alkyne incorporated into the newly synthesized population of protein can be conjugated to biotin using copper-catalyzed alkyne-azide cycloaddition (CuAAC) (Fig 1A, Box 2). Subsequently, the first stage of the enrichment strategy can take place where the client protein of interest is globally captured and enriched using epitope-tagged immunoprecipitation, followed by elution (Fig 1A, Box 3). The second enrichment step can then utilize a biotin-streptavidin pulldown to capture the Hpg pulse-labeled, and CuAAC conjugated population, enriching samples into time-resolved fractions (Fig 1A, Box 4) (Li et al, 2020; Thompson et al, 2019)."

      Additionally, we have improved text to very clearly state that for the TRIP experiments Tg is FLAG-tagged and this epitope tag is required for the two-stage enrichment strategy. As one small example:

      "Thyroglobulin was chosen as the model secretory client protein. We generated isogenic Fischer rat thyroid cells (FRT) cells that stably expressed FLAG-tagged Tg (Tg-FT), including WT or mutant variants (A2234D and C1264R) (Fig EV1)"

      "Furthermore, the C-terminal FLAG-tag and Hpg labeling are necessary for this two-stage enrichment strategy, and DSP crosslinking is necessary to capture these interactions after stringent wash steps (Fig 1D, Fig EV2)."

      __Reviewer #4, Comment #2: __To what extend do the difference in protein abundance between Tg WT and Tg C1264R contribute to the increase binding of their interactors (e.g., HSP5 and PDIA4). The authors should perform a TRIP coupled immunoblot analysis where WT and Mutant are loaded side-by-side on the SDS-PAGE.

      Our Response: As Reviewer #3 (comment 5) had a similar inquiry, we provide the same response as listed above:

      We do not foresee an impact from level of Tg expression on the profiles obtained by TRIP. We were able to identify distinct profiles because we processed the data and normalized it based on the relative Tg amount. For example, while WT and A2234D Tg are expressed at similar levels intracellularly, we ere able to identify distinct differences in the interaction profiles across the two constructs. When developing FRT clones, we selected those that were expressed at similar levels and, therefore, did not have the capability to directly test differences, if any, in observed profiles that may be the result of different expression levels of the same Tg construct. Furthermore, Tg can make up 50% of all protein content within thyroid tissue (Di Jeso & Arvan, 2016). As such, thyroid cells are adept at maintaining the balance of QC machinery to process thyroid. Therefore, we do not anticipate that the amount of Tg expressed in TRIP experiments would have a significant impact on the profiles that we were able to observe.

      __Reviewer #4, Comment #3: __While the RNAi screen was done with pooled siRNA, it is not clear what was used for the RNAi validation experiments shown in Figure 5. This should be done by individual siRNA and not the same pooled reagents as used for the screen.

      Our Response: Similarly, pooled siRNAs were initially utilized for the data shown in Figure 5. The RNAi screen utilized siRNAs optimized for human cells, where as those found for Figure 5 were for rat cells. For the revisions, we performed control experiments with individual siRNAs, which are now shown in Fig EV7J,K. While we did not find that any one single siRNA recapitulated the full phenotype, we did find that several single siRNAs for VCP and TEX264 at least partially restored the observed phenotype of increased C1264R Tg secretion. This result is expected given that we reasoned the siRNAs are likely providing an additive effect contributing to the observed phenotypes. We provided these single siRNA control experiments in Fig EV7J,K, and updated the manuscript text as follows:

      "Several individual VCP and TEX264 siRNAs were able to partially recapitulate these increased secretion phenotype on C1264R Tg-FT, confirming that the effect is mediated by the respective gene silencing (Fig EV7J,K)."

      Reviewer #4, Comment #4: __In Figure 5A it is not clear which band was used to quantify the effect of NAPA reduction. Also, this analysis lacks normalization to an unrelated protein or loading control. Moreover, the authors should also examine the effect of the siRNA targets shown in Figure 5C for Tg WT and not only the mutant.__

      Our Response: The uppermost band in Fig 5A was used for quantification. We added a red asterisk similar to that found in Fig 5C to denote this lower back in the lysate panel(s) as a non-specific background band found within the Western blot. These data are the result of immunoprecipitations of both cell lysate and medium content, as such there is no applicable loading control that can be used within the western blots. For experiments, cell amounts were normalized by seeding and subsequently culturing the same amount of cells, as denoted within the Materials and Methods - FRT siRNA validation studies section of the manuscript. Furthermore, there are no loading controls that are easily utilized for analyzing cell culture medium. We have further clarified the Fig 5 caption to provide clearer experimental detail:

      "(A and B) Western blot analysis (A) and quantification (B) of WT Tg-FT secretion from FRT cells transfected with select siRNAs hits from initial screening data set. Red asterisk denotes a non-specific background band within the western blot. Cells were transfected with 25nM siRNAs for 36 hrs, media exchanged and conditions for 4 hrs, Tg-FT was immunoprecipitated from lysate and media samples, and Tg-FT amounts were analyzed via immunoblotting. N = 6.

      (C and D) Western blot analysis (C) and quantification (D) of C1264R Tg-FT secretion from FRT cells transfected with select siRNA hits from the initial screening data set. Red asterisk denotes a non-specific background band within the western blot. Cells were transfected with 25nM siRNAs for 36 hrs, media exchanged and conditions for 8 hrs, Tg-FT was immunoprecipitated from lysate and media samples, and Tg-FT amounts were analyzed via immunoblotting. All statistical testing performed using an unpaired student's t-test with Welch's correction. *pFinally, as the siRNA targets shown in Fig 5C were shown to be hits exclusively for C1264R Tg-FT we did not believe it was necessary to follow-up on these with WT Tg-FT. Similarly, we did not follow-up on hits that were exclusive to WT Tg-FT with C1264R and A2234D Tg-FT.

      __Reviewer #4, Comment #5: __The authors should also test for the binding of RTN3 to Tg WT and mutant - in particular in comparison to TEX264. This would be important in the context that only RTN3 but not TEX264 was detected in the TRIP approach. Do the authors also detect VCP and LC3B in their pulldowns?

      Our response: Please also see Reviewer #3, comment 9, who made a similar point.

      We performed follow-up experiments to test interactions with Tg and the wider panel of ER-phagy receptors. We transiently expressed FLAG-tagged CCPG1, RTN3L, and TEX264 in HEK293 cells stably expressing Tg-NLuc and performed FLAG IPs followed by western blot analysis. We found that WT and C1264R Tg were enriched, albeit modestly, in the RTN3L Co-IP compared to control samples expressing GFP. Additionally, we found that WT, A2234D, and C1264R Tg were all enriched with CCPG1 compared to control samples expressing GFP. CCPG1 was found to be a C1264R Tg interactor within our mass spectrometry datasets, along with RTN3. We have now integrated these data into the manuscript as Fig EV8, and updated the manuscript text as follows:

      "Additionally, we monitored Tg enrichment with ER-phagy receptors CCPG1 and RTN3 via Western blot as both were found to be C1264R Tg interactors within our TRIP dataset. RTN3L is found to be the only RTN3 isoform involved in ER turnover via ER-phagy (Grumati et al, 2017). WT and C1264R Tg-NLuc were modestly enriched with RTN3L compared to control samples expressing GFP. Conversely, we found that all Tg variants exhibited modest interactions with CCPG1 compared to control samples expressing GFP, although less than with TEX264 (Fig EV8).

      Together, these data suggest that TEX264, CCPG1, or RTN3L engage with Tg during processing, and CH-associated Tg mutants may be selectively targeted to TEX264. Furthermore, ER-phagy may be considered as a degradative pathway in Tg processing, as other studies have mainly focused on Tg degradation through ERAD (Tokunaga et al, 2000; Menon et al, 2007)."

      Regarding VCP, we can detect it routinely in our AP-MS experiment as presented previously (Wright et al. 2021), and here in Fig 3, Appendix Fig S1. However, we have not been able to detect interactions via western blot, which may be attributed to the increased sensitivity that LC-MS offers. We have not probed for LC3 interactions via western blot as we did not detect it by LC-MS either, but we identified several lysosomal and other autophagy-related components previously (Wright et al. 2021), and here shown in Appendix Fig S1 and Fig EV5C.

      __Reviewer #4, Comment #6: __The effect of TEX264 depletion on Tg secretion should be confirmed by TEX263 KO experiments. Do the authors observe a similar increase in secreted Tg C1264R in BafA1- or SAR405-treated cells? Moreover, the authors should show that Tg C1264R is actually delivered to lysosomes using biochemical assays such as LysoIP or colocalization experiments.

      Our response: To address this concern, we generated stable TEX264 knockout FRT cell lines by CRISPR, and probed several clones for their impact on Tg secretion. We found that TEX264 knockout did not recapitulate the increase in C1264R Tg secretion observed with transient siRNA knockout. While disappointing, these results are not necessarily surprising, considering that prolonged TEX264 knockout may lead the cell to adapt compensation mechanisms by modulating other proteostasis factors and/or autophagy machinery.

      We performed experiments utilizing the autophagy inhibitor Bafilomycin A1, and have now included these results with the manuscript available in Fig EV10A,B. We found that BafA1 treatment led to the accumulation of WT Tg in the lysate but not for the C1264R Tg. We updated the manuscript text to accompany these data as follows:

      "To understand whether this rescue in secretion was uniquely linked to VCP inhibition or could be more broadly attributed to blocking Tg degradation, we tested the proteasomal inhibitor bortezomib, and lysosomal inhibitor bafilomycin. Bafilomycin increased WT Tg lysate abundance, and bortezomib significantly increased A2234D lysate abundance, consistent with a role of these degradation processes in Tg PQC (Fig EV10A). When monitoring Tg-NLuc media abundance, neither bafilomycin nor bortezomib significantly altered WT, A2234D, or C1264R abundance (Fig. EV10B). confirming that general inhibition of proteasomal or lysosomal degradation does with rescue mutant Tg secretion."

      These results raise the possibility that the mutant Tg interaction with TEX264 may not lead to active autophagic degradation of mutant Tg. This is also consistent with the slow degradation of C1264R Tg observed in the pulse-chase experiment in Fig EV9A. Whether the TEX246 recruitment of mutant Tg leads to degradation or assumes an alternative function, for example, intracellular sequestration, remains to be tested. Importantly, we have refrained from making claims in the manuscript that C1264R Tg is delivered to the lysosome but have presented data showing that it interacts with ER-phagy-related components and have further speculated on the possibility how autophagy could play a role in Tg processing.

      Thank you for the LysoIP suggestion. Ongoing work in the lab is addressing this question and experiments suggested by the reviewer, but this is better reserved for a follow-up manuscript.

      __Reviewer #4, Comment #7: __Figure 7A and 7C lack loading controls. The quantification shown in Figure 7B and 7D should be normalized to this control. Since VCP activity is often coupled to the of the proteasome, the authors should check whether blocking the proteasome yields a similar effect than ML-240.

      Our Response: Like Fig 5A discussed above (Reviewer #4, comment 4), these data are the result of immunoprecipitations from cell lysate and medium. As a result, there is not applicable loading control that can be used within the western blots. For experiments, cell amounts were normalized by seeding and subsequently culturing the same amount of cells, as denoted within the Materials and Methods - FRT siRNA validation studies section of the manuscript and Material and Methods - VCP pharmacological inhibition studies.

      Regarding the effect of proteasome inhibition, we tested whether bortezomib treatment can increase C1264R Tg secretion. We found that bortezomib led to a small but significant increase in A2234D Tg accumulation in the lysate, but did not increase secretion of Tg for WT or any of the mutant variants. This new data is shown in Fig EV10A,B. We updated the text as follow:

      "To understand whether this rescue in secretion was uniquely linked to VCP inhibition or could be more broadly attributed to blocking Tg degradation, we tested the proteasomal inhibitor bortezomib, and lysosomal inhibitor bafilomycin. Bafilomycin increased WT Tg lysate abundance, and bortezomib significantly increased A2234D lysate abundance, consistent with a role of these degradation processes in Tg PQC (Fig EV10A). When monitoring Tg-NLuc media abundance, neither bafilomycin nor bortezomib significantly altered WT, A2234D, or C1264R abundance (Fig. EV10B). confirming that general inhibition of proteasomal or lysosomal degradation does with rescue mutant Tg secretion."

      __Reviewer #4, Comment #8: __With regard to Figure 7 - Figure supplement 1: The authors should monitor the effect of ML-240 on Tg secretion such that WT and C1264R mutants are directly compared (side-by-side on the same immunoblot). Otherwise, it is difficult to claim that ML-240 rescues the secretion of the mutant.

      Our response: The reviewer is referring to the S35 pulse-chase experiments now shown in Fig EV9. We would like to clarify that these images are not immunoblots but autoradiographs. Even though the samples for WT and C1264R Tg were loaded onto separate gels, the gels were imaged at the same time and are therefore directly comparable. Regardless, the more meaningful information that can be gleaned from these experiments are the absolute rates of protein secretion and degradation and how they change in response to ML-240 treatment. The scale in the quantifications (0 - 100%) is the same and corresponds to the total amount of WT or C1264R Tg that is labeled with 35S during the 30 min pulse. Importantly, we found that C1264R Tg-FT secretion is significantly increased in the presence of ML-240, changing from

      __Reviewer #4, Comment #9: __How did ML-240 affect the ER-phagy components (in particular RTN3) in the TRIP analysis of Tg C1264R (Figure 7G-L)?

      Our response: This is a great discussion point raised by reviewer #4. We have updated the manuscript text to discuss in more detail changes in interactions with degradation components, especially with proteasomal degradation machinery (Fig 7M,N). The manuscript text now reads as follows:

      "The most striking interaction changes occurred with proteasomal degradation components, which remained steady until 1.5 hr, but then abruptly declined with ML-240 treatment at later time points (Fig 7M,N). This decline tracks with changes to the glycan processing machinery, highlighting that the coordination between N-glycosylation and diverting Tg away from ERAD may be a key to the rescue mechanism."

      Minor points:

      __Reviewer #4, Comment #10: __The candidate labeling in Figure 3 - Figure supplement 2 and 3 is too small und unreadable. The authors should provide a higher resolution of these figures or increase the font.

      Our response: These figures are now in the Appendix and we have edited this figure to provide higher resolution.

      Reviewer #4 (Significance (Required)):

      Please see above

    1. Author response:

      We would like to thank the reviewers for their constructive feedback. We have thoroughly considered their concerns and comments and we aim to include some additional results in an updated version of this manuscript. In addition, we would like to address some of the comments, with which we respectfully disagree. Below is our point-by-point reply.

      Reviewer 1:

      Summary:

      This paper is focused on the role of Cadherin Flamingo (Fmi) - also called Starry night (stan) - in cell competition in developing Drosophila tissues. A primary genetic tool is monitoring tissue overgrowths caused by making clones in the eye disc that express activated Ras (RasV12) and that are depleted for the polarity gene scribble (scrib). The main system that they use is ey-flp, which makes continuous clones in the developing eye-antennal disc beginning at the earliest stages of disc development. It should be noted that RasV12, scrib-i (or lgl-i) clones only lead to tumors/overgrowths when generated by continuous clones, which presumably creates a privileged environment that insulates them from competition. Discrete (hs-flp) RasV12, lgl-i clones are in fact out-competed (PMID: 20679206), which is something to bear in mind. 

      We think it is unlikely that the outcome of RasV12, scrib (or lgl) competition depends on discrete vs. continuous clones or on creation of a privileged environment. As shown in the same reference mentioned by the reviewer, the outcome of RasV12, scrib (or lgl) tumors greatly depends on the clone being able to grow to a certain size. The authors show instances of discrete clones where larger RasV12, lgl clones outcompete the surrounding tissue and eliminate WT cells by apoptosis, whereas smaller clones behave more like losers. It is not clear what aspect of the environment determines the ability of some clones to grow larger than others, but in neither case are the clones prevented from competition. Other studies show that in mammalian cells, RasV12, scrib clones are capable of outcompeting the surrounding tissue, such as in Kohashi et al (2021), where cells carrying both mutations actively eliminate their neighbors.

      The authors show that clonal loss of Fmi by an allele or by RNAi in the RasV12, scrib-i tumors suppresses their growth in both the eye disc (continuous clones) and wing disc (discrete clones). The authors attributed this result to less killing of WT neighbors when Myc over-expressing clones lacking Fmi, but another interpretation (that Fmi regulates clonal growth) is equally as plausible with the current results.

      See point (1) for a discussion on this.

      Next, the authors show that scrib-RNAi clones that are normally out-competed by WT cells prior to adult stages are present in higher numbers when WT cells are depleted for Fmi. They then examine death in RasV12, scrib-i ey-FLP clones, or in discrete hs-FLP UAS-Myc clones. They state that they see death in WT cells neighboring RasV12, scrib-i clones in the eye disc (Figures 4A-C). Next, they write that RasV12, scrib-I cells become losers (i.e., have apoptosis markers) when Fmi is removed. Neither of these results are quantified and thus are not compelling. They state that a similar result is observed for Myc over-expression clones that lack Fmi, but the image was not compelling, the results are not quantified and the controls are missing (Myc over-expressing clones alone and Fmi clones alone).

      We assayed apoptosis in UAS-Myc clones in eye discs but neglected to include the results in Figure 4. We will include them in the updated manuscript. Regarding Fmi clones alone, we direct the reviewer’s attention to Fig. 2 Supplement 1 where we showed that fminull clones cause no competition. Dcp-1 staining showed low levels of apoptosis unrelated to the fminull clones or twin-spots, and we will comment on this in the revised manuscript.

      Regarding the quantification of apoptosis, we did not provide a quantification, in part because we observe a very clear visual difference between groups (Fig. 4A-K), and in part because it is challenging to come up with a rigorous quantification method. For example, how far from a winner clone can an apoptotic cell be and still be considered responsive to the clone? For UAS-Myc winner clones, we observe a modest amount of cell death both inside and outside the clones, consistent with prior observations. For fminull UAS-Myc clones, we observe vastly more cell death within the fminull UAS-Myc clones and modest death in nearby wildtype cells, and consequently a much higher ratio of cell death inside vs outside the clone. Because of the somewhat arbitrary nature of quantification, and the dramatic difference, we initially chose not to provide a quantification. However, given the request, we chose an arbitrary distance from the clone boundary in which to consider dying cells and counted the numbers for each condition. We view this as a very soft quantification, but will report it in a way that captures the phenomenon in the revised manuscript.

      They then want to test whether Myc over-expressing clones have more proliferation. They show an image of a wing disc that has many small Myc overexpressing clones with and without Fmi. The pHH3 results support their conclusion that Myc overexpressing clones have more pHH3, but I have reservations about the many clones in these panels (Figures 5L-N).

      As the reviewer’s reservations are not specified, we have no specific response.

      They show that the cell competition roles of Fmi are not shared by another PCP component and are not due to the Cadherin domain of Fmi. The authors appear to interpret their results as Fmi is required for winner status. Overall, some of these results are potentially interesting and at least partially supported by the data, but others are not supported by the data.

      Strengths: 

      Fmi has been studied for its role in planar cell polarity, and its potential role in competition is interesting.

      Weaknesses:

      (1) In the Myc over-expression experiments, the increased size of the Myc clones could be because they divide faster (but don't outcompete WT neighbors). If the authors want to conclude that the bigger size of the Myc clones is due to out-competition of WT neighbors, they should measure cell death across many discs of with these clones. They should also assess if reducing apoptosis (like using one copy of the H99 deficiency that removes hid, rpr, and grim) suppresses winner clone size. If cell death is not addressed experimentally and quantified rigorously, then their results could be explained by faster division of Myc over-expressing clones (and not death of neighbors). This could also apply to the RasV12, scrib-i results.

      Indeed, Myc clones have been shown to divide faster than WT neighbors, but that is not the only reason clones are bigger. As shown in (de la Cova et al, 2004), Myc-overexpressing cells induce apoptosis in WT neighbors, and blocking this apoptosis results in larger wings due to increased presence of WT cells. Also, (Moreno and Basler, 2004) showed that Myc-overexpressing clones cause a reduction in WT clone size, as WT twin spots adjacent to 4xMyc clones are significantly smaller than WT twin spots adjacent to WT clones. In the same work, they show complete elimination of WT clones generated in a tub-Myc background. Since then, multiple papers have shown these same results. It is well established then that increased cell proliferation transforms Myc clones into supercompetitors and that in the absence of cell competition, Myc-overexpressing discs produce instead wings larger than usual.

      In (de la Cova et al, 2004) the authors already showed that blocking apoptosis with H99 hinders competition and causes wings with Myc clones to be larger than those where apoptosis wasn’t blocked. As these results are well established from prior literature, there is no need to repeat them here.

      (2) This same comment about Fmi affecting clone growth should be considered in the scrib RNAi clones in Figure 3.

      In later stages, scrib RNAi clones in the eye are eliminated by WT cells. While scrib RNAi clones are not substantially smaller in third instar when competing against fmi cells (Fig 3M), by adulthood we see that WT clones lacking Fmi have failed to remove scrib clones, unlike WT clones that have completely eliminated the scrib RNAi clones by this time. We therefore disagree that the only effect of Fmi could be related to rate of cell division.

      (3) I don't understand why the quantifications of clone areas in Figures 2D, 2H, 6D are log values. The simple ratio of GFP/RFP should be shown. Additionally, in some of the samples (e.g., fmiE59 >> Myc, only 5 discs and fmiE59 vs >Myc only 4 discs are quantified but other samples have more than 10 discs). I suggest that the authors increase the number of discs that they count in each genotype to at least 20 and then standardize this number.

      Log(ratio) values are easier to interpret than a linear scale. If represented linearly, 1 means equal ratios of A and B, while 2A/B is 2 and A/2B is 0.5. And the higher the ratio difference between A and B, the starker this effect becomes, making a linear scale deceiving to the eye, especially when decreased ratios are shown. Using log(ratios), a value of 0 means equal ratios, and increased and decreased ratios deviate equally from 0.

      Statistically, either analyzing a standardized number of discs for all conditions or a variable number not determined beforehand has no effect on the p-value, as long as the variable n number is not manipulated by p-hacking techniques, such as increasing the n of samples until a significant p-value has been obtained. While some of our groups have lower numbers, all statistical analyses were performed after all samples were collected. For all results obtained by cell counts, all samples had a minimum of 10 discs due to the inherent though modest variability of our automated cell counts, and we analyzed all the discs that we obtained from a given experiment, never “cherry-picking” examples. For the sake of transparency, all our graphs show individual values in addition to the distributions so that the reader knows the n values at a glance.

      (5) Figure 4 - shows examples of cell death. Cas3 is written on the figure but Dcp-1 is written in the results. Which antibody was used? The authors need to quantify these results. They also need to show that the death of cells is part of the phenotype, like an H99 deficiency, etc (see above).

      Thank you for flagging this error. We used cleaved Dcp-1 staining to detect cell death, not Cas3 (Drice in Drosophila). We will update all panels replacing Cas3 by Dcp-1.

      As described above, cell death is a well established consequence of myc overexpression induced cell death and we feel there is no need to repeat that result. To what extent loss of Fmi induces excess cell death or reduces proliferation in “would-be” winners, and to what extent it reduces “would-be” winners’ ability to eliminate competitors are interesting mechanistic questions that are beyond the scope of the current manuscript.

      (6) It is well established that clones overexpressing Myc have increased cell death. The authors should consider this when interpreting their results.

      We are aware that Myc-overexpressing clones have increased cell death, but it has also been demonstrated that despite that fact, they behave as winners and eliminate WT neighboring cells. And as mentioned in comment (1), WT clones generated in a 3x and 4x Myc background are eliminated and removed from the tissue, and blocking cell death increases the size of WT “losers” clones adjacent to Myc overexpressing clones.

      (7) A better characterization of discrete Fmi clones would also be helpful. I suggest inducing hs-flp clones in the eye or wing disc and then determining clone size vs twin spot size and also examining cell death etc. If such experiments have already been done and published, the authors should include a description of such work in the preprint.

      We have already analyzed the size of discrete Fmi clones and showed that they did not cause any competition, with fmi-null clones having the same size as WT clones in both eye and wing discs. We direct the reviewer’s attention to Figure 2 Supplement 1.

      (8) We need more information about the expression pattern of Fmi. Is it expressed in all cells in imaginal discs? Are there any patterns of expression during larval and pupal development?

      Fmi is equally expressed by all cells in all imaginal discs in Drosophila larva and pupa. We will include this information in the updated manuscript.

      (9) Overall, the paper is written for specialists who work in cell competition and is fairly difficult to follow, and I suggest re-writing the results to make it accessible to a broader audience.

      We have endeavored to both provide an accessible narrative and also describe in sufficient detail the data from multiple models of competition and complex genetic systems. We hope that most readers will be able, at a minimum, to follow our interpretations and the key takeaways, while those wishing to examine the nuts and bolts of the argument will find what they need presented as simply as possible.

      Reviewer 2:

      Summary:

      In this manuscript, Bosch et al. reveal Flamingo (Fmi), a planar cell polarity (PCP) protein, is essential for maintaining 'winner' cells in cell competition, using Drosophila imaginal epithelia as a model. They argue that tumor growth induced by scrib-RNAi and RasV12 competition is slowed by Fmi depletion. This effect is unique to Fmi, not seen with other PCP proteins. Additional cell competition models are applied to further confirm Fmi's role in 'winner' cells. The authors also show that Fmi's role in cell competition is separate from its function in PCP formation.

      We would like to thank the reviewer for their thoughtful and positive review.

      Strengths:

      (1) The identification of Fmi as a potential regulator of cell competition under various conditions is interesting.

      (2) The authors demonstrate that the involvement of Fmi in cell competition is distinct from its role in planar cell polarity (PCP) development.

      Weaknesses:

      (1) The authors provide a superficial description of the related phenotypes, lacking a comprehensive mechanistic understanding. Induction of apoptosis and JNK activation are general outcomes, but it is important to determine how they are specifically induced in Fmi-depleted clones. The authors should take advantage of the power of fly genetics and conduct a series of genetic epistasis analyses.

      We appreciate that this manuscript does not address the mechanism by which Fmi participates in cell competition. Our intent here is to demonstrate that Fmi is a key contributor to competition. We indeed aim to delve into mechanism, are currently directing our efforts to exploring how Fmi regulates competition, but the size of the project and required experiments are outside of the scope of this manuscript. We feel that our current findings are sufficiently valuable to merit sharing while we continue to investigate the mechanism linking Fmi to competition.

      (2) The depletion of Fmi may not have had a significant impact on cell competition; instead, it is more likely to have solely facilitated the induction of apoptosis.

      We respectfully disagree for several reasons. First, loss of Fmi is specific to winners; loss of Fmi has no effect on its own or in losers when confronting winners in competition. And in the Ras V12 tumor model, loss of Fmi did not perturb whole eye tumors – it only impaired tumor growth when tumors were confronted with competitors. We agree that induction of apoptosis is affected, but so too is proliferation, and only when in winners in competition.

      (3) To make a solid conclusion for Figure 1, the authors should investigate whether complete removal of Fmi by a mutant allele affects tumor growth induced by expressing RasV12 and scrib RNAi throughout the eye.

      We agree with the reviewer that this is a worthwhile experiment, given that RNAi has its limitations. However, as fmi is homozygous lethal at the embryo stage, one cannot create whole disc tumors mutant for fmi. As an approximation to this condition, we have introduced the GMR-Hid, cell-lethal combination to eliminate non-tumor tissue in the eye disc. Following elimination of non-tumor cells, there remains essentially a whole disc harboring fminull tumor. Indeed, this shows that whole fminull tumors overgrow similar to control tumors, confirming that the lack of Fmi only affects clonal tumors. We will provide those results in the updated manuscript.

      (4) The authors should test whether the expression level of Fmi (both mRNA and protein) changes during tumorigenesis and cell competition.

      This is an intriguing point that we would like to validate. We are currently performing immunostaining for Fmi in clones to confirm whether its levels change during competition. We will provide these results in the updated manuscript.

      Reviewer 3:

      Summary: <br /> In this manuscript, Bosch and colleagues describe an unexpected function of Flamingo, a core component of the planar cell polarity pathway, in cell competition in the Drosophila wing and eye disc. While Flamingo depletion has no impact on tumour growth (upon induction of Ras and depletion of Scribble throughout the eye disc), and no impact when depleted in WT cells, it specifically tunes down winner clone expansion in various genetic contexts, including the overexpression of Myc, the combination of Scribble depletion with activation of Ras in clones or the early clonal depletion of Scribble in eye disc. Flamingo depletion reduces the proliferation rate and increases the rate of apoptosis in the winner clones, hence reducing their competitiveness up to forcing their full elimination (hence becoming now "loser"). This function of Flamingo in cell competition is specific to Flamingo as it cannot be recapitulated with other components of the PCP pathway, and does not rely on the interaction of Flamingo in trans, nor on the presence of its cadherin domain. Thus, this function is likely to rely on a non-canonical function of Flamingo which may rely on downstream GPCR signaling.

      This unexpected function of Flamingo is by itself very interesting. In the framework of cell competition, these results are also important as they describe, to my knowledge, one of the only genetic conditions that specifically affect the winner cells without any impact when depleted in the loser cells. Moreover, Flamingo does not just suppress the competitive advantage of winner clones, but even turns them into putative losers. This specificity, while not clearly understood at this stage, opens a lot of exciting mechanistic questions, but also a very interesting long-term avenue for therapeutic purposes as targeting Flamingo should then affect very specifically the putative winner/oncogenic clones without any impact in WT cells.

      The data and the demonstration are very clean and compelling, with all the appropriate controls, proper quantification, and backed-up by observations in various tissues and genetic backgrounds. I don't see any weakness in the demonstration and all the points raised and claimed by the authors are all very well substantiated by the data. As such, I don't have any suggestions to reinforce the demonstration.

      While not necessary for the demonstration, documenting the subcellular localisation and levels of Flamingo in these different competition scenarios may have been relevant and provided some hints on the putative mechanism (specifically by comparing its localisation in winner and loser cells). 

      Also, on a more interpretative note, the absence of the impact of Flamingo depletion on JNK activation does not exclude some interesting genetic interactions. JNK output can be very contextual (for instance depending on Hippo pathway status), and it would be interesting in the future to check if Flamingo depletion could somehow alter the effect of JNK in the winner cells and promote downstream activation of apoptosis (which might normally be suppressed). It would be interesting to check if Flamingo depletion could have an impact in other contexts involving JNK activation or upon mild activation of JNK in clones.

      We would like to thank the reviewer for their thorough and positive review.

      Strengths: 

      - A clean and compelling demonstration of the function of Flamingo in winner cells during cell competition.

      - One of the rare genetic conditions that affects very specifically winner cells without any impact on losers, and then can completely switch the outcome of competition (which opens an interesting therapeutic perspective in the long term)

      Weaknesses: 

      - The mechanistic understanding obviously remains quite limited at this stage especially since the signaling does not go through the PCP pathway.

      Reviewer 2 made the same comment in their weakness (1), and we refer to that response. In future work, we are excited to better understand the pathways linking Fmi and competition.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Weaknesses:

      The authors demonstrate that ASGR1 is degraded in response to RSPO2RA-antibody treatment through both the proteasomal and the lysosomal pathway, suggesting that this is due to the RSPO2RA-mediated recruitment of ZNRF3/RNF43, which have E3 ubiquitin ligase activity. The paper doesn't show, however, if ASGR1 is indeed ubiquitinated.

      We thank the reviewer for this comment. We have now conducted ASGR1 ubiquitination assays by immunoprecipitation (IP) of ubiquitin in the membrane protein extract, and immunoblotting (IB) ASGR1 after treating HepG2 cells with our SWEETS molecules or controls. The new data demonstrated ubiquitination of ASGR1 with SWEETS treatment (new Fig. S3A and S3B). Additionally, we blocked the potential ubiquitination of ASGR1 by mutating the two lysine residues in the cytoplasmic domain and compared the ASGR1 degradation after SWEETS treatment. The new data show that removing the potential ubiquitylation Lys sites prevented ASGR1 degradation post SWEETS treatment (new Fig. S3C). These new results provide direct evidence that ASGR1 is ubiquitinated to undergo lysosome or proteasome degradation.

      The authors conclude that the RSPO2A-Ab fusions can act as a targeted protein degredation platform, because they can degrade ASGR. While I agree with this statement, I would argue that the goal of these Abs would not be to degrade ASGR per se. The argumentation is a bit confusing here. This holds for both the results and the discussion section: The authors focus on the dual role of their agents, i.e. on promoting both WNT signaling AND on degrading ASGR1. They might want to reconsider how they present their data (e.g. it may be interesting to target ASGR1, but one would presumably then like to do this without also increasing WNT responsiveness?).

      We thank the reviewer for this comment. As the reviewer states, the initial goal of the RSPO2RA-ab fusions was to generate tissue-specific RSPO mimetics that focus on elimination of E3. As an unintended consequence, we observed enhanced elimination of ASGR as well. While this was unintended, the results did provide POC that when an E3 ligase is brought into proximity of another protein, ubiquitination and degradation of this protein may occur. Additionally, our results highlight that one needs to be careful in fully assessing the impact of bispecific molecules on the intended target as well as unintended targets to understand the potential side effects of such bispecific molecules. We have revised the manuscript to make this more clear, both in the Results and Discussion sections.

      Lines 326-331: The authors use a lot of abbreviations for all of the different protein targeting technologies, but since they are hinting at specific mechanisms, it would be better to actually describe the biological activity of LYTAC versus AbTAC/PROTAB/REULR so non-experts can follow.

      We thank the reviewer for this suggestion. We have added more details in the Discussion to highlight the different mechanisms of the various systems described.

      Can the authors comment on how 8M24 and 8G8 compare to 4F3? The latter seems a bit more specific (ie. lower background activity in the absence of ASGR1 in 5C)? Are there any differences/advances between 8M24 and 8G8 over 4F3? This remains unclear.

      These three antibodies bind different regions/epitopes on ASGR. 8M24 and 8G8 bind non-overlapping epitopes on the carbohydrate recognition domain (CRD), while 4F3 binds the stalk region outside of the CRD. This information is in the Results section of the manuscript. We do not believe that the difference in the ASGR binding epitopes contributes to the slight differences in the background activity. The slight differences may be due to differences in the conformation of the antibodies resulting from the differences in their primary sequences, and these differences may not be significant. We have now repeated the experiments in Fig. 5C and 5D to address the reviewer’s next comment on the axis. These new data (new Fig. 5C and 5D) show less background differences between the molecules.

      Can the authors ensure that the axes are labelled/numbered similarly for Fig 5B-D? This will make it easier to compare 5C and 5D.

      We thank the reviewer for this suggestion. The y-axes in Fig. 5B–D now have the same scale and number format. For Figs. 5C and 5D, we focus on the potency increases of the SWEETS molecules post ASGR1 overexpression.

      Reviewer #2 (Public Review):

      Weaknesses:

      The authors show crystal structures for binding of these antibodies to ASGR1/2, and hypothesize about why specificity is mediated through specific residues. They do not test these hypotheses.

      We thank the reviewer for this comment. We did not further test the residue contributions to binding and specificity as this is not the main focus of the current manuscript. We have revised the section and tuned down the claims for specificity.

      The authors demonstrate in hepatocyte cell lines that these function as mimetics, and that they do not function in HEK cells, which do not express ASGR1. They do not perform an exhaustive screen of all non-hepatocyte cells, nor do they test these molecules in vivo.

      We agree with the reviewer. For the 4F3-based SWEETS molecule, additional in vitro and in vivo specificity characterized were performed and described in Zhang et al., Sci Rep, 2020. Since 8M24 is human specific and 8G8 only weakly interacts with mouse receptors, in vivo experiments in mouse were not performed. While we did not extensively test the 8M24- and 8G8-based SWEETS on additional cell lines or in vivo, we do believe the data presented strongly support the hepatocyte-specific effects of these molecules.

      Surprisingly, these molecules also induced loss of ASGR1, which the authors hypothesize is due to ubiquitination and degradation, initiated by the E3 ligases recruited to ASGR1. They demonstrate that inhibition of either the proteasome or lysosome abrogates this effect and that it is dependent on E1 ubiquitin ligases. They do not demonstrate direct ubiquitination of ASGR1 by ZNRF3/RNF43.

      We thank the reviewer for this comment. We have now conducted ASGR1 ubiquitination assays by immunoprecipitation (IP) of ubiquitin in the membrane protein extract, and immunoblotting (IB) ASGR1 after treating HepG2 cells with our SWEETS molecules or controls. The new data demonstrate ubiquitination of ASGR1 with SWEETS treatment (new Figs. S3A and S3B). Additionally, we blocked the potential ubiquitination of ASGR1 by mutating the two lysine residues in the cytoplasmic domain and compared the ASGR1 degradation after SWEETS treatment. The new data show that removing the potential ubiquitylation Lys sites prevented ASGR1 degradation post SWEETS treatment (new Fig. S3C). These new results provide direct evidence that ASGR1 is ubiquitinated to undergo lysosome or proteasome degradation.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      There are multiple instances where articles (i.e. the use of "the") are missing.

      We thank the reviewer for this comment. Following the suggestion, the manuscript has gone through a detailed review by an editorial service, and these and other grammatical errors have been corrected.

      Reviewer #2 (Recommendations For The Authors):

      The best I can think of is to inject these into Wnt reporter mice (or maybe humanized mice) and see if the liver lights up while other tissues do not.

      We thank the reviewer for this suggestion. The liver specificity was demonstrated in vivo in our earlier publication (SciRep, 10:13951, 2020) with the 4F3-RSPO2RA molecule. Unfortunately, as the results in this manuscript show, the new ASGR binders 8M24 and 8G8 either do not bind or only weakly interact with mouse receptors. Therefore, the in vivo experiments were not performed here.

      You could also consider addressing some of the statements in the manuscript that are currently hypothetical experimentally.

      We thank the reviewer for this comment. We did not further test the residues’ contribution to binding and specificity as this is not the main focus of the current manuscript. We have revised the section and tuned down the claims for specificity.

      It would be easier to compare the graphs in 5B-D if all Y-axes were the same scale, with the same scientific notation.

      We thank the reviewer for this suggestion. The y-axes in Fig. 5B-D now have the same scale and number format. For Figs. 5C and 5D, we focus on the potency increases of the SWEETS molecules post ASGR1 overexpression.

      Some of the western blots in Figure 6 do not have antibody/target labels, making them harder to interpret.

      All the Western blots antibody/target labels are on the right side of the blots for each panel, we have now made the text bold and thus easier to identify.

      Figure 6 and Supplementary Figure 2 are the same I think.

      Figure 6 and Supplementary Figure 2 show the same experimental set-up performed on two different cell lines, Fig. 6 is on Huh7 cells and Supplementary Fig. 2 is on HepG2 cells. The results from these two cell lines are quite consistent, making their appearance very similar.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      eLife assessment

      This is a valuable study that develops a new model of the way muscle responds to perturbations, synthesizing models of how it responds to small and large perturbations, both of which are used to predict how muscles function for stability but also how they can be injured, and which tend to be predicted poorly by classic Hill-type models. The evidence presented to support the model is solid, since it outperforms Hill-type models in a variety of conditions. Although the combination of phenomenological and mechanistic aspects of the model may sometimes make it challenging to interpret the output, the work will be of interest to those developing realistic models of the stability and control of movement in humans or other animals.

      Reviewer #1 (Public Review):

      Muscle models are important tools in the fields of biomechanics and physiology. Muscle models serve a wide variety of functions, including validating existing theories, testing new hypotheses, and predicting forces produced by humans and animals in health and disease. This paper attempts to provide an alternative to Hill-type muscle models that includes contributions of titin to force enhancement over multiple time scales. Due to the significant limitations of Hill-type models, alternative models are needed and therefore the work is important and timely.

      The effort to include a role for titin in muscle models is a major strength of the methods and results. The results clearly demonstrate the weaknesses of Hill models and the advantages of incorporating titin into theoretical treatments of muscle mechanics. Another strength is to address muscle mechanics over a large range of time scales.

      The authors succeed in demonstrating the need to incorporate titin in muscle models, and further show that the model accurately predicts in situ force of cat soleus (Kirsch et al. 1994; Herzog & Leonard, 2002) and rabbit posts myofibrils (Leonard et al. 2010). However, it remains unclear whether the model will be practical for use with data from different muscles or preparations. Several ad hoc modifications were described in the paper, and the degree to which the model requires parameter optimization for different muscles, preparations and experiment types remains unclear.

      I think the authors should state how many parameters require fitting to the data vs the total number of model parameters. It would also be interesting for the authors to discuss challenges associated with modeling ex vivo and in vivo data sets, due to differences in means of stimulation vs. model inputs.

      (1) I think the authors should state how many parameters require fitting to the data vs the total number of model parameters.

      The total number of model parameters are listed in Table 1. Each parameter has, in addition, references listed for the source of data (if one exists) along with how the data were used (’C’ calculate, ’F’ fit, ’E’ estimated, or ’S’ for scaled) for the specific simulations that appear in this paper. While this is a daunting number of parameters, only a few of these parameters must be updated when modeling a new musculotendon.

      Similar to a Hill-type muscle model, at least 5 parameters are needed to fit the VEXAT model to a specific musculotendon: maximum isometric force (fiso), optimal contractile element (CE) length, pennation angle, maximum shortening velocity, and tendon slack length. However, similar to a Hill model, it is only possible to use this minimal set of parameters by making use of default values for the remaining set of parameters. The defaults we have used have been extracted from mammalian muscle (see Table 1) and may not be appropriate for modeling muscle tissue that differs widely in terms of the ratio of fast/slow twitch fibers, titin isoform, temperature, and scale.

      Even when these defaults are appropriate, variation is the rule for biological data rather than the exception. It will always be the case that the best fit can only be obtained by fitting more of the model’s parameters to additional data. Standard measurements of the active force-length relation, passive forcelength relation, and force-velocity relations are quite helpful to improve the accuracy of the model to a specific muscle. It is challenging to improve the fit of the model’s cross-bridge (XE) and titin models because the data required are so rare. The experiments of Kirsch et al., Prado et al, and Trombitas et´ al. are unique to our knowledge. However, if more data become available, it is relatively straight forward to update the model’s parameters using the methods described in Appendix B or the code that appears online (https://github.com/mjhmilla/Millard2023VexatMuscle).

      We have modified the manuscript to make it clear that, in some circumstances, the burden of parameter identification for the VEXAT model can be as low as a Hill model:

      - Section 3: last two sentences of the 2nd paragraph, found at: Page 10, column 2, lines 1-12 of MillardFranklinHerzog v3.pdf and 05 MillardFranklinHerzog v2 v3 diff.pdf

      - Table 1: last two sentences of the caption, found at: Page 11 of MillardFranklinHerzog v3.pdf and 05 MillardFranklinHerzog v2 v3 diff.pdf

      (2) It would also be interesting for the authors to discuss challenges associated with modeling ex vivo and in vivo data sets, due to differences in means of stimulation vs. model inputs.

      All of the experiments simulated in this work are in-situ or ex-vivo. So far the main challenges of simulating any experiment have been quite consistent across both in-situ and ex-vivo datasets: there are insufficient data to fit most model parameters to a specific specimen and, instead, defaults from the literature must be used. In an ideal case, a specimen would have roughly ten extra trials collected so that the maximum isometric force, optimal fiber length, active force-length relation, passive force-length relation (upto ≈ 0_._6_f_oM), and the force-velocity relations could be identified from measurements rather than relying on literature values. Since most lab specimens are viable for a small number of trials (with the exception of cat soleus), we don’t expect this situation to change in future.

      However, if data are available the fitting process is pretty straight forward for either in-situ or ex-vivo data: use a standard numerical method (for example non-linear least squares, or the bisection method) to adjust the model parameters to reduce the errors between simulation and experiment. The main difficulty, as described in the previous paragraph, is the availability of data to fit as many parameters as possible for a specific specimen. As such, the fitting process really varies from experiment to experiment and depends mainly on the richness of measurements taken from a specific specimen, and from the literature in general.

      Working from in-vivo data presents an entirely different set of challenges. When working with human data, for example, it’s just not possible to directly measure muscle force with tendon buckles, and so it is never completely clear how force is distributed across the many muscles that typically actuate a joint. Further, there is also uncertainty in the boundary condition of the muscle because optical motion capture markers will move with respect to the skeleton. Video fluoroscopy offers a method of improving the accuracy of measured boundary conditions, though only for a few labs due to its great expense. A final boundary condition remains impossible to measure in any case: the geometry and forces that act at the boundaries as muscle wraps over other muscles and bones. Fitting to in-vivo data are very difficult.

      While this is an interesting topic, it is tangent to our already lengthy manuscript. Since these reviews are public, we’ll leave it to the motivated reader to find this text here.

      Reviewer #2 (Public Review):

      This model of skeletal muscle includes springs and dampers which aim to capture the effect of crossbridge and titin stiffness during the stretch of active muscle. While both crossbridge and titin stiffness have previously been incorporated, in some form, into models, this model is the first to simultaneously include both. The authors suggest that this will allow for the prediction of muscle force in response to short-, mid- and long-range stretches. All these types of stretch are likely to be experienced by muscle during in vivo perturbations, and are known to elicit different muscle responses. Hence, it is valuable to have a single model which can predict muscle force under all these physiologically relevant conditions. In addition, this model dramatically simplifies sarcomere structure to enable this muscle model to be used in multi-muscle simulations of whole-body movement.

      In order to test this model, its force predictions are compared to 3 sets of experimental data which focus on short-, mid- and long-range perturbations, and to the predictions of a Hill-type muscle model. The choice of data sets is excellent and provide a robust test of the model’s ability to predict forces over a range of length perturbations. However, I find the comparison to a Hill-type muscle model to be somewhat limiting. It is well established that Hill-type models do not have any mechanism by which they can predict the effect of active muscle stretch. Hence, that the model proposed here represents an improvement over such a model is not a surprise. Many other models, some of which are also simple enough to be incorporated into whole-body simulations, have incorporated mechanistic elements which allow for the prediction of force responses to muscle stretch. And it is not clear from the results presented here that this model would outperform such models.

      The paper begins by outlining the phenomenological vs mechanistic approaches taken to muscle modelling, historically. It appears, although is not directly specified, that this model combines these approaches. A somewhat mechanistic model of the response of the crossbridges and titin to active stretch is combined with a phenomenological implementation of force-length and force-velocity relationships. This combination of approaches may be useful improving the accuracy of predictions of muscle models and whole-body simulations, which is certainly a worthy goal. However, it also may limit the insight that can be gained. For example, it does not seem that this model could reflect any effect of active titin properties on muscle shortening. In addition, it is not clear to me, either physiologically or in the model, what drives the shift from the high stiffness in short-range perturbations to the somewhat lower stiffness in mid-range perturbations.

      (1) It is well established that Hill-type models do not have any mechanism by which they can predict the effect of active muscle stretch.

      While many muscle physiologists are aware of the limitations of the Hill model, these limitations are not so well known among computational biomechanists. There are at least two reasons for this gap: there are few comprehensive evaluations of Hill models against several experiments, and some of the differences are quite nuanced. For example, active lengthening experiments can be replicated reasonably well using a Hill model if the lengthening is done on the ascending limb of the force length curve. Clearly the story is quite different on the descending limb as shown in Figure 9. Similarly, as Figure 8 shows, by choosing the right combination of tendon model and perturbation bandwidth it is possible to get reasonably accurate responses from the Hill model to stochastic length changes. Yet when a wide variety of perturbation bandwidths, magnitudes, and tendon models are tested it is clear that the Hill model cannot, in general, replicate the response of muscle to stochastic perturbations. For these reasons we think many of the Hill model’s drawbacks have not been clearly understood by computational biomechanists for many years now.

      (2) Many other models, some of which are also simple enough to be incorporated into whole-body simulations, have incorporated mechanistic elements which allow for the prediction of force responses to muscle stretch. And it is not clear from the results presented here that this model would outperform such models.

      We agree that it will be valuable to benchmark other models in the literature using the same set of experiments. Hopefully we, or perhaps others, will have the good fortune to secure research funding to continue this benchmarking work. This will, however, be quite challenging: few muscle models are accompanied by a professional-quality open-source implementation. Without such an implementation it is often impossible to reproduce published results let alone provide a fair and objective evaluation of a model.

      (3) For example, it does not seem that this model could reflect any effect of active titin properties on muscle shortening.

      The titin model described in the paper will provide an enhancement of force during a stretch-shortening cycle. This certainly would be an interesting next experiment to simulate in a future paper.

      (4) In addition, it is not clear to me, either physiologically or in the model, what drives the shift from the high stiffness in short-range perturbations to the somewhat lower stiffness in mid-range perturbations.

      We can only respond to what drives the frequency dependent stiffness in the model, though we’re quite interested in what happens physiologically. Hopefully that there are some new experiments done to examine this phenomena in the future. In the case of the model, the reasons are pretty straight forward: the formulation of Eqn. 16 is responsible for this shift.

      Equation 16 has been formulated so that the acceleration of the attachment point of the XE is driven by the force difference between the XE and a reference Hill model (numerator of the first term in Eqn. 16) which is then low pass filtered (denominator of the first term in Eqn. 16). Due to this formulation the attachment point moves less when the numerator is small, or when the differences in the numerator change rapidly and effectively become filtered out. When the attachment point moves less, more of the CE’s force output is determined by variations in the length of the XE and its stiffness.

      On the other hand, the attachment point will move when the numerator of the first term in Eqn. 16 is large, or when those differences are not short lived. When the attachment point moves to reduce the strain in the XE, the force produced by the XE’s spring-damper is reduced. As a result, the CE’s force output is less influenced by variations of the length of the XE and its stiffness.

      Reviewer #2 (Recommendations for the Authors):

      I find the clarity of the manuscript to be much improved following revision. While I still find the combination of phenomenological and mechanistic approaches to be a little limiting with regards to our understanding of muscle contraction, the revised description of small length changes makes the interpretation much less confusing.

      Similarly, while I agree that Hill-type models are widely used their limitations have been addressed extensively and are very well established. Hence, moving forward I think it would be much more valuable to start to compare these newer models to one another rather than just showing an improvement over a Hill model under (very biologically important) conditions which that model has no capacity to predict forces.

      (1) While I still find the combination of phenomenological and mechanistic approaches to be a little limiting with regards to our understanding of muscle contraction ...

      We have had to abstract some of the details of reality to have a model that can be used to simulate hundreds of muscles. In contrast, FiberSim produced by Kenneth Campbell’s group uses much less abstraction and might be of greater interest to you. FiberSim’s models include individual cross-bridges, titin molecules, and an explicit representation of the spatial geometry of a sarcomere. While this model is a great tool for testing muscle physiology questions through simulation, it is computationally expensive to use this model to simulate hundreds of muscles simultaneously.

      Kosta S, Colli D, Ye Q, Campbell KS. FiberSim: A flexible open-source model of myofilament-level contraction. Biophysical journal. 2022 Jan 18;121(2):175-82.https://campbell-muscle-lab.github.io/FiberSim/

      (2) Similarly, while I agree that Hill-type models are widely used their limitations have been addressed extensively and are very well established.

      Please see our response 1 to Reviewer # 1.

      (3) Hence, moving forward I think it would be much more valuable to start to compare these newer models to one another rather than just showing an improvement over a Hill model under (very biologically important) conditions which that model has no capacity to predict forces.

      Please see our response to 2 to Reviewer #1.

    1. Author response:

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary: 

      In the paper by Choi et al., the authors aimed to develop base editing strategies to convert CAG repeats to CAA repeats in the huntingtin gene (HTT), which causes Huntington's disease (HD). They hypothesized that this conversion would delay disease onset by shortening the uninterrupted CAG repeat. Using HEK-293T cells as a model, the researchers employed cytosine base editors and guide RNAs (gRNAs) to efficiently convert CAG to CAA at various sites within the CAG repeat. No significant indels, off-target edits, transcriptome alterations, or changes in HTT protein levels were detected. Interestingly, somatic CAG repeat expansion was completely abolished in HD knock-in mice carrying CAA-interrupted repeats. 

      Correction of factual errors

      We analyzed HEK293 cells, not "HEK-293T".

      Strengths: 

      This study represents the first proof-of-concept exploration of the cytosine base editing technique as a potential treatment for HD and other repeat expansion disorders with similar mechanisms. 

      Weaknesses: 

      Given that HD is a neurodegenerative disorder, it is crucial to determine the efficiency of the base editing strategies tested in this manuscript and their feasibility in relevant cells affected by HD and the brain, which needed to be improved in this manuscript. 

      We appreciate the reviewer's constructive recommendations. Our genetic investigation focused on understanding observations in HD patients to develop genetic-based treatment strategies and test their feasibility. We agree with the reviewer regarding the importance of data from relevant cell types. Unfortunately, the levels of CAG-to-CAA conversion in the patient-derived neurons were modest, as described in our manuscript (approximately 2%). In addition, AAV did not produce detectable conversions in the brain of HD knock-in mice (data not shown), which was somewhat expected from the literature (PMID: 31937940). We believe some technical hurdles can be overcome by developing efficient delivery methods. Nonetheless, it will be an important follow-up study to perform preclinical studies employing optimized base editing strategies and efficient brain delivery methods to fully demonstrate the therapeutic potential of BE strategies. 

      Reviewer #2 (Public Review):

      Summary: 

      In a proof-of-concept study with the aspiration of developing a treatment to delay HD onset, Choi et al. design and test an A>G DNA base editing strategy to exploit the recently established inverse relationship between the number of uninterrupted CAG repeats in polyglutamine repeat expansions and the age-of-onset of Huntington's Disease (HD). Most of the study is devoted to optimizing a base editing strategy typified by BE4max and gRNA2. The base editing is performed in human HEK293 cells engineered with a 51 CAG canonical repeat and in HD knock-in mice harboring 105+ CAG repeats. 

      Correction of factual errors

      We tested base editing strategies aimed at C > T conversion, not A > G DNA base editing. In addition to HEK293 and knock-in mice, we tested base editing strategies in patient-derived iPSC and neurons.

      Weaknesses: 

      Genotypic data on DNA editing are not portrayed in a clear manner consistent with the study's goal, namely reducing the number of uninterrupted CAG repeats by a clinically relevant amount according to the authors' least square approximated mean age-at-onset. No phenotypic data are presented to show that editing performed in either model would lead to reduced hallmarks of HD onset. 

      More evidence is needed to support the central claims and therapeutic potential needs to be more adequate. 

      Our strategies for converting CAG to CAA in model systems resulted in quantitative DNA modification in a population of cells. Consequently, individual cells may carry different genotypes, some harboring CAA and others CAG at the same genomic location. Therefore, using a standard genotype format for DNA to present base editing outcomes may not be ideal. Instead, we presented the resulting genotype data in a quantitative fashion to provide the percentage of conversion at each site. This approach allows for an intuitive interpretation of both the extent of repeat length reduction and the proportion of such modifications.

      Currently, genetically precise HD mouse models with robust motor and behavioral phenotypes are unavailable. While some HD mouse models, such as the BAC and YAC models, feature pronounced behavioral phenotypes, they consist of interrupted CAG repeat sequences, making them unsuitable for base conversion studies due to their inherently short uninterrupted repeats. Although genetically precise HD knockin mouse models exist, they do not manifest motor symptom-like phenotypes. Given that CAG repeat expansion is the primary driver of the disease and knock-in mice recapitulate such phenomenon, our genetic investigation focused on assessing the effects of base conversion on CAG repeat instability in knock-in mice. However, as emphasized by the reviewer, subsequent preclinical studies to evaluate the therapeutic efficacy of CAG-to-CAA conversion strategies using mouse models harboring uninterrupted adult-onset CAG repeats and robust HD-like phenotypes remain crucial.

      Reviewer #3 (Public Review):

      Summary: 

      In human patients with Huntington's disease (HD), caused by a CAG repeat expansion mutation, the number of uninterrupted CAG repeats at the genomic level influences age-at-onset of clinical signs independent of the number of polyglutamine repeats at the protein level. In most patients, the CAG repeat terminates with a CAACAG doublet. However, CAG repeat variants exist that either do not have that doublet or have two doublets. These variants consequently differ in their number of uninterrupted CAG repeats, while the number of glutamine repeats is the same as both CAA and CAG codes for glutamine. The authors first confirm that a shorter uninterrupted CAG repeat number in human HD patients is associated with developing the first clinical signs of HD later. They predict that introducing a further CAA-CAG doublet will result in years of delay of clinical onset. Based on this observation, the authors tested the hypothesis that turning CAG to CAA within a CAG repeat sequence using base editing techniques will benefit HD biology. They show that, indeed, in HD cell models (HEK293 cells expressing 16/17 CAG repeats; a single human stem cell line carrying a CAG repeat expansion in the fully penetrant range with 42 CAG repeats), their base editing strategies do induce the desired CAG-CAA conversion. The efficiency of conversion differed depending on the strategy used. In stem cells, delivery posed a problem, so to test allele specificity, the authors then used a HEK 293 cell line with 51 CAG repeats on the expanded allele. Conversion occurred in both alleles with huntingtin protein and mRNA levels; transcriptomics data was unchanged. In knock-in mice carrying 110 CAG repeats, however, base editing did not work as well for different, mainly technical, reasons. 

      Correction of factual errors

      "HD cell models (HEK293 cells expressing 16/17 CAG repeats" is an incorrect description. It should be "HD cell models (HEK293 cells expressing 51/17 CAG repeats".

      Strengths: 

      The authors use state-of-the-art methods and carefully and thoroughly designed experiments. The data support the conclusions drawn. This work is a very valuable translation from the insight gained from large GWAS studies into HD pathogenesis. It rightly emphasises the potential this has as a causal treatment in HD, while the authors also acknowledge important limitations. 

      Weaknesses: 

      They could dedicate a little more to discussing several of the mentioned challenges. The reader will better understand where base editing is in HD currently and what needs to be done before it can be considered a treatment option. For instance, 

      - It is important to clarify what can be gained by examining again the relationship between uninterrupted CAG repeat length and age-at-onset. Could the authors clarify why they do this and what it adds to their already published GWAS findings? What is the n of datasets? 

      Published HD GWAS (PMID: 31398342) compared the onset age of duplicated interruption and loss of interruption to that of canonical repeats to determine whether uninterrupted CAG repeat or polyglutamine determines age at onset. However, GWAS findings did not quantify the magnitude of the unexplained remaining variance in age at onset in duplicated interruption and loss of interruption. Our study further investigated to gain insights into the amount of additional impact of duplicated interruption to estimate the maximum clinical benefits of base editing strategies for CAG-to-CAA conversion. Since the purpose of this genetic analysis is described in the result section already, we added the following sentence in the introduction section to bring up what is unknown. 

      "Still, age at onset of loss of interruption and duplicated interruption was not fully accounted for by uninterrupted CAG repeat, suggesting additional effects of non-canonical repeats."

      We added sample size for the least square approximation analysis in the text and corresponding figure legend. Sample sizes for molecular and animal experiments can be found in the corresponding figure legend.

      - What do they think an ideal conversion rate would be, and how that could be achieved? 

      It is a very important question. However, speculating the ideal conversion levels is out of the scope of this genetic investigation. A series of preclinical studies using relevant models may generate data that may shed light on the conversion rate levels that are required to produce meaningful clinical benefits. In the discussion section, we added the following sentence. 

      "Currently, the ideal levels of CAG-to-CAA conversion that produce significant clinical benefits are unknown. A series of preclinical studies using relevant model systems may generate data that may shed light on the optimal conversion rate levels that are required to produce significant clinical benefits."

      - Is there a dose-effect relationship for base editing, and would it be realistic to achieve the ideal conversion rate in target cells, given the difficulties described by the authors in differentiated neurons from stem cells? 

      We observed a clear dose-response relationship between the amount of BE reagents and the levels of conversion in non-neuronal cells. Unfortunately, the conversion rate was low in neuronal cells, potentially due to limited delivery, as speculated in the result section. As described in the discussion sections, we predict that efficient delivery methods will be crucial to produce significant CAG-to-CAA conversion to achieve therapeutic benefits.

      - The liver is a good tool for in-vivo experiments examining repeat instability in mouse models. However, the authors could comment on why they did not examine the brain.

      We focused on liver instability because of 1) the expectation that delivery/targeting efficiency is significantly lower in the brain (PMID: 31937940) and 2) shared underlying mechanisms between the brain and liver (described in the result section). The following sentence was added in the method section to provide a rationale for liver analysis. 

      "Since significantly lower delivery/targeting efficiency was expected in the brain 34, we focused on analyzing liver instability."

      - Is there a limit to judging the effects of base editing on somatic instability with longer repeats, given the difficulties in measuring long CAG repeat expansions? 

      Determining the levels of base conversion using sequencing technologies gets harder as repeats become longer. Fragment analysis can overcome such technical difficulty if conversion efficiency is high. As pointed out, the repeat expansion measure is also challenging because amplification is biased toward shorter alleles. However, if repeat sizes are relatively similar, the levels of repeat expansion as a function of base conversion can be determined relatively precisely without a significant bias by a standard fragment analysis approach. 

      - Given the methodological challenges for assessing HTT fragments, are there other ways to measure the downstream effects of base editing rather than extrapolate what it will likely be?

      Our CAG-to-CAA conversion strategies are not expected to directly generate fragments of huntingtin DNA, RNA, or protein. In contrast, immediate downstream effects of CAG-to-CAA conversion include sequence changes (DNA and RNA) and alteration of repeat instability, which are presented in the manuscript. If repeat instability is associated with HTT exon 1A fragment, base conversion strategies may indirectly alter the levels of such putative toxic species, which remains to be determined.  

      - Sequencing errors could mask low-level, but biologically still relevant, off-target effects (such as gRNAdependent and gRNA-independent DNA, Off-targets, RNA off-targets, bystander editing). How likely is that? 

      We agree with the reviewer that increased editing efficiency is expected to increase the levels of off-target editing. However, the field is actively developing base editors with minimal off-target effect (PMID: 35941130), which will increase the safety aspects of this technology for clinical use. We added the following sentence.  "In addition, developing base editors with high level on-target gene specificity and minimal off-target effects is a critical aspect to address 100."

      - How worried are the authors about immune responses following base editing? How could this be assessed? 

      We added the following sentence in the discussion section as the reviewer raised an important safety issue.  

      "Thorough assessments of immune responses against base editing strategies (e.g., development of antibody, B cell, and T cell-specific immune responses) and subsequent modification (e.g., immunosilencing) 101 will be critical to address immune response-associated safety issues of BE strategies."

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      The following points could be considered to improve the overall quality of the manuscript: 

      (1) The authors mentioned that the reason for checking repeat instability in the nonneuronal cells was due to the availability of specific types of AAV; there are other subtypes of AAVs available to infect neurons and iPSCs. 

      Our pilot experiments testing several AAV serotypes in patient-derived iPSC and HD knock-in mice showed that only AAV9 converted CAG to CAA at detectable levels in the liver, not in the brain or neurons. We also speculate that difficulties in targeting the CAG repeat region due to GC-rich sequence contributed to low conversion efficiency. Therefore, subsequent optimization of base editor and delivery may improve BE strategies for HD, permitting robust conversion at the challenging locus. 

      (2) Despite its bold nature, minimal data in the manuscript demonstrate that this gene editing strategy is disease-modifying.

      Resources required to demonstrate the therapeutic benefits of CAG-to-CAA conversion strategies are not fully available. Especially, relevant HD mouse models that carry uninterrupted adult onset CAG repeat and that permit measuring the levels of disease-modifying are lacking, as described in our response to the second reviewer. Given that CAG repeat expansion is the primary driver of the disease, this genetic investigation focused on determining the impacts of base editing strategies on CAG repeat expansion. Still, as indicated by the reviewer, follow-up preclinical studies to evaluate the levels of disease-modifying of CAG-to-CAA conversion strategies using relevant mouse models represent important next steps.

      (3) Off-target analysis at the DNA level was limited to "predicted" off-target sites. What about possible translocations that can result from co-nicking on different chromosomes, as a large number of potential targets exist? 

      Among gRNAs we tested, we focused on gRNAs 1 and 2, which predicted small numbers of off-target. Therefore, our off-target analysis at the DNA level was focused on validating those predicted off-targets. As pointed out, thoroughly evaluating off-target effects will be necessary when candidate BE strategies take the next steps for therapeutic development.

      Genomic translocation caused by double-strand breaks can produce negative consequences, such as cancer. Importantly, although paired nicks efficiently induced translocations, translocations were not detected when a single nick was introduced on each chromosome (PMID: 25201414). Therefore, it is predicted that BE strategies using nickase confers little risk of translocation.

      (4) For in vivo work, somatic repeat expansion was analyzed only in peripheral tissue samples. Since the main affected cellular population in HD is the brain, the outcome of this treatment on a disease-relevant organ still needs to be determined. 

      Challenges in delivery to the brain made us determine instability in the liver since many mechanistic components of somatic CAG repeat instability are shared between the liver and striatum, as rationalized in the manuscript. However, we agree with the reviewer regarding the importance of determining the effects of base conversion on brain instability. We added the following sentence in the method section to provide a rationale. "Since significantly lower delivery/targeting efficiency was expected in brain 34, we focused on analyzing liver instability."

      Reviewer #2 (Recommendations For The Authors):

      Throughout the manuscript, the authors apologize for techniques that do not work when workarounds seem readily apparent to an expert in the field. In its current form, the manuscript reads verbose, speculative, apologetic, and preliminary. 

      Drug development programs that are supported by human genetics data show increased success rates in clinical trials (PMID: 26121088, 31827124, 31830040). This is why this genetic study focused on 1) investigating observations in HD subjects and 2) subsequently developing treatment strategies that are supported by patient genetics. As the first illustration of base editing in HD, the main scope of our manuscript is to justify the genetic rationale of CAG-to-CAA conversion and demonstrate the feasibility of therapeutic strategies rooted in patient genetics. As our study was not aimed at entirely demonstrating the clinical benefits of base editing strategies in HD, some of our data were based on tools and approaches that were not fully optimal. We agree with the reviewer that it will be an important next step to employ optimized approaches to evaluate the efficacy of base editing strategies in model systems. Nevertheless, our novel base conversion strategies derived from HD patient genetics represent a significant advancement as they may contribute to developing effective treatments for this devastating disorder. 

      Reviewer#3 (Recommendations For The Authors):

      It would make for an easier read if abbreviations were kept to a minimum. 

      As recommended, we decreased the use of abbreviations. The following has been spelled out throughout the manuscript: CR (canonical repeat), LI (loss of interruption), DI (duplicated interruption), and CBE (cytosine base editor). Other abbreviations with infrequent usage (e.g., ABE, SS, QC) were also spelled out in the text.

    1. Reviewer #2 (Public Review):

      Summary:

      The authors had two aims: First, to decompose the attentional blink (AB) deficit into the two components of signal detection theory; sensitivity and bias. Second, the authors aimed to assess the two subcomponents of sensitivity; detection and discrimination. They observed that the AB is only expressed in sensitivity. Furthermore, detection and discrimination were doubly dissociated. Detection modulated N2p and P3 ERP amplitude, but not frontoparietal beta-band coherence, whereas this pattern was reversed for discrimination.

      Strengths:

      The experiment is elegantly designed, and the data - both behavioral and electrophysiological - are aptly analyzed. The outcomes, in particular the dissociation between detection and discrimination blinks, are consistently and clearly supported by the results. The discussion of the results is also appropriately balanced.

      Weaknesses:

      The lack of an effect of stimulus contrast does not seem very surprising from what we know of the nature of AB already. Low-level perceptual factors are not thought to cause AB. This is fine, as there are also other, novel findings reported, but perhaps the authors could bolster the importance of these (null) findings by referring to AB-specific papers, if there are indeed any, that would have predicted different outcomes in this regard.

      On an analytical note, the ERP analysis could be finetuned a little more. The task design does not allow measurement of the N2pc or N400 components, which are also relevant to the AB, but the N1 component could additionally be analyzed. In doing so, I would furthermore recommend selecting more lateral electrode sites for both the N1, as well as the P1. Both P1 and N1 are likely not maximal near the midline, where the authors currently focused their P1 analysis.

      Impact & Context:

      The results of this study will likely influence how we think about selective attention in the context of the AB phenomenon. However, I think its impact could be further improved by extending its theoretical framing. In particular, there has been some recent work on the nature of the AB deficit, showing that it can be discrete (all-or-none) and gradual (Sy et al., 2021; Karabay et al., 2022, both in JEP: General). These different faces of target awareness in the AB may be linked directly to the detection and discrimination subcomponents that are analyzed in the present paper. I would encourage the authors to discuss this potential link and comment on the bearing of the present work on these previous behavioral findings.

    2. Author response:

      Reviewer #1: 

      Summary:

      In this study, the authors used a multi-alternative decision task and a multidimensional signal-detection model to gain further insight into the cause of perceptual impairments during the attentional blink. The model-based analyses of behavioural and EEG data show that such perceptual failures can be unpacked into distinct deficits in visual detection and discrimination, with visual detection being linked to the amplitude of late ERP components (N2P and P3) and discrimination being linked to the coherence of fronto-parietal brain activity.

      Strengths:

      The main strength of this paper lies in the fact that it presents a novel perspective on the cause of perceptual failures during the attentional blink. The multidimensional signaldetection modelling approach is explained clearly, and the results of the study show that this approach offers a powerful method to unpack behavioural and EEG data into distinct processes of detection and discrimination.

      Weaknesses:

      (1.1) While the model-based analyses are compelling, the paper also features some analyses that seem misguided, or, at least, insufficiently motivated and explained. Specifically, in the introduction, the authors raise the suggestion that the attentional blink could be due to a reduction in sensitivity or a response bias. The suggestion that a response bias could play a role seems misguided, as any response bias would be expected to be constant across lags, while the attentional blink effect is only observed at short lags. Thus, it is difficult to understand why the authors would think that a response bias could explain the attentional blink.

      A deficit in T2 identification accuracy could arise from either sensitivity or criterion effects; the criterion effect may manifest as a choice bias. For example, in short T1-T2 lag trials, when T2 closely follows T1, participants may adopt a more conservative choice criterion for reporting the presence of T2. Moreover, criterion effects need not be uniform across lags: A participant could infer the T1-T2 lag interval based on various factors, including trial length, thereby permitting them to adjust their choice criterion variably across different lags. We will provide a more detailed illustration of this claim in the revision.

      (1.2) A second point of concern regards the way in which the measures for detection and discrimination accuracy were computed. If I understand the paper correctly, a correct detection was defined as either correctly identifying T2 (i.e., reporting CW or CCW if T2 was CW or CCW, respectively, see Figure 2B), or correctly reporting T2's absence (a correct rejection). Here, it seems that one should also count a misidentification (i.e., incorrect choice of CW or CCW when T2 was present) as a correct detection, because participants apparently did detect T2, but failed to judge/remember its orientation properly in case of a misidentification. Conversely, the manner in which discrimination performance is computed also raises questions. Here, the authors appear to compute accuracy as the average proportion of T2-present trials on which participants selected the correct response option for T2, thus including trials in which participants missed T2 entirely. Thus, a failure to detect T2 is now counted as a failure to discriminate T2. Wouldn't a more proper measure of discrimination accuracy be to compute the proportion of correct discriminations for trials in which participants detected T2?

      Detection and discrimination accuracies were computed with precisely the same procedure, and under the same conditions, as described by the Reviewer (underlined text, above). We regret our poor description; we will improve upon it in the revised manuscript.

      (1.3) My last point of critique is that the paper offers little if any guidance on how the inferred distinction between detection and discrimination can be linked to existing theories of the attentional blink. The discussion mostly focuses on comparisons to previous EEG studies, but it would be interesting to know how the authors connect their findings to extant, mechanistic accounts of the attentional blink. A key question here is whether the finding of dissociable processes of detection and discrimination would also hold with more meaningful stimuli in an identification task (e.g., the canonical AB task of identifying two letters shown amongst digits). There is evidence to suggest that meaningful stimuli are categorized just as quickly as they are detected (Grill-Spector & Kanwisher, 2005; Grill-Spector K, Kanwisher N. Visual recognition: as soon as you know it is there, you know what it is. Psychol Sci. 2005 Feb;16(2):152-60. doi: 10.1111/j.0956-7976.2005.00796.x. PMID: 15686582.). Does that mean that the observed distinction between detection and discrimination would only apply to tasks in which the targets consist of otherwise meaningless visual elements, such as lines of different orientations?

      Our results are consistent with previous literature suggested by the Reviewer. Specifically, we do not claim that detection and discrimination are sequential processes; in fact, we modeled them as concurrent computations (Figs. 3A-B). Yet, our results suggest that these processes possess distinct neural bases. We have discussed this idea briefly in the Discussion section (e.g., “Yet, we found no evidence for these two computations being sequential…”). We will discuss this further in the revised manuscript in the context of previous literature.

      Reviewer #2:

      Summary:

      The authors had two aims: First, to decompose the attentional blink (AB) deficit into the two components of signal detection theory; sensitivity and bias. Second, the authors aimed to assess the two subcomponents of sensitivity; detection and discrimination. They observed that the AB is only expressed in sensitivity. Furthermore, detection and discrimination were doubly dissociated. Detection modulated N2p and P3 ERP amplitude, but not frontoparietal beta-band coherence, whereas this pattern was reversed for discrimination.

      Strengths:

      The experiment is elegantly designed, and the data - both behavioral and electrophysiological - are aptly analyzed. The outcomes, in particular the dissociation between detection and discrimination blinks, are consistently and clearly supported by the results. The discussion of the results is also appropriately balanced.

      Weaknesses:

      (2.1) The lack of an effect of stimulus contrast does not seem very surprising from what we know of the nature of AB already. Low-level perceptual factors are not thought to cause AB. This is fine, as there are also other, novel findings reported, but perhaps the authors could bolster the importance of these (null) findings by referring to AB-specific papers, if there are indeed any, that would have predicted different outcomes in this regard.

      While there is consensus that the low-level perceptual factors are not affected by the attentional blink, other studies may suggest evidence to the contrary (e.g., Chua et al, Percept. Psychophys., 2005). We will highlight the significance of our findings in the context of such conflicting evidence in literature, in the revised manuscript.

      (2.2) On an analytical note, the ERP analysis could be finetuned a little more. The task design does not allow measurement of the N2pc or N400 components, which are also relevant to the AB, but the N1 component could additionally be analyzed. In doing so, I would furthermore recommend selecting more lateral electrode sites for both the N1, as well as the P1. Both P1 and N1 are likely not maximal near the midline, where the authors currently focused their P1 analysis.

      We will incorporate these additional analyses in the revised manuscript.

      (2.3) Impact & Context:

      The results of this study will likely influence how we think about selective attention in the context of the AB phenomenon. However, I think its impact could be further improved by extending its theoretical framing. In particular, there has been some recent work on the nature of the AB deficit, showing that it can be discrete (all-or-none) and gradual (Sy et al., 2021; Karabay et al., 2022, both in JEP: General). These different faces of target awareness in the AB may be linked directly to the detection and discrimination subcomponents that are analyzed in the present paper. I would encourage the authors to discuss this potential link and comment on the bearing of the present work on these behavioural findings.

      Thank you. We will discuss our findings in the context of these recent studies.

      Reviewer #3:

      Summary:

      In the present study, the authors aimed to achieve a better understanding of the mechanisms underlying the attentional blink, that is, a deficit in processing the second of two target stimuli when they appear in rapid succession. Specifically, they used a concurrent detection and identification task in- and outside of the attentional blink and decoupled effects of perceptual sensitivity and response bias using a novel signal detection model. They conclude that the attentional blink selectively impairs perceptual sensitivity but not response bias, and link established EEG markers of the attentional blink to deficits in stimulus detection (N2p, P3) and discrimination (fronto-parietal high-beta coherence), respectively. Taken together, their study suggests distinct mechanisms mediating detection and discrimination deficits in the attentional blink.

      Strengths:

      Major strengths of the present study include its innovative approach to investigating the mechanisms underlying the attentional blink, an elegant, carefully calibrated experimental paradigm, a novel signal detection model, and multifaceted data analyses using state-of-theart model comparisons and robust statistical tests. The study appears to have been carefully conducted and the overall conclusions seem warranted given the results. In my opinion, the manuscript is a valuable contribution to the current literature on the attentional blink. Moreover, the novel paradigm and signal detection model are likely to stimulate future research.

      Weaknesses:

      Weaknesses of the present manuscript mainly concern the negligence of some relevant literature, unclear hypotheses, potentially data-driven analyses, relatively low statistical power, potential flaws in the EEG methods, and the absence of a discussion of limitations. In the following, I will list some major and minor concerns in detail.

      Major points

      (3.1) Hypotheses:

      I appreciate the multifaceted, in-depth analysis of the given dataset including its high amount of different statistical tests. However, neither the Introduction nor the Methods contain specific statistical hypotheses. Moreover, many of the tests (e.g., correlations) rely on selected results of previous tests. It is unclear how many of the tests were planned a priori, how many more were performed, and how exactly corrections for multiple tests were implemented. Thus, I find it difficult to assess the robustness of the results.

      As outlined in the Introduction, we hypothesized that neural computations associated with target detection would be characterized by regional neuronal markers (e.g., parietal or occipital ERPs), whereas computations linked to feature discrimination may involve neural coordination across multiple brain regions (e.g. fronto-parietal coherence). We planned and conducted our statistical tests based on this hypothesis. All multiple comparison corrections (e.g., Bonferroni-Holm correction, see Methods) were performed separately for each class of analyses. We will clarify these hypotheses and provide further details in the revised manuscript.

      (3.2) Power:

      Some important null findings may result from the rather small sample sizes of N = 24 for behavioral and N = 18 for ERP analyses. For example, the correlation between detection and discrimination d' deficits across participants (r=0.39, p=0.059) (p. 12, l. 263) and the attentional blink effect on the P1 component (p=0.050, no test statistic) (p. 14, 301) could each have been significant with one more participant. In my opinion, such results should not be interpreted as evidence for the absence of effects.

      We agree and will revise the manuscript accordingly. We will also report Bayes factor (BF) values, where relevant, to further evaluate these claims.

      (3.3) Neural basis of the attentional blink:

      The introduction (e.g., p. 4, l. 56-76) and discussion (e.g., p. 19, 427-447) do not incorporate the insights from the highly relevant recent review by Zivony & Lamy (2022), which is only cited once (p. 19, l. 428). Moreover, the sections do not mention some relevant ERP studies of the attentional blink (e.g., Batterink et al., 2012; Craston et al., 2009; Dell'Acqua et al., 2015; Dellert et al., 2022; Eiserbeck et al., 2022; Meijs et al., 2018).

      We will motivate and discuss our study in the context of these previous studies. 

      (3.4) Detection versus discrimination:

      Concerning the neural basis of detection versus discrimination (e.g., p. 6, l. 98-110; p. 18, l. 399-412), relevant existing literature (e.g., Broadbent & Broadbent, 1987; Hillis & Brainard, 2007; Koivisto et al., 2017; Straube & Fahle, 2011; Wiens et al., 2023) is not included.

      Thank you for these suggestions. We will include these important studies in our discussion.

      (3.5) Pooling of lags and lags 1 sparing:

      I wonder why the authors chose to include 5 different lags when they later pooled early (100, 300 ms) and late (700, 900 ms) lags, and whether this pooling is justified. This is important because T2 at lag 1 (100 ms) is typically "spared" (high accuracy) while T2 at lag 3 (300 ms) shows the maximum AB (for reviews, see, e.g., Dux & Marois, 2009; Martens & Wyble, 2010). Interestingly, this sparing was not observed here (p. 43, Figure 2). Nevertheless, considering the literature and the research questions at hand, it is questionable whether lag 1 and 3 should be pooled.

      Lag-1 sparing is not always observed in attentional blink studies; there are notable exceptions that do not report such sparing (Hommel et al., Q. J. Exp. Psychol., 2005; Livesay et al., Attention, Percept. Psychophys., 2011). Our statistical tests revealed no significant difference in accuracies between short lag (100 and 300 ms) trials or between long lag (700 and 900 ms) trials but did reveal significant differences between the short and long lag trials (ANOVA, followed by post-hoc tests). To simplify the presentation of the findings, we pooled together the short lag (100 and 300 ms) and, separately, the long lag (700 and 900 ms) trials. We will present these analyses, and clarify the motivation for pooling in the revised manuscript. 

      (3.6) Discrimination in the attentional blink

      Concerning the claims that previous attentional blink studies conflated detection and discrimination (p. 6, l. 111-114; p. 18, l. 416), there is a recent ERP study (Dellert et al., 2022) in which participants did not perform a discrimination task for the T2 stimuli. Moreover, since the relevance of all stimuli except T1 was uncertain in this study, irrelevant distractors could not be filtered out (cf. p. 19, l. 437). Under these conditions, the attentional blink was still associated with reduced negativities in the N2 range (cf. p. 19, l. 427-437) but not with a reduced P3 (cf. p. 19, l 439-447).

      We will address the difference between our findings and those of Dellert et al (2022) in the revised manuscript.

      (3.7) General EEG methods:

      While most of the description of the EEG preprocessing and analysis (p. 31/32) is appropriate, it also lacks some important information (see, e.g., Keil et al., 2014). For example, it does not include the length of the segments, the type and proportion of artifacts rejected, the number of trials used for averaging in each condition, specific hypotheses, and the test statistics (in addition to p-values).

      We regret the oversight. We will include these details in the revised Methods.

      (3.8) EEG filters:

      P. 31, l. 728: "The data were (...) bandpass filtered between 0.5 to 18 Hz (...). Next, a bandstop filter from 9-11 Hz was applied to remove the 10 Hz oscillations evoked by the RSVP presentation." These filter settings do not follow common recommendations and could potentially induce filter distortions (e.g., Luck, 2014; Zhang et al., 2024). For example, the 0.5 high-pass filter could distort the slow P3 wave. Mostly, I am concerned about the bandstop filter. Since the authors commendably corrected for RSVP-evoked responses by subtracting T2-absent from T2-present ERPs (p. 31, l. 746), I wonder why the additional filter was necessary, and whether it might have removed relevant peaks in the ERPs of interest.

      Thank you for this suggestion. We will repeat this analysis by removing these additional filters.

      (3.9) Coherence analysis:

      P. 33, l. 786: "For subsequent, partial correlation analyses of coherence with behavioral metrics and neural distances (...), we focused on a 300 ms time period (0-300 ms following T2 onset) and high-beta frequency band (20-30 Hz) identified by the cluster-based permutation test (Fig. 5A-C)." I wonder whether there were any a priori criteria for the definition and selection of such successive analyses. Given the many factors (frequency bands, hemispheres) in the analyses and the particular shape of the cluster (p. 49, Fig 5C), this focus seems largely data-driven. It remains unclear how many such tests were performed and whether the results (e.g., the resulting weak correlation of r = 0.22 in one frequency band and one hemisphere in one part of a complexly shaped cluster; p. 15, l. 327) can be considered robust.

      Please see responses to comments #3.1 and #3.2 (above). In addition to reporting further details regarding statistical tests and multiple comparisons corrections, we will compute and report Bayes factors to quantify the strength of the evidence for correlations, as appropriate.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary: 

      The current manuscript provides an extensive in vivo analysis of two guidance pathways identifying multiple mechanisms that shape the bifurcation of DRG axons when forming the dorsal funiculus in the DREZ. 

      Strengths: 

      Multiple mouse mutant lines were used, together with complementary techniques; the results are very clear and compelling. 

      The findings are very significant and clearly move forward our understanding of the regulation of axonal development at the DREZ. 

      Weaknesses: 

      No major weaknesses were found. As it is I have no recommendations that would increase the clarity or quality of the manuscript. 

      Reviewer #2 (Public Review):

      Summary: 

      In this manuscript, the authors conduct a detailed analysis of the molecular cues that control the guidance of bifurcated dorsal root ganglion axons in a key region of the spinal cord called the dorsal funiculus. This is a specific case of axon guidance that occurs in a precise way. The authors knew that Slit was important but many axons still target correctly in Slit knockouts, suggesting a role for other guidance factors. Netrin1 is also expressed in this region, so they looked at netrin mutants. The authors found axons outside the DREZ in the Ntn1 mutants, and they show by single-neuron genetic labeling that many of these come from DRG neurons. Quantified axonal tracing studies in Slit1/2, Ntn1, or triple mutant embryos support the idea that Slit and Ntr1 have distinct functions in guidance and that the effect of their loss is additive. Interestingly none of these knockouts affect bifurcation itself but rather the guidance of one or both of the bifurcated axon terminals. Knockout of the Slit receptors (Robo1/2) or the Netrin 1 receptor (DCC) in embryos causes similar guidance defects to loss of the ligands, providing additional confirmation of the requirement for both guidance pathways. 

      Strengths: 

      This study expands understanding of the role of the axon guidance factors Ntr1/DCC and Slit/Robo in a specific axon guidance decision. The strength of the study is the careful axonal labeling and quantification, which allows the authors to establish precise consequences of the loss of each guidance factor or receptor. 

      Weaknesses: 

      There are some places in the text where the discussion of these data is compared with other studies and models, but additional details would help clarify the arguments. 

      The details were added to the first section of Discussion in the revision to address this weakness.  Also see the response to the recommendations below.

      Reviewer #3 (Public Review):

      Summary: 

      In this paper, Curran et al investigate the role of Ntn, Slit1, and Slit 2 in the axon patterning of DRG neurons. The paper uses mouse genetics to perturb each guidance molecule and its corresponding receptor. Cre-based approaches and immunostaining of DRG neurons are used to assess the phenotypes. Overall, the study uses the strength of mouse genetics and imaging to reveal new genetic modifiers of DRG axons. The conclusions of the experiments match the presented results. The paper is an important contribution to the field, as evidence that dorsal funiculus formation is impacted by Ntn and Slit signaling. However, there are some potential areas of the manuscript that should be edited to better match the results with the conclusions of the work. 

      Strengths: 

      The manuscript uses the advantage of mouse genetics to investigate the axon patterning of DRG neurons. The work does a great job of assessing individual phenotypes in single and double mutants. This reveals an intriguing cooperative and independent function of Ntn, Slit1, and Slit2 in DRG axon patterning. The sophisticated triple mutant analysis is lauded and provides important insight. 

      Weaknesses: 

      Overall, the manuscript is sound in technique and analysis. However, the majority of the manuscript is about the dorsal funiculus and not the bifurcation of the axons, as the title would make a reader believe. Further, the manuscript would provide a more scholarly discussion of the current knowledge of DRG axon patterning and how their work fits into that knowledge. 

      We revised the title as suggested.  Additional discussion of DRG axon growth at the DREZ is added to the last section of the Discussion in the revision.  Also see the response to the recommendations below.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Given the reasons stated above, I have no specific recommendations for the authors. 

      There is a typo in the Abstract (... mice with triple deletion of Ntn1, Slit2, and Slit2....). 

      Corrected in the revision.

      Reviewer #2 (Recommendations For The Authors):

      (1) The authors twice repeated that their data on DRG guidance defects in the Ntn1 mutants differ from studies previously published in references 19 and 26. However it is unclear to me, without having read those other studies, what is actually different between this study and those, and why there would be differences between the results from two groups. If the authors think this is an important point to make they need to more clearly say what the other group saw and offer an explanation of why the data may be different. 

      We added detailed comparison of the defects from different studies to the first section of the Discussion and suggested multiple roles of Ntn1 in controlling sensory axon growth at the DREZ in the revision.

      (2) In the final section of the discussion it says, "The guidance regulation of DRG axon bifurcation by Slit and Ntn1 may be similar to but overshadowed by their function in midline guidance [43]." The meaning of this sentence was unclear to me. I had been thinking that since there are total knockout embryos (not conditional) there could be patterning effects that happen before the DRG branching that influence the formation of the DREZ. Is this what the authors mean to say here? How can the authors show that the guidance factors they have knocked out are actually functioning in the DRG neurons? 

      We agree with the reviewer that the first sentence is vague, so we edited the paragraph and included the discussion of the regulation of DRG axons at the DREZ, which was the main theme of this last section.  In addition, we agree with the reviewer’s suggestion of the possible indirect role of Ntn1 on DRG axons via the control of interneuron migration.  This possibility was included in the last paragraph of the Discussion.

      (3) In several of the figures (3T, 5I, 5J) there are distance measurements that are presumably averages of multiple axons in 3 or 4 embryos because 3-4 points are shown per graph. However, the figure and methods do not say how many axons were measured per embryo and I could not find if it says these numbers are averages. Clarifying the details of these panels would be useful. 

      The n is the number of animals analyzed and is now added to the figure legends.  From each animal, multiple sections (2-4) were analyzed for various parameters in Fig. 3 and 5.  This information was added to the Method section of the revision.

      Reviewer #3 (Recommendations For The Authors):

      Overall the data matches the conclusions in the paper. However, to this reviewer, the title suggests that Ntn and Slit will have defects in bifurcation. This is not the presented phenotype. I recommend the authors change the title to better reflect the findings of the work. 

      We edited the title of the revised manuscript to reflect the control of growth direction in the context of bifurcation.  

      The introduction of the work clearly outlines what is known about DREZ formation in mice but could extend its discussion to other systems like chick and zebrafish (Jaeda Coutinho-Budd et al. 2008, Wang and Scott 2000, Golding et al 1997, Nichols and Smith 2019, Kikel-Coury et al 2021). These studies are particularly important given that pioneer events, including bifurcation, can be visualized. Acknowledging the contribution of other model systems to the understanding of DRG axon patterning is important to improve the scholarly discussion of the paper. 

      We added more detailed discussion of the current knowledge of DRG axon growth at the DREZ from several relevant studies of the rodent and zebrafish models in the last section of Discussion.

      In the data presented, the authors see defects in the axon patterning of DRG neurons and conclude it is a defect in the dorsal funiculus formation. Another interpretation is that a subset of axons cannot invade the spinal cord boundary properly. This phenotype was observed in zebrafish with timelapse imaging (Kikel-Coury et al 2021). It may not be necessary to specifically test the axons' ability to enter the spinal cord in this paper, but the possibility that this could drive the presented phenotypes should be more clearly stated in the results. Entry is not thoroughly addressed in this paper and would need to be confirmed by labeling the edge of the spinal cord with a second reporter. No entry would obviously impact axon targeting. However, delayed entry could place the axon in a navigation environment that is atypical, causing it to navigate aberrantly and present as a funiculus phenotype. 

      We thank the reviewer for raising this very interesting point.  In our present view, dorsal funiculus formation is related to DRG axon patterning, which involves growth, guidance, and bifurcation of the incoming afferents at the dorsal spinal cord.  We believe that these events are highly coordinated by various environmental cues to generate the DREZ and the dorsal funiculus.  The defects we observed could result from the disruption of such coordination that leads to misregulation of DRG axon entry at the dorsal spinal cord, as suggested by the reviewer.  We propose that further analysis by time-lapse imaging as done in zebrafish would provide better understanding of such coordination.  This discussion was included in the last section of Discussion. 

      The authors should clarify that their approach does not knock out molecules in a cell-specific way. This would specifically impact the interpretation of the Dcc phenotypes. It is possible that UNC-40/DCC is guiding cells that are not labeled. The non-autonomous role of UNC-40/DCC should be clearly stated as a possibility. 

      This discussion was added to the last paragraph of the Discussion section.

    1. Reviewer #2 (Public Review):

      Summary:

      Zylberberg and colleagues show that food choice outcomes and BOLD signal in the vmPFC are better explained by algorithms that update subjective values during the sequence of choices compared to algorithms based on static values acquired before the decision phase. This study presents a valuable means of reducing the apparent stochasticity of choices in common laboratory experiment designs. The evidence supporting the claims of the authors is solid, although currently limited to choices between food items because no other goods were examined. The work will be of interest to researchers examining decision-making across various social and biological sciences.

      Strengths:

      The paper analyses multiple food choice datasets to check the robustness of its findings in that domain.

      The paper presents simulations and robustness checks to back up its core claims.

      Weaknesses:

      To avoid potential misunderstandings of their work, I think it would be useful for the authors to clarify their statements and implications regarding the utility of item ratings/bids (e-values) in explaining choice behavior. Currently, the paper emphasizes that e-values have limited power to predict choices without explicitly stating the likely reason for this limitation given its own results or pointing out that this limitation is not unique to e-values and would apply to choice outcomes or any other preference elicitation measure too. The core of the paper rests on the argument that the subjective values of the food items are not stored as a relatively constant value, but instead are constructed at the time of choice based on the individual's current state. That is, a food's subjective value is a dynamic creation, and any measure of subjective value will become less accurate with time or new inputs (see Figure 3 regarding choice outcomes, for example). The e-values will change with time, choice deliberation, or other experiences to reflect the change in subjective value. Indeed, most previous studies of choice-induced preference change, including those cited in this manuscript, use multiple elicitations of e-values to detect these changes. It is important to clearly state that this paper provides no data on whether e-values are more or less limited than any other measure of eliciting subjective value. Rather, the paper shows that a static estimate of a food's subjective value at a single point in time has limited power to predict future choices. Thus, a more accurate label for the e-values would be static values because stationarity is the key assumption rather than the means by which the values are elicited or inferred.

      There is a puzzling discrepancy between the fits of a DDM using e-values in Figure 1 versus Figure 5. In Figure 1, the DDM using e-values provides a rather good fit to the empirical data, while in Figure 5 its match to the same empirical data appears to be substantially worse. I suspect that this is because the value difference on the x-axis in Figure 1 is based on the e-values, while in Figure 5 it is based on the r-values from the Reval algorithm. However, the computation of the value difference measure on the two x-axes is not explicitly described in the figures or methods section and these details should be added to the manuscript. If my guess is correct, then I think it is misleading to plot the DDM fit to e-values against choice and RT curves derived from r-values. Comparing Figures 1 and 5, it seems that changing the axes creates an artificial impression that the DDM using e-values is much worse than the one fit using r-values.

      Relatedly, do model comparison metrics favor a DDM using r-values over one using e-values in any of the datasets tested? Such tests, which use the full distribution of response times without dividing the continuum of decision difficulty into arbitrary hard and easy bins, would be more convincing than the tests of RT differences between the categorical divisions of hard versus easy.

      Revaluation and reduction in the imprecision of subjective value representations during (or after) a choice are not mutually exclusive. The fact that applying Reval in the forward trial order leads to lower deviance than applying it in the backwards order (Figure 7) suggests that revaluation does occur. It doesn't tell us if there is also a reduction in imprecision. A comparison of backwards Reval versus no Reval would indicate whether there is a reduction in imprecision in addition to revaluation. Model comparison metrics and plots of the deviance from the logistic regression fit using e-values against backward and forward Reval models would be useful to show the relative improvement for both forms of Reval.

      Did the analyses of BOLD activity shown in Figure 9 orthogonalize between the various e-value- and r-value-based regressors? I assume they were not because the idea was to let the two types of regressors compete for variance, but orthogonalization is common in fMRI analyses so it would be good to clarify that this was not used in this case. Assuming no orthogonalization, the unique variance for the r-value of the chosen option in a model that also includes the e-value of the chosen option is the delta term that distinguishes the r and e-values. The delta term is a scaled count of how often the food item was chosen and rejected in previous trials. It would be useful to know if the vmPFC BOLD activity correlates directly with this count or the entire r-value (e-value + delta). That is easily tested using two additional models that include only the r-value or only the delta term for each trial.

      Please confirm that the correlation coefficients shown in Figure 11 B are autocorrelations in the MCMC chains at various lags. If this interpretation is incorrect, please give more detail on how these coefficients were computed and what they represent.

      The paper presents the ceDDM as a proof-of-principle type model that can reproduce certain features of the empirical data. There are other plausible modifications to bounded evidence accumulation (BEA) models that may also reproduce these features as well or better than the ceDDM. For example, a DDM in which the starting point bias is a function of how often the two items were chosen or rejected in previous trials. My point is not that I think other BEA models would be better than the ceDDM, but rather that we don't know because the tests have not been run. Naturally, no paper can test all potential models and I am not suggesting that this paper should compare the ceDDM to other BEA processes. However, it should clearly state what we can and cannot conclude from the results it presents.

      This work has important practical implications for many studies in the decision sciences that seek to understand how various factors influence choice outcomes. By better accounting for the context-specific nature of value construction, studies can gain more precise estimates of the effects of treatments of interest on decision processes. That said, there are limitations to the generalizability of these findings that should be noted.

      These limitations stem from the fact that the paper only analyzes choices between food items and the outcomes of the choices are not realized until the end of the study (i.e., participants do not eat the chosen item before making the next choice). This creates at least two important limitations. First, preferences over food items may be particularly sensitive to mindsets/bodily states. We don't yet know how large the choice deltas may be for other types of goods whose value is less sensitive to satiety and other dynamic bodily states. Second, the somewhat artificial situation of making numerous choices between different pairs of items without receiving or consuming anything may eliminate potential decreases in the preference for the chosen item that would occur in the wild outside the lab setting. It seems quite probable that in many real-world decisions, the value of a chosen good is reduced in future choices because the individual does not need or want multiples of that item. Naturally, this depends on the durability of the good and the time between choices. A decrease in the value of chosen goods is still an example of dynamic value construction, but I don't see how such a decrease could be produced by the ceDDM.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      This manuscript investigates the dynamics of GC-content patterns in the 5'end of the transcription start sites (TSS) of protein-coding genes (pc-genes). The manuscript introduces a quite careful and comprehensive analysis of GC content in pc-genes in humans and other vertebrates, specially around the TSS. The result of this investigation states that "GC-content surrounding the TSS is largely influenced by patterns of recombination." (from end of Introduction)

      My main concern with this manuscript is one of causal reasoning, whether intended or not. I hope the authors can follow my reasoning bellow on how the logic sometimes seems to fail, and that they introduce changes to clarify their suggested mechanisms of action.

      The above quoted sentence form the end of the Intro is in conflict with this other sentence that appears at the end of the Abstract "the dynamics of GC-content in mammals are largely shaped by patterns of recombination". The sentence in the Intro seems to indicate that the effect is specific to TSSs, but the one in the abstract seem to indicate the opposite, that is, that the effect is ubiquitous.

      We are sorry about the lack of clarity. We have now rewritten the abstract and intro to emphasize that our results are restricted to the 5' end of genes, and that by "patterns of recombination" we mean "historic patterns of recombination".

      The observations as stated in the abstract are: "We observe that in primates and rodents, where recombination is directed away from TSSs by PRDM9, GC-content at protein-coding gene TSSs is currently undergoing mutational decay."

      If I understand the measurements described in the manuscript correctly, and the arguments around them, you seem to show that the mutational decay of GC-content in humans is independent of location (TSSS or not), as noted here (also from the abstract) "These patterns extend into the open reading frame affecting protein-coding regions, and we show that changes in GC-content due to recombination affect synonymous codon position choices at the start of the open reading frame."

      Again, we have rewritten this section to clarify these points.

      There is one more result described in the manuscript, that in my mind is very important, but it is not given the relevance that it appears to me that it has. That is presented in Figure S3G. "we concluded that GC-content at the TSS of protein-coding genes is not at equilibrium, but in decay in primates and rodents. This decay rate is similar to the decay seen in intergenic regions that have the same GC-content (Figure S3G)"

      Thus, if the decaying effect happens everywhere, how can it be related to "recombination being directed away from TSSs by PRDM9" as it is stated in the abstract and in the model described in Figure 7?

      We make the argument that the GC-peak as likely caused by past recombination events. This is based on:

      1) The change in GC-content at the TSS in Dogs and Fox, coupled to the fact that they perform recombination at the TSS

      2) That the TSS can act as a default recombination site in mice when PRDM9 is knocked out

      3) That some forms of PRDM9 allow for recombination at TSS (see Schield et al., 2020, Hoge et al. 2023, and Joseph et al., 2023) and that this is expected to cause an increase in GC-content

      We thus speculate that the GC-peak in humans and rodents was caused by past recombination at TSSs that were permitted by ancient variants of PRDM9. We further point out that PRDM9 is undergoing rapid evolution, and some of the past versions of the protein may have had this property.

      We have tried to clarify these points in the latest version of the text.

      The fact that the decay rate is similar to any other region with similar GC-content should be an indication that the effect is not related to anything having to do with TSS or recombination being directed away from TSSs by PRDM9.

      We are sorry about the lack of clarity. TSSs in humans, chimpanzees, mouse and rats are are experiencing GC-decay at the same rate as in non-functional DNA regions with high GC-content. Thus the GC-peak is not being maintained by selection. This is surprising, given the role that GC-content plays in gene expression. This is a critical point, and we added it to the "conclusion" section of the abstract.

      I hope these paragraphs show my confusion about the relationship between the results presented which I think are very comprehensive and their interpretation and suggested model for GC-content dynamics around TSSs in human.

      On another note, can you provided a bit more background on recombination and its mechanisms?

      We have done our best to clarify these issues.

      You seem to have confident sets of genes under high/low/med recombination. How are those determined.

      We used the recombination rates per gene provided in Pouyet et al 2017 to identify the sets of genes under low/med/high recombination. Those rates were estimated from the HapMap genetic map (Frazer et al., 2007). This is now all specified in the methods section.

      You also seem to concentrate the cause of recombination on PRDM9, please explain. Is PRDM9 the unique indicator of recombination?

      PRDM9 has been shown to be the primary determinant of where recombination occurs in the genome (Grey et al., 2011, Brick et al., 2012). This is very well established. We now reword some of the introduction to make this clear.

      specific comments


      Figure 1, it is very hard to understand the differences between the three rows. Please explain more clearly in the legend, and add more information to the figure itself.

      We altered the axis titles to make this clearer. We also label "Upsream", "Exon 1" and "Part of Intron 1" in Figure 1C, F and I, and in Figure 2C. We now spell this out in the Figure Legend.

      Figure 7, express somewhere in the figure that the y axis measures GC content.

      We now added "GC Content" to the left of the first "graph" in Figure 7.

      Figure seems to introduce a 'causal' model of GC-content dismissing (diminishing?) based on recombination being directed away from TSSs. How about the diminishing of GC-content on any other genomic regions as you have shown in Figure S3G?

      Our focus in this model, and manuscript, is on TSSs. I think that to add the dynamics of other GC-rich regions is distracting. We do not know what caused these intergenic genomic regions to be high in GC-content prior to decay. After excluding known recombination sites and TSSs, these regions are very rare in the human genome. They may be ancient recombination sites that are decaying in GC-content. However, unlike TSSs, which have some connection to recombination (i.e. data from PRDM9 knockout mice and dogs and fox), we do not have any direct or indirect evidence that these other sites were used for recombination in the past. Alternatively, there could have been some other pressure on these sites in the past to increase GC-content that we are not aware of.

      -- The title is too selective, as to the results, and it has the implication that the decay is exclusive to the surrounding of the TSSs.

      Decay of GC-content towards equilibrium is the default state for non-functional DNA. That it is occurring at the TSS is surprising, as it indicates that the GC-peak is not maintained by selection. We now state this in the paper and include this in the "conclusion" portion of the abstract.

      Reviewer #1 (Significance (Required)):

      The statistical analysis is comprehensive and robust.

      We thank the reviewer for this.

      Their model interpretation as is describe induces confusion and needs to be clarified.

      We are sorry about this. Hopefully our revised text will clear up the confusion.

      I am an expert computational biologist, I do not have a deep knowledge of sequence implications of recombination, and it would be good if the manuscript could add some more background on that.

      We thank the reviewer for their perspective, and we hope that our text changes better explain to the non-expert why our findings are so surprising. We further clarify how recombination affects DNA sequence by gBGC and some of these changes are detailed in our response to the other reviewers.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      In this work, the author present various analyses suggesting that GC-content in TSS of coding genes is affected by recombination. The article findings are interesting and novel and are important to our understanding of how various non-adaptive evolutionary forces shape vertebrate genome evolutionary history.

      We thank the reviewer for these kind words.

      The Methods section includes most needed details (see comments below for missing information), and the scripts and data provided online help in transparency and usability of these analyses.

      I have several comments, mostly regarding clarifications in the text and several suggestions:

      1. In introduction: CpG islands, have been shown to activate transcription (Fenouil et al., 2012) - what is known about CpG Islands is somewhat inaccurately described. It should be rephrased more accurately, e.g. - CpG Islands found near TSS are associated with robust and high expression level of genes, including genes expressed in many tissues, such as housekeeping genes.

      We thank the reviewer for that. We have rewrote this part of the introduction.

      1. The following claim (in Introduction), regarding retrogenes and their GC content is not in agreement recent analyses: "Indeed, it has been observed that these genes have elevated GC-content at their 5' ends in comparison to their intron-containing counterparts, suggesting that elevation of GC-content can be driven by positive selection to drive their efficient export (Mordstein et al., 2020). Moreover, retrogenes tend to arise from parental genes that have high GC-content at their 5'ends (Kaessmann et al.,2009)." Recent work showed that retrogenes in mouse and human are significantly depleted of CpG islands in their promoters (PMID: 37055747). This follows the notion that young genes, such as these retrogenes, have simple promoters (PMID: 30395322) with few TF binding sites and without CpGs. The two reported trends should be both mentioned with some suggestions regarding why they seem to be contrasting each other and how they can be reconciled.

      We thank the reviewer for this information. The previous report (Mordstein et al., 2020) indicated that the increase in GC-content occurs downstream of the TSS in retrogenes. Since sequences upstream of the TSS are not part of the retro-insertion, it is not surprising that GC-content may differ between the retrogene and the parental gene. That retrogenes have lower numbers of CpGs upstream of the TSS, bolsters the idea that GC-content is not required for transcription and that the GC-peak is not being maintained in most genes by purging selection.

      1. In "Thus GC-content is expected, and is indeed observed to be higher near recombination hotspots due to gBGC (REF)." I think you forgot the reference...

      We thank the reviewer for catching this.

      1. In Results, regarding average GC content (Fig 2X): "Interestingly, this pattern is different in the nonamniotes examined, including anole lizard, coelacanth, shark and lamprey." - in lizard, it seems that the genomic average is lower (and lizards are amniotes)

      You are absolutely right. We now fix this.

      1. In Discussion, the statement: "This model is supported by findings in a recent preprint, which documents the equilibrium state of GC-content in TSS regions from numerous organisms" seems to contrast with the findings of the mentioned preprint. If "most mammals have a high GC-content equilibrium state" but still have a functional PRDM9, in the lack of evidence for functional differences between ortholog PRDM9 proteins (such as signatures for positive selection or functional assays), the authors' findings regarding the relationship between a lack of PRDM9 in canids and the trends observed in their TSS, are weakened.

      We are sorry about the confusion. We were not exactly sure what points were being commented on. 1) whether GC-content is at equilibrium for most mammals or 2) that the equilibrium state is high for most mammals despite containing PRDM9. We rewrote this sentence to clarify both issues (especially given that these concepts may not be clear to non-experts, such as the first reviewer). To answer the first potential concern, the paper in question (Joseph et al., 2023), does not show that GC-content at the TSS in mammals is at equilibrium, rather, it calculates what the equilibrium state is given the nucleotide substitution rates. In most organisms, the TSS is not at equilibrium. To answer both 1 and 2, Joseph et al., show that the equilibrium GC-content at the TSS for canids is much higher than for other mammals. They and others infer that the diversity between other mammals (where the equilibrium state is higher than humans and rodents but lower than canids) has to do with the variation between PRDM9 orthologues, however this has yet to be tested. Although the action of PRDM9 has not been evaluated in most mammals, we do point out that in snakes PRDM9 allows for some recombination at the TSS.

      1. In Methods, the ENSEMBL version (in addition of the per-species genome version) should be mentioned.

      This has been fixed.

      1. In Fig 1, it is worth clarifying in the legend that the differences between the first and second rows of panels is in the length of the plotted region.

      We have now indicated this in the figure legend.

      Reviewer #2 (Significance (Required)):

      The manuscript provides a rigorous analysis of the possible processes that have impacted the TSS GC-content during evolution. It should be of interest to a diverse set of investigators in the genomics community, since it touches on different topics including genome evolution, transcription and gene structures.

      Thank you.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      This study analyzes the distribution of GC-content along genes in humans and vertebrates, and particularly the higher GC-content in the 5'-end than in the 3'-end of genes. The results suggest that this pattern is ancient in vertebrates, currently decaying in mouse and humans, and probably driven by recombination and GC-biased gene conversion. It is proposed that the 5'-3' gradient was generated during evolution when PRDM9 was less active (in which case recombination occurs mostly near transcription start sites), and decays when PRDM9 is very active, as it is currently in humans and mouse. This is a very interesting hypothesis, also corroborated by a recent, similar analysis in mammals (Joseph et al. 2023). These two preprints, which appeared around the same time, are, I think, quite novel and important. The analyses performed here are thorough and convincing. Source code and raw data sets are openly distributed. I only have a couple of minor comments and suggestions, which I hope might help improve the manuscript.

      Thank you very much for the kind words.

      A1. There has been quite some work on the 5'-3' GC-content gradient in plants (e.g. Clément et al. 2014 GBE, Ressayre et al. 2015 GBE, Brazier & Glemin 2023 biorxiv), which you might like to cite.

      Thank you for pointing out these very interesting papers, we have incorporated them into the latest version.

      A2. CpG-content and GC-content are related in various ways (e.g. see Galtier & Duret 2000 MBE, Fryxell & Moon 2005 MBE) that you might like to discuss; currently the manuscript discusses the CpG hypermutation rate as a driver of GC-content but the picture might be a bit more complex.

      Thank you for this, we have incorporated these citations.

      A3. The model introduced by this manuscript (figure 7) is dependent on the evolution of recombination determination in vertebrates and the role of PRDM9. A recent preprint by Raynaud et al (biorxiv) seems relevant to this issue.

      Thank you for pointing out this pre-print. We have added a paragraph to the discussion that mentions this work. This also initiated a conversation with the authors, and we include some "personal communications" that illuminate what is going on in teleost fish.

      Line-by-line comments

      B1. "First, highly spliced mRNAs tend to have high GC-content at their 5' ends despite the fact that it is not required for export and does not affect expression levels (Mordstein et al., 2020)" -> I do not totally understand this sentence, which seems to imply some link between splicing and export/expression, could you please clarify?

      We rewrote that sentence to make it clearer.

      B2. "mismatches will form in the heteroduplex which are typically corrected in favor of Gs and Cs over As and Ts by about 70%" -> This 70% figure is human-specific, and varies a lot among species; I know in this introduction you're mainly reviewing the human literature but since this part of the text introduces gBGC as a process maybe clarify by adding "in humans" or refrain from giving this figure?

      Thank you. This is a good point. We fixed this.

      B3. "Thus GC-content is expected, and is indeed observed to be higher near recombination hotspots due to gBGC (REF)." -> reference missing here; actually I'm not sure you will find a good reference for this because PRDM9-dependent hotspots are so short-lived that GC-content would only respond weakly; mayber rather refer to the equilibrium GC-content (and cite, for instance, Pratto et al 2014 Science), or to high-recombining regions instead of hotspots (and you have plenty of papers to cite)?

      Thanks for this.

      B4. Paragraph starting: "PRDM9 and recombination hotspots also experience accelerated rates of evolution..." -> I would suggest removing the word "also" and moving this paragraph up, just before the sentence I'm commenting above (the one starting "Thus GC-content..."). This will justify my suggestion in comment B3 of mentioning high-recombining regions instead of hotspots, while also avoiding to have the important paragraph on recombination at TSS (the one starting "There are interesting connections...") being sandwiched between two sections on PRDM9.

      We did not move this paragraph, although we did adjust the wording slightly.

      B5. Paragraph starting "There are interesting connections..." is crucial to your discussion and might be emphasized a bit more in introduction, in my opinion. For instance, what about adding a sentence like "Also not directly relevant to humans, these observations suggest that gBGC might have played a role in shaping the observed 5'-3' GC-content gradient."

      We did not alter the structure of this paragraph but we did reword sections of it.

      1. "Interestingly, this pattern is different in the non-amniotes examined, including anole lizard, coelacanth, shark and lamprey. These organisms had clear differences in GC-content between their first exon and surrounding sequences (upstream and intronic sequences), which came close to the overall genomic GC-content." -> I'm not sure I got the point the authors are intending to make here. Also please note that lizards are amniotes.

      We thank the reviewer for catching this error, we have fixed this.

      Reviewer #3 (Significance (Required)):

      This is one of two preprints having appeared ~at the same time (the other one being the cited Joseph et al 2023), which I think are quite important and convincing regarding the role of PRDM9-dependent and PRDM9-independent recombination on GC-content evolution in vertebrates. I support publication of this preprint in a molecular evolutionary journal.

      We thank the reviewer for their kind assessment!

    1. Reviewer #1 (Public Review):

      Rubin et al. study chondrocyte columns in the prenatal and postnatal growth plate in 3D for the first time, using a novel analysis pipeline in which Confetti clones in the murine growth plate are analysed morphometrically. Prenatal chondrocytes were found not to be organised in columns parallel to the main orientation of the long bone, but rather, prenatal chondrocytes were commonly organised perpendicular to the main direction of growth. In the postnatal (P40) growth plate there was a diverse arrangement of columns, but more of the columns were vertically aligned

      I enjoyed reading the work and the analysis is rigorous. However, I think that it is not valid to state that columns do not form in the embryo. The data only supports the finding that strictly vertical columns do not form in the embryo, as the cells are still organised into columns, albeit with a range of orientations. I do not like the term "typically" aligned, as how can we know what is "typical" when orientation has never before been assessed in 3D... And the authors' data demonstrates that it is certainly not "typical" for chondrocyte to organise into vertical columns prenatally.

      It would be very interesting to delve deeper into the reason for the change in orientation of columns between pre- and post-natal. For example, does more circumferential growth happen prenatally as compared to postnatally? Is the rate of circumferential vs longitudinal growth different between prenatal and postnatal, and could the change in column orientation be responsible for a (possible) shift in the balance between longitudinal vs circumferential growth before vs after birth? The first sentence of the Discussion refers to the role of chondrocyte columns in driving bone elongation, but aren't they also involved in driving bone morphology?

      I feel describing the activity of the cells as "mis-rotations" which implies the orientations are not intentional. It is likely not accidental or mistaken that the chondrocytes align in the ways they do- the diaphysis is largely for longitudinal growth while the epiphyses, and lateral expansion of the joint is also important. I find the data in Figure 4 fascinating, especially the variation in orientations between the regions of the growth plate (from proximal to distal), with the most lateral orientation at the most proximal and distal ends- it would be nice to see more discussion of these variations and what they may be contributing to.

      The abstract focuses solely on the analysis of columns prenatally and would benefit from the inclusion of the data from the postnatal growth plate and from the chondrocyte rotations.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Compared to our initial submission to Review Commons, we have addressed all the reviewers' comments. We have extensively re-written the manuscript to make it clearer to a larger audience. In particular, we have transferred Figure EV1 to Figure 1 with more complete panels and included a scheme (Figure EV3) on the steps of D2R internalization which we measure with live cell imaging. We have added a new paragraph to the start of the Discussion to summarize our main conclusions and reordered the discussion on the possible mechanisms of membrane PUFA enrichment on D2R endocytosis. All the changes in the text are in red for easier comparison with the previous version.

      As suggested by reviewer 1, we have performed additional experiments to test the specificity of the effects of PUFA treatments on D2R endocytosis, reinforcing the results shown in Figure 4 using feeding assays. We show with live cell TIRF imaging and the ppH assay that TfR-SEP endocytosis is not affected (Figure EV5) and that SEP-β2AR endocytosis and βarr2-mCherry recruitment to the plasma membrane are not affected (Figure EV6).

      Reviewer #1

      Evidence, reproducibility and clarity

      *The manuscript, using different live and fixed cell trafficking assays, demonstrates that incorporation of poly-unsaturated, but not saturated, free fatty acids in the membrane phospholipids reduce agonist induced internalization of the D2 dopamine receptor but not the adrenergic beta2 receptors or the transferrin receptor. Pulsed pH (ppH) live microscopy further demonstrated that the reduced internalization by incorporation of free fatty acid was accompanied by a blunted recruitment of Beta-arrestin for the D2R.

      I believe said claims put forward in the manuscript are overall well supported by the data and as such I do not believe that further experiments are necessarily needed to uphold these key claims. Also, the methodology is satisfactorily reported, and statistics are robust, although two-way Anova like used in Fig 1 seems appropriate for Fig 2 and 3*

      We thank the reviewer for his/her positive assessment of our work. We have checked the statistical tests used for all our measures. For Figure 2 and 3 (now 3 and 4) we test for only one factor (PUFA treatment or not) so we ran ordinary one-way ANOVA using Graphpad Prism.

      That said, I suggest that the fixed cell internalization experiments (Fig 2 and 3), which relate the effect on the D2R to B2AR and transferrin are revised. This is important since this is relevant to judge whether the effect is a general or a selective molecular mechanism since this is the one of the three assay which this comparison relies on. Alternatively, I suggest omitting this data and include the B2AR in the Live DERET assay and both B2AR and TfR in the ppH assay. Specifically, my concerns with the fixed cell internalization are: • The analysis is based on counting the number of endosomes, which is not necessarily equivalent to the number of receptors internalized

      The number of puncta, as well as their fluorescence, is reported by the analysis program (written in Matlab2021 and available upon request). We chose to show number of puncta because they reflect more directly the number of labelled endosomes (in Figures 3 and 4). As shown in the figure below, we found slight but significant differences between groups for FLAG-D2R (88.6 % and 87.6 % of average fluorescence in DHA and DPA treated cells compared to control cells), (panel A), and no differences for FLAG-β2AR (panel B). We find a significant decrease in puncta fluorescence for transferrin uptake in cells incubated with DHA (but not DPA) relative to control cells (panel C). However, because we did not detect differences in the number of puncta or in the frequency and amplitude of endocytic vesicle creation events (see below), we still conclude that enrichment with exogenous PUFAs does not affect clathrin mediated endocytosis.

      In conclusion, the most robust measure of endocytosis for this assay is the number of detected puncta per cell rather than their fluorescence.

      • The analysis relies on fully effective stripping of the surface pool of receptors - i.e clustered surface receptors not stripped by the protocol will be assessed as internalized. It is often very difficult to obtain full efficiency of the Flag-tag stripping and this is somewhat expression dependent. • The protocol for the constitutive and agonist induced internalization is different and yet shown on the same absolute graph. Although I take it the microscope gain setting are unaltered between the constitutive and agonist induced internalization I don't believe the quantification can be directly related. This is confusing at the very least. More critically however, the membrane signal from the non-stripped condition of constitutive internalization will likely fully shield internalized receptors in the Rab4 membrane proximal recycling pathway leading to under-estimation of the in the constitutive endocytosis. I believe this methodological limitation underlies the massive relative difference in the constitutive endocytosis between panel 2A,B and 2C,D. For comparison, by a quantitative dual color FACS endocytosis assay, we have previously demonstrated the ligand endocytosis a ~4 fold increased over constitutive (in concert with Fig 2A,B here) (Schmidt et al 20XX). Importantly, high relative variability by this methodology could well shield an actual effect of incorporation of FFAs on the constitutive endocytosis. We thank the reviewer for pointing this difference in the protocol. As a matter of fact, we have not used acid stripping in all the conditions used for the uptake assays (Figures 3 and 4). We apologize for the confusion and we have clarified this point in the Methods section. In early experiments we compared conditions with or without stripping but we concluded from these experiments that indeed, the stripping was not complete. Moreover, we noticed early on that many cells treated with DHA or DPA did not have any detectable cluster (13 cells out of 58 quantified cells treated with DHA after addition of QPL, 12/56 cells treated with DPA, 0/68 for cells treated with vehicle). Stripping the antibody would have made these cells undetectable, biasing the analysis. Therefore, to make our results more consistent we decided to use non-stripping conditions. To detect endosomes specifically, we used a segmentation tool developed earlier (see Rosendale et al.* 2019). This tool is based on wavelet transforms which recognizes dot-like structures. In addition, we excluded from the cell mask the labelled plasma membrane by a mask erosion.

      We agree the design of experiments was not aimed at comparing the effect of PUFA treatment on low levels of constitutive D2R endocytosis. This would require more sensitive assays and be addressed in subsequent studies.

      'Optional' Also, it would be informative to see the ppH Beta-arrestin experiments with the B2AR to assess, whether the putative discrepancy between D2R and B2AR is upstream or downstream of the blunted Beta-arrestin recruitment. To the same point, it would be very informative to assess how the incorporation of the free fatty acids affect receptor signalling, which would also help relate the effect of incorporation of the FFA's in the phospholipids to previous experiment using short term incubation with FFA's

      We have now performed live imaging experiments in HEK293 cells expressing SEP-β2AR, GRK2 and βarr2-mCherry and stimulated with isoproterenol (Figure EV6). We show that the clustering of SEP-β2AR, of βarr2-mCherry, as well as endocytosis, are not affected by treatments with DHA or DPA. In this study, we focused on the early trafficking steps of D2R internalization. It will be interesting in a future study to address its consequences on G protein dependent and independent signaling. Moreover, and for good measure, we performed experiments to assess TfR-SEP endocytosis with the ppH assay. Again, we found no difference between cells treated or not with PUFAs (Figure EV5)

      *References overall seem appropriate although Schmidt et al would be relevant for reference of the constitutive vs agonist induced endocytosis of D2R and B2AR. *

      We have now cited Schmidt et al. 2020 doi 10.1111/bcpt.13274 in the discussion with the following sentences: "D2R also shows constitutive endocytosis (Schmidt et al, 2020) which may be modulated by PUFAs although we did not detect any significant difference in our measures (see Figure 3) which were aimed at detecting high levels of internalization induced by agonists. Further work will be required to specifically examine the effect of PUFAs on constitutive GPCR internalization."

      Overall, the figures are well composed and convey the messages fairly well. Specific point that would strengthen the rigor include: • Chosing actual representative pictures of the quantitative data in Fig 2 and 3 (e.g. hard to see 25 endocytic events in Fig 2A constitutive endo, EtOH)

      We apologize for the confusion. We employ a normalization procedure to account for cell size. In addition, all numbers have been normalized to the condition stimulated with agonist with no PUFA treatment). In fact, we detect in unstimulated cells very few puncta (on average 0.6, range 0-5) compared to 27.3 clusters (range 2-87) in cells stimulated with QPL.

      • Showing actual p values for the statistical comparisons* For easier reading, we have kept the stars convention for the figures but added two tables with all statistical tests and the p values for both main figures and EV figures.

      Moreover, for ease of reading the figures (without consulting the legend repeatedly) it would be very helpful to headline individual panel with what the experiments assesses. Figure 1a and 1b for example can't be distinguished at all before reading the figure legend. Also, y-axis could be more informative on what I measured rather than just giving the unit.

      We have added titles to panels (in particular for Figure 2A,B which correspond to former Figure 1A,B) and we have given new titles to Y axes to make them clearer. We hope that the reading of our figures will now be easier.

      Finally, the figure presentation and description of S1 is very hard to follow. I cannot really make out what is assessed in the different panels.

      We have changed substantially Figure EV1 (now Figure 1) with new presentation of data: all 4 conditions (control, treated with DHA, DPA or BA) systematically presented in the same graph, and clearer titles for the parameter displayed on the Y axes. We hope that this figure is now easier to follow.

      Significance

      *The strength of the manuscript is the use and validation of incorporation of FFA's in the plasma membrane, which more closely mimics the physiological situation than brief application of FFAs as often done. Is addition, the blunted recruitment of beta-arrestin as assessed by the ppH protocol is quite intriguing mechanistically. The limitation are the relative narrow focus on the D2 receptor (and not multiple GPCRs) that does not really speak to as or assess the physiological, pathophysiological or therapeutic role of the observations (except from referring the relation between FFAs and disease). Also, despite the putative role of Beta-arrestin recruitment in the process, the actual causation in the process is not clear. This shortcoming is underscored by the putative effect on the constitutive internalization described above.

      My specific expertise for assessing the paper is within general trafficking processes (including the trafficking methodology applied), trafficking of GPCRs and function of the dopamine system including the role of D2 receptors.*

      • *

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      • *

      The only conclusion that I was able to understand from the study was that enrichment of cell membranes with polyunsaturated fatty acids specifically inhibited agonist-induced internalization of D2 receptors. However, I think that the experiments used to conclude that PUFAs do not alter D2R clustering but reduce the recruitment of β-arrestin2 and D2R endocytosis need some clarification (i.e. data depicted in Fig. 2-5). This lack of clarity might be due to the fact I am not familiar enough with the employed technologies or to the unclear writing style of the paper. There was an overuse of acronyms, initialisms and abbreviations, which are difficult to understand for researchers outside of the specific lipid field. I think that the manuscript should be written in a way to be legible also for researchers not working in the immediate filed.

      The paper was not written in a manner that a general audience of cell biologists or those interested in GPCR biology could understand and judge. It is indeed interesting that polyunsaturated fatty acids specifically inhibit D2R internalization in HEK293 cells, and it could be significant. But, it is difficult to judge the significance of the observation without more in vivo data.

      I would suggest the following. Remove all acronyms and abbreviations. Significantly, expand the Materials and Methods section, either in the manuscript or in the Supplemental section. I suggest clearly explaining each construct used, and the function of each module in the construct, with diagrams. In addition, provide a comprehensive step by step description of each experimental protocol, providing the reader with the rationale for each step in the protocol with explanatory diagrams. The authors should also more clearly explain the rationale and logic that was utilized to make the conclusions that they did from the depicted observations. Only then can a broader audience determine if the authors' conclusions are justified.

      We thank the reviewer for his/her comments. Indeed, our main message was that two types of PUFAs (DHA and DPA) specifically alter D2R endocytosis by reducing the recruitment of β-arrestin2 without changing D2R clustering at the plasma membrane. We are sorry that our writing was not clear enough. We also found out that in the last steps of the submission to Review Commons, the first paragraph of the Discussion was inadvertently erased. This made our main conclusions, summarized in this first paragraph, less clear. We have now put back this important paragraph. Moreover, we have extensively rewritten the manuscript thriving to make it as clear as possible to a large audience. We have reduced the use of acronyms to keep only the most used ones [e.g. PUFA (used 99 times), DHA (37 times), GPCR (34 times), D2R (126 times), GRK (17 times)] and made them consistent throughout the manuscript. Following the reviewer's suggestion, we have also added a scheme of the steps following D2R activation by agonist leading to its internalization (Figure EV3).

      We understand that the reviewer implies by "in vivo data" results obtained in the brain of animals. As written in the Introduction and in the Discussion, the current work follows up on a recently published manuscripts by a subset of the authors, namely (i) Ducrocq et al. 2020 (doi 10.1016/j.cmet.2020.02.012) in which we show that deficits in motivation in animals deprived in ω3-PUFAs can be restored specifically by conditional expression of a fatty acid desaturase from c. elegans (FAT1) that allows restoring PUFA levels specifically in D2R-expressing striatal projection neurons (which mediate the so-called indirect pathway), and (ii) Jobin et al. 2023 (doi: 10.1038/s41380-022-01928-6) which combines in cellulo (HEK 293 cells) and in vivo data to show that PUFAs affects the ligand binding of the dopamine D2 receptor and its signaling in a lipid context that reflects patient lipid profiles regarding poly-unsaturation levels.

      Reviewer #2 (Significance (Required)):

      • *

      In summary, I will reiterate that the reported experiments need to be much better explained to make the study understandable to a broader audience and for that audience to determine whether the conclusions are justified.

      • *

      • *

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      • *

      Summary:

      The authors investigate the role of lipid polyunsaturation in endocytic uptake of the dopamine D2 receptor (D2R). To modulate the degree of unsaturation in live cell plasma membranes, the authors incubate cell lines with pure fatty acid that is metabolized and incorporated into the cellular membranes. To quantify the internalization of D2R in these live cells, the authors utilized quantitative fluorescence assays such as DERET and endosome analysis to determine the degree and rate of D2R internalization in the presence of two model agonists - dopamine and quinpirole. The authors conclude that when the PUFA content of the plasma membrane is increased (i.e., via ω3 or ω6 fatty acids), both the quantity and rate of D2R internalization decrease substantially. The authors confirmed that these phenomena are specific to D2R as caveolar endocytosis and clathrin-mediated endocytosis were unaffected when these same experimental techniques were utilized for β2 adrenergic receptor and transferrin. Additionally, the authors conclude that the clustering ability of D2R is unaffected by lipid unsaturation but that the ability of D2R clusters to interact with β-arrestin2 is inhibited in the presence of excess PUFA. Based on these findings, the authors propose several hypothetical mechanisms for lipid-D2R interactions on the plasma membrane, which will likely be the scope of future work.

      Overall, this is a highly thorough and rigorous body of work that convincingly illustrates the connection between PUFA levels and D2R activity. However, I do not agree with the authors' conclusions pertaining to how their results should be interpreted in the context of fatty acid-related disorders. Additionally, this manuscript could benefit from some reorganization which would present the work more clearly. Please see the comments below.

      We thank the reviewer for the positive appreciation of our work, qualified as a "thorough and rigorous body of work that convincingly illustrates the connection between PUFA levels and D2R activity". We will address the specific points raised by the reviewer with our answers below.

      Comments:

        • A recurring motivation for this study that is brought up by the authors is that dietary deficiency of ω3 fatty acids is tied to D2R dysfunction. This would indicate that PUFA reduction in the plasma membrane results in D2R dysfunction. However, the experiments emphasized in this manuscript investigate the condition where PUFA content is INCREASED in the plasma membrane and D2R function is compromised. It seems inappropriate for the authors to cite dietary deficiency of ω3 as a motivation when they experimentally test a condition that is tied to ω3 surplus.* Regarding the general comment of the reviewer, we agree that direct conclusion cannot be drawn on the etiology of psychiatric disorders by looking at the effect of membrane fatty acid levels on D2R in HEK 293 cells. Nevertheless, we mention in the Introduction the intriguing occurrence of low PUFA levels in psychiatric disorders as starting point to look at D2R as an important target for psychoactive drugs prescribed for these disorders. In the Discussion, we propose that manipulating fatty acid levels might potentiate the efficacy of D2R ligands used as treatments. We felt raising these aspects was not putting too much emphasis on psychiatric disorders. However, in accordance with the reviewer's comment, we toned down these descriptions in the revised manuscript.

      The goal of increasing the levels of fatty acids at the membrane in HEK 293, the most widely used cellular system to study GPCR trafficking, was to try to emulate the levels of lipids in brain cells. Indeed, the levels of PUFAs in our culture conditions are much lower (~8 %, Figure 1B) than in brain extracts (~30 %). Therefore, the "control" condition in HEK 293 cells would correspond to PUFA deficiency while after our enrichment protocol these levels are closer to those found in brain cells. Our results could therefore be interpreted as endocytosis of D2R being augmented under membrane PUFA decrease. Importantly, increased receptor internalization often correlates with decreased signaling. Therefore, membrane PUFA enrichment in our conditions would rather potentiate D2R signaling.

      • Following up on the first comment, the authors' results seem to indicate that excess ω3's are detrimental to D2R function. This result would be at odds with the conventional view that ω3's are essential and that excessive ω3 may not be harmful. The authors should rationalize their findings in the context of what is known about excess dietary ω3.*

      The Reviewer is right that the conventional view is that excessive ω3 PUFA may not be harmful. However, this rather applies to dietary consumption, which might have limited effect to brain fatty acid contents since their accretion is highly regulated. Moreover, the majority of studies looking at ω3 supplementation have been performed in young adults and the effects on the developing brain - as it might be happening in pathological conditions in which D2R is involved - remain poorly understood. Furthermore, as mentioned above, blunted internalization of D2R under membrane PUFA enrichment is not an indication of "detrimental" to D2R function. Nor do we argue that membrane enrichment corresponds to excess PUFAs.

      • I would argue that the control experiments with saturated fatty acids (i.e., Behenic Acid in figure 1), represent a scenario mimicking ω3 deficiency as the enrichment of Behenic Acid causes an overall reduction in PUFAs (Figure EV1C - an increase in SFA must correspond to a decrease in PUFA). These Behenic acid results are the only experiments presented by the authors that mimic a scenario resembling ω3 deficiency and the results show that the D2R internalization is unaffected (Figure 1G-H). Therefore, I would further argue that if anything, the authors results suggest that ω3 deficiency is NOT correlated to D2R internalization. Again, the authors must rationalize these findings in the context of what is known about dietary intake of ω3's.*

      The Reviewer must refer to the fact that nutrients rich in SFAs are usually poor in PUFAs and vice-versa. Based on our lipidomic analysis, we now present in Figure 1B the effect of treatments (DHA, DPA, BA) on the levels of PUFAs (Figure 1B) and saturated fatty acids (Figure 1C). In cells treated with behenic acid (BA), PUFA levels are not significantly changed relative to control, untreated cells, while saturated fatty acid levels are increased. BA was used here to determine whether the effects observed with PUFAs was related to the enrichment in unsaturations or due to carbon chain length (C22). It is not the case because BA treatment, unlike DHA or DPA treatment, does not affect D2R endocytosis (Figure 2G,H).

      • It's not clear why the authors decided to include an ω6 fatty acid in this study. The authors built up a detailed rationale for investigating ω3's as they are dietarily essential and tied to disease when deficient. To my knowledge, ω6's are considered much less beneficial than ω3's in a dietary sense. The inclusion of an ω6 almost seems coerced as the ω6-related results don't provide any interesting additional insights. It would benefit the manuscript if the authors provided some additional discussion explaining why ω6's are being investigated in addition to ω3's. *

      We agree that we could have made the rationale clearer. The goal in comparing ω3-DHA and ω6-DPA was to assess whether the position of the first unsaturation (n-3 vs n-6), with the same carbon chain length (C22) might differentially impact D2R endocytosis.

      • In Figure EV1D, the AHA and DPA percentages each increase by ~6%. The corresponding Figure EV1B indicates that the overall PUFA% in the plasma membrane also increases by 6%. This makes sense as the total change in PUFA content is consistent with the amount of AHA or DPA being internalized to cells. However, this consistency was not observed with BA and SFAs. In Figure EV1E, the BA percentage increases only ~1% while the total SFA percentage in Figure EV1C increases by ~6%. How can something undergoing a 1% change (relative to total lipid content) result in a 6% overall change in SFA content?*

      The reviewer is correct: the level of SFAs is increased by 5.2% (34.5 % of total FAs in control cells to 39.7 % in BA treated cells), more than the increase in BA alone (1.18% from 0.35 % to 1.53 %). A close look at our lipidomics data showed that many of the 10 saturated fatty acids quantified are enhanced. In particular, the two most abundant ones, palmitic acid (16:0) and stearic acid (18:0) are increased, from 21.37 % to 22.28 % and 8.47 % to 11.17%, respectively. The reasons for these apparent discrepancies may involve lipid metabolic pathways which convert the rare and long BA into more common and shorter SFAs to preserve lipid contents and thus membrane properties.

      • In Figure 4, the discussion of kinetics does not make sense. How exactly are kinetics being monitored in this figure? (Recruitment kinetics are discussed in panels D and G)*

      We wanted to convey the impression that the time to reach the peak βarr2-mCherry recruitment was shorter in PUFA-treated cells than in control cells. However, after analyzing the kinetics in individual cells, we did not find a statistically significant difference in the time to maximum fluorescence. Therefore, we removed this reference to the kinetics of recruitment.

      We now write: " However, treatment with DHA or DPA significantly decreased peak βarr2-mCherry fluorescence (Figure 5F-G).."

      • In Figure 5, What is the purpose of panel D? Would it be more helpful to include additional, overlaid "cumulative N" plots for scenarios in which PUFAs were enriched? This would work well in conjunction with panel F.*

      The purpose of this panel is to show the kinetics of increase in the frequency of endocytic vesicle formation upon agonist addition, and the decrease in frequency when the agonist is removed. We have now added examples of cells treated with DHA and DPA of similar surface for direct comparison with control (EtOH) cells.

      • For the readers who are new to this area or unfamiliar with the assays used, Figure 1 is not intuitive and initially difficult to interpret. It would greatly benefit the flow of the manuscript if Figures EV1A-C and EV2A were included in the main text and "Normalized R" was clearly defined in the main text, prior to discussion of Figure 1.*

      We have now transferred Figure EV1 as Figure 1. We have adapted the scheme of the DERET assay and its legend (now in Figure EV1A) to make it clearer. We did not put in Figure 2 because this figure is already very big. We have changed "Normalized R" to "Ratio 620/520) (% max)" to be clearer and more consistent with the scheme.

      Reviewer #3 (Significance (Required)):

      • *

      General assessment: The work, for the most part, is rigorous and scientifically sound. The authors utilize impressive, quantitative assays to expand our understanding of protein-lipid interactions. However, the authors need to improve their discussion of the actual physiological conditions that correspond to their experimental results.

      • *

      Advance: This work may fill a gap in our understanding of disorders related to the dopamine D2 receptor. However, some of the results may be at odds with what is currently known/understood about dietary ω3 fatty acids.

      • *

      Audience: This work will be of broad interest to researchers in the biophysics field, with particular emphasis on researchers who study protein and membrane biophysics. This work will also be of interest to researchers who study membrane molecular biology.

      • *

      Reviewer Expertise: quantitative fluorescence spectroscopy and microscopy; membrane biophysics; protein-lipid interactions

      • *
    2. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Compared to our initial submission to Review Commons, we have addressed all the reviewers' comments. We have extensively re-written the manuscript to make it clearer to a larger audience. In particular, we have transferred Figure EV1 to Figure 1 with more complete panels and included a scheme (Figure EV3) on the steps of D2R internalization which we measure with live cell imaging. We have added a new paragraph to the start of the Discussion to summarize our main conclusions and reordered the discussion on the possible mechanisms of membrane PUFA enrichment on D2R endocytosis. All the changes in the text are in red for easier comparison with the previous version.

      As suggested by reviewer 1, we have performed additional experiments to test the specificity of the effects of PUFA treatments on D2R endocytosis, reinforcing the results shown in Figure 4 using feeding assays. We show with live cell TIRF imaging and the ppH assay that TfR-SEP endocytosis is not affected (Figure EV5) and that SEP-β2AR endocytosis and βarr2-mCherry recruitment to the plasma membrane are not affected (Figure EV6).

      Reviewer #1

      Evidence, reproducibility and clarity

      *The manuscript, using different live and fixed cell trafficking assays, demonstrates that incorporation of poly-unsaturated, but not saturated, free fatty acids in the membrane phospholipids reduce agonist induced internalization of the D2 dopamine receptor but not the adrenergic beta2 receptors or the transferrin receptor. Pulsed pH (ppH) live microscopy further demonstrated that the reduced internalization by incorporation of free fatty acid was accompanied by a blunted recruitment of Beta-arrestin for the D2R.

      I believe said claims put forward in the manuscript are overall well supported by the data and as such I do not believe that further experiments are necessarily needed to uphold these key claims. Also, the methodology is satisfactorily reported, and statistics are robust, although two-way Anova like used in Fig 1 seems appropriate for Fig 2 and 3*

      We thank the reviewer for his/her positive assessment of our work. We have checked the statistical tests used for all our measures. For Figure 2 and 3 (now 3 and 4) we test for only one factor (PUFA treatment or not) so we ran ordinary one-way ANOVA using Graphpad Prism.

      That said, I suggest that the fixed cell internalization experiments (Fig 2 and 3), which relate the effect on the D2R to B2AR and transferrin are revised. This is important since this is relevant to judge whether the effect is a general or a selective molecular mechanism since this is the one of the three assay which this comparison relies on. Alternatively, I suggest omitting this data and include the B2AR in the Live DERET assay and both B2AR and TfR in the ppH assay. Specifically, my concerns with the fixed cell internalization are: • The analysis is based on counting the number of endosomes, which is not necessarily equivalent to the number of receptors internalized

      The number of puncta, as well as their fluorescence, is reported by the analysis program (written in Matlab2021 and available upon request). We chose to show number of puncta because they reflect more directly the number of labelled endosomes (in Figures 3 and 4). As shown in the figure below, we found slight but significant differences between groups for FLAG-D2R (88.6 % and 87.6 % of average fluorescence in DHA and DPA treated cells compared to control cells), (panel A), and no differences for FLAG-β2AR (panel B). We find a significant decrease in puncta fluorescence for transferrin uptake in cells incubated with DHA (but not DPA) relative to control cells (panel C). However, because we did not detect differences in the number of puncta or in the frequency and amplitude of endocytic vesicle creation events (see below), we still conclude that enrichment with exogenous PUFAs does not affect clathrin mediated endocytosis.

      In conclusion, the most robust measure of endocytosis for this assay is the number of detected puncta per cell rather than their fluorescence.

      • The analysis relies on fully effective stripping of the surface pool of receptors - i.e clustered surface receptors not stripped by the protocol will be assessed as internalized. It is often very difficult to obtain full efficiency of the Flag-tag stripping and this is somewhat expression dependent. • The protocol for the constitutive and agonist induced internalization is different and yet shown on the same absolute graph. Although I take it the microscope gain setting are unaltered between the constitutive and agonist induced internalization I don't believe the quantification can be directly related. This is confusing at the very least. More critically however, the membrane signal from the non-stripped condition of constitutive internalization will likely fully shield internalized receptors in the Rab4 membrane proximal recycling pathway leading to under-estimation of the in the constitutive endocytosis. I believe this methodological limitation underlies the massive relative difference in the constitutive endocytosis between panel 2A,B and 2C,D. For comparison, by a quantitative dual color FACS endocytosis assay, we have previously demonstrated the ligand endocytosis a ~4 fold increased over constitutive (in concert with Fig 2A,B here) (Schmidt et al 20XX). Importantly, high relative variability by this methodology could well shield an actual effect of incorporation of FFAs on the constitutive endocytosis. We thank the reviewer for pointing this difference in the protocol. As a matter of fact, we have not used acid stripping in all the conditions used for the uptake assays (Figures 3 and 4). We apologize for the confusion and we have clarified this point in the Methods section. In early experiments we compared conditions with or without stripping but we concluded from these experiments that indeed, the stripping was not complete. Moreover, we noticed early on that many cells treated with DHA or DPA did not have any detectable cluster (13 cells out of 58 quantified cells treated with DHA after addition of QPL, 12/56 cells treated with DPA, 0/68 for cells treated with vehicle). Stripping the antibody would have made these cells undetectable, biasing the analysis. Therefore, to make our results more consistent we decided to use non-stripping conditions. To detect endosomes specifically, we used a segmentation tool developed earlier (see Rosendale et al.* 2019). This tool is based on wavelet transforms which recognizes dot-like structures. In addition, we excluded from the cell mask the labelled plasma membrane by a mask erosion.

      We agree the design of experiments was not aimed at comparing the effect of PUFA treatment on low levels of constitutive D2R endocytosis. This would require more sensitive assays and be addressed in subsequent studies.

      'Optional' Also, it would be informative to see the ppH Beta-arrestin experiments with the B2AR to assess, whether the putative discrepancy between D2R and B2AR is upstream or downstream of the blunted Beta-arrestin recruitment. To the same point, it would be very informative to assess how the incorporation of the free fatty acids affect receptor signalling, which would also help relate the effect of incorporation of the FFA's in the phospholipids to previous experiment using short term incubation with FFA's

      We have now performed live imaging experiments in HEK293 cells expressing SEP-β2AR, GRK2 and βarr2-mCherry and stimulated with isoproterenol (Figure EV6). We show that the clustering of SEP-β2AR, of βarr2-mCherry, as well as endocytosis, are not affected by treatments with DHA or DPA. In this study, we focused on the early trafficking steps of D2R internalization. It will be interesting in a future study to address its consequences on G protein dependent and independent signaling. Moreover, and for good measure, we performed experiments to assess TfR-SEP endocytosis with the ppH assay. Again, we found no difference between cells treated or not with PUFAs (Figure EV5)

      *References overall seem appropriate although Schmidt et al would be relevant for reference of the constitutive vs agonist induced endocytosis of D2R and B2AR. *

      We have now cited Schmidt et al. 2020 doi 10.1111/bcpt.13274 in the discussion with the following sentences: "D2R also shows constitutive endocytosis (Schmidt et al, 2020) which may be modulated by PUFAs although we did not detect any significant difference in our measures (see Figure 3) which were aimed at detecting high levels of internalization induced by agonists. Further work will be required to specifically examine the effect of PUFAs on constitutive GPCR internalization."

      Overall, the figures are well composed and convey the messages fairly well. Specific point that would strengthen the rigor include: • Chosing actual representative pictures of the quantitative data in Fig 2 and 3 (e.g. hard to see 25 endocytic events in Fig 2A constitutive endo, EtOH)

      We apologize for the confusion. We employ a normalization procedure to account for cell size. In addition, all numbers have been normalized to the condition stimulated with agonist with no PUFA treatment). In fact, we detect in unstimulated cells very few puncta (on average 0.6, range 0-5) compared to 27.3 clusters (range 2-87) in cells stimulated with QPL.

      • Showing actual p values for the statistical comparisons* For easier reading, we have kept the stars convention for the figures but added two tables with all statistical tests and the p values for both main figures and EV figures.

      Moreover, for ease of reading the figures (without consulting the legend repeatedly) it would be very helpful to headline individual panel with what the experiments assesses. Figure 1a and 1b for example can't be distinguished at all before reading the figure legend. Also, y-axis could be more informative on what I measured rather than just giving the unit.

      We have added titles to panels (in particular for Figure 2A,B which correspond to former Figure 1A,B) and we have given new titles to Y axes to make them clearer. We hope that the reading of our figures will now be easier.

      Finally, the figure presentation and description of S1 is very hard to follow. I cannot really make out what is assessed in the different panels.

      We have changed substantially Figure EV1 (now Figure 1) with new presentation of data: all 4 conditions (control, treated with DHA, DPA or BA) systematically presented in the same graph, and clearer titles for the parameter displayed on the Y axes. We hope that this figure is now easier to follow.

      Significance

      *The strength of the manuscript is the use and validation of incorporation of FFA's in the plasma membrane, which more closely mimics the physiological situation than brief application of FFAs as often done. Is addition, the blunted recruitment of beta-arrestin as assessed by the ppH protocol is quite intriguing mechanistically. The limitation are the relative narrow focus on the D2 receptor (and not multiple GPCRs) that does not really speak to as or assess the physiological, pathophysiological or therapeutic role of the observations (except from referring the relation between FFAs and disease). Also, despite the putative role of Beta-arrestin recruitment in the process, the actual causation in the process is not clear. This shortcoming is underscored by the putative effect on the constitutive internalization described above.

      My specific expertise for assessing the paper is within general trafficking processes (including the trafficking methodology applied), trafficking of GPCRs and function of the dopamine system including the role of D2 receptors.*

      • *

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      • *

      The only conclusion that I was able to understand from the study was that enrichment of cell membranes with polyunsaturated fatty acids specifically inhibited agonist-induced internalization of D2 receptors. However, I think that the experiments used to conclude that PUFAs do not alter D2R clustering but reduce the recruitment of β-arrestin2 and D2R endocytosis need some clarification (i.e. data depicted in Fig. 2-5). This lack of clarity might be due to the fact I am not familiar enough with the employed technologies or to the unclear writing style of the paper. There was an overuse of acronyms, initialisms and abbreviations, which are difficult to understand for researchers outside of the specific lipid field. I think that the manuscript should be written in a way to be legible also for researchers not working in the immediate filed.

      The paper was not written in a manner that a general audience of cell biologists or those interested in GPCR biology could understand and judge. It is indeed interesting that polyunsaturated fatty acids specifically inhibit D2R internalization in HEK293 cells, and it could be significant. But, it is difficult to judge the significance of the observation without more in vivo data.

      I would suggest the following. Remove all acronyms and abbreviations. Significantly, expand the Materials and Methods section, either in the manuscript or in the Supplemental section. I suggest clearly explaining each construct used, and the function of each module in the construct, with diagrams. In addition, provide a comprehensive step by step description of each experimental protocol, providing the reader with the rationale for each step in the protocol with explanatory diagrams. The authors should also more clearly explain the rationale and logic that was utilized to make the conclusions that they did from the depicted observations. Only then can a broader audience determine if the authors' conclusions are justified.

      We thank the reviewer for his/her comments. Indeed, our main message was that two types of PUFAs (DHA and DPA) specifically alter D2R endocytosis by reducing the recruitment of β-arrestin2 without changing D2R clustering at the plasma membrane. We are sorry that our writing was not clear enough. We also found out that in the last steps of the submission to Review Commons, the first paragraph of the Discussion was inadvertently erased. This made our main conclusions, summarized in this first paragraph, less clear. We have now put back this important paragraph. Moreover, we have extensively rewritten the manuscript thriving to make it as clear as possible to a large audience. We have reduced the use of acronyms to keep only the most used ones [e.g. PUFA (used 99 times), DHA (37 times), GPCR (34 times), D2R (126 times), GRK (17 times)] and made them consistent throughout the manuscript. Following the reviewer's suggestion, we have also added a scheme of the steps following D2R activation by agonist leading to its internalization (Figure EV3).

      We understand that the reviewer implies by "in vivo data" results obtained in the brain of animals. As written in the Introduction and in the Discussion, the current work follows up on a recently published manuscripts by a subset of the authors, namely (i) Ducrocq et al. 2020 (doi 10.1016/j.cmet.2020.02.012) in which we show that deficits in motivation in animals deprived in ω3-PUFAs can be restored specifically by conditional expression of a fatty acid desaturase from c. elegans (FAT1) that allows restoring PUFA levels specifically in D2R-expressing striatal projection neurons (which mediate the so-called indirect pathway), and (ii) Jobin et al. 2023 (doi: 10.1038/s41380-022-01928-6) which combines in cellulo (HEK 293 cells) and in vivo data to show that PUFAs affects the ligand binding of the dopamine D2 receptor and its signaling in a lipid context that reflects patient lipid profiles regarding poly-unsaturation levels.

      Reviewer #2 (Significance (Required)):

      • *

      In summary, I will reiterate that the reported experiments need to be much better explained to make the study understandable to a broader audience and for that audience to determine whether the conclusions are justified.

      • *

      • *

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      • *

      Summary:

      The authors investigate the role of lipid polyunsaturation in endocytic uptake of the dopamine D2 receptor (D2R). To modulate the degree of unsaturation in live cell plasma membranes, the authors incubate cell lines with pure fatty acid that is metabolized and incorporated into the cellular membranes. To quantify the internalization of D2R in these live cells, the authors utilized quantitative fluorescence assays such as DERET and endosome analysis to determine the degree and rate of D2R internalization in the presence of two model agonists - dopamine and quinpirole. The authors conclude that when the PUFA content of the plasma membrane is increased (i.e., via ω3 or ω6 fatty acids), both the quantity and rate of D2R internalization decrease substantially. The authors confirmed that these phenomena are specific to D2R as caveolar endocytosis and clathrin-mediated endocytosis were unaffected when these same experimental techniques were utilized for β2 adrenergic receptor and transferrin. Additionally, the authors conclude that the clustering ability of D2R is unaffected by lipid unsaturation but that the ability of D2R clusters to interact with β-arrestin2 is inhibited in the presence of excess PUFA. Based on these findings, the authors propose several hypothetical mechanisms for lipid-D2R interactions on the plasma membrane, which will likely be the scope of future work.

      Overall, this is a highly thorough and rigorous body of work that convincingly illustrates the connection between PUFA levels and D2R activity. However, I do not agree with the authors' conclusions pertaining to how their results should be interpreted in the context of fatty acid-related disorders. Additionally, this manuscript could benefit from some reorganization which would present the work more clearly. Please see the comments below.

      We thank the reviewer for the positive appreciation of our work, qualified as a "thorough and rigorous body of work that convincingly illustrates the connection between PUFA levels and D2R activity". We will address the specific points raised by the reviewer with our answers below.

      Comments:

        • A recurring motivation for this study that is brought up by the authors is that dietary deficiency of ω3 fatty acids is tied to D2R dysfunction. This would indicate that PUFA reduction in the plasma membrane results in D2R dysfunction. However, the experiments emphasized in this manuscript investigate the condition where PUFA content is INCREASED in the plasma membrane and D2R function is compromised. It seems inappropriate for the authors to cite dietary deficiency of ω3 as a motivation when they experimentally test a condition that is tied to ω3 surplus.* Regarding the general comment of the reviewer, we agree that direct conclusion cannot be drawn on the etiology of psychiatric disorders by looking at the effect of membrane fatty acid levels on D2R in HEK 293 cells. Nevertheless, we mention in the Introduction the intriguing occurrence of low PUFA levels in psychiatric disorders as starting point to look at D2R as an important target for psychoactive drugs prescribed for these disorders. In the Discussion, we propose that manipulating fatty acid levels might potentiate the efficacy of D2R ligands used as treatments. We felt raising these aspects was not putting too much emphasis on psychiatric disorders. However, in accordance with the reviewer's comment, we toned down these descriptions in the revised manuscript.

      The goal of increasing the levels of fatty acids at the membrane in HEK 293, the most widely used cellular system to study GPCR trafficking, was to try to emulate the levels of lipids in brain cells. Indeed, the levels of PUFAs in our culture conditions are much lower (~8 %, Figure 1B) than in brain extracts (~30 %). Therefore, the "control" condition in HEK 293 cells would correspond to PUFA deficiency while after our enrichment protocol these levels are closer to those found in brain cells. Our results could therefore be interpreted as endocytosis of D2R being augmented under membrane PUFA decrease. Importantly, increased receptor internalization often correlates with decreased signaling. Therefore, membrane PUFA enrichment in our conditions would rather potentiate D2R signaling.

      • Following up on the first comment, the authors' results seem to indicate that excess ω3's are detrimental to D2R function. This result would be at odds with the conventional view that ω3's are essential and that excessive ω3 may not be harmful. The authors should rationalize their findings in the context of what is known about excess dietary ω3.*

      The Reviewer is right that the conventional view is that excessive ω3 PUFA may not be harmful. However, this rather applies to dietary consumption, which might have limited effect to brain fatty acid contents since their accretion is highly regulated. Moreover, the majority of studies looking at ω3 supplementation have been performed in young adults and the effects on the developing brain - as it might be happening in pathological conditions in which D2R is involved - remain poorly understood. Furthermore, as mentioned above, blunted internalization of D2R under membrane PUFA enrichment is not an indication of "detrimental" to D2R function. Nor do we argue that membrane enrichment corresponds to excess PUFAs.

      • I would argue that the control experiments with saturated fatty acids (i.e., Behenic Acid in figure 1), represent a scenario mimicking ω3 deficiency as the enrichment of Behenic Acid causes an overall reduction in PUFAs (Figure EV1C - an increase in SFA must correspond to a decrease in PUFA). These Behenic acid results are the only experiments presented by the authors that mimic a scenario resembling ω3 deficiency and the results show that the D2R internalization is unaffected (Figure 1G-H). Therefore, I would further argue that if anything, the authors results suggest that ω3 deficiency is NOT correlated to D2R internalization. Again, the authors must rationalize these findings in the context of what is known about dietary intake of ω3's.*

      The Reviewer must refer to the fact that nutrients rich in SFAs are usually poor in PUFAs and vice-versa. Based on our lipidomic analysis, we now present in Figure 1B the effect of treatments (DHA, DPA, BA) on the levels of PUFAs (Figure 1B) and saturated fatty acids (Figure 1C). In cells treated with behenic acid (BA), PUFA levels are not significantly changed relative to control, untreated cells, while saturated fatty acid levels are increased. BA was used here to determine whether the effects observed with PUFAs was related to the enrichment in unsaturations or due to carbon chain length (C22). It is not the case because BA treatment, unlike DHA or DPA treatment, does not affect D2R endocytosis (Figure 2G,H).

      • It's not clear why the authors decided to include an ω6 fatty acid in this study. The authors built up a detailed rationale for investigating ω3's as they are dietarily essential and tied to disease when deficient. To my knowledge, ω6's are considered much less beneficial than ω3's in a dietary sense. The inclusion of an ω6 almost seems coerced as the ω6-related results don't provide any interesting additional insights. It would benefit the manuscript if the authors provided some additional discussion explaining why ω6's are being investigated in addition to ω3's. *

      We agree that we could have made the rationale clearer. The goal in comparing ω3-DHA and ω6-DPA was to assess whether the position of the first unsaturation (n-3 vs n-6), with the same carbon chain length (C22) might differentially impact D2R endocytosis.

      • In Figure EV1D, the AHA and DPA percentages each increase by ~6%. The corresponding Figure EV1B indicates that the overall PUFA% in the plasma membrane also increases by 6%. This makes sense as the total change in PUFA content is consistent with the amount of AHA or DPA being internalized to cells. However, this consistency was not observed with BA and SFAs. In Figure EV1E, the BA percentage increases only ~1% while the total SFA percentage in Figure EV1C increases by ~6%. How can something undergoing a 1% change (relative to total lipid content) result in a 6% overall change in SFA content?*

      The reviewer is correct: the level of SFAs is increased by 5.2% (34.5 % of total FAs in control cells to 39.7 % in BA treated cells), more than the increase in BA alone (1.18% from 0.35 % to 1.53 %). A close look at our lipidomics data showed that many of the 10 saturated fatty acids quantified are enhanced. In particular, the two most abundant ones, palmitic acid (16:0) and stearic acid (18:0) are increased, from 21.37 % to 22.28 % and 8.47 % to 11.17%, respectively. The reasons for these apparent discrepancies may involve lipid metabolic pathways which convert the rare and long BA into more common and shorter SFAs to preserve lipid contents and thus membrane properties.

      • In Figure 4, the discussion of kinetics does not make sense. How exactly are kinetics being monitored in this figure? (Recruitment kinetics are discussed in panels D and G)*

      We wanted to convey the impression that the time to reach the peak βarr2-mCherry recruitment was shorter in PUFA-treated cells than in control cells. However, after analyzing the kinetics in individual cells, we did not find a statistically significant difference in the time to maximum fluorescence. Therefore, we removed this reference to the kinetics of recruitment.

      We now write: " However, treatment with DHA or DPA significantly decreased peak βarr2-mCherry fluorescence (Figure 5F-G).."

      • In Figure 5, What is the purpose of panel D? Would it be more helpful to include additional, overlaid "cumulative N" plots for scenarios in which PUFAs were enriched? This would work well in conjunction with panel F.*

      The purpose of this panel is to show the kinetics of increase in the frequency of endocytic vesicle formation upon agonist addition, and the decrease in frequency when the agonist is removed. We have now added examples of cells treated with DHA and DPA of similar surface for direct comparison with control (EtOH) cells.

      • For the readers who are new to this area or unfamiliar with the assays used, Figure 1 is not intuitive and initially difficult to interpret. It would greatly benefit the flow of the manuscript if Figures EV1A-C and EV2A were included in the main text and "Normalized R" was clearly defined in the main text, prior to discussion of Figure 1.*

      We have now transferred Figure EV1 as Figure 1. We have adapted the scheme of the DERET assay and its legend (now in Figure EV1A) to make it clearer. We did not put in Figure 2 because this figure is already very big. We have changed "Normalized R" to "Ratio 620/520) (% max)" to be clearer and more consistent with the scheme.

      Reviewer #3 (Significance (Required)):

      • *

      General assessment: The work, for the most part, is rigorous and scientifically sound. The authors utilize impressive, quantitative assays to expand our understanding of protein-lipid interactions. However, the authors need to improve their discussion of the actual physiological conditions that correspond to their experimental results.

      • *

      Advance: This work may fill a gap in our understanding of disorders related to the dopamine D2 receptor. However, some of the results may be at odds with what is currently known/understood about dietary ω3 fatty acids.

      • *

      Audience: This work will be of broad interest to researchers in the biophysics field, with particular emphasis on researchers who study protein and membrane biophysics. This work will also be of interest to researchers who study membrane molecular biology.

      • *

      Reviewer Expertise: quantitative fluorescence spectroscopy and microscopy; membrane biophysics; protein-lipid interactions

      • *
    1. The child may feel shame (they might not be developmentally able to separate their identity from the momentary rejection)

      I think the nuance in meaning of guilt and shame reflects the belief we hold as a society that ppl aren’t inherently bad or good. Moreover, the kind of behavior and action that you do don’t define your moral compass nor does it say anything directly about yourself as an individual. In my own experience I find it very hard to unascribe myself to the criticism that my actions receive. For instance, when I get peer reviews, I know that objectively the comments are directed toward my writing but it’s difficult not to also put yourself under that lens of criticism when you are the one who procured the work.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      General Statements

      We thank all three reviewers for their time and care in reviewing our manuscript, in particular Reviewer 3 for providing a detailed critique that was very useful for planning revisions. We are grateful that all three reviewers indicate that the new genome resources presented in this work are of high-quality and address an existing knowledge gap. We are also grateful for general assessments that the manuscript is 'well-written', and the analyses 'well performed' and 'thorough'.

      We acknowledge Reviewer 3’s legitimate criticism that the assembly and annotation data is not already publicly available and would like to assure the reviewing team that we have been pressing NCBI to progress the submission status since before the preprint was submitted. We regret the delay but hope that we can resolve this issue promptly. Furthermore, as some additional fields in the REAT genome annotation are lost during the NCBI submission process, we will ensure that comprehensive annotation files are also added to Zenodo.

      Reviewer 3 also made the general comment that 'the manuscript could greatly benefit from merging the result and discussion sections' and we would naturally be happy to make this adjustment if the journal in question uses that format.

      Description of the planned revisions

      • We will follow suggestions by Reviewer 3 to improve clarity of two figures:

      Figure S9: Please use a more appropriate colour palette. It is difficult to know the copy number based on the colour gradient.

      Figure 5: Consider changing panel B for a similar version of Fig S12. I think it gives a cleaner and more general perspective of the presence of starship elements.

      • We will address the choice of LOESS versus linear regression for investigating the relationship between candidate secreted effector protein (CSEP) density and transposable element (TE) density, as queried by Reviewer 3:

      Lines 140-144: LOESS smoothing functions are based on local regressions and usually find correlations when there are very weak associations. The authors have to justify the use of this model versus a simpler and more straightforward linear regression. My suspicion is that the latter would fail to find an association. Also, there is no significance of Kendall's Tau estimate (p-value).

      We agree with the reviewer, that as we did not find an association with the more sensitive LOESS, we expect that linear regression would also not find an association, supporting our current conclusions. We will add this negative result into the text.

      • We will check for other features associated with the distribution of CSEPs, as queried by Reviewer 3:

      Lines 157-163: Was there any other feature associated with the CSEP enrichment? GC content? Repetitive content? Centromere likely localisation?

      • We will integrate TE variation into the PERMANOVA lifestyle testing, as suggested by Reviewer 3:

      Line 186: Why not to test the variation content of TEs as a factor for the PERMANOVA?

      In reviewing this suggestion, we also spotted an error in our data plotting code, and the PERMANOVA lifestyle result for all genes will be corrected from 17% to 15% in Fig. 4a. Correcting this error does not impact our ultimate results or interpretation.

      • To complement the current graphical-based assessment of approximate data normality, we will include additional tests (Shapiro-Wilk for sample sizes

      Line 743: Q-Q plots are not a formal statistical test for normality.

      • One of the main critiques from Reviewer 3 was that, although we already acknowledged low sample sizes being a limitation of this work, the manuscript could benefit from reframing with greater consideration of this factor. They also highlighted a few specific places in the text that could be rephrased in consideration of this:

      Line 267: "Multiple strains" can be misleading about the magnitude.

      Lines 305-307: The fact that there is significant copy number variation between the two GtA strains suggests that the variation in the GtA lineage has not been fully captured and that there may be an unsampled substructure. Although the authors acknowledge the need for pangenomic references, they should recognize this limitation in the sample size of their own study, especially when expressing its size as "multiple strains" (line 267).

      Lines 314-317: Again, the sample size is still very small and likely not representative. It suggests UNSAMPLED substructure even for the UK populations.

      Line 164 (and whole section): I would invite the authors to cautiously revisit the use of the terms "core", "soft core". The sample size is very low, as they themselves acknowledge, and probably not representative of the diversity of Gaeumannomyces.

      We intend to edit the text to address this, including removal of both text and figure references to ‘soft-core’ genes, as we agree the term is likely not meaningful in this case, and removing it has no bearing on the results or interpretation.

      Description of the revisions that have already been incorporated in the transferred manuscript

      • We have amended the text in a number of places for clarity/fluency as suggested by Reviewer 3:

      ii) There need to be an explicit conclusion about the differences between pathogenic Gt and non-pathogenic Gh. Somehow, this is not entirely clear and is probably only a matter of rephrasing.

      Please see new lines 477-478: ‘Regarding differences between pathogenic Gt and non-pathogenic Gh, we found that Gh has a larger overall genome size and greater number of genes.’

      Lines 309-314: The message seems a bit out of context in the paragraph.

      This is valid, these lines have now been removed.

      Lines 392-395: The idea that crop pathogenic fungi are under pressure that favours heterothallism does not take into account the multiple cases of successful pathogenic clonal lineages in which sexual reproduction is absent. This paragraph seems very speculative to me. Please rephrase it.

      Our intention here was the exact reverse, that crop pathogens are under pressure to favour homothallism (as Reviewer 3 points out, anecdotally this often seems to play out in nature). We have rephrased lines 386-390 to hopefully make our stance more explicit: 'Together, this could suggest a selective pressure towards homothallism for crop fungal pathogens, and a switch from heterothallism in Gh to homothallism in Gt and Ga may, therefore, have been a key innovation underlying lifestyle divergence between non-pathogenic Gh and pathogenic Gt and Ga.'

      Lines 463-464: Please refer to the analyses when discussing the genetic divergence.

      We have rephrased this sentence to make our intended point clearer, please see new lines 459-461: ‘If we compare Ga and Gt in terms of synteny, genome size and gene content, the magnitude of differences does not appear to be more pronounced than those between GtA and GtB.’

      • We have also fixed the following typographic errors highlighted by Reviewer 3:

      Line 399: You mean, Fig 4C?

      Line 722: You missed "trimAI"

      Lines 723-727: Missing citations for "AMAS" and RAxML-NG, "AHDR" and "OrthoFinder"

      • We have added genome-wide RIP estimates to Supplementary Table S1 as requested by Reviewer 3:

      Lines 416-422: Please provide the data related to the genome-wide estimates of RIP.

      • We have added a note clarifying that differences in overall genome size between lineages are not fully explained by differences in gene copy-number (lines 406-408: 'We should note that the total length of HCN genes was not sufficiently large to account for the overall greater genome size of GtB compared to GtA (Supplemental Table S1).') in response to a comment from Reviewer 3:

      Line 396: The difference in duplicated genes raises the question of whether there are differences in overall genome size between lineages and, if so, whether they can be explained by the presence of genes.

      • We have made an alteration to the author order and added equal second-author contributions.

      Description of analyses that authors prefer not to carry out

      • In response to our analysis regarding the absence of TE-effector compartmentalisation in this system, Reviewer 1 requested additional analyses:

      While TE enrichment is typically associated with accessory compartments, it is not a defining feature. To bolster the authors' claim, it is essential to demonstrate that there is no bias in the ratio of conserved and non-conserved genes across the genomes.

      We believe that there are two slightly different compartmentalisation concepts being somewhat conflated here – (1) the idea of compartments where TEs and virulence proteins such as effectors are significantly colocalised in comparison with the rest of the genome, and (2) the idea of compartments containing gene content that is not shared in all strains (i.e. accessory). The two may overlap – as Reviewer 2 states, accessory compartments may also be enriched with TEs – but not necessarily. We specifically address the first concept in our text, and we appreciate Reviewer 3’s response on this subject:

      There is a clear answer for the compartmentalisation question. The authors favour the idea of "one-compartment" with compelling analyses.

      We believe that the second concept of accessory compartments is shown to be irrelevant in this case from our GENESPACE results (see Fig. 2), which demonstrate that gene content is conserved, broadly syntenic even, across strains, with no clear evidence of accessory compartments or chromosomes regarding gene content. We have already acknowledged that other mechanisms of compartmentalisation beyond TE-effector colocalisation may be at play (as seen from our exploration of effector distributions biased towards telomeres, see section from line 156: ‘Although CSEPs were not broadly colocalised with TEs, we did observe that they appeared to be non-randomly distributed in some pseudochromosomes (Fig. 3a)…’).

      • Reviewer 1 questioned the statement that higher level of genome-wide RIP is consistent with lower levels of gene duplication:

      L422: Is the highest RIP rate in GtA consistent with its low levels of gene duplication? Does this suggest that duplicated sequences in GtA are no longer recognizable due to RIP mutations? This seems counterintuitive, as RIP is primarily triggered by gene duplication.

      Our understanding is that, while RIP can directly mutate coding regions, it predominantly acts on duplicated sequences within repetitive regions such as TEs (https://genomebiology.biomedcentral.com/articles/10.1186/s13059-020-02060-w), which has a knock-on effect of reducing TE-mediated gene duplication. In Neurospora crassa, where RIP was first discovered and thus the model species for much of our understanding of the process, a low number of gene duplicates has been linked to the activity of RIP (https://www.nature.com/articles/nature01554). We therefore believe the current text is reasonable.

      • Reviewer 2 stated that experimental validation of gene function is required to make clear links to lifestyle or pathogenicity:

      In my eyes, the study has two main limitations. First of all, the research only concerns genomics analyses, and therefore is rather descriptive and observational, and as such does not provide further mechanistic details into the pathogen biology and/or into pathogenesis. This is further enhanced by the lack of clear observations that discriminate particular species/lineages or life styles from others in the study. Some observations are made with respect to variations in candidate secreted effector proteins and biosynthetic gene clusters, but clear links to life style or pathogenicity are missing. To further substantiate such links, lab-based experimental work would be required.

      We agree that in an ideal world supportive wet biology gene function experimental evidence would be included. Unfortunately, transformation has not been successfully developed yet in this system (see lines 33-35: ‘There have also been considerable difficulties in producing a reliable transformation system for Gt, preventing gene disruption experiments to elucidate function (Freeman and Ward 2004).’) not for lack of trying – after 18 months of effort using all available transformation techniques and selectable markers neither Gt or Gh was transformable. Undertaking that challenge has proven to be far beyond the scope of this paper, the purpose of which was to generate and analyse high-quality genomic data, a major task in itself. We again appreciate Reviewer 3’s response to this point, agreeing that it is out of scope for this work:

      I just want to respectfully disagree with reviewer #2 about the need for more experimental laboratory work, as in my opinion it clearly goes beyond the intention and scope of the submitted work. This could be a limitation that would depend on the chosen journal and its specific format and requirements. Finally, I think it would suffice for the authors to discuss on the lack of in-depth experimental work as part of the limitations of their overall approach.

      As per the suggestion by Reviewer 3, we will add text to address the absence of in-depth experimental work within the scope of this study.

      • Reviewer 3 suggested we might 'consider including formal population differentiation estimators', however, as they previously highlighted above, our sample sizes are too small to produce reliable population-level statistics.

      • Reviewer 3 raised the disparity in the appearance of branches at the root of phylogenetic trees in various figures:

      Figure 4a (and Figs S5, S13): The depicted tree has a trichotomy at the basal node. Please correct it so Magnaporthiopsis poae is resolved as an outgroup, as in Fig. S17.

      All the trees were rooted with M. poae as the outgroup, and although it may seem counterintuitive, a trifurcation at the root is the correct outcome in the case of rerooting a bifurcating tree, please see this discussion including the developers of both leading phylogeny visualisation tools ggtree and phytools (https://www.biostars.org/p/332030/). Although it is possible to force a bifurcating tree after rooting by positioning the root along an edge, the resulting branch lengths in the tree can be misleading, and so in cases where we wanted to include meaningful branch lengths in the figure (i.e. estimated from DNA substitute rates, in Figures 4a, S5 and S13) we have not circumvented the trifurcation. In Fig S17 meaningful branch lengths have not been included and the tree only represents the topology, resulting in the appearance of bifurcation at the root.

      • Reviewer 3 suggested that the discussion on giant Starship TEs resembled more of a review:

      Lines 434-451: This section resembles more a review than a discussion of the results of the present work. This also highlights the lack of analysis on the genetic composition and putative function of the identified starship-like elements.

      The reviewer has a valid point. However, Starships are a recently discovered and thus underexplored genetic feature that readers from the wider mycology/plant pathology community may not yet be aware of. We believe it is warranted to include some additional exposition to give context for why their discovery here is novel, interesting and unexpected. We are naturally keen to investigate the make-up of the elements we have found in this lineage, however that will require a substantial amount of further work. Analysis of Starships is not trivial, for example the starfish tool is still under development and a limited number of species have been used to train it. How best to compare elements is also an active area of investigation – they are dynamic in their structure and may include genes originating from the host genome or a previous host – and for this reason we believe is out of scope to interrogate alongside the other foundational genomic data presented in this paper.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      The manuscript "Evolutionary genomics reveals variation in structure and genetic content implicated in virulence and lifestyle in the genus Gaeumannomyces" by Rowena Hill and collaborators is a thorough, well-planned and designed work. They have described 9 almost complete new assemblages, from their most general characteristics to their genetic content and implications. I am very pleased with the quality and completeness of this work and agree that it provides a very useful resource and framework for further research on this important organism.

      The three main motivations of the present study were:

      1) Are there genomic signatures distinguishing Gt A/B virulence lineages?;

      2) How do gene repertoires differ between pathogenic Gt and non-pathogenic Gh? And, iii) Is there evidence of genome compartmentalisation in Gaeumannomyces?

      a) The authors themselves recognise the low number of samples in their work (Lines 453-454) and this limitation hampers the establishment of a clear association between lineage-specific virulence and genomic signatures. I would argue that the present work needs to be reframed factoring this main limitation.

      b) There need to be an explicit conclusion about the differences between pathogenic Gt and non-pathogenic Gh. Somehow, this is not entirely clear and is probably only a matter of rephrasing.

      c) There is a clear answer for the compartmentalisation question. The authors favour the idea of "one-compartment" with compelling analyses.

      Major comments:

      The authors have not published the genomic data. Therefore, it is impossible to audit the quality of the assemblies and impedes its reproducibility. It is also bad practice by current scientific standards.

      I strongly believe that the manuscript could greatly benefit from merging the result and discussion sections. It would be easier for the reader to follow their entire logic. This is of course something optional and contingent on the journal format.

      Minor and specific comments:

      RESULTS

      • Lines 140-144: LOESS smoothing functions are based on local regressions and usually find correlations when there are very weak associations. The authors have to justify the use of this model versus a simpler and more straightforward linear regression. My suspicion is that the latter would fail to find an association. Also, there is no significance of Kendall's Tau estimate (p-value).

      • Lines 157-163: Was there any other feature associated with the CSEP enrichment? GC content? Repetitive content? Centromere likely localisation?

      • Line 164 (and whole section): I would invite the authors to cautiously revisit the use of the terms "core", "soft core". The sample size is very low, as they themselves acknowledge, and probably not representative of the diversity of Gaeumannomyces.

      • Figure 4a (and Figs S5, S13): The depicted tree has a trichotomy at the basal node. Please correct it so Magnaporthiopsis poae is resolved as an outgroup, as in Fig. S17.

      • Line 186: Why not to test the variation content of TEs as a factor for the PERMANOVA?

      • Figure S9: Please use a more appropriate colour palette. It is difficult to know the copy number based on the colour gradient.

      • Figure 5: Consider changing panel B for a similar version of Fig S12. I think it gives a cleaner and more general perspective of the presence of starship elements.

      DISCUSSION

      • Line 267: "Multiple strains" can be misleading about the magnitude.

      • Lines 305-307: The fact that there is significant copy number variation between the two GtA strains suggests that the variation in the GtA lineage has not been fully captured and that there may be an unsampled substructure. Although the authors acknowledge the need for pangenomic references, they should recognize this limitation in the sample size of their own study, especially when expressing its size as "multiple strains" (line 267).

      • Lines 309-314: The message seems a bit out of context in the paragraph.

      • Lines 314-317: Again, the sample size is still very small and likely not representative. It suggests UNSAMPLED substructure even for the UK populations.

      • Lines 392-395: The idea that crop pathogenic fungi are under pressure that favours heterothallism does not take into account the multiple cases of successful pathogenic clonal lineages in which sexual reproduction is absent. This paragraph seems very speculative to me. Please rephrase it.

      • Line 396: The difference in duplicated genes raises the question of whether there are differences in overall genome size between lineages and, if so, whether they can be explained by the presence of genes.

      • Line 399: You mean, Fig 4C?

      • Lines 416-422: Please provide the data related to the genome-wide estimates of RIP.

      • Lines 434-451: This section resembles more a review than a discussion of the results of the present work. This also highlights the lack of analysis on the genetic composition and putative function of the identified starship-like elements.

      • Lines 463-464: Please refer to the analyses when discussing the genetic divergence. Consider including formal population differentiation estimators.

      METHODS

      • Line 722: You missed "trimAI"

      • Lines 723-727: Missing citations for "AMAS" and RAxML-NG, "AHDR" and "OrthoFinder" Line 743: Q-Q plots are not a formal statistical test for normality.

      Referees cross-commenting

      I agree with my peer reviewers and appreciate that we have shared common concerns and suggestions. I also agree with their comments.

      I just want to respectfully disagree with reviewer #2 about the need for more experimental laboratory work, as in my opinion it clearly goes beyond the intention and scope of the submitted work. This could be a limitation that would depend on the chosen journal and its specific format and requirements. Finally, I think it would suffice for the authors to discuss on the lack of in-depth experimental work as part of the limitations of their overall approach.

      Significance

      The work presented by Hill and co-workers contributes to the understanding of the genetic basis of host-pathogen interactions and evolutionary dynamics in the important fungus responsible for wheat "take-all-disease", Gaeumannomyces tritici. By providing 9 new near-complete assemblages, this work will provide a valuable resource for research on this agronomically important organism. This work sets the stage for developing a global pangenome of G. tritici that can expand analyses of its population structure and specific genetic elements that are associated with its virulence.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reply to the Reviewers

      We thank the referees for their careful reading of the manuscript and their valuable suggestions for improvements.

      General Statements:

      Existing SMC-based loop extrusion models successfully predict and characterize mesoscale genome spatial organization in vertebrate organisms, providing a valuable computational tool to the genome organization and chromatin biology fields. However, to date this approach is highly limited in its application beyond vertebrate organisms. This limitation arises because existing models require knowledge of CTCF binding sites, which act as effective boundary elements, blocking loop-extruding SMC complexes and thus defining TAD boundaries. However, CTCF is the predominant boundary element only in vertebrates. On the other hand, vertebrates only contain a small proportion of species in the tree of life, while TADs are nearly universal and SMC complexes are largely conserved. Thus, there is a pressing need for loop extrusion models capable of predicting Hi-C maps in organisms beyond vertebrates.

      The conserved-current loop extrusion (CCLE) model, introduced in this manuscript, extends the quantitative application of loop extrusion models in principle to any organism by liberating the model from the lack of knowledge regarding the identities and functions of specific boundary elements. By converting the genomic distribution of loop extruding cohesin into an ensemble of dynamic loop configurations via a physics-based approach, CCLE outputs three-dimensional (3D) chromatin spatial configurations that can be manifested in simulated Hi-C maps. We demonstrate that CCLE-generated maps well describe experimental Hi-C data at the TAD-scale. Importantly, CCLE achieves high accuracy by considering cohesin-dependent loop extrusion alone, consequently both validating the loop extrusion model in general (as opposed to diffusion-capture-like models proposed as alternatives to loop extrusion) and providing evidence that cohesin-dependent loop extrusion plays a dominant role in shaping chromatin organization beyond vertebrates.

      The success of CCLE unambiguously demonstrates that knowledge of the cohesin distribution is sufficient to reconstruct TAD-scale 3D chromatin organization. Further, CCLE signifies a shifted paradigm from the concept of localized, well-defined boundary elements, manifested in the existing CTCF-based loop extrusion models, to a concept also encompassing a continuous distribution of position-dependent loop extrusion rates. This new paradigm offers greater flexibility in recapitulating diverse features in Hi-C data than strictly localized loop extrusion barriers.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      This manuscript presents a mathematical model for loop extrusion called the conserved-current loop extrusion model (CCLE). The model uses cohesin ChIP-Seq data to predict the Hi-C map and shows broad agreement between experimental Hi-C maps and simulated Hi-C maps. They test the model on Hi-C data from interphase fission yeast and meiotic budding yeast. The conclusion drawn by the authors is that peaks of cohesin represent loop boundaries in these situations, which they also propose extends to other organism/situations where Ctcf is absent.

      __Response: __

      We would like to point out that the referee's interpretation of our results, namely that, "The conclusion drawn by the authors is that peaks of cohesin represent loop boundaries in these situations, ...", is an oversimplification, that we do not subscribe to. The referee's interpretation of our model is correct when there are strong, localized barriers to loop extrusion; however, the CCLE model allows for loop extrusion rates that are position-dependent and take on a range of values. The CCLE model also allows the loop extrusion model to be applied to organisms without known boundary elements. Thus, the strict interpretation of the positions of cohesin peaks to be loop boundaries overlooks a key idea to emerge from the CCLE model.

      __ Major comments:__

      1. More recent micro-C/Hi-C maps, particularly for budding yeast mitotic cells and meiotic cells show clear puncta, representative of anchored loops, which are not well recapitulated in the simulated data from this study. However, such punta are cohesin-dependent as they disappear in the absence of cohesin and are enhanced in the absence of the cohesin release factor, Wapl. For example - see the two studies below. The model is therefore missing some key elements of the loop organisation. How do the authors explain this discrepency? It would also be very useful to test whether the model can predict the increased strength of loop anchors when Wapl1 is removed and cohesin levels increase.

      Costantino L, Hsieh TS, Lamothe R, Darzacq X, Koshland D. Cohesin residency determines chromatin loop patterns. Elife. 2020 Nov 10;9:e59889. doi: 10.7554/eLife.59889. PMID: 33170773; PMCID: PMC7655110. Barton RE, Massari LF, Robertson D, Marston AL. Eco1-dependent cohesin acetylation anchors chromatin loops and cohesion to define functional meiotic chromosome domains. Elife. 2022 Feb 1;11:e74447. doi: 10.7554/eLife.74447. Epub ahead of print. PMID: 35103590; PMCID: PMC8856730.

      __Response: __

      We are perplexed by this referee comment. While we agree that puncta representing loop anchors are a feature of Hi-C maps, as noted by the referee, we would reinforce that our CCLE simulations of meiotic budding yeast (Figs. 5A and 5B of the original manuscript) demonstrate an overall excellent description of the experimental meiotic budding yeast Hi-C map, including puncta arising from loop anchors. This CCLE model-experiment agreement for meiotic budding yeast is described and discussed in detail in the original manuscript and the revised manuscript (lines 336-401).

      To further emphasize and extend this point we now also address the Hi-C of mitotic budding yeast, which was not included the original manuscript. We have now added an entire new section of the revised manuscript entitled "CCLE Describes TADs and Loop Configurations in Mitotic S. cerevisiae" including the new Figure 6, which presents a comparison between a portion of the mitotic budding yeast Hi-C map from Costantino et al. and the corresponding CCLE simulation at 500 bp-resolution. In this case too, the CCLE model well-describes the data, including the puncta, further addressing the referee's concern that the CCLE model is missing some key elements of loop organization.

      Concerning the referee's specific comment about the role of Wapl, we note that in order to apply CCLE when Wapl is removed, the corresponding cohesin ChIP-seq in the absence of Wapl should be available. To our knowledge, such data is not currently available and therefore we have not pursued this explicitly. However, we would reinforce that as Wapl is a factor that promotes cohesin unloading, its role is already effectively represented in the optimized value for LEF processivity, which encompasses LEF lifetime. In other words, if Wapl has a substantial effect it will be captured already in this model parameter.

      1. Related to the point above, the simulated data has much higher resolution than the experimental data (1kb vs 10kb in the fission yeast dataset). Given that loop size is in the 20-30kb range, a good resolution is important to see the structural features of the chromosomes. Can the model observe these details that are averaged out when the resolution is increased?

      __Response: __

      We agree with the referee that higher resolution is preferable to low resolution. In practice, however, there is a trade-off between resolution and noise. The first experimental interphase fission yeast Hi-C data of Mizuguchi et al 2014 corresponds to 10 kb resolution. To compare our CCLE simulations to these published experimental data, as described in the original manuscript, we bin our 1-kb-resolution simulations to match the 10 kb experimental measurements. Nevertheless, CCLE can readily predict the interphase fission yeast Hi-C map at higher resolution by reducing the bin size (or, if necessary, reducing the lattice site size of the simulations themselves). In the revised manuscript, we have added comparisons between CCLE's predicted Hi-C maps and newer Micro-C data for S. pombe from Hsieh et al. (Ref. [50]) in the new Supplementary Figures 5-9. We have chosen to present these comparisons at 2 kb resolution, which is the same resolution for our meiotic budding yeast comparisons. Also included in Supplementary Figures 5-9 are comparisons between the original Hi-C maps of Mizuguchi et al. and the newer maps of Hsieh et al., binned to 10 kb resolution. Inspection of these figures shows that CCLE provides a good description of Hsieh et al.'s experimental Hi-C maps and does not reveal any major new features in the interphase fission yeast Hi-C map on the 10-100 kb scale, that were not already apparent from the Hi-C maps of Mizuguchi et al 2014. Thus, the CCLE model performs well across this range of effective resolutions.

      3. Transcription, particularly convergent has been proposed to confer boundaries to loop extrusion. Can the authors recapitulate this in their model?

      __Response: __

      In response to the suggestion of the reviewer we have now calculated the correlation between cohesin ChIP-seq and the locations of convergent gene pairs, which is now presented in Supplementary Figures 17 and 18. Accordingly, in the revised manuscript, we have added the following text to the Discussion (lines 482-498):

      "In vertebrates, CTCF defines the locations of most TAD boundaries. It is interesting to ask what might play that role in interphase S. pombe as well as in meiotic and mitotic S. cerevisiae. A number of papers have suggested that convergent gene pairs are correlated with cohesin ChIP-seq in both S. pombe [65, 66] and S. cerevisiae [66-71]. Because CCLE ties TADs to cohesin ChIP-seq, a strong correlation between cohesin ChIP-seq and convergent gene pairs would be an important clue to the mechanism of TAD formation in yeasts. To investigate this correlation, we introduce a convergent-gene variable that has a nonzero value between convergent genes and an integrated weight of unity for each convergent gene pair. Supplementary Figure 17A shows the convergent gene variable, so-defined, alongside the corresponding cohesin ChIP-seq for meiotic and mitotic S. cerevisiae. It is apparent from this figure that a peak in the ChIP-seq data is accompanied by a non-zero value of the convergent-gene variable in about 80% of cases, suggesting that chromatin looping in meiotic and mitotic S. cerevisiae may indeed be tied to convergent genes. Conversely, about 50% of convergent genes match peaks in cohesin ChIP-seq. The cross-correlation between the convergent-gene variable and the ChIP-seq of meiotic and mitotic S. cerevisiae is quantified in Supplementary Figures 17B and C. By contrast, in interphase S. pombe, cross-correlation between convergent genes and cohesin ChIP-seq in each of five considered regions is unobservably small (Supplementary Figure 18A), suggesting that convergent genes per se do not have a role in defining TAD boundaries in interphase S. pombe."

      Minor comments:

      1. In the discussion, the authors cite the fact that Mis4 binding sites do not give good prediction of the HI-C maps as evidence that Mis4 is not important for loop extrusion. This can only be true if the position of Mis4 measured by ChIP is a true reflection of Mis4 position. However, Mis4 binding to cohesin/chromatin is very dynamic and it is likely that this is too short a time scale to be efficiently cross-linked for ChIP. Conversely, extensive experimental data in vivo and in vitro suggest that stimulation of cohesin's ATPase by Mis4-Ssl3 is important for loop extrusion activity.

      __Response: __

      We apologize for the confusion on this point. We actually intended to convey that the absence of Mis4-Psc3 correlations in S. pombe suggests, from the point of view of CCLE, that Mis4 is not an integral component of loop-extruding cohesin, during the loop extrusion process itself. We agree completely that Mis4/Ssl3 is surely important for cohesin loading, and (given that cohesin is required for loop extrusion) Mis4/Ssl3 is therefore important for loop extrusion. Evidently, this part of our Discussion was lacking sufficient clarity. In response to both referees' comments, we have re-written the discussion of Mis4 and Pds5 to more carefully explain our reasoning and be more circumspect in our inferences. The re-written discussion is described below in response to Referee #2's comments.

      Nevertheless, on the topic of whether Nipbl-cohesin binding is too transient to be detected in ChIP-seq, the FRAP analysis presented by Rhodes et al. eLife 6:e30000 "Scc2/Nipbl hops between chromosomal cohesin rings after loading" indicates that, in HeLa cells, Nipbl has a residence time bound to cohesin of about 50 seconds. As shown in the bottom panel of Supplementary Fig. 7 in the original manuscript (and the bottom panel of Supplementary Fig. 20 in the revised manuscript), there is a significant cross-correlation (~0.2) between the Nipbl ChIP-seq and Smc1 ChIP-seq in humans, indicating that a transient association between Nipbl and cohesin can be (and in fact is) detected by ChIP-seq.

      1. *Inclusion of a comparison of this model compared to previous models (for example bottom up models) would be extremely useful. What is the improvement of this model over existing models? *

      __Response: __

      As stated in the original manuscript, as far as we are aware, "bottom up" models, that quantitatively describe the Hi-C maps of interphase fission yeast or meiotic budding yeast or, indeed, of eukaryotes other than vertebrates, do not exist. Bottom-up models would require knowledge of the relevant boundary elements (e.g. CTCF sites), which, as stated in the submitted manuscript, are generally unknown for fission yeast, budding yeast, and other non-vertebrate eukaryotes. The absence of such models is the reason that CCLE fills an important need. Since bottom-up models for cohesin loop extrusion in yeast do not exist, we cannot compare CCLE to the results of such models.

      In the revised manuscript we now explicitly compare the CCLE model to the only bottom-up type of model describing the Hi-C maps of non-vertebrate eukaryotes by Schalbetter et al. Nat. Commun. 10:4795 2019, which we did cite extensively in our original manuscript. Schalbetter et al. use cohesin ChIP-seq peaks to define the positions of loop extrusion barriers in meiotic S. cerevisiae, for which the relevant boundary elements are unknown. In their model, specifically, when a loop-extruding cohesin anchor encounters such a boundary element, it either passes through with a certain probability, as if no boundary element is present, or stops extruding completely until the cohesin unbinds and rebinds.

      In the revised manuscript we refer to this model as the "explicit barrier" model and have applied it to interphase S. pombe, using cohesin ChIP-seq peaks to define the positions of loop extrusion barriers. The corresponding simulated Hi-C map is presented in Supplementary Fig. 19 in comparison with the experimental Hi-C. It is evident that the explicit barrier model provides a poorer description of the Hi-C data of interphase S. pombe compared to the CCLE model, as indicated by the MPR and Pearson correlation scores. While the explicit barrier model appears capable of accurately reproducing Hi-C data with punctate patterns, typically accompanied by strong peaks in the corresponding cohesin ChIP-seq, it seems less effective in several conditions including interphase S. pombe, where the Hi-C data lacks punctate patterns and sharp TAD boundaries, and the corresponding cohesin ChIP-seq shows low-contrast peaks. The success of the CCLE model in describing the Hi-C data of both S. pombe and S. cerevisiae, which exhibit very different features, suggests that the current paradigm of localized, well-defined boundary elements may not be the only approach to understanding loop extrusion. By contrast, CCLE allows for a concept of continuous distribution of position-dependent loop extrusion rates, arising from the aggregate effect of multiple interactions between loop extrusion complexes and chromatin. This paradigm offers greater flexibility in recapitulating diverse features in Hi-C data than strictly localized loop extrusion barriers.

      We have also added the following paragraph in the Discussion section of the manuscript to elaborate this point (lines 499-521):

      "Although 'bottom-up' models which incorporate explicit boundary elements do not exist for non-vertebrate eukaryotes, one may wonder how well such LEF models, if properly modified and applied, would perform in describing Hi-C maps with diverse features. To this end, we examined the performance of the model described in Ref. [49] in describing the Hi-C map of interphase S. cerevisiae. Reference [49] uses cohesin ChIP-seq peaks in meiotic S. cerevisiae to define the positions of loop extrusion barriers which either completely stall an encountering LEF anchor with a certain probability or let it pass. We apply this 'explicit barrier' model to interphase S. pombe, using its cohesin ChIP-seq peaks to define the positions of loop extrusion barriers, and using Ref. [49]'s best-fit value of 0.05 for the pass-through probability. Supplementary Figure 19A presents the corresponding simulated Hi-C map the 0.3-1.3 kb region of Chr 2 of interphase S. pombe in comparison with the corresponding Hi-C data. It is evident that the explicit barrier model provides a poorer description of the Hi-C data of interphase S. pombe compared to the CCLE model, as indicated by the MPR and Pearson correlation scores of 1.6489 and 0.2267, respectively. While the explicit barrier model appears capable of accurately reproducing Hi-C data with punctate patterns, typically accompanied by strong peaks in the corresponding cohesin ChIP-seq, it seems less effective in cases such as in interphase S. pombe, where the Hi-C data lacks punctate patterns and sharp TAD boundaries, and the corresponding cohesin ChIP-seq shows low-contrast peaks. The success of the CCLE model in describing the Hi-C data of both S. pombe and S. cerevisiae, which exhibit very different features, suggests that the current paradigm of localized, well-defined boundary elements may not be the only approach to understanding loop extrusion. By contrast, CCLE allows for a concept of continuous distribution of position-dependent loop extrusion rates, arising from the aggregate effect of multiple interactions between loop extrusion complexes and chromatin. This paradigm offers greater flexibility in recapitulating diverse features in Hi-C data than strictly localized loop extrusion barriers."

      Reviewer #1 (Significance (Required)):

      This simple model is useful to confirm that cohesin positions dictate the position of loops, which was predicted already and proposed in many studies. However, it should be considered a starting point as it does not faithfully predict all the features of chromatin organisation, particularly at better resolution.

      Response:

      As described in more detail above, we do not agree with the assertion of the referee that the CCLE model "does not faithfully predict all the features of chromatin organization, particularly at better resolution" and provide additional new data to support the conclusion that the CCLE model provides a much needed approach to model non-vertebrate contact maps and outperforms the single prior attempt to predict budding yeast Hi-C data using information from cohesin ChIP-seq.

      *It will mostly be of interest to those in the chromosome organisation field, working in organisms or systems that do not have ctcf. *

      __Response: __

      We agree that this work will be of special interest to researchers working on chromatin organization of non-vertebrate organisms. We would reinforce that yeast are frequently used models for the study of cohesin, condensin, and chromatin folding more generally. Indeed, in the last two months alone there are two Molecular Cell papers, one Nature Genetics paper, and one Cell Reports paper where loop extrusion in yeast models is directly relevant. We also believe, however, that the model will be of interest for the field in general as it simultaneously encompasses various scenarios that may lead to slowing down or stalling of LEFs.

      This reviewer is a cell biologist working in the chromosome organisation field, but does not have modelling experience and therefore does not have the expertise to determine if the modelling part is mathematically sound and has assumed that it is.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary: Yuan et al. report on their development of an analytical model ("CCLE") for loop extrusion with genomic-position-dependent speed, with the idea of accounting for barriers to loop extrusion. They write down master equations for the probabilities of cohesin occupancy at each genomic site and obtain approximate steady-state solutions. Probabilities are governed by cohesin translocation, loading, and unloading. Using ChIP-seq data as an experimental measurement of these probabilities, they numerically fit the model parameters, among which are extruder density and processivity. Gillespie simulations with these parameters combined with a 3D Gaussian polymer model were integrated to generate simulated Hi-C maps and cohesin ChIP-seq tracks, which show generally good agreement with the experimental data. The authors argue that their modeling provides evidence that loop extrusion is the primary mechanism of chromatin organization on ~10-100 kb scales in S. pombe and S. cerevisiae.

      Major comments:

      1. I am unconvinced that this analysis specifically is sufficient to demonstrate that extrusion is the primary organizer of chromatin on these scales; moreover, the need to demonstrate this is questionable, as extrusion is widely accepted, even if not universally so. How is the agreement of CCLE with experiments more demonstrative of loop extrusion than previous modeling?

      __Response: __

      We agree with the referee's statement that "loop extrusion is extrusion is widely accepted, even if not universally so". We disagree with the referee that this state of affairs means that "the need to demonstrate this (i.e. loop extrusion) is questionable". On the contrary, studies that provide further compelling evidence that cohesin-based loop extrusion is the primary organizer of chromatin, such as ours, must surely be welcomed, first, in order to persuade those who remain unconvinced by the loop extrusion mechanism in general, and, secondly, because, until the present work, quantitative models of loop extrusion, capable of reproducing Hi-C maps quantitatively, in yeasts and other non-vertebrate eukaryotes have been lacking, leaving open the question of whether loop extrusion can describe Hi-C maps beyond vertebrates. CCLE has now answered that question in the affirmative. Moreover, the existence of a robust model to predict contact maps in non-vertebrate models, which are extensively used in the pursuit of research questions in chromatin biology, will be broadly enabling to the field.

      It is fundamental that if a simple, physically-plausible model/hypothesis is able to describe experimental data quantitatively, it is indeed appropriate to ascribe considerable weight to that model/hypothesis (until additional data become available to refute the model).

      How is the agreement of CCLE with experiments more demonstrative of loop extrusion than previous modeling?

      Response:

      As noted above and in the original manuscript, we are unaware of previous quantitative modeling of cohesin-based loop extrusion and the resultant Hi-C maps in organisms that lack CTCF, namely non-vertebrate eukaryotic models such as fission yeast or budding yeast, as we apply here. As noted in the original manuscript, previous quantitative modeling of Hi-C maps based on cohesin loop extrusion and CTCF boundary elements has been convincing that loop extrusion is indeed relevant in vertebrates, but the restriction to vertebrates excludes most of the tree of life.

      Below, the referee cites two examples of loop extrusion outside of vertebrates. The one that is suggested to correspond to yeast cells (Dequeker et al. Nature 606:197 2022) actually corresponds to mouse cells, which are vertebrate cells. The other one models the Hi-C map of the prokaryote, Bacillus subtilis, based on loop extrusion of the bacterial SMC complex thought to most resemble condensin (not cohesin), subject to barriers to loop extrusion that are related to genes or involving prokaryote-specific Par proteins (Brandao et al. PNAS 116:20489 2019). We have referenced this work in the revised manuscript but would reinforce that it lacks utility in predicting the contact maps for non-vertebrate eukaryotes.

      Relatedly, similar best fit values for S. pombe and S. cerevisiae might not point to a mechanistic conclusion (same "underlying mechanism" of loop extrusion), but rather to similar properties for loop-extruding cohesins in the two species.

      Response:

      In the revised manuscript, we have replaced "suggesting that the underlying mechanism that governs loop extrusion by cohesin is identical in both species" with "suggesting loop-extruding cohesins possess similar properties in both species" (lines 367-368).

      As an alternative, could a model with variable binding probability given by ChIP-seq and an exponential loop-size distribution work equally well? The stated lack of a dependence on extrusion timescale suggests that a static looping model might succeed. If not, why not?

      Response:

      A hypothetical mechanism that generates the same instantaneous loop distributions and correlations as loop extrusion would lead to the same Hi-C map as does loop extrusion. This circumstance is not confined to CCLE, but is equally applicable to previous CTCF-based loop extrusion models. It holds because Hi-C and ChIP-seq, and therefore models that seek to describe these measurements, provide a snapshot of the chromatin configuration at one instant of time.

      We would reinforce that there is no physical basis for a diffusion capture model with an approximately-exponential loop size distributions. Nevertheless, one can reasonably ask whether a physically-sensible diffusion capture model can simultaneously match cohesin ChIP-seq and Hi-C. Motivated by the referee's comment we have addressed this question and, accordingly, in the revised manuscript, we have added (1) an entire subsection entitled "Diffusion capture does not reproduce experimental interphase S. pombe Hi-C maps" (lines 303-335) and (2) Supplementary Figure 15. As we now demonstrate, the CCLE model vastly outperforms an equilibrium binding model in reproducing the experimental Hi-C maps and measured P(s).

      *2. I do not understand how the loop extrusion residence time drops out. As I understand it, Eq 9 converts ChIP-seq to lattice site probability (involving N_{LEF}, which is related to \rho, and \rho_c). Then, Eqs. 3-4 derive site velocities V_n and U_n if we choose rho, L, and \tau, with the latter being the residence time. This parameter is not specified anywhere and is claimed to be unimportant. It may be true that the choice of timescale is arbitrary in this procedure, but can the authors please clarify? *

      __Response: __

      As noted above, Hi-C and ChIP-seq both capture chromatin configuration at one instant in time. Therefore, such measurements cannot and do not provide any time-scale information, such as the loop extrusion residence time (LEF lifetime) or the mean loop extrusion rate. For this reason, neither our CCLE simulations, nor other researchers' previous simulations of loop extrusion in vertebrates with CTCF boundary elements, provide any time-scale information, because the experiments they seek to describe do not contain time-scale information. The Hi-C map simulations can and do provide information concerning the loop size, which is the product of the loop lifetime and the loop extrusion rate. Lines 304-305 of the revised manuscript include the text: "Because Hi-C and ChIP-seq both characterize chromatin configuration at a single instant of time, and do not provide any direct time-scale information, ..."

      In practice, we set the LEF lifetime to be some explicit value with arbitrary time-unit. We have added a sentence in the Methods that reads, "In practice, however, we set the LEF dissociation rate to 5e-4 time-unit-1 (equivalent to a lifetime of 2000 time-units), and the nominal LEF extrusion rate (aka \rho*L/\tau, see Supplementary Methods) can be determined from the given processivity" (lines 599-602), to clarify this point. We have also changed the terminology from "timesteps" to "LEF events" in the manuscript as the latter is more accurate for our purpose.

      1. The assumptions in the solution and application of the CCLE model are potentially constraining to a limited number of scenarios. In particular the authors specify that current due to binding/unbinding, A_n - D_n, is small. This assumption could be problematic near loading sites (centromeres, enhancers in higher eukaryotes, etc.) (where current might be dominated by A_n and V_n), unloading sites (D_n and V_{n-1}), or strong boundaries (D_n and V_{n-1}). The latter scenario is particularly concerning because the manuscript seems to be concerned with the presence of unidentified boundaries. This is partially mitigated by the fact that the model seems to work well in the chosen examples, but the authors should discuss the limitations due to their assumptions and/or possible methods to get around these limitations.

      4. Related to the above concern, low cohesin occupancy is interpreted as a fast extrusion region and high cohesin occupancy is interpreted as a slow region. But this might not be true near cohesin loading and unloading sites.

      __Response: __

      Our response to Referee 2's Comments 3. and 4. is that both in the original manuscript and in the revised manuscript we clearly delineate the assumptions underlying CCLE and we carefully assess the extent to which these assumptions are violated (lines 123-126 and 263-279 in the revised manuscript). For example, Supplementary Figure 12 shows that across the S. pombe genome as a whole, violations of the CCLE assumptions are small. Supplementary Figure 13 shows that violations are similarly small for meiotic S. cerevisiae. However, to explicitly address the concern of the referee, we have added the following sentences to the revised manuscript:

      Lines 277-279:

      "While loop extrusion in interphase S. pombe seems to well satisfy the assumptions underlying CCLE, this may not always be the case in other organisms."

      Lines 359-361:

      "In addition, the three quantities, given by Eqs. 6, 7, and 8, are distributed around zero with relatively small fluctuations (Supplementary Fig. 13), indicating that CCLE model is self-consistent in this case also."

      In the case of mitotic S. cerevisiae, Supplementary Figure 14 shows that these quantities are small for most of genomic locations, except near the cohesin ChIP-seq peaks. We ascribe these greater violations of CCLE's assumptions at the locations of cohesin peaks in part to the low processivity of mitotic cohesin in S. cerevisiae, compared to that of meiotic S. cerevisiae and interphase S. pombe, and in part to the low CCLE loop extrusion rate at the cohesin peaks. We have added a paragraph at the end of the Section "CCLE Describes TADs and Loop Configurations in Mitotic S. cerevisiae" to reflect these observations (lines 447-461).

      1. *The mechanistic insight attempted in the discussion, specifically with regard to Mis4/Scc2/NIPBL and Pds5, is problematic. First, it is not clear how the discussion of Nipbl and Pds5 is connected to the CCLE method; the justification is that CCLE shows cohesin distribution is linked to cohesin looping, which is already a questionable statement (point 1) and doesn't really explain how the model offers new insight into existing Nipbl and Pds5 data. *

      Furthermore, I believe that the conclusions drawn on this point are flawed, or at least, stated with too much confidence. The authors raise the curious point that Nipbl ChIP-seq does not correlate well with cohesin ChIP-seq, and use this as evidence that Nipbl is not a part of the loop-extruding complex in S. pombe, and it is not essential in humans. Aside from the molecular evidence in human Nipbl/cohesin (acknowledged by authors), there are other reasons to doubt this conclusion. First, depletion of Nipbl (rather than binding partner Mau2 as in ref 55) in mouse cells strongly inhibits TAD formation (Schwarzer et al. Nature 551:51 2017). Second, at least two studies have raised concerns about Nibpl ChIP-seq results: 1) Hu et al. Nucleic Acids Res 43:e132 2015, which shows that uncalibrated ChIP-seq can obscure the signal of protein localization throughout the genome due to the inability to distinguish from background * and 2) Rhodes et al. eLife 6:e30000, which uses FRAP to show that Nipbl binds and unbinds to cohesin rapidly in human cells, which could go undetected in ChIP-seq, especially when uncalibrated. It has not been shown that these dynamics are present in yeast, but there is no reason to rule it out yet.*

      Similar types of critiques could be applied to the discussion of Pds5. There is cross-correlation between Psc3 and Pds5 in S. pombe, but the authors are unable to account for whether Pds5 binding is transient and/or necessary to loop extrusion itself or, more importantly, whether Pds5 ChIP is associated with extrusive or cohesive cohesins; cross-correlation peaks at about 0.6, but note that by the authors own estimates, cohesive cohesins are approximately half of all cohesins in S. pombe (Table 3).

      *Due to the above issues, I suggest that the authors heavily revise this discussion to better reflect the current experimental understanding and the limited ability to draw such conclusions based on the current CCLE model. *

      __Response: __

      As stated above, our study demonstrates that the CCLE approach is able to take as input cohesin (Psc3) ChIP-seq data and produce as output simulated Hi-C maps that well reproduce the experimental Hi-C maps of interphase S. pombe and meiotic S. cerevisiae. This result is evident from the multiple Hi-C comparison figures in both the original and the revised manuscripts. In light of this circumstance, the referee's statement that it is "questionable", that CCLE shows that cohesin distribution (as quantified by cohesin ChIP-seq) is linked to cohesin looping (as quantified by Hi-C), is demonstrably incorrect.

      However, we did not intend to suggest that Nipbl and Pds5 are not crucial for cohesin loading, as the reviewer states. Rather, our inquiries relate to a more nuanced question of whether these factors only reside at loading sites or, instead, remain as a more long-lived constituent component of the loop extrusion complex. We regret any confusion and have endeavored to clarify this point in the revised manuscript in response to Referee 2's Comment 5. as well as Referee 1's Minor Comment 1. We have now better explained how the CCLE model may offer new insight from existing ChIP-seq data in general and from Mis4/Nipbl and Pds5 ChIP-seq, in particular. Accordingly, we have followed Referee 2's advice to heavily revise the relevant section of the Discussion.

      To this end, we have removed the following text from the original manuscript:

      "The fact that the cohesin distribution along the chromatin is strongly linked to chromatin looping, as evident by the success of the CCLE model, allows for new insights into in vivo LEF composition and function. For example, recently, two single-molecule studies [37, 38] independently found that Nipbl, which is the mammalian analogue of Mis4, is an obligate component of the loop-extruding human cohesin complex. Ref. [37] also found that cohesin complexes containing Pds5, instead of Nipbl, are unable to extrude loops. On this basis, Ref. [32] proposed that, while Nipbl-containing cohesin is responsible for loop extrusion, Pds5-containing cohesin is responsible for sister chromatid cohesion, neatly separating cohesin's two functions according to composition. However, the success of CCLE in interphase S. pombe, together with the observation that the Mis4 ChIP-seq signal is uncorrelated with the Psc3 ChIP-seq signal (Supplementary Fig. 7) allows us to infer that Mis4 cannot be a component of loop-extruding cohesin in S. pombe. On the other hand, Pds5 is correlated with Psc3 in S. pombe (Supplementary Fig. 7) suggesting that both proteins are involved in loop-extruding cohesin, contradicting a hypothesis that Pds5 is a marker for cohesive cohesin in S. pombe. In contrast to the absence of Mis4-Psc3 correlation in S. pombe, in humans, Nipbl ChIP-seq and Smc1 ChIP-seq are correlated (Supplementary Fig. 7), consistent with Ref. [32]'s hypothesis that Nipbl can be involved in loop-extruding cohesin in humans. However, Ref. [55] showed that human Hi-C contact maps in the absence of Nipbl's binding partner, Mau2 (Ssl3 in S. pombe [56]) show clear TADs, consistent with loop extrusion, albeit with reduced long-range contacts in comparison to wild-type maps, indicating that significant loop extrusion continues in live human cells in the absence of Nipbl-Mau2 complexes. These collected observations suggest the existence of two populations of loop-extruding cohesin complexes in vivo, one that involves Nipbl-Mau2 and one that does not. Both types are present in mammals, but only Mis4-Ssl3-independent loop-extruding cohesin is present in S. pombe."

      And we have replaced it by the following text in the revised manuscript (lines 533-568):

      "As noted above, the input for our CCLE simulations of chromatin organization in S. pombe, was the ChIP-seq of Psc3, which is a component of the cohesin core complex [75]. Accordingly, Psc3 ChIP-seq represents how the cohesin core complex is distributed along the genome. In S. pombe, the other components of the cohesin core complex are Psm1, Psm3, and Rad21. Because these proteins are components of the cohesin core complex, we expect that the ChIP-seq of any of these proteins would closely match the ChIP-seq of Psc3, and would equally well serve as input for CCLE simulations of S. pombe genome organization. Supplementary Figure 20C confirms significant correlations between Psc3 and Rad21. In light of this observation, we then reason that the CCLE approach offers the opportunity to investigate whether other proteins beyond the cohesin core are constitutive components of the loop extrusion complex during the extrusion process (as opposed to cohesin loading or unloading). To elaborate, if the ChIP-seq of a non-cohesin-core protein is highly correlated with the ChIP-seq of a cohesin core protein, we can infer that the protein in question is associated with the cohesin core and therefore is a likely participant in loop-extruding cohesin, alongside the cohesin core. Conversely, if the ChIP-seq of a putative component of the loop-extruding cohesin complex is uncorrelated with the ChIP-seq of a cohesin core protein, then we can infer that the protein in question is unlikely to be a component of loop-extruding cohesin, or at most is transiently associated with it.

      For example, in S. pombe, the ChIP-seq of the cohesin regulatory protein, Pds5 [74], is correlated with the ChIP-seq of Psc3 (Supplementary Fig. 20B) and with that of Rad21 (Supplementary Fig. 20D), suggesting that Pds5 can be involved in loop-extruding cohesin in S. pombe, alongside the cohesin core proteins. Interestingly, this inference concerning fission yeast cohesin subunit, Pds5, stands in contrast to the conclusion from a recent single-molecule study [38] concerning cohesin in vertebrates. Specifically, Reference [38] found that cohesin complexes containing Pds5, instead of Nipbl, are unable to extrude loops.

      Additionally, as noted above, in S. pombe the ChIP-seq signal of the cohesin loader, Mis4, is uncorrelated with the Psc3 ChIP-seq signal (Supplementary Fig. 20A), suggesting that Mis4 is, at most, a very transient component of loop-extruding cohesin in S. pombe, consistent with its designation as a "cohesin loader". However, both References [38] and [39] found that Nipbl (counterpart of S. pombe's Mis4) is an obligate component of the loop-extruding human cohesin complex, more than just a mere cohesin loader. Although CCLE has not yet been applied to vertebrates, from a CCLE perspective, the possibility that Nipbl may be required for the loop extrusion process in humans is bolstered by the observation that in humans Nipbl ChIP-seq and Smc1 ChIP-seq show significant correlations (Supplementary Fig. 20G), consistent with Ref. [32]'s hypothesis that Nipbl is involved in loop-extruding cohesin in vertebrates. A recent theoretical model of the molecular mechanism of loop extrusion by cohesin hypothesizes that transient binding by Mis4/Nipbl is essential for permitting directional reversals and therefore for two-sided loop extrusion [41]. Surprisingly, there are significant correlations between Mis4 and Pds5 in S. pombe (Supplementary Fig. 20E), indicating Pds5-Mis4 association, outside of the cohesin core complex."

      In response to Referee 2's specific comment that "at least two studies have raised concerns about Nibpl ChIP-seq results", we note (1) that, while Hu et al. Nucleic Acids Res 43:e132 2015 present a general method for calibrating ChIP-seq results, they do not measure Mis4/Nibpl ChIP-seq, nor do they raise any specific concerns about Mis4/Nipbl ChIP-seq, and (2) that (as noted above, in response to Referee 1's comment) while the FRAP analysis presented by Rhodes et al. eLife 6:e30000 indicates that, in HeLa cells, Nipbl has a residence time bound to cohesin of about 50 seconds, nevertheless, as shown in Supplementary Fig. 20G in the revised manuscript, there is a significant cross-correlation between the Nipbl ChIP-seq and Smc1 ChIP-seq in humans, indicating that a transient association between Nipbl and cohesin is detected by ChIP-seq, the referees' concerns notwithstanding.

      We thank the referee for pointing out Schwarzer et al. Nature 551:51 2017. However, our interpretation of these data is different than the referee's. As noted in our original manuscript, Nipbl has traditionally been considered to be a cohesin loading factor. If the role of Nipbl was solely to load cohesin, then we would expect that depleting Nipbl would have a major effect on the Hi-C map, because fewer cohesins are loaded onto the chromatin. Figure 2 of Schwarzer et al. Nature 551:51 2017, shows the effect of depleting Nibpl on a vertebrate Hi-C map. Even in this case when Nibpl is absent, this figure (Figure 2 of Schwarzer et al. Nature 551:51 2017) shows that TADs persist, albeit considerably attenuated. According to the authors' own analysis associated with Fig. 2 of their paper, these attenuated TADs correspond to a smaller number of loop-extruding cohesin complexes than in the presence of Nipbl. Since Nipbl is depleted, these loop-extruding cohesins necessarily cannot contain Nipbl. Thus, the data and analysis of Schwarzer et al. Nature 551:51 2017 actually seem consistent with the existence of a population of loop-extruding cohesin complexes that do not contain Nibpl.

      Concerning the referee's comment that we cannot be sure whether Pds5 ChIP is associated with extrusive or cohesive cohesin, we note that, as explained in the manuscript, we assume that the cohesive cohesins are uniformly distributed across the genome, and therefore that peaks in the cohesin ChIP-seq are associated with loop-extruding cohesins. The success of CCLE in describing Hi-C maps justifies this assumption a posteriori. Supplementary Figure 20B shows that the ChIP-seq of Pds5 is correlated with the ChIP-seq of Psc3 in S. pombe, that is, that peaks in the ChIP-seq of Psc3, assumed to derive from loop-extruding cohesin, are accompanied by peaks in the ChIP-seq of Pds5. This is the reasoning allowing us to associate Pds5 with loop-extruding cohesin in S. pombe.

      1. I suggest that the authors recalculate correlations for Hi-C maps using maps that are rescaled by the P(s) curves. As currently computed, most of the correlation between maps could arise from the characteristic decay of P(s) rather than smaller scale features of the contact maps. This could reduce the surprising observed correlation between distinct genomic regions in pombe (which, problematically, is higher than the observed correlation between simulation and experiment in cervisiae).

      Response:

      We thank the referee for this advice. Following this advice, throughout the revised manuscript, we have replaced our original calculation of the Pearson correlation coefficient of unscaled Hi-C maps with a calculation of the Pearson correlation coefficient of rescaled Hi-C maps. Since the MPR is formed from ratios of simulated to experimental Hi-C maps, this metric is unchanged by the proposed rescaling.

      As explained in the original manuscript, we attribute the lower experiment-simulation correlation in the meiotic budding yeast Hi-C maps to the larger statistical errors of the meiotic budding yeast dataset, which arises because of its higher genomic resolution - all else being equal we can expect 25 times the counts in a 10 kb x10 kb bin as in a 2 kb x 2 kb bin. For the same reason, we expect larger statistical errors in the mitotic budding yeast dataset as well. Lower correlations for noisier data are to be expected in general.

      *7. Please explain why the difference between right and left currents at any particular site, (R_n-L_n) / Rn+Ln, should be small. It seems easy to imagine scenarios where this might not be true, such as directional barriers like CTCF or transcribed genes. *

      __Response: __

      For simplicity, the present version of CCLE sets the site-dependent loop extrusion rates by assuming that the cohesin ChIP-seq signal has equal contributions from left and right anchors. Then, we carry out our simulations which subsequently allow us to examine the simulated left and right currents and their difference at every site. The distributions of normalized left-right difference currents are shown in Supplementary Figures 12B, 13B, and 14D, for interphase S. pombe, meiotic S. cerevisiae, and mitotic S. cerevisiae, respectively. They are all centered at zero with standard deviations of 0.12, 0.16, and 0.33. Thus, it emerges from our simulations that the difference current is indeed generally small.

      8. Optional, but I think would greatly improve the manuscript, but can the authors: a) analyze regions of high cohesin occupancy (assumed to be slow extrusion regions) to determine if there's anything special in these regions, such as more transcriptional activity

      __Response: __

      In response to Referee 1's similar comment, we have calculated the correlation between the locations of convergent genes and cohesin ChIP-seq. Supplementary Figure 18A in the revised manuscript shows that for interphase S. pombe no correlations are evident, whereas for both of meiotic and mitotic S. cerevisiae, there are significant correlations between these two quantities (Supplementary Fig. 17).

      *b) apply this methodology to vertebrate cell data *

      __Response: __

      The application of CCLE to vertebrate data is outside the scope of this paper which, as we have emphasized, has the goal of developing a model that can be robustly applied to non-vertebrate eukaryotic genomes. Nevertheless, CCLE is, in principle, applicable to all organisms in which loop extrusion by SMC complexes is the primary mechanism for chromatin spatial organization.

      1. *A Github link is provided but the code is not currently available. *

      __Response: __

      The code is now available.

      Minor Comments:

      1. Please state the simulated LEF lifetime, since the statement in the methods that 15000 timesteps are needed for equilibration of the LEF model is otherwise not meaningful. Additionally, please note that backbone length is not necessarily a good measure of steady state, since the backbone can be compacted to its steady-state value while the loop distribution continues to evolve toward its steady state.

      __Response: __

      The terminology "timesteps" used in the original manuscript in fact should mean "the number of LEF events performed" in the simulation. Therefore, we have changed the terminology from "timesteps" to "LEF events".

      The choice of 15000 LEF events is empirically determined to ensure that loop extrusion steady state is achieved, for the range of parameters considered. To address the referee's concern regarding the uncertainty of achieving steady state after 15000 LEF events, we compared two loop size distributions: each distribution encompasses 1000 data points, equally separated in time, one between LEF event 15000 and 35000, and the other between LEF event 80000 and 100000. The two distributions are within-errors identical, suggesting that the loop extrusion steady state is well achieved within 15000 LEF events.

      2. How important is the cohesive cohesin parameter in the model, e.g., how good are fits with \rho_c = 0?

      __Response: __

      As stated in the original manuscript, the errors on \rho_c on the order of 10%-20% (for S. pombe). Thus, fits with \rho_c=0 are significantly poorer than with the best-fit values of \rho_c.

      *3. A nice (but non-essential) supplemental visualization might be to show a scatter of sim cohesin occupancy vs. experiment ChIP. *

      __Response: __

      We have chosen not to do this, because we judge that the manuscript is already long enough. Figures 3A, 5D, and 6C already compare the experimental and simulated ChIP-seq, and these figures already contain more information than the figures proposed by the referee.

      1. *A similar calculation of Hi-C contacts based on simulated loop extruder positions using the Gaussian chain model was previously presented in Banigan et al. eLife 9:e53558 2020, which should be cited. *

      __Response: __

      We thank the referee for pointing out this citation. We have added it to the revised manuscript.

      1. It is stated that simulation agreement with experiments for cerevisiae is worse in part due to variability in the experiments, with MPR and Pearson numbers for cerevisiae replicates computed for reference. But these numbers are difficult to interpret without, for example, similar numbers for duplicate pombe experiments. Again, these numbers should be generated using Hi-C maps scaled by P(s), especially in case there are systematic errors in one replicate vs. another.

      __Response: __

      As noted above, throughout the revised manuscript, we now give the Pearson correlation coefficients of scaled-by-P(s) Hi-C maps.

      1. *In the model section, it is stated that LEF binding probabilities are uniformly distributed. Did the authors mean the probability is uniform across the genome or that the probability at each site is a uniformly distributed random number? Please clarify, and if the latter, explain why this unconventional assumption was made. *

      __Response: __

      It is the former. We have modified the manuscript to clarify that LEFs "initially bind to empty, adjacent chromatin lattice sites with a binding probability, that is uniformly distributed across the genome." (lines 587-588).

      *7. Supplement p4 line 86 - what is meant by "processivity of loops extruded by isolated LEFs"? "size of loops extruded by..." or "processivity of isolated LEFs"? *

      __Response: __

      Here "processivity of isolated LEFs" is defined as the processivity of one LEF without the interference (blocking) from other LEFs. We have changed "processivity of loops extruded by isolated LEFs" to "processivity of isolated LEFs" for clarity.

      1. The use of parentheticals in the caption to Table 2 is a little confusing; adding a few extra words would help.

      __Response: __

      In the revised manuscript, we have added an additional sentence, and have removed the offending parentheses.

      1. *Page 12 sentence line 315-318 is difficult to understand. The barrier parameter is apparently something from ref 47 not previously described in the manuscript. *

      __Response: __

      In the revised manuscript, we have removed mention of the "barrier parameter" from the discussion.

      1. *Statement on p14 line 393-4 is false: prior LEF models have not been limited to vertebrates, and the authors have cited some of them here. There are also non-vertebrate examples with extrusion barriers: genes as boundaries to condensin in bacteria (Brandao et al. PNAS 116:20489 2019) and MCM complexes as boundaries to cohesin in yeast (Dequeker et al. Nature 606:197 2022). *

      __Response: __

      In fact, Dequeker et al. Nature 606:197 2022 concerns the role of MCM complexes in blocking cohesin loop extrusion in mouse zygotes. Mouse is a vertebrate. The sole aspect of this paper, that is associated with yeast, is the observation of cohesin blocking by the yeast MCM bound to the ARS1 replication origin site, which is inserted on a piece of lambda phage DNA. No yeast genome is used in the experiment. Therefore, the referee is mistaken to suggest that this paper models yeast genome organization.

      We thank the referee for pointing out Brandao et al. PNAS 116:20489 2019, which includes the development of a tour-de-force model of condensin-based loop extrusion in the prokaryote, Bacillus subtilis, in the presence of gene barriers to loop extrusion. To acknowledge this paper, we have changed the objectionable sentence to now read (lines 571-575):

      "... prior LEF models have been overwhelmingly limited to vertebrates, which express CTCF and where CTCF is the principal boundary element. Two exceptions, in which the LEF model was applied to non-vertebrates, are Ref. [49], discussed above, and Ref. [76] (Brandao et al.), which models the Hi-C map of the prokaryote, Bacillus subtilis, on the basis of condensin loop extrusion with gene-dependent barriers."

      *Referees cross-commenting *

      I agree with the comments of Reviewer 1, which are interesting and important points that should be addressed.

      *Reviewer #2 (Significance (Required)):

      Analytically approaching extrusion by treating cohesin translocation as a conserved current is an interesting approach to modeling and analysis of extrusion-based chromatin organization. It appears to work well as a descriptive model. But I think there are major questions concerning the mechanistic value of this model, possible applications of the model, the provided interpretations of the model and experiments, and the limitations of the model under the current assumptions. I am unconvinced that this analysis specifically is sufficient to demonstrate that extrusion is the primary organizer of chromatin on these scales; moreover, the need to demonstrate this is questionable, as extrusion is widely accepted, even if not universally so. It is also unclear that the minimal approach of the CCLE necessarily offers an improved physical basis for modeling extrusion, as compared to previous efforts such as ref 47, as claimed by the authors. There are also questions about significance due to possible limitations of the model (detailed above). Applying the CCLE model to identify barriers would be interesting, but is not attempted. Overall, the work presents a reasonable analytical model and numerical method, but until the major comments above are addressed and some reasonable application or mechanistic value or interpretation is presented, the overall significance is somewhat limited.*

      __Response: __

      We agree with the referee that analytically approaching extrusion by treating cohesin translocation as a conserved current is an interesting approach to modeling and analysis of extrusion-based chromatin organization. We also agree with the referee that it works well as a descriptive model (of Hi-C maps in S. pombe and S. cerevisiae). Obviously, we disagree with the referee's other comments. For us, being able to describe the different-appearing Hi-C maps of interphase S. pombe (Fig. 1 and Supplementary Figures 1-9), meiotic S. cerevisiae (Fig. 5) and mitotic S. cerevisiae (Fig. 6), all with a common model with just a few fitting parameters that differ between these examples, is significant and novel. The reviewer prematurely ignores the fact that there are still debates about whether "diffusion-capture"-like model is the more dominant mechanism that shape chromatin spatial organization at the TAD-scale. Many works have argued that such models could describe TAD-scale chromatin organization, as cited in the revised manuscript (Refs. [11, 14, 15, 17, 20, 22-24, 55]). However, in contrast to the poor description of the Hi-C map using diffusion capture model (as demonstrated in the revised manuscript and Supplementary Fig. 15), the excellent experiment-simulation agreement achieved by CCLE provides compelling evidence that cohesin-based loop extrusion is indeed the primary organizer of TAD-scale chromatin.

      Importantly, CCLE provides a theoretical base for how loop extrusion models can be generalized and applied to organisms without known loop extrusion barriers. Our model also highlights that (and provides means to account for) distributed barriers that impede but do not strictly block LEFs could also impact chromatin configurations. This case might be of importance to organisms with CTCF motifs that infrequently coincide with TAD boundaries, for instance, in the case of Drosophila melanogaster. Moreover, CCLE promises theoretical descriptions of the Hi-C maps of other non-vertebrates in the future, extending the quantitative application of the LEF model across the tree of life. This too would be highly significant if successful.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      Yuan et al. report on their development of an analytical model ("CCLE") for loop extrusion with genomic-position-dependent speed, with the idea of accounting for barriers to loop extrusion. They write down master equations for the probabilities of cohesin occupancy at each genomic site and obtain approximate steady-state solutions. Probabilities are governed by cohesin translocation, loading, and unloading. Using ChIP-seq data as an experimental measurement of these probabilities, they numerically fit the model parameters, among which are extruder density and processivity. Gillespie simulations with these parameters combined with a 3D Gaussian polymer model were integrated to generate simulated Hi-C maps and cohesin ChIP-seq tracks, which show generally good agreement with the experimental data. The authors argue that their modeling provides evidence that loop extrusion is the primary mechanism of chromatin organization on ~10-100 kb scales in S. pombe and S. cerevisiae.

      Major comments:

      1. I am unconvinced that this analysis specifically is sufficient to demonstrate that extrusion is the primary organizer of chromatin on these scales; moreover, the need to demonstrate this is questionable, as extrusion is widely accepted, even if not universally so. How is the agreement of CCLE with experiments more demonstrative of loop extrusion than previous modeling? Relatedly, similar best fit values for S. pombe and S. cerevisiae might not point to a mechanistic conclusion (same "underlying mechanism" of loop extrusion), but rather to similar properties for loop-extruding cohesins in the two species. As an alternative, could a model with variable binding probability given by ChIP-seq and an exponential loop-size distribution work equally well? The stated lack of a dependence on extrusion timescale suggests that a static looping model might succeed. If not, why not?
      2. I do not understand how the loop extrusion residence time drops out. As I understand it, Eq 9 converts ChIP-seq to lattice site probability (involving N_{LEF}, which is related to \rho, and \rho_c). Then, Eqs. 3-4 derive site velocities V_n and U_n if we choose rho, L, and \tau, with the latter being the residence time. This parameter is not specified anywhere and is claimed to be unimportant. It may be true that the choice of timescale is arbitrary in this procedure, but can the authors please clarify?
      3. The assumptions in the solution and application of the CCLE model are potentially constraining to a limited number of scenarios. In particular the authors specify that current due to binding/unbinding, A_n - D_n, is small. This assumption could be problematic near loading sites (centromeres, enhancers in higher eukaryotes, etc.) (where current might be dominated by A_n and V_n), unloading sites (D_n and V_{n-1}), or strong boundaries (D_n and V_{n-1}). The latter scenario is particularly concerning because the manuscript seems to be concerned with the presence of unidentified boundaries. This is partially mitigated by the fact that the model seems to work well in the chosen examples, but the authors should discuss the limitations due to their assumptions and/or possible methods to get around these limitations.
      4. Related to the above concern, low cohesin occupancy is interpreted as a fast extrusion region and high cohesin occupancy is interpreted as a slow region. But this might not be true near cohesin loading and unloading sites.
      5. The mechanistic insight attempted in the discussion, specifically with regard to Mis4/Scc2/NIPBL and Pds5, is problematic. First, it is not clear how the discussion of Nipbl and Pds5 is connected to the CCLE method; the justification is that CCLE shows cohesin distribution is linked to cohesin looping, which is already a questionable statement (point 1) and doesn't really explain how the model offers new insight into existing Nipbl and Pds5 data.

      Furthermore, I believe that the conclusions drawn on this point are flawed, or at least, stated with too much confidence. The authors raise the curious point that Nipbl ChIP-seq does not correlate well with cohesin ChIP-seq, and use this as evidence that Nipbl is not a part of the loop-extruding complex in S. pombe, and it is not essential in humans. Aside from the molecular evidence in human Nipbl/cohesin (acknowledged by authors), there are other reasons to doubt this conclusion. First, depletion of Nipbl (rather than binding partner Mau2 as in ref 55) in mouse cells strongly inhibits TAD formation (Schwarzer et al. Nature 551:51 2017). Second, at least two studies have raised concerns about Nibpl ChIP-seq results: 1) Hu et al. Nucleic Acids Res 43:e132 2015, which shows that uncalibrated ChIP-seq can obscure the signal of protein localization throughout the genome due to the inability to distinguish from background and 2) Rhodes et al. eLife 6:e30000, which uses FRAP to show that Nipbl binds and unbinds to cohesin rapidly in human cells, which could go undetected in ChIP-seq, especially when uncalibrated. It has not been shown that these dynamics are present in yeast, but there is no reason to rule it out yet.

      Similar types of critiques could be applied to the discussion of Pds5. There is cross-correlation between Psc3 and Pds5 in S. pombe, but the authors are unable to account for whether Pds5 binding is transient and/or necessary to loop extrusion itself or, more importantly, whether Pds5 ChIP is associated with extrusive or cohesive cohesins; cross-correlation peaks at about 0.6, but note that by the authors own estimates, cohesive cohesins are approximately half of all cohesins in S. pombe (Table 3).

      Due to the above issues, I suggest that the authors heavily revise this discussion to better reflect the current experimental understanding and the limited ability to draw such conclusions based on the current CCLE model. 6. I suggest that the authors recalculate correlations for Hi-C maps using maps that are rescaled by the P(s) curves. As currently computed, most of the correlation between maps could arise from the characteristic decay of P(s) rather than smaller scale features of the contact maps. This could reduce the surprising observed correlation between distinct genomic regions in pombe (which, problematically, is higher than the observed correlation between simulation and experiment in cervisiae). 7. Please explain why the difference between right and left currents at any particular site, (R_n-L_n) / Rn+Ln, should be small. It seems easy to imagine scenarios where this might not be true, such as directional barriers like CTCF or transcribed genes. 8. Optional, but I think would greatly improve the manuscript, but can the authors: a) analyze regions of high cohesin occupancy (assumed to be slow extrusion regions) to determine if there's anything special in these regions, such as more transcriptional activity

      b) apply this methodology to vertebrate cell data 9. A Github link is provided but the code is not currently available.

      Minor Comments:

      1. Please state the simulated LEF lifetime, since the statement in the methods that 15000 timesteps are needed for equilibration of the LEF model is otherwise not meaningful. Additionally, please note that backbone length is not necessarily a good measure of steady state, since the backbone can be compacted to its steady-state value while the loop distribution continues to evolve toward its steady state.
      2. How important is the cohesive cohesin parameter in the model, e.g., how good are fits with \rho_c = 0?
      3. A nice (but non-essential) supplemental visualization might be to show a scatter of sim cohesin occupancy vs. experiment ChIP.
      4. A similar calculation of Hi-C contacts based on simulated loop extruder positions using the Gaussian chain model was previously presented in Banigan et al. eLife 9:e53558 2020, which should be cited.
      5. It is stated that simulation agreement with experiments for cerevisiae is worse in part due to variability in the experiments, with MPR and Pearson numbers for cerevisiae replicates computed for reference. But these numbers are difficult to interpret without, for example, similar numbers for duplicate pombe experiments. Again, these numbers should be generated using Hi-C maps scaled by P(s), especially in case there are systematic errors in one replicate vs. another.
      6. In the model section, it is stated that LEF binding probabilities are uniformly distributed. Did the authors mean the probability is uniform across the genome or that the probability at each site is a uniformly distributed random number? Please clarify, and if the latter, explain why this unconventional assumption was made.
      7. Supplement p4 line 86 - what is meant by "processivity of loops extruded by isolated LEFs"? "size of loops extruded by..." or "processivity of isolated LEFs"?
      8. The use of parentheticals in the caption to Table 2 is a little confusing; adding a few extra words would help.
      9. Page 12 sentence line 315-318 is difficult to understand. The barrier parameter is apparently something from ref 47 not previously described in the manuscript.
      10. Statement on p14 line 393-4 is false: prior LEF models have not been limited to vertebrates, and the authors have cited some of them here. There are also non-vertebrate examples with extrusion barriers: genes as boundaries to condensin in bacteria (Brandao et al. PNAS 116:20489 2019) and MCM complexes as boundaries to cohesin in yeast (Dequeker et al. Nature 606:197 2022).

      Referees cross-commenting

      I agree with the comments of Reviewer 1, which are interesting and important points that should be addressed.

      Significance

      Analytically approaching extrusion by treating cohesin translocation as a conserved current is an interesting approach to modeling and analysis of extrusion-based chromatin organization. It appears to work well as a descriptive model. But I think there are major questions concerning the mechanistic value of this model, possible applications of the model, the provided interpretations of the model and experiments, and the limitations of the model under the current assumptions. I am unconvinced that this analysis specifically is sufficient to demonstrate that extrusion is the primary organizer of chromatin on these scales; moreover, the need to demonstrate this is questionable, as extrusion is widely accepted, even if not universally so. It is also unclear that the minimal approach of the CCLE necessarily offers an improved physical basis for modeling extrusion, as compared to previous efforts such as ref 47, as claimed by the authors. There are also questions about significance due to possible limitations of the model (detailed above). Applying the CCLE model to identify barriers would be interesting, but is not attempted. Overall, the work presents a reasonable analytical model and numerical method, but until the major comments above are addressed and some reasonable application or mechanistic value or interpretation is presented, the overall significance is somewhat limited.

    1. Author response:

      [The following is the authors’ response to the current reviews.]

      In response to Reviewer #2, we agree with the reviewer that it needs to be noted that not all forms of recognition are the same and have added the following: "However, we note that not all forms of recognition are the same; researchers may prefer to have their work featured instead of personal stories or critiques of the scientific environment."


      [The following is the authors’ response to the previous reviews.]

      We thank both reviewers for their detailed comments and insightful suggestions. Below we summarize our responses to each concern in addition to the edits within the manuscript.

      We would also like to add a clarification to the eLife assessment, it states “This important bibliometric analysis shows that authors of scientific papers whose names suggest they are female or East Asian get quoted less often in news stories about their work.” We show that individuals with names predicted to be from women or East Asian name origins are less likely to be quoted or mentioned in Nature’s scientific news stories than expected by publication demographics. In this study, we did not compare the level of coverage of a scientific article by the demographics of the authors of the article.

      Reviewer #1

      The article is not so clearly structured, which makes it hard to follow. A better framing, contextualization, and conceptualization of their analysis would help the readers to better understand the results. There are some unclear definitions and wrong wording of key concepts.

      We have adapted our wording in the text and added a more detailed discussion which hopefully makes the paper easier to comprehend. These changes are described in the context of your reviewer's suggestions and addressed in the next section.

      Language use: Male/Female refers to sex, not to gender.

      We have now updated the language throughout the text. Thank you for pointing this out.

      Regional disparities are not the same as names' origin. While the first might relate to the academic origin of authors, inferred from their institutional belonging, the latter reflects the authors' inferred identity. Ethnic identities and the construction of prejudice against specific populations need proper contextualization.

      We have added better contextualization in the manuscript and reworded the section in our results and discussion to clarify that we are analyzing disparities related to perceived ethnicity and not regions. We also added the following text to the results section “In our analysis, we use name origin as an estimate for the perceived ethnicity of a primary source by a journalist. Our prediction is not intended to assign ethnicity to an individual, but to be used broadly as a tool to quantify representational differences in a journalist's sociologically constructed perception of a primary source's ethnicity.” We also added the following text to our Discussion: “Our use of name origins is a proxy for a journalist's or referring scholarly peer’s potential perceptions of the ethnicity of a primary source as signaled by an individual's name. We do not intend to assign an identity to an individual, but to generate a broad metric to measure possible bias for particular ethnicities during journalists' primary source gathering.”

      It would be helpful to have a clear definition of what are quotes, mentions, and citations. For me, it was not so clear and made understanding the results more difficult.

      We added the following text to the results section Extracted Data Used for Analysis: “Quoted names are any names that were attached to a quote within the article. Mentioned names are any names that were stated within the article. Cited names are all author names of a scientific paper that was cited in the news article.”

      The comparison against Nature published research articles is not perfect because journalists will also cover articles not published in Nature. If for example, the gender representation in the quoted articles is not the same between Nature journals and other journals, then this source of inequality would be missing (e.g. if the journalists are biased against women, but not as much when they published in Nature, because they are also biased towards Nature articles). Also, the gender representation among Nature authors could not be the same as in general. Nevertheless, this seems to be a fair benchmark, especially if the authors did not have access to other more comprehensive databases. But a statement of limitations including these potential issues would be good to have.

      To add better context to the generalizability of our work, we added the following text to our discussion: “Furthermore, the news articles present on "www.nature.com" are intended for a very specific readership that may not be reflective of more broad scientific news outlets. In a separate analysis, we took a cursory look into a comparison with The Guardian and found similar disparities in gender and name origin. However, it is not clear which publications should be used as a comparator for science-related articles in The Guardian, and difficult to compare relative rates of representation. While other science news outlets may not have a direct comparator, it would be useful to take a broad comparison across multiple science news outlets to compare against one another. Our existing pipeline could be easily applied to other science news outlets and identify if there exists a consistent pattern of disparity regardless of the intended readership.”

      "we select the highest probability origin for each name as the resultant assignment". Threshold based approaches for race/ethnicity name-based inference have been criticized by the literature as they might reproduce biases (see Kozlowski, D., Murray, D. S., Bell, A., Hulsey, W., Larivière, V., Monroe-White, T., & Sugimoto, C. R. (2022). Avoiding bias when inferring race using name-based approaches. Plos one, 17(3), e0264270.). The authors could use the full distribution of probabilities over names instead of selecting one. The formulae proposed (3-5) could be easily adapted to this change.

      We thank the author for pointing this out. We have updated our analysis to use the probabilities instead of hard assignments. Figure 3 and formulae 3-5 have been updated. While we observe a slight shift in the calculated values, the overall trends are unchanged.

      Is it possible to make an analysis that intersects both name origin and gender? I am not sure if the sample size would allow for this, but if some other dimensions were collapsed, it would be very important to show what happens at the intersection of these two dimensions of discrimination.

      We agree that identifying any differences in quotation patterns at the intersection of gender and name origin would be very useful to identify. To address this, we added supplemental table 5. This table identifies the number of quotes per predicted name origin and gender over all years and article types. In this table, we don’t see a significant difference in gender distribution across predicted name origins.

      Given a larger sample size, we would be able to better identify more subtle differences, but at this sample size, we cannot make more detailed inferences. Additionally, this also addresses a QC-issue, where predicted gender accuracy varies by name origin, specifically East Asian name origin. From our data, we don’t see a large difference in proportions across any name origin. We added the following text to the results section to incorporate this analysis:

      “However, it should be noted that the error rate varies by name origin with the largest decrease in performance on names with an Asian origin [@doi:10.7717/peerj-cs.156;@doi:10.5195/jmla.2021.1252]

      . In our analysis, we did not observe a large difference in names predicted to come from a man or woman between predicted East Asian and other name origins (Table 5). “

      The use of vocabulary should be more homogeneous. For example, in page 13 the authors start to use the concepts of over/under enrichment, which appeared before in a title but was not used.

      The text has been updated to remove all mentions of “over/under enrichment” with “over/under representation”

      In the discussions section, it would be important to see as a statement of limitations the problems that automatic origin and gender inference have.

      We thank the reviewer for this suggestion. We have added the following paragraph to our discussion.

      Computational tools enabled us to automatically analyze thousands of articles to identify existing disparities by gender and name origin, but these tools are not without limitations. Our tools are unable to identify non-binary people and rely on gender predictors that are known to have region-specific biases, with the largest decrease in performance on names of an Asian origin [@doi:10.7717/peerj-cs.156;@doi:10.5195/jmla.2021.1252]. Furthermore, name origin is only a proxy for externally perceived racial or ethnic origins of a source or author and is not as accurate as self-identified race or ethnicity. Self-identification better captures the lived experience of an individual that computational estimates from a name can not capture. This is highlighted in our inability to distinguish between Black and White people from the US by their names. As the collection of demographic data by publication outlets grows, we believe this will enable a more fine-grained and accurate analysis of disparities in scientific journalism.

      Figures 2a and 3a show that the affiliations of authors and their countries was going to be used in this analysis. Yet, this section is not present in the article. I would encourage the authors to add this to the analysis as it would show important patterns, and to intersect the dimensions of gender, name origin and country.

      We were interested in using this analysis in our work, but unfortunately the sample size of cited works in each country was too small to make inferences. If this work was extended to larger scientific outlets to include larger corpora such as The Guardian or New York Times, we think one could be able to make more robust inferences. Since our work only focuses on Nature, we decided not to include this analysis. However, we do include a section in our discussion for future work.

      “As a proxy for measuring possible geographical bias of a journalist, we attempted to identify if there was any geographical bias of cited authors. To do this, we identified the affiliation of each cited author and identified their affiliated country. Unfortunately, we could not robustly extract a large enough number of cited authors from different countries to make any conclusive statements. Expanding our work to other science journalism outlets could help identify possible ways in which geographic region, genders, and perceived ethnicity interact and affect scientific visibility of specific groups. While we are unable to identify that journalists have a specific geographical bias, having reporters explicitly focused on specific regional sources will broaden coverage of international opinions in science.”

      It is not clear at that point what column dependence means.

      The abstract has been updated to state, “Gender disparity in Nature quotes was dependent on the article type.”

      Reviewer #2

      We thank the reviewer for their very detailed and insightful suggestions regarding our analysis and the key caveats that needed better contextualization in our analysis. We went through each major point the reviewer brought up below and included any additional text that was needed.

      In some cases, the manuscript lacks consistency in terminology, and uses word choice that is strange (e.g., "enrichment" and "depletion" when discussion representation).

      We thank the review for pointing this out, we have removed all instances of depletion/enrichment for over/under-representation

      Caveats to Claim 1. So while Claim 1 holds, it does not hold for all comparator sets and for all years. I don't think this is critical of the paper-the authors do discuss the trend in Claim 2-but interpretation of this claim should take care of these caveats, and readers should consider the important differences in first and last authorship.

      We thank the reviewer for their detailed feedback on this section. We have added the missing contextualization of our results. In the results section, I changed the figure caption to: “Speakers predicted to be men are sometimes overrepresented in quotes, but this depends on the year and article type.” Added the following paragraph “When considering the relative proportion of authors and speakers predicted to be men, we only find a slight over-representation of men. This overrepresentation is dependent on the authorship position and the year. Before 2010, quotes predicted as from men are overrepresented in comparison to both first and last authors, but between 2010 and 2017 quotes predicted from men are only overrepresented in comparison for first authors. In 2020, we find a slight over-representation of quotes predicted to be from women relative to first and last authors, but still severely under-represented when considering the general population. The choice of comparison between first and last authors can reveal different aspects of the current state of academia. While this does not hold in all scientific fields, first authors are typically early career scientists and last authors are more senior scientists. It has also been shown that early career scientists tend to be more diverse than senior scientists [@doi:10.7554/eLife.60829; @doi:10.1096/fj.201800639]. Since we find that quotes are only slightly more likely to come from a last author, it is reasonable to compare the relative rate of predicted quotes from men to either authorship position. Comparison with last authorships may reveal more how gender bias currently exists whereas comparison with early career scientists may reveal bias in comparison to a future, more possibly diverse academic environment. We hope that increased representation and recognition of women in science, even beyond what is observed in authorship, can increase the proportion of women first and last authors such that it better reflects the general population.”

      Generalizability to other contexts of science journalism:

      We thank the reviewer for their feedback on the generalizability of our work. We have now added the following text to our discussion to provide the reader with a better context of our results: “To articles presented on "www.nature.com" are intended for a very specific readership that may not be reflective of more broad scientific news outlets. In a separate analysis, we took a cursory look into a comparison with The Guardian and found very similar disparities in gender and name origin. However, it is not clear which publications should be used as a comparator for science-related articles in The

      Guardian, and difficult to compare relative rates of representation. While other science news outlets may not have a direct comparator, it would be useful to take a broad comparison across multiple science news outlets to compare against one another. Our existing pipeline could be easily applied to other science news outlets and identify if there exists a consistent pattern of disparity regardless of the intended readership. ”

      Shallow discussion:

      The authors highlight gender parity in career features, but why exactly is there gender parity in this format

      We thank the reviewer for encouraging us to better contextualize our findings in the broader discourse. We have now added several sections to our Discussion. To address gender parity, we have added the following text: “This finding, coupled with the near equal number of articles written by journalists predicted to be men or women, argues for more diversity in topical coverage. "Career Feature" articles highlight current topics relevant to working scientists and frequently highlight systemic issues with the scientific environment. This column allows space for marginalized people to critique the current state of affairs in science or share their personal stories. This type of content encourages the journalist to seek out a diverse set of primary sources. Including more content that is not primarily focused on recent publications, but all topics surrounding the practice of science, can serve as an additional tool to rapidly achieve gender parity in journalistic recognition.”

      Representation in quotations varies by first and last author, most certainly as a result of the academic division of labor in the life sciences. However, what does it say about the scientific quotation that it appears first authors are more often to be quoted? Does this mean that the division of labor is changing such that the first authors are the lead scientists? Or does it imply that senior authors are being skipped over, or giving away their chance to comment on a study to the first author?

      We thank the reviewer for asking bringing up these important questions. We have added better context to our first author analysis in our discussion. We have included the following two sections to address this. Also, we want to state that we find last authors to be slightly more quoted than first authors, as depicted in Fig. 2d., with first author quotation percentage largely appearing below the red line. We included this text in a response above and include it again here for convenience.

      “Before 2010, quotes predicted as from men are overrepresented in comparison to both first and last authors, but between 2010 and 2017 quotes predicted from men are only overrepresented in comparison for first authors. In 2020, we find a slight over-representation of quotes predicted to be from women relative to first and last authors, but still severely under-represented when considering the general population. The choice of comparison between first and last authors can reveal different aspects of the current state of academia. While this does not hold in all scientific fields, first authors are typically early career scientists and last authors are more senior scientists. It has also been shown that early career scientists tend to be more diverse than senior scientists [@doi:10.7554/eLife.60829; @doi:10.1096/fj.201800639]. Since we find that quotes are only slightly more likely to come from a last author, it is reasonable to compare the relative rate of predicted quotes from men to either authorship position. Comparison with last authorships may reveal more how gender bias currently exists whereas comparison with early career scientists may reveal bias in comparison to a future, more possibly diverse academic environment. We hope that increased representation and recognition of women in science, even beyond what is observed in authorship, can increase the proportion of women first and last authors such that it better reflects the general population.”

      “In our analysis, we also find that there are more first authors with predicted East Asian name origin than last authors. This is in contrast to predicted Celtic/English and European name origins.

      Furthermore, we see that the amount of first author people with predicted East Asian name origins is increasing at a much faster rate than quotes are increasing. If this mismatched rate of representation continues, this could lead to an increasingly large erasure of early career scientists with East Asian name origins. As noted before, focusing on increasing engagement with early career scientists can help to reduce the growing disparity of public visibility of scientists with East Asian name origins.”

      What might be the downstream impacts on the public stemming from the under-representation of scientists with East Asian names? According to Figure 3d, not only are East Asian names under-represented in quotations, but they are becoming more under-represented over time as they appear as authors in a greater number of Nature publications; Those with European names are proportionately represented in quotations given their share of authors in Nature. Why might this be, especially seeing as Anglo names are heavily over-represented?

      To address this point, we have added the following text to our discussion: “In our analysis, we also find that there are more first authors with predicted East Asian name origin than last authors. This is in contrast to predicted Celtic/English and European name origins. Furthermore, the amount of first author people with predicted East Asian name origins is increasing at a much faster rate than quotes are increasing. If this mismatched rate of representation continues, this could lead to an increasingly large erasure of early career scientists with East Asian name origins. As noted before, focusing on increasing engagement with early career scientists can help to reduce the growing disparity of public visibility of scientists with East Asian name origins.”

      I am very confused by Figure 1B. It mixes the counts of News-related items with (non-Springer) research articles in a single stacked bar plot which makes determining the quantity of either difficult. I would advise splitting them out

      Figure 1B has been updated, and the News and Research articles have been separated.

      When querying the first 2000 or so results from the SpringerNature API, are the authors certain that they are getting a random sample of papers?

      These papers were the first 200 English language "Journal" papers returned by the Springer Nature API for each month, resulting in 2400 papers per year from 2005 through 2020. These papers are the first 200 papers published each month by a Springer Nature journal, which may not be completely random, but we believe to be a reasonably representative sample. Furthermore, the Springer Nature comparator set is being used as an additional comparator to the complete set of all Nature research papers used in our analyses.

      In all figures: the authors use capital letters to indicate panels in the caption, but lowercase letters in the figure itself and in the main text. This should be made consistent.

      This has been updated.

      In all figures: the authors should make the caption letter bold in the figure captions, which makes it much easier to find descriptions of specific panels

      This has been updated.

      In the section "coreNLP": the authors mention "co-reference resolution" but without really remarking why it is being used. This is an issue throughout the methods-the authors describe what method they are using but either they don't mention why they are using that method until later, or else not at all.

      We have added better reasoning behind our coreNLP selected methods: “We used the standard set of annotaters: tokenize, ssplit, pos, lemma, ner, parse, coref, and additionally the quote annotator. These perform text tokenization, sentence splitting, part of speech recognition, lemmatization, named entity recoginition, division of sentences into constituent phrases, co-reference resolution, and identification of quoted entities, respectively. We used the "statistical" algorithm to perform coreference resolution for speed. Each of these aspects is required to identify the names of quoted or mentioned speakers and identify any of their associated pronouns. All results were output to json format for further downstream processing.”

      We included a better description of scrapy: “Scrapy is a tool that applies user-defined rules to follow hyperlinks on webpages and return the information contained on each webpage.

      We used Scrapy to extract all web pages containing news articles and extract the text.”

      We also included our motivation for bootstrapping: “We used the boostrap method to construct confidence intervals for each of our calculated statistics.”

      In the section "Name Formatting for Gender Prediction in Quotes or Mentions", genderizeR is mentioned before an introduction to the tool

      We added the following text to provide context: “Even though genderizeR, the computational method used to predict the name's gender, only uses the first name to make the gender prediction, identifying the full name gives us greater confidence that we correctly identified the first name. “

      In the section "Name Formatting for Gender Prediction of Authors", you state that you exclude papers with only one author. How many papers is this? I assume few, in Nature, but if not I can imagine gender differences based on who writes first-authored papers.

      We find that the number excluded is roughly 7% of all papers, which is consistent across Nature and Springer Nature (1113/15013 for cited springer articles, 2899/42155 for random springer articles, 955/12459 for nature authors). We have added the following text to the manuscript for better context: “Roughly 7% of all papers were estimated to be by a single author and removed from this analysis.: 1113/15013 for cited Springer articles, 2899/42155 for random Springer articles, 955/12459 for Nature research articles.”

      In "Name Origin Analysis", for the in-text reference to Equation 3: include the prefix "Eq." or similar to mark this as referencing the equation and not something else

      This has been updated.

      The use of the word "enrichment" in reference to the representation of East Asian authors is strange and does not fit the colloquial definition of the term. I suggest just using a simpler term like "representation" instead

      Similarly, the authors use the word "depletion" to reflect the lower rate of quotes to scientists with East-Asian names, but I feel a simpler word would be more appropriate.

      We thank the reviewer for this suggestion, all instances of “enrichment/depletion” have been replaced with “over/under representation”

      The authors claim in Figure 2d that there is a steady increase in the rate of first author citations, however, this graph is not convincing. It appears to show much more noise than anything resembling a steady change.

      We have reworded our figure description to state that there is a consistent bias towards quoting last authors. Our figure description now states: “Panel d shows a consistent but slight bias towards quoting the last author of a cited article than the first author over time.”

      Supplemental Figures 1b and 1c do not seem to be mentioned in the main text, and I struggle to see their relevance.

      We thank the reviewer for identifying this error; these subpanels have been removed.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      *Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      I have trialled the package on my lab's data and it works as advertised. It was straightforward to use and did not require any special training. I am confident this is a tool that will be approachable even to users with limited computational experience. The use of artificial data to validate the approach - and to provide clear limits on applicability - is particularly helpful.

      The main limitation of the tool is that it requires the user to manually select regions. This somewhat limits the generalisability and is also more subjective - users can easily choose "nice" regions that better match with their hypothesis, rather than quantifying the data in an unbiased manner. However, given the inherent challenges in quantifying biological data, such problems are not easily circumventable.

      *

      * I have some comments to clarify the manuscript:

      1. A "straightforward installation" is mentioned. Given this is a Method paper, the means of installation should be clearly laid out.*

      __This sentence is now modified. In the revised manuscript we now describe how to install the toolset and we give the link to the toolset website if further information is needed. __On this website, we provide a full video tutorial and a user manual. The user manual is provided as a supplementary material of the manuscript.

      * It would be helpful if there was an option to generate an output with the regions analysed (i.e., a JPG image with the data and the drawn line(s) on top). There are two reasons for this: i) A major problem with user-driven quantification is accidental double counting of regions (e.g., a user quantifies a part of an image and then later quantifies the same region). ii) Allows other users to independently verify measurements at a later time.*

      We agree that it is helpful to save the analyzed regions. To answer this comment and the other two reviewers' comments pointing at a similar feature, we have now included an automatic saving of the regions of interest. The user will be able to reopen saved regions of interest using a new function we included in the new version of PatternJ.

      * 3. Related to the above point, it is highlighted that each time point would need to be analysed separately (line 361-362). It seems like it should be relatively straightforward to allow a function where the analysis line can be mapped onto the next time point. The user could then adjust slightly for changes in position, but still be starting from near the previous timepoint. Given how prevalent timelapse imaging is, this seems like (or something similar) a clear benefit to add to the software.*

      We agree that the analysis of time series images can be a useful addition. We have added the analysis of time-lapse series in the new version of PatternJ. The principles behind the analysis of time-lapse series and an example of such analysis are provided in Figure 1 - figure supplement 3 and Figure 5, with accompanying text lines 140-153 and 360-372. The analysis includes a semi-automated selection of regions of interest, which will make the analysis of such sequences more straightforward than having to draw a selection on each image of the series. The user is required to draw at least two regions of interest in two different frames, and the algorithm will automatically generate regions of interest in frames in which selections were not drawn. The algorithm generates the analysis immediately after selections are drawn by the user, which includes the tracking of the reference channel.

      * Line 134-135. The level of accuracy of the searching should be clarified here. This is discussed later in the manuscript, but it would be helpful to give readers an idea at this point what level of tolerance the software has to noise and aperiodicity.

      *

      We agree with the reviewer that a clarification of this part of the algorithm will help the user better understand the manuscript.__ We have modified the sentence to clarify the range of search used and the resulting limits in aperiodicity (now lines 176-181). __Regarding the tolerance to noise, it is difficult to estimate it a priori from the choice made at the algorithm stage, so we prefer to leave it to the validation part of the manuscript. We hope this solution satisfies the reviewer and future users.

      *

      **Referees cross-commenting**

      I think the other reviewer comments are very pertinent. The authors have a fair bit to do, but they are reasonable requests. So, they should be encouraged to do the revisions fully so that the final software tool is as useful as possible.

      Reviewer #1 (Significance (Required)):

      Developing software tools for quantifying biological data that are approachable for a wide range of users remains a longstanding challenge. This challenge is due to: (1) the inherent problem of variability in biological systems; (2) the complexity of defining clearly quantifiable measurables; and (3) the broad spread of computational skills amongst likely users of such software.

      In this work, Blin et al., develop a simple plugin for ImageJ designed to quickly and easily quantify regular repeating units within biological systems - e.g., muscle fibre structure. They clearly and fairly discuss existing tools, with their pros and cons. The motivation for PatternJ is properly justified (which is sadly not always the case with such software tools).

      Overall, the paper is well written and accessible. The tool has limitations but it is clearly useful and easy to use. Therefore, this work is publishable with only minor corrections.

      *We thank the reviewer for the positive evaluation of PatternJ and for pointing out its accessibility to the users.

      *

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      # Summary

      The authors present an ImageJ Macro GUI tool set for the quantification of one-dimensional repeated patterns that are commonly occurring in microscopy images of muscles.

      # Major comments

      In our view the article and also software could be improved in terms of defining the scope of its applicability and user-ship. In many parts the article and software suggest that general biological patterns can be analysed, but then in other parts very specific muscle actin wordings are used. We are pointing this out in the "Minor comments" sections below. We feel that the authors could improve their work by making a clear choice here. One option would be to clearly limit the scope of the tool to the analysis of actin structures in muscles. In this case we would recommend to also rename the tool, e.g. MusclePatternJ. The other option would be to make the tool about the generic analysis of one-dimensional patterns, maybe calling the tool LinePatternJ. In the latter case we would recommend to remove all actin specific wordings from the macro tool set and also the article should be in parts slightly re-written.

      *

      We agree with the reviewer that our initial manuscript used a mix of general and muscle-oriented vocabulary, which could make the use of PatternJ confusing especially outside of the muscle field. To make PatternJ useful for the largest community, we corrected the manuscript and the PatternJ toolset to provide the general vocabulary needed to make it understandable for every biologist. We modified the manuscript accordingly.

      * # Minor/detailed comments

      # Software

      We recommend considering the following suggestions for improving the software.

      ## File and folder selection dialogs

      In general, clicking on many of the buttons just opens up a file-browser dialog without any further information. For novel users it is not clear what the tool expects one to select here. It would be very good if the software could be rewritten such that there are always clear instructions displayed about which file or folder one should open for the different buttons.*

      We experienced with the current version of macOS that the file-browser dialog does not display any message; we suspect this is the issue raised by the reviewer. This is a known issue of Fiji on Mac and all applications on Mac since 2016. We provided guidelines in the user manual and on the tutorial video to correct this issue by changing a parameter in Fiji. Given the issues the reviewer had accessing the material on the PatternJ website, which we apologize for, we understand the issue raised. We added an extra warning on the PatternJ website to point at this problem and its solution. Additionally, we have limited the file-browser dialog appearance to what we thought was strictly necessary. Thus, the user will experience fewer prompts, speeding up the analysis.

      *

      ## Extract button

      The tool asks one to specify things like whether selections are drawn "M-line-to-M-line"; for users that are not experts in muscle morphology this is not understandable. It would be great to find more generally applicable formulations. *

      We agree that this muscle-oriented vocabulary can make the use of PatternJ confusing. We have now corrected the user interface to provide both general and muscle-specific vocabulary ("center-to-center or edge-to-edge (M-line-to-M-line or Z-disc-to-Z-disc)").*

      ## Manual selection accuracy

      The 1st step of the analysis is always to start from a user hand-drawn profile across intensity patterns in the image. However, this step can cause inaccuracy that varies with the shape and curve of the line profile drawn. If not strictly perpendicular to for example the M line patterns, the distance between intensity peaks will be different. This will be more problematic when dealing with non-straight and parallelly poised features in the image. If the structure is bended with a curve, the line drawn over it also needs to reproduce this curve, to precisely capture the intensity pattern. I found this limits the reproducibility and easy-usability of the software.*

      We understand the concern of the reviewer. On curved selections this will be an issue that is difficult to solve, especially on "S" curved or more complex selections. The user will have to be very careful in these situations. On non-curved samples, the issue may be concerning at first sight, but the errors go with the inverse of cosine and are therefore rather low. For example, if the user creates a selection off by 5 degrees, which is visually obvious, lengths will be affected by an increase of only 0.38%. The point raised by the reviewer is important to discuss, and we therefore added a paragraph to comment on the choice of selection (lines 94-98) and a supplementary figure to help make it clear (Figure 1 - figure supplement 1).*

      ### Reproducibility

      Since the line profile drawn on the image is the first step and very essential to the entire process, it should be considered to save together with the analysis result. For example, as ImageJ ROI or ROIset files that can be re-imported, correctly positioned, and visualized in the measured images. This would greatly improve the reproducibility of the proposed workflow. In the manuscript, only the extracted features are being saved (because the save button is also just asking for a folder containing images, so I cannot verify its functionality). *

      We agree that this is a very useful and important feature. We have added ROI automatic saving. Additionally, we now provide a simplified import function of all ROIs generated with PatternJ and the automated extraction and analysis of the list of ROIs. This can be done from ROIs generated previously in PatternJ or with ROIs generated from other ImageJ/Fiji algorithms. These new features are described in the manuscript in lines 120-121 and 130-132.

      *

      ## ? button

      It would be great if that button would open up some usage instructions.

      *

      We agree with the reviewer that the "?" button can be used in a better way. We have replaced this button with a Help menu, including a simple tutorial showing a series of images detailing the steps to follow by the user, a link to the user website, and a link to our video tutorial.

      * ## Easy improvement of workflow

      I would suggest a reasonable expansion of the current workflow, by fitting and displaying 2D lines to the band or line structure in the image, that form the "patterns" the author aims to address. Thus, it extracts geometry models from the image, and the inter-line distance, and even the curve formed by these sets of lines can be further analyzed and studied. These fitted 2D lines can be also well integrated into ImageJ as Line ROI, and thus be saved, imported back, and checked or being further modified. I think this can largely increase the usefulness and reproducibility of the software.

      *

      We hope that we understood this comment correctly. We had sent a clarification request to the editor, but unfortunately did not receive an answer within the requested 4 weeks of this revision. We understood the following: instead of using our 1D approach, in which we extract positions from a profile, the reviewer suggests extracting the positions of features not as a single point, but as a series of coordinates defining its shape. If this is the case, this is a major modification of the tool that is beyond the scope of PatternJ. We believe that keeping our tool simple, makes it robust. This is the major strength of PatternJ. Local fitting will not use line average for instance, which would make the tool less reliable.

      * # Manuscript

      We recommend considering the following suggestions for improving the manuscript. Abstract: The abstract suggests that general patterns can be quantified, however the actual tool quantifies specific subtypes of one-dimensional patterns. We recommend adapting the abstract accordingly.

      *

      We modified the abstract to make this point clearer.

      * Line 58: Gray-level co-occurrence matrix (GLCM) based feature extraction and analysis approach is not mentioned nor compared. At least there's a relatively recent study on Sarcomeres structure based on GLCM feature extraction: https://github.com/steinjm/SotaTool with publication: *https://doi.org/10.1002/cpz1.462

      • *

      We thank the reviewer for making us aware of this publication. We cite it now and have added it to our comparison of available approaches.

      * Line 75: "...these simple geometrical features will address most quantitative needs..." We feel that this may be an overstatement, e.g. we can imagine that there should be many relevant two-dimensional patterns in biology?!*

      We have modified this sentence to avoid potential confusion (lines 76-77).

      • *

      • Line 83: "After a straightforward installation by the user, ...". We think it would be convenient to add the installation steps at this place into the manuscript. *

      __This sentence is now modified. We now mention how to install the toolset and we provide the link to the toolset website, if further information is needed (lines 86-88). __On the website, we provide a full video tutorial and a user manual.

      * Line 87: "Multicolor images will give a graph with one profile per color." The 'Multicolor images' here should be more precisely stated as "multi-channel" images. Multi-color images could be confused with RGB images which will be treated as 8-bit gray value (type conversion first) images by profile plot in ImageJ. *

      We agree with the reviewer that this could create some confusion. We modified "multicolor" to "multi-channel".

      * Line 92: "...such as individual bands, blocks, or sarcomeric actin...". While bands and blocks are generic pattern terms, the biological term "sarcomeric actin" does not seem to fit in this list. Could a more generic wording be found, such as "block with spike"? *

      We agree with the reviewer that "sarcomeric actin" alone will not be clear to all readers. We modified the text to "block with a central band, as often observed in the muscle field for sarcomeric actin" (lines 103-104). The toolset was modified accordingly.

      * Line 95: "the algorithm defines one pattern by having the features of highest intensity in its centre". Could this be rephrased? We did not understand what that exactly means.*

      We agree with the reviewer that this was not clear. We rewrote this paragraph (lines 101-114) and provided a supplementary figure to illustrate these definitions (Figure 1 - figure supplement 2).

      * Line 124 - 147: This part the only description of the algorithm behind the feature extraction and analysis, but not clearly stated. Many details are missing or assumed known by the reader. For example, how it achieved sub-pixel resolution results is not clear. One can only assume that by fitting Gaussian to the band, the center position (peak) thus can be calculated from continuous curves other than pixels. *

      Note that the two sentences introducing this description are "Automated feature extraction is the core of the tool. The algorithm takes multiple steps to achieve this (Fig. S2):". We were hoping this statement was clear, but the reviewer may refer to something else. We agree that the description of some of the details of the steps was too quick. We have now expanded the description where needed.

      * Line 407: We think the availability of both the tool and the code could be improved. For Fiji tools it is common practice to create an Update Site and to make the code available on GitHub. In addition, downloading the example file (https://drive.google.com/file/d/1eMazyQJlisWPwmozvyb8VPVbfAgaH7Hz/view?usp=drive_link) required a Google login and access request, which is not very convenient; in fact, we asked for access but it was denied. It would be important for the download to be easier, e.g. from GitHub or Zenodo.

      *

      We are sorry for issues encountered when downloading the tool and additional material. We thank the reviewer for pointing out these issues that limited the accessibility of our tool. We simplified the downloading procedure on the website, which does not go through the google drive interface nor requires a google account. Additionally, for the coder community the code, user manual and examples are now available from GitHub at github.com/PierreMangeol/PatternJ, and are provided as supplementary material with the manuscript. To our knowledge, update sites work for plugins but not for macro toolsets. Having experience sharing our codes with non-specialists, a classical website with a tutorial video is more accessible than more coder-oriented websites, which deter many users.

      * Reviewer #2 (Significance (Required)):

      The strength of this study is that a tool for the analysis of one-dimensional repeated patterns occurring in muscle fibres is made available in the accessible open-source platform ImageJ/Fiji. In the introduction to the article the authors provide an extensive review of comparable existing tools. Their new tool fills a gap in terms of providing an easy-to-use software for users without computational skills that enables the analysis of muscle sarcomere patterns. We feel that if the below mentioned limitations could be addressed the tool could indeed be valuable to life scientists interested in muscle patterning without computational skills.

      In our view there are a few limitations, including the accessibility of example data and tutorials at sites.google.com/view/patternj, which we had trouble to access. In addition, we think that the workflow in Fiji, which currently requires pressing several buttons in the correct order, could be further simplified and streamlined by adopting some "wizard" approach, where the user is guided through the steps.

      *As answered above, the links on the PatternJ website are now corrected. Regarding the workflow, we now provide a Help menu with:

      1. __a basic set of instructions to use the tool, __
      2. a direct link to the tutorial video in the PatternJ toolset
      3. a direct link to the website on which both the tutorial video and a detailed user manual can be found. We hope this addresses the issues raised by this reviewer.

      *Another limitation is the reproducibility of the analysis; here we recommend enabling IJ Macro recording as well as saving of the drawn line ROIs. For more detailed suggestions for improvements please see the above sections of our review. *

      We agree that saving ROIs is very useful. It is now implemented in PatternJ.

      We are not sure what this reviewer means by "enabling IJ Macro recording". The ImageJ Macro Recorder is indeed very useful, but to our knowledge, it is limited to built-in functions. Our code is open and we hope this will be sufficient for advanced users to modify the code and make it fit their needs.*

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary In this manuscript, the authors present a new toolset for the analysis of repetitive patterns in biological images named PatternJ. One of the main advantages of this new tool over existing ones is that it is simple to install and run and does not require any coding skills whatsoever, since it runs on the ImageJ GUI. Another advantage is that it does not only provide the mean length of the pattern unit but also the subpixel localization of each unit and the distributions of lengths and that it does not require GPU processing to run, unlike other existing tools. The major disadvantage of the PatternJ is that it requires heavy, although very simple, user input in both the selection of the region to be analyzed and in the analysis steps. Another limitation is that, at least in its current version, PatternJ is not suitable for time-lapse imaging. The authors clearly explain the algorithm used by the tool to find the localization of pattern features and they thoroughly test the limits of their tool in conditions of varying SNR, periodicity and band intensity. Finally, they also show the performance of PatternJ across several biological models such as different kinds of muscle cells, neurons and fish embryonic somites, as well as different imaging modalities such as brightfield, fluorescence confocal microscopy, STORM and even electron microscopy.

      This manuscript is clearly written, and both the section and the figures are well organized and tell a cohesive story. By testing PatternJ, I can attest to its ease of installation and use. Overall, I consider that PatternJ is a useful tool for the analysis of patterned microscopy images and this article is fit for publication. However, i do have some minor suggestions and questions that I would like the authors to address, as I consider they could improve this manuscript and the tool:

      *We are grateful to this reviewer for this very positive assessment of PatternJ and of our manuscript.

      * Minor Suggestions: In the methodology section is missing a more detailed description about how the metric plotted was obtained: as normalized intensity or precision in pixels. *

      We agree with the reviewer that a more detailed description of the metric plotted was missing. We added this information in the method part and added information in the Figure captions where more details could help to clarify the value displayed.

      * The validation is based mostly on the SNR and patterns. They should include a dataset of real data to validate the algorithm in three of the standard patterns tested. *

      We validated our tool using computer-generated images, in which we know with certainty the localization of patterns. This allowed us to automatically analyze 30 000 images, and with varying settings, we sometimes analyzed 10 times the same image, leading to about 150 000 selections analyzed. From these analyses, we can provide with confidence an unbiased assessment of the tool precision and the tool capacity to extract patterns. We already provided examples of various biological data images in Figures 4-6, showing all possible features that can be extracted with PatternJ. In these examples, we can claim by eye that PatternJ extracts patterns efficiently, but we cannot know how precise these extractions are because of the nature of biological data: "real" positions of features are unknown in biological data. Such validation will be limited to assessing whether a pattern was found or not, which we believe we already provided with the examples in Figures 4-6.

      * The video tutorial available in the PatternJ website is very useful, maybe it would be worth it to include it as supplemental material for this manuscript, if the journal allows it. *

      As the video tutorial may have been missed by other reviewers, we agree it is important to make it more prominent to users. We have now added a Help menu in the toolset that opens the tutorial video. Having the video as supplementary material could indeed be a useful addition if the size of the video is compatible with the journal limits.

      * An example image is provided to test the macro. However, it would be useful to provide further example images for each of the three possible standard patterns suggested: Block, actin sarcomere or individual band.*

      We agree this can help users. We now provide another multi-channel example image on the PatternJ website including blocks and a pattern made of a linear intensity gradient that can be extracted with our simpler "single pattern" algorithm, which were missing in the first example. Additionally, we provide an example to be used with our new time-lapse analysis.

      * Access to both the manual and the sample images in the PatternJ website should be made publicly available. Right now they both sit in a private Drive account. *

      As mentioned above, we apologize for access issues that occurred during the review process. These files can now be downloaded directly on the website without any sort of authentication. Additionally, these files are now also available on GitHub.

      * Some common errors are not properly handled by the macro and could be confusing for the user: When there is no selection and one tries to run a Check or Extraction: "Selection required in line 307 (called from line 14). profile=getProfile( ;". A simple "a line selection is required" message would be useful there. When "band" or "block" is selected for a channel in the "Set parameters" window, yet a 0 value is entered into the corresponding "Number of bands or blocks" section, one gets this error when trying to Extract: "Empty array in line 842 (called from line 113). if ( ( subloc . length == 1 ) & ( subloc [ 0 == 0) ) {". This error is not too rare, since the "Number of bands or blocks" section is populated with a 0 after choosing "sarcomeric actin" (after accepting the settings) and stays that way when one changes back to "blocks" or "bands".*

      We thank the reviewer for pointing out these bugs. These bugs are now corrected in the revised version.

      * The fact that every time one clicks on the most used buttons, the getDirectory window appears is not only quite annoying but also, ultimately a waste of time. Isn't it possible to choose the directory in which to store the files only once, from the "Set parameters" window?*

      We have now found a solution to avoid this step. The user is only prompted to provide the image folder when pressing the "Set parameter" button. We kept the prompt for directory only when the user selects the time-lapse analysis or the analysis of multiple ROIs. The main reason is that it is very easy for the analysis to end up in the wrong folder otherwise.

      * The authors state that the outputs of the workflow are "user friendly text files". However, some of them lack descriptive headers (like the localisations and profiles) or even file names (like colors.txt). If there is something lacking in the manuscript, it is a brief description of all the output files generated during the workflow.*

      PatternJ generates multiple files, several of which are internal to the toolset. They are needed to keep track of which analyses were done, and which colors were used in the images, amongst others. From the user part, only the files obtained after the analysis All_localizations.channel_X.txt and sarcomere_lengths.txt are useful. To improve the user experience, we now moved all internal files to a folder named "internal", which we think will clarify which outputs are useful for further analysis, and which ones are not. We thank the reviewer for raising this point and we now mention it in our Tutorial.

      I don't really see the point in saving the localizations from the "Extraction" step, they are even named "temp".

      We thank the reviewer for this comment, this was indeed not necessary. We modified PatternJ to delete these files after they are used.

      * In the same line, I DO see the point of saving the profiles and localizations from the "Extract & Save" step, but I think they should be deleted during the "Analysis" step, since all their information is then grouped in a single file, with descriptive headers. This deleting could be optional and set in the "Set parameters" window.*

      We understand the point raised by the reviewer. However, the analysis depends on the reference channel picked, which is asked for when starting an analysis, and can be augmented with additional selections. If a user chooses to modify the reference channel or to add a new profile to the analysis, deleting all these files would mean that the user will have to start over again, which we believe will create frustration. An optional deletion at the analysis step is simple to implement, but it could create problems for users who do not understand what it means practically.

      * Moreover, I think it would be useful to also save the linear roi used for the "Extract & Save" step, and eventually combine them during the "Analysis step" into a single roi set file so that future re-analysis could be made on the same regions. This could be an optional feature set from the "Set parameters" window. *

      We agree with the reviewer that saving ROIs is very useful. ROIs are now saved into a single file each time the user extracts and saves positions from a selection. Additionally, the user can re-use previous ROIs and analyze an image or image series in a single step.

      * In the "PatternJ workflow" section of the manuscript, the authors state that after the "Extract & Save" step "(...) steps 1, 2, 4, and 5 can be repeated on other selections (...)". However, technically, only steps 1 and 5 are really necessary (alternatively 1, 4 and 5 if the user is unsure of the quality of the patterning). If a user follows this to the letter, I think it can lead to wasted time.

      *

      We agree with the reviewer and have corrected the manuscript accordingly (line 119-120).

      • *

      *I believe that the "Version Information" button, although important, has potential to be more useful if used as a "Help" button for the toolset. There could be links to useful sources like the manuscript or the PatternJ website but also some tips like "whenever possible, use a higher linewidth for your line selection" *

      We agree with the reviewer as pointed out in our previous answers to the other reviewers. This button is now replaced by a Help menu, including a simple tutorial in a series of images detailing the steps to follow, a link to the user website, and a link to our video tutorial.

      * It would be interesting to mention to what extent does the orientation of the line selection in relation to the patterned structure (i.e. perfectly parallel vs more diagonal) affect pattern length variability?*

      As answered to reviewer 1, we understand this concern, which needs to be clarified for readers. The issue may be concerning at first sight, but the errors grow only with the inverse of cosine and are therefore rather low. For example, if the user creates a selection off by 3 degrees, which is visually obvious, lengths will be affected by an increase of only 0.14%. The point raised by the reviewer is important to discuss, and we therefore have added a comment on the choice of selection (lines 94-98) as well as a supplementary figure (Figure 1 - figure supplement 1).

      * When "the algorithm uses the peak of highest intensity as a starting point and then searches for peak intensity values one spatial period away on each side of this starting point" (line 133-135), does that search have a range? If so, what is the range? *

      We agree that this information is useful to share with the reader. The range is one pattern size. We have modified the sentence to clarify the range of search used and the resulting limits in aperiodicity (now lines 176-181).

      * Line 144 states that the parameters of the fit are saved and given to the user, yet I could not find such information in the outputs. *

      The parameters of the fits are saved for blocks. We have now clarified this point by modifying the manuscript (lines 186-198) and modifying Figure 1 - figure supplement 5. We realized we made an error in the description of how edges of "block with middle band" are extracted. This is now corrected.

      * In line 286, authors finish by saying "More complex patterns from electron microscopy images may also be used with PatternJ.". Since this statement is not backed by evidence in the manuscript, I suggest deleting it (or at the very least, providing some examples of what more complex patterns the authors refer to). *

      This sentence is now deleted.

      * In the TEM image of the fly wing muscle in fig. 4 there is a subtle but clearly visible white stripe pattern in the original image. Since that pattern consists of 'dips', rather than 'peaks' in the profile of the inverted image, they do not get analyzed. I think it is worth mentioning that if the image of interest contains both "bright" and "dark" patterns, then the analysis should be performed in both the original and the inverted images because the nature of the algorithm does not allow it to detect "dark" patterns. *

      We agree with the reviewer's comment. We now mention this point in lines 337-339.

      * In line 283, the authors mention using background correction. They should explicit what method of background correction they used. If they used ImageJ's "subtract background' tool, then specify the radius.*

      We now describe this step in the method section.

      *

      Reviewer #3 (Significance (Required)):

      • Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field. Being a software paper, the advance proposed by the authors is technical in nature. The novelty and significance of this tool is that it offers quick and simple pattern analysis at the single unit level to a broad audience, since it runs on the ImageJ GUI and does not require any programming knowledge. Moreover, all the modules and steps are well described in the paper, which allows easy going through the analysis.
      • Place the work in the context of the existing literature (provide references, where appropriate). The authors themselves provide a good and thorough comparison of their tool with other existing ones, both in terms of ease of use and on the type of information extracted by each method. While PatternJ is not necessarily superior in all aspects, it succeeds at providing precise single pattern unit measurements in a user-friendly manner.
      • State what audience might be interested in and influenced by the reported findings. Most researchers working with microscopy images of muscle cells or fibers or any other patterned sample and interested in analyzing changes in that pattern in response to perturbations, time, development, etc. could use this tool to obtain useful, and otherwise laborious, information. *

      We thank the reviewer for these enthusiastic comments about how straightforward for biologists it is to use PatternJ and its broad applicability in the bio community.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary

      The authors present an ImageJ Macro GUI tool set for the quantification of one-dimensional repeated patterns that are commonly occurring in microscopy images of muscles.

      Major comments

      In our view the article and also software could be improved in terms of defining the scope of its applicability and user-ship. In many parts the article and software suggest that general biological patterns can be analysed, but then in other parts very specific muscle actin wordings are used. We are pointing this out in the "Minor comments" sections below. We feel that the authors could improve their work by making a clear choice here. One option would be to clearly limit the scope of the tool to the analysis of actin structures in muscles. In this case we would recommend to also rename the tool, e.g. MusclePatternJ. The other option would be to make the tool about the generic analysis of one-dimensional patterns, maybe calling the tool LinePatternJ. In the latter case we would recommend to remove all actin specific wordings from the macro tool set and also the article should be in parts slightly re-written.

      Minor/detailed comments

      Software

      We recommend considering the following suggestions for improving the software.

      File and folder selection dialogs

      In general, clicking on many of the buttons just opens up a file-browser dialog without any further information. For novel users it is not clear what the tool expects one to select here. It would be very good if the software could be rewritten such that there are always clear instructions displayed about which file or folder one should open for the different buttons.

      Extract button

      The tool asks one to specify things like whether selections are drawn "M-line-to-M-line"; for users that are not experts in muscle morphology this is not understandable. It would be great to find more generally applicable formulations.

      Manual selection accuracy

      The 1st step of the analysis is always to start from a user hand-drawn profile across intensity patterns in the image. However, this step can cause inaccuracy that varies with the shape and curve of the line profile drawn. If not strictly perpendicular to for example the M line patterns, the distance between intensity peaks will be different. This will be more problematic when dealing with non-straight and parallelly poised features in the image. If the structure is bended with a curve, the line drawn over it also needs to reproduce this curve, to precisely capture the intensity pattern. I found this limits the reproducibility and easy-usability of the software.

      Reproducibility

      Since the line profile drawn on the image is the first step and very essential to the entire process, it should be considered to save together with the analysis result. For example, as ImageJ ROI or ROIset files that can be re-imported, correctly positioned, and visualized in the measured images. This would greatly improve the reproducibility of the proposed workflow. In the manuscript, only the extracted features are being saved (because the save button is also just asking for a folder containing images, so I cannot verify its functionality).

      ? button

      It would be great if that button would open up some usage instructions.

      Easy improvement of workflow

      I would suggest a reasonable expansion of the current workflow, by fitting and displaying 2D lines to the band or line structure in the image, that form the "patterns" the author aims to address. Thus, it extracts geometry models from the image, and the inter-line distance, and even the curve formed by these sets of lines can be further analyzed and studied. These fitted 2D lines can be also well integrated into ImageJ as Line ROI, and thus be saved, imported back, and checked or being further modified. I think this can largely increase the usefulness and reproducibility of the software.

      Manuscript

      We recommend considering the following suggestions for improving the manuscript. Abstract: The abstract suggests that general patterns can be quantified, however the actual tool quantifies specific subtypes of one-dimensional patterns. We recommend adapting the abstract accordingly.

      Line 58: Gray-level co-occurrence matrix (GLCM) based feature extraction and analysis approach is not mentioned nor compared. At least there's a relatively recent study on Sarcomeres structure based on GLCM feature extraction: https://github.com/steinjm/SotaTool with publication: https://doi.org/10.1002/cpz1.462

      Line 75: "...these simple geometrical features will address most quantitative needs..." We feel that this may be an overstatement, e.g. we can imagine that there should be many relevant two-dimensional patterns in biology?!

      Line 83: "After a straightforward installation by the user, ...". We think it would be convenient to add the installation steps at this place into the manuscript.

      Line 87: "Multicolor images will give a graph with one profile per color." The 'Multicolor images' here should be more precisely stated as "multi-channel" images. Multi-color images could be confused with RGB images which will be treated as 8-bit gray value (type conversion first) images by profile plot in ImageJ.

      Line 92: "...such as individual bands, blocks, or sarcomeric actin...". While bands and blocks are generic pattern terms, the biological term "sarcomeric actin" does not seem to fit in this list. Could a more generic wording be found, such as "block with spike"?

      Line 95: "the algorithm defines one pattern by having the features of highest intensity in its centre". Could this be rephrased? We did not understand what that exactly means.

      Line 124 - 147: This part the only description of the algorithm behind the feature extraction and analysis, but not clearly stated. Many details are missing or assumed known by the reader. For example, how it achieved sub-pixel resolution results is not clear. One can only assume that by fitting Gaussian to the band, the center position (peak) thus can be calculated from continuous curves other than pixels.

      Line 407: We think the availability of both the tool and the code could be improved. For Fiji tools it is common practice to create an Update Site and to make the code available on GitHub. In addition, downloading the example file (https://drive.google.com/file/d/1eMazyQJlisWPwmozvyb8VPVbfAgaH7Hz/view?usp=drive_link) required a Google login and access request, which is not very convenient; in fact, we asked for access but it was denied. It would be important for the download to be easier, e.g. from GitHub or Zenodo.

      Significance

      The strength of this study is that a tool for the analysis of one-dimensional repeated patterns occurring in muscle fibres is made available in the accessible open-source platform ImageJ/Fiji. In the introduction to the article the authors provide an extensive review of comparable existing tools. Their new tool fills a gap in terms of providing an easy-to-use software for users without computational skills that enables the analysis of muscle sarcomere patterns. We feel that if the below mentioned limitations could be addressed the tool could indeed be valuable to life scientists interested in muscle patterning without computational skills.

      In our view there are a few limitations, including the accessibility of example data and tutorials at sites.google.com/view/patternj, which we had trouble to access. In addition, we think that the workflow in Fiji, which currently requires pressing several buttons in the correct order, could be further simplified and streamlined by adopting some "wizard" approach, where the user is guided through the steps. Another limitation is the reproducibility of the analysis; here we recommend enabling IJ Macro recording as well as saving of the drawn line ROIs. For more detailed suggestions for improvements please see the above sections of our review.

    1. Reviewer #2 (Public Review):

      The document "Mapping spatial patterns to energetic benefits in groups of flow-coupled swimmers" by Heydari et al. uses several types of simulations and models to address aspects of stability of position and power consumption in few-body groups of pitching foils. I think the work has the potential to be a valuable and timely contribution to an important subject area. The supporting evidence is largely quite convincing, though some details could raise questions, and there is room for improvement in the presentation. My recommendations are focused on clarifying the presentation and perhaps spurring the authors to assess additional aspects:

      (1) Why do the authors choose to set the swimmers free only in the propulsion direction? I can understand constraining all the positions/orientations for investigating the resulting forces and power, and I can also understand the value of allowing the bodies to be fully free in x, y, and their orientation angle to see if possible configurations spontaneously emerge from the flow interactions. But why constrain some degrees of freedom and not others? What's the motivation, and what's the relevance to animals, which are fully free?

      (2) The model description in Eq. (1) and the surrounding text is confusing. Aren't the authors computing forces via CFD or the VS method and then simply driving the propulsive dynamics according to the net horizontal force? It seems then irrelevant to decompose things into thrust and drag, and it seems irrelevant to claim that the thrust comes from pressure and the drag from viscous effects. The latter claim may in fact be incorrect since the body has a shape and the normal and tangential components of the surface stress along the body may be complex.

      (3) The parameter taudiss in the VS simulations takes on unusual values such as 2.45T, making it seem like this value is somehow very special, and perhaps 2.44 or 2.46 would lead to significantly different results. If the value is special, the authors should discuss and assess it. Otherwise, I recommend picking a round value, like 2 or 3, which would avoid distraction.

      (4) Some of the COT plots/information were difficult to interpret because the correspondence of beneficial with the mathematical sign was changing. For example, DeltaCOT as introduced on p. 5 is such that negative indicates bad energetics as compared to a solo swimmer. But elsewhere, lower or more negative COT is good in terms of savings. Given the many plots, large amounts of data, and many quantities being assessed, the paper needs a highly uniform presentation to aid the reader.

      (5) I didn't understand the value of the "flow agreement parameter," and I didn't understand the authors' interpretation of its significance. Firstly, it would help if this and all other quantities were given explicit definitions as complete equations (including normalization). As I understand it, the quantity indicates the match of the flow velocity at some location with the flapping velocity of a "ghost swimmer" at that location. This does not seem to be exactly relevant to the equilibrium locations. In particular, if the match were perfect, then the swimmer would generate no relative flow and thus no thrust, meaning such a location could not be an equilibrium. So, some degree of mismatch seems necessary. I believe such a mismatch is indeed present, but the plots such as those in Figure 4 may disguise the effect. The color bar is saturated to the point of essentially being three tones (blue, white, red), so we cannot see that the observed equilibria are likely between the max and min values of this parameter.

      (6) More generally, and related to the above, I am favorable towards the authors' attempts to find approximate flow metrics that could be used to predict the equilibrium positions and their stability, but I think the reasoning needs to be more solid. It seems the authors are seeking a parameter that can indicate equilibrium and another that can indicate stability. Can they clearly lay out the motivation behind any proposed metrics, and clearly present complete equations for their definitions? Further, is there a related power metric that can be appropriately defined and which proves to be useful?

      (7) Why do the authors not carry out CFD simulations on the larger groups? Some explanations should be given, or some corresponding CFD simulations should be carried out. It would be interesting if CFD simulations were done and included, especially for the in-line case of many swimmers. This is because the results seem to be quite nuanced and dependent on many-body effects beyond nearest-neighbor interactions. It would certainly be comforting to see something similar happen in CFD.

      (8) Related to the above, the authors should discuss seemingly significant differences in their results for long in-line formations as compared to the CFD work of Peng et al. [48]. That work showed apparently stable groups for numbers of swimmers quite larger than that studied here. Why such a qualitatively different result, and how should we interpret these differences regarding the more general issue of the stability of tandem groups?

      (9) The authors seem to have all the tools needed to address the general question about how dynamically stable configurations relate to those that are energetically optimal. Are stable solutions optimal, or not? This would seem to have very important implications for animal groups, and the work addresses closely related topics but seems to miss the opportunity to give a definitive answer to this big question.

      (10) Time-delay particle model: This model seems to construct a simplified wake flow. But does the constructed flow satisfy basic properties that we demand of any flow, such as being divergence-free? If not, then the formulation may be troublesome.

    1. each person is seen as being rich in potential; as having power, dignity, and many, varied strengths.

      This is why I think it's important we recognize all kinds of strengths and celebrate the wins of every student, no matter how "small". Disabled or not, we all have different strengths that should be valued equally, but this is especially important for those with disabilities that may make traditional schooling or certain subjects more difficult. Students like that may thrive in more niche areas such as art, and I think it's especially important we uplift these students so they feel valued in a society that often uplifts certain skills over others.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1: 

      This is my first review of the article entitled "The canonical stopping network: Revisiting the role of the subcortex in response inhibition" by Isherwood and colleagues. This study is one in a series of excellent papers by the Forstmann group focusing on the ability of fMRI to reliably detect activity in small subcortical nuclei - in this case, specifically those purportedly involved in the hyper- and indirect inhibitory basal ganglia pathways. I have been very fond of this work for a long time, beginning with the demonstration of De Hollander, Forstmann et al. (HBM 2017) of the fact that 3T fMRI imaging (as well as many 7T imaging sequences) do not afford sufficient signal to noise ratio to reliably image these small subcortical nuclei. This work has done a lot to reshape my view of seminal past studies of subcortical activity during inhibitory control, including some that have several thousand citations.

      In the current study, the authors compiled five datasets that aimed to investigate neural activity associated with stopping an already initiated action, as operationalized in the classic stop-signal paradigm. Three of these datasets are taken from their own 7T investigations, and two are datasets from the Poldrack group, which used 3T fMRI.

      The authors make six chief points: 

      (1) There does not seem to be a measurable BOLD response in the purportedly critical subcortical areas in contrasts of successful stopping (SS) vs. going (GO), neither across datasets nor within each individual dataset. This includes the STN but also any other areas of the indirect and hyperdirect pathways.

      (2) The failed-stop (FS) vs. GO contrast is the only contrast showing substantial differences in those nodes.

      (3) The positive findings of STN (and other subcortical) activation during the SS vs. GO contrast could be due to the usage of inappropriate smoothing kernels.

      (4) The study demonstrates the utility of aggregating publicly available fMRI data from similar cognitive tasks. 

      (5) From the abstract: "The findings challenge previous functional magnetic resonance (fMRI) of the stop-signal task" 

      (6) and further: "suggest the need to ascribe a separate function to these networks." 

      I strongly and emphatically agree with points 1-5. However, I vehemently disagree with point 6, which appears to be the main thrust of the current paper, based on the discussion, abstract, and - not least - the title.

      To me, this paper essentially shows that fMRI is ill-suited to study the subcortex in the specific context of the stop-signal task. That is not just because of the issues of subcortical small-volume SNR (the main topic of this and related works by this outstanding group), but also because of its limited temporal resolution (which is unacknowledged, but especially impactful in the context of the stop-signal task). I'll expand on what I mean in the following.

      First, the authors are underrepresenting the non-fMRI evidence in favor of the involvement of the subthalamic nucleus (STN) and the basal ganglia more generally in stopping actions. 

      - There are many more intracranial local field potential recording studies that show increased STN LFP (or even single-unit) activity in the SS vs. FS and SS vs. GO contrast than listed, which come from at least seven different labs. Here's a (likely non-exhaustive) list of studies that come to mind:

      Ray et al., NeuroImage 2012 <br /> Alegre et al., Experimental Brain Research 2013 <br /> Benis et al., NeuroImage 2014 <br /> Wessel et al., Movement Disorders 2016 <br /> Benis et al., Cortex 2016 <br /> Fischer et al., eLife 2017 <br /> Ghahremani et al., Brain and Language 2018 <br /> Chen et al., Neuron 2020 <br /> Mosher et al., Neuron 2021 <br /> Diesburg et al., eLife 2021 

      - Similarly, there is much more evidence than cited that causally influencing STN via deep-brain stimulation also influences action-stopping. Again, the following list is probably incomplete: 

      Van den Wildenberg et al., JoCN 2006 <br /> Ray et al., Neuropsychologia 2009 <br /> Hershey et al., Brain 2010 <br /> Swann et al., JNeuro 2011 <br /> Mirabella et al., Cerebral Cortex 2012 <br /> Obeso et al., Exp. Brain Res. 2013 <br /> Georgiev et al., Exp Br Res 2016 <br /> Lofredi et al., Brain 2021 <br /> van den Wildenberg et al, Behav Brain Res 2021 <br /> Wessel et al., Current Biology 2022 

      - Moreover, evidence from non-human animals similarly suggests critical STN involvement in action stopping, e.g.: 

      Eagle et al., Cerebral Cortex 2008 <br /> Schmidt et al., Nature Neuroscience 2013 <br /> Fife et al., eLife 2017 <br /> Anderson et al., Brain Res 2020 

      Together, studies like these provide either causal evidence for STN involvement via direct electrical stimulation of the nucleus or provide direct recordings of its local field potential activity during stopping. This is not to mention the extensive evidence for the involvement of the STN - and the indirect and hyperdirect pathways in general - in motor inhibition more broadly, perhaps best illustrated by their damage leading to (hemi)ballism. 

      Hence, I cannot agree with the idea that the current set of findings "suggest the need to ascribe a separate function to these networks", as suggested in the abstract and further explicated in the discussion of the current paper. For this to be the case, we would need to disregard more than a decade's worth of direct recording studies of the STN in favor of a remote measurement of the BOLD response using (provably) sub ideal imaging parameters. There are myriads of explanations of why fMRI may not be able to reveal a potential ground-truth difference in STN activity between the SS and FS/GO conditions, beginning with the simple proposition that it may not afford sufficient SNR, or that perhaps subcortical BOLD is not tightly related to the type of neurophysiological activity that distinguishes these conditions (in the purported case of the stop-signal task, specifically the beta band). But essentially, this paper shows that a specific lens into subcortical activity is likely broken, but then also suggests dismissing existing evidence from superior lenses in favor of the findings from the 'broken' lens. That doesn't make much sense to me.

      Second, there is actually another substantial reason why fMRI may indeed be unsuitable to study STN activity, specifically in the stop-signal paradigm: its limited time resolution. The sequence of subcortical processes on each specific trial type in the stop-signal task is purportedly as follows: at baseline, the basal ganglia exert inhibition on the motor system. During motor initiation, this inhibition is lifted via direct pathway innervation. This is when the three trial types start diverging. When actions then have to be rapidly cancelled (SS and FS), cortical regions signal to STN via the hyperdirect pathway that inhibition has to be rapidly reinstated (see Chen, Starr et al., Neuron 2020 for direct evidence for such a monosynaptic hyperdirect pathway, the speed of which directly predicts SSRT). Hence, inhibition is reinstated (too late in the case of FS trials, but early enough in SS trials, see recordings from the BG in Schmidt, Berke et al., Nature Neuroscience 2013; and Diesburg, Wessel et al., eLife 2021). 

      Hence, according to this prevailing model, all three trial types involve a sequence of STN activation (initial inhibition), STN deactivation (disinhibition during GO), and STN reactivation (reinstantiation of inhibition during the response via the hyperdirect pathway on SS/FS trials, reinstantiation of inhibition via the indirect pathway after the response on GO trials). What distinguishes the trial types during this period is chiefly the relative timing of the inhibitory process (earliest on SS trials, slightly later on FS trials, latest on GO trials). However, these temporal differences play out on a level of hundreds of milliseconds, and in all three cases, processing concludes well under a second overall. To fMRI, given its limited time resolution, these activations are bound to look quite similar. 

      Lastly, further building on this logic, it's not surprising that FS trials yield increased activity compared to SS and GO trials. That's because FS trials are errors, which are known to activate the STN (Cavanagh et al., JoCN 2014; Siegert et al. Cortex 2014) and afford additional inhibition of the motor system after their occurrence (Guan et al., JNeuro 2022). Again, fMRI will likely conflate this activity with the abovementioned sequence, resulting in a summation of activity and the highest level of BOLD for FS trials. 

      In sum, I believe this study has a lot of merit in demonstrating that fMRI is ill-suited to study the subcortex during the SST, but I cannot agree that it warrants any reappreciation of the subcortex's role in stopping, which are not chiefly based on fMRI evidence. 

      We would like to thank reviewer 1 for their insightful and helpful comments. We have responded point-by-point below and will give an overview of how we reframed the paper here.  

      We agree that there is good evidence from other sources for the presence of the canonical stopping network (indirect and hyperdirect) during action cancellation, and that this should be reflected more in the paper. However, we do not believe that a lack of evidence for this network during the SST makes fMRI ill-suited for studying this task, or other tasks that have neural processes occurring in quick succession. What we believe the activation patterns of fMRI reflect during this task, is the large of amount of activation caused by failed stops. That is, that the role of the STN in error processing may be more pronounced that its role in action cancellation. Due to the replicability of fMRI results, especially at higher field strengths, we believe the activation profile of failed stop trials reflects a paramount role for the STN in error processing. Therefore, while we agree we do not provide evidence against the role of the STN in action cancellation, we do provide evidence that our outlook on subcortical activation during different trial types of this task should be revisited. We have reframed the article to reflect this, and discuss points such as fMRI reliability, validity and the complex overlapping of cognitive processes in the SST in the discussion. Please see all changes to the article indicated by red text.

      A few other points: 

      - As I said before, this team's previous work has done a lot to convince me that 3T fMRI is unsuitable to study the STN. As such, it would have been nice to see a combination of the subsamples of the study that DID use imaging protocols and field strengths suitable to actually study this node. This is especially true since the second 3T sample (and arguably, the Isherwood_7T sample) does not afford a lot of trials per subject, to begin with.

      Unfortunately, this study already comprises of the only 7T open access datasets available for the SST. Therefore, unless we combined only the deHollander_7T and Miletic_7T subsamples there is no additional analysis we can do for this right now. While looking at just the sub samples that were 7T and had >300 trials would be interesting, based on the new framing of the paper we do not believe it adds to the study, as the sub samples still lack the temporal resolution seemingly required for looking at the processes in the SST.

      - What was the GLM analysis time-locked to on SS and FS trials? The stop-signal or the GO-signal? 

      SS and FS trials were time-locked to the GO signal as this is standard practice. The main reason for this is that we use contrasts to interpret differences in activation patterns between conditions. By time-locking the FS and SS trials to the stop signal, we are contrasting events at different time points, and therefore different stages of processing, which introduces its own sources of error. We agree with the reviewer, however, that a separate analysis with time-locking on the stop-signal has its own merit, and now include results in the supplementary material where the FS and SS trials are time-locked to the stop signal as well.

      - Why was SSRT calculated using the outdated mean method? 

      We originally calculated SSRT using the mean method as this was how it was reported in the oldest of the aggregated studies. We have now re-calculated the SSRTs using the integration method with go omission replacement and thank the reviewer for pointing this out. Please see response to comment 3.

      - The authors chose 3.1 as a z-score to "ensure conservatism", but since they are essentially trying to prove the null hypothesis that there is no increased STN activity on SS trials, I would suggest erring on the side of a more lenient threshold to avoid type-2 error. 

      We have used minimum FDR-corrected thresholds for each contrast now, instead of using a blanket conservative threshold of 3.1 over all contrasts. The new thresholds for each contrast are shown in text. Please see below (page 12):

      “The thresholds for each contrast are as follows: 3.01 for FS > GO, 2.26 for FS > SS and 3.1 for SS > GO.”

      - The authors state that "The results presented here add to a growing literature exposing inconsistencies in our understanding of the networks underlying successful response inhibition". It would be helpful if the authors cited these studies and what those inconsistencies are. 

      We thank reviewer 1 for their detailed and thorough evaluation of our paper. Overall, we agree that there is substantial direct and indirect evidence for the involvement of the cortico-basal-ganglia pathways in response inhibition. We have taken the vast constructive criticism on board and agree with the reviewer that the paper should be reframed. We would like to thank the reviewer for the thoroughness of their helpful comments aiding the revising of the paper.

      (1) I would suggest reframing the study, abstract, discussion, and title to reflect the fact that the study shows that fMRI is unsuitable to study subcortical activity in the SST, rather than the fact that we need to question the subcortical model of inhibition, given the reasons in my public review.

      We agree with the reviewer that the article should be reframed and not taken as direct evidence against the large sum of literature pointing towards the involvement of the cortico-basal-ganglia pathway in response inhibition. We have significantly rewritten the article in light of this.

      (2) I suggest combining the datasets that provide the best imaging parameters and then analyzing the subcortical ROIs with a more lenient threshold and with regressors time-locked to the stop-signals (if that's not already the case). This would make the claim of a null finding much more impactful. Some sort of power analysis and/or Bayes factor analysis of evidence for the null would also be appreciated. 

      Instead of using a blanket conservative threshold of 3.1, we instead used only FDR-corrected thresholds. The threshold level is therefore different for each contrast and noted in the figures. We have also added supplementary figures including the group-level SPMs and ROI analyses when the FS and SS trials were time-locked to the stop signal instead of the GO signal (Supplementary Figs 4 & 5). But as mentioned above, due to the difference in time points when contrasting, we believe that time-locking to the GO signal for all trial types makes more sense for the main analysis.

      We have now also computed BFs on the first level ROI beta estimates for all contrasts using the BayesFactor package as implemented in R. We add the following section to the methods and updated the results section accordingly (page 8):

      “In addition to the frequentist analysis we also opted to compute Bayes Factors (BFs) for each contrast per ROI per hemisphere. To do this, we extracted the beta weights for each individual trial type from our first level model. We then compared the beta weights from each trial type to one another using the ‘BayesFactor’ package as implement in R (Morey & Rouder, 2015). We compared the full model comprising of trial type, dataset and subject as predictors to the null model comprising of only the dataset and subject as predictor. The datasets and subjects were modeled as random factors. We divided the resultant BFs from the full model by the null model to provide evidence for or against a significant difference in beta weights for each trial type. To interpret the BFs, we used a modified version of Jeffreys’ scale (Jeffreys, 1939; Lee & Wagenmakers, 2014).”

      (3) I suggest calculating SSRT using the integration method with the replacement of Go omissions, as per the most recent recommendation (Verbruggen et al., eLife 2019).

      We agree we should have used a more optimal method for SSRT estimation. We have replaced our original estimations with that of the integration method with go omissions replacement, as suggested and adapted the results in table 3.

      We have also replaced text in the methods sections to reflect this (page 5):

      “For each participant, the SSRT was calculated using the mean method, estimated by subtracting the mean SSD from median go RT (Aron & Poldrack, 2006; Logan & Cowan, 1984).”

      Now reads:

      “For each participant, the SSRT was calculated using the integration method with replacement of go omissions (Verbruggen et al., 2019), estimated by integrating the RT distribution and calculating the point at which the integral equals p(respond|signal). The completion time of the stop process aligns with the nth RT, where n equals the number of RTs in the RT distribution of go trials multiplied by the probability of responding to a signal.”

      Reviewer #2:

      This work aggregates data across 5 openly available stopping studies (3 at 7 tesla and 2 at 3 tesla) to evaluate activity patterns across the common contrasts of Failed Stop (FS) > Go, FS > stop success (SS), and SS > Go. Previous work has implicated a set of regions that tend to be positively active in one or more of these contrasts, including the bilateral inferior frontal gyrus, preSMA, and multiple basal ganglia structures. However, the authors argue that upon closer examination, many previous papers have not found subcortical structures to be more active on SS than FS trials, bringing into question whether they play an essential role in (successful) inhibition. In order to evaluate this with more data and power, the authors aggregate across five datasets and find many areas that are *more* active for FS than SS, specifically bilateral preSMA, caudate, GPE, thalamus, and VTA, and unilateral M1, GPi, putamen, SN, and STN. They argue that this brings into question the role of these areas in inhibition, based upon the assumption that areas involved in inhibition should be more active on successful stop than failed stop trials, not the opposite as they observed. 

      As an empirical result, I believe that the results are robust, but this work does not attempt a new theoretical synthesis of the neuro-cognitive mechanisms of stopping. Specifically, if these many areas are more active on failed stop than successful stop trials, and (at least some of) these areas are situated in pathways that are traditionally assumed to instantiate response inhibition like the hyperdirect pathway, then what function are these areas/pathways involved in? I believe that this work would make a larger impact if the author endeavored to synthesize these results into some kind of theoretical framework for how stopping is instantiated in the brain, even if that framework may be preliminary. 

      I also have one main concern about the analysis. The authors use the mean method for computing SSRT, but this has been shown to be more susceptible to distortion from RT slowing (Verbruggen, Chambers & Logan, 2013 Psych Sci), and goes against the consensus recommendation of using the integration with replacement method (Verbruggen et al., 2019). Therefore, I would strongly recommend replacing all mean SSRT estimates with estimates using the integration with replacement method. 

      I found the paper clearly written and empirically strong. As I mentioned in the public review, I believe that the main shortcoming is the lack of theoretical synthesis. I would encourage the authors to attempt to synthesize these results into some form of theoretical explanation. I would also encourage replacing the mean method with the integration with replacement method for computing SSRT. I also have the following specific comments and suggestions (in the approximate order in which they appear in the manuscript) that I hope can improve the manuscript: 

      We would like to thank reviewer 2 for their insightful and interesting comments. We have adapted our paper to reflect these comments. Please see direct responses to your comments below. We agree with the reviewer that some type of theoretical synthesis would help with the interpretability of the article. We have substantially reworked the discussion and included theoretical considerations behind the newer narrative. Please see all changes to the article indicated by red text.

      (1) The authors say "performance on successful stop trials is quantified by the stop signal reaction time". I don't think this is technically accurate. SSRT is a measure of the average latency of the stop process for all trials, not just for the trials in which subjects successfully stop. 

      Thank you for pointing this technically incorrect statement. We have replaced the above sentence with the following (page 1):

      “Inhibition performance in the SST as a whole is quantified by the stop signal reaction time (SSRT), which estimates the speed of the latent stopping process (Verbruggen et al., 2019).”

      (2) The authors say "few studies have detected differences in the BOLD response between FS and SS trials", but then do not cite any papers that detected differences until several sentences later (de Hollander et al., 2017; Isherwood et al., 2023; Miletic et al., 2020). If these are the only ones, and they only show greater FS than SS, then I think this point could be made more clearly and directly. 

      We have moved the citations to the correct place in the text to be clearer. We have also rephrased this part of the introduction to make the points more direct (page 2).

      “In the subcortex, functional evidence is relatively inconsistent. Some studies have found an increase in BOLD response in the STN in SS > GO contrasts (Aron & Poldrack, 2006; Coxon et al., 2016; Gaillard et al., 2020; Yoon et al., 2019), but others have failed to replicate this (Bloemendaal et al., 2016; Boehler et al., 2010; Chang et al., 2020; B. Xu et al., 2015). Moreover, some studies have actually found higher STN, SN and thalamic activation in failed stop trials, not successful ones (de Hollander et al., 2017; Isherwood et al., 2023; Miletić et al., 2020).

      (3) Unless I overlooked it, I don't believe that the author specified the criterion that any given subject is excluded based upon. Given some studies have significant exclusions (e.g., Poldrack_3T), I think being clear about how many subjects violated each criterion would be useful. 

      This is indeed interesting and important information to include. We have added the number of participants who were excluded for each criterion. Please see added text below (page 4):

      “Based on these criteria, no subjects were excluded from the Aron_3T dataset. 24 subjects were excluded from the Poldrack_3T dataset (3 based on criterion 1, 9 on criterion 2, 11 on criterion 3, and 8 on criterion 4). Three subjects were excluded from the deHollander_7T dataset (2 based on criterion 1 and 1 on criterion 2). Five subjects were excluded from the Isherwood_7T dataset (2 based on criterion 1, 1 on criterion 2, and 2 on criterion 4). Two subjects were excluded from the Miletic_7T dataset (1 based on criterion 2 and 1 on criterion 4). Note that some participants in the Poldrack_3T study failed to meet multiple inclusion criteria.”

      (4) The Method section included very exhaustive descriptions of the neuroimaging processing pipeline, which was appreciated. However, it seems that much of what is presented is not actually used in any of the analyses. For example, it seems that "functional data preprocessing" section may be fMRIPrep boilerplate, which again is fine, but I think it would help to clarify that much of the preprocessing was not used in any part of the analysis pipeline for any results. For example, at first blush, I thought the authors were using global signal regression, but after a more careful examination, I believe that they are only computing global signals but never using them. Similarly with tCompCor seemingly being computed but not used. If possible, I would recommend that the authors share code that instantiates their behavioral and neuroimaging analysis pipeline so that any confusion about what was actually done could be programmatically verified. At a minimum, I would recommend more clearly distinguishing the pipeline steps that actually went into any presented analyses.

      We thank the reviewer for finding this inconsistency. The methods section indeed uses the fMRIprep boilerplate text, which we included so to be as accurate as possible when describing the preprocessing steps taken. While we believe leaving the exact boilerplate text that fMRIprep gives us is the most accurate method to show our preprocessing, we have adapted some of the text to clarify which computations were not used in the subsequent analysis. As a side-note, for future reference, we’d like to add that the fmriprep authors expressly recommend users to report the boilerplate completely and unaltered, and as such, we believe this may become a recurring issue (page 7).

      “While many regressors were computed in the preprocessing of the fMRI data, not all were used in the subsequent analysis. The exact regressors used for the analysis can be found above. For example, tCompCor and global signals were calculated in our generic preprocessing pipeline but not part of the analysis. The code used for preprocessing and analysis can be found in the data and code availability statement.”

      (5) What does it mean for the Poldrack_3T to have N/A for SSD range? Please clarify. 

      Thank you for pointing out this omission. We had not yet found the possible SSD range for this study. We have replaced this value with the correct value (0 – 1000 ms).

      (6) The SSD range of 0-2000ms for deHollander_7T and Miletic_7T seems very high. Was this limit ever reached or even approached? SSD distributions could be a useful addition to the supplement. 

      Thank you for also bringing this mistake to light. We had accidentally placed the max trial duration in these fields instead of the max allowable SSD value. We have replaced the correct value (0 – 900 ms).

      (7) The author says "In addition, median go RTs did not correlate with mean SSRTs within datasets (Aron_3T: r = .411, p = .10, BF = 1.41; Poldrack_3T: r = .011, p = .91, BF = .23; deHollander_7T: r = -.30, p = .09, BF = 1.30; Isherwood_7T: r = .13, p = .65, BF = .57; Miletic_7T: r = .37, p = .19, BF = 1.02), indicating independence between the stop and go processes, an important assumption of the horse-race model (Logan & Cowan, 1984)." However, the independent race model assumes context independence (the finishing time of the go process is not affected by the presence of the stop process) and stochastic independence (the duration of the go and stop processes are independent on a given trial). This analysis does not seem to evaluate either of these forms of independence, as it correlates RT and SSRT across subjects, so it was unclear how this analysis evaluated either of the types of independence that are assumed by the independent race model. Please clarify or remove. 

      Thank you for this comment. We realize that this analysis indeed does not evaluate either context or stochastic independence and therefore we have removed this from the manuscript.

      (8) The RTs in Isherwood_7T are considerably slower than the other studies, even though the go stimulus+response is the same (very simple) stimulus-response mapping from arrows to button presses. Is there any difference in procedure or stimuli that might explain this difference? It is the only study with a visual stop signal, but to my knowledge, there is no work suggesting visual stop signals encourage more proactive slowing. If possible, I think a brief discussion of the unusually slow RTs in Isherwood_7T would be useful. 

      We have included the following text in the manuscript to reflect this observed difference in RT between the Isherwood_7T dataset and the other datasets (page 9).

      “Longer RTs were found in the Isherwood_7T dataset in comparison to the four other datasets. The only difference in procedure in the Isherwood_7T dataset is the use of a visual stop signal as opposed to an auditory stop signal. This RT difference is consistent with previous research, where auditory stop signals and visual go stimuli have been associated with faster RTs compared to unimodal visual presentation (Carrillo-de-la-Peña et al., 2019; Weber et al., 2024). The mean SSRTs and probability of stopping are within normal range, indicating that participants understood the task and responded in the expected manner.”

      (9) When the authors included both 3T and 7T data, I thought they were preparing to evaluate the effect of magnet strength on stop networks, but they didn't do this analysis. Is this because the authors believe there is insufficient power? It seems that this could be an interesting exploratory analysis that could improve the paper.

      We thank the reviewer for this interesting comment. As our dataset sample contains only two 3T and three 7T datasets we indeed believe there is insufficient power to warrant such an analysis. In addition, we wanted the focus of this paper to be how fMRI examines the SST in general, and not differences between acquisition methods. With a greater number of datasets with different imaging parameters (especially TE or resolution) in addition to field strength, we agree such an analysis would be interesting, although beyond the scope of this article.

      (10) The authors evaluate smoothing and it seems that the conclusion that they want to come to is that with a larger smoothing kernel, the results in the stop networks bleed into surrounding areas, producing false positive activity. However, in the absence of a ground truth of the true contributions of these areas, it seems that an alternative interpretation of the results is that the denser maps when using a larger smoothing kernel could be closer to "true" activation, with the maps using a smaller smoothing kernel missing some true activity. It seems worth entertaining these two possible interpretations for the smoothing results unless there is clear reason to conclude that the smoothed results are producing false positive activity. 

      We agree with the view of the reviewer on the interpretation of the smoothing results. We indeed cannot rule this out as a possible interpretation of the results, due to a lack of ground truth. We have added text to the article to reflect this view and discuss the types of errors we can expect for both smaller and larger smoothing kernels (page 15).

      “In the absence of a ground truth, we are not able to fully justify the use of either larger or smaller kernels to analyse such data. On the one hand, aberrantly large smoothing kernels could lead to false positives in activation profiles, due to bleeding of observed activation into surrounding tissues. On the other side, too little smoothing could lead to false negatives, missing some true activity in surrounding regions. While we cannot concretely validate either choice, it should be noted that there is lower spatial uncertainty in the subcortex compared to the cortex, due to the lower anatomical variability. False positives from smoothing spatially unmatched signal, are more likely than false negatives. It may be more prudent for studies to use a range of smoothing kernels, to assess the robustness of their fMRI activation profiles.”

    1. Reviewer #3 (Public Review):

      I remain enthusiastic about this study. The manuscript is well-written, logical, and conceptually clear. To my knowledge, no prior modeling study has tackled the question of 'why prepare before executing, why not just execute?' Prior studies have simply assumed, to emulate empirical findings, that preparatory inputs precede execution. They never asked why. The authors show that, when there are constraints on inputs, preparation becomes a natural strategy. In contrast, with no constraint on inputs, there is no need for preparation as one could get anything one liked just via the inputs during movement. For the sake of tractability, the authors use a simple magnitude constraint: the cost function punishes the integral of the squared inputs. Thus, if small inputs before movement can reduce the size of the inputs needed during movement, preparation is a good strategy. This occurs if (and only if) the network has strong dynamics (otherwise feeding it preparatory activity would not produce anything interesting). All of this is sensible and clarifying.

      As discussed in the prior round of reviews, the central constraint that the authors use is a mathematically tractable stand-in for a range of plausible (but often trickier to define and evaluate) constraints, such as simplicity of inputs (or inputs being things that other areas could provide). The manuscript now embraces this fact more explicitly, and also gives some results showing that other constraints (such as on the derivative of activity, which is one component of complexity) can have the same effect. The manuscript also now discusses and addresses a modest weakness of the previous manuscript: the preparatory activity in their simulations is often overly complex temporally, lacking the (rough) plateau typically seen for data. Depending on your point of view, this is simply 'window dressing', but from my perspective it was important to know that their approach could yield more realistic-looking preparatory activity. Both these additions (the new constraint, and the more realistic temporal profile of preparatory activity) are added simply as supplementary figures rather than in the main text, and are brought up only in the Discussion. At first this struck me as slightly odd, but in the end I think this is appropriate. These are really Discussion-type issues, and dealing with them there makes sense. The 'different constraints' issue in particular is deep, tricky to explore for technical reasons, and could thus support a small research program. I think it is fair to talk about it thoughtfully (as the Discussion now does) and then just mention some simple results.

      My remaining comments largely pertain to some subtle (but to me important) nuances at a few locations in the text. These should be easy for the authors to address, in whatever way they see fit.

      Specific comments:

      (1) The authors state the following on line 56: "For preparatory processes to avoid triggering premature movement, any pre-movement activity in the motor and dorsal pre-motor (PMd) cortices must carefully exclude those pyramidal tract neurons."<br /> This constraint is overly restrictive. PT neurons absolutely can change their activity during preparation in principle (and appear to do so in practice). The key constraint is looser: those changes should have no net effect on the muscles. E.g., if d is the vector of changes in PT neuron firing rates, and b is the vector of weights, then the constraint is that b'd = 0. d = 0 is one good way of doing this, but only one. Half the d's could go up and half could go down. Or they all go up, but half the b's are negative. Put differently, there is no reason the null space has to be upstream of the PT neurons. It could be partly, or entirely, downstream.<br /> In the end, this doesn't change the point the authors are making. It is still the case that d has to be structured to avoid causing muscle activity, which raises exactly the point the authors care about: why risk this unless preparation brings benefits? However, this point can be made with a more accurate motivation. This matters, because people often think that a null-space is a tricky thing to engineer, when really it is quite natural. With enough neurons, preparing in the null space is quite simple.

      (2) Line 167: 'near-autonomous internal dynamics in M1'.<br /> It would be good if such statements, early in the paper, could be modified to reflect the fact that the dynamics observed in M1 may depend on recurrence that is NOT purely internal to M1. A better phrase might be 'near-autonomous dynamics that can be observed in M1'. A similar point applies on line 13. This issue is handled very thoughtfully in the Discussion, starting on line 713. Obviously it is not sensible to also add multiple sentences making the same point early on. However, it is still worth phrasing things carefully, otherwise the reader may have the wrong impression up until the Discussion (i.e. they may think that both the authors, and prior studies, believe that all the relevant dynamics are internal to M1). If possible, it might also be worth adding one sentence, somewhere early, to keep readers from falling into this hole (and then being stuck there till the Discussion digs them out).

      (3) The authors make the point, starting on line 815, that transient (but strong) preparatory activity empirically occurs without a delay. They note that their model will do this but only if 'no delay' means 'no external delay'. For their model to prepare, there still needs to be an internal delay between when the first inputs arrive and when movement generating inputs arrive.

      This is not only a reasonable assumption, but is something that does indeed occur empirically. This can be seen in Figure 8c of Lara et al. Similarly, Kaufman et al. 2016 noted that "the sudden change in the CIS [the movement triggering event] occurred well after (~150 ms) the visual go cue... (~60 ms latency)" Behavioral experiments have also argued that internal movement-triggering events tend to be quite sluggish relative to the earliest they could be, causing RTs to be longer than they should be (Haith et al. Independence of Movement Preparation and Movement Initiation). Given this empirical support, the authors might wish to add a sentence indicating that the data tend to justify their assumption that the internal delay (separating the earliest response to sensory events from the events that actually cause movement to begin) never shrinks to zero.

      While on this topic, the Haith and Krakauer paper mentioned above good to cite because it does ponder the question of whether preparation is really necessary. By showing that they could get RTs to shrink considerably before behavior became inaccurate, they showed that people normally (when not pressured) use more preparation time than they really need. Given Lara et al, we know that preparation does always occur, but Haith and Krakauer were quite right that it can be very brief. This helped -- along with neural results -- change our view of preparation from something more cognitive that had to occur, so something more mechanical that was simply a good network strategy, which is indeed the authors current point. Working a discussion of this into the current paper may or may not make sense, but if there is a place where it is easy to cite, it would be appropriate.

    2. Author response:

      The following is the authors’ response to the original reviews.

      General response:

      We thank all the reviewers for their detailed reviews.

      All reviewers made a number of valuable comments, in particular by highlighting several points that would benefit from additional clarifications and discussion. We really appreciate the time and effort that went into the reviews. We have updated the paper to reflect the changes we have made in response to the reviewers' comments (largely by including more discussion regarding the model limitations and the effect of various modeling choices). We have also included several new supplementary figures (S7, S8, S9, S10) that provide further details of the model behavior, and show the effect of changing some of the terms in the cost. Below, we go through the individual comments, and highlight the places in which we have made changes to address the reviewers’ comments.

      Reviewer 1:

      Thank you for your review and pointing out multiple things to be discussed and clarified! Below, we go through the various limitations you pointed out and refer to the places where we have tried to address them.

      (1) It's important to keep in mind that this work involves simplified models of the motor system, and often the terminology for 'motor cortex' and 'models of motor cortex' are used interchangeably, which may mislead some readers. Similarly, the introduction fails in many cases to state what model system is being discussed (e.g. line 14, line 29, line 31), even though these span humans, monkeys, mice, and simulations, which all differ in crucial ways that cannot always be lumped together.

      That is a good point. We have clarified this in the text (Introduction and Discussion), to highlight the fact that our model isn’t necessarily meant to just capture M1. We have also updated the introduction to make it more clear which species the experiments which motivate our investigation were performed in.

      (2) At multiple points in the manuscript thalamic inputs during movement (in mice) is used as a motivation for examining the role of preparation. However, there are other more salient motivations, such as delayed sensory feedback from the limb and vision arriving in the motor cortex, as well as ongoing control signals from other areas such as the premotor cortex.

      Yes – the motivation for thalamic inputs came from the fact that those have specifically been shown to be necessary for accurate movement generation in mice. However, it is true that the inputs in our model are meant to capture any signals external to the dynamical system modeled, and as such are likely to represent a mixture of sensory signals, and feedback from other areas. We have clarified this in the Discussion, and have added this additional motivation in the Introduction.

      (3) Describing the main task in this work as a delayed reaching task is not justified without caveats (by the authors' own admission: line 687), since each network is optimized with a fixed delay period length. Although this is mentioned to the reader, it's not clear enough that the dynamics observed during the delay period will not resemble those in the motor cortex for typical delayed reaching tasks.

      Yes, we completely agree that the terminology might be confusing. While the task we are modeling is a delayed reaching task, it does differ from the usual setting since the network has knowledge of the delay period, and that is indeed a caveat of the model. We have added a brief paragraph just after the description of the optimal control objective to highlight this limitation.

      We have also performed additional simulations using two different variants of a model-predictive control approach that allow us to relax the assumption that the go-cue time is known in advance. We show that these modifications of the optimal controller yield results that remain consistent with our main conclusions, and can in fact in some settings lead to preparatory activity plateaus during the preparation epoch as often found in monkey M1 (e.g in Elsayed et al. 2016). We have modified the Discussion to explain these results and their limitations, which are summarized in a new Supplementary Figure (S9).

      (4) A number of simplifications in the model may have crucial consequences for interpretation.

      a) Even following the toy examples in Figure 4, all the models in Figure 5 are linear, which may limit the generalisability of the findings.

      While we agree that linear models may be too simplistic, much prior analyses of M1 data suggest that it is often good enough to capture key aspects of M1 dynamics; for example, the generative model underlying jPCA is linear, and Sussillo et al. (2015) showed that the internal activity of nonlinear RNN models trained to reproduce EMG data aligned best with M1 activity when heavily regularized; in this regime, the RNN dynamics were close to linear. Nevertheless, this linearity assumption is indeed convenient from a modeling viewpoint: the optimal control problem is more easily solved for linear network dynamics and the optimal trajectories are more consistent across networks. Indeed, we had originally attempted to perform the analyses of Figure 5 in the nonlinear setting, but found that while the results were overall similar to what we report in the linear regime, iLQR was occasionally trapped into local minimal, resulting in more variable results especially for inhibition-stabilized network in the strongly connected end of the spectrum. Finally, Figure 5 is primarily meant to explore to what extent motor preparation can be predicted from basic linear control-theoretic properties of the Jacobian of the dynamics; in this regard, it made sense to work with linear RNNs (for which the Jacobian is constant).

      b) Crucially, there is no delayed sensory feedback in the model from the plant. Although this simplification is in some ways a strength, this decision allows networks to avoid having to deal with delayed feedback, which is a known component of closed-loop motor control and of motor cortex inputs and will have a large impact on the control policy.

      This comment resonates well with Reviewer 3's remark regarding the autonomous nature (or not) of M1 during movement. Rather than thinking of our RNN models as anatomically confined models of M1 alone, we think of them as models of the dynamics which M1 implements possibly as part of a broader network involving “inter-area loops and (at some latency) sensory feedback”, and whose state appears to be near-fully decodable from M1 activity alone. We have added a paragraph of Discussion on this important point.

      (5) A key feature determining the usefulness of preparation is the direction of the readout dimension. However, all readouts had a similar structure (random Gaussian initialization). Therefore, it would be useful to have more discussion regarding how the structure of the output connectivity would affect preparation, since the motor cortex certainly does not follow this output scheme.

      We agree with this limitation of our model — indeed one key message of Figure 4 is that the degree of reliance on preparatory inputs depends strongly on how the dynamics align with the readout. However, this strong dependence is somewhat specific to low-dimensional models; in higher-dimensional models (most of our paper), one expects that any random readout matrix C will pick out activity dimensions in the RNN that are sufficiently aligned with the most controllable directions of the dynamics to encourage preparation.

      We did consider optimizing C away (which required differentiating through the iLQR optimizer, which is possible but very costly), but the question inevitably arises what exactly should C be optimized for, and under what constraints (e.g fixed norm or not). One possibility is to optimize C with respect to the same control objective that the control inputs are optimized for, and constrain its norm (otherwise, inputs to the M1 model, and its internal activity, could become arbitrarily small as C can grow to compensate). We performed this experiment (new Supplementary Figure S7) and obtained a similar preparation index; there was one notable difference, namely that the optimized readout modes led to greater observability compared to a random readout; thus, the same amount of “muscle energy” required for a given movement could now be produced by a smaller initial condition. In turn, this led to smaller control inputs, consistent with a lower control cost overall.

      Whilst we could have systematically optimized C away, we reasoned that (i) it is computationally expensive, and (ii) the way M1 affects downstream effectors is presumably “optimized” for much richer motor tasks than simple 2D reaching, such that optimizing C for a fixed set of simple reaches could lead to misleading conclusions. We therefore decided to stick with random readouts.

      Additional comments :

      (1) The choice of cost function seems very important. Is it? For example, penalising the square of u(t) may produce very different results than penalising the absolute value.

      Yes, the choice of cost function does affect the results, at least qualitatively. The absolute value of the inputs is a challenging cost to use, as iLQR relies on a local quadratic approximation of the cost function. However, we have included additional experiments in which we penalized the squared derivative of the inputs (Supplementary Figure S8; see also our response to Reviewer 3's suggestion on this topic), and we do see differences in the qualitative behavior of the model (though the main takeaway, i.e. the reliance on preparation, continues to hold). This is now referred to and discussed in the Discussion section.

      (2) In future work it would be useful to consider the role of spinal networks, which are known to contribute to preparation in some cases (e.g. Prut and Fetz, 1999).

      (3) The control signal magnitude is penalised, but not the output torque magnitude, which highlights the fact that control in the model is quite different from muscle control, where co-contraction would be a possibility and therefore a penalty of muscle activation would be necessary. Future work should consider the role of these differences in control policy.

      Thank you for pointing us to this reference! Regarding both of these concerns, we agree that the model could be greatly improved and made more realistic in future work (another avenue for this would be to consider a more realistic biophysical model, e.g. using the MotorNet library). We hope that the current Discussion, which highlights the various limitations of our modeling choices, makes it clear that a lot of these choices could easily be modified depending on the specific assumptions/investigation being performed.

      Reviewer 2:

      Thank you for your positive review! We very much agree with the limitations you pointed out, some of which overlapped with the comments of the other reviewers. We have done our best to address them through additional discussion and new supplementary figures. We briefly highlight below where those changes can be found.

      (1) Though the optimal control theory framework is ideal to determine inputs that minimize output error while regularizing the input norm, it however cannot easily account for some other varied types of objectives especially those that may lead to a complex optimization landscape. For instance, the reusability of parts of the circuit, sparse use of additional neurons when learning many movements, and ease of planning (especially under uncertainty about when to start the movement), may be alternative or additional reasons that could help explain the preparatory activity observed in the brain. It is interesting to note that inputs that optimize the objective chosen by the authors arguably lead to a trade-off in terms of other desirable objectives. Specifically, the inputs the authors derive are time-dependent, so a recurrent network would be needed to produce them and it may not be easy to interpolate between them to drive new movement variants. In addition, these inputs depend on the desired time of output and therefore make it difficult to plan, e.g. in circumstances when timing should be decided depending on sensory signals. Finally, these inputs are specific to the full movement chain that will unfold, so they do not permit reuse of the inputs e.g. in movement sequences of different orders.

      Yes, that is a good point! We have incorporated further Discussion related to this point. We have additionally included a new example in which we regularize the temporal complexity of the inputs (see also our response to Reviewer 3's suggestion on this topic), which leads to more slowly varying inputs, and may indeed represent a more realistic constraint and lead to simpler inputs that can more easily be interpolated between. We also agree that uncertainty about the upcoming go cue may play an important role in the strategy adopted by the animals. While we have not performed an extensive investigation of the topic, we have included a Supplementary Figure (S9) in which we used Model Predictive Control to investigate the effect of planning under uncertainty about the go cue arrival time. We hope that this will give the reader a better sense of what sort of model extensions are possible within our framework.

      (2) Relatedly, if the motor circuits were to balance different types of objectives, the activity and inputs occurring before each movement may be broken down into different categories that may each specialize into one objective. For instance, previous work (Kaufman et al. eNeuron 2016, Iganaki et al., Cell 2022, Zimnik and Churchland, Nature Neuroscience 2021) has suggested that inputs occurring before the movement could be broken down into preparatory inputs 'stricto sensu' - relating to the planned characteristics of the movement - and a trigger signal, relating to the transition from planning to execution - irrespective of whether the movement is internally timed or triggered by an external event. The current work does not address which type(s) of early input may be labeled as 'preparatory' or may be thought of as a part of 'planning' computations.

      Yes, our model does indeed treat inputs in a very general way, and does not distinguish between the different types of processes they may be composed of. This is partly because we do not explicitly model where the inputs come from, such that our inputs likely englobe multiple processes. We have added discussion related to this point.

      (3) While the authors rightly point out some similarities between the inputs that they derive and observed preparatory activity in the brain, notably during motor sequences, there are also some differences. For instance, while both the derived inputs and the data show two peaks during sequences, the data reproduced from Zimnik and Churchland show preparatory inputs that have a very asymmetric shape that really plummets before the start of the next movement, whereas the derived inputs have larger amplitude during the movement period - especially for the second movement of the sequence. In addition, the data show trigger-like signals before each of the two reaches. Finally, while the data show a very high correlation between the pattern of preparatory activity of the second reach in the double reach and compound reach conditions, the derived inputs appear to be more different between the two conditions. Note that the data would be consistent with separate planning of the two reaches even in the compound reach condition, as well as the re-use of the preparatory input between the compound and double reach conditions. Therefore, different motor sequence datasets - notably, those that would show even more coarticulation between submovements - may be more promising to find a tight match between the data and the author's inputs. Further analyses in these datasets could help determine whether the coarticulation could be due to simple filtering by the circuits and muscles downstream of M1, planning of movements with adjusted curvature to mitigate the work performed by the muscles while permitting some amount of re-use across different sequences, or - as suggested by the authors - inputs fully tailored to one specific movement sequence that maximize accuracy and minimize the M1 input magnitude.

      Regarding the exact shape of the occupancy plots, it is important to note that some of the more qualitative aspects (e.g the relative height of the two peaks) will change if we change the parameters of the cost function. Right now, we have chosen the parameters to ensure that both reaches would be performed at roughly the same speed (as a way to very loosely constrain the parameters based on the observed behavior). However, small changes to the hyperparameters can lead to changes in the model output (e.g one of the two consecutive reaches being performed using greater acceleration than the other), and since our biophysical model is fairly simple, changes in the behavior are directly reflected in the network activity. Essentially, what this means is that while the double occupancy is a consistent feature of the model, the exact shape of the peaks is more sensitive to hyperparameters, and we do not wish to draw any strong conclusions from them, given the simplicity of the biophysical model. However, we do agree that our model exhibits some differences with the data. As discussed above, we have included additional discussion regarding the potential existence of separate inputs for planning vs triggering the movement in the context of single reaches.

      Overall, we are excited about the suggestions made by the Reviewer here about using our approach to analyze other motor sequence datasets, but we think that in order to do this properly, one would need to adopt a more realistic musculo-skeletal model (such as one provided by MotorNet).

      (4) Though iLQR is a powerful optimization method to find inputs optimizing the author's cost function, it also has some limitations. First, given that it relies on a linearization of the dynamics at each timestep, it has a limited ability to leverage potential advantages of nonlinearities in the dynamics. Second, the iLQR algorithm is not a biologically plausible learning rule and therefore it might be difficult for the brain to learn to produce the inputs that it finds. It remains unclear whether using alternative algorithms with different limitations - for instance, using variants of BPTT to train a separate RNN to produce the inputs in question - could impact some of the results.

      We agree that our choice of iLQR has limitations: while it offers the advantage of convergence guarantees, it does indeed restrict the choice of cost function and dynamics that we can use. We have now included extensive discussion of how the modeling choices affect our results.

      We do not view the lack of biological plausibility of iLQR as an issue, as the results are agnostic to the algorithm used for optimization. However, we agree that any structure imposed on the inputs (e.g by enforcing them to be the output of a self-contained dynamical system) would likely alter the results. A potentially interesting extension of our model would be to do just what the reviewer suggested, and try to learn a network that can generate the optimal inputs. However, this is outside the scope of our investigation, as it would then lead to new questions (e.g what brain region would that other RNN represent?).

      (5) Under the objective considered by the authors, the amount of input occurring before the movement might be impacted by the presence of online sensory signals for closed-loop control. It is therefore an open question whether the objective and network characteristics suggested by the authors could also explain the presence of preparatory activity before e.g. grasping movements that are thought to be more sensory-driven (Meirhaeghe et al., Cell Reports 2023).

      It is true that we aren’t currently modeling sensory signals explicitly. However, some of the optimal inputs we infer may be capturing upstream information which could englobe some sensory information. This is currently unclear, and would likely depend on how exactly the model is specified. We have added new discussion to emphasize that our dynamics should not be understood as just representing M1, but more general circuits whose state can be decoded from M1.

      Reviewer #2 (Recommendations For The Authors):

      Additionally, thank you for pointing out various typos in the manuscript, we have fixed those!

      Reviewer 3:

      Thank you very much for your review, which makes a lot of very insightful points, and raises several interesting questions. In summary, we very much agree with the limitations you pointed out. In particular, the choice of input cost is something we had previously discussed, but we had found it challenging to decide on what a reasonable cost for “complexity” could be. Following your comment, we have however added a first attempt at penalizing “temporal complexity”, which shows promising behavior. We have only included those additional analyses as supplementary figures, and we have included new discussion, which hopefully highlights what we meant by the different model components, and how the model behavior may change as we vary some of our choices. We hope this can be informative for future models that may use a similar approach. Below, we highlight the changes that we have made to address your comments.

      The main limitation of the study is that it focuses exclusively on one specific constraint - magnitude - that could limit motor-cortex inputs. This isn't unreasonable, but other constraints are at least as likely, if less mathematically tractable. The basic results of this study will probably be robust with regard such issues - generally speaking, any constraint on what can be delivered during execution will favor the strategy of preparing - but this robustness cuts both ways. It isn't clear that the constraint used in the present study - minimizing upstream energy costs - is the one that really matters. Upstream areas are likely to be limited in a variety of ways, including the complexity of inputs they can deliver. Indeed, one generally assumes that there are things that motor cortex can do that upstream areas can't do, which is where the real limitations should come from. Yet in the interest of a tractable cost function, the authors have built a system where motor cortex actually doesn't do anything that couldn't be done equally well by its inputs. The system might actually be better off if motor cortex were removed. About the only thing that motor cortex appears to contribute is some amplification, which is 'good' from the standpoint of the cost function (inputs can be smaller) but hardly satisfying from a scientific standpoint.

      The use of a term that punishes the squared magnitude of control signals has a long history, both because it creates mathematical tractability and because it (somewhat) maps onto the idea that one should minimize the energy expended by muscles and the possibility of damaging them with large inputs. One could make a case that those things apply to neural activity as well, and while that isn't unreasonable, it is far from clear whether this is actually true (and if it were, why punish the square if you are concerned about ATP expenditure?). Even if neural activity magnitude an important cost, any costs should pertain not just to inputs but to motor cortex activity itself. I don't think the authors really wish to propose that squared input magnitude is the key thing to be regularized. Instead, this is simply an easily imposed constraint that is tractable and acts as a stand-in for other forms of regularization / other types of constraints. Put differently, if one could write down the 'true' cost function, it might contain a term related to squared magnitude, but other regularizing terms would by very likely to dominate. Using only squared magnitude is a reasonable way to get started, but there are also ways in which it appears to be limiting the results (see below).

      I would suggest that the study explore this topic a bit. Is it possible to use other forms of regularization? One appealing option is to constrain the complexity of inputs; a long-standing idea is that the role of motor cortex is to take relatively simple inputs and convert them to complex time-evolving inputs suitable for driving outputs. I realize that exploring this idea is not necessarily trivial. The right cost-function term is not clear (should it relate to low-dimensionality across conditions, or to smoothness across time?) and even if it were, it might not produce a convex cost function. Yet while exploring this possibility might be difficult, I think it is important for two reasons.

      First, this study is an elegant exploration of how preparation emerges due to constraints on inputs, but at present that exploration focuses exclusively on one constraint. Second, at present there are a variety of aspects of the model responses that appear somewhat unrealistic. I suspect most of these flow from the fact that while the magnitude of inputs is constrained, their complexity is not (they can control every motor cortex neuron at both low and high frequencies). Because inputs are not complexity-constrained, preparatory activity appears overly complex and never 'settles' into the plateaus that one often sees in data. To be fair, even in data these plateaus are often imperfect, but they are still a very noticeable feature in the response of many neurons. Furthermore, the top PCs usually contain a nice plateau. Yet we never get to see this in the present study. In part this is because the authors never simulate the situation of an unpredictable delay (more on this below) but it also seems to be because preparatory inputs are themselves strongly time-varying. More realistic forms of regularization would likely remedy this.

      That is a very good point, and it mirrors several concerns that we had in the past. While we did focus on the input norm for the sake of simplicity, and because it represents a very natural way to regularize our control solutions, we agree that a “complexity cost” may be better suited to models of brain circuits. We have addressed this in a supplementary investigation. We chose to focus on a cost that penalizes the temporal complexity of the inputs, as ||u(t+1) - u(t)||^2. Note that this required augmenting the state of the model, making the computations quite a bit slower; while it is doable if we only penalize the first temporal derivative, it would not scale well to higher orders.

      Interestingly, we did find that the activity in that setting was somewhat more realistic (see new Supplementary Figure S8), with more sustained inputs and plateauing activity. While we have kept the original model for most of the investigations, the somewhat more realistic nature of the results under that setting suggests that further exploration of penalties of that sort could represent a promising avenue to improve the model.

      We also found the idea of a cost that would ensure low-dimensionality of the inputs across conditions very interesting. However, it is challenging to investigate with iLQR as we perform the optimization separately for each condition; nevertheless, it could be investigated using a different optimizer.

      At present, it is also not clear whether preparation always occurs even with no delay. Given only magnitude-based regularization, it wouldn't necessarily have to be. The authors should perform a subspace-based analysis like that in Figure 6, but for different delay durations. I think it is critical to explore whether the model, like monkeys, uses preparation even for zero-delay trials. At present it might or might not. If not, it may be because of the lack of more realistic constraints on inputs. One might then either need to include more realistic constraints to induce zero-delay preparation, or propose that the brain basically never uses a zero delay (it always delays the internal go cue after the preparatory inputs) and that this is a mechanism separate from that being modeled.

      I agree with the authors that the present version of the model, where optimization knows the exact time of movement onset, produces a reasonably realistic timecourse of preparation when compared to data from self-paced movements. At the same time, most readers will want to see that the model can produce realistic looking preparatory activity when presented with an unpredictable delay. I realize this may be an optimization nightmare, but there are probably ways to trick the model into optimizing to move soon, but then forcing it to wait (which is actually what monkeys are probably doing). Doing so would allow the model to produce preparation under the circumstances where most studies have examined it. In some ways this is just window-dressing (showing people something in a format they are used to and can digest) but it is actually more than that, because it would show that the model can produce a reasonable plateau of sustained preparation. At present it isn't clear it can do this, for the reasons noted above. If it can't, regularizing complexity might help (and even if this can't be shown, it could be discussed).

      In summary, I found this to be a very strong study overall, with a conceptually timely message that was well-explained and nicely documented by thorough simulations. I think it is critical to perform the test, noted above, of examining preparatory subspace activity across a range of delay durations (including zero) to see whether preparation endures as it does empirically. I think the issue of a more realistic cost function is also important, both in terms of the conceptual message and in terms of inducing the model to produce more realistic activity. Conceptually it matters because I don't think the central message should be 'preparation reduces upstream ATP usage by allowing motor cortex to be an amplifier'. I think the central message the authors wish to convey is that constraints on inputs make preparation a good strategy. Many of those constraints likely relate to the fact that upstream areas can't do things that motor cortex can do (else you wouldn't need a motor cortex) and it would be good if regularization reflected that assumption. Furthermore, additional forms of regularization would likely improve the realism of model responses, in ways that matter both aesthetically and conceptually. Yet while I think this is an important issue, it is also a deep and tricky one, and I think the authors need considerable leeway in how they address it. Many of the cost-function terms one might want to use may be intractable. The authors may have to do what makes sense given technical limitations. If some things can't be done technically, they may need to be addressed in words or via some other sort of non-optimization-based simulation.

      Specific comments

      As noted above, it would be good to show that preparatory subspace activity occurs similarly across delay durations. It actually might not, at present. For a zero ms delay, the simple magnitude-based regularization may be insufficient to induce preparation. If so, then the authors would either have to argue that a zero delay is actually never used internally (which is a reasonable argument) or show that other forms of regularization can induce zero-delay preparation.

      Yes, that is a very interesting analysis to perform, which we had not considered before! When investigating this, we found that the zero-delay strategy does not rely on preparation in the same way as is seen in the monkeys. This seems to be a reflection of the fact that our “Go cue” corresponds to an “internal” go cue which would likely come after the true, “external go cue” – such that we would indeed never actually be in the zero delay setting. This is not something we had addressed (or really considered) before, although we had tried to ensure we referred to “delta prep” as the duration of the preparatory period but not necessarily the delay period. We have now included more discussion on this topic, as well as a new Supplementary Figure S10.

      I agree with the authors that prior modeling work was limited by assuming the inputs to M1, which meant that prior work couldn't address the deep issue (tackled here) of why there should be any preparatory inputs at all. At the same time, the ability to hand-select inputs did provide some advantages. A strong assumption of prior work is that the inputs are 'simple', such that motor cortex must perform meaningful computations to convert them to outputs. This matters because if inputs can be anything, then they can just be the final outputs themselves, and motor cortex would have no job to do. Thus, prior work tried to assume the simplest inputs possible to motor cortex that could still explain the data. Most likely this went too far in the 'simple' direction, yet aspects of the simplicity were important for endowing responses with realistic properties. One such property is a large condition-invariant response just before movement onset. This is a very robust aspect of the data, and is explained by the assumption of a simple trigger signal that conveys information about when to move but is otherwise invariant to condition. Note that this is an implicit form of regularization, and one very different from that used in the present study: the input is allowed to be large, but constrained to be simple. Preparatory inputs are similarly constrained to be simple in the sense that they carry only information about which condition should be executed, but otherwise have little temporal structure. Arguably this produces slightly too simple preparatory-period responses, but the present study appears to go too far in the opposite direction. I would suggest that the authors do what they can to address these issue via simulations and/or discussion. I think it is fine if the conclusion is that there exist many constraints that tend to favor preparation, and that regularizing magnitude is just one easy way of demonstrating that. Ideally, other constraints would be explored. But even if they can't be, there should be some discussion of what is missing - preparatory plateaus, a realistic condition-invariant signal tied to movement onset - under the present modeling assumptions.

      As described above, we have now included two additional figures. In the first one (S8, already discussed above), we used a temporal smoothness prior, and we indeed get slightly more realistic activity plateaus. In a second supplementary figure (S9), we have also considered using model predictive control (MPC) to optimize the inputs under an uncertain go cue arrival time. There, we found that removing the assumption that the delay period is known came with new challenges: in particular, it requires the specification of a “mental model” of when the Go cue will arrive. While it is reasonable to expect that monkeys will have a prior over the go time arrival cue that will be shaped by the design of the experiment, some assumptions must be made about the utility functions that should be used to weigh this prior. For instance, if we imagine that monkeys carry a model of the possible arrival time of the go cue that is updated online, they could nonetheless act differently based on this information, for instance by either preparing so as to be ready for the earliest go cue possible or alternatively to be ready for the average go cue. This will likely depend on the exact task design and reward/penalty structure. Here, we added simulations with those two cases (making simplifying assumptions to make the problem tractable/solvable using model predictive control), and found that the “earliest preparation” strategy gives rise to more realistic plateauing activity, while the model where planning is done for the “most likely go time” does not. We suspect that more realistic activity patterns could be obtained by e.g combining this framework with the temporal smoothness cost. However, the main point we wished to make with this new supplementary figure is that it is possible to model the task in a slightly more realistic way (although here it comes at the cost of additional model assumptions). We have now added more discussion related to those points. Note that we have kept our analyses on these new models to a minimum, as the main takeaway we wish to convey from them is that most components of the model could be modified/made more realistic. This would impact the qualitative behavior of the system and match to data but – in the examples we have so far considered – does not appear to modify the general strategy of networks relying on preparation.

      On line 161, and in a few other places, the authors cite prior work as arguing for "autonomous internal dynamics in M1". I think it is worth being careful here because most of that work specifically stated that the dynamics are likely not internal to M1, and presumably involve inter-area loops and (at some latency) sensory feedback. The real claim of such work is that one can observe most of the key state variables in M1, such that there are periods of time where the dynamics are reasonably approximated as autonomous from a mathematical standpoint. This means that you can estimate the state from M1, and then there is some function that predicts the future state. This formal definition of autonomous shouldn't be conflated with an anatomical definition.

      Yes, that is a good point, thank you for making it so clearly! Indeed, as previous work, we do not think of our “M1 dynamics” as being internal to M1, but they may instead include sensory feedback / inter-area loops, which we summarize into the connectivity, that we chose to have dynamics that qualitatively resemble data. We have now incorporated more discussion regarding what exactly the dynamics in our model represent.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      eLife assessment 

      Dasgupta and colleagues make a valuable contribution to the understanding how the guidance factor Sema7a promotes connections between mechanosensory hair cells and afferent neurons of the zebrafish lateral line system. The authors provide solid evidence that loss of Sema7a function results in fewer contacts between hair cells and afferents through comprehensive quantitative analysis. Additional work is needed to distinguish the effects of different isoforms of Sema7a to determine whether there are specific roles of secreted and membrane bound forms. 

      Public Reviews:

      Reviewer #1 (Public Review):

      Dasguta et al. have dissected the role of Sema7a in fine tuning of a sensory microcircuit in the posterior lateral line organ of zebrafish. They attempt to also outline the different roles of a secreted verses membrane-bound form of Sema7a in this process. Using genetic perturbations and axonal network analysis, the authors show that loss of both Sema7a isoforms causes abnormal axon terminal structure with more bare terminals and fewer loops in contact with presynaptic sensory hair cells. Further, they show that loss of Sema7a causes decreased number and size of both the pre- and post-synapse. Finally, they show that overexpression of the secreted form of Sema7a specifically can elicit axon terminal outgrowth to an ectopic Sema7a expressing cell. Together, the analysis of Sema7a loss of function and overexpression on axon arbor structure is fairly thorough and revealed a novel role for Sema7a in axon terminal structure. However, the connection between different isoforms of Sema7a and the axon arborization needs to be substantiated. Furthermore, the effect of loss of Sema7a on the presynaptic cell is not ruled out as a contributing factor to the synaptic and axon structure phenotypes. These issues weaken the claims made by the authors including the statement that they have identified dual roles for the GPI-anchored verses secreted forms of Sema7a on synapse formation and as a chemoattractant for axon arborization respectively. 

      Reviewer #2 (Public Review):

      In this work, Dasgupta et al. investigates the role of Sema7a in the formation of peripheral sensory circuit in the lateral line system of zebrafish. They show that Sema7a protein is present during neuromast maturation and localized, in part, to the base of hair cells (HCs). This would be consistent with pre-synaptic Sema7a mediating formation and/or stabilization of the synapse. They use sema7a loss-of-function strain to show that lateral line sensory terminals display abnormal arborization. They provide highly quantitative analysis of the lateral line terminal arborization to show that a number of specific topological parameters are affected in mutants. Next, they ectopically express a secreted form of Sema7a to show that lateral line terminals can be ectopically attracted to the source. Finally, they also demonstrate that the synaptic assembly is impaired in the sema7a mutant. Overall, the data are of high quality and properly controlled. The availability of Sema7a antibody is a big plus, as it allows to address the endogenous protein localization as well to show the signal absence in the sema7a mutant. The quantification of the arbor topology should be useful to people in the field who are looking at the lateral line as well as other axonal terminals. I think some results are overinterpreted though. The authors state: "Our findings demonstrate that Sema7A functions both as a juxtracrine and as a secreted cue to pattern neural circuitry during sensory organ development." However, they have not actually demonstrated which isoform functions in HCs (also see comments below). In addition, they have to be careful in interpreting their topology analysis, as they cannot separate individual axons. Thus, such analysis can generate artifacts. They can perform additional experiments to address these issues or adjust their interpretations. 

      Reviewer #3 (Public Review):

      The data reported here demonstrate that Sema7a defines the local behavior of growing axons in the developing zebrafish lateral line. The analysis is sophisticated and convincingly demonstrates effects on axon growth and synapse architecture. Collectively, the findings point to the idea that the diffusible form of sema7a may influence how axons grow within the neuromast and that the GPI-linked form of sema7a may subsequently impact how synapses form, though additional work is needed to strongly link each form to its' proposed effect on circuit assembly. 

      The revised manuscript is significantly improved. The authors comprehensively and appropriately addressed most of the reviewers' concerns. In particular, they added evidence that hair cells express both Sema7A isoforms, showed that membrane bound Sema7A does not have long range effects on guidance, demonstrated how axons behave close to ectopic Sema7A, and analyzed other features of the hair cells that revealed no strong phenotypes. The authors also softened the language in many, but not all places. Overall, I am satisfied with the study as a whole. 

      Reviewer #4 (Public Review):

      This study provides direct evidence showing that Sema7a plays a role in the axon growth during the formation of peripheral sensory circuits in the lateral-line system of zebrafish. This is a valuable finding because the molecules for axon growth in hair-cell sensory systems are not well understood. The majority of the experimental evidence is convincing, and the analysis is rigorous. The evidence supporting Sema7a's juxtracrine vs. secreted role and involvement in synapse formation in hair cells is less conclusive. The study will be of interest to cell, molecular and developmental biologists, and sensory neuroscientists. 

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      In their revised manuscript, Dasgupta et al. have provided further experiments to address the role of Sema7a (sec and GPI-anchored) in regulating axon guidance in the lateral line system. Specifically, the inclusion of the heat shock controls and FM labeling to show hair cell mechanotransduction were crucial to interpretation of the results. However, there are still concerns about the specificity of the results. My primary concern is if the change in axon patterning is specifically due to loss of Sema7a in the mutant hair cells. These animals are morphologically very abnormal and, in the rebuttal, the authors state that hair cell number is reduced. This is not quantified in the manuscript and should be included. 

      Thank you for this suggestion. We have included the data in the manuscript in lines 137-139, in Figure 2—figure supplement 1B, and in the source data for Figure 2 and Figure 2-figure supplements.

      If there is not a function for Sema7a in hair cells themselves, why is the number reduced? 

      The sema7a-/- homozygous mutants are not viable and they die by 6 dpf. The loss of Sema7A protein produce other developmental defects including brain edema and a curved body axis. We believe a slight but not significant decrease in hair cell number may arise from a minute developmental delay in the morphogenesis of the neuromast. We have accordingly quantified our data at three distinct developmental stages-at 2 dpf, 3 dpf, and 4 dpf-and have incorporated them in the revised manuscript.

      Additionally, FM data should be quantified and presented in animals without a transgene in the same excitation/emission spectra for clearer interpretation of the staining.

      We have quantified the intensities of labeling with FM 4-64 styryl dye from the control and the sema7a-/- mutant larvae and incorporated the data in lines 139-146, in Figure 2—figure supplement 1D, and in source data for Figure 2 and Figure 2-figure supplements. We Kept the transgenes to concurrently show the arborization phenotype, hair cell morphology, and the FM 4-64 incorporation between the genotypes. 

      Rescue analysis using the myo6d promotor would allow the authors to ensure that the axon deficits can be rescued by putting Sema7a back into the sensory hair cells. Transient transgenesis could be useful for this approach and would not require the creation of a stable line. This could be done with both forms of Sema7a allowing the true assessment of whether or not the secreted and GPI-anchored form have disparate functions as claimed in lines 418424. 

      Although we recognize the importance of the rescue of the sema7a-/- mutant phenotype with the sema7asec and the sema7aGPI transcripts, it is not possible for us to perform that experiment at the moment, for the first author will leave the lab next week.  However, he plans to continue work on this project as an independent investigator to dissect the individual roles of the transcript variants in specifying the pattern of sensory arborization, a project that includes generation of transcript-specific knockout animals and rescue experiments with stable transgenic fish lines. 

      Other concerns:

      (1) The timeline of the heat shock experiment is confusing to me and, therefore, it makes me question the specificity of those results. Based on the speed of axon outgrowth and the time necessary for transcription and translation after heat shock induction of the transgene, it is unclear to me how the axon growth defects could occur in the timeline provided. Imaging two hours after the start of the heat shock is very rapid and speaks to either an indirect effect of the transgenesis on the axon growth or a leaky promotor/induction paradigm. It is possible I am just misunderstanding the set up but, from what I could gather, the imaging is being done 2 hrs after the start of the heat shock. This should be clarified. 

      The axons of the zebrafish posterior lateral line migrate relatively fast. The pioneering axons migrate at around 120 μm/hour (Sato et. al., 2010) and the follower axons migrate at almost 30-80 μm/hour (Sato et. al., 2010). The heat-shock promoter that we have utilized, hsp70l, is highly effective in inducing gene expression and subsequent protein formation within 30 to 60 mins. We believe an hour of heat shock and an hour of incubation post heat shock is sufficient to induce directed axon migration to a distance that spans from 27 μm to 140 μm. 

      We strongly believe that the directed arborization of the sensory axons towards the Sema7Asec source is not due to an indirect effect of transgenesis or leaky promoter induction, as in all 18 of the injected but not heat-shocked control larvae we did not observe ectopic Sema7Asec expression, and no aberrant projection was formed from the sensory arbor network. We highlight this observation in lines 297-299 and in Figure 4E.

      Sato et. al., 2010: Single-cell analysis of somatotopic map formation in the zebrafish lateral line system. Developmental Dynamics 239:2058–2065, 2010.

      Similarly, it would help to clarify if t(0) in the figure is the onset of the heat shock or onset of imaging two hours after the heat shock is started. 

      The t=0 hour in the Figure 4I denotes the onset of imaging two hours after the heat shock began. We have clarified this in the manuscript in lines 1155-1156.

      (2) In the rebuttal, the line numbers cited do not match up with the appropriate text, I believe.

      We have corrected this and updated the manuscript.

      (3) Some of the supplemental figures are not mentioned in the text, or I could not find them. For example: Figure 1 supplement 2J. 

      Thank you for pointing this. We have corrected the manuscript, and the new information is added in line 114.  

      (4) Table 1 statistics: were these adjusted for multiple comparisons using a bonferroni correction or something similar? This is necessary for statistical significance to be meaningful. 

      We did not adjust the p-values for multiple comparisons because the values correspond to only three or four statistical tests per experiment, strongly indicating the unlikelihood of erroneous significance due solely to multiple tests.

      (5) Figure 1I and 1-S3 - The legend states a positive correlation between axonal signal and sema7A signal. Correlations are 0.5, 0.6, and 0.4 (2,3, 4dpf). This is not a convincing positive correlation. At best this is no to a very weak positive correlation. 

      In lines 122-126 we mention that the basal association of the sensory arbors shows a positive correlation with Sema7A accumulation. We never emphasize on the strength of the correlation. However, a consistent positive correlation at three different developmental stages suggests that progressive Sema7A accumulation at the base of the hair cells may guide the sensory arbors to increasingly associate themselves with the hair cells.    

      Reviewer #2 (Recommendations For The Authors):

      I am a bit disappointed that the authors elected not to experimentally address the issue raised by all reviewers: whether the secreted or membrane bound isoform is active in hair cells. They rather decided to change their interpretation in the text. It is fine, given the eLife review structure. However, that would make the manuscript much stronger. Other issues were adequately addressed through textual changes as well. 

      Although we recognize the importance of the rescue of the sema7a-/- mutant phenotype with the sema7asec and the sema7aGPI transcripts, it is not possible for us to perform that experiment at the moment, for the first author will leave the lab next week.  However, he plans to continue work on this project as an independent investigator to dissect the individual roles of the transcript variants in specifying the pattern of sensory arborization, a project that includes generation of transcript-specific knockout animals and rescue experiments with stable transgenic fish lines. 

      Reviewer #3 (Recommendations For The Authors):

      Overall, I am satisfied with the study as a whole and just have a few minor comments that remain to be addressed. 

      (1) Although the authors say that they added appropriate no plasmid/heatshock-only and plasmid-only/no heatshock controls, these results need to be presented more clearly, as they are separated in the paper and only one was quantified (i.e. 100% of embryos showed no defect). Please just make it clear that no defects were observed in either control for either experiment (both secreted and membrane bound ectopic expression). 

      We have clearly stated this information in lines 297-299 and 343-345.

      (2) Please add a compass to Fig. 1A to indicate the orientation of the neuromast. It would also be helpful to add labels for developmental ages to all of the figures, rather than making the reader look it up in the legend. 

      We have updated the Figure 1A and the corresponding figure legend in lines 882883 . We have denoted the larval age in the figure legends to keep the individual images uncluttered.  

      (3) For the RT-PCR experiments in Figure 1, no negative control was included to show that supporting cell or neuronal genes are not detected in the purified hair cells and v.v. that neither isoform is detected in supporting cells or neurons. I ask only because there is a lot of immune-signal outside of the hair cells and I am curious whether that is secreted or might come from other cell types. For neurons and supporting cells, simply demonstrating absence of Sema7a overall would suffice. 

      We have utilized the transgenic line Tg(myo6b:actb1-EGFP) that expresses the fluorophore GFP specifically in the hair cells of the neuromast. Unfortunately, we do not possess a transgenic line that reliably and specifically labels the support cells in the neuromast. Hence, in our sorting experiment the GFP-negative cells that are collected from the trunk segments of the larvae contain all the non-hair cells including epidermal cells, neuronal cells, and immune cells etc. Such a mixture of varied cellular identity may not serve as a reliable negative control. 

      In Figure 7, we have plotted the normalized expression values of the sema7a gene in the neuromast. The plot clearly depicts that the source of Sema7A is the young and the mature hair cells, not the support cells. We further confirm this observation by

      immunostaining where the Sema7A signal is highly restricted to the hair cells and not in any other cell in the neuromast (Figure 1E). Immunostaining further demonstrates that the lateral line sensory arbors also do not produce the Sema7A protein (Figure 1H; Video 1).

      We agree with the reviewer that there are diverse immune cells, including macrophages in and around the neuromast. These macrophages are dynamic and possess highly ramified structure (Denans et. al., 2022). In all our Sema7A immunostainings, we never observed structures that resemble macrophages. Albeit we cannot confirm that Sema7A is not expressed in a distant immune cell, but we highly doubt that signal coming from immune cells is impacting hair cell innervation by the sensory arbors during homeostatic development.

      Denans et. al., 2022: Nature Communications volume 13, Article number: 5356 (2022).

      (4) In Figure 1, Supplement 4, I do not see the immunogen labeled in blue. 

      We have corrected the figure legend. The immunogenic region of the Sema7A protein is now clearly denoted in the figure legend of Figure 1—figure supplement 4.

      (5) In Figure 2, please add a control image as requested, as that enables direct comparison. There is ample room in the figure. 

      We have updated the Figure 2 and made the suggested change.

      (6) In Figure 2, Supplement 1, the FM4-64 data are not presented in a quantified fashion. Please report at least how many embryos showed reliable uptake and preferably how many hair cells per embryo showed reliable uptake. 

      We have quantified the FM 4-64 intensities in control and sema7a-/- mutant larvae. The new data is added to the manuscript in lines 142-146, 577-579 , and in Figure 2—figure supplement 1D.

      (7) In Figure 3, there seems to be a typo in the figure legend: "mutants in the same larvae" does not make sense to me. 

      We have corrected the error. The modified statement is represented in lines 10671068.

      (8) The text should refer more explicitly to the statistical tests reported in Table 1, i.e. as the results are presented. 

      In lines 1105 and 1109, we clearly state the statistical tests that were performed.

      (9) In Figure 6, Supplement 1, please show the raw data points not just the bar graphs

      We have updated the Figure 6—figure supplement 1.

      (10) Minor point: the authors state that they addressed the distance over which secreted Sema7A may act, but this was not evident to me in the text. Please make this finding clearer.

      We have clarified this information in lines 310-311.

      (11) Finally, the discussion contains a statement that is not supported by the data: "We have discovered dual modes of Sema7A function in vivo." They have discovered evidence that there are two isoforms, that loss of both disrupts connectivity, and that overexpression of only the secreted form can elicit growth from a distance. However, there is no direct evidence that the membrane-bound form is responsible for local effects. It is formally possible still that the phenotypes are a result of dual roles for the secreted form. It is clear that another manuscript is forthcoming that will expand on the role of the transmembrane form, but for this manuscript, the authors should make firm conclusions only about the data presented herein.

      Thank you for this suggestion. We have modified the manuscript in lines 425-434.

      Reviewer #4 (Recommendations For The Authors):

      The authors have made significant changes to the manuscript based on the comments of the reviewers. It is now suitable for publication.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      eLife assessment

      This manuscript represents a cleanly designed experiment for assessing biological motion processing in children (mean age = 9) with and without ADHD. The group differences concerning accuracy in global and local motion processing abilities are solid, but the analyses suggesting dissociable relationships between global and local processing and social skills, age, and IQ are inconclusive. The results are useful in terms of understanding ADHD and the ontogenesis of different components of the processing of biological motion.

      We thank the editors and reviewers for their valuable feedback and constructive comments. We have carefully considered each point raised by the reviewers and made the necessary revisions to the manuscript. Regarding the relationships between global and local BM processing, the accumulated evidence from previous studies has converged on the dissociation of the two BM components, e.g., while global BM processing is susceptible to learning and practice, local BM processing does not show a learning trend (Chang and Troje, 2009; Grossman et al., 2004), and the brain activations in response to local and global BM cues are different (Chang et al., 2018; Duarte et al., 2022). Nevertheless, we concurred with reviewers that the evidence for such dissociation from the current study by itself is not strong enough. Therefore, we have toned down on this point and no longer claimed the dissociation (including the title). Based on the current results, we focused our discussion on the different aspects of BM processing in children with and without ADHD.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The paper presents a nice study investigating the impairments of biological motion perception in individuals with ADHD in comparison with neurotypical controls. Motivated by the idea that there is a relationship between biological motion perception and social capabilities, the authors investigated the impairments of local and global (holistic) biological motion perception, the diagnosis status, and several additional behavioral variables that are affected in ADHS (IQ, social responsiveness, and attention / impulsivity). As well local as global biological motion perception is impaired in ADHD individuals. In addition, the study demonstrates a significant correlation between local biological motion perception skills and the social responsiveness score in the ADHD group, but not in controls. A path analysis in the ADHD group suggests that general performance in biological motion perception is influenced mainly by global biological motion perception performance and attentional and perceptual reasoning skills.

      Strengths:

      It is true that there exists not much work on biological motion perception and ADHD. Therefore, the presented study contributes an interesting new result to the biological motion literature, and adds potentially also new behavioral markers for this clinical group. The design of the study is straightforward and technically sound, and the drawn conclusions are supported by the presented results.

      Thanks for this positive assessment of our work.

      Weaknesses:

      Some of the claims about the relationship between genetic factors and ADHD and the components of biological motion processing have to remain speculative at this point because genetic influences were not explicitly tested in this paper. Specifically, the hypothesis that the perception of human social interaction is critically based on a local mechanism for the detection of asymmetry in foot trajectories of walkers (this is what 'BL-local' really measures), or on the detection of live agents in cluttered scenes seems not very plausible.

      Thanks for these comments. We agree that the relationship between genetic factors and BM perception remains to be further examined, as we did not test the genetic influences in this study. We have deleted relavant discussion about genetics. Based on our results, we discuss the possible mechanisms behind the relationship between local BM processing and social interaction in the revised manuscript as follows:

      “As mentioned above, we found a significant negative correlation between the SRS total score and the accuracy of local BM processing, specifically in the ADHD group. This could be due to decreased visual input related to atypical local BM processing, which further impairs global BM processing. According to the two-process theory of biological motion processing61, local BM cues guide visual attention towards BM stimuli55,62. Consequently, the visual input of BM stimuli increases, facilitating the development of the ability to process global BM cues through learning21,63. The latter is a prerequisite for attributing intentions to others and facilitating social interactions with other individuals20,64,65. Thus, atypical local BM processing may contribute to impaired social interaction through altered visual inputs. Further empirical studies are required to confirm these hypotheses.” (lines 417 - 428)

      Based on my last comments, now the discussion has been changed in a way that tries to justify the speculative claims by citing a lot of other speculative papers, which does not really address the problem. For example, the fact that chicks walk towards biological motion stimuli is interesting. To derive that this verifies a fundamental mechanism in human biological motion processing is extremely questionable, given that birds do not even have a cortex. Taking the argumentation of the authors serious, one would have to assume that the 'Local BM' mechanism is probably located in the mesencephalon in humans, and then would have to interact in some way with social perception differences of ADHD children. To me all this seems to make very strong (over-)claims. I suggest providing a much more modest interpretation of the interesting experimental result, based on what has been really experimentally shown by the authors and closely related other data, rather than providing lots of far-reaching speculations.

      In the same direction, in my view, go claims like 'local BM is an intrinsic trait' (L. 448) , which is not only imprecise (maybe better 'mechanisms of processing of local BM cues') but also rather questionable. Likely, this' local processing of BM' is a lower level mechanisms, located probably in early and mid-levels of the visual cortex, with a possible influence of lower structures. It seems not really plausible that this is related to a classical trait variables in the sense of psychology, like personality, as seems to be suggested here. Also here I suggest a much more moderate and less speculative interpretation of the results.

      We thank the reviewer for pointing out these issues. According to these comments, we have carefully revised the discussion to avoid strong (over-) claims. We have deleted the example of chicks, but substituted with more empirical studies to explain our results. We agree that the Local BM mechanism is probably located in subcortical regions in humans, which were reported by some MRI studies (Chang et al., 2018; Hirai and Senju, 2020; Loula et al., 2005). We have added some evidence that atypical local BM processing may decrease visual inputs related to social information as follows:

      “According to the two-process theory of biological motion processing61, local BM cues guide visual attention towards BM stimuli55,62. Consequently, the visual input of BM stimuli increases, facilitating the development of the ability to process global BM cues through learning21,63. The latter is a prerequisite for attributing intentions to others and facilitating social interactions with other individuals20,64,65. Thus, atypical local BM processing may contribute to impaired social interaction through altered visual inputs.” (lines 421 - 427)

      We have also deleted the clarims of 'local BM is an intrinsic trait' (originally L. 448) and related discussion as it was not conclusive based on the current study.

      Reviewer #2 (Public Review):

      Summary:

      Tian et al. aimed to assess differences in biological motion (BM) perception between children with and without ADHD, as well as relationships to indices of social functioning and possible predictors of BM perception (including demographics, reasoning ability and inattention). In their study, children with ADHD showed poorer performance relative to typically developing children in three tasks measuring local, global, and general BM perception. The authors further observed that across the whole sample, performance in all three BM tasks was negatively correlated with scores on the social responsiveness scale (SRS), whereas within groups a significant relationship to SRS scores was only observed in the ADHD group and for the local BM task. Local and global BM perception showed a dissociation in that global BM processing was predicted by age, while local BM perception was not. Finally, general (local & global combined) BM processing was predicted by age and global BM processing, while reasoning ability mediated the effect of inattention on BM processing.

      Strengths:

      Overall, the manuscript is presented in a clear fashion and methods and materials are presented with sufficient detail so the study could be reproduced by independent researchers. The study uses an innovative, albeit not novel, paradigm to investigate two independent processes underlying BM perception. The results are novel and have the potential to have wide-reaching impact on multiple fields.

      We appreciate the reviewer’s positive feedback very much.

      Weaknesses:

      The manuscript has greatly improved in clarity and methodological considerations in response to the review. There are only a few minor points which deserve the authors' attention:

      When outlining the moviation for the current study, results from studies in ADHD and ASD are used too interchangeably. The authors use a lack of evidence for contributing (psychological/developmental) factors on BM processing in ASD to motivate the present study and refer to evidence for differences between typical and non-typical BM processing using studies in both ASD and ADHD. While there are certainly overlapping features between the two conditions/neurotypes, they are not to be considered identical and may have distinct etiologies, therefore the distinction between the two should be made clearer.

      We thank the reviewer for pointing out this issue. We have removed some unnecessary citations about ASD and referred to studies about social cognition in ADHD to elaborate the motivation of this study:

      “Further exploration of a diverse range of social cognitions (e.g., biological motion perception) can provide a fresh perspective on the impaired social function observed in ADHD. Moreover, recent studies have indicated that the social cognition in ADHD may vary depending on different factors at the cognitive, pathological, or developmental levels, such as general cognitive impairment5, symptoms severity8, or age5. Nevertheless, understanding how these factors relate to social cognitive dysfunction of in ADHD is still in its infancy. Bridging this gap is crucial as it can help depict the developmental trajectory of social cognition and identify effective interventions for impaired social interaction in individuals with ADHD.” (lines 53 - 62)

      In the first/main analysis, is unclear to me why in the revised manuscript the authors changed the statistical method from ANOVA/ANCOVA to independent samples t-tests (unless the latter were only used for post-hoc comparisons, then this needs to be stated). Furthermore, although p-values look robust, for this analysis too it should be indicated whether and how multiple comparison problems were accounted for.

      Thanks for the reviewer’s comments. According to the suggestions from reviewer #3, it may be inapposite to regard gender as a covariate in ANOVA, which may violate the assumptions of ANCOVA. To ensure that gender does not influence the results, firstly, we separated boys and girls on the plots with different coloured individual data points, and there are no signs of a gender effect in their TD group. Secondly, we use t-tests to examine the difference between TD and ADHD groups. Finally, we conducted a subsampling analysis with balanced data, and the results remained consistent.

      In part 1 of the results, we aimed to compare the task accuracies between the TD and ADHD groups in three independent tasks, which assess the participants’ abilities to process three types of BM cues. We assumed that individuals with ADHD show poorer performance in three tasks compared to TD individuals. With regard to that, we consider that multiple comparisons may not be necessary.

      Reviewer #3 (Public Review):

      Strengths:

      The authors present differences between ADHD and TD children in biological motion processing, and this question has not received as much attention as equivalent processing capabilities in autism. They use a task that appears well controlled. They raise some interesting mechanistic possibilities for differences in local and global motion processing, which are distinctions worth exploring. The group differences will therefore be of interest to those studying ADHD, as well as other developmental conditions, and those examining biological motion processing mechanisms in general.

      We appreciate the reviewer’s positive assessment of this work.

      Weaknesses:

      The data are not strong enough to support claims about differences between global and lobal processing wrt social communication skills and age. The mechanistic possibilities for why these abilities may dissociate in such a way are interesting, but the crucial tests of differences between correlations do not present a clear picture. Further empirical work would be needed to test the authors' claims. Specifics:

      The authors state frequently that it was the local BM task that related to social communication skills (SRS) and not the global tasks. However, the results section shows a correlation between SRS and all three tasks. The only difference is that when looking specifically within the ADHD group, the correlation is only significant for the local task. The supplementary materials demonstrate that tests of differences between correlations present an incomplete picture. Currently they have small samples for correlations, so this is unsurprising.

      Thanks for this comment. We agree with the reviewer that the relationship between local and global processing with social communication and age needs more expirical work. Based on our results, there are only possible dissociable roles of local and global BM processing. The accumulated evidence from previous studies has converged on this dissociation, e.g., whild global BM processing is susceptible to learning and practice, local BM processing does not show a learning trend (Chang and Troje, 2009; Grossman et al., 2004), and the brain activations in response to local and global BM cues are different (Chang et al., 2018; Duarte et al., 2022). We concurred with reviewers that the evidence for such dissociation from the current study by itself is not strong enough. Therefore, we have toned down on this point and no longer emphasized the dissociation. Based on the current results, we focused our discussion on the different aspects of BM processing in children with and without ADHD. Future studies with larger sample sizes are needed to confirm this disociable relationship.

      Theoretical assumptions. The authors make some statements about local vs global biological motion processing that should still be made more tentatively. They assume that local processing is specifically genetically whereas global processing is a product of experience. These data in newborn chicks are controversial and confounded - I cannot remember the specifics but I think there an upper vs lower visual field complexity difference here.

      We appreciate the reviewer’s suggestion. We agree that the relationship between genetic factors and BM perception remains to be further examined as we didn’t perform any genetic analysis in the current study. Some speculative papers have been removed, so do the statement about newborn chicks given the controversial and confounded results. We have toned down our claims and povided a moderate interpretation of the results:

      “Sensitivity to local BM cues emerges early in life54,55 and involves rapid processing in the subcortical regions16,56-58. As a basic pre-attentive feature23, local BM cues can guide visual attention spontaneously59,60. In contrary, the ability to process global BM cues is related to slow cortical BM processing and is influenced by many factors such as attention25,26 and visual experience21,51. As mentioned above, we found a significant negative correlation between the SRS total score and the accuracy of local BM processing, specifically in the ADHD group. This could be due to decreased visual input related to atypical local BM processing, which further impairs global BM processing. According to the two-process theory of biological motion processing61, local BM cues guide visual attention towards BM stimuli55,62. Consequently, the visual input of BM stimuli increases, facilitating the development of the ability to process global BM cues through learning21,63. The latter is a prerequisite for attributing intentions to others and facilitating social interactions with other individuals20,64,65. Thus, atypical local BM processing may contribute to impaired social interaction through altered visual inputs.” (lines 413 - 427)

      “Few developmental studies have been conducted on local BM processing. The ability to process local BM cues remained stable and did not exhibit a learning trend21,25. A reasonable interpretation may be that local BM processing is a low-level mechanism, probably performed by the primary visual cortex and subcortical regions such as the superior colliculus, pulvinar, and ventral lateral nucleus14,56,61.” (lines 441- 446)

      Readability. The manuscript needs very careful proofreading and correction for grammar. There are grammatical errors throughout.

      Thank the reviewer for this feedback. We have performed thorough proofreading and corrected grammatical errors throughout the manuscript.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      I thank the authors for their revisions that address several of the minor points that I raised in my last review. A number of requests are still not sufficiently answered:

      L. 290 ff.: These model 'BM-local = age + gender etc ' is a pretty sloppy notation. I think what is meant that a GLM was used that uses the predictors genderetc. time appropriate beta_i values. This formulas should be corrected or one just says that a GLM was run with the predictors gender

      The same criticism applies to these other models that follow.

      This was corrected.

      However, the corrected text remains sloppy: example: 'BM-locaL = ...' What exacty is 'BM-Local' the accuracy? etc. Here a precise notation shoudl be given that clearly names which variables are used here as predictors and target variables.

      We appreciate the reviewer’s suggestion. We clarified which variables are used in our model and gived them precise notations:

      “Three linear models were built to investigate the contributing factors: (a) ACClocal = β0 + β1 * age + β2 * gender + β3 * FIQ + β4 * QbInattention, (b) ACCglobal = β0 + β1 * age + β2 * gender + β3 * FIQ + β4 * QbInattention, and (c) ACCgeneral = β0 + β1 * age + β2 * gender + β3 * FIQ + β4 * QbInattention + β5 * ACClocal + β6 * ACCglobal. ACClocal, ACCglobal and ACCgeneral refer to the response accuracies of the three tasks in the ADHD group, and QbInattention is the standardised score for sustained attention function.” (lines 337 - 343)

      All these models assume linearity of the combination of the predictors. was this assumption verified?

      We referred to the previous study of BM perception in children. They found main predictor variables, including IQ (Rutherford et al., 2012; Jones et al., 2011) and age (Annaz et al., 2010; van et al., 2016), have a linear relation with the ability of BM processing.

      This answer is insufficient and not convincing. Because a variable Y depends linearly on predictor A and B in some other study, this does not imply that is is also linear in predictor C, or does not show interactions with such predictors in the present study.

      What is needed here is the testing of models with interaction terms and verifying that such models are not better predictors. If authors do not want to do this, they need at least to clearly point out that they made the strong assumption of linearity of their model, which might be wrong and thus be a substantial limitation of their analysis.

      Thanks for the suggestion. We tried to compare each possible mode with and without relative interactions. The results showed that the change of Coefficient of Determination (R-squared, R2) between the two models was not statistically significant.

      L. 296ff.: For model (b) it looks like general BM performance is strongly driven by the predictor global BM performance in the ADHD group. Does the same observation also apply to the controls?

      The same phenomenon was not observed in TD children. We have briefly discussed this point in the Discussion section of the revised manuscript (lines 449 - 459).

      Was such a path analysis also done for the TD subjects or not? If yes, was then also predicted that the variable BM-Global largely and directedly influences the variable BM-General? (The answer refers to the general discussion section, where no such analysis is presented, as far as I understand.)

      Thank you for your comment. We also conduct a path analysis similar to that in the ADHD group. There is no statistically significant mediator effect in the TD group. Please see Figure S3 for complete statistics.

      Reviewer #2 (Recommendations For The Authors):

      (1) Please add public access to the data repository so data availability can be assessed.

      The data analyzed during the study is available at https://osf.io/37p5s/.

      (2) Lines 119-115: The differences observed in ADHD participants in the studies referenced here were relative to what group? The last sentence here also refers to two groups, and it is difficult to gather which specific groups are meant, also because the two references relate to both ADHD and ASD samples. Please clarify.

      The suggestion is well taken. We have clarified the expressions accordingly:

      “Specifically, compared with the typically developing (TD) group, children with ADHD showed reduced activity of motion-sensitive components (N200) while watching biological and scrambled motions, although no behavioural differences were observed. Another study found that children with ADHD performed worse in BM detection with moderate noise ratios than the TD group32.” (lines 100 - 105)

      (3) Line 116: I'm not sure what is meant by 'despite initial indications' - please briefly specify/summarise here why the investigation into BM processing in ADHD is warranted.

      Thank the reviewer for pointing out this issue. We rephrase this part and briefly specify “why the investigation into BM processing in ADHD is warranted”:

      “Despite initial findings about atypical BM perception in ADHD, previous studies on ADHD treated BM perception as a single entity, which may have led to misleading or inconsistent findings28. Hence, it is essential to deconstruct BM processing into multiple components and motion features.” (lines 108 -111)

      (4) Lines 290-293: Please complete the sentence.

      Thank the reviewer for pointing out this issue. Th sentence has been completed:

      “For Task 2 and 3, where children were asked to detect the presence or discriminate the facing direction of the target walker, TD group have higher accuracies than the ADHD group (Task 2 - TD: 0.70 ± 0.12, ADHD: 0.59 ± 0.12, t73 = 3.677, p < 0.001, Cohen's d = 0.861; Task 3 - TD: 0.79 ± 0.12, ADHD: 0.63 ± 0.17, t73 = 4.702, p < 0.001, Cohen's d = 1.100).” (lines 284 - 288)

      Reviewer #3 (Recommendations For The Authors):

      (1) Conclusions concerning differences between the local and global tasks wrt SRS and age (see above). I believe the authors need to reword throughout to reflect that the tests of differences between these crucial correlations did not present a clear picture.

      We have reworded throughout the paper to reflect the inconclusiveness with regard to the relationship between local and global processing with social communication based on this study only. Future studies with larger sample sizes are needed to confirm this conclusion. The mechanism for this dissociable relationship should be validated by more psychologial tests in the future studies.

      (2) I would again tone down the discussion of genetic specification of local processing, given it is highly controversial.

      We thank the reviewer for pointing out the issue. We agree the point about the genetic specification of local processing remains controversial. The interpretation of results about local BM processing has been rephrased. Please refer to our response to the point #2 mentioned.

      (3) The manuscript needs very careful proofreading and grammatical correction throughout.

      Thanks for the suggestion to check the grammar. We have carefully proofread the manuscript to correct grammatical errors

    1. Author response:

      Response to Reviewer #1 (Public Review):

      We thank the reviewer for their constructive criticism of our study, their proposed solutions, and for highlighting areas of the methodology and analytical pipeline where explanations were unclear or unsatisfactory. We will take the reviewer’s feedback into account to improve the clarity and readability of the revised manuscript. We acknowledge the importance of ruling out eye movements as a potential confound. We address these concerns briefly below, but a more detailed explanation (and a full breakdown of the relevant analyses, including the corrected and uncorrected results) will be provided in the revised manuscript.

      First, the source of EEG activity recorded from the frontal electrodes is often unclear. Without an external reference, it is challenging to resolve the degree to which frontal EEG activity represents neural or muscular responses1. Thus, as a preventative measure against the potential contribution of eye movement activity, for all our EEG analyses, we only included activity from occipital, temporal, and parietal electrodes (the selected electrodes can be seen in the final inset of Figure 3).

      Second, as suggested by the reviewer, we re-ran our analyses using the activity measured from the frontal electrodes alone. If the source of the nonlinear decoding accuracy in the AV condition was muscular activity produced by eye movements, we would expect to observe better decoding accuracy from sensors closer to the source. Instead, we found that decoding accuracy from the frontal electrodes (peak d' = 0.08) was less than half that of decoding accuracy from the more posterior electrodes (peak d' = 0.18). These results suggest that the source of neural activity containing information about stimulus position was located over occipito-parietal areas, consistent with our topographical analyses (inset of Figure 4).

      Third, we compared the average eye movements between the three main sensory conditions (auditory, visual, and audiovisual). In the visual condition, there was little difference in eye movements corresponding to the five stimulus locations, likely because the visual stimuli were designed to be spatially diffuse. For the auditory and audiovisual conditions, there was more distinction between eye movements corresponding to the stimulus locations. However, these appeared to be the same between auditory and audiovisual conditions. If consistent saccades to audiovisual stimuli had been responsible for the nonlinear decoding we observed, we would expect to find a higher positive correlation between horizontal eye position and stimulus location in the audiovisual condition than in the auditory or visual conditions. Instead, we found no difference in correlation between audiovisual and auditory stimuli, indicating that eye movements were equivalent in these conditions and unlikely to explain better decoding accuracy for audiovisual stimuli.

      Finally, we note that the stricter eye movement criterion acknowledged in the Discussion section of the original manuscript resulted in significantly better audiovisual d' than the MLE prediction, but this difference did not survive cluster correction. This is an important distinction to make as, when combined with the results described above, it seems to support our original interpretation that the stricter criterion combined with our conservative measure of (mass-based) cluster correction2 led to type 2 error.

      References

      (1) Roy, R. N., Charbonnier, S., & Bonnet, S. (2014). Eye blink characterization from frontal EEG electrodes using source separation and pattern recognition algorithms. Biomedical Signal Processing and Control, 14, 256–264.

      (2) Pernet, C. R., Latinus, M., Nichols, T. E., & Rousselet, G. A. (2015). Cluster-based computational methods for mass univariate analyses of event-related brain potentials/fields: A simulation study. Journal of Neuroscience Methods, 250, 85–93.

      Response to Reviewer #2 (Public Review):

      We thank the reviewer for their insight and constructive feedback. As emphasized in the review, an interesting question that arises from our results is that, if the neural data exceeds the optimal statistical decision (MLE d'), why doesn’t the behavioural data? We agree with the reviewer’s suggestion that more attention should be devoted to this question, and plan to provide a deeper discussion of the relationship between behavioural and neural super-additivity in the revised manuscript. We also note that while this discrepancy remains unexplained, our results are consistent with the literature. That is, both non-linear neural responses (single-cell recordings) and behavioural responses that match MLE are reliable phenomenon in multisensory integration1,2,3,4.

      One possible explanation for this puzzling discrepancy is that behavioural responses occur sometime after the initial neural response to sensory input. There are several subsequent neural processes between perception and a behavioural response5, all of which introduce additional noise that may obscure super-additive perceptual sensitivity. In particular, the mismatch between neural and behavioural accuracy may be the result of additional neural processes that translate sensory activity into a motor response to perform the behavioural task.

      Our measure of neural super-additivity (exceeding optimally weighted linear summation) differs from how it is traditionally assessed (exceeding summation of single neuron responses)2. However, neither method has yet fully explained how this neural activity translates to behavioural responses, and we think that more work is needed to resolve the abovementioned discrepancy. However, our method will facilitate this work by providing a reliable method of measuring neural super-additivity in humans, using non-invasive recordings.

      References

      (1) Alais, D., & Burr, D. (2004). The ventriloquist effect results from near-optimal bimodal integration. Current Biology, 14(3), 257–262.

      (2) Ernst, M. O., & Banks, M. S., (2002). Humans integrate visual and haptic information in a statistically optimal fashion. Nature, 415(6870), 429–433.

      (3) Meredith, M. A., & Stein, B. E. (1993). Interactions among converging sensory inputs in the superior colliculus. Science, 221, 389–391.

      (4) Stanford, T. R., & Stein, B. E. (2007). Superadditivity in multisensory integration: putting the computation in context. Neuroreport 18, 787–792.

      (5) Heekeren, H., Marrett, S. & Ungerleider, L. (2008). The neural systems that mediate human perceptual decision making. Nature Reviews Neuroscience, 9, 467–479.

    1. Reviewer #1 (Public Review):

      In this study, Gonzalez Alam et al. report a series of functional MRI results about the neural processing from the visual cortex to high-order regions in the default-mode network (DMN), compiling evidence from task-based functional MRI, resting-state connectivity, and diffusion-weighted imaging. Their participants were first trained to learn the association between objects and rooms/buildings in a virtual reality experiment; after the training was completed, in the task-based MRI experiment, participants viewed the objects from the earlier training session and judged if the objects were in the semantic category (semantic task) or if they were previously shown in the same spatial context (spatial context task). Based on the task data, the authors utilised resting-state data from their previous studies, visual localiser data also from previous studies, as well as structural connectivity data from the Human Connectome Project, to perform various seed-based connectivity analysis. They found that the semantic task causes more activation of various regions involved in object perception while the spatial context task causes more activation in various regions for place perception, respectively. They further showed that those object perception regions are more connected with the frontotemporal subnetwork of the DMN while those place perception regions are more connected with the medial-temporal subnetwork of the DMN. Based on these results, the authors argue that there are two main pathways connecting the visual system to high-level regions in the DMN, one linking object perception regions (e.g., LOC) leading to semantic regions (e.g., IFG, pMTG), the other linking place perception regions (e.g., parahippocampal gyri) to the entorhinal cortex and hippocampus.

      Below I provide my takes on (1) the significance of the findings and the strength of evidence, (2) my guidance for readers regarding how to interpret the data, as well as several caveats that apply to their results, and finally (3) my suggestions for the authors.

      (1) Significance of the results and strength of the evidence

      I would like to praise the authors for, first of all, trying to associate visual processing with high-order regions in the DMN. While many vision scientists focus specifically on the macroscale organisation of the visual cortex, relatively few efforts are made to unravel how neural processing in the visual system goes on to engage representations in regions higher up in the hierarchy (a nice precedent study that looks at this issue is by Konkle and Caramazza, 2017). We all know that visual processing goes beyond the visual cortex, potentially further into the DMN, but there's no direct evidence. So, in this regard, the authors made a nice try to look at this issue.

      Having said this, the authors' characterisation of the organisation of the visual cortex (object perception/semantics vs. place perception/spatial contexts) does not go beyond what has been known for many decades by vision neuroscience. Specifically, over the past two decades, numerous proposals have been put forward to explain the macroscale organisation of the visual system, particularly the ventrolateral occipitotemporal cortex. A lateral-medial division has been reliably found in numerous studies. For example, some researchers found that the visual cortex is organised along the separation of foveal vision (lateral) vs. peripheral vision (medial), while others found that it is structured according to faces (lateral) vs. places (medial). Such a bipartite division is also found in animate (lateral) vs. inanimate (medial), small objects (lateral) vs. big objects (medial), as well as various cytoarchitectonic and connectomic differences between the medial side and the lateral side of the visual cortex. Some more recent studies even demonstrate a tripartite division (small objects, animals, big objects; see Konkle and Caramazza, 2013). So, in terms of their characterisation of the visual cortex, I think Gonzalez Alam et al. do not add any novel evidence to what the community of neuroscience has already known.

      However, the authors' effort to link visual processing with various regions of the DMN is certainly novel, and their attempt to gather converging evidence with different methodologies is commendable. The authors are able to show that, in an independent sample of resting-state data, object-related regions are more connected with semantic regions in the DMN while place-related regions are more connected with navigation-related regions in the DMN, respectively. Such patterns reveal a consistent spatial overlap with their Kanwisher-type face/house localiser data and also concur with the HCP white-matter tractography data. Overall, I think the two pathways explanation that the authors seek to argue is backed by converging evidence. The lack of travelling wave type of analysis to show the spatiotemporal dynamics across the cortex from the visual cortex to high-level regions is disappointing though because I was expecting this type of analysis would provide the most convincing evidence of a 'pathway' going from one point to another. Dynamic caudal modelling or Granger causality may also buttress the authors' claim of pathway because many readers, like me, would feel that there is not enough evidence to convincingly prove the existence of a 'pathway'.

      (2) Guidance to the readers about interpretation of the data

      The organisation of the visual cortex and the organisation of the DMN historically have been studied in parallel with little crosstalk between different communities of researchers. Thus, the work by Gonzalez Alam et al. has made a nice attempt to look at how visual processing goes beyond the realm of the visual cortex and continues into different subregions of the DMN.

      While the authors of this study have utilised multiple methods to obtain converging evidence, there are several important caveats in the interpretation of their results:

      (1) While the authors choose to use the term 'pathway' to call the inter-dependence between a set of visual regions and default-mode regions, their results have not convincingly demonstrated a definitive route of neural processing or travelling. Instead, the findings reveal a set of DMN regions are functionally more connected with object-related regions compared to place-related regions. The results are very much dependent on masking and thresholding, and the patterns can change drastically if different masks or thresholds are used.

      (2) Ideally, if the authors could demonstrate the dynamics between the visual cortex and DMN in the primary task data, it would be very convincing evidence for characterising the journey from the visual cortex to DMN. Instead, the current connectivity results are derived from a separate set of resting state data. While the advantage of the authors' approach is that they are able to verify certain visual regions are more connected with certain DMN regions even under a task-free situation, it falls short of explaining how these regions dynamically interact to convert vision into semantic/spatial decision.

      (3) There are several results that are difficult to interpret, such as their psychophysiological interactions (PPI), representational similarity analysis, and gradient analysis. For example, typically for PPI analysis, researchers interrogate the whole brain to look for PPI connectivity. Their use of targeted ROI is unusual, and their use of spatially extensive clusters that encompass fairly large cortical zones in both occipital and temporal lobes as the PPI seeds is also an unusual approach. As for the gradient analysis, the argument that the semantic task is higher on Gradient 1 than the spatial task based on the statistics of p-value = 0.027 is not a very convincing claim (unhelpfully, the figure on the top just shows quite a few blue 'spatial dots' on the hetero-modal end which can make readers wonder if the spatial context task is really closer to the unimodal end or it is simply the authors' statistical luck that they get a p-value under 0.05). While it is statistically significant, it is weak evidence (and it is not pertinent to the main points the authors try to make).

      (3) My suggestion for the authors

      There are several conceptual-level suggestions that I would like to offer to the authors:

      (1) If the pathway explanation is the key argument that you wish to convey to the readers, an effective connectivity type of analysis, such as Granger causality or dynamic caudal modelling, would be helpful in revealing there is a starting point and end point in the pathway as well as revealing the directionality of neural processing. While both of these methods have their issues (e.g., Granger causality is not suitable for haemodynamic data, DCM's selection of seeds is susceptible to bias, etc), they can help you get started to test if the path during task performance does exist. Alternatively, travelling wave type of analysis (such as the results by Raut et al. 2021 published in Science Advances) can also be useful to support your claims of the pathway.

      (2) I think the thresholding for resting state data needs to be explained - by the look of Figure 2E and 3E, it looks like whole-brain un-thresholded results, and then you went on to compute the conjunction between these un-thresholded maps with network templates of the visual system and DMN. This does not seem statistically acceptable, and I wonder if the conjunction that you found would disappear and reappear if you used different thresholds. Thus, for example, if the left IFG cluster (which you have shown to be connected with the visual object regions) would disappear when you apply a conventional threshold, this means that you need to seriously consider the robustness of the pathway that you seek to claim... it may be just a wild goose that you are chasing.

      (3) There are several analyses that are hard to interpret and you can consider only reporting them in the supplementary materials, such as the PPI results and representational similarity analysis, as none of these are convincing. These analyses do not seem to add much value to make your argument more convincing and may elicit more methodological critiques, such as statistical issues, the set-up of your representational theory matrix, and so on.

    1. Author response:

      Thanks for the eLife assessment

      “This study employed a comprehensive approach to examining how the MT+ region integrates into a complex cognition system in mediating human visuo-spatial intelligence. While the findings are useful, the experimental evidence is incomplete and the study design, hypothesis, analyses, writing, and presentation need to be improved.” We plan to revise the manuscript according to the comments of Public Reviews.

      We are grateful for the excellent and very helpful comments, and now we address provisional author responses.

      Reviewer #1 (Public Review):

      Summary:

      The study of human intelligence has been the focus of cognitive neuroscience research, and finding some objective behavioral or neural indicators of intelligence has been an ongoing problem for scientists for many years. Melnick et al, 2013 found for the first time that the phenomenon of spatial suppression in motion perception predicts an individual's IQ score. This is because IQ is likely associated with the ability to suppress irrelevant information. In this study, a high-resolution MRS approach was used to test this theory. In this paper, the phenomenon of spatial suppression in motion perception was found to be correlated with the visuo-spatial subtest of gF, while both variables were also correlated with the GABA concentration of MT+ in the human brain. In addition, there was no significant relationship with the excitatory transmitter Glu. At the same time, SI was also associated with MT+ and several frontal cortex FCs.

      Strengths:

      (1) 7T high-resolution MRS is used.

      (2) This study combines the behavioral tests, MRS, and fMRI.

      Weaknesses:

      (1) In the intro, it seems to me that the multiple-demand (MD) regions are the key in this study. However, I didn't see any results associated with the MD regions. Did I miss something??

      Thank reviewer for pointing this out. After careful consideration, we agree with your point of view. According to the results of Melnick 2013, the motion surround suppression (SI) and the time thresholds of small and large gratings representing hMT+ functionality are correlated with Verbal Comprehension, Perceptual Reasoning, Working Memory, and Processing Speed Indicators, with correlation coefficients of 0.69, 0.47, 0.49, and 0.50, respectively. This suggests that hMT+ does have the potential to become the core of MD system. However, due to our results only delving into “the GABA-ergic inhibition in human MT predicts visuo-spatial intelligence mediated by reverberation with frontal cortex”, it is not yet sufficient to prove that hMT+is the core node of the MD system, we will adjust the explanatory logic of the article, that is, emphasizing the de-redundancy of hMT+ in visual-spatial intelligence and the improvement of information processing efficiency, while weakening the significance of hMT+ in MD systems.

      (2) How was the sample size determined? Is it sufficient??

      Thank reviewer for pointing this out. We use G*power to determine our sample size. In the study by Melnick (2013), they reported a medium effect between SI and Perception Reasoning sub-ability (r=0.47). Here we use this r value as the correlation coefficient (ρ H1), setting the power at the commonly used threshold of 0.8 and the alpha error probability at 0.05. The required sample size is calculated to be 26. This ensures that our study has adequate power to yield valid statistical results. Furthermore, compared to earlier within-subject studies like Schallmo et al.'s 2018 research, which used 22 datasets to examine GABA levels in MT+ and the early visual cortex (EVC), our study includes a more extensive dataset.

      (3) In Schallmo elife 2018, there was no correlation between GABA concentration and SI. How can we justify the different results different here?

      Thank reviewer for pointing this out. There are several differences between us:

      a. While the earlier study by Schallmo et al. (2018) employed 3T MRS, we utilize 7T MRS, enhancing our ability to detect and measure GABA with greater accuracy.

      b. Schallmo elife 2018 choose to use the bilateral hMT+ as the MRS measurement region while we use the left hMT+. The reason why we focus on left hMT+ are describe in reviewer 1. (6). Briefly, use of left MT/V5 as a target was motivated by studies demonstrating that left MT/V5 TMS is more effective at causing perceptual effects (Tadin et al., 2011).

      c. The resolution of MRS sequence in Schallmo elife 2018 is 3 cm isotropic voxel, while we apply 2 cm isotropic voxel. This helps us more precisely locate hMT+ and exclude more white matter signal.

      (4) Basically this study contains the data of SI, BDT, GABA in MT+ and V1, Glu in MT+ and V1-all 6 measurements. There should be 6x5/2 = 15 pairwise correlations. However, not all of these results are included in Figure 1 and supplementary 1-3. I understand that it is not necessary to include all figures. But I suggest reporting all values in one Table.

      We thank the reviewer for the good suggestion, we are planning to make a correlation matrix to reporting all values.

      (5) In Melnick (2013), the IQ scores were measured by the full set of WAIS-III, including all subtests. However, this study only used the visual spatial domain of gF. I wonder why only the visuo-spatial subtest was used not the full WAIS-III?

      We thank the reviewer for pointing this out. The decision was informed by Melnick’s findings which indicated high correlations between Surround suppression (SI) and the Verbal Comprehension, Perceptual Reasoning, Working Memory, and Processing Speed Indexes, with correlation coefficients of 0.69, 0.47, 0.49, and 0.50, respectively. It is well-established that the hMT+ region of the brain is a sensory cortex involved in visual perception processing (3D perception). Furthermore, motion surround suppression (SI), a specific function of hMT+, aligns closely with this region's activities. Given this context, the Perception Reasoning sub-ability was deemed to have the clearest mechanism for further exploration. Consequently, we selected the most representative subtest of Perception Reasoning—the Block Design Test—which primarily assesses 3D visual intelligence.

      (6) In the functional connectivity part, there is no explanation as to why only the left MT+ was set to the seed region. What is the problem with the right MT+?

      We thank the reviewer for pointing this out. The main reason is that our MRS ROI is the left hMT+, we would like to make different models’ ROI consistent to each other. Use of left MT/V5 as a target was motivated by studies demonstrating that left MT/V5 TMS is more effective at causing perceptual effects (Tadin et al., 2011). In addition, we will check the results of our localizer to confirm whether similar findings are consistently replicated.

      (7) In Melnick (2013), the authors also reported the correlation between IQ and absolute duration thresholds of small and large stimuli. Please include these analyses as well.

      We thank the reviewer for the good advice. Containing such result do help researchers compare the result between Melnick and us. We are planning to make such picture in the revised version.

      Reviewer #2 (Public Review):

      Summary:

      Recent studies have identified specific regions within the occipito-temporal cortex as part of a broader fronto-parietal, domain-general, or "multiple-demand" (MD) network that mediates fluid intelligence (gF). According to the abstract, the authors aim to explore the mechanistic roles of these occipito-temporal regions by examining GABA/glutamate concentrations. However, the introduction presents a different rationale: investigating whether area MT+ specifically, could be a core component of the MD network.

      Strengths:

      The authors provide evidence that GABA concentrations in MT+ and its functional connectivity with frontal areas significantly correlate with visuo-spatial intelligence performance. Additionally, serial mediation analysis suggests that inhibitory mechanisms in MT+ contribute to individual differences in a specific subtest of the Wechsler Adult Intelligence Scale, which assesses visuo-spatial aspects of gF.

      Weaknesses:

      (1) While the findings are compelling and the analyses robust, the study's rationale and interpretations need strengthening. For instance, Assem et al. (2020) have previously defined the core and extended MD networks, identifying the occipito-temporal regions as TE1m and TE1p, which are located more rostrally than MT+. Area MT+ might overlap with brain regions identified previously in Fedorenko et al., 2013, however the authors attribute these activations to attentional enhancement of visual representations in the more difficult conditions of their tasks. For the aforementioned reasons, It is unclear why the authors chose MT+ as their focus. A stronger rationale for this selection is necessary and how it fits with the core/extended MD networks.

      We really appreciate reviewer’s opinions. The reason why we focus on hMT+ is following: According to the results of Melnick 2013, the motion surround suppression (SI) and the time thresholds of small and large gratings representing hMT+ functionality are correlated with Verbal Comprehension, Perceptual Reasoning, Working Memory, and Processing Speed Indicators, with high correlation coefficients of 0.69, 0.47, 0.49, and 0.50, respectively. In addition, Fedorenko et al. 2013, the averaged MD activity region appears to overlap with hMT+. Based on these findings, we assume that hMT+ does have the potential to become the core of MD system.

      (2) Moreover, although the study links MT+ inhibitory mechanisms to a visuo-spatial component of gF, this evidence alone may not suffice to position MT+ as a new core of the MD network. The MD network's definition typically encompasses a range of cognitive domains, including working memory, mathematics, language, and relational reasoning. Therefore, the claim that MT+ represents a new core of MD needs to be supported by more comprehensive evidence.

      Thank reviewer for pointing this out. After careful consideration, we agree with your point of view. Due to our results only delving into visuo-spatial intelligence, it is not yet sufficient to prove that hMT is the core node of the MD system. We will adjust the explanatory logic of the article, that is, emphasizing the de-redundancy of hMT+in visual-spatial intelligence and the improvement of information processing efficiency, while weakening the significance of hMT+ in MD systems.

      Reviewer #3 (Public Review):

      Summary:

      This manuscript aims to understand the role of GABA-ergic inhibition in the human MT+ region in predicting visuo-spatial intelligence through a combination of behavioral measures, fMRI (for functional connectivity measurement), and MRS (for GABA/glutamate concentration measurement). While this is a commendable goal, it becomes apparent that the authors lack fundamental understanding of vision, intelligence, or the relevant literature. As a result, the execution of the research is less coherent, dampening the enthusiasm of the review.

      Strengths:

      (1) Comprehensive Approach: The study adopts a multi-level approach, i.e., neurochemical analysis of GABA levels, functional connectivity, and behavioral measures to provide a holistic understanding of the relationship between GABA-ergic inhibition and visuo-spatial intelligence.

      (2) Sophisticated Techniques: The use of ultra-high field magnetic resonance spectroscopy (MRS) technology for measuring GABA and glutamate concentrations in the MT+ region is a recent development.

      Weaknesses:

      Study Design and Hypothesis

      (1) The central hypothesis of the manuscript posits that "3D visuo-spatial intelligence (the performance of BDT) might be predicted by the inhibitory and/or excitation mechanisms in MT+ and the integrative functions connecting MT+ with the frontal cortex." However, several issues arise:

      (1.1) The Suppression Index depicted in Figure 1a, labeled as the "behavior circle," appears irrelevant to the central hypothesis.

      We thank the reviewer for pointing this out. In our study, the inhibitory mechanisms in hMT+ are conceptualized through two models: the neurotransmitter model and the behavior model. The Suppression Index is essential for elucidating the local inhibitory mechanisms within behavior model. However, we acknowledge that our initial presentation in the introduction may not have clearly articulated our hypothesis, potentially leading to misunderstandings. We plan to revise the introduction to better clarify these connections and ensure the relevance of the Suppression Index is comprehensively understood.

      (1.2) The construct of 3D visuo-spatial intelligence, operationalized as the performance in the Block Design task, is inconsistently treated as another behavioral task throughout the manuscript, leading to confusion.

      We thank the reviewer for pointing this out. We acknowledge that our manuscript may have inconsistently presented this construct across different sections, causing confusion. To address this, we plan to ensure a consistent description of 3D visuo-spatial intelligence in both the introduction and the discussion sections. But we would like to maintain 'Block Design task score' within the results section to help readers clarify which subtest we use.

      (1.3) The schematics in Figure 1a and Figure 6 appear too high-level to be falsifiable. It is suggested that the authors formulate specific and testable hypotheses and preregister them before data collection.

      We thank the reviewer for pointing this out. We are planning to revise the Figure 1a and make it less abstract and more logical. For Figure 6, the schematic represents our theoretical framework of how hMT+ works in the 3D viso-spatial intelligence, we believe the elements within this framework are grounded in related theories and supported by evidence discussed in our results and discussions section, making them specific and testable.

      (2) Central to the hypothesis and design of the manuscript is a misinterpretation of a prior study by Melnick et al. (2013). While the original study identified a strong correlation between WAIS (IQ) and the Suppression Index (SI), the current manuscript erroneously asserts a specific relationship between the block design test (from WAIS) and SI. It should be noted that in the original paper, WAIS comprises Similarities, Vocabulary, Block design, and Matrix reasoning tests in Study 1, while the complete WAIS is used in Study 2. Did the authors conduct other WAIS subtests other than the block design task?

      Thanks for pointing this out. Reviewer #1 also asked this question, we copy the answers in here “The decision was informed by Melnick’s findings which indicated high correlations between Surround suppression (SI) and the Verbal Comprehension, Perceptual Reasoning, Working Memory, and Processing Speed Indexes, with correlation coefficients of 0.69, 0.47, 0.49, and 0.50, respectively. It is well-established that the hMT+ region of the brain is a sensory cortex involved in visual perception processing (3D perception). Furthermore, motion surround suppression (SI), a specific function of hMT+, aligns closely with this region's activities. Given this context, the Perception Reasoning sub-ability was deemed to have the clearest mechanism for further exploration. Consequently, we selected the most representative subtest of Perception Reasoning—the Block Design Test—which primarily assesses 3D visual intelligence.”

      (3) Additionally, there are numerous misleading references and unsubstantiated claims throughout the manuscript. As an example of misleading reference, "the human MT ... a key region in the multiple representations of sensory flows (including optic, tactile, and auditory flows) (Bedny et al., 2010; Ricciardi et al., 2007); this ideally suits it to be a new MD core." The two references in this sentence are claims about plasticity in the congenitally blind with sensory deprivation from birth, which is not really relevant to the proposal that hMT+ is a new MD core in healthy volunteers.

      Thanks for pointing this out. We have carefully read the corresponding references and considered the corresponding theories and agree with these comments. Due to our results only delving into “the GABA-ergic inhibition in human MT predicts visuo-spatial intelligence mediated by reverberation with frontal cortex”, it is not yet sufficient to prove that hMT+ is the core node of the MD system, we will adjust the explanatory logic of the article, that is, emphasizing the de redundancy of hMT+in visual-spatial intelligence and the improvement of information processing efficiency, while weakening the significance of hMT+ in MD systems. In addition, regarding the potential central role of hMT+ in the MD system, we agree with your view that research on hMT+ as a multisensory integration hub mainly focuses on developmental processes. Meanwhile, in adults, the MST region of hMT+ is considered a multisensory integration area for visual and vestibular inputs, which potentially supports the role of hMT+ in multitasking multisensory systems (Gu et al., J. Neurosci, 26(1), 73–85, 2006; Fetsch et al., Nat. Neurosci, 15, 146–154, 2012.). Further research could explore how other intelligence sub-ability such as working memory and language comprehension are facilitated by hMT+'s features.

      Another example of unsubstantiated claim: the rationale for selecting V1 as the control region is based on the assertion that "it mediates the 2D rather than 3D visual domain (Born & Bradley, 2005)". That's not the point made in the Born & Bradley (2005) paper on MT. It's crucial to note that V1 is where the initial binocular convergence occurs in cortex, i.e., inputs from both the right and left eyes to generate a perception of depth.

      Thank you for pointing this out. We acknowledge the inappropriate citation of "Born & Bradley, 2005," which focuses solely on the structure and function of the visual area MT. However, we believe that choosing hMT+ as the domain for 3D visual analysis and V1 as the control region is justified. Cumming and DeAngelis (Annu Rev Neurosci, 24:203–238.2001) state that binocular disparity provides the visual system with information about the three-dimensional layout of the environment, and the link between perception and neuronal activity is stronger in the extrastriate cortex (especially MT) than in the primary visual cortex(V1). This supports our choice and emphasizes the relevance of MT+ in our study. We will revise our reference in the revised version.

      Results & Discussion

      (1) The missing correlation between SI and BDT is crucial to the rest of the analysis. The authors should discuss whether they replicated the pattern of results from Melnick et al. (2013) despite using only one WAIS subtest.

      We thank for reviewer’s suggestion. Now the correlation result is placed in the supplemental material, we will put it back to the main text.

      (2) ROIs: can the authors clarify if the results are based on bilateral MT+/V1 or just those in the left hemisphere? Can the authors plot the MRS scan area in V1? I would be surprised if it's precise to V1 and doesn't spread to V2/3 (which is fine to report as early visual cortex).

      We thank for reviewer’s suggestion. We plan to draw the V1 ROI MRS scanning area and use the visual template to check if the scanning area contains V2/3. If it does, we will refer to it as the early visual cortex rather than specifically V1 in our reporting.

      (3) Did the authors examine V1 FC with either the frontal regions and/or whole brain, as a control analysis? If not, can the author justify why V1 serves as the control region only in the MRS but not in FC (Figure 4) or the mediation analysis (Figure 5)? That seems a little odd given that control analyses are needed to establish the specificity of the claim to MT+

      We thank for reviewer’s suggestion. We plan to do the V1 FC-behavior connection as control analysis. For mediation analysis, since V1 GABA/Glu has no correlation with BDT score, it is not sufficient to apply mediation analysis.

      (4) It is not clear how to interpret the similarity or difference between panels a and b in Figure 4.

      We thank reviewer for pointing this out. We plan to further interpret the difference between a and b in the revised version. Panels a represents BDT score correlated hMT+-region FC, which is obviously involved in frontal cortex. While panels b represents SI correlated hMT+-region FC, which shows relatively less regions. The overlap region is what we are interested in and explain how local inhibitory mechanisms works in the 3D viso-spatial intelligence. In addition, we would like to revise Figure 4 and point out the overlap region.

      (5) SI is not relevant to the authors‘ priori hypothesis, but is included in several mediation analyses. Can the authors do model comparisons between the ones in Figure 5c, d, and Figure S6? In other words, is SI necessary in the mediation model? There seem discrepancies between the necessity of SI in Figures 5c/S6 vs. Figure 5d.

      We thank the reviewer for highlighting this point. The relationship between the Suppression Index (SI) and our a priori hypotheses is elaborated in the response to reviewer 3, section (1). SI plays a crucial role in explicating how local inhibitory mechanisms function within the context of the 3D visuo-spatial task. Additionally, Figure 5c illustrates the interaction between the frontal cortex and hMT+, showing how the effects from the frontal cortex (BA46) on the Block Design Task are fully mediated by SI. This further underscores the significance of SI in our model.

      (6) The sudden appearance of "efficient information" in Figure 6, referring to the neural efficiency hypothesis, raises concerns. Efficient visual information processing occurs throughout the visual cortex, starting from V1. Thus, it appears somewhat selective to apply the neural efficiency hypothesis to MT+ in this context.

      We thank the reviewer for highlighting this point. There is no doubt that V1 involved in efficient visual information processing. However, in our result, the V1 GABA has no significant correlation between BDT score, suggesting that the V1 efficient processing might not sufficiently account for the individual differences in 3D viso-spatial intelligence. Additionally, we will clarify our use of the neural efficiency hypothesis by incorporating it into the introduction of our paper to better frame our argument.

      Transparency Issues:

      (1) Don't think it's acceptable to make the claim that "All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary information". It is the results or visualizations of data analysis, rather than the raw data themselves, that are presented in the paper/supp info.

      We thank reviewer for pointing this out. We realized that such expression will lead to confusion. We will delete this expression.

      (2) No GitHub link has been provided in the manuscript to access the source data, which limits the reproducibility and transparency of the study.

      We thank reviewer for pointing this out. We will attach the GitHub link in the revised version.

      Minor:

      "Locates" should be replaced with "located" throughout the paper. For example: "To investigate this issue, this study selects the human MT complex (hMT+), a region located at the occipito-temporal border, which represents multiple sensory flows, as the target brain area."

      We thank reviewer for pointing this out. We will revise it.

      Use "hMT+" instead of "MT+" to be consistent with the term in the literature.

      We thank reviewer for pointing this out. We agree to use hMT+ in the literature.

      "Green circle" in Figure 1 should be corrected to match its actual color.

      We thank reviewer for pointing this out. We will revise it.

      The abbreviation for the Wechsler Adult Intelligence Scale should be "WAIS," not "WASI."

      We thank reviewer for pointing this out. We will revise it.

    1. Reviewer #2 (Public Review):

      Summary:

      The authors re-analyse MEG data from a speech production and perception study and extend their previous Granger causality analysis to a larger number of cortical-cortical and in particular cortical-subcortical connections. Regions of interest were defined by means of a meta-analysis using Neurosynth.org and connectivity patterns were determined by calculating directed influence asymmetry indices from the Granger causality analysis results for each pair of brain regions. Abbasi et al. report feedforward signals communicated via fast rhythms and feedback signals via slow rhythms below 40 Hz, particularly during speaking. The authors highlight one of these connections between the right cerebellum lobule VI and auditory association area A5, where in addition the connection strength correlates negatively with the strength of speech tracking in the theta band during speaking (significant before multiple comparison correction). Results are interpreted within a framework of active inference by minimising prediction errors.

      While I find investigating the role of cortical-subcortical connections in speech production and perception interesting and relevant to the field, I am not yet convinced that the methods employed are fully suitable to this endeavour or that the results provide sufficient evidence to make the strong claim of dissociation of bottom-up and top-down information flow during speaking in distinct frequency bands.

      Strengths:

      The investigation of electrophysiological cortical-subcortical connections in speech production and perception is interesting and relevant to the field. The authors analyse a valuable dataset, where they spent a considerable amount of effort to correct for speech production-related artefacts. Overall, the manuscript is well-written and clearly structured.

      Weaknesses:

      The description of the multivariate Granger causality analysis did not allow me to fully grasp how the analysis was performed and I hence struggled to evaluate its appropriateness.<br /> Knowing that (1) filtered Granger causality is prone to false positives and (2) recent work demonstrates that significant Granger causality can simply arise from frequency-specific activity being present in the source but not the target area without functional relevance for communication (Schneider et al. 2021) raises doubts about the validity of the results, in particular with respect to their frequency specificity. These doubts are reinforced by what I perceive as an overemphasis on results that support the assumption of specific frequencies for feedforward and top-down connections, while findings not aligning with this hypothesis appear to be underreported. Furthermore, the authors report some main findings that I found difficult to reconcile with the data presented in the figures. Overall, I feel the conclusions with respect to frequency-specific bottom-up and top-down information flow need to be moderated and that some of the reported findings need to be checked and if necessary corrected.

      Major points

      (1) I think more details on the multivariate GC approach are needed. I found the reference to Schaum et al., 2021 not sufficient to understand what has been done in this paper. Some questions that remained for me are:

      (i) Does multivariate here refer to the use of the authors' three components per parcel or to the conditioning on the remaining twelve sources? I think the latter is implied when citing Schaum et al., but I'm not sure this is what was done here?

      If it was not: how can we account for spurious results based on indirect effects?

      (ii) Did the authors check whether the GC of the course-target pairs was reliably above the bias level (as Schaum et. al. did for each condition separately)? If not, can they argue why they think that their results would still be valid? Does it make sense to compute DAIs on connections that were below the bias level? Should the data be re-analysed to take this concern into account?

      (iii) You may consider citing the paper that introduced the non-parametric GC analysis (which Schaum et al. then went on to apply): Dhamala M, Rangarajan G, Ding M. Analyzing Information Flow in Brain Networks with Nonparametric Granger Causality. Neuroimage. 2008; 41(2):354-362. https://doi.org/10.1016/j.neuroimage.2008.02. 020

      (2) GC has been discouraged for filtered data as it gives rise to false positives due to phase distortions and the ineffectiveness of filtering in the information-theoretic setting as reducing the power of a signal does not reduce the information contained in it (Florin et al., 2010; Barnett and Seth, 2011; Weber et al. 2017; Pinzuti et al., 2020 - who also suggest an approach that would circumvent those filter-related issues). With this in mind, I am wondering whether the strong frequency-specific claims in this work still hold.

      (3) I found it difficult to reconcile some statements in the manuscript with the data presented in the figures:

      (i) Most notably, the considerable number of feedforward connections from A5 and STS that project to areas further up the hierarchy at slower rhythms (e.g. L-A5 to R-PEF, R-Crus2, L CB6 L-Tha, L-FOP and L-STS to R-PEF, L-FOP, L-TOPJ or R-A5 as well as R-STS both to R-Crus2, L-CB6, L-Th) contradict the authors' main message that 'feedback signals were communicated via slow rhythms below 40 Hz, whereas feedforward signals were communicated via faster rhythms'. I struggled to recognise a principled approach that determined which connections were highlighted and reported and which ones were not.

      (ii) "Our analysis also revealed robust connectivity between the right cerebellum and the left parietal cortex, evident in both speaking and listening conditions, with stronger connectivity observed during speaking. Notably, Figure 4 depicts a prominent frequency peak in the alpha band, illustrating the specific frequency range through which information flows from the cerebellum to the parietal areas." There are two peaks discernible in Figure 4, one notably lower than the alpha band (rather theta or even delta), the other at around 30 Hz. Nevertheless, the authors report and discuss a peak in the alpha band.

      (iii) In the abstract: "Notably, high-frequency connectivity was absent during the listening condition." and p.9 "In contrast with what we reported for the speaking condition, during listening, there is only a significant connectivity in low frequency to the left temporal area but not a reverse connection in the high frequencies."<br /> While Fig. 4 shows significant connectivity from R-CB6 to A5 in the gamma frequency range for the speaking, but not for the listening condition, interpreting comparisons between two effects without directly comparing them is a common statistical mistake (Makin and Orban de Xivry). The spectrally-resolved connectivity in the two conditions actually look remarkably similar and I would thus refrain from highlighting this statement and indicate clearly that there were no significant differences between the two conditions.

      (iv) "This result indicates that in low frequencies, the sensory-motor area and cerebellum predominantly transmit information, while in higher frequencies, they are more involved in receiving it."<br /> I don't think that this statement holds in its generality: L-CB6 and R-3b both show strong output at high frequencies, particularly in the speaking condition. While they seem to transmit information mainly to areas outside A5 and STS these effects are strong and should be discussed.

      (4) "However, definitive conclusions should be drawn with caution given recent studies raising concerns about the notion that top-down and bottom-up signals can only be transmitted via separate frequency channels (Ferro et al., 2021; Schneider et al., 2021; Vinck et al., 2023)."

      I appreciate this note of caution and think it would be useful if it were spelled out to the reader why this is the case so that they would be better able to grasp the main concerns here. For example, Schneider et al. make a strong point that we expect to find Granger-causality with a peak in a specific frequency band for areas that are anatomically connected when the sending area shows stronger activity in that band than the receiving one, simply because of the coherence of a signal with its own linear projection onto the other area. The direction of a Granger causal connection would in that case only indicate that one area shows stronger activity than the other in the given frequency band. I am wondering to what degree the reported connectivity pattern can be traced back to regional differences in frequency-specific source strength or to differences in source strength across the two conditions.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Description of the planned revisions

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      • Again, in Figure 5, were FoxP3/CD4+ cells enumerated? Author Response: Fig 5 showed that the inflammatory score, and activation of CD4 and CD8 cells, were lower in the intestine of DSS-treated mice transplanted with Jag1Ndr/Ndr lymphocytes than in those transplanted with Jag1+/+ lymphocytes. However, in Figure 5 we had not quantified the number of FoxP3/CD4+ cells (Tregs). We agree that it would be interesting to know whether the dampened intestinal inflammation (in response to a classical inflammatory disease model (DSS-treatment)) is also mediated by excess Tregs. We will therefore now quantify Foxp3+ cells on the intestinal sections of experimental animals used for acquisition of data in Fig 5.

      • *

      Description of the revisions that have already been incorporated in the transferred manuscript.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Reviewer 1 comment: This is an interesting study that examines defects in the Jag1ndr/ndr mouse model of Alagille syndrome. The novel aspects of this manuscript are the comparisons, at many levels, between the mouse model and ALG patient samples, including an examination of immune profiles. The conclusions that the Jag1ndr/ndr mouse model is an accurate representation of the human ALG syndrome appear valid. However the reported differences in immune profiles, particularly in the Jag1ndr/ndr mouse model are difficult to understand. The data presented indicate a reduction in CD4+ cells in the Jag1ndr/ndr mouse at day P3 in both liver and spleen. Additionally, the authors report differences between the the Jag1ndr/ndr mouse and controls at day P30 in the relative percentages of DN, DP and SP CD4 and CD8 cells in the thymus. When examining the peripheral lymphoid system, CD4+ numbers are the same in both the Jag1ndr/ndr animals and controls however CD8+ numbers are reduced and FoxP3/CD4+ cells are increased in both the spleen and the thymus. FoxP3/CD4+ T cells are usually assumed to be regulatory T cells that dampen the inflammatory responses of T cells. Therefore, the increase in this population in an animal model of what is assumed to be an inflammatory disease is confusing and confounding. The authors do not present a clear analysis of how they feel an increase of Tregs would lead to this disease. One possibility is that this population is not functioning as conventional Tregs and rather are promoting inflammation but this conclusion would require a functional analysis of this population of cells, at the very least in an in vitro analysis of T cell suppression. From an immunologist's point of view, their data are antithetical to what one would expect to find in an inflammatory disease. Perhaps this reviewer is missing an important point but if I am missing it, then other who read this manusgcript also may be confused.

      Author Response: *We thank the reviewer for carefully assessing our work, and for noting which aspects of the immune analyses should be more thoroughly explained. We apologize for any confusion, which a clearer introduction will help to avoid. *

      *Alagille syndrome is not thought of as an inflammatory disorder, it is a congenital disorder affecting bile duct development (Kohut et al 2021, Semin Liver Dis). During normal bile duct development, JAG1+ portal fibroblasts signal to NOTCH2+ hepatoblasts to instruct bile duct development. In the context of low JAG1 signaling, hepatoblasts either fail to adopt a cholangiocyte fate, or fail to undergo bile duct morphogenesis, resulting in bile duct paucity and cholestasis. This cholestasis should activate inflammatory processes leading to fibrosis, which is the subject of this study. *

      • *

      We agree with the reviewer that Tregs would be expected to suppress inflammation, and our data are consistent with Treg suppression of inflammation. We show, for the first time, that Tregs are enriched in Jag1Ndr/Ndr mice (Fig 4) and present evidence that they suppress inflammation (Fig 5) and fibrosis (Fig 6), which could explain the atypical fibrosis seen in patients with ALGS.

      • *

      *To clarify that ALGS is a genetic liver disease affecting bile duct formation, we: *

      1. Modified and extended the following text in the Introduction (Page 2, lines 14-17): “ALGS is mainly caused by mutations in the Notch ligand JAGGED1 (JAG1, 94%) (Mašek & Andersson, 2017; Oda et al, 1997), affecting bile duct development and morphogenesis, resulting in bile duct paucity and cholestasis. Immune dysregulation has also been described (Tilib Shamoun et al, 2015), but how this might interact with liver disease in ALGS to affect fibrosis is not known.
      2. *Introduce the disease, the animal model, and the scientific question in a schematic in new Fig 1A. *
      3. * Reviewer 1 comment: Minor points that should be addressed include: • The source cells used in the transfer experiments reported in Figure 5 is unclear. Are they using total spleen cells with T, B and myeloid cells or are they using purified T cells. And if it is the latter, have they assessed the ratio of CD4+ versus FoxP3/CD4+ cells in the transferred cells?

      Author Response: *Total spleen cells including all lymphocytes were transplanted, as described in Materials and Methods. The constituent T-cell populations are characterized and shown in Fig 4F. To clarify this, we: *

      1. *added the text “Adoptive transfer of lymphocytes” to the schematic in Fig 5A, FigS5A, and Fig 6A, and *
      2. modified the opening paragraph related to results presented in Fig.5 and FigS5 in the following way (page 8, line 209): “To investigate Jag1Ndr/Ndr T cell function, we performed adoptive transfer of the splenic lymphocytes into Rag1-/- mice, which lack mature B- and T cell populations, but provide a host environment with normal Jag1 (Mombaerts et al, 1992).
      3. *

      *To acknowledge that B-cells and innate lymphoid cells might contribute to the observed results, we include a following sentence in the Discussion: *

      (page 12, lines 369-371) “Finally, our experimental setup does not exclude an additional contribution of other lymphocytes (B-cells or innate lymphoid cells) to the BDL-induced fibrosis, and selective testing of the individual subpopulations would be an intriguing follow up to this study.”

      Reviewer 1 comment: In the DSS experiments in Figure 5, there does not appear to be a no DSS control. What does the architecture look like without DSS?

      Author Response: The intestinal architecture and phenotype of mice transplanted with Jag1+/+ or Jag1Ndr/Ndr lymphocytes, not treated with DSS, are presented in Supplementary Figure 5. In the absence of DSS, Jag1+/+- or Jag1Ndr/Ndr -transplanted mice exhibit no overt differences in survival or weight gain/loss. The intestinal inflammatory score was not different in the two conditions and was *2.29 +/-0.44 and 2.03 +/-0.92 for Jag1+/+- or Jag1Ndr/Ndr -transplanted mice, respectively. *

      To compare the results with and without DSS, we added the following text to the results section, when describing the DSS results (Page 9, lines 223-226):

      As expected, histological scoring of intestinal and colonic inflammation revealed elevated inflammation in Jag1+/+→Rag1-/- mice treated with DSS (Fig. 5C,D) compared to Jag1+/+→Rag1-/- mice not treated with DSS (Fig. S5). However, there was significantly less inflammation in Jag1Ndr/Ndr→Rag1-/- mice than in Jag1+/+→Rag1-/- mice (Fig. 5C,D)."

      Reviewer 1 comment: The authors noted that splenomegaly was observed in the Jag1ndr/ndr mouse model. Again this is antithetical to what one would expect when one sees an increase in FoxP3/CD4+ T regs.

      Author Response: *We thank the reviewer for pointing at a possible discrepancy, related to Fig1 in which we report the presence of splenomegaly. Although there can be multiple causes of splenomegaly, it is one of the hallmarks of portal hypertension (as also corroborated by Reviewer 2), tightly connected with liver fibrosis, present in patients with ALGS and we report it as such in the manuscript. To clarify this, we added the following text sections: *

      1. Results (page 2, lines 37,38) “Liver fibrosis compresses blood vessels and reduces their blood flow, leading to portal hypertension, a serious consequence of liver disease which can manifest as splenomegaly.
      2. Discussion (page 13, line 394-401): “Splenomegaly has been described as a consequence of portal hypertension in ALGS (Kamath et al, 2020), but could also be attributed to immune-related pathology. Jag1Ndr/Ndr mice exhibit splenomegaly as early as P10, and is exacerbated at P30 ( 1E,F). Patients with other liver diseases display portal hypertension and cirrhosis, with both splenomegaly and hypersplenism associated with a high CD4+/CD8+ ratio, but a low Treg+/CD4+ ratio (Nomura et al, 2014). However, Jag1Ndr/Ndr mice present with splenomegaly but not hypersplenism. An overactive spleen (hypersplenism) would remove red blood cells which are instead enriched in Jag1Ndr/Ndr mice, and Tregs were enriched in Jag1Ndr/Ndr mice, not depleted as seen in cirrhosis/hypersplenism. These data are thus consistent with portal hypertension-induced splenomegaly rather than hypersplenism.*” *

      Reviewer #1 (Significance (Required)):

      Reviewer 1 comment: The strengths of this paper are the careful comparisons between the mouse model and the human ALG syndrome. These comparisons are valuable and worth publication.

      Author Response: We thank the reviewer for these comments.

      Reviewer 1 comment: Weaknesses are stated above. Needs a clearer explanation for their immune analysis.

      Author Response: *We thank the reviewers for highlighting points requiring clarification and hope the proposed text changes and additional data presented in response to the comments of all three reviewers lead to a significant clarification of the immunological aspect of our study. *

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Reviewer 2 comment:

      Summary: Masek and colleagues use multi-pronged studies on the Jag1[Ndr/Ndr] mouse model of Alagille syndrome (ALGS) combined with transcriptomic analysis on livers from patients with ALGS to elucidate the potential mechanisms regulating liver fibrosis in this disease. The authors first show that Jag1[Ndr/Ndr] animals develop pericellular and perisinusoidal fibrosis and exhibit evidence for portal hypertension, similar to patients with ALGS. Single-cell RNA-sequencing indicated more hepatoblasts and less hepatocytes, relatively speaking, in Jag1[Ndr/Ndr] P3 livers, which suggested hampering of hepatoblast differentiation to hepatocytes. Deconvolution of previously generated bulk RNA-seq data from Jag1[Ndr/Ndr] P10 livers and GESA on RNAseq data from livers of these mice and patients with ALGS confirmed the P3 scRNA-seq observations and indicated mild pro-inflammatory activation of immature hepatocytes in ALGS livers. GESA also suggested an inability of Jag1[Ndr/Ndr] livers to attract T cells upon cholestatic injury. Indeed, 25-color flow cytometry on liver and spleen from mutant and control mice indicated a defect in T cell response to cholestasis in this model. The authors then examined the effects of the Ndr mutation on T-cell development and function. They found that the Ndr/Ndr thymi were significantly smaller than control thymi. Moreover, Ndr/Ndr thymi showed an increase in CD4+ T-cells and Tregs at the expense of double-positive T-cells. The authors then performed lymphocyte transplantation studies and concluded that Ndr/Ndr T-cells fail to mount an adequate response to inflammation in a DSS model of ulcerative colitis. The authors tested the contribution of Ndr/Ndr immune cells to liver fibrosis in a model of experimentally induced cholestasis (bile duct ligation; BDL). Ndr/Ndr T-cells did not show any defects in migrating into the liver upon BDL. However, the periportal fibrosis observed in BDL model was reduced in animals receiving Ndr/Ndr immune cells compared to those receiving Jag1+/+ immune cells. This was accompanied by significantly less aSMA staining in these livers. Finally, reanalysis of bulk RNAseq data from liver samples from ALGS and other liver diseases suggested that the presence of FOXP3+ T-reg cells in the liver is associated with higher liver fibrosis in non-ALGS liver diseases but lower liver fibrosis in ALGS livers. The authors have used an impressive combination of single-cell RNA-sequencing, reanalysis of previous bulk RNA-sequencing data from their group and others, 25-color FACS analysis, and adoptive immune transfer experiments in this manuscript, and systematically provide quantification and statistical analysis for their data. Overall, this is an interesting and important study. Prior studies are referenced appropriately. The text and figures are clear and accurate. I don't think any additional experiments are essential. However, the issues listed under Major comments should be discussed and clarified in the manuscript, especially the first item.

      Author Response: *We sincerely thank the reviewer for the comprehensive and insightful assessment of our manuscript. We are particularly gratified to note your acknowledgment of the thoroughness of our experimental approach and the clarity of our presentation. We are pleased that no further experiments would be required, and will address the points raised under Major comments which enhance our study's quality and accessibility. *

      Reviewer 2 comment:

      Major comments:

      • Only a small fraction of the cells in scRNA-seq experiments have been assigned to hepatocytes/hepatoblast clusters, with the majority of these cells allocated to Hepato-Ery cluster. This suggests that many hepatocytes and potentially hepatoblasts have been lost during sample preparation. The authors should discuss this issue and its potential implications on the interpretation of the cell ratios and gene expression conclusions of scRNA-seq data. Author Response: We agree with the reviewer regarding this aspect of our study. We mentioned this limitation in the supplementary methods section: ”Liver parenchymal cells constituted ~6.5% of cells at E16.5, and ~7.5% of cells at P3 and included mesenchymal cells, endothelial cells, hepatoblasts and hepatocytes (Fig. S1D), this parenchymal proportion is lower than in vivo, but consistent with ex vivo liver digest (Guilliams et al, 2022).” We recognize it may be too inaccessible there, and we thus added the following text to the Discussion section of the manuscript: (Pages 11-12, lines 330-337) “A limitation of this study is the underrepresentation of the hepatoblast/cyte parenchymal cells in the scRNA-seq dataset (Fig. 2A-D), which constituted ~6.5% of analyzed cells at E16.5, and ~7.5% of cells at P3 (Fig. S1D). This parenchymal proportion is lower than in vivo, but is consistent with scRNA seq datasets obtained with ex vivo liver digest (Guilliams et al, 2022). One risk is that cell stress as a result of dissociation could result in further loss of injured Jag1Ndr/Ndr hepatocytes, impacting the interpretation of cell type abundance. Nuclear scRNAseq can overcome cell type-dependent dissociation sensitivity bias (Guilliams et al, 2022), and could provide further insights into Jag1Ndr/Ndr livers at the single cell level. Nonetheless, both bulk RNA seq deconvolution and histological analyses confirmed that patients and Jag1Ndr/Ndr mice exhibit hepatoblast enrichment and less differentiated hepatocytes.

      Reviewer 2 comment: The Jag1[Ndr/Ndr] strain is an excellent model for various aspects of ALGS phenotypes. However, when it comes to linking the effects of this mutation to the function of a specific cell type, it is worth considering that Jag1[Ndr/Ndr] might not recapitulate the effects of loss of one copy of JAG1 observed in most patients with ALGS. This is especially important given the sensitivity of various cellular and organ-level processes to the degree of Notch pathway activation. In the context of the present manuscript, it is possible that what the authors have observed in Jag1[Ndr/Ndr] lymphocytes does not mirror how a JAG1-heterozygous human lymphocyte behaves. This is not a major concern, but it is worth considering.

      Author Response: We agree and thus added the following discussion paragraph (page 11, lines 315-321) “In patients with ALGS, who have a single mutation in either JAG1 or NOTCH2, the remnant healthy allele(s) could be expected to mediate signaling. However, some JAG1 mutations exhibit dominant negative effects (Ponio et al, 2007; Xiao et al, 2013; Guan et al, 2023), which could entail further repression of JAG1/NOTCH2 signaling. In this context, it is important to note that the Jag1Ndr/Ndr mice are homozygous for the missense mutation, but retain some JAG1 activity, and it is not clear to which degree this mimics JAG1 heterozygosity in humans. It would be of interest to test whether Jag1 potency affects hepatoblast differentiation or injury-induced reversion of hepatocytes in patients as a function of their genotype.

      Reviewer 2 comment: •The basis for the opposite type of correlation between COL1A1 expression and POXP3 level in ALGS versus non-ALGS liver disease is not clear.

      Author Response: We thank the reviewer for pointing out the unclear interpretation of the patient data. In patients with ALGS, the extent of fibrosis is likely to be highly multifactorial, involving (as we show) hepatocyte immaturity, dampened inflammation, and immune system dysregulation (possibly involving more than T-cells). Since human patients ARE so heterogeneous, teasing apart the relative contribution of each is currently outside the scope of our study, but will be an important area of future research. Nonetheless we thought it was important and interesting to show these patterns in supplementary Fig 6, now extended with further data, and analyses, and described in the following manner:

      • *

      Results section: (page 10, lines 267-275) “Liver damage in non-ALGS liver disease (using liver injury marker LGALS3BP) (Yang et al, 2021), was positively correlated with recruitment of lymphocytes (including CD8A+,and FOXP3+ populations of T cells), as well as the extent of fibrosis (COL1A1 abundance) (Fig. S6G). However, in ALGS, the extent of liver damage, lymphocyte recruitment and fibrosis were unlinked (Fig. S6G). These data are in line with the observation that liver stiffness (a proxy for fibrosis) in ALGS is independent of biomarkers of liver disease (Leung et al, 2023). While Treg infiltration in ALGS was independent of liver damage, it exhibited a tendency towards a negative correlation with fibrosis (Fig. S6G), corroborating that elevated levels of Tregs may limit fibrosis in ALGS. Altogether, these data suggest that the liver and lymphocytes may be differentially affected in different patients with ALGS, a disorder that is well known for its heterogenous presentation.

      Minor comments:

      • Page 2, last paragraph of Introduction, Page 12 last sentence, and Supplementary Methods: Please use "adoptive immune transfer" instead of "adaptive immune transfer". • Pages 3 and 4: Reference is made to Figures 3E-O, which appears to be Figure 2E-O. • Figure 3 legend: "Analysis in (E) is one-way ANOVA with Dunnett's multiple comparison test". Panel E compares two means, so ANOVA is not the appropriate statistical analysis for these data. Is this sentence related to panel D? • Page 9: Please correct misspelling: "response to intestinal insult (Fig. 5). W therefore". • The Science Translation Medicine references lack page number. Author Response: *We thank the reviewer deeply for taking the time to meticulously note and convey these errors, helping us to correct these. The suggested corrections have been implemented. Science Transl Med is an online journal and does not have page numbers – we have added an issue number to facilitate retrieval of these references. *

      • *

      Additionally, we noticed that the image of a consecutive liver section with CYP1A2 staining from Jag1Ndr/Ndr liver in Fig 2 L was accidentally flipped along the horizontal axis, which we have now corrected. We also changed the scRNAseq cell cluster naming from Hepatoblasts/cytes, Hepato_Ery, and Kupffer cells, Kuffer cells_Ery to Hepatoblasts/cytes I, and II, and Kupffer cells I and II, respectively, to match the Neutrophil progenitors I and II naming convention. Names were subsequently also changed in Fig S1 and methods.

      **Referees cross-commenting**

      To my knowledge, ALGS is not considered to be an inflammatory disorder. Furthermore, the splenomagaly observed in the mouse model could be due to portal hypertension rather than a primary immune disturbance. Having said that, I agree with the other reviewers that the manuscript will benefit from further discussion and clarification on the immune-related observations.

      Author Response: We thank Reviewer 2 for indicating to Reviewer 1 that ALGS is not considered an inflammatory disorder, which we agree with. It was not our intention to convey this idea. To avoid confusion, we now:

      1. *Added a schematic in Fig 1A. *
      2. Modified and extended the following text in the Introduction: (Page 2, lines 14-17): “ALGS is mainly caused by mutations in the Notch ligand JAGGED1 (JAG1, 94%) (Mašek & Andersson, 2017; Oda et al, 1997), affecting bile duct development and morphogenesis, resulting in bile duct paucity and cholestasis. Immune dysregulation has also been described (Tilib Shamoun et al, 2015), but how this might interact with liver disease in ALGS to affect fibrosis is not known. *Furthermore, we have addressed or will address all comments from reviewer 1 to clarify the immune-related observations. *

      Reviewer #2 (Significance (Required)):

      Despite severe cholestasis, ALGS patients do not show as much fibrosis as other cholestatic diseases, including biliary atresia (BA). A previous study had suggested that this phenomenon could be due to the difference in the nature of reactive hepatobiliary cells in ALGS compared to BA (Fabris et al, 2007). Moreover, a number of studies have suggested a role for Notch pathway activation in several cell types in the liver in the development of liver fibrosis (for example, Sawitza et al, Hepatology, 2009; Chen et al, Plos One, 2012; Duan et al, Hepatology, 2018; Yu et al, Science Translational Medicine, 2021). However, although a role for Notch signaling in T-cells is well established, it was not known whether impaired T-cell development/function contributes to reduced fibrosis in ALGS liver disease. Accordingly, the current manuscript provides novel insight into the mechanism of fibrosis in this disease. Moreover, the observation that Jag1-mutant T-cells do not confer as much protection as control T-cells to immunodeficient mice subjected to DSS-induced ulcerative colitis provides strong evidence for impaired T-cell immunity in this ALGS model and might help explain other aspects of ALGS phenotypes.

      The manuscript will be of interest to broad audience (Notch signaling, cholestatic liver disease, mechanisms of liver fibrosis, T-cell development).

      I have expertise in Notch signaling and in using animal models of human developmental disorders.

      __Author Response: __We thank the reviewer for the balanced assessment of our manuscript in light of the current knowledge, and for highlighting its importance in the context of not only Notch and ALGS, but also other cholestatic and fibrotic liver diseases.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      The article entitled "Jag1 Insufficiency Disrupts Neonatal T Cell Differentiation and Impairs Hepatocyte Maturation, Leading to Altered Liver Fibrosis" by Mašek et al described the role of Notch ligand JAGGED1 (JAG1) in the T-cell differentiation contributing to liver fibrosis and immune system development in ALGS. This article is well written and has important preliminary findings that could establish Jag1 and its downstream signaling pathways as potential therapeutic targets to attenuate liver fibrosis.

      Author Response: We thank the reviewer for recognizing our work and pointing out the therapeutical implications of our findings.

      Reviewer 3 comment 1: Minor comments: In page 4, they mentioned that "the hepatoblast marker alpha fetoprotein (AFP) was 3.1-fold enriched (Fig. 3J,K), while the mature hepatocyte marker CYP1A2 protein was 1.7-fold less expressed (Fig. 3L-M)", the figure numbers should be changed to 2J, K, L-M etc.

      Author Response:* We thank the reviewer for identifying these errors. The suggested corrections have been implemented. *

      Reviewer 3 comment 2: In liver fibrosis the Th17 cells play crucial roles. Please show the level of IL17A mRNA level in the liver in the Jag1Ndr/Ndr mice compared to the Jag1+/+ mice.

      Author Response: We thank the reviewer for the insightful comments. We indeed investigated the Th17 vs Treg immune response, however we detect neither Th17-expressed Il17, Il17a, Il17f, nor Il21 and Il22 mRNA in the bulk RNA data, suggesting their expression is either masked or they are not present in significant numbers within the liver tissue at P10, preventing us from drawing any conclusions about this cell population.

      Reviewer ____3 comment 3: Also, please show the expression level of pro-inflammatory molecules, for example, TNFα, IL1β, MCP1 etc and the level of MMPs (especially MMP2, MMP8, MMP9) in the livers of the mice models used.

      Author Response: *The expression of Il10, Il1b, Mcp1(Ccl2), was presented in the manuscript Fig. 2O, and we attach in the response to reviewers *

      *a full list together with the expression levels of Mmp2/8/9, Tnfa, Ifng, Il17 receptor family and Tgfb1-3. Out of these, Mmp8 (0.9 Log2fold change = 1.9-fold), Ccl2 (2.2 Log2fold change = 4.7-fold), and Tl17rb (1.1 Log2fold change = 2.1-fold) were significantly upregulated, but do not indicate any specific leukocyte population’s response. This is in line with data in Fig S2E, demonstrating a dominance of myeloid over adaptive immune response in the GSEA of the immune KEGGs. *

      *Since lymphocytes are underrepresented in the bulk transcriptomics, and individual genes might report activity of many different cell types, we chose to focus on the list of genes shown to be markers of activated hepatocytes, to avoid over interpretation of the RNA sequencing data. Instead, the immune analyses were based on flow cytometry data, which we expect should accurately report cell type abundance across organ systems. *

      Reviewer 3 comment____ 4. Authors have shown significant alterations in the Treg population in their Jag1Ndr/Ndr mice of ALGS. Please also show the expression of IL10 and TGFβ in the liver and whether they are correlated with the level of Treg populations.

      Author response:* IL10 and Tgfb mRNA levels in liver are shown in the heatmap in the response to reviewers, and were not significantly different between genotypes at P10. They were also not correlated with Foxp3 levels, as shown in the correlation matrices below (Pearson’s R values in top row, significance values in bottom row). *

      Reviewer 3 comment 5. It would be interesting to know whether the IFNγ mRNA expression in the livers were altered in the Jag1Ndr/Ndr mice with altered populations of CD8 T cells.

      Author Response: There was no significant difference in IFNγ mRNA expression levels between Jag1+/+ and Jag1Ndr/Ndr *livers at P10 (please see the heatmap in response to comment no.3, above). *

      Reviewer #3 (Significance (Required)): Strength: This article is well written and has important preliminary findings that could establish Jag1 and its downstream signaling pathways as potential therapeutic targets to attenuate liver fibrosis.

      Author Response: Thank you for these comments and pointing out the wider implications of our findings.


      Reviewer 3____ Limitations: This study lacked the detailed molecular pathways which could explain how the Jag1 altered the T-cell recruitment, development and hepatocyte maturation in the development of liver fibrosis in the ALGS model.

      Author Response: We agree that this study does not focus on molecular pathways. The intention of this study was to identify which cell populations contribute to atypical neonatal fibrosis in ALGS. Because we expected this process to be multifactorial, Jag1Ndr/Ndr mice, carrying a systemic mutation, present both advantages (Jag1 abrogation in all cells --> ALGS-like organ interactions) and limitations (inability to identify contributions of individual cell types). However, by identifying maturing hepatocytes and Tregs as dysregulated, and demonstrating that Jag1Ndr/Ndr lymphocytes behave abnormally and suppress inflammation and fibrosis in Rag1-/- mice (with normal Jag1 expression), we establish a biological framework that can now be further investigated with conditional genetic tools and in vitro systems, to elucidate specific molecular pathways, that were beyond the scope of the current study.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1

      Evidence, reproducibility and clarity (Required):

      Au et al. used two fly models to study how mitochondrial defects are implicated in C9ALS, the most common familial ALS type. They found that in these flies, mitochondrial, but not cytosolic, ROS is upregulated, accompanied by locomotion defects agreeing with previous publications. Consistent with these data, sod2, but not sod1, rescues the behavioral defects in these flies. Also, manipulating mitochondrial dynamics or mitophagy does not rescue these defects. Furthermore, the authors showed that the Nrf2 activity is upregulated, likely due to oxidative stress, and genetically or pharmacologically suppressing the Keap1 function, which activates Nrf2 and thereby its downstream antioxidative genes, suppresses behavior defects in these flies. This part is generally solid and convincing, with minor issues that need some revision. Finally, the authors showed that mitochondrial ROS and nuclear Nrf2 are both upregulated in C9 iPS neurons, both of which are suppressed by the Keap1 inhibitor DMF, or a known antioxidant. For this part, the data are convincing but insufficient to support a good translation of their fly data.

      __Major concerns: __

      1a. The authors really need a phenotypic readout for their iPS experiments, either cell death or some sort of toxicity, to support the translatability of their fly data.

      • We agree and appreciate the value of having such as phenotypic readout for the iPSC experiments but, unfortunately, within the context of the current work we did not obvious any clear phenotype of toxicity or diminished viability under basal, unchallenged conditions. To support this, we have added our analysis of cell viability at the time of imaging, shown in new Supplementary Figure 3C and mentioned in the text (line 620-621).

      1b. The authors also need to test the toxicity of DMF in iPS neurons.

      • As above, we found that treatment with DMF conferred no overt toxicity within the time-course of our experiments. These data are shown in new Supplementary Figure 3D and mentioned in the text (line 626-628).

      The authors should use genetic ways, e.g., knocking down Keap1, to activate Nrf2 and test whether this suppresses ROS and neurodegeneration phenotype in iPS neurons, as they did in flies.

      They need to better characterize the Nrf2 activity in iPS neurons (see Minor Concern #1).

      • Regarding these two points, we agree that it would be interesting to further investigate the Keap1/Nrf2 pathway in these cells, but time, personnel and resource constraints preclude additional investigations on this occasion. It is important to note that the cell models were used specifically to validate that elevated mitochondrial oxidative stress and increased nuclear Nrf2 localisation also occurred in patient-derived neurons, and whether DMF treatment could reverse the oxidative stress. This was the extent to which the cell models were used in this instance and the current data are sufficient to support the conclusions made based on this. We regret that it was not possible to delve deeper into this at the current time but will be possible in future work.

      __Minor concerns: __

      1a. Fig 4A and B are hard to comprehend. Can the authors show images with more obvious differences?

      • We have now revised these figure panels replacing with alternative images. We hope that the new images show more appreciable differences. We understand that the differences can sometimes be subtle which is why we rely on the quantification for unbiased interpretation.

      1b. Also, Gst-D1 is the only Nrf2 downstream gene tested. Can the authors use RT-PCR to test multiple genes? These will strengthen the point that Nrf2 is activated. Similar things should be done in iPS neurons.

      • Thanks for this suggestion. To complement the immunoblots of the genomic GstD1-GFP reporter, we have now performed qRT-PCR on flies treated with or without DMF for additional Keap1/Nrf2 pathway targets, including GstD1, Gclc, GstD2 and Cyp6a2. These data show that the degree of transcriptional activation was variable between different targets, but DMF treatment caused a general upregulation of CncC targets in G4C2x36 flies (new Fig. 6A).

      What about cytosolic ROS in C9 iPS neurons? Is it similar to the fly models?

      • We agree that this would be interesting to analyse. Unfortunately, given time and resource constraints we did not have the capacity to also explore this out of curiosity. Again, the specific focus for the iPSC neuron work was to validate the mitochondrial ROS aspect and action of DMF.

      Unless the authors confirm that mitochondrial dynamics or mitophagy are not contributing to neurodegeneration in iPS neurons, I wouldn't emphasize their related negative data in flies. Overall, the authors need to tone down their arguments if the findings are not verified in iPS or other mammalian models.

      • On reflection, we agree that the iNeuron data was given an overly prominent status within the study and we have adjusted the text accordingly throughout, including removing a specific mention of this in the title. That said, we still consider that the negative results regarding the lack of rescue of organism-scale phenotypes (e.g., locomotion) by manipulating mitochondrial dynamics or mitophagy to be important indicators of the relative mechanistic contribution of these processes to the organism-scale pathology (most closely reflecting the clinical condition). As discussed above (major point 1a), within the context of the current work we did not obvious any clear phenotype of toxicity or diminished viability in the patient iNeurons. Therefore, it is not readily possible to test the relative contribution of mitochondrial dynamics vs mitophagy vs ROS to the survival of these cells, so we have based our interpretations of this on the in vivomodels. In summary, we have toned down our statements relating to and stemming from data arising from the iNeuron work but our interpretation of the negative results in flies remains the same.

      Can the authors measure the activities of OXPHOS complexes and ATP synthase/complex V?

      • The intention of this study was to explore mechanisms that could alleviate pathological phenotypes in vivo. We have characterised a wide-range of cellular defects relating to mitochondrial dysfunction including overall OXPHOS function by OCR. Analysing individual OXPHOS complexes from animal tissue is not a trivial undertaking and, other than providing a little more granularity to the nature of the respiratory defect, we considered that this would be a distraction from the main focus of the study.

      5a. Edavarone is one of the only two effective drugs for general ALS, and it's believed to work as an antioxidant. The authors should discuss it along with relating their findings to therapeutic development.

      • A statement on Edaravone being an FDA-approved treatment for ALS and an antioxidant (ROS scavenger) were included in the text (lines 628-629). We have added further comment on this in the Discussion (lines 686-690). Since edaravone was used as a comparator in this study, and to maintain the focus on DMF, we prefer to not elaborate on this further in the discussion.

      5b. Also, the discussion on SOD1 aggregation sounds somewhat farfetched. Plus, it's not directly related to the central message of this paper. I would remove it.

      • Fair enough. We have removed these statements from the text.

      __Significance (Required): __

      C9orf72-mediated ALS is the most common familial ALS type and also accounts for a fraction of sporadic ALS cases. Its pathomechanism is incompletely understood. Previous studies have linked mitochondrial defects and ROS to pathogenesis in fly, iPS, mouse, etc. models, and antioxidants can suppress some neurodegenerative features in these models. Consistent with these findings, one of the only two effective drugs for general ALS, edaravone, is believed to mitigate oxidative stress in motor neurons. Hence, oxidative stress is a critical pathogenic contributor that holds great potential as a therapeutic target. However, our understanding of its cause and consequence in ALS is limited. This paper includes at least two novel points: 1) identifying mitochondrial, but not cytosolic, ROS is upregulated and contributes to neurodegeneration in C9ALS models; 2) discovering that the Keap1/Nrf2 is altered and activating Nrf2 suppresses neurodegeneration. The first point presents an incremental advance in the field, but the second one is potentially critical, especially from a translational aspect. That being said, the novelty of the second point is somewhat dampened by a recently published paper (Jiménez-Villegas, et al. 2022), which showed that Nrf2/Keap1 is altered in C9 patient leukocytes and NSC cells overexpressing or treated with C9-DPRs. However, these cells/models are remotely related to the disease. The current manuscript still provided evidence in an in vivo neuronal model for the first time. If the authors could make their iPS part comprehensive, this could still be a major advance towards translation.

      This paper could be interesting to a broad audience beyond the ALS field.

      Another strength of this paper is that the fly analyses are comprehensive, the data are convincing, and the conclusions are solid. However, the major weakness is that the iPSN part is incomplete to support the translatability of their findings in flies. Current data only suggest that DMF and EDV are functional in iPSNs.

      Reviewer #2

      __Evidence, reproducibility and clarity (Required): __

      the study of ALS uses almost exclusively drosophila larvae and adults and has a few expts with iNeurons (human) at the end. THe results are interesting and relevant to human disease and do suggest potential ways to treat disease. Not all the effect sizes are large, but nonetheless this is publishable material. More expts would of course strengthen their case. None of what I suggest is essential, but this depends in part on where they eventually want to publish their work.

      __Some comments below: __

      All are overexpression models with strong phenotypes. This has to be mentioned.

      • The nature of the genetic models is clearly delineated in the manuscript. To highlight this further in the text, we have added comments at the start of the Results section stating that Drosophila do not have an orthologue of C9orf72, so we use previously established transgenic models (lines 372-376). In fact, it is incorrect to call these 'overexpression' models because there isn't a C9orf72 orthologue to be overexpressed. Formally, they are ectopic expression models.

      Furthermore, in any ageing model every aspect of cell biology is affected.

      • Agreed.

      In fig 1E to the non-expert it is hard to work out what is a mitochondrion. Some higher res imaging might help.

      • It is indeed difficult to discern individual mitochondria with this particular approach. We have a lot of experience in this kind of analysis and higher resolution imaging does not resolve the problem. The challenges with imaging mitochondria in such tiny cell bodies is the reason that we have adopted a categorical scoring system.

      Line 390 comments on morphology but fig s1b-c is survival. Do they have morphology data? If not then they should rephrase the text

      • This is a misunderstanding. The brief mention of mitochondrial morphology at the start of the paragraph ("Mitochondrial morphology is known to respond to changes in reactive oxygen species (ROS) levels as well as other physiological stimuli." - lines 414-415) is to provide as a segue from the preceding section describing the morphology defects to the following sections that investigate the possible mechanisms affecting this.

      Line 441. Can they provide reference for 1000 being physiologically relevant? 36 is certainly pathological in humans. In my opinion the only genuinely physiologicall relevant model is a genetically faithful knockin without codon alteration.

      • We have rephrased this to be 'more physiologically relevant repeat length' and provided a reference.

      Line 482 - they say mitophagy is downstream, but isn't that obvious in a C9 transgenic model?

      • We appreciate that this statement was confusing. We are referring to 'upstream' or 'downstream' in the cascade of events that ensuing from expression of DPRs, not upstream or downstream with respect to C9 mutations themselves, so we have rephrased this as "not a primary contributor to C9orf72 pathology" (lines 502-503).

      7a. Line 502 - they indicate 'exploring the basis', but I am a little unclear what they are saying. What is the reason for the reduced SOD1 in x36 v x3 flies? Are they simply killing cells that have the most SOD1 and therefore their qPCRs/blots only represent those cells with less SOD1? There is still SOD1 being expressed there of course.

      • Thanks for allowing us to clarify this point. We have not been able to clarify the mechanism for why Sod1 appears to be downregulated upon G4C2x36 expression, which we acknowledge is a limitation. So, we have decided to adjust the language from 'exploring the basis', to now simply report this as an associated observation (line 527).

      7b. In the text it would help if they clarified if the genes overexpressed are human or fly. If human, it might be worth overexpressing mutant ALS SOD1 if they are able.

      • In general, when reporting on experiments with a model organism such as Drosophila, we work on the assumption that genetic manipulations will typically be that of the host species, i.e., transgenic expression with be of Drosophila genes, unless specifically stated otherwise. In any case, all the necessary details of all genetic strains used in this study are laid out in Methods.

      Line 521 - this para should perhaps be in intro section, not results.

      • Agreed. We have now edited the start of this section (lines 543-546).

      In Fig5, do they have CnnC IHC to back up their conclusion that keap1 mutation is affecting this process?

      • Thank you for this suggestion. We have now analysed CncC localisation in C9 models {plus minus} Keap1 mutation. As before, we saw that G4C2x36 caused an increase in CncC nuclear localisation, although there was a trend towards an increase with Keap1 heterozygosity this was not consistent enough to be significant. These data are presented in new Fig. 5D, E and discussed in the text (lines 579-581). Although these results do not show an additional increase of nuclear CncC by this treatment of DMF, we also performed qRT-PCR analysis of CncC target genes GstD1, GstD2, Gclc and Cyp6a2,from flies treated with or without DMF. These data show that the degree of transcriptional activation was variable between different targets, but DMF treatment caused a general upregulation of CncC targets in G4C2x36 flies (new Fig. 6A).

      The Induced neuron results are interesting. What kind of neurons are they? Have they been confirmed to be so with ICC? The figures in 6 are poor. They should make the point that correction of the mutation to ensure isogenicity would be an additional confirmatory measure. Isogenic lines are available from JAX and the UK MND Institute.

      • Agreed. We now provide further characterisation of the iNeurons that was done at the time of the original experiments but not presented. These analyses include immunostaining with neuronal marker antibodies against β-III Tubulin, MAP2 and NeuN. These data are shown in new Supplementary Figure 3A, B. We also report the relative viability of these neurons at the point of analysis (new Supplementary Figure 3C, D). We have added mention of this in the text (lines 620-621 and 627-628). Of note, these patient cell lines have been used and reported before (Reference 53) which we cite on line 618. We also acknowledge the limitations of using these lines, and that future work would be better done with isogenic controls (lines 690-692) as the reviewer indicates.

      Suppl fig 3 - interesting observation with edaravone, but do they have any survival/motility data in neurons/flies? Also, would be good to compare with another drug that works on a different mechamism E.g. riluzole.

      • Since edaravone is a known therapeutic for ALS and was used as a comparator, rather than being the primary focus, we do not have additional data on edaravone.

      Overall, the conclude they have done a comprehensive analysis of mito function, but I would argue that while a good analysis there are plenty of other studies they could have done e.g. assess mitochondrial respiratory chain.

      • We agree that additional studies can always be envisaged.

      13a. I also think the imaging of mitochondria could be better, and much work needs to be done on the iNeurons to characterise them.

      • As mentioned above, we have provided additional characterisation of the iNeurons in this revision.

      13b. Sentence line 674 - needs rephrasing.

      • Thanks for prompting this. We have now rewritten these sentences (now lines 700-701).

      In their final paragraph what do you they mean by oxidative stress being upstream? I would argue it is downstream of the C9 expansion, right?

      • We apologise that this was confusingly written. As per the comment above (response to point 6), we were referring to events 'upstream' or 'downstream' in the cascade of events that ensuing from expression of DPRs. We have now rephrased this to be a "proximal" pathogenic mechanism (lines 708-710). We hope that our intended meaning is now clearer in the text.

      __Significance (Required): __

      A good study, modest degree of advancement in the field.

      Reviewer #3

      __Evidence, reproducibility and clarity (Required): __

      In the present paper the authors focused on the hyper-production of ROS in a C9orf72 fly model. they the sought to rescue the observed fly phenotype by manipulating mitochondria dysfunctions or pathways downstream these dysfunctions.

      __Majors: __

      Given the wide varieties of statistical tests used a rationale should be given to why a certain test (one way anova) was used in one experiment (WB, qPCR) and another for another (Chi square) experiment (mitochondria morphology)

      • In all cases, the choice of statistical test is dictated by the nature of the data being analysed - a principal that should be well-understood by all experienced researchers - and so may vary between experiments but will be consistent between different data sets of the same type of experiment. For instance, for those data sets consisting of two groups, an unpaired t-test would be appropriate. Most other experiments consist of three or more experimental groups and so will need an appropriate test with additional post-hoc test to correct for multiple comparisons, such as one-way ANOVA with Bonferroni's post-hoc correction. Where data sets are not normally distributed, such as generated by our climbing assay, a non-parametric analysis is required, such as the Kruskal-Wallis test. Here we also use a Dunn's post-hoc correction for multiple comparisons. In some assays of multiple groups, there are also multiple variables, such as the different drug concentrations tested on control and C9 iNeurons, a two-way ANOVA with an appropriate post-hoc correction test is used. Finally, some assays employ a categorical scored system, such as the mitochondrial morphology analysis, which will require a different type of statistical analysis such as Chi squared test.
            These types of analysis are in no way unusual or 'cherry-picked' to give the most desirable outcomes but are selected simply based on the type of the data to be analysed following standard rules of statistical analysis. For this reason, we do not feel that any more elaborate explanation is necessary in the manuscript text itself, but we hope that the explanation given here will satisfy the reviewer of the rationale for employing different statistical tests for different data sets.
        

      The entire second part of the paper, and most important one to the authors (given the tile), rely mostly on a supposed loss in protection against antioxidant. I feel the experiment in support of this hypothesis are not strong. It is true that there is an overproduction of ROS (as evaluated in the first figures) but the loss in protection stated based on Fig 4H does not hold much. I think more experiment are needed to support this hypothesis.

      • This is a fair comment and on reflection we also agree that our claim that the response to oxidative stress is blunted in the C9 models is based almost exclusively on the data from (old) Fig. 4H, and so is not strong. On reflection, prompted by the reviewer's comment, we have removed this interpretation from the manuscript and revised our comments accordingly. Consequently, we have also removed Fig. 4H.

      Moreover, I counter intuitive that to rescue a phenotype the authors over expressed that is already high in C9orf72 flies (nrf). I would suggest to match this results with downregulation of nrf, to effectively proof that nrf decrease is detrimental to counteract ROS species in C9orf72 flies (further reducing protection against ROS). I believe this experiment is quite critical for the entire manuscript.

      • We appreciate the thinking behind this suggestion, but this experiment can't be performed because loss of CncC function is lethal, as expected from a master regulator of a major cell-protection mechanism.

      Also to me there is a little bit of disconnection between the first three figures and the last three. The authors also find a reuse effect over expressing SOD2 etc as shown in figure 3 where they actually show rescue in mitochondrial dysfunction (morphology etc). The only piece of data that shows rescue in mitochondrial dysfunction upon nrf over expression is figure 5H. More extensive characterization of mitochondrial dysfunction recur should be performed if the title want to kept focused on keep/nrf mechanism. Otherwise a broader title like "modulation of the mitochondria damage rescue C9orf72 phenotype." could help the reader understanding the overarching message of the paper

      • We do not see a disconnect between the first part of the paper and the second. To be clear, the first part was documenting mitochondria-related defects (morphology, ROS, mitophagy) and determining their causative hierarchy and mechanistic impact on organismal phenotypes (we found only certain antioxidants rescued locomotor deficits and could reverse mitochondrial morphology and mitophagy defects). As stated, these results strongly implicated oxidative stress as a major driver in organismal pathology. The second part of the study was characterising whether a major antioxidant defence pathway (Keap1/Nrf2) could be manipulated to provide phenotypic rescue on the organismal scale (i.e., locomotor behaviours). On reflection of the original title, we agree that this was too focussed on the mitochondrial dysfunction angle (and also gave too much prominence to the iNeuron part of the study). Therefore, we have now modified the title to reflect a greater focus on oxidative stress and locomotor behaviours across the study. We hope this the reviewer feels that this better represents the study but will be happy to consider suggested alternatives.

      __Minors: __

      Figure 1n does each for represent a cell? or is an average of more cells and each dot represent an animal? I could not find this information anywhere, but if each dots is a single cells, I would recommend scaling up to at least 10 cells. Same concern for Figure 3F

      We agree that this point needs clarification. Each dot represents data for one animal. The quantification per animal is based on at least 10 cells from one image. This has been added to the Methods section for clarification (lines 220-221).

      Line 550-1-2 I do not agree with the statement. I do not think that the data shown that the protection against ross is less efficient. The only difference is the starting point. But the final point is the same so why should protection against ROS be less efficient in G4C2x36 drosophilas?

      - This comment relates to point 2 above. As stated there, we agree that the data are not compelling enough to make this interpretation, so we have revised our comments accordingly.

      There are some concerns about the neurons in figure 3: they do not appear to have axons and dendrites. I'd suggest containing with neuronal marker.

      - The reviewer may be unfamiliar with the specific tissue in question; the larval ventral ganglion. As a complex, mature tissue there are multiple cell types (e.g., neurons and glia) very closely packed. Neuronal processes are very thin in this tissue, and they are squeezed between neighbouring cells. Thus, microscopy of neuronal cell biology within such a complex tissue does not look like in vitro cultured neurons. In the specific context of Figure 3, we are looking at markers for mitochondria or mitophagy. The reviewer may also be aware that mitochondria and mitolysosomes are most abundant in the cell bodies and have very limited abundance in neuronal processes. Thus, we do not generally try to observe these organelles in processes because there would be very little to see. We know that the signal is within neurons because the markers are transgenically expressed exclusively by a neuronal driver system i.e. nSyb-GAL4. In summary, there is no problem with how these cells or how they look. This is quite normal.

      iNeurons were only used to confirm the second part of the paper. Would be interesting to also confirm some of the results in the first part, like SOD2 over expression etc etc.

      • We appreciate this suggestion, which is similar to a comment from Reviewer 1, but, as replied above, time, personnel and resource constraints preclude additional investigations on this occasion. Just to reiterate, it is worth noting that the cell models were used specifically to validate that elevated mitochondrial oxidative stress and increased nuclear Nrf2 localisation also occurred in patient-derived neurons, and whether DMF treatment could reverse the oxidative stress. This was the extent to which the cell models were used in this instance and the current data are sufficient to support the conclusions made based on this. We regret that it was not possible to delve deeper into this at the current time but would be the focus of future work.

      __Significance (Required): __

      The present work while not extremely novel in the hypothesis, it is well performed with state-of-the-art techniques, some of them also very novel to the field. The concept of oxidative stress as an important in ALS pathogenesis is not new in the field, but the identification of Nrf as an important players might pave the way for more human related studies and possibly to therapeutic interventions.

      I think the work is technically sounded and well performed; certain evidence are solidly demonstrated with multiple different techniques. other evidences instead need a little more work to prove their solidity to widen the audience which will appreciate the content of this paper.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      In 'Systems analysis of miR-199a/b-5p and multiple miR-199a/b-5p targets during chondrogenesis', Patel et al. present a variety of analyses using different methodologies to investigate the importance of two miRNAs in regulating gene expression in a cellular model of cartilage development. They first re-analysed existing data to identify these miRNAs as one of the most dynamic across a chondrogenesis development time course. Next, they manipulated the expression of these miRNAs and showed that this affected the expression of various marker genes as expected. An RNA-seq experiment on these manipulations identified putative mRNA targets of the miRNAs which were also supported by bioinformatics predictions. These top hits were validated experimentally and, finally, a kinetic model was developed to demonstrate the relationship between the miRNAs and mRNAs studied throughout the paper.

      I am convinced that the novel relationships reported here between miR-199a/b-5p and target genes FZD6, ITGA3, and CAV1 are likely to be genuine. It is important for researchers working on this system and related diseases to know all the miRNA/mRNA relationships but, as the authors have already published work studying the most dynamic miRNA (miR-140-5p) in this biological system I was not convinced that this study of the second miRNA in their list provided a conceptual advance on their previous work.

      We believe this study is an enhancement on our previous work for two reasons, which have been alluded to in new text within the introduction. Firstly, our previous work used experimental and bioinformatic analysis to identify microRNAs with significant regulatory roles during chondrogenesis. This new manuscript additionally uses  a systems biology approaches to identify novel miRNA-mRNA interactions and capture these within an in silico model. Secondly, this work was initiated by the analysis of our previously generated data – using a novel tool we developed for this type of data (Bioconductor - TimiRGeN).  

      I was also concerned with the lack of reporting of details of the manipulation experiments. The authors state that they have over-expressed miR-199a-5p (Figure 2A) and knocked down miR-199b-5p (Figure 2B) but they should have reported their proof that these experiments had worked as predicted, e.g. showing the qRT-PCR change in miRNA expression. Similarly, I was concerned that one miRNA was over-expressed while the other was knocked down - why did the authors not attempt to manipulate both miRNAs in both directions? Were they unable to achieve a significant change in miRNA expression or did these experiments not confirm the results reported in the manuscript?

      We agree with the reviewer that some additional data were needed to demonstrate the effective regulation of miR-199-5p.  Hence, Supplementary Figure 1 is now included which provides validation of the effects of miR-199a-5p overexpression (Supplementary Figure 1A) and inhibition of miR-199a/b-5p (Supplementary Figure 1B). Within the main manuscript, Figure 2B has been amended to include the consequences of inhibition of miR-199a-5p, with 2C showing the consequences of miR-199b-5p inhibition. Further, we include new data with regards to miR-199a/b-5p inhibition on CAV1 (Figure 4A). 

      I had a number of issues with the way in which some of the data was presented. Table 1 only reported whether a specific pathway was significant or not for a given differential expression analysis but this concealed the extent of this enrichment or the level of statistical significance reported. Could it be redrawn to more similarly match the format of Figure 3A? The various shades of grey in Figure 2 and Figure 4 made it impossible to discriminate between treatments and therefore identify whether these data supported the conclusions made in the text. It also appeared that the same results were reported in Figure 3B and 3C and, indeed, Figure 3B was not referred to in the main text. Perhaps this figure could be made more concise by removing one of these two sets of panels.

      We agree with all points made here and have amended these within the manuscript. Figure 1A is now pathway enrichment plots from the TimiRGeN R Bioconductor package, and the table which previously showed the pathways enriched at each time point is now in the supplementary materials (supp. Table 1). Figure 2 and 4 now have color instead of shades of grey. Figure 3C has now been moved to supplementary materials (Supplementary Figure 2) and is referenced in the text. 

      Overall, while I think that this is an interesting and valuable paper, I think its findings are relatively limited to those interested in the role of miRNAs in this specific biomedical context.

      Reviewer #2 (Public review):

      Summary:

      This study represents an ambitious endeavor to comprehensively analyze the role of miR199a/b-5p and its networks in cartilage formation. By conducting experiments that go beyond in vitro MSC differentiation models, more robust conclusions can be achieved.

      Strengths:

      This research investigates the role of miR-199a/b-5p during chondrogenesis using bioinformatics and in vitro experimental systems. The significance of miRNAs in chondrogenesis and OA is crucial, warranting further research, and this study contributes novel insights.

      Weaknesses:

      While miR-140 and miR-455 are used as controls, these miRNAs have been demonstrated to be more relevant to Cartilage Homeostasis than chondrogenesis itself. Their deficiency has been genetically proven to induce Osteoarthritis in mice. Therefore, the results of this study should be considered in comparison with these existing findings.

      We agree with the reviewers comments. miR-455-null mice develop normally but miR-140-null (or mutated) mice and humans do have skeletal abnormalities (e.g. Nat Med. 2019 Apr;25(4):583-590. doi: 10.1038/s41591-019-0353-2), indicating a role in chondrogenesis.  We have made an addition in the description to point towards the need to assess the roles miR-199a/b-5p may play during skeletogenesis and OA. We anticipate miR-199a/b-5p to be relevant in OA and have ongoing additional work for this – but this beyond the scope of this manuscript. 

      Recommendations to Authors:

      Reviewer #1 (Recommendations to authors):

      Beyond the issues raised in the public review, I had a few minor recommendations that are largely designed to help improve the understanding of the manuscript as it is currently written.

      (1) Please provide the statistical tests used to obtain p-values in the Figure 2 and 4 legends.

      We have now added statistical test information to the figure legends of figures 2 and 4.

      (2) It is stated on p. 9 that both miRNAs may share a functional repertoire because 25 and 341 genes are interested between their inhibition experiments. Please provide statistical support that this overlap is an enrichment over the null background in this experiment. Total DE genes – chi squared. Expected / Observed. 

      A chi-squared test is now presented in the manuscript which shows that the number of significant genes which were found in common between miR-199a-5p knockdown and miR-199b-5p knockdown were significantly more than expected for day 0 or day 1 of the experiments. 

      (3) The final sentence on p. 12 (beginning 'Size of the points reflect...') seemed out of place - is it part of a legend?

      Thank you for pointing out this mistake - it was part of figure 3C and now is in the supplementary materials.

      (4) A sentence on p. 14 reads that 'FZD6 and ITGA3 levels increased significantly' but this should read decreased, rather than increased. Quite an important typo!

      Thank you for pointing this error out. It has been corrected.

      (5) Theoretical transcripts are mentioned in the legend of Figure 5A but these were not present in the figure. Please include these or remove them from the legend.

      This error has been removed form Figure 5A.

      (6) On p 20, the references 22 and 27 should I think be moved to earlier in the sentence (after 'miR-199a-5p-FZD6 has been predicted previously'). Currently, it reads as if these references support your luciferase assays which you claim are the first evidence for this target relationship.

      We agree with this change and have corrected the manuscript.

      (7) The reference to Figure 5D on p. 20 should be a reference to Figure 5C.

      Thank you for pointing this error out – this has been corrected.

      Reviewer #2 (Recommendations to authors):

      (1) The paper is based on the importance of miR-140 and miR-455 as miRNAs in chondrogenesis, citing only Barter, M. J. et al. Stem Cells 33, (2015). Considering the scope and results of this study, this citation is insufficient.

      We agree with this reviewers comments. For many year miR-140 and miR-455 have been experimented on and their importance in OA research has become apparent. We included additional references within the introduction to address this.

      (2) Analyzing chondrogenesis solely through differentiation experiments from MSCs is inadequate. It is essential to perform experiments involving the network within normal cartilage tissue and/or the generation of knockout mice to understand the precise role of miR199a/b-5p in chondrogenesis.

      We have added an additional paragraph in the discussion to state this, and do believe it is highly important that miR-199a/b-5p be tested in OA samples – however this would be beyond the intended scope of this article.

      (3) In light of the above points, it is imperative to investigate the role of miR-199a/b-5p beyond the in vitro differentiation model from MSCs, encompassing mouse OA models or human disease samples.

      In tangent with the previous address, we agree with the pretense and believe additional experiments should be performed to gain more insight to the mechanism of how miR-199a/b-5p regulate OA. But development of a new mouse line to investigate this is not in the scope of this manuscript.

    1. The argument/ideology that pins down Barthes’ deconstruction of the Eiffel Tower is very Nietzchian. Much like Nietzsche’s popular argument that art is the only truth because it allows one to live in a personal abstraction and intuition, the tower being art means it surpasses our rationalization, deconstruction, and assimilation of it into one side of binary schemas. It exists to emphasize its inability to be known by us and to serve almost “mythical” purposes that transcend rational rules of the world. In other words, “Barthes’ phenomenological approach brings us to the focus of our investigation: an architectural structure’s capacity to simultaneously be understood as agent and object, a capacity we regard as a peculiar oscillation between function and symbol in the case of the Eiffel Tower” (Steiner).

      There is a lot to unpack with the contradictory qualities of the “utterly useless monument”, which we actually learn is pretty useful (Barthes 5). The point that stands is that, physically, the tower is an uncontainable object that we try to domesticate. One way we do this is through “the installation of a restaurant [...or other] means of leisure” in the tower itself (Barthes 16). The fact that the tower is an open construction makes us uncomfortable when we are used to typical tourist hotspots (like museums, for instance) being enclosed for us to feel like we entered, experienced, and “owned” some of it. The tower doesn’t do that for us. So, we have to create a mini world surrounding the tower in order to make it feel normal. In our conception of the order of the world, the Eiffel tower is unique to us because it is simultaneously a representation of the inside and of the outside world. This quality, that the tower is somehow both sides of an opposite binary, is too far outside of the social contract, and Nietzche would say (and Barthes points to it) that we often try to tackle this discomfort by trying to reduce the tower. We do this by turning the tower into a sight of projection. It becomes a symbol of industrialism, of Paris, of travel, of art, of Paris itself–whatever one may choose. But it is in this choice, that we strip the tower of the other symbols it projects equally as strongly. And this is where the problem lies. We must look at the tower as the embodiment of all the opposites it may be: inside/outside, industry/art, ugly/beautiful, all at the same time.

      Barthes asks us to consider why the tower makes us so uncomfortable in this binary presentation. Perhaps it is because this makes the tower oddly more powerful than us. The tower can be a spectacle and an object, useless and useful, inside and outside. We cannot be those things. If we are looking at the tower, we can't be in it, for example; but the tower can be both an empty base space outside, and an indoor restaurant as well, for example. None of our relations to the tower can come together at the same time, while the tower can be opposites at the same time. We can only perceive the tower as one of its opposite meanings at a time, and we have to kind of deal with the impossibility of bringing together two things that are true and simultaneous but also cannot co occur logically. I think one way we do this is by glossing over it all and pretending everything can occur at the same time–a comforting thought facilitated by the constructed surrounding environment.

      However, by doing this, what simultaneously happens is that the tower becomes a signifier of basically an infinite sight of projection. It is reduced to a symbol of Paris, of travel, of industrialism, of some kind of focal point in France. The tower being a signifier for everything really just makes it nothing. And when we come face-to-face with this (structural and symbolic) emptiness, we rush to find ways to create more perceived “somethingness”(we add restaurants, shops, carts of food, and other community experiences all around the tower) to fit into our schemas and orders.

      Barthes, Roland. The Eiffel Tower - Roland Barthes - LANTB, lantb.net/uebersicht/wp-pdf/eiffelTower.pdf. Accessed 13 May 2024. Steiner, Henriette, and Kristin Veel. “Towering invisibilities: A cultural-theoretical reading of the Eiffel Tower and the One World Trade Center.” Qualitative Inquiry, vol. 25, no. 4, 5 Aug. 2018, pp. 407–416, https://doi.org/10.1177/1077800418790297.

    2. Pentadic criticism can be used to analyze the Eiffel Tower as well. It requires of us that we identify 5 items of the pentad: * Agent (who is performing the act) * Act (what is happening) * Scene (where and when the artifact was produced) * Purpose (why) * Agency (how/what means does the agent use) Pentadic criticism allows us to assign various different characteristics or details to each “item”, resulting in various interpretations of the same artifact. (Foss 356) Here is one possible pentad and interpretation of the Eiffel Tower * Agent: Gustave Eiffel * Act: constructing the Eiffel Tower * Scene: Paris during 1889 * Purpose: to introduce the value of engineers as creative artisans and mathematical intellectuals in a climate heavily dominated by artists only Agency: by submitting the design for the Eiffel Tower to the World Fair contest, Eiffel found a means through which he could gain extreme popularity for his cause. In construction of the tower itself, the use of metal and open structures contribute to the “engineer” aspect of the monument. Next is to analyze the artifact using some combination of these five characteristics. For instance, one might argue that agency and scene are the most important qualities of this artifact: By using the World Fair contest as a way to bring attention to his industrial artifact, Gustave Eiffel also shattered a perception of Paris as a classy and elegant city. He took attention away from the belle epoque and forced people to think of how structures could also serve useful purposes. The message behind the Eiffel Tower may not have come across this way had Eiffel, an engineer, not submitted his work to an art competition like this one.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Dear editor and reviewers,

      we thank you very much for your constructive comments, criticisms and suggestions for improvement of our manuscript. We have addressed all points raised by you and have added our point-by-point response to your comments below.

      With best regards on behalf of all authors,

      Andreas Wodarz

      1. Point-by-point description of the revisions

      Reviewer #1

      Evidence, reproducibility and clarity

      Baz/Par3 is an important conserved protein acting as a master regulator of cell polarity in a wide range of cell types. This study focuses on re-assessing the subcellular localisation of Baz/Par3 in a range of Drosophila tissues. This is an important study with respect to our understanding of Baz/Par3, as there have been conflicting reports on the localisation of Par complex members - while the majority show localisation to cell cortex and intercellular junctions, several reports have claimed that Par complex members localise at additional subcellular sites including the nucleus, nuclear envelope and neuromuscular junction. In this study the authors re-assess this issue for Baz/Par3 in a comprehensive and thorough manner.

      We thank the reviewer for this overall positive assessment of our work.

      *1. They used a variety of antibodies raised in different host animals against different epitopes of Baz 2. They tested the specificity of these antisera using mosaic analysis with null mutant baz alleles and tissue-specific RNAi against baz 3. They used a GFP-tagged Baz under control of its endogenous promoter in a baz null mutant background to compare with the subcellular localisation of the respective GFP-Baz fusion proteins to the staining results with anti-Baz antisera

      The data from each of these experiments are very clear and convincing. Comprehensive methods are included which means that each of the experiments with specific anti-sera/RNAi lines/GFP-tagged conditions could be reproduced. There are a couple of experiments which were performed in support of the conclusions (extra RNAi lines and stronger expression of Gal4) listed as (data not shown). I would strongly suggest including these data as extra supplemental figures. Together, their results clearly show that Baz/Par3 localises to the cortex and intercellular junctions, but that anti-sera staining at the NMJs and nuclear envelope appear to be a staining artifact, likely due to staining with an unidentified epitope.

      Minor comments 1. Many of the figures have overlays of red and green which will be indistinguishable from each other to colour-blind readers. Please alter to make colour-blind friendly (eg magenta-green)*

      We have changed all figures in the following way: All single channel images have been converted to inverted grayscale to improve the visibility of weak fluorescence signals. In all multicolor overlay images, red has been omitted and instead green, magenta, blue and grayscale have been used to improve the visibility for color-blind readers.

      2. In Fig 2D please indicate where the epidermis and neuroblasts are

      We assume that the reviewer refers to Fig. S2D. In the revised version of the manuscript, this figure is now Fig. S2A. We have marked epidermal cells and neuroblasts by different symbols.

      *3. In the following two places there are experiments describe where the data is listed as not shown. Please show the data as additional supplemental data. They are P8 - This result was confirmed using the CY2::Gal4 driver line expressed in the follicular epithelium and with three different RNAi lines against baz (data not shown). *

      We have deleted this sentence because expression of CY2::Gal4 in our hands was weaker and thus the RNAi effects less reproducible than with tj::Gal4.

      P11 - We also did not see any downregulation of Baz or a-spectrin upon baz-RNAi in M12 at 29°C, when the UAS-Gal4 system is maximally active (data not shown).

      We now show these results in the new Fig. S8.

      4. Figure 3 - this would be easier to interpret with a few arrows/arrowheads indicating the NMJs

      We have added arrows pointing to NMJs and arrowheads pointing to nuclei.


      Significance

      It will be important to publish these results as it means that findings for a function of Baz/Par3 at the NJM and the nuclear envelope should be regarded with caution, and it may save researchers chasing for functions for Baz/Par3 in places where they are simply not expressed. As much of our fundamental understanding of how Par3 works in vertebrates has its roots in studies in Drosophila, this is likely to be of wide relevance.


      Reviewer #2

      Evidence, reproducibility and clarity

      *Evidence, reproducibility and clarity

      1.1 Summary

      This reviewer acknowledges the expertise and contributions of Prof. Wodarz and his research group in the field of development, cell polarity regulation and Drosophila genetics.


      Manuscript summary:

      Kim S. et al. explored the localisation of Bazooka, the Drosophila homolog of the polarity protein Par-3, at two non-canonical positions for a cell polarity factor: the nuclear envelope in epithelial tissues and the postsynaptic membrane of the neuromuscular junction (NMJ). Previous work has shown the detection of Par-3/Baz at the nuclear envelope and the NMJ using antibodies against Par-3/Baz. Here, the authors used a combination of genetic perturbations (baz RNAi and generation of genetic mosaics for baz) and GFP-labelled Bazooka lines to test if the antibody-mediated detection of Baz at the nuclear envelope and NMJ is artifactual. The data provided by the authors strongly suggest both the nuclear envelope and NMJ detection of Baz using antibodies is non-specific.

      1.2 Major comments

      The manuscript is written in a clear manner, easy to be followed by readers. However, there are some important experimental details that should be provided as the authors advance over previous work regarding Baz localization (points 1.2.1 and 1.2.2). Furthermore, if possible, this reviewer considers that performing the experiment in 1.2.3 would strengthen the authors main message of their manuscript.

      1.2.1 Methodology information is missing, and would be necessary to be included for: image acquisition (Objectives, Airyscan mode), image processing (projections, details on linear -e.g. brightness, contrast- or non-linear adjustments of signal -e.g. gamma-). For image processing information, please include it within each figure legend. *

      We have added the information regarding objectives and imaging modes to the Materials and Methods section. There it now reads: "Tissues were imaged on a Zeiss LSM880 Airyscan confocal microscope using 25x LCI Plan Neofluar NA 0.8 and 63x Plan Apochromat NA 1.4 oil immersion objectives. If not stated otherwise in the figure legend, all confocal images are single optical sections taken at a pinhole setting of 1 Airy unit. Images were processed with Zen black software (Zeiss) without contrast enhancement. Figures were assembled with Inkscape 1.2 (Inkscape.org) and Powerpoint (Microsoft)."

      RNAi experiments lines, temperature for each target and tissue (a table would be helpful) and number of heat shocks performed for FRT/FLP clones.

      We have added a table in the Supplementary information giving the precise genotypes for each figure. We have furthermore added the following sentences to the Materials and Methods section: "Crossings for RNAi experiments were set up at 25°C if not indicated otherwise. For generating follicle cell clones in ovaries by Flipase-mediated mitotic recombination of the FRT sites flies were heat shocked for 1h at 37°C 5-7 days prior to preparation of the ovaries. For generation of germ line clones by Flipase-mediated mitotic recombination of the FRT sites flies were heat shocked twice for 2 h at 37°C on two consecutive days in late 2nd, early 3rd instar larval stages."

      1.2.2 For each experiment it is unclear the number of specimens (experimental units) and independent experiments that were analysed. It is unclear if the Baz localisation phenotypes are fully penetrant or not as judged by the data provided.

      We have added the following section to the Materials and Methods: "Images were analyzed for the presence or absence of a fluorescence signal at the nuclear envelope or the NMJ compared to negative or positive controls, either in the same tissue (mutant clones in the follicular epithelium, RNAi in a specific body wall muscle, junctional versus nuclear signal, anti-Baz staining versus Baz-GFP signal) or in samples processed in parallel (ovaries with follicle cell and germ line clones). Fluorescence intensities were not quantified because the results were obvious and fully penetrant. Therefore, no statistical analysis of the results was required."

      1.2.3 This reviewer agrees the data provided strongly suggests the detection of Baz along the nuclear envelope and NMJ is artifactual in the Drosophila tissues that have been studied. However, the nature of the bazEH747 mutant allele is not a deletion of the Baz gene, but instead a nonsense mutation, which, as the authors describe, could potentially generate a small product of 51 aminoacids, corresponding to the N-terminal part of Baz, which is also the target of Baz rabbit antibody ('rb Baz 1-297'). Thus: • Would it be possible to complement the FRT/FLP analyses in the FE using a deficiency that uncovers the baz locus? A persistent detection of Baz signal at the nuclear compartment after complete removal of baz gene products would be an ideal experiment, if feasible.

      We agree with the reviewer that the use of a clean deletion allele of the whole baz locus would be the ideal tool for the clonal analysis. However, such an allele does not exist according to our knowledge.

        • Would the authors comment on the possibility the rb Baz antibody 1-297 detect a 51 aminoacids peptide? We consider this possibility very unlikely for two reasons: 1) RNAi affects the baz mRNA and thus should knock down all epitopes to the same degree. However, we see a complete loss of junctional Baz signal but no reduction of the signal at the nuclear envelope or the NMJ upon RNAi targeting baz. 2) The GFP-Baz fusion proteins do not show any signal at the NMJ or the nuclear envelope upon imaging of the native GFP fluorescence or upon antibody staining with an anti GFP antibody, although both the Baz-GFP BAC line and the GFP-Baz protein trap line express full-length Baz including the N-terminal epitope that is potentially still expressed in the bazEH747* allele. We have added a passage summarizing these considerations to the Discussion section.

      *1.3 Minor comments

      This manuscript is largely based on imaging data. Therefore, it would be beneficial for the ease of comprehension of figure panels:

      1.3.1 More general use of insets to show with larger magnification and clarity the data indicated with arrows and arrowheads.*

      We have added arrowheads, arrows and additional symbols to point to features of interest in all figure panels where this is helpful.

      1.3.2 Using negative grayscale either for insets or single channel data.

      We have changed all single channel image panels to negative (inverted) grayscale.

      1.3.3 For coloured-overlays please bear in mind using colors that would be suitable for colour-blinded readers.

      In all multicolor overlay images, red has been omitted and instead green, magenta, blue and grayscale have been used to improve the visibility for color-blind readers.

      1.3.4 Figures showcasing the clonal analyses (both MARCM and FRT/FLP): might be worth indicating the boundaries of clones in single channel data with a dotted line.

      We have marked the clone boundaries of the MARCM clones by dashed lines in Fig. 2D, E and have added a high magnification inset to show the clone boundaries (Fig. 2D', E').

      Significance

      *2 Significance

      The findings provided by this manuscript will be of importance for researchers in the field of cell polarity, conducting research on Bazooka/Par-3 and associated proteins, both within the Drosophila field and other model organisms. The present study presents an advance towards a specific and most likely artifactual observation of Par-3/Bazooka. It will help to re-think the tools used for detecting Par-3/Bazooka in different animal models, and in this regard, will be helpful for the community.*

      We thank the reviewer for appreciating the importance of this work.

      *This work does not focus on Par-3/Bazooka biology, nor provides new insights into Par-3/Bazooka function, however, it is clear for this reviewer the later is not the aim of this manuscript.

      Reviewer expertise:

      • Drosophila genetics
      • Developmental cell biology and morphogenesis
      • Cytoskeleton, cell cell adhesion and cell polarity*

      Reviewer #3 *(Evidence, reproducibility and clarity (Required)):__

      __Kim et al. address a common but frequently neglected problem in molecular and cellular biology: sophisticated tests for the specificity of antibodies. The protein Bazooka (Baz) is a member of the Par complex that usually resides in apicocortical regions of epithelial cells. Several publications, however, report expression in other subcellular compartments or cell types, such as the nuclear lamina or neuromuscular junction (NMJ). The authors have used a panel of polyclonal antibodies, genetic constructs and mutant alleles to show that staining of Baz in the nuclear envelope or NMJ is likely unspecific due to an unknown cross-reactivity. Specifically, four antisera, raised against different GST-Baz fusion proteins in different species, recognized Baz at cortical membranes, around nuclei and at NMJs. Nuclear and NMJ staining, however, persisted in baz-RNAi experiments or baz mutant clones. If the endogenous locus is tagged with GFP, Baz-GFP localized to cortical membranes in imaginal disc epithelial cells but was but not detectable in nuclear envelopes or NMJs in muscles. The authors conclude that they could not find evidence for either nuclear or NMJ localization of Baz and any results derived from these antibodies should be regarded with caution.

      The manuscript reports a careful and thorough evaluation of anti-Baz antibodies used in the scientific community. Since it might impact previous findings, any remaining uncertainties should be clarified before publication. I have therefore a number of suggestions to improve the manuscript.

      Major comments:

      1) Any truncation or addition of amino acids might affect the subcellular localization of proteins. Important molecular information on the baz alleles and GFP-fusion proteins are therefore missing in the manuscript. Specifically, what is the underlying molecular nature of the baz alleles used in the study, e.g. bazEH747 (nonsense? position?)? At which amino acid position and in which protein domain is GFP fused to Baz in Baz-GFP (Bac) and Baz-GFP (Trap)? Would these fusions affect subcellular localization and/or functionality? While the authors positively tested Baz-GFP (Bac) in a baz mutant background, this cannot easily be done for Baz-GFP (Trap). The authors should therefore clarify, e.g. by RT-PCR, which of the four Baz isoforms are fused to GFP in Baz-GFP (Trap) and if this might affect functionality and/or location? This information should be depicted or listed together with the epitopes of the antibodies in a figure or table, respectively, in the main manuscript for better orientation of the reader. *

      bazEH747 is a strong loss-of-function allele with a point mutation changing the codon for Q51 to Stop in all four isoforms (numbering is according to isoform A) (Krahn et al., 2010; Shahab et al., 2015). In the Results section, we have changed the wording as follows to make this clear: "For clonal analysis the strong loss-of-function allele bazEH747 was used, where a point mutation in exon 4 results in a premature stop close to the N-terminus of all four isoforms (the codon for amino acid residue Q51 is mutated to a stop in isoform A) (Krahn et al., 2010)."

      We have added two additional supplemental figures to precisely show the insertion site of GFP in the GFP-Baz trap line (Fig. S5) and the Baz-GFP BAC line (Fig. S6). We have changed the Results section to precisely explain the nature of the two Baz-GFP lines as follows: "While strong nuclear envelope immunostaining was observed using several independently raised anti Baz antibodies (Fig. 1; Fig. S1), no nuclear envelope localization was detected in follicular epithelial cells and in larval body wall muscles using a Baz-GFP BAC line (Besson et al., 2015) (Fig. S3C-D', S4A, A') nor in a GFP-Baz protein-trap line (Buszczak et al., 2007)(Fig. S3E-F', S4C, C'). In the GFP-Baz protein-trap line an engineered exon encoding for GFP is inserted into the second untranslated exon (Fig. S5). This exon encoding for GFP is predicted to be spliced in frame into the mRNAs RA and RC encoding for isoforms PA and PC whose translation starts in exon 1 (Fig. S5), resulting in insertion of GFP between amino acid residues K40 and P41 of isoforms PA and PC. The transcripts RB and RD encoding Baz isoforms PB and PD have their translation start within exon 3 and thus cannot form fusion proteins with GFP inserted in exon 2 (Fig. S5). However, GFP-Baz protein trap flies are homozygous viable and are phenotypically indistinguishable from wild type flies, indicating that the corresponding GFP fusion protein is fully functional and faithfully reflects the expression pattern and subcellular localization of Baz isoforms PA and PC. The BAC line integrates the GFP within exon 10 between amino acid residues L1424 and Q1425 of isoform PA, giving rise to GFP fusion proteins for all four isoforms (Fig. S6) (Besson et al., 2015). Like the protein-trap GFP-Baz fusion protein, the Baz-GFP fusion protein in the BAC line is fully functional as it completely rescued lethality and fertility of the bazEH747 allele (Fig. S7D-D') and the baz815-8 allele (Besson et al., 2015)."

      *2) Figure 3D-G: The images for Baz-GFP nicely show that GFP is expressed in imaginal discs but not at NMJs. However, when brightness of Fig. 3D' and 3F' is increased nuclear envelopes, tracheal branches and some synaptic boutons are clearly visible in the Baz-GFP channels. These are likely background signals due to the staining procedure, but to avoid any confusion, images showing unstained (native) GFP fluorescence should be included to proof that there are no residual signals. GFP fluorescence survives formaldehyde fixation and many GFP exon traps are clearly visible even in the absence of immunofluorescent stainings. Furthermore, Fig. 3G appears vastly different compared to Fig. 3E and Baz localization at cell-cell junctions cannot be recognized by people unfamiliar with imaginal discs. The images in Fig. 3G are therefore not suitable and should be replaced. *

      We have added the new Fig. S4 showing the GFP signal without antibody staining of somatic body wall muscles and wing imaginal discs of larvae expressing the Baz-GFP BAC and GFP-Baz trap transgenes. We have also replaced Fig. 3G with images that can easily be compared with the images in Fig. 3E. The following paragraph was added to the Results section: "These findings were confirmed by analysis of fixed larval tissues that were imaged for GFP fluorescence without anti GFP antibody staining (Fig. S4). Neither in the Baz-GFP BAC line (Fig. S4A, A'), nor in the GFP-Baz trap line (Fig. S4C, C') any nuclear envelope or NMJ signal was detectable in somatic muscles, whereas junctional signal in wing imaginal discs was readily detectable in both lines (Fig. S4B, D)."

      *3) The argument that baz4 and baz815-8 carry second site mutations is not fully convincing (page 10, 13). Why should two independent baz alleles carry an additional hit that affect Spectrin levels? Other explanations might be possible. While downregulation of Baz in muscles by RNAi is a good approach to tackle the question of Spectrin localization and expression levels, RNAi itself has its own uncertainties. Why not showing the effect on Spectrin levels or the lack of Baz at the NMJ (or the nuclear envelopes) in "clean" baz null embryos or larvae (e.g. bazEH747/Df)? NMJs can be stained in late stage embryos or compound heterozygous null mutants quite frequently survive until larval stages. *

      We do not have a good explanation for the published reduction of Baz and a-Spectrin signal at the NMJ in larvae heterozygous for the baz alleles baz4 and baz815-8 (Ruiz-Canada et al., 2004; Ramachandran et al., 2009), as our analysis shows that Baz is not expressed there, rendering the reported phenotypes very difficult to explain. It is beyond the scope of our paper to proof that the data published by Ruiz-Canada et al. (2004) and Ramachandran et al. (2009) are indeed reproducible. Our speculation that second site hits on these two mutant chromosomes may have caused the published effects is just based on our own published observation that commonly used chromosomes with these two mutant baz alleles have stronger phenotypes than a clean baz loss-of-function allele (Shahab et al., 2015). We have changed the wording of the corresponding paragraph as follows: "It has been published that heterozygous baz4 mutant larvae show a significant decrease in immunofluorescence signal of Baz and also of Spectrin at the NMJ (Ruiz-Canada et al., 2004). Another publication showed a significant decrease in Baz and Spectrin immunostaining at the NMJ of larvae heterozygous for the baz815-8 allele (Ramachandran et al., 2009). We did not attempt to reproduce these findings. However, in our hands mitotic clones generated with FRT chromosomes carrying these latter two baz alleles showed polarity phenotypes in the follicular epithelium, whereas clones of the clean bazEH747 null allele did not show any polarity defect (Shahab et al., 2015), raising the possibility that the NMJ phenotypes observed by Ruiz-Canada et al. (2004) and Ramachandran et al. (2009) were caused by second site mutations on these chromosomes rather than by reduced Baz activity.

      bazEH747 hemizygous mutant embryos are so abnormal and malformed at late embryonic stages that we did not attempt to stain these for Baz immunoreactivity at NMJs.

      4) It is not really made clear in the manuscript, why the additional reactivity of the anti-Baz antibodies has not been noticed earlier. The paper should therefore include a summarizing paragraph that describes how the specificities of the antibodies have been tested in the past in the laboratories that used them. Have they never been tested in null mutant animals? In null mutants it should be obvious to determine, if some staining patterns do not disappear.

      The vast majority of publications on Baz including those from our own laboratory focused on the functions of Baz at junctions and in the control of cell polarity. For these functions the cortical localization of Baz is relevant, which has been shown to be specific in many independent studies using null alleles and RNAi. Only few publications, in particular those from the laboratory of Vivian Budnik, have focused on potential functions of Baz at the NMJ and the nuclear envelope. Why in these studies no convincing proof of the specificity of the signal at those "unconventional" locations has been provided is beyond our knowledge.

      5) Figure 4 is very difficult to comprehend and should be better labeled (e.g. anterior-posterior, dorsal-ventral, muscle fibers, unspecific signals). It is standard in the field to show ventral muscles 12, 13 or 6, 7 in the center of the image and in a similar orientation (anterior left, dorsal up). Better images should be shown.

      We understand that for researchers interested in the function of specific muscles it is important to adhere to conventions regarding the orientation of muscles in figures. However, in our case it is just relevant whether a muscle expresses RNAi against a gene of interest (GFP+) or not (GFP-) in order to compare the signal intensity for Baz and Spectrin in these two situations. Thus, although we appreciate the validity of this comment, we decided to leave the original images unchanged. However, to help the reader in identifying relevant structures more easily, we have added color-coded arrows and arrowheads to mark NMJs and nuclear envelopes in GFP+ and GFP- muscles.

      *Reviewer #3 (Significance (Required)):

      The authors provide a critical assessment on the specificity of antibodies and highlight the necessity to carefully test antibodies and the conclusions drawn from the resulting stainings, especially when antibodies are bought from companies or have previously been published as specific. This is extremely important for the interpretation of experiments in all fields of molecular and cellular biology. *

      We thank the reviewer for appreciating the importance of this work.

    1. There is no doubt that humans are an artistic species. We make music, television shows, and movies, plus we paint, draw, and sculpt. All of these things are art. Humans are able to think in the abstract. We imagine and create things that do not exist, such as unicorns, monsters, and superheroes. We also build upon the achievements of earlier periods to make art that is grounded in history but is also new.

      Human beings are naturally creative. We sing and use instruments, act out scenes, draw, paint, and express ourselves in unique ways that we may not even recognize as art. We use our imagination to portray fictional places and people, combine images to create a whole new composite image. We also admire or engage in some way with the art of others, past and present.

    1. Author response:

      The following is the authors’ response to the original reviews.

      We are very grateful to the reviewers for their constructive comments. Here is a summary of the main changes we made from the previous manuscript version, based on the reviewers’ comments:

      (1) Introduction of a new model, based on a Markov chain, capturing within-trial evolution in search strategy .

      (2) Addition of a new figure investigating inter-animal variations in search strategy.

      (3) Measurement of model fit consistency across 10 simulation repetitions, to prevent the risk of model overfitting.

      (4) Several clarifications have been made in the main text (Results, Discussion, Methods) and figure legends.

      (5) We now provide processed data and codes for analyses and models at GitHub repository

      (6) Simplification of the previous modeling. We realized that the two first models in the previous manuscript version were simply special cases of the third model. Therefore, we retained only the third model, which has been renamed as the ‘mixture model’.

      (7) Modification of Figure 4-6 and Supplementary Figure 7-8 (or their creation) to reflect the aforementioned changes

      Public Reviews:

      Reviewer #1 (Public Review):

      The authors design an automated 24-well Barnes maze with 2 orienting cues inside the maze, then model what strategies the mice use to reach the goal location across multiple days of learning. They consider a set of models and conclude that one of these models, a combined strategy model, best explains the experimental data.

      This study is written concisely and the results presented concisely. The best fit model is reasonably simple and fits the experimental data well (at least the summary measures of the data that were presented).

      Major points:

      (1) One combined strategy (once the goal location is learned) that might seem to be reasonable would be that the animal knows roughly where the goal is, but not exactly where, so it first uses a spatial strategy just to get to the first vestibule, then switches to a serial strategy until it reaches the correct vestibule. How well would such a strategy explain the data for the later sessions? The best combined model presented in the manuscript is one in which the animal starts with a roughly 50-50 chance of a serial (or spatial strategy) from the start vestibule (i.e. by the last session before the reversal the serial and spatial strategies are at ~50-50m in Fig. 5d). Is it the case that even after 15 days of training the animal starts with a serial strategy from its starting point approximately half of the time? The broader point is whether additional examination of the choices made by the animal, combined with consideration of a larger range of possible models, would be able to provide additional insight into the learning and strategies the animal uses.

      Our analysis focused on the evolution of navigation strategies across days and trials. The reviewer raises the interesting possibility that navigation strategy might evolve in a specific manner within each trial, especially on the later days once the environment is learned. To address this possibility, we first examined how some of the statistical distributions, previously analyzed across days, evolved within trials. Consistent with the reviewer’s intuition, the statistical distributions changed within trials, suggesting a specific strategy evolution within trials. Second, we developed a new model, where strategies are represented as nodes of a Markov chain. This model allows potential strategy changes after each vestibule visit, according to a specific set of transition probabilities. Vestibules are chosen based on the same stochastic processes as in the previous model. This new model could be fitted to the experimental distributions and captured both the within-trial evolution and the global distributions. Interestingly, the trials were mostly initiated in the random strategy (~67% chance) and to a lesser extent in the spatial strategy (~25% chance), but rarely in the serial strategy (~8% chance). This new model is presented in Figure 6.

      (2) To clarify, in the Fig. 4 simulations, is the "last" vestibule visit of each trial, which is by definition 0, not counted in the plots of Fig. 4b? Otherwise, I would expect that vestibule 0 is overrepresented because a trial always ends with Vi = 0.

      The last vestibule visit (vestibule 0 by definition) is counted in the plots of Fig.4b. We initially shared the same concern as the reviewer. However, upon further consideration, we arrived at the following explanation: A factor that might lead to an overrepresentation of vestibule 0 is the fact that, unlike other vestibules, it has to be contained in each trial, as trials terminated upon the selection of vestibule 0. Conversely, a factor that might contribute to an underrepresentation of vestibule 0 is that, unlike other vestibules, it cannot be counted more than once per trial. Somehow these two factors seem to counterbalance each other, resulting in no discernible overrepresentation or underrepresentation of vestibule 0 in the random process. 

      Reviewer #2 (Public Review):

      This paper uses a novel maze design to explore mouse navigation behaviour in an automated analogue of the Barnes maze. Overall I find the work to be solid, with the cleverly designed maze/protocol to be its major strength - however there are some issues that I believe should be addressed and clarified.

      (1) Whilst I'm generally a fan of the experimental protocol, the design means that internal odor cues on the maze change from trial to trial, along with cues external to the maze such as the sounds and visual features of the recording room, ultimately making it hard for the mice to use a completely allocentric spatial 'place' strategy to navigate. I do not think there is a way to control for these conflicts between reference frames in the statistical modelling, but I do think these issues should be addressed in the discussion.

      It should be pointed out that all cues on the maze (visual, tactile, odorant) remained unchanged across trials, since the maze was rotated together with goal and guiding cues. Furthermore, the maze was equipped with an opaque cover to prevent mice from seeing the surrounding room (the imaging of mouse trajectories was achieved using infrared light and camera). It is however possible that some other cues such as room sounds and odors could be perceived and somewhat interfered with the sensory cues provided inside the maze. We have now mentioned this possibility in the discussion.

      (2) Somewhat related - I could not find how the internal maze cues are moved for each trial to demarcate the new goal (i.e. the luminous cues) ? This should be clarified in the methods.

      The luminous cues were fixed to the floor of the arena. Consequently, they rotated along with the arena as a unified unit, depicted in figure 1. We have added some clarifications in Figure 1 legend and methods.

      (3) It appears some data is being withheld from Figures 2&3? E.g. Days 3/4 from Fig 2b-f and Days 1-5 on for Fig 3. Similarly, Trials 2-7 are excluded from Fig 3. If this is the case, why? It should be clarified in the main text and Figure captions, preferably with equivalent plots presenting all the data in the supplement.

      The statistical distributions for all single days/trials are shown in the color-coded panels of Figure2&3. In the line plots of Figure2&3, we show only the overlay of 2-3 lines for the sake of clarity. The days/trials represented were chosen to capture the dynamic range of variability within the distributions. We have added this information in the figure legends.

      (4) I strongly believe the data and code should be made freely available rather than "upon reasonable request".

      Matrices of processed data and various codes for simulations and analyses are now available at https://github.com/ sebiroyerlab/Vestibule_sequences.

      Reviewer #3 (Public Review):

      Royer et al. present a fully automated variant of the Barnes maze to reduce experimenter interference and ensure consistency across trials and subjects. They train mice in this maze over several days and analyze the progression of mouse search strategies during the course of the training. By fitting models involving stochastic processes, they demonstrate that a model combined of the random, spatial, and serial processes can best account for the observed changes in mice's search patterns. Their findings suggest that across training days the spatial strategy (using local landmarks) was progressively employed, mostly at the expense of the random strategy, while the serial strategy (consecutive nearby vestibule check) is reinforced from the early stages of training. Finally, they discuss potential mechanistic underpinnings within brain systems that could explain such behavioral adaptation and flexibility.

      Strength:

      The development of an automated Barnes maze allows for more naturalistic and uninterrupted behavior, facilitating the study of spatial learning and memory, as well as the analysis of the brain's neural networks during behavior when combined with neurophysiological techniques. The system's design has been thoughtfully considered, encompassing numerous intricate details. These details include the incorporation of flexible options for selecting start, goal, and proximal landmark positions, the inclusion of a rotating platform to prevent the accumulation of olfactory cues, and careful attention given to atomization, taking into account specific considerations such as the rotation of the maze without causing wire shortage or breakage. When combined with neurophysiological manipulations or recordings, the system provides a powerful tool for studying spatial navigation system.

      The behavioral experiment protocols, along with the analysis of animal behavior, are conducted with care, and the development of behavioral modeling to capture the animal's search strategy is thoughtfully executed. It is intriguing to observe how the integration of these innovative stochastic models can elucidate the evolution of mice's search strategy within a variant of the Barnes maze.

      Weakness:

      (1) The development of the well-thought-out automated Barnes maze may attract the interest of researchers exploring spatial learning and memory. However, this aspect of the paper lacks significance due to insufficient coverage of the materials and methods required for readers to replicate the behavioral methodology for their own research inquiries.

      Moreover, as discussed by the authors, the methodology favors specialists who utilize wired recordings or manipulations (e.g. optogenetics) in awake, behaving rodents. However, it remains unclear how the current maze design, which involves trapping mice in start and goal positions and incorporating angled vestibules resulting in the addition of numerous corners, can be effectively adapted for animals with wired implants.

      The reviewer is correct in pointing out that the current maze design is not suitable for performing experiments with wired implant, particularly due to the maze’s enclosed structure and the access to the start/goal boxes through side holes. Instead, pharmacogenetics and wireless approaches for optogenetic and electrophysiology would need to be used. We have now mentioned this limitation in the discussion.

      (2) Novelty: In its current format, the main axis of the paper falls on the analysis of animal behavior and the development of behavioral modeling. In this respect, while it is interesting to see how thoughtfully designed models can explain the evolution of mice search strategy in a maze, the conclusions offer limited novel findings that align with the existing body of research and prior predictions.

      We agree with the reviewer that our study is weakly connected to previous researches on hippocampus and spatial navigation, as it consists mainly of animal behavior analysis and modeling and addresses a relatively unexplored topic. We hope that the combination of our behavioral approach with optogenetic and electrophysiology will allow in the future new insights that are in line with the existing body of research.

      (3) Scalability and accessibility: While the approach may be intriguing to experts who have an interest in or are familiar with the Barnes maze, its presentation seems to primarily target this specific audience. Therefore, there is a lack of clarity and discussion regarding the scalability of behavioral modeling to experiments involving other search strategies (such as sequence or episodic learning), other animal models, or the potential for translational applications. The scalability of the method would greatly benefit a broader scientific community. In line with this view, the paper's conclusions heavily rely on the development of new models using custom-made codes. Therefore, it would be advantageous to make these codes readily available, and if possible, provide access to the processed data as well. This could enhance comprehension and enable a larger audience to benefit from the methodology.

      The current approach might indeed extend to other species in equivalent environments and might also constitute a general proof of principle regarding the characterization of animal behaviors by the mixing of stochastic processes. We have now mentioned these points in the discussion.

      As suggest by the reviewer, we have now provided model/simulation codes and processed data to replicate the figures, at https://github.com/sebiroyerlab/Vestibule_sequences

      (4) Cross-validation of models: The authors have not implemented any measures to mitigate the risk of overfitting in their modeling. It would have been beneficial to include at least some form of cross-validation with stochastic models to address this concern. Additionally, the paper lacks the presence of analytics or measures that assess and compare the performance of the models.

      To avoid the risk of model overfitting, the most appropriate solution appeared to be repeating the simulations several times and examining the consistency of the obtained parameters across repetitions. For the mixture model, we now show in Supplementary figure 7 the probabilities obtained from 10 repetitions of the simulation. Similarly, for the Markov chain model, the probabilities obtained from 10 repetitions of the simulation are shown in Figure 6.

      Regarding model comparison, we have simplified our mixture model into only one model, as we realized the 2 other models in the previous manuscript version were simply special cases of the 3rd model. Nevertheless, comparison was still needed for the estimation for the best value of N (the number of consecutive segments that a strategy lasts) in the mixture model. We now show the comparison of mean square errors obtained for different values of N, using t-test across 10 repetitions of the simulations (Figure 5c).

      (5) Quantification of inter-animal variations in strategy development: It is important to investigate, and address the argument concerning the possibility that not all animals recruit and develop the three processes (random, spatial, and serial) in a similar manner over days of training. It would be valuable to quantify the transition in strategy across days for each individual mouse and analyze how the population average, reflecting data from individual mice, corresponds to these findings. Currently, there is a lack of such quantification and analysis in the paper.

      We have added a figure (Supplementary figure 8) showing the mixture model matching analyses for individual animals. A lot of variability is indeed observed across animals, with some animals displaying strong preferences for certain strategies compare to others. The average across mouse population showed a similar trend as the result obtained with the pooled data.

      Recommendations for the authors:

      Summary of Reviewer Comments:

      (1) In its present form, the manuscript lacks sufficient coverage of the materials and methods necessary for readers to replicate the behavioral methodology in their own research inquiries. For instance, it would be beneficial to clarify how the cues are rotated relative to the goal.

      (2) The models may be over-fitted, leading to spurious conclusions, and cross-validation is necessary to rule out this possibility.

      (3) The specific choice of the three strategies used to fit behavior in this model should be better justified, as other strategies may account for the observed behavior.

      (4) The study would benefit from an analysis of behavior on an animal-by-animal basis, potentially revealing individual differences in strategies.

      (5) Spatial behavior is not necessarily fully allocentric in this task, as only the two cues in the arena can be used for spatial orientation, unlike odor cues on the floor and sound cues in the room. This should be discussed.

      (6) Making the data and code fully open source would greatly strengthen the impact of this study.

      In addition, each reviewer has raised both major and minor concerns which should be addressed if possible.

      Reviewer #1 (Recommendations For The Authors):

      Minor points:

      (1) Change "tainted" to "tinted" in Fig. 1a

      (2) Should note explicitly in Fig. 2d that the goal is at vestibule 0, and also in the legend

      (3) Fig. 3 legend should say "c-e)", not "c-f)"

      (4) Supplementary Fig. 8 legend repeats "d)" twice

      Reviewer #2 (Recommendations For The Authors):

      Packard & McGaugh 1996 is cited twice as refs 5 and 14

      Reviewer #3 (Recommendations For The Authors):

      - Figure 3: Please correct the labels referenced as "c-f)" in the figure's legend.

      - Rounding numbers issue on page 4: 82.62% + 17.37% equals 99.99%, not 100%.

      We fixed all minor points. We are very thankful to the reviewers for their constructive comments.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment:

      This study uses carefully designed experiments to generate a useful behavioural and neuroimaging dataset on visual cognition. The results provide solid evidence for the involvement of higher-order visual cortex in processing visual oddballs and asymmetry. However, the evidence provided for the very strong claims of homogeneity as a novel concept in vision science, separable from existing concepts such as target saliency, is inadequate.

      We appreciate the positive and balanced assessment from the reviewers. We agree that visual homogeneity is similar to existing concepts such as target saliency. We have tried our best to articulate our rationale for defining it as a novel concept. However, the debate about whether visual homogeneity is novel or related to existing concepts is completely beside the point, since that is not the key contribution of our study.

      Our key contribution is our quantitative model for how the brain could be solving generic visual tasks by operating on a feature space. In the literature there are no theories regarding the decision-making process by which the brain could be solving generic visual tasks. In fact, oddball search tasks, same-different tasks and symmetry tasks are never even mentioned in the same study because it is tacitly assumed that the underlying processes are completely different! Our work brings together these disparate tasks by proposing a specific computation that enables the brain to solve both types of tasks and providing evidence for it. This specific computation is a well-defined, falsifiable model that will need to be replicated, elaborated and refined by future studies.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The authors define a new metric for visual displays, derived from psychophysical response times, called visual homogeneity (VH). They attempt to show that VH is explanatory of response times across multiple visual tasks. They use fMRI to find visual cortex regions with VH-correlated activity. On this basis, they declare a new visual region in the human brain, area VH, whose purpose is to represent VH for the purpose of visual search and symmetry tasks.

      Thank you for your concise summary. We appreciate your careful reading and thoughtful and constructive comments.

      Strengths:

      The authors present carefully designed experiments, combining multiple types of visual judgments and multiple types of visual stimuli with concurrent fMRI measurements. This is a rich dataset with many possibilities for analysis and interpretation.

      Thank you for your accurate assessment of the strengths of our study.

      Weaknesses:

      The datasets presented here should provide a rich basis for analysis. However, in this version of the manuscript, I believe that there are major problems with the logic underlying the authors' new theory of visual homogeneity (VH), with the specific methods they used to calculate VH, and with their interpretation of psychophysical results using these methods. These problems with the coherency of VH as a theoretical construct and metric value make it hard to interpret the fMRI results based on searchlight analysis of neural activity correlated with VH.

      We appreciate your concerns, and have tried our best to respond to them fully against your specific concerns below.

      In addition, the large regions of VH correlations identified in Experiments 1 and 2 vs. Experiments 3 and 4 are barely overlapping. This undermines the claim that VH is a universal quantity, represented in a newly discovered area of the visual cortex, that underlies a wide variety of visual tasks and functions.

      We agree with you that the VH regions defined using symmetry task and search task do not overlap completely (as we have shown in Figure S13). However this is to be expected for several reasons. First, the images in the symmetry task were presented at fixation, whereas the images in the visual search task were presented peripherally. Second, the lack of overlap could be due to variations across individuals. Indeed, considerable individual variability has been observed in the location of category-selective regions such as VWFA (Glezer and Riesenhuber 2013) and FFA (Weiner and Grill-Spector, 2012). We propose that testing the same participants on both search and symmetry tasks would reveal overlapping VH regions. We now acknowledge these issues in the Results (p. 26).

      Maybe I have missed something, or there is some flaw in my logic. But, absent that, I think the authors should radically reconsider their theory, analyses, and interpretations, in light of the detailed comments below, to make the best use of their extensive and valuable datasets combining behavior and fMRI. I think doing so could lead to a much more coherent and convincing paper, albeit possibly supporting less novel conclusions.

      We appreciate your concerns. We have tried our best to respond to them fully against your specific concerns below.

      THEORY AND ANALYSIS OF VH

      (1) VH is an unnecessary, complex proxy for response time and target-distractor similarity. VH is defined as a novel visual quality, calculable for both arrays of objects (as studied in Experiments 1-3) and individual objects (as studied in Experiment 4). It is derived from a center-to-distance calculation in a perceptual space. That space in turn is derived from the multi-dimensional scaling of response times for target-distractor pairs in an oddball detection task (Experiments 1 and 2) or in a same-different task (Experiments 3 and 4).

      The above statements are not entirely correct. Experiments 1 & 3 are oddball visual search experiments. Their purpose was to estimate the underlying perceptual space of objects.

      Proximity of objects in the space is inversely proportional to response times for arrays in which they were paired. These response times are higher for more similar objects. Hence, proximity is proportional to similarity. This is visible in Fig. 2B as the close clustering of complex, confusable animal shapes.

      VH, i.e. distance-to-center, for target-present arrays, is calculated as shown in Fig. 1C, based on a point on the line connecting the target and distractors. The authors justify this idea with previous findings that responses to multiple stimuli are an average of responses to the constituent individual stimuli. The distance of the connecting line to the center is inversely proportional to the distance between the two stimuli in the pair, as shown in Fig. 2D. As a result, VH is inversely proportional to the distance between the stimuli and thus to stimulus similarity and response times. But this just makes VH a highly derived, unnecessarily complex proxy for target-distractor similarity and response time. The original response times on which the perceptual space is based are far more simple and direct measures of similarity for predicting response times.

      We agree that VH brings no explanatory power to target-present searches, since target-present response times are a direct estimate of target-distractor similarity. However, we are additionally explaining target-absent response times. Target-absent response times are well known to vary systematically with image properties, but why they do so have not been clear in the literature.

      Our key conceptual advance lies in relating the neural response to a search array to the neural response of the constituent elements, and in proposing a decision variable using which participants can make both target-present and target-absent judgements on any search array.

      (2) The use of VH derived from Experiment 1 to predict response times in Experiment 2 is circular and does not validate the VH theory.

      The use of VH, a response time proxy, to predict response times in other, similar tasks, using the same stimuli, is circular. In effect, response times are being used to predict response times across two similar experiments using the same stimuli. Experiment 1 and the target present condition of Experiment 2 involve the same essential task of oddball detection. The results of Experiment 1 are converted into VH values as described above, and these are used to predict response times in Experiment 2 (Fig. 2F). Since VH is a derived proxy for response values in Experiment 1, this prediction is circular, and the observed correlation shows only consistency between two oddball detection tasks in two experiments using the same stimuli.

      We agree that it would be circular to use oddball search times in Experiment 1 to explain only target-present search times in Experiment 2, since they basically involve the same searches. However, we are explaining both target-present and target-absent search times in a unified framework; systematic variations in target-absent search times have been noted in the literature but never really explained. One could still simply say that target-absent search times are some function of the target-present search times, but this still doesn’t provide an explanation for how participants are making target-present and absent decisions. The existing literature contains models for how visual search might occur for a specific target and distractor but does not elucidate how participants might perform generic visual search where target and distractors are not known in advance.

      Our key conceptual advance lies in relating the neural response to a search array to the neural response of the constituent elements, and in proposing a decision variable using which participants can make both target-present and target-absent judgements on any search array.

      (3) The negative correlation of target-absent response times with VH as it is defined for target-absent arrays, based on the distance of a single stimulus from the center, is uninterpretable without understanding the effects of center-fitting. Most likely, center-fitting and the different VH metrics for target-absent trials produce an inverse correlation of VH with target-distractor similarity.

      We see no cause for concern with the center-fitting procedure, for several reasons. First, the best-fitting center remained stable despite many randomly initialized starting points. Second, the best-fitting center derived from one set of objects was able to predict the target-absent and target-present responses of another set of objects. Finally, the VH obtained for each object (i.e. distance from the best-fitting center) is strongly correlated with the average distance of that object from all other objects (Figure S1A). We have now clarified this in the Results (p. 11).

      The construction of the VH perceptual space also involves fitting a "center" point such that distances to center predict response times as closely as possible. The effect of this fitting process on distance-to-center values for individual objects or clusters of objects is unknowable from what is presented here. These effects would depend on the residual errors after fitting response times with the connecting line distances. The center point location and its effects on the distance-to-center of single objects and object clusters are not discussed or reported here.

      While it is true that the optimal center needs to be found by fitting to the data, there no particular mystery to the algorithm: we are simply performing a standard gradient-descent to maximize the fit to the data. We have described the algorithm clearly and are making our codes public. We find the algorithm to yield stable optimal centers despite many randomly initialized starting points. We find the optimal center to be able to predict responses to entirely novel images that were excluded during model training. We are making no assumption about the location of centre with respect to individual points. Therefore, we see no cause for concern regarding the center-finding algorithm.

      Yet, this uninterpretable distance-to-center of single objects is chosen as the metric for VH of target-absent displays (VHabsent). This is justified by the idea that arrays of a single stimulus will produce an average response equal to one stimulus of the same kind. However, it is not logically clear why response strength to a stimulus should be a metric for homogeneity of arrays constructed from that stimulus, or even what homogeneity could mean for a single stimulus from this set. It is not clear how this VHabsent metric based on single stimuli can be equated to the connecting line VH metric for stimulus pairs, i.e. VHpresent, or how both could be plotted on a single continuum.

      Most visual tasks, such as finding an animal, are thought to involve building a decision boundary on some underlying neural representation. Even visual search has been portrayed as a signal-detection problem where a particular target is to be discriminated from a distractor. However none of these formulations work in the case of generic visual tasks, where the target and distractor identities are unknown. We are proposing that, when we view a search array, the neural response to the search array can be deduced from the neural responses to the individual elements using well known rules, and that decisions about an oddball target being present or absent can be made by computing the distance of this neural response from some canonical mean firing rate of a population of neurons. This distance to center computation is what we denote as visual homogeneity. We have revised our manuscript throughout to make this clearer and we hope that this helps you understand the logic better.

      It is clear, however, what should be correlated with difficulty and response time in the target-absent trials, and that is the complexity of the stimuli and the numerosity of similar distractors in the overall stimulus set. The complexity of the target, similarity with potential distractors, and the number of such similar distractors all make ruling out distractor presence more difficult. The correlation seen in Fig. 2G must reflect these kinds of effects, with higher response times for complex animal shapes with lots of similar distractors and lower response times for simpler round shapes with fewer similar distractors.

      You are absolutely correct that the stimulus complexity should matter, but there are no good measures for stimulus complexity. But considering what factors are correlated with target-absent response times is entirely different from asking what decision variable or template is being used by participants to solve the task.

      The example points in Fig. 2G seem to bear this out, with higher response times for the deer stimulus (complex, many close distractors in the Fig. 2B perceptual space) and lower response times for the coffee cup (simple, few close distractors in the perceptual space). While the meaning of the VH scale in Fig. 2G, and its relationship to the scale in Fig. 2F, are unknown, it seems like the Fig. 2G scale has an inverse relationship to stimulus complexity, in contrast to the expected positive relationship for Fig. 2F. This is presumably what creates the observed negative correlation in Fig. 2G.

      Taken together, points 1-3 suggest that VHpresent and VHabsent are complex, unnecessary, and disconnected metrics for understanding target detection response times. The standard, simple explanation should stand. Task difficulty and response time in target detection tasks, in both present and absent trials, are positively correlated with target-distractor similarity.

      Respectfully, we disagree with your assessment. Your last point is not logically consistent though: response times for target-absent trials cannot be correlated with any target-distractor similarity since there is no target in the first place in a target-absent array. We have shown that target-absent response times are in fact, independent of experimental context, which means that they index an image property that is independent of any reference target (Results, p. 15; Section S4). This property is what we define as visual homogeneity.

      I think my interpretations apply to Experiments 3 and 4 as well, although I find the analysis in Fig. 4 especially hard to understand. The VH space in this case is based on Experiment 3 oddball detection in a stimulus set that included both symmetric and asymmetric objects. However, the response times for a very different task in Experiment 4, a symmetric/asymmetric judgment, are plotted against the axes derived from Experiment 3 (Fig. 4F and 4G). It is not clear to me why a measure based on oddball detection that requires no use of symmetry information should be predictive of within-stimulus symmetry detection response times. If it is, that requires a theoretical explanation not provided here.

      We are using an oddball detection task to estimate perceptual dissimilarity between objects, and construct the underlying perceptual representation of both symmetric and asymmetric objects. This enabled us to then ask if some distance-to-center computation can explain response times in a symmetry detection task, and obtain an answer in the affirmative. We have reworked the text to make this clear.

      (4) Contrary to the VH theory, same/different tasks are unlikely to depend on a decision boundary in the middle of a similarity or homogeneity continuum.

      We have provided empirical proof for our claims, by showing that target-present response times in a visual search task are correlated with “different” responses in the same-different task, and that target-absent response times in the visual search task are correlated with “same” responses in the same-different task (Section S3).

      The authors interpret the inverse relationship of response times with VHpresent and VHabsent, described above, as evidence for their theory. They hypothesize, in Fig. 1G, that VHpresent and VHabsent occupy a single scale, with maximum VHpresent falling at the same point as minimum VHabsent. This is not borne out by their analysis, since the VHpresent and VHabsent value scales are mainly overlapping, not only in Experiments 1 and 2 but also in Experiments 3 and 4. The authors dismiss this problem by saying that their analyses are a first pass that will require future refinement. Instead, the failure to conform to this basic part of the theory should be a red flag calling for revision of the theory.

      We respectfully disagree – by no means did we dismiss this problem! In fact, we have explicitly acknowledged this by saying that VH does not explain all the variance in the response times, but nonetheless explains substantial variance and might form the basis for an initial guess or a fast response. The remaining variance might be explained by processes that involve more direct scrutiny. Please see Results, page 10 & 22.

      The reason for this single scale is that the authors think of target detection as a boundary decision task, along a single scale, with a decision boundary somewhere in the middle, separating present and absent. This model makes sense for decision dimensions or spaces where there are two categories (right/left motion; cats vs. dogs), separated by an inherent boundary (equal left/right motion; training-defined cat/dog boundary). In these cases, there is less information near the boundary, leading to reduced speed/accuracy and producing a pattern like that shown in Fig. 1G.

      The key conceptual advance of our study is that we show that even target/present, same/different or symmetry judgements can be fit into the standard decision-making framework.

      This logic does not hold for target detection tasks. There is no inherent middle point boundary between target present and target absent. Instead, in both types of trials, maximum information is present when the target and distractors are most dissimilar, and minimum information is present when the target and distractors are most similar. The point of greatest similarity occurs at the limit of any metric for similarity. Correspondingly, there is no middle point dip in information that would produce greater difficulty and higher response times. Instead, task difficulty and response times increase monotonically with the similarity between targets and distractors, for both target present and target absent decisions. Thus, in Figs. 2F and 2G, response times appear to be highest for animals, which share the largest numbers of closely similar distractors.

      Unfortunately, your logic does not boil down to any quantitative account, since you are using vague terms like “maximum information”. Further, any argument based solely on item similarity to explain visual search or symmetry responses cannot explain systematic variations observed for target-absent arrays and for symmetric objects, for the reasons below.

      If target-distractor dissimilarity were the sole driver of response times, target-absent judgements should always take the longest time since the target and distractor have zero similarity, with no variation from one image to another. This account does not explain why target-absent response times vary so systematically.

      Similarly, if symmetry judgements are solely based on comparing the dissimilarity between two halves of an object, there should be no variation in the response times of symmetric objects since the dissimilarity between their two halves is zero. However we do see systematic variation in the response times to symmetric objects.

      DEFINITION OF AREA VH USING fMRI

      (1) The area VH boundaries from different experiments are nearly completely non-overlapping.

      In line with their theory that VH is a single continuum with a decision boundary somewhere in the middle, the authors use fMRI searchlight to find an area whose responses positively correlate with homogeneity, as calculated across all of their target present and target absent arrays. They report VH-correlated activity in regions anterior to LO. However, the VH defined by symmetry Experiments 3 and 4 (VHsymmetry) is substantially anterior to LO, while the VH defined by target detection Experiments 1 and 2 (VHdetection) is almost immediately adjacent to LO. Fig. S13 shows that VHsymmetry and VHdetection are nearly non-overlapping. This is a fundamental problem with the claim of discovering a new area that represents a new quantity that explains response times across multiple visual tasks. In addition, it is hard to understand why VHsymmetry does not show up in a straightforward subtraction between symmetric and asymmetric objects, which should show a clear difference in homogeneity. • Actually VHsymmetry is apparent even in a simple subtraction between symmetric and asymmetric objects (Figure S10). The VH regions identified using the visual search task and symmetry task have a partial overlap, not zero overlap as you are incorrectly claiming.

      We have noted that it is not straightforward to interpret the overlap, since there are many confounding factors. One reason could simply be that the stimuli in the symmetry task were presented at fixation, whereas the visual search arrays contained items exclusively in the periphery. Another that the participants in the two tasks were completely different, and the lack of overlap is simply due to inter-individual variability. Testing the same participants in two tasks using similar stimuli would be ideal but this is outside the scope of this study. We have acknowledged these issues in the Results (p. 26) and in the Supplementary Material (Section S8).

      (2) It is hard to understand how neural responses can be correlated with both VHpresent and VHabsent.

      The main paper results for VHdetection are based on both target-present and target-absent trials, considered together. It is hard to interpret the observed correlations, since the VHpresent and VHabsent metrics are calculated in such different ways and have opposite correlations with target similarity, task difficulty, and response times (see above). It may be that one or the other dominates the observed correlations. It would be clarifying to analyze correlations for target-present and target-absent trials separately, to see if they are both positive and correlated with each other.

      Thanks. The positive correlation between VH and neural response holds even when we do the analysis separately for target-present and -absent searches (correlation between neural response in VH region and visual homogeneity (n = 32, r = 0.66, p < 0.0005 for target-present searches & n = 32, r = 0.56, p < 0.005 for target-absent searches).

      (3) The definition of the boundaries and purpose of a new visual area in the brain requires circumspection, abundant and convergent evidence, and careful controls.

      Even if the VH metric, as defined and calculated by the authors here, is a meaningful quantity, it is a bold claim that a large cortical area just anterior to LO is devoted to calculating this metric as its major task. Vision involves much more than target detection and symmetry detection. The cortex anterior to LO is bound to perform a much wider range of visual functionalities. If the reported correlations can be clarified and supported, it would be more circumspect to treat them as one byproduct of unknown visual processing in the cortex anterior to LO, rather than treating them as the defining purpose for a large area of the visual cortex.

      We totally agree with you that reporting a new brain region would require careful interpretation and abundant and converging evidence. However, this requires many studies worth of work, and historically category-selective regions like the FFA have achieved consensus only after they were replicated and confirmed across many studies. We believe our proposal for the computation of a quantity like visual homogeneity is conceptually novel, and our study represents a first step that provides some converging evidence (through replicable results across different experiments) for such a region. We have reworked our manuscript to make this point clearer (Discussion, p 32).

      Reviewer #2 (Public Review):

      Summary:

      This study proposes visual homogeneity as a novel visual property that enables observers perform to several seemingly disparate visual tasks, such as finding an odd item, deciding if two items are the same, or judging if an object is symmetric. In Experiment 1, the reaction times on several objects were measured in human subjects. In Experiment 2, the visual homogeneity of each object was calculated based on the reaction time data. The visual homogeneity scores predicted reaction times. This value was also correlated with the BOLD signals in a specific region anterior to LO. Similar methods were used to analyze reaction time and fMRI data in a symmetry detection task. It is concluded that visual homogeneity is an important feature that enables observers to solve these two tasks.

      Strengths:

      (1) The writing is very clear. The presentation of the study is informative.

      (2) This study includes several behavioral and fMRI experiments. I appreciate the scientific rigor of the authors.

      We are grateful to you for your balanced assessment and constructive comments.

      Weaknesses:

      (1) My main concern with this paper is the way visual homogeneity is computed. On page 10, lines 188-192, it says: "we then asked if there is any point in this multidimensional representation such that distances from this point to the target-present and target-absent response vectors can accurately predict the target-present and target-absent response times with a positive and negative correlation respectively (see Methods)". This is also true for the symmetry detection task. If I understand correctly, the reference point in this perceptual space was found by deliberating satisfying the negative and positive correlations in response times. And then on page 10, lines 200-205, it shows that the positive and negative correlations actually exist. This logic is confusing. The positive and negative correlations emerge only because this method is optimized to do so. It seems more reasonable to identify the reference point of this perceptual space independently, without using the reaction time data. Otherwise, the inference process sounds circular. A simple way is to just use the mean point of all objects in Exp 1, without any optimization towards reaction time data.

      We disagree with you since the same logic applies to any curve-fitting procedure. When we fit data to a straight line, we are finding the slope and intercept that minimizes the error between the data and the straight line, but we would hardly consider the process circular when a good fit is achieved – in fact we take it as a confirmation that the data can be fit linearly. In the same vein, we would not have observed a good fit to the data, if there did not exist any good reference point relative to which the distances of the target-present and target-absent search arrays predicted these response times.

      In Section S1, we have already reported that the visual homogeneity estimates for each object is strongly correlated with the average distance of each object to all other objects (r = 0.84, p<0.0005, Figure S1). Second, to confirm that the results we obtained are not due to overfitting, we have already reported a cross-validation analysis, where we removed all searches involving a particular image and predicted these response times using visual homogeneity. This too revealed a significant model correlation confirming that our results are not due to overfitting.

      (2) On page 11, lines 214-221. It says: "these findings are non-trivial for several reasons". However, the first reason is confusing. It is unclear to me why "it suggests that there are highly specific computations that can be performed on perceptual space to solve oddball tasks". In fact, these two sentences provide no specific explanation for the results.

      We have now revised the text to make it clearer (Results, p. 11).

      (3) The second reason is interesting. Reaction times in target-present trials can be easily explained by target-distractor similarity. But why does reaction time vary substantially across target-absent stimuli? One possible explanation is that the objects that are distant from the feature distribution elicit shorter reaction times. Here, all objects constitute a statistical distribution in the feature (perceptual) space. There is certainly a mean of this distribution. Some objects look like outliers and these outliers elicit shorter reaction times in the target-absent trials because outlier detection is very salient.

      One might argue that the above account is merely a rephrasing of the idea of visual homogeneity proposed in this study. If so, feature saliency is not a new account. In other words, the idea of visual homogeneity is another way of reiterating the old feature saliency theory.

      Thank you for this interesting point. We don’t necessarily see a contradiction. However, we are proposing a quantitative decision variable that the brain could be using to make target present/absent judgements.

      (4) One way to reject the feature saliency theory is to compare the reaction times of the objects that are very different from other objects (i.e., no surrounding objects in the perceptual space, e.g., the wheel in the lower right corner of Fig. 2B) with the objects that are surrounded by several similar objects (e.g., the horse in the upper part of Fig. 2B). Also, please choose the two objects with similar distance from the reference point. I predict that the latter will elicit longer reaction times because they can be easily confounded by surrounding similar objects (i.e., four-legged horses can be easily confounded by four-legged dogs). If the density of object distribution per se influences the visual homogeneity score, I would say that the "visual homogeneity" is essentially another way of describing the distributional density of the perceptual space.

      We agree with you, and we have indeed found that visual homogeneity estimates from our model are highly correlated with the average distance of an object relative to all other objects. However, we performed several additional experiments to elucidate the nature of target-absent response times. We find that they are unaffected by whether these searches are performed in the midst of similar or dissimilar objects (Section S4, Experiment S6), and even when the same searches are performed among nearby sets of objects with completely uncorrelated average distances (Section S4, Experiment S7). We have now reworked the text to make this clearer.

      (5) The searchlight analysis looks strange to me. One can easily perform a parametric modulation by setting visual homogeneity as the trial-by-trial parametric modulator and reaction times as a covariate. This parametric modulation produces a brain map with the correlation of every voxel in the brain. On page 17 lines 340-343, it is unclear to me what the "mean activation" is.

      We have done something similar. For each region we took the mean activation at each voxel as the average activation 3x3x3 voxel neighborhood in the brain, and took its correlation with visual homogeneity. We have now reworked this to make it clearer (Results, p. 16).

      Minor points

      (1) In the intro, it says: "using simple neural rules..." actually it is very confusing what "neural rules" are here. Better to change it to "computational principles" or "neural network models"??

      We have now replaced this with “using well-known principles governing multiple object representations”.

      (2) In the intro, it says: "while machine vision algorithms are extremely successful in solving feature-based tasks like object categorization (Serre, 2019), they struggle to solve these generic tasks (Kim et al., 2018; Ricci et al. 2021). These are not generic tasks. They are just a specific type of visual task-judging relationship between multiple objects. Moreover, a large number of studies in machine vision have shown that DNNs are capable of solving these tasks and even more difficult tasks. Two survey papers are listed here.

      Wu, Q., Teney, D., Wang, P., Shen, C., Dick, A., & Van Den Hengel, A. (2017). Visual question answering: A survey of methods and datasets. Computer Vision and Image Understanding, 163, 21-40.

      Małkiński, M., & Mańdziuk, J. (2022). Deep Learning Methods for Abstract Visual Reasoning: A Survey on Raven's Progressive Matrices. arXiv preprint arXiv:2201.12382.

      Thank you for sharing these references. In fact, a recent study has shown that specific deep networks can indeed solve the same-different task (Tartaglini et al, 2023). However our broader point remains that the same-different or other such visual tasks are non-trivial for machine vision algorithms.

      Reviewer #1 (Recommendations For The Authors):

      Nothing to add to the public review. If my concerns turn out to be invalid, I apologize and will happily accept correction. If they are valid, I hope they will point toward a new version of this paper that optimizes the insights to be gained from this impressive dataset.

      Reviewer #2 (Recommendations For The Authors):

      My suggestions are as follows:

      (1) Analyze the fMRI data using the parametric modulation approach first at the single-subject level and then perform group analysis.

      To clarify, we have obtained image-level activations from each subject, and used it for all our analyses.

      (2) Think about a way to redefine visual homogeneity from a purely image-computable approach. In other words, visual homogeneity should be first defined as an image feature that is independent of any empirical response data. And then use the visual homogeneity scores to predict reaction times.

      While we understand what you mean, any image-computable representation such as from a deep network may carry its own biases and may not be an accurate representation of the underlying object representation. By contrast, neural dissimilarities in the visual cortex are strongly predictive of visual search oddball response times. That is why we used visual search oddball response times as a proxy for the underlying neural representation, and then asked whether some decision variable can be derived from this representation to explain both target present and absent judgements in visual search.

    2. Reviewer #3 (Public Review):

      Summary:

      This study proposes visual homogeneity as a novel visual property that enables observers perform to several seemingly disparate visual tasks, such as finding an odd item, deciding if two items are same, or judging if an object is symmetric. In Exp 1, the reaction times on several objects were measured in human subjects. In Exp 2, visual homogeneity of each object was calculated based on the reaction time data. The visual homogeneity scores predicted reaction times. This value was also correlated with the BOLD signals in a specific region anterior to LO. Similar methods were used to analyze reaction time and fMRI data in a symmetry detection task. It is concluded that visual homogeneity is an important feature that enables observers to solve these two tasks.

      Strengths:

      (1) The writing is very clear. The presentation of the study is informative.<br /> (2) This study includes several behavioral and fMRI experiments. I appreciate the scientific rigor of the authors.

      Weaknesses:

      (1) My main concern with this paper is the way visual homogeneity is computed. On page 10, lines 188-192, it says: "we then asked if there is any point in this multidimensional representation such that distances from this point to the target-present and target-absent response vectors can accurately predict the target-present and target-absent response times with a positive and negative correlation respectively (see Methods)". This is also true for the symmetry detection task. If I understand correctly, the reference point in this perceptual space was found by deliberating satisfying the negative and positive correlations in response times. And then on page 10, lines 200-205, it shows that the positive and negative correlations actually exist. This logic is confusing. The positive and negative correlations emerge only because this method is optimized to do so. It seems more reasonable to identify the reference point of this perceptual space independently, without using the reaction time data. Otherwise, the inference process sounds circular. A simple way is to just use the mean point of all objects in Exp 1, without any optimization towards reaction time data.

      (2) Visual homogeneity (at least given the current from) is an unnecessary term. It is similar to distractor heterogeneity/distractor variability/distractor statics in literature. However, the authors attempt to claim it as a novel concept. The title is "visual homogeneity computations in the brain enable solving generic visual tasks". The last sentence of the abstract is "a NOVEL IMAGE PROPERTY, visual homogeneity, is encoded in a localized brain region, to solve generic visual tasks". In the significance, it is mentioned that "we show that these tasks can be solved using a simple property WE DEFINE as visual homogeneity". If the authors agree that visual homogeneity is not new, I suggest a complete rewrite of the title, abstract, significance, and introduction.

      (3) Also, "solving generic tasks" is another overstatement. The oddball search tasks, same-different tasks, and symmetric tasks are only a small subset of many visual tasks. Can this "quantitative model" solve motion direction judgment tasks, visual working memory tasks? Perhaps so, but at least this manuscript provides no such evidence. On line 291, it says "we have proposed that visual homogeneity can be used to solve any task that requires discriminating between homogeneous and heterogeneous displays". I think this is a good statement. A title that says "XXXX enable solving discrimination tasks with multi-component displays" is more acceptable. The phrase "generic tasks" is certainly an exaggeration.

      (4) If I understand it correctly, one of the key findings of this paper is "the response times for target-present searches were positively correlated with visual homogeneity. By contrast, the response times for target-absent searches were negatively correlated with visual homogeneity" (lines 204-207). I think the authors have already acknowledged that the positive correlation is not surprising at all because it reflects the classic target-distractor similarity effect. But the authors claim that the negative correlations in target-absent searches is the true novel finding.

      (5) I would like to make it clear that this negative correlation is not new either. The seminal paper by Duncan and Humphreys (1989) has clearly stated that "difficulty increases with increased similarity of targets to nontargets and decreased similarity between nontargets" (the sentence in their abstract). Here, "similarity between nontargets" is the same as the visual homogeneity defined here. Similar effects have been shown in Duncan (1989) and Nagy, Neriani, and Young (2005). See also the inconsistent results in Nagy& Thomas, 2003, Vicent, Baddeley, Troscianko&Gilchrist, 2009.<br /> More recently, Wei Ji Ma has systematically investigated the effects of heterogeneous distractors in visual search. I think the introduction part of Wei Ji Ma's paper (2020) provides a nice summary of this line of research.

      I am surprised that these references are not mentioned at all in this manuscript (except Duncan and Humphreys, 1989).

      (6) If the key contribution is the quantitative model, the study should be organized in a different way. Although the findings of positive and negative correlations are not novel, it is still good to propose new models to explain classic phenomena. I would like to mention the three studies by Wei Ji Ma (see below). In these studies, Bayesian observer models were established to account for trial-by-trial behavioral responses. These computational models can also account for the set-size effect, behavior in both localization and detection tasks. I see much more scientific rigor in their studies. Going back to the quantitative model in this paper, I am wondering whether the model can provide any qualitative prediction beyond the positive and negative correlations? Can the model make qualitative predictions that differ from those of Wei Ji's model? If not, can the authors show that the model can quantitatively better account for the data than existing Bayesian models? We should evaluate a model either qualitatively or quantitatively.

      (7) In my opinion, one of the advantages of this study is the fMRI dataset, which is valuable because previous studies did not collect fMRI data. The key contribution may be the novel brain region associated with display heterogeneity. If this is the case, I would suggest using a more parametric way to measure this region. For example, one can use Gabor stimuli and systematically manipulate the variations of multiple Gabor stimuli, the same logic also applies to motion direction. If this study uses static Gabor, random dot motion, object images that span from low-level to high-level visual stimuli, and consistently shows that the stimulus heterogeneity is encoded in one brain region, I would say this finding is valuable. But this sounds like another experiment. In other words, it is insufficient to claim a new brain region given the current form of the manuscript.

      REFERENCES<br /> - Duncan, J., & Humphreys, G. W. (1989). Visual search and stimulus similarity. Psychological Review, 96(3), 433-458. doi: 10.1037/0033-295x.96.3.433<br /> - Duncan, J. (1989). Boundary conditions on parallel processing in human vision. Perception, 18(4), 457-469. doi: 10.1068/p180457<br /> - Nagy, A. L., Neriani, K. E., & Young, T. L. (2005). Effects of target and distractor heterogeneity on search for a color target. Vision Research, 45(14), 1885-1899. doi: 10.1016/j.visres.2005.01.007<br /> - Nagy, A. L., & Thomas, G. (2003). Distractor heterogeneity, attention, and color in visual search. Vision Research, 43(14), 1541-1552. doi: 10.1016/s0042-6989(03)00234-7<br /> - Vincent, B., Baddeley, R., Troscianko, T., & Gilchrist, I. (2009). Optimal feature integration in visual search. Journal of Vision, 9(5), 15-15. doi: 10.1167/9.5.15<br /> - Singh, A., Mihali, A., Chou, W. C., & Ma, W. J. (2023). A Computational Approach to Search in Visual Working Memory.<br /> - Mihali, A., & Ma, W. J. (2020). The psychophysics of visual search with heterogeneous distractors. BioRxiv, 2020-08.<br /> - Calder-Travis, J., & Ma, W. J. (2020). Explaining the effects of distractor statistics in visual search. Journal of Vision, 20(13), 11-11.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      The authors provide convincing experimental evidence of extended motivational signals encoded in the mouse anterior cingulate cortex (ACC) that are implemented by the orbitofrontal cortex (OFC)-to-ACC signaling during learning. The results are valuable to the field of motivation and cognition. The experimental methods used were state-of-the-art. The manuscript would further benefit from theory-driven analyses to inform a mechanistic understanding, particularly for the single-cell calcium imaging results. These results will be of interest to those interested in cortical function, learning, and/or motivation.

      We thank the reviewers for their thoughtful reading of our paper and providing constructive feedback. We have made the relevant changes to the manuscript to improve the writing and figures. We provide responses below to each of the reviewer’s comments.

      Reviewer #1 (Public Review):

      (1) An important conclusion (Figure 4) is that when mice are trained to run through no reward (N) cues in order to reach reward (R) cues, the OFC neurons projecting to ACC each respond to different specific events in a manner that ensures that collectively they tile the extended behavioural sequence. What I was less sure of was whether the ACC neurons do the same or not. Figure 3 suggests that on average ACC neurons maintain activity across N cues in order to get to R cues but I was not sure whether this was because all individual neurons did this or whether some had activity patterns like the OFC neurons projecting to ACC.

      We agree that it remains uncertain what individual ACC neurons do during the extended behavioral sequence. We now include a few sentences in the discussion about what we hypothesize, as we did not perform the cellular resolution imaging to determine this:

      “While we did not perform single-cell imaging of ACC in our task, we hypothesize that individual ACC neurons could encode the distribution of actions/opportunities47 (i.e. stop, run, lick, suppress lick) taken during R or N cues. ACC neurons could compute the relative value of the action taken such that more ACC neurons become recruited once mice learn to run out of N cues. The sustained increase in bulk ACC activity across N cue trials (Figure 2) could come from a stable sequence of individual neurons that encode the timescale of the actions taken. In this way, OFC projections would encode current motivation across N cues before learning, which then triggers ACC to compute the valuebased actions. Motivational signals in OFC would thus represent state since past rewards/goals, while in ACC these signals represent actions taken to pursue rewards/goals in the future.”

      (2) Figure 1 versus Figure 2: There does not seem to be a particular motivation for whether chemogenetic inactivation or optogenetic inhibition were used in different experiments. I think that this is not problematic but, if I am wrong and there were specific reasons for performing each experiment in a certain way, then further clarification as to why these decisions were made would be useful. If there is no particular reason, then simply explaining that this is the case might stop readers from seeking explanations.

      Thank you for this comment and we agree that clarification on this is important. We performed chemogenetic inhibition of ACC in Figure 1 to take a broad survey of behavioral effects throughout a 40-min long behavioral session, and performed optogenetic inhibition in Figure 2 because we wanted to restrict our inhibition to the few seconds of cue presentation during a behavioral session and across days. Furthermore, we wanted to combat any potential off-target effects that would come from repeated administration of CNO over the several days of training (Manvich et al 2018). We have included a couple sentences on page 4 to clarify this:

      “We proceeded to test whether these motivation related signals in ACC are required for learning. To restrict our inhibition to cue presentation portions of our task, and combat any potential off-target effects of CNO31 from repeated administration across several days of training, we used optogenetic inhibition.”

      (3) P5, paragraph 2. The authors argue that OFC and anteriomedial (AM) thalamic inputs into ACC are especially important for mediating motivation through N cues in order to reach R cues. Is this based on a statistical comparison between the activity in OFC or AM inputs as opposed to the other inputs?

      We determined that OFC and AM thalamic inputs to ACC are particularly important by comparing the pre-cue activity in a reward-no reward-reward trial sequence (RNR; Figure 3B). Specifically, we performed paired t-tests comparing pre-cue activity between N and R cues, and found a statistically significant increase for R cues but only for the OFC and AM inputs, not for the BLA or LC inputs.

      (4) P3, paragraph 2. Some papers by Khalighinejad and colleagues (eg Neuron 2020, Current Biology, 2022) might be helpful here in as much as they assess ACC roles in determining action frequency, initiation, and speed and mediating the relationship between reward availability and action frequency and speed.

      We thank the reviewer for bringing these relevant papers to our attention. We have included these papers in our citations in this paragraph.

      (5) Paragraph 1 "This learning is of a more deliberate, informed nature than habitual learning, as they are sensitive to the current value of outcomes and can lead to a novel sequence of actions for a desired outcome1-3." Should "they" be "it"?

      This is correct, we have edited this in the manuscript.

      Reviewer #2 (Public Review):

      Impact:

      The findings will be valuable for further research on the impact of motivational states on behaviour and cognition. The authors provided a promising concept of how persistent motivational states could be maintained, as well as established a novel, reproducible task assay. While experimental methods used are currently state-of-the-art, theoretical analysis seems to be incomplete/not extensive. We thank the reviewer for these comments. In our paper, we performed single-cell calcium imaging of OFC projection neurons to ACC to build a mechanistic understanding for the bulk ramp-like response we identified in these neurons with photometry. We identified ensembles of neurons that tile sequences of trials that match the bulk response, in particular a subset of neurons that are active at the time a reward (R) cue is reached after 2 no-reward (N) cues. We included a paragraph in the discussion to address future theory-driven analyses to address how computation is achieved by OFC projection neurons:

      “We linked the ramp-like increase in neural activity in OFC to motivation, but several questions still remain about how motivation is computed and why it would be represented as a ramp. Motivation could be computed as a combination of several variables such as time since last reward, value of reward, and effort to reach future rewards. Future theorydriven analyses could determine how motivation is computed, and whether individual variables of time, value, and effort, are encoded as clusters of similar tuned neurons, or mixed and collectively represented at the population level. In either case, it is likely that a combined map of task space and value-information carried by OFC are being used to inform downstream regions, such as ACC, for adjusting behavior. ”

      Reviewer #2 (Recommendations for the Authors):

      Overall, the layout of the figures seems a little bit chaotic and makes it hard to understand the boundaries between panels.

      We agree that the figure layout could be improved upon to aid the reader in moving from panel to panel. We have edited two of the main figures with layouts that are most irregular (Figures 2 and 4) to help with this.

      Figures/text should include the promoters used for protein expression so that readers understand which cell types would be affected.

      We have made sure to edit the figures to include the promoter of the viruses we used, and edited the text to include both the AAV serotype and promoter.

      Discuss why it is necessary for multiple prefrontal areas to be involved in maintaining motivational signals.

      We thank the reviewer for this comment. We believe that prefrontal areas would be recruited as tasks to study motivational states become more complex and require animals to keep track of task structure and perform value-guided actions. We have included a couple sentences in the final paragraph of the discussion about this:

      “Our work showed the recruitment of multiple frontal cortical areas in this process, which is to be expected as animals are required to build, maintain, and use representations of task structure and value to drive learned, motivated behaviors47. Future work can build upon the task we developed here to determine how the frontal cortex maintains motivational states across many more cue-outcome associations, and how these associations may dynamically change across time48”.

      Additionally, we included a short discussion on how in motivational signals differ between OFC and ACC in our work. We suggest OFC encodes current motivation before and after learning, which then leads ACC to represent learned actions taken and thus have a longer timescale motivational response (see response to Reviewer 1).

      Minor: Page 4, Line 1: "increase" instead of "increases".

      This is correct, we have edited this in the manuscript.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This study provides important insights into the role of neurexins as regulators of synaptic strength and timing at the glycinergic synapse between neurons of the medial nucleus of the trapezoid body and the lateral superior olive, key components of the auditory brainstem circuit involved in computing sound source location from differences in the intensity of sounds arriving at the two ears. Through an elegant combination of genetic manipulation, fluorescence in-situ hybridization, ex vivo slice electrophysiology, pharmacology, and optogenetics, the authors provide convincing evidence to support their claims. While further work is needed to reveal the mechanistic basis by which neurexins influence glycinergic neurotransmission, this work will be of interest to both auditory and synaptic neuroscientists.

      We appreciate the recognition of the significance of our study in shedding light on the role of neurexins in regulating synaptic strength and timing at the glycinergic synapse. Indeed, further investigations are warranted to delve deeper into the specific role of each different variant of neurexins in the future. We hope that our work will spark more interest and collaboration in unraveling the complexities of molecular codes of synaptic function.

      Public Reviews:

      Reviewer #1 (Public Review):

      Jiang et al. demonstrated that ablating Neurexins results in alterations to glycinergic transmission and its calcium sensitivity, utilizing a robust experimental system. Specifically, the authors employed rAAV-Cre-EGFP injection around the MNTB in Nrxn1/2/3 triple conditional mice at P0, measuring Glycine receptor-dependent IPSCs from postsynaptic LSO neurons at P13-14. Notably, the authors presented a clear reduction of 60% and 30% in the amplitudes of opto- and electric stimulation-evoked IPSCs, respectively. Additionally, they observed changes in kinetics, alterations in PPR, and sensitivity to lower calcium and the calcium chelator, EGTA, indicating solid evidence for changes in presynaptic properties of glycinergic transmission.

      Furthermore, the authors uncovered an unexpected increase in sIPSC frequency without altering amplitude. Despite the reduction in evoked IPSC, immunostaining revealed an increase in GlyT2 and VGAT in TKO mice, supporting the notion of an increase in synapse number. However, the reviewer expresses caution regarding the authors' conclusion that "glycinergic neurotransmission likely by promoting the synapse formation/maintenance, which is distinct from the phenotypes observed in glutamatergic and GABAergic neurons (Chen et al., 2017; Luo et al., 2021)", as outlined in lines 173-175. The reviewer suggests that this statement may be overstated, pointing out the authors' own discussion in lines 254-265, which acknowledges multiple possibilities, including the potential that the increase in synapses is a consequence rather than a causal effect of Nrxn deletion.

      We appreciate the reviewer’s thoughtful evaluation of our study. We agree that our conclusion regarding the promotion of synapse formation/maintenance may have been overstated and recognize the need for a more nuanced interpretation of our findings. Accordingly, we have revised our interpretation by discussing carefully the various possibilities that may cause the observed increase in synapse number in line 256-266.

      Reviewer #2 (Public Review):

      Summary:

      In this manuscript, Jiang et al., explore the role of neurexins at glycinergic MNTB-LSO synapses. The authors utilize elegant and compelling ex vivo slice electrophysiology to assess how the genetic conditional deletion of Nrxns1-3 impacts inhibitory glycinergic synaptic transmission and found that TKO of neurexins reduced electrically and optically evoked IPSC amplitudes, slowed optically evoked IPSC kinetics and reduced presynaptic release probability. The authors use classic approaches including reduced [Ca2+] in ACSF and EGTA chelation to propose that changes in these evoked properties are likely driven by the loss of calcium channel coupling. Intriguingly, while evoked transmission was impaired, the authors reported that spontaneous IPSC frequency was increased, potentially due to an increased number of synapses in LSO. Overall, this manuscript provides important insight into the role of neurexins at the glycinergic MNTP-LSO synapse and further emphasizes the need for continued study of both the non-redundant and redundant roles of neurexins.

      We thank the reviewer for the strong comments and support of our work.

      Strengths:

      This well-written manuscript seamlessly incorporates mouse genetics and elegant ex vivo electrophysiology to identify a role for neurexins in glycinergic transmission at MNTB-LSO synapses. Triple KO of all neurexins reduced the amplitude and timing of evoked glycinergic synaptic transmission. Further, spontaneous IPSC frequency was increased. The evoked synaptic phenotype is likely a result of reduced presynaptic calcium coupling while the spontaneous synaptic phenotype is likely due to increased synapse numbers. While neuroligin-4 has been identified at glycinergic synapses, this study, to the best of my knowledge, is the first to study Nrxn function at these synapses.<br />

      We again appreciate the positive feedback on the strengths of our study. We agree that the observed reduction in evoked synaptic transmission and the increase in spontaneous IPSC frequency provide intriguing insights into the function of neurexins in regulating glycinergic synaptic activity.

      Weaknesses:

      The data are compelling and report an intriguing functional phenotype. The role of Neurexins redundantly controls calcium channel coupling has been previously reported. Mechanistic insight would significantly strengthen this study.

      We wholeheartedly agree with the reviewer that understanding how neurexins control calcium channel coupling at the presynaptic active zone is crucial for elucidating their role in synaptic transmission. While our current study has provided compelling evidence for the functional phenotypes of pan-neurexin deletion, we recognize the importance of investigating the underlying molecular mechanisms in future research. Exploring these mechanisms would undoubtedly enhance our understanding of neurexin function at various synapses and contribute to advancing the field.

      The claim that triple KO of Nrxns from MNTB increases the number of synapses in LSO is not strongly supported.

      We agree. Echoing the suggestion made by reviewer 1 (as mentioned above), we acknowledge that the claim regarding the increase in synapse numbers in the LSO following the triple knockout of neurexins from the MNTB was overstated. Consequently, we have revised our conclusions more carefully to reflect this adjustment.

      Despite the stated caveats of measuring electrically evoked currents and the more robust synaptic phenotypes observed using optically evoked transmission, the authors rely heavily on electrical stimulation for most measurements.

      We acknowledge that optogenetic stimulation offers crucial advantages, and we have provided a balanced discussion of the caveats associated with both methods in our manuscript. Additionally, we have conducted new optogenetic experiments specifically for measuring the paired-pulse ratio in control and Nrxn123 TKO mice. These results have been included as a new supplementary figure (Figure S2).

      For experiments involving EGTA and low Ca2+ manipulations, we opted for electrical stimulation due to concerns regarding potential side effects of optogenetics, including the phototoxicity and photobleaching during prolonged light exposure.

      The differential expression of individual neurexins might indicate that specific neurexins may dominantly regulate synaptic transmission, however, this possibility is not discussed in detail.

      We thank the reviewer for bringing up this important point. The differential expression of individual neurexins indeed suggests that specific neurexins may play dominant roles in regulating synaptic transmission. While our study primarily focused on the collective impact of ablating all neurexins, we acknowledge the significance of exploring the specific contributions of individual neurexin isoforms in the future. Understanding the distinct roles of each neurexin isoform could provide valuable insights into the precise mechanisms underlying synaptic function and plasticity. We have added discussion in our revised manuscript Line223-230.

      Reviewer #3 (Public Review):

      Summary:

      The authors investigate the hypothesis that neurexins serve a crucial role as regulators of the synaptic strength and timing at the glycinergic synapse between neurons of the medial nucleus of the trapezoid body (MNTB) and the lateral superior olivary complex (LSO). It is worth mentioning that LSO neurons are an integration station of the auditory brainstem circuit displaying high reliability and temporal precision. These features are necessary for computing interaural cues to derive sound source location from comparing the intensities of sounds arriving at the two ears. In this context, the authors' findings build up according to the hypothesis first by displaying that neurexins were expressed in the MNTB at varying levels. They followed this up with the deletion of all neurexins in the MNTB through the employment of a triple knock-out (TKO). Using electrophysiological recordings in acute brainstem slices of these TKO mice, they gathered solid evidence for the role of neurexins in synaptic transmission at this glycinergic synapse primarily by ensuring tight coupling of Ca2+ channels and vesicular release sites. Additionally, the authors uncovered a connection between the deletion of neurexins and a higher number of glycinergic synapses in TKO mice, for which they provided evidence in the form of immunostainings and related it to electrophysiological data on spontaneous release. Consequently, this investigation expands our knowledge on the molecular regulation of synaptic transmission at glycinergic synapses, as well as on the auditory processing at the level of the brainstem.

      Strengths:

      The authors demonstrate substantial results in support of the hypothesis of a critical role of neurexins for regulating glycinergic transmission in the LSO using various techniques. They provide evidence for the expression of neurexins in the MNTB and consecutively successfully generate and characterize the neurexin TKO. For their study on LSO IPSCs the authors transduced MNTB neurons by co-injection of virus-carrying Cre and ChR2 and subsequently optogenetically evoke release of glycine. As a result, they observed a significant reduction in amplitude and significantly slower rise and decay times of the IPSCs of the TKO in comparison with control mice in which MNTB neurons were only transduced with ChR2. Furthermore, they observed an increased paired pulse ratio (PPR) of LSO IPSCs in the TKO mice, indicating lower release probability. Elaborating on the hypothesis that neurexins are essential for the coupling of synaptic vesicles to Ca2+ channels, the authors show lowered Ca2+ sensitivity in the TKO mice. Additionally, they reveal convincing evidence for the connection between the increased frequency of spontaneous IPSC and the higher number of glycinergic synapses of the LSO in the TKO mice, revealed by immunolabeling against the glycinergic presynaptic markers GlyT2 or VGAT.

      We thank the reviewer for the thoughtful and thorough evaluation of the significance of investigating the role of neurexins in glycinergic transmission at the MNTB-LSO synapse, particularly in the context of auditory processing and sound localization. The positive feedback is greatly appreciated.

      Weaknesses:

      The major concern is novelty as this work on the effects of pan-neurexin deletion in a glycinergic synapse is quite consistent with the authors' prior work on glutamatergic synapses (Luo et al., 2020). The authors might want to further work out novel aspects and strengthen the comparative perspective. Conceptually, the authors might want to be more clear about interpreting the results on the altered dependence of release on voltage-gated Ca2+ influx (Ca2+ sensitivity, coupling).

      Regarding the reviewer’s concern about the novelty of our work, we acknowledge that our previous work has explored the effects of pan-neurexin deletion on glutamatergic synapses (Luo et al., 2020). However, we would like to point out that a novelty of our present study indeed stems from the exploration of how different types of synapses converge to employ the same mechanism of synaptic function, particularly in the context of neurexin-mediated regulation. Our previous study focused on glutamatergic synapses, the current study delves into the realm of glycinergic synapses, which represent a distinct population with unique properties and functions. Despite the differences between these synapse types, our findings reveal a commonality in the underlying mechanisms of synaptic regulation mediated by neurexins. This convergence of mechanisms across different synapse types highlights the fundamental role of neurexins in synaptic function and plasticity. By elucidating how neurexins regulate synaptic transmission at both excitatory and inhibitory synapses, we provide valuable insights into the general principles governing synaptic function. In addition, this comparative perspective may shed light on the complex interplay between excitatory and inhibitory neurotransmission, which is crucial for maintaining the balance of neuronal activity and network dynamics.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      During the developmental period spanning P3-P12, the MNTB-LSO synapses undergo a transition from GABAergic to glycinergic transmission. It is well-established that Neurexin plays a role in modulating GABAergic transmission. In the authors' experimental system, AAV was injected at P0, likely impacting GABAergic transmission, including potentially influencing synapse number, before subsequently affecting glycinergic transmission. A thoughtful discussion of how the experimental interventions might have influenced this developmental process and glycinergic transmission would enhance the clarity and interpretation of their findings.

      We thank the reviewer for raising the interesting topic of the transmitter switch during neurodevelopment. Strong evidence using gerbils and rats as animal models demonstrates that the MNTB-LSO synapses undergo a shift from GABAergic to glycinergic during the early development. However, in a more recent study by Friauf and colleagues (Fisher et al., 2019), patch-clamp recordings in acute mouse brainstem slices at P4-P11 combined with pharmacological blockade of GABAA receptors and/or glycine receptors clearly demonstrated no GABAergic synaptic component on LSO principal neurons, suggesting the transmitter subtype switch may be species different. We add a discussion in our revision to clarify this topic.

      Reviewer #2 (Recommendations For The Authors):

      The data are compelling and report an intriguing functional phenotype. Mechanistic insight into how this phenotype manifests would significantly strengthen this study. For example, which neuroligin is found at these MNTB-LSO synapses?

      We agree that investigating the underlying molecular mechanisms, particularly the specific function of each variant of neurexins and their respective ligands on the postsynaptic neurons, is crucial. Exploring these mechanisms, which extend beyond the scope of our current study, would undoubtedly enhance our understanding of neurexin function at various synapses and foster advancements in the field.

      Does the TKO alter the ability of MNTB inputs to induce AP firing in LSO neurons?

      Activation of the MNTB inputs does not directly induce AP firing in LSO neurons, because the MNTB-LSO synapses are glycinergic and serve to inhibit neuronal activity.

      We think the reviewer was to ask whether pan-neurexin deletion in the MNTB neurons alter their ability to impact the firing of LSO neurons. Indeed, the weakening of glycinergic transmission due to pan-neurexin ablation in MNTB neurons could potentially alter the excitation-inhibition (E/I) balance, thereby impacting the overall excitability of LSO neurons. We have conducted preliminary experiments to investigate this aspect and found that the E/I balance at LSO neurons was notably increased in TKO mice. We are currently preparing a manuscript to comprehensively address the role of neurexins at the auditory circuit and behavior levels.

      Additional calcium measurements using GECIs would provide insight into whether nanodomain calcium or total calcium is altered at these synapses.

      We appreciate the valuable suggestion provided by the reviewer. However, distinguishing between Ca2+ nanodomain and Ca2+ microdomain using Ca2+ imaging techniques requires advanced systems such as two-photon STED microscopy, which are beyond the scope of our current research.

      It is unclear why fluorescence intensity is quantified instead of the number of synaptic clusters in LSO. In addition to changes in synapse numbers, fluorescent intensity can indicate a number of other possible morphological changes.

      We appreciate the valuable suggestion from the reviewer. We have re-analyzed our imaging data to compare synaptic density. The results, as included in Fig.3f and 3h, confirm an increase in the number of glycinergic synapses after pan-neurexin deletion.

      The most robust synaptic phenotypes were produced by measuring light-evoked oIPSCs and the authors acknowledge that electrically-evoked eIPSCs might be contaminated by uninfected fibers or by other sources of glycinergic inputs. I suggest that IPSC PPRs, EGTA, and low Ca2+ experiments be performed using optogenetics.

      As discussed in our response to Public Reviews, we acknowledge that optogenetic stimulation offers crucial advantages, and we have provided a balanced discussion of the caveats associated with both methods in our manuscript. Additionally, following the reviewer’s suggestion, we have conducted new optogenetic experiments specifically for measuring the paired-pulse ratio in control and Nrxn123 TKO mice. We included this new dataset in supplementary Figure S2, which is consistent with our result obtained with electrically fiber stimulation.

      For experiments involving EGTA and low Ca2+ manipulations, we opted for electrical stimulation due to major concerns regarding potential side effects of optogenetics, including the phototoxicity and photobleaching during prolonged light exposure.

      It is sometimes confusing which type of evoked stimulation is being used (e.g. PPR, EGTA, and low Ca2+ experiments). To aid in the interpretations of these experiments, it would help to clarify.

      We appreciate the reviewer's suggestion regarding the clarity of the evoked stimulation methods used in our experiments. We have revised the manuscript to provide clearer descriptions of the specific types of evoked stimulation employed in each experiment. Thank you for guiding towards this clarification.

      The comparisons to Chen et al 2017 and the senior author's 2020 paper seem disjointed and do not contribute to the findings, which alone, are quite interesting. Given the prevailing notion that neurexins control different synaptic properties depending on the brain region and/or synapse studied, is it surprising that the findings observed here differ from previous studies of different synapses (glutamatergic and GABAergic)?

      By comparing previous studies at different types of neurons/synapses, our findings reveal a commonality in the underlying mechanisms of synaptic regulation mediated by neurexins. This convergence of mechanisms across different synapse types highlights the fundamental role of neurexins in synaptic function and plasticity. In addition, this comparative perspective may shed light on the complex interplay between excitatory and inhibitory neurotransmission, which is crucial for maintaining the balance of neuronal activity and network dynamics.

      Despite Nrxn3 being the most abundant Nrxn mRNA in MNTB neurons, the possible contributions of this highly expressed protein are not discussed.

      We thank the reviewer for bringing up this important point. The differential expression of individual neurexins indeed suggests that specific neurexins may play dominant roles in regulating synaptic transmission. While our study primarily focused on the collective impact of ablating all neurexins, we acknowledge the significance of exploring the specific contributions of individual neurexin isoforms in the future. Understanding the distinct roles of each neurexin isoform could provide valuable insights into the precise mechanisms underlying synaptic function and plasticity. We have added discussion in our revised manuscript Line223-230.

      Reviewer #3 (Recommendations For The Authors):

      • There are several instances of spaces missing and typos, please carefully check the manuscript.

      We greatly appreciate the reviewer's helpful feedback on the text that could be clarified or improved. We have meticulously edited the manuscript to address these concerns.

      • While studying the properties of IPSC, apart from optogenetic stimulation, the authors performed experiments with electrical fiber stimulation. Their findings showed a slightly significant reduction of the IPSC amplitude and no effect on the IPSCs kinetics when comparing the TKO and control. One weakness is the discrepancy between the results from the optogenetic and fiber stimulation experiments, which the authors contribute to inefficient transfection in the fiber stimulation experiments. The authors state that they tried to optimize their protocols for virus injection protocols. However, they do not elaborate on how the transfection rates could be improved in the discussion section. Moreover, it would be good to further address the reasons for the difference in amplitude between the control IPSCs in the optogenetic and fiber stimulation experiments.

      Echoing the suggestion by Reviewer 2 (see above), we acknowledge that optogenetic stimulation offers certain advantages, and we have provided a balanced discussion of the caveats associated with both methods in our manuscript. In addition, we have performed a new set of optogenetic experiment for the paired-pulse ratio measurement in control and Nrxn123 TKO mice and included as a new figure in supplementary figure S2.

      For experiments involving EGTA and low Ca2+ manipulations, we opted for electrical stimulation due to major concerns regarding potential side effects of optogenetics, including the phototoxicity and photobleaching during prolonged light exposure.

      We added the detail of virus injection strategy that optimized the transfection rates in the method section “To enhance virus infection efficiency, we decreased the dosage per injection while increasing the frequency of injections. Additionally, we ensured the pipette remained immobilized for 20-30 seconds to guarantee virus absorption at injection sites. As a result of this strategy, we estimated that the vast majority of MNTB neurons were inoculated by AAVs.” See line288-290.

      • Abstract: "ablation of all neurexins in MNTB neurons reduced not only the amplitude but also altered the kinetics of the glycinergic synaptic transmission at LSO neurons."

      Changed as suggested.

      • Consider revising to "The synaptic dysfunctions primarily resulted from an altered dependence of release on voltage-gated Ca2+ influx."

      We appreciate the reviewer's suggestion, which helps improve the clarity of our manuscript. We have revise the phrasing as follows: "The synaptic dysfunctions primarily resulted from an impaired calcium sensitivity of release and a loosened coupling between voltage-gated calcium channels and synaptic vesicles."

      • Line 39 should be vertebrates.

      Revised as suggested.

      • Line 49 it would sound better to say "which further points to the diverse actions of neurexins in specific neurons."

      Revised as suggested.

      • Line 60 - this paragraph could include information about GABA signaling from the MNTB to the LSO, because on line 113 you mention LSO neurons receive inhibitory GABAergic/glycinergic inputs, but when you do not mention blocking of GABA currents to isolate the glycinergic ones.

      We thank the reviewer for the thoughtful and detailed suggestion. We revised the text in line 60 to “In the mature mammalian auditory brainstem” and in line 113, we removed GABAergic to emphasize the nature of glycinergic synapse, particularly in the mouse brainstem where no GABAergic components are found (Fisher et al., 2019).

      • Line 72/73 it should be adeno-associated virus; line 73: "combining this with the RNAScope technique" sounds better.

      Changed as suggested.

      • Line 91 using the RNAScope technique; lines 97, 119 as a control; line 108 the functional organization.<br />

      Changed as suggested.

      • Line 113 should be a pharmacological approach; line 122 optogenetically evoked.

      Changed as suggested.

      • Line 132, 160: the control.

      Changed as suggested.

      • Line 147 thus were infected; line 148 likely to be present but were obscured .

      Changed as suggested.

      • Line 154 which has been routinely used.

      Changed as suggested.

      • Line 155 It is not supposed to be Figure 2h but 2i; following that Figure 2i should be 2j; in my opinion, Figure 2i does not display a strong depression for the TKO mice.

      Changed as suggested.

      • Line 171 a better flow is achieved by saying: together these data show.

      Changed as suggested.

      • EC50 rather than IC50 of [Ca2+].

      Changed as suggested.

      • 180 it is better to say "we approached the matter by..."; line 183 while recording;

      Changed as suggested.

      • Line 203 were much stronger than the effect at control synapses; line 206 tightly clustering.

      Changed as suggested.

      • Line 212 sounds like they provide evidence for retina and spinal cord as well, should be made clear.

      Changed as suggested.

      • Line 289 previously.

      Changed as suggested.

      • Line 295 should be 30 min.

      Changed as suggested.

      • Line 336, 337 confocal microscope.

      Changed as suggested.

      • Please provide the number of data points also in figure captions or in the results section.

      Added in the captions as suggested.

      • Line 533, a better phrasing would be: the blocking effect of 0.2 mM Ca on IPSC amplitude.

      Changed as suggested.

      • Explain either in the methods or result section how was the EC50 of Ca2+ calculated.

      Added in the methods as suggested.

    1. AbstractMost of available reference genomes are lack of the sequence map of sex-limited chromosomes, that make the assemblies uncompleted. Recent advances on long reads sequencing and population sequencing raise the opportunity to assemble sex-limited chromosomes without the traditional complicated experimental efforts. We introduce a computational method that shows high efficiency on sorting and assembling long reads sequenced from sex-limited chromosomes. It will lead to the complete reference genomes and facilitate downstream research of sex-limited chromosomes.Competing Interest StatementThe authors have declared no competing interest.

      Reviewer 3. Arang Rhie

      Comments to Author: 1. In the introduction, add recent marker based graph phasing algorithms in long-reads, such as hifiasm trio and verkko trio mode after the T2T-Y. They are different from trio-binning, which tries to phase the reads upfront. Graph based phasing is using markers to determine haplotype specific paths to traverse. a. T2T-Y chromosome should be referencing Rhie et al., Nature 2023. Verkko is a successor of the manual efforts taken in T2T-Y, which should be also noted in the introduction. b. Reference for sexPhase program is still missing. Also, some rephrasing of the sentence is needed, as the way it is currently written is easily misleading to be understood as sexPhase was part of the methods used in the assembly of the T2T-Y. 2. There are other approaches for phasing genomes taken in plants, for example the poly ploid potato phasing using many siblings of the child by Mari et al. bioRxiv 2022.3. "But only one male and one female could suffer from sampling error" - this part is unclear. Please clarify. 4. Reference for the mason_simulator, badread software is missing. 5. Provide the accession (HG02982) for the "African human Y" in the main text. 6. I appreciate that the authors compared assemblies to T2T-Y as I requested before. However, fundamentally, mapping to T2T-Y and comparing length of each sequence classes is comparing apples to oranges, particularly in the heterochromatic region and ampliconic region of the Y. It is known to have variable copy numbers and size differences between two individuals. Frequent inversions have been reported in the ampliconic regions across different Y haplogroup. The number, size, and distribution of the repeat arrays composing the heterochromatic region has been shown to vary among different Y haplogroups in Hallast et al., Nature 2023. This can be also seen in Fig. 3c; the overall depth of the flow sorting in the heterochromatic region is below 1 - indicating the Yqh is shorter than T2T-Y, as it is in Fig. 3b. To make the benchmark legit, the authors should compare SRY and the flow sorting method using samples from the same individual. HG02982 and HX1 are presumably having very different sequence compositions given the diverged population history (African vs. Asian). Comparing total length of the assembled region against a 3rd different Y haplogroup (HG002Y) makes things more complicated, especially on regions that are known to vary a lot. If the authors think flow sorting based method needs to be compared, it should be benchmarked on the same individual to make an apple-to-apple comparison. I do agree results from read sorting (i.e. portion of reads sequenced from non-Y chromosomes in SRY vs. flow-sorting) is an important finding. However, I'd still argue comparing assemblies from the two different Y haplogroups is a stretch. The authors could have performed the same assembly length comparison on the T2T-Y using results from their SRY sorted reads with Verkko of HG002 vs. Verkko assembly using trio-binned markers. 7. In the section where assemblies are compared, the authors point to Table 1, which contains results from HG01109. HG01109 has never been mentioned before. I thought the authors were comparing assemblies from SRY sorted reads of HX1? I am not sure why the authors suddenly added a 3rd PUR genome with no context. Was this a mistake? Add results from HX1 to Table 1. 8. Please add divider lines in Table 1 between All / Ampliconic / X-degenerate / X-transposed / PAR / Het / Others. It is hard to see which rows belong to which category. 9. The last result section where authors compare results from Verkko, it is unclear how the verkko assembly was run. The authors say "default option", and later "in trio mode" in the methods. Did the authors collect parental reads from HG002 (HG003 and HG004)? How was "trio mode" performed? Did the authors used trio binning to sort the reads, then run Verkko? Or used the homopolymer compressed parental kmers and used that in the Rukki step of Verkko (and this should be benchmarked)? Was the HG002 trio assembly taken from Rautiainen et al. paper? Please clarify and add the missing parts to the main text and methods. 10. Related to the above section, it is hard to see in Fig. 4a the "two approximately 1 Mb contigs aligning to the same region of the Y chromosome". An enlarged inset of the dotplot may be helpful. Also, add legends and scale to the X and Y axis of the dotplots. 11. Note there is a mis-assembly reported on T2T-Y palindrome P5 (https://github.com/marbl/CHM13-issues/blob/main/v2.0_issues.bed), which the entire P5 should be inverted. I don't see this in the dotplots of Fig. 4. 12. In the discussion, the authors are mentioning results from the 10 trios that have been removed from the previous results. Please add the 10 trio results to the main text if it was a mistake, or remove the irrelevant results from the Discussions and Supp. Tables. 13. The authors discuss the suboptimal performance of SRY in the PAR is contributed by the restricted data types. I thought it was contributed by the lower density of the markers? The PAR parental marker density was very similar to that of autosomes, with stretches of runs of homozygosity, presumably to maintain enough homology for recombination. What was the marker density in the PAR? Was it below their 7 kmer / 1kb? 14. The authors mentioned there are no ZW genomes available to test SRY. There is a Zebra finch trio (ZW, female, bTaeGut2) and a male sample (ZZ, male, bTaeGut1) available with HiFi of the child (bTaeGut2) and Illumina of all the genomes from the Vertebrate Genomes Project (Rhie et al., Nature, 2021). Perhaps the authors could apply SRY on this individual, and compare the W chromosome results to what has been released on https://www.genomeark.org/vgp-all/Taeniopygia_guttata.html.

      Re-review: The authors have addressed most of my concerns. The revised manuscript reads much better than before. Regarding my last comment and response from the authors about the W chromosome, I was hoping to see comparable coverage of the W chromosome to the reference, as a proof of principle that SRY could be applied to non-human, highly diverged genomes. The assembly looks very fragmented though. Was it only the similarity to the Z chromosome that caused the fragmentation? Are there no other factors contributing to the discontinuity of the W chromosome? A few minor comments below to the revised version: 1. Please indicate which genome was compared in the legend of Supp. Table 5. 2.When using et al notations, please use the last name. Mari et al should be Serra Mari et al., Mikko et al should be Rautiainen et al. Also, Serra Mari et al is now published in Genome Biology: https://doi.org/10.1186/s13059-023-03160-z. Please update the reference. 3. There are a few grammar corrections to make.

    1. Dynamic functional connectivity (dFC) has become an important measure for understanding brain function and as a potential biomarker. However, various methodologies have been developed for assessing dFC, and it is unclear how the choice of method affects the results. In this work, we aimed to study the results variability of commonly-used dFC methods. We implemented seven dFC assessment methods in Python and used them to analyze fMRI data of 395 subjects from the Human Connectome Project. We measured the pairwise similarity of dFC results using several similarity metrics in terms of overall, temporal, spatial, and inter-subject similarity. Our results showed a range of weak to strong similarity between the results of different methods, indicating considerable overall variability. Surprisingly, the observed variability in dFC estimates was comparable to the expected natural variation over time, emphasizing the impact of methodological choices on the results. Our findings revealed three distinct groups of methods with significant inter-group variability, each exhibiting distinct assumptions and advantages. These findings highlight the need for multi-analysis approaches to capture the full range of dFC variation. They also emphasize the importance of distinguishing neural-driven dFC variations from physiological confounds, and developing validation frameworks under a known ground truth. To facilitate such investigations, we provide an open-source Python toolbox that enables multi-analysis dFC assessment. This study sheds light on the impact of dFC assessment analytical flexibility, emphasizing the need for careful method selection and validation, and promoting the use of multi-analysis approaches to enhance reliability and interpretability of dFC studies.Competing Interest StatementThe authors have declared no competing interest.

      Reviewer 2. Nicolas Farrugia

      Comments to Author: Summary of review This paper fills a very important gap in the literature investigating time-varying functional connectivity (or dynamic functional connectivity, dFC), by measuring analytical flexibility of seven different dFC methods. An impressive amount of work has been put up to generate a set of convincing results, that essentially show that the main object of interest of dFC, which is the temporal variability of connectivity, cannot be measured with a high consistency, as this variability is of the same order of magnitude or even higher than the changes observed across different methods on the same data. In this very controversial field, it is very remarkable to note that the authors have managed to put together a set of analysis to demonstrate this in a very clear and transparent way. The paper is very well written, the overall approach is based on a few assumptions that make it possible to compare methods (e.g. subsampling of temporal aspects of some methods, spatial subsampling), and the provided analysis is very complete. The most important results are condensed in a few figures in the main manuscript, which is enough to convey the main messages. The supplementary materials provide an exhaustive set of additional results, which are shortly discussed one by one. Most importantly, the authors have provided an open source implementation of 7 main dfc methods. This is very welcome for the community and for reproductibility, and is of course particularly suited for this kind of contribution. A few suggestions follow. Clarification questions and suggestions : 1- How was the uniform downsampling of 286 ROI to 96 done ? Uniform in which sense ? According to the RSN ? Were ROIs regrouped with spatial contiguity ? I understand this was done in order to reduce computational complexity and to harmonize across methods, but the manuscript would benefit from having an added sentence to explain what was done. 2- Table A in figure 1 shows the important hyperparameters (HP) for each method, but the motivations regarding the choice of HP for each method is only explained in the discussion (end of page 11, "we adopted the hyperparameter values recommended by the original paper or consensus among the community for each method"). It would be better to explain it in the methods, and then only discuss why this can be a limitation, in the discussion. 3- The github repository https://github.com/neurodatascience/dFC/tree/main does not reference the paper 4- The github repository https://github.com/neurodatascience/dFC/tree/main is not documented enough. There are two very large added values in this repo : open implementation of methods, and analytical flexibility tools. The demo notebook shows how to use the analytical flexibility tools, but the methods implementation is not documented. I expect that many people will want to perform analysis using the methods as well as comparison analysis, so the documentation of individual methods should not be minimized. 5 - For the reader, it would be better to include early in the manuscript (in the introduction) the presence of the code for reproductibility. Currently, the toolbox is only introduced in the final paragraph of the discussion. It comes as a very nice suprise when reading the manuscript in full, but I think the manuscript would gain a lot of value if this paragraph was included earlier, and if the development of the toolbox was included much earlier (ie. in the abstract). 6 - We have published two papers on dFC that the authors may want to include, although these papers have investigated cerebello-cerebral dFC using whole brain + cerebellum parcellations. The first paper used continuous HMM on healthy subjects, and found correlations with impulsivity scores, while the second papers used network measures on sliding window dFC matrices on a clinical cohort (patients with alcohol use disorder). I am not sure why the authors have not found our papers in their litterature, but maybe it would be good to include them. Authors need to update the final table in supplementary materials as well as the citations in the main paper. Abdallah, M., Farrugia, N., Chirokoff, V., & Chanraud, S. (2020). Static and dynamic aspects of cerebro-cerebellar functional connectivity are associated with self-reported measures of impulsivity: A resting-state fMRI study. Network Neuroscience, 4(3), 891-909. Abdallah, M., Zahr, N. M., Saranathan, M., Honnorat, N., Farrugia, N., Pfefferbaum, A., Sullivan, E. & Chanraud, S. (2021). Altered cerebro-cerebellar dynamic functional connectivity in alcohol use disorder: a resting-state fMRI study. The Cerebellum, 20, 823-835. Note that in Abdallah et al. (2020), while we did not compare HMM results with other dFC methods, we did investigate the influence of HMM hyperparameters, as well as perform internal cross validation on our sample + null models of dFC.

      Minor comments 6 - "[..] what lies behind the of methods. Instead, they reveal three groups of methods, 720 variations in dynamic functional connectivity?. " -> an extra "." was added (end of page 10).

    1. Background Culture-free real-time sequencing of clinical metagenomic samples promises both rapid pathogen detection and antimicrobial resistance profiling. However, this approach introduces the risk of patient DNA leakage. To mitigate this risk, we need near-comprehensive removal of human DNA sequence at the point of sequencing, typically involving use of resource-constrained devices. Existing benchmarks have largely focused on use of standardised databases and largely ignored the computational requirements of depletion pipelines as well as the impact of human genome diversity.Results We benchmarked host removal pipelines on simulated Illumina and Nanopore metagenomic samples. We found that construction of a custom kraken database containing diverse human genomes results in the best balance of accuracy and computational resource usage. In addition, we benchmarked pipelines using kraken and minimap2 for taxonomic classification of Mycobacterium reads using standard and custom databases. With a database representative of the Mycobacterium genus, both tools obtained near-perfect precision and recall for classification of Mycobacterium tuberculosis. Computational efficiency of these custom databases was again superior to most standard approaches, allowing them to be executed on a laptop device.Conclusions Nanopore sequencing and a custom kraken human database with a diversity of genomes leads to superior host read removal from simulated metagenomic samples while being executable on a laptop. In addition, constructing a taxon-specific database provides excellent taxonomic read assignment while keeping runtime and memory low. We make all customised databases and pipelines freely available.Competing Interest StatementThe authors have declared no competing interest.

      Reviewer 2. Darrin Lemmer, M.S.

      Comments to Author: This paper describes a method for improving the accuracy and efficiency of extracting a pathogen of interest (M. tuberculosis in this instance, though the methods should work equally well for other pathogens) from a "clinical" metagenomic sample. The paper is well written and provides links to all source code and datasets used, which were well organized and easy to understand. The premise – that using a pangenome database improves classification -- seems pretty intuitive, but it is nice to see some benchmarking to prove it. For clarity I will arrange my comments by the three major steps of your methods: dataset generation, human read removal, and Mycobacterium read classification. 1. Dataset generation -- I appreciate that you used a real-world study (reference #8) to approximate the proportions of organisms in your sample, however I am disappointed that you generated exactly one dataset for benchmarking. Even if you use the exact same community composition, there is a level of randomness involved in generating sequencing reads, and therefore some variance. I would expect to see multiple generations and an averaging of the results in the tables, however with a sufficiently high read depth, the variance won't likely change your results much, so it would be nice, and more true to real sequencing data, to vary the number of reads generated (I didn't see where you specified to what read depth for each species you generated the reads for), as it is rare in the real world to always get this deep of coverage. Ideally it would also be nice to see datasets varying the proportions of MTBC in the sample to test the limits of detection, but that may be beyond the scope of this particular paper. 2. Human read removal -- The data provided do not really support the conclusion, as all methods benchmarked performed quite well and, particularly when using the long reads from the Nanopore simulated dataset, fairly indistinguishable with the exception of HRRT. The short Illumina reads show a little more separation between the methods, probably due to the shorter sequences being able to align to multiple sequences in the reference databases, however comparing kraken human to kraken HPRC still shows very little difference, thus not supporting the conclusion that the pangenome reference provides "superior" host removal. The run times and memory used do much more to separate the performance of the various methods, and particularly with the goal of being able to run the analysis on a personal computer where peak memory usage is important. The only methods that perform well within the memory constraints of a personal computer for both long reads and short leads are HRRT and the two kraken methods, with kraken being superior at recall, but again, kraken human and kraken HPRC are virtually indistinguishable, making it hard to justify the claim that the pangenome is superior. Also, it appears your run time and peak memory usage is again based on one single data point, these should be performed multiple times and averaged. Finally, as an aside, I did find it interesting and disturbing that HRRT had such a high false negative rate compared to the other methods, given that this is the primary method used by NCBI for publishing in the SRA database, implying there are quite a few human remaining in SRA. 3. Mycobacterium read classification -- Here we do have some pretty good support for using a pangenome reference database, particularly compared to the kraken standard databases, though as mentioned previously, a single datapoint isn't really adequate, and I'd like to see both multiple datasets and multiple runs of each method. Additionally, given the purpose here is to improve the amount of MTB extracted from a metagenomic sample, these data should be taken the one extra step to show the coverage breadth and depth of the MTB genome provided by the reads classified as MTB, as a high number of reads doesn't mean much if they are all stacked at the same region of the genome. Given that these are simulated reads, which tend to have pretty even genome coverage, this may not show much, however it is still an important piece to show the value of your recommended method. One final comment is that it should be fairly easy to take this beyond a theoretical exercise, by running some actual real world datasets through the methods you are recommending to see how well they perform in actuality. For instance, reference #8, which you used as a basis for the composition of your simulated metagenomic sample, published their actual sequenced sputum samples. It would be easy to show if you can improve the amount of Mycobacterium extracted from their samples over the methods they used, thus showing value to those lower income/high TB burden regions where whole metagenome sequencing may be the best option they have.

      Re-review.

      This is a significantly stronger paper than originally submitted. I especially appreciate that multiple runs have now been done with more than one dataset, including a "real" dataset, and the analysis showing the breadth and depth of coverage of the retained Mtb reads, proving that you can still generally get a complete genome of a metagenomic sample with these methods. However kraken's low sensitivity when using the standard database definitely impacts the results, making a stronger argument for using a pangenome database (Kraken-Standard can identify the presence of Mtb, but if you want to do anything more with it, like AMR detection, you would need to use a pangenome database). I really think that this should be emphasized more, and perhaps some or all of the data in tables S9-S12 be brought into the main paper. It is maybe worth noting, that the significant drop in breadth, I would imagine, is a result of dividing the total size of the aligned reads by the size of the genome, implying a shallow coverage, but the reality is still high coverage in the areas that are covered, but lots of complete gaps in coverage. I did also like the switch to the somewhat more standard sensitivity/specificity metrics, though I do lament the actual FN/FP counts being relegated to the supplemental tables, as I thought these numbers valuable (or at least interesting) when comparing the results of the various pipelines, particularly with human read removal, where the various pipelines perform quite similarly.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      *Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      *The study examined the mechanisms behind the nuclear transport of capsid proteins of various flaviviruses. The study used mass spectrometry to identify the interaction partners of JEV capsid protein and found Importin 7 as the top hit. After validating this interaction with IP-western blotting, using IPO7 knock-out cells they showed that the nuclear accumulation of capsid is dependent on IPO7. Moreover, they also observed nearly 10-folds reduction in titre of virus produced from knock out cells without reduction in virus replication or particle assembly.

      The study needs improvements to bring it to publication standards. Some overaarching problems include, all capsid localization studies being done with GFP-tagged capsid, and not wild type capsid produced during authentic infection, lack of quantitation of most of the localization data and not showing capsid localization from infection experiments in knock out cells, and no in-depth analysis of the potential mechanisms behind the observed reduction in titre in knock out cells etc.

      Thank you for your constructive comments. We have sincerely answered all of them, as shown below. We hope you are satisfied with our additional data and the revised manuscript.

      The major comments are

      Fig 1B: Please add quantitation and statistical analyses of the ratio of nuclear and cytoplasmic capsid protein of all different capsids used. Also include western blot to prove that there is no cleavage between Capsid and GFP and the green signal indeed comes from the fusion protein. Ideally you should use capsid alone instead of a fusion protein for at least selected few constructs to prove that the Capsid-GFP behaves identical to Capsid alone.

      Following the reviewer’s comments, we have added quantification and statistical data in Figure 1D. We have added CBB data and western blot data in Figures 1B and S1. Because recombinant proteins of low molecular weights were artificially translocated into the nucleus through diffusion, less than 20 kDa proteins are typically used as GFP or GST fusion proteins for the IJ and PM experiments. Instead of IJ and PM experiments, we have added data on the translocation of the non-tagged core using IFA and its statistical data in Figure 1A. Although in vitro data on the translocation of capsid protein differ somewhat from IFA data, the data on nuclear translocation of core proteins are consistent across different experiments.

      Fig 1C: It is unclear from the figure legends the WT JEV capsid means GFP-Capsid or Capsid alone. You should clearly state the GFP part if the construct includes GFP. Quantitation and statistics are missing and the information on how many independent experiments were performed is also not included in the figure legend.

      Following the reviewer’s suggestion, we have described that the JEV proteins fused GFP as follows: “AcGFP-JEVCoreWT or AcGFP-JEVCoreGP/AA” (Line. 771). We added quantification and statistical analysis as shown in Figure 1E. IJ and PM experiments were performed three times independently and described in the legend of Figure 1 in the revised manuscript (Lines 773–774).

      Fig 2B: Quantitation and statistics are missing. Ideally, the data need to be reproduced with Capsid alone instead of Capsid-GFP. A positive control is needed for the activity of Bimax to prove that the drug was working in the assay.

      We have added quantitative and statistical data in the revised Figure 2B. As mentioned above, capsid alone is potentially translocated into the nucleus artificially using the IJ and PM assay. Bimax binds to importin alpha but not importin beta, specifically inhibiting the importin alpha/beta pathway. The RanGTP mutant binds to the importin beta family, including importin beta 1, and widely inhibits importin beta-dependent nuclear import. These inhibitors are well-characterized and recognized in the field. We cited the following reference: Tsujii et al., JBC, 2015.

      Fig 2C: How do you reconcile the IP mass spectrometry data that Importin b1 is the second strongest hit with the lack of IP interaction you observed in fig 2C?

      As shown in Figure 2C, importin b1 does not interact with the JEV core. Importin b1 is the most abundant member of the importin beta family. Thus, it might be a non-specific interaction between importin b1 and the JEV core. Therefore, we excluded importin b1 from further analyses. We added a sentence to explain why importin b1 was excluded on Line 145.

      Fig 3C: How many independent confirmations of this experiment was performed?

      All IJ and PM experiments were performed thrice independently. We described this in the legend of Figure 3 in the revised manuscript (Line; 794).

      Fig 4A and B: Add quantitation for the western blot. 4A-D Include data on the number of biological repetitions. 4C-D: Add quantitation and statistical analyses of the ratio of nuclear and cytoplasmic capsid protein.

      We have added quantification data, as shown in Figures 4A and 4B. All experimental results shown in Figures 4A, 4B, 4C, and 4D were performed thrice independently, as described in the legend of Figure 4 of the revised manuscript (Lines; 810-812).

      Fig 5B. This data should be shown in the context of infection with untagged Capsid at least for 1-2 viruses. This is a serious drawback of the present study as there is no clear evidence presented that the native capsid protein in an infection context depend on importin 7 for nuclear accumulation and behave similar to the GFP-Capsid constructs being used.

      Following the reviewer’s concerns, we used an un-tagged JEV and DENV core to examine core translocation in WT or IPO7KO Huh7 cells. As shown in Figures 5C and 5D and their quantitative data, nuclear translocation of JEV and DENV core protein was inhibited in IPO7KO Huh7 cells. We tested the translocation of core protein upon infection with DENV as shown in Figure 5F. Although we could not examine ZIKV infection because we could not find appropriate antibodies against the ZIKV core, these data are consistent in that nuclear translocation of flavivirus core protein largely depends on IPO7.

      Fig 5 A-D: Two repetitions are insufficient; a minimum of three biological repeats and statistical analysis need to be included. 5E-F: You cannot do statistics on two repeats, need minimum of three repeats to perform statistical analysis. 5G-H: I presume three repetitions based on the data points shown, this should be clearly stated in the figure legend.

      We repeated three independent experiments, shown in Figures 5A and 5C-5F, and indicated them on Lines 823. We have added statistical data in Figures 5B-5F. We have corrected the statement of biological repeats in Figures 6A and 6B (Lines; 843-844).

      Fig 5E-G: Taking the data of 5E and 5G together it seems Importin 7 functions as the level of particle release and not particle assembly or maturation. Have you checked for the specific infectivity of the particles released from knock out cells to determine the reason behind the reduction in virus titre? You could look at the prM maturation by furin cleavage to check it this is altered in the IPO7 knock out cells.

      We determined the ratio of infectious titer per 103 copies of viral RNA in Figure 6F. The proportion of infectious viruses targeting extracellular JEV RNA was decreased in IPO7KO cells. Simultaneously, no difference was observed in the proportion of infectious viruses targeting intracellular JEV RNA between WT and IPO7KO cells. Although we could not find appropriate antibodies against the JEV core, we checked prM expression using the DENV virus. The expression of prM was slightly increased in JEV-infected IPO7-KO Huh7 cells (Figure S3D). This result suggests that the efficiency of prM cleavage by furin was partially involved in the impairment of infectious virus release in IPO7KO Huh7 cells.

      Fig 5H: Have you checked if the observation regarding intracellular RNA levels in 5F is applicable to these viruses as well.

      We checked the intracellular RNA levels of DENV and ZIKV-infected cells. In contrast to JEV, intracellular ZIKV or DENV RNA showed no difference in IPO7-KO Huh7 cells (Figure 6H). We discuss it in Discussion section (Lines; 269-271)

      Fig 6: The figure legend "Data are representative of two (A, B) independent experiments and are presented as the mean {plus minus} SD of three independent experiments (C)" is confusing. The sentence should be reworded to state the repetitions separately for independent experiments. Fig 6C should show original titres and not percentages.

      We have corrected Figure legends according to the reviewer’s comments. We have showed the original titers in Figures 6C and 6E.

      Fig 7B: This experiment should be performed in IPO7 knock out cells to confirm that the observed reduction of core mutant is mainly contributed from its lack of interaction with IPO7 and not from any other confounding factors.

      Following the reviewer’s suggestion, we performed SRIP experiments for GP/AA mutation using IPO7KO Huh7 cells. As shown in Figure 7C, the SRIPs harboring WT core were impaired in IPO7KO Huh7 cells; no difference was observed in the SRIPs harboring GP/AA mutations in WT and IPO7KO cells. These results suggest that IPO7-dependent nuclear translocation of core protein is important for the viral release.

      Reviewer #1 (Significance (Required)): While the authors could convincingly demonstrate the interaction between capsid and IPO7, how that interaction results in the observed reduction in viral titre is largely unexplored. As all the localization data used a GFP-tagged capsid outside an infection context, this reviewer is not confident that all the reported observations will hold in an infection setting. This need to be urgently addressed to rise the confidence about the observation. The current data is insufficient to confidently attribute the change in titre to the interaction between capsid and IPO7 and the capsid localization to the nucleus. Knocking out IPO7 could have pleotropic effects independent of capsid nuclear accumulation that could lead to the observed titre reduction. This need to be addressed further before linking both these phenotypes. Certain key experiments needed to address these questions are currently missing. While the interaction of Capsid with IPO7 is certainly intriguing, the implications of this interaction on virus biology needed further investigation before clear conclusions can be drawn regarding this observation.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary: In this study Itoh and colleagues investigate the mechanism, role and impact of the nuclear localization of the flavivirus core protein. The import of the core protein has long been observed and investigated and herein the authors use some novel approaches to identify potential cellular binding partners that facilitate nuclear import. Via proteomics and biochemical approaches they determine that importin-7 plays a crucial role in the import of the core protein that appears to be conserved across Flavivirus members. In general the findings and conclusions are sound but there are some significant omissions and caveats that warrant further investigation.

      Major comments: - one of the major caveats of the study is that the flavivirus NS5 protein also translocates to the nucleus in an Importin-alpha/beta dependent manner. Therefore how can the authors discount any impact of preventing NS5 import, in addition to core, on virus and SRIP replication and production. Some discussion, if not additional experiments are required here ie. NS5 localization in the KO cells during virus infection

      We examined the localization of NS5 using IPO7KO Huh7 cells. As shown in Figure S2D and S2E, we confirmed that IPO7 was not involved in the nuclear localization of NS5.

      • the localization is predominantly nucleolus rather that nucleoplasm when compared to the SV40 NLS. What are the sequence differences between the flavivirus proteins that potentially could account for this? A protein known to localize solely to the cytoplasm should also be used eg. NS1 or NS3.

      The JEV core does not contain a consensus nucleolar localization signal. Nuclear localization of NS5 depended on importin-α similar to the SV40 NLS, while flavivirus core proteins were independent of importin-α. Gly42 and Pro43 are critical amino acids for the nuclear localization of the core protein, as shown in Figures 1C and 1D. The Gly42 to Pro43 of core proteins were well-conserved in the core proteins of the Flaviviridae family.

      • controls for Figure 2? Ie. a protein known to be inhibited by Bimax but not the RanGTP mutant and vice versa.

      Bimax binds to importin alpha but not importin beta and specifically inhibits the importin alpha/beta pathway. The RanGTP mutant binds to the importin beta family, including importin beta 1, and widely inhibits importin beta-dependent nuclear import. These inhibitors are well-characterized and recognized in the field. Therefore, we have cited the following references: Tsujii et al., JBC, 2015.

      • Fig 5. Difference with WNV and DENV in nucleoplasm localization but also WNV still appeared to have Core in the nucleus in the KO cells

      We agree with the reviewer’s comment about differences in nuclear localization among the viruses using the IJ assay. We have added new data to examine the localization of the DENV core after DENV infection. Nucleolar localization of the DENV core following DENV infection was observed, as shown in Figure 5F. Therefore, differences in nucleoplasm or nucleolar localization among different viruses shown in Figure 1C and Figure 5B might be artifacts of recombinant proteins. One possibility is that the localization of core proteins using IJ assay was detected by anti-GFP antibodies. Although purified GFP-core proteins, as shown in Figure 1B and S1, were observed as a single band of fusion proteins, core proteins of WNV and DENV might be cleaved during IJ experiments, and GFP alone might be detected at nucleoplasm, as shown in Figure 5B. Because our study focused on the nuclear translocation of flavivirus core proteins, the detailed localization of each core protein in the nucleus will be studied in the future.

      • Fig 5C still has substantial JEV and DENV core but not WNV and ZIKV. Why is the DENV and WNV localization pattern different to Fig 5B?

      We appreciate the reviewer’s suggestion; we re-checked all our data presented in Figure 5B and other data shown in Figure 5B. We quantified the ratio of nuclear localization as shown in the right of Figure 5B. Our quantification data showed that the nuclear transport of all core proteins used in this study was dependent on IPO7. In contrast, Figure 5A shows that nuclear translocation of WNV core protein is partially dependent on IPO7. This discrepancy might be explained that nuclear translocation of WNV core protein might be regulated by several nuclear carriers. We described this in discussion section (Line; 250-254).

      • Fig 5F, does the KO also restrict NS5 from entering the nucleus and could this then results in increase polymerase activity confined to the cytoplasm resulting in more viral RNA?

      Following the reviewer’s suggestion, we examined NS5 localization during viral infection and plasmid transfection, as shown in Figure S2D and S2E. Previous data regarding the nuclear localization of NS5 depended on importin-α. Our data are consistent with previous reports that IPO7 was not involved in the nuclear localization of NS5. In contract to JEV, we also confirm that intracellular ZIKV or DENV RNA showed no difference in WT and IPO7-KO Huh7 cells (Figure 6H). As described in the discussion, other factors, such as antiviral factors, might be involved in IPO7-mediated nuclear transports in JEV infected cells (Line; 269-271).

      • Why was WNV infection not performed in Fig 5H? What where the viral tires compared to for the relative % values?

      Because our institution does not have a BSL3 facility, we could not use WNV. Following the reviewer’s comment, we showed viral titers in Figure 6G.

      • Fig 6B, still a significant amount of core present in the nucleolus. Also WT cells have (almost?) no cytoplasmic staining for core where this could be clearly observed in the WT cells in Fig 5D. Why the difference?

      Plasmid transfection of AcGFP-Core WT showed that almost all core proteins were located in the nucleus. We assumed that AcGFP might influence nuclear exports of core proteins or the efficiency of nuclear transports as shown in other data of in vitro experiments. However, our finding that IPO7 was involved in the nuclear transport of core proteins is consistent.

      • In Fig 7B, D and E, when were the SRIPs collected and what was the time period after subsequent infection?

      Following the reviewer’s comments, we have added more details on SRIP experiments in Materials & Methods (Line; 521-523).

      • In Fig 7C was the luciferase measured from the initial transfection and how did it correlate with RNA production? A 15-fold increase in replicon RNA actually seems quite low over a 48h period

      Because large amounts of in vitro-transcribed replicon RNA were injected into cells in this experiment, we observed that significant amounts of luciferase values were detected after 4 h. However, the 15-fold enhancement in luciferase value was consistent with previous reports (PMID: 30413742, PMID: 17024179). We have added references in the revised manuscript.

      • quantitation is required throughout all of the experimental IFA data provided

      Following reviewer comments, we have quantified all IFA data and showed their results.

      Reviewer #2 (Significance (Required)):

      The nuclear translocation of flavivirus protein has long been studied and it has been observed that the core, NS5 (RNA polymerase) and potentially the NS3 (helicase/protease) proteins all translocate the nucleus. Importin alpha and beta have been shown to facilitate this process. The authors aim to extend this to identify importin-7 as a major cellular factor enabling nuclear translocation. Overall the experiments have been performed well but there is a lack of quantitation for many of the results an suitable controls are required.

      I am a researcher in the field of flavivirus replication

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      In the presented study the authors identified and mechanistically investigated how Flaviviruses including Japanese encephalitis virus (JEV), Dengue virus (DENV), and Zika virus (ZIKV) commonly use importin-7 (IPO7), an importin-β family protein, as a cellular carrier protein to facilitate nuclear core protein translocation. The authors evaluated how the production of infectious viruses is regulated by IPO7 using cellular infection models including IPO7-deficient knockout cells. In the submitted manuscript, the authors provide evidence that IPO7 facilitates viral core protein import into the nucleus of infected cells, which is essential for effective Flavivirus replication. Taken together, the study is interesting to a broader readership with interest in molecular virology, and its findings are informative for potential future targeting of IPO7 to affect flavivirus replication using small molecule drugs. The manuscript is well-written and easy to follow, the methods are appropriate, the structure is logical, and statistical analysis is adequate.

      Major comments:

      • It is unclear why the authors specifically used Ala substitution at Gly42 anb Pro43 to obtain the abolishment of nuclear core protein localization. It would be helpful to put this into more context and explain the approach.

      Mutations of Gly42 and Pro43 to Ala were previously reported and characterized by the same research group (PMID: 15731239). Following the reviewer’s comment, we have added more details of GP mutations in the text (Lines 66–70).

      • In Figure 4, the authors claim that the binding between IPO7 and RPS7 is disrupted upon the addition of RanGTPQ69L. This is not clearly evident from the pulldown experiment and should be proven experimentally with additional experiments (e.g. by using an imaging approach) to underline the statement that the binding mode of IPO7 to the JEV core protein is similar to that of RPS7. Loading controls for pulldown blots should be added.

      As described in response to the comment by reviewer#2 regarding Figure 2, the RanGTPQ69L mutant inhibits the interaction between the importin beta family, including IPO7 and its substrates, by directly binding to importin beta proteins. For the benefit of readers without knowledge of the typical Ran-dependent nuclear transport mechanism, we have described its effects with several cited references (Dickmanns et al., 1996; Tachibana et al., 2000). We referred to a study that showed that IPO7 transports RPL proteins, including RPS7 (Jäkel and Görlich, 1998). The data in Figures 4A and 4B demonstrate that adding RanGTPQ69L remarkably reduces the binding of IPO7 to the Core proteins and that the effect is more robust than that for RPS7. We believe that these results are experimentally valid, indicating that nuclear transport of Core proteins by IPO7 is achieved through a typical Ran-dependent pathway.

      • Most methods used are presented logically but require some more details so that they can be reproduced. In particular, the difference between Figure 4 E and 4H is confusing. What is the difference? Is 4E showing intracellular viral titers and 4H infectious viral titers in the supernatant of cells? Clarification needed. Put relevance of these experiments in context of the hypothesis.

      We apologize for the confusion regarding the data in Figures 5E and 5H (we assume). These data were derived from the same experiments, except for the time-course data presented in Figure 5E. We have removed Figure 5E to simplify our results.

      • Identical phenotypes induced by IPO7 knockout in a number of HuH7 clones are shown in Figures 6A to 6C. This data does not add to the overall understanding and should be moved to supplementary figures. Why are 293T cells used in experiments shown in Figure 6D and 6E? What is the relevance of kidney cells to Flavirius infections?

      Following the reviewer’s comments, we have moved Figure 6 to supplementary figures. We used 293T cells because of efficient JEV propagation and gene-deficient efficiency. We wanted to demonstrate that our data are not Huh7-dependent through experiments in 293T cells.

      • Prior studies are referenced appropriately, however, in a recent study it was demonstrated that IPO7 is stabilized upon Epstein-Barr Virus infection and that IPO7 presence is required for the survival of host cells (Yang YC, Front Microbiol. 2021 Feb 16;12:643327. doi: 10.3389/fmicb.2021.643327).

      We deeply appreciate the publications in these fields. Following the reviewer’s comment, we have cited these references.

      This important study about the physiological relevance of IPO7 during viral infections has not been cited by Itoh and colleagues in the presented study. However, the results of the uncited study are very relevant to the provided manuscript, since Itoh and colleagues are using IPO7 knockout cells to investigate its function in Flavivirus core protein nuclear import. Hence, the authors should perform cell survival and cellular fitness experiments to demonstrate that observed phenomena of reduced viral replication and virus export in IPO7 knockout cells are independent of compromised cellular fitness due to IPO7 deficiency.

      We evaluated cellular fitness between WT and IPO7KO Huh7 cells using PI (Propidium Iodide) staining through flow cytometry. As shown in Figure S2F, no differences were observed in cell viability between WT and IPO7KO Huh7 cells. It suggests that viral titers reduced in IPO7KO Huh7 cells are not involved in cellular fitness.

      Minor comments:

      • Describing Figure 3B, the authors state that they focused on IPO7 among the core binding proteins belonging to the importin-b family, because IPO7 "was identified the most peptides" in the mass spectrometry approach. This requires a more detailed explanation. Also, an explanation of why HEK293T cells were used for this approach and not HuH7 cells, as used predominately in most parts of the study, would provide more clarity to the reader.

      We focused on IPO7 because it had the highest number of detected peptides, and we found that the second most detected peptide, IPOB1, did not bind to JEV core proteins as shown in Figure 2C. Therefore, we included the lack of interaction between IPO7 and IPOB1 as part of the rationale.

      • In Figures 4E and 4F, colour coding is missing.

      We have indicated color coding in this data. Thank you for your comments.

      Reviewer #3 (Significance (Required)):

      The provided manuscript 'Importin-7-dependent nuclear localization of the Flavivirus core protein is required for infectious virus production' by Itoh and colleagues investigates a topic with important scientific relevance. The presented study builds on previous findings by the authors where they have demonstrated that Flavivirus core protein nuclear localization is actually conserved among Flaviviridae and represents a potential target for broad-range antiviral small molecule drugs (Tokunaga et al., Virology, 2020 Feb;541:41-51). However, our understanding of Flavivirus core protein nuclear localization during viral replication and how the processes could potentially be targeted using novel therapeutic drugs remains elusive. Here, the provided manuscript addresses a mechanistic investigation of how the Flavivirus core protein is actually translocated from the cytoplasm to the nucleus of infected cells. The study is informative particularly for virologists with expertise in Flavivirus replication.

      However, from my point of view as a virologist investigating host-pathogen interactions with a strong interest in clinical translational, the manuscript requires a more careful evaluation and interpretation of some results of key experiments. In addition, some of the results need to be more precisely described for clearer understanding by a broader readership.

      Reviewer #4 (Evidence, reproducibility and clarity (Required)):

      Summary: In the manuscript entitled "Importin-7-dependent nuclear localization of the Flavivirus core protein is required for infectious virus production", by combining proteomics, CRISPR/Cas9 gene KO, CLSM and standard virology techniques, Yumi Itoh report novel data concerning the involvement of IPO7 in the nuclear and nucleolar localization of Flaviviridae core nuclear and nucleolar localization and viral particle release. Surprisingly, IMPa/b1 inhibition via Bimax2 does not affect core nuclear transport, whereas both RanQ69L and WGA did so. The authors try to identify the cellular transporters involved in core nuclear import, and to this end performed a MS spec analysis of JEV core interactors, which yielded IPO7 as the most likely candidate. After confirming the result by Co-IP, the authors go on showing most core proteins require IPO7 for nuclear delivery using Huh7 and HEK7 IPO7-KO cells, with the exception of WNV core which was able to partially enter the nucleus. In such cells, upon infection, extracellular (but not intracellular) viral titers were strongly reduced, a phenotype which was observed with a JEV core mutant bearing the Gly42 and Pro43 to Ala substitutions in a previous study.

      Major comments: - The major conclusions of the study are:

      1.IPO7 is the main driver of core nuclear transport 2.Core nuclear localization is somehow important for viral particle release Both conclusions are well-supported by experimental evidence.

      Methods are clear and precise, the study appears to have been produced with high quality standards, and so is the presentation of the results. A few controls however should be added to increase the reliability of the results presented here (see below)

      Since the authors attempt to link the phenotype observed on virus release upon IPO7 KO to defects on core nuclear import by making a parallelism with core GP/AA mutant, it would be important to know the behavior of such virus in Huh7 wt and Huh IPO7 KO cells. In other words, is GP/AA JEV released efficiently in Huh7 IPO7 KO cells?

      We have added new data examining the propagation of the GP/AA JEV mutant in IPO7KO Huh7 cells (Figure 6F). Our new data showed that there were no differences in the propagation of the GP/AA mutant in WT and IPO7-KO Huh7 cells.

      A similar approach can be applied to data shown in Figure 7 (effect on release on a capsid nuclear deficient mutant). This would help understand if IPO7 KO, viral release defects and core nuclear import are somehow linked.

      We produced SRIPs harboring GP/AA core using WT and IPO7KO Huh7 cells and demonstrated that the number of infectious viruses produced by WT and IPO7KO Huh7 cells was the same (Figure 7C).

      Minor comments:

      INTRODUCTION • “Flaviviruses...are mosquito-borne human pathogens" What about tick borne encephalitis virus?

      We have corrected it (Line; 43-44).

      • " replication.... occur in the endoplasmic reticulum (ER)" This sentence is a bit inaccurate. Flaviviridae RNA replication occurs in so-called viral replication factories, double membrane vesicles which are partly derived from the ER. see "PMID: 26958917".

      We have corrected this sentence according to the reviewer’s comment (Line; 60-62).

      • "it is known that some flavivirus core proteins are translocated from the cytoplasm into the nucleus" o I think the first evidence of core in the nucleus dates back to 1989, and here it might be appropriate to cite the original reference: "PMID: 2471810". o It might be worth mentioning that NS5 has also been reported in the nucleus (See "PMID: 28106839")

      We have corrected the sentence according to the reviewer’s comment (Line; 63-65).

      • "In the cytoplasm, NLS-containing proteins are recognized by importin-α " o This is true only for classical NLSs, not every NLS binds IMPa, as the authors confirm in this study! Indeed, we have also PY-NLS, IPO7 specific NLSs, IPOb1 NLSs, etc. I therefore suggest rephrasing.

      Thank you for pointing out the exact description of NLS. We agree with the reviewer’s comment that “NLS” includes all types of signal sequences, such as PY-NLS. To clearly distinguish between the CLASSICAL nuclear transport pathway by importin α/β1 and the various nuclear transport pathways by the importin β family, such as transportin, we refer to NLS as classical NLS (cNLS) in the document. We have modified the following sentence by adding “such as transportin” and “without importin-α.”

      RESULTS

      • Fig. 1. o it is not clear what is new here, with respect to what has been already published. The authors should clearly differentiate novel findings from confirmatory results

      Thank you for your suggestion. We would like to introduce our new assay using recombinant virus core proteins, as shown in Figures 1C and 1D. The data shown in Figure 1 are crucial for understanding our data in Figure 2, and we believe this figure is required for broad-ranging readers.

      Fig. 2 and 4 o Proteins whose nuclear transport is dependent on IMPa/IMPb1 (such as SV40 NLS) are lacking here

      Bimax binds to importin alpha but not to importin beta and specifically inhibits the importin alpha/beta pathway. The RanGTP mutant binds to the importin beta family, including importin beta 1, and widely inhibits importin beta-dependent nuclear import. These inhibitors are well-characterized and recognized in the field. Therefore, we have cited the following references: Tsujii et al., JBC, 2015.

      • Fig.5 o It would be important to know the effect on total virus infectivity (intracellular + extracellular) and total viral RNA. It would also be important the effect on RNA replication by using a subgenomic viral replicon (with deletion of the env gene for example). The question here is if IPO7 depletion affects to any extent viral genome replication, and this is impossible to assess in a fully assembling system. We determined the ratio of infectious titer per 103 copies of viral RNA in Figure 5D. The proportion of infectious viruses targeting extracellular JEV RNA was decreased in IPO7KO cells, and there was no difference in the proportion of infectious viruses targeting intracellular JEV RNA between WT and IPO7KO cells. We examined the effects of IPO7 on viral RNA replication of subgenomic replicon. We showed that the deficiency of IPO7 enhanced viral RNA replication as shown in Figure 7E. As described in the Discussion section, IPO7 may transport other factors possessing antiviral activity against flaviviruses. These data will be investigated in the future.

      o Panels A-F legend is missing, consider adding it?

      We have added more details to Figure 5A-5F following the reviewer’s suggestion.

      • Fig.7 o I did not completely understand how NLuc is the readout here To quantify RNA replication, we quantified Nluc values using a plate reader. We have added more details on the reporter assay in Materials and Methods (Line; 521-523).

      o Also, I do not understand if the effect of GP/AA substitution of panel B has already been reported or if it is a novel finding

      Previous reports regarding the effect of GP/AA substitution of JEV showed the impairment of infectious virus release. However, the SRIP assay was performed to examine the viral release step. Our detailed data showed that the lack of IPO7-mediated nuclear transport of core proteins impaired infectious viral release, and our new results using SRIPs harboring GP/AA core showed that the lack of nuclear transport of core proteins also impaired the release of infectious viruses. Our data strongly suggest that the lack of nuclear transport of core proteins influences the viral release.

      • All CLSM figures lack quantification (Fn/c; Fno/n)

      We have added quantitative data for IFA experiments in our revised manuscript.

      DISCUSSION

      • "The nuclear entry of viral genomic DNA has been demonstrated to involve IPO7" o It would be nice to know which viruses the authors are freeing to here

      We have added the virus name and corresponding references.

      • "While RNA viruses, including flaviviruses, are considered to replicate in the cytoplasm of mammalian cells, increasing evidence suggests nucleolar localization of the viruses " o I suspect Rawlinson did not propose the viruses localize to the nucleolus, as this sentence seems to imply. Rather, a trafficking of viral proteins to nucleoli, to manipulate cell function, is more realistic. I suggest considering rephrasing. We have corrected this sentence.

      Reviewer #4 (Significance (Required)):

      SECTION B - Significance ========================

      • Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field. As alluded to above, this work presents several advances of current knowledge in the field of viral proteins nuclear trafficking, and in Flavivirus biology. The finding of most core proteins depending on IPO-7 is novel and intriguing, and opens the question of what makes WNV core special. Indeed, this protein nuclear targeting is only partially inhibited in IPO7 deficient cells. The fact that the authors extend their findings to several Flaviviruses adds significance. The role of nuclear core for virus release is also intriguing, but appears poorly characterized. In this respect a mechanistic explanation of the phenomenon would be highly desirable to increase the significance of the work presented here.

      In this context I would have a few suggestions:

      A) The authors performed MS spec on JEV core, this most likely resulted in a long list of "hits". However, they only report IMPb superfamily members. This is perfectly fine, since they focus at identifying partners responsible for nuclear import. However, it might be helpful for understanding the role of nuclear core. By comparing MS of wt core and GP/AA core, and or wt core in wt and IPO7KO cells, authors could identify core biding partners in the nucleus (in the nucleolus?) which are important for virus release. This could be subsequently addressed by knocking down these factors and study the effect on virus life cycle.

      We appreciate the reviewer’s valuable comments. We did not perform MS analysis on GP/AA core protein and core protein using WT or IPO7KO Hun7 cells. To report IPO7-mediated core translocation simply, we would like to cite our manuscript focusing on IPO7. To clarify the importance of nuclear transport of core protein on the viral life cycle, we will perform wide-ranging proteomics.

      1. B) Further, the authors should try to address the role of core in the nucleus (and nucleolus). Does it interact with cellular/nucleolar proteins? Does it deliver viral RNA to sites of assembly? Does it interfere with rRNA synthesis? All these findings would be easily obtainable using the GP/AA virus and/or Huh7 KO cells, and tremendously increase the impact of the study, which at the moment is limited at points 1 and 2 in the first section of the current report.

      Thank you for your valuable comments. We agree that we should clarify the roles of the nucleus or nucleolar localization of the core protein. We tested the effects of rRNA synthesis on JEV core expression. Our data showed that core protein expression slightly impaired the maturation of rRNA synthesis, as shown here. However, the core expression did not influence protein translation. We focused on the phase separation capacity of core protein localized in the nucleolar or nucleus. From our accumulating data, we hypothesized that the acquisition of phase separation capacity of core protein might be involved in an efficient virus release step. We hope that these data will be reported in the near future.

      Overall, this work should be interesting for both cell biologists interested in trafficking of viral proteins, and virologists interested in virus-host interactions. The antiviral approach at the moment is a bit less convincing, but the manuscript might be interesting for scientists trying to develop new antiviral strategies. (In this context it might be worth reading and possible discussing the very recent paper from the Bartenschlager group "PMID: 37702492." Also, I think that it would be worth discussing the recent discovery that a closely related virus belonging to the Hepacivirus genus within the Flaviviridae family, mediated re-localization of Nups to viral replication factories, where they are believed to control access to DMVs interior, thereby regulating virus replication and assembly. Could the core IPO7-interaction have any role in core delivery to DMVs? See "PMID: 26150811".

      Thank you for your valuable comments. We have added several sentences in the Discussion section (Line; 297-305). We will investigate the role of nuclear transports in viral life cycles in the future.

      Since I am a molecular virologist studying viral nucleocytoplasmic trafficking, virus-host interactions, and antiviral drug-discovery I think I have sufficient expertise for an informative and helpful revision of this work.

    1. One study participant said, “I will work until everything is done and everything is beautiful and wonderful …  And if I have to not sleep for three days to do that, that's what happens.” For students with autism, this inclination could manifest as putting in extra effort to make eye contact when speaking with another person, even if it makes them uncomfortable. This pressure to take on more and hide parts of themselves can lead to burnout and have negative impacts on mental health.

      I chose this section because it’s extremely relatable to me as a STEM student who’s spent countless nights up staying up completing assignments or studying. Most community college STEM students are like me and want the perfect 4.0 to transfer to a nice University, so we all get our assignments to look “beautiful and wonderful” to get a perfect grade on it. This article is important to me because I’m more aware of neurodivergent students and the challenges they face, with my new knowledge I’ll be able to be more supportive towards my fellow future STEM classmates. I also think this article is important for STEM students because some of them could be neurodivergent students and this article could help them manage challenges they may have like work balancing. Now I’m not personally a neurodivergent student but I’ve noticed tons of the same traits are relatable to me which is interesting, overall I’m glad I chose this article because it’s really informative and helps you gather information about other people's struggles.

    2. Neurodivergent students see their neurotypical peers as the “ideal” students, which can lead to negative self-judgment—telling oneself, for example, “I don’t do things the way I’m expected to, so there’s something wrong with me,” Syharat says. They often have challenges in areas in which they feel they are expected to excel, so they may struggle to feel that they belong.

      i have chosen this quote because not only can i personally relate to it, but i know many of my friends could relate to it too. when talking to a friend that i share a class with, we often wonder why everyone else understands the concepts much easier even though we are doing everything we can to try and understand. its very frustrating and often times makes me think i'm missing something, like everyone automatically knows what to do and im behind somehow, which all in all can lower confidence in classwork. so i could definitely relate to this paragraph. i think this article connects the idea of designing for equity and inclusion by giving a voice to the fact that neurodivergent students have felt this way for a long time and gave examples of ways they have had to adapt to the "ideal students" world.

    1. According to all known laws of aviation,

      there is no way a bee should be able to fly.

      Its wings are too small to get its fat little body off the ground.

      The bee, of course, flies anyway

      because bees don't care what humans think is impossible.

      Yellow, black. Yellow, black. Yellow, black. Yellow, black.

      Ooh, black and yellow! Let's shake it up a little.

      Barry! Breakfast is ready!

      Ooming!

      Hang on a second.

      Hello?

      Barry?

      Adam?

      Oan you believe this is happening?

      I can't. I'll pick you up.

      Looking sharp.

      Use the stairs. Your father paid good money for those.

      Sorry. I'm excited.

      Here's the graduate. We're very proud of you, son.

      A perfect report card, all B's.

      Very proud.

      Ma! I got a thing going here.

      You got lint on your fuzz.

      Ow! That's me!

      Wave to us! We'll be in row 118,000.

      Bye!

      Barry, I told you, stop flying in the house!

      Hey, Adam.

      Hey, Barry.

      Is that fuzz gel?

      A little. Special day, graduation.

      Never thought I'd make it.

      Three days grade school, three days high school.

      Those were awkward.

      Three days college. I'm glad I took a day and hitchhiked around the hive.

      You did come back different.

      Hi, Barry.

      Artie, growing a mustache? Looks good.

      Hear about Frankie?

      Yeah.

      You going to the funeral?

      No, I'm not going.

      Everybody knows, sting someone, you die.

      Don't waste it on a squirrel. Such a hothead.

      I guess he could have just gotten out of the way.

      I love this incorporating an amusement park into our day.

      That's why we don't need vacations.

      Boy, quite a bit of pomp… under the circumstances.

      Well, Adam, today we are men.

      We are!

      Bee-men.

      Amen!

      Hallelujah!

      Students, faculty, distinguished bees,

      please welcome Dean Buzzwell.

      Welcome, New Hive Oity graduating class of…

      …9:15.

      That concludes our ceremonies.

      And begins your career at Honex Industries!

      Will we pick ourjob today?

      I heard it's just orientation.

      Heads up! Here we go.

      Keep your hands and antennas inside the tram at all times.

      Wonder what it'll be like? A little scary. Welcome to Honex, a division of Honesco

      and a part of the Hexagon Group.

      This is it!

      Wow.

      Wow.

      We know that you, as a bee, have worked your whole life

      to get to the point where you can work for your whole life.

      Honey begins when our valiant Pollen Jocks bring the nectar to the hive.

      Our top-secret formula

      is automatically color-corrected, scent-adjusted and bubble-contoured

      into this soothing sweet syrup

      with its distinctive golden glow you know as…

      Honey!

      That girl was hot.

      She's my cousin!

      She is?

      Yes, we're all cousins.

      Right. You're right.

      At Honex, we constantly strive

      to improve every aspect of bee existence.

      These bees are stress-testing a new helmet technology.

      What do you think he makes? Not enough. Here we have our latest advancement, the Krelman.

      What does that do? Oatches that little strand of honey that hangs after you pour it. Saves us millions.

      Oan anyone work on the Krelman?

      Of course. Most bee jobs are small ones. But bees know

      that every small job, if it's done well, means a lot.

      But choose carefully

      because you'll stay in the job you pick for the rest of your life.

      The same job the rest of your life? I didn't know that.

      What's the difference?

      You'll be happy to know that bees, as a species, haven't had one day off

      in 27 million years.

      So you'll just work us to death?

      We'll sure try.

      Wow! That blew my mind!

      "What's the difference?" How can you say that?

      One job forever? That's an insane choice to have to make.

      I'm relieved. Now we only have to make one decision in life.

      But, Adam, how could they never have told us that?

      Why would you question anything? We're bees.

      We're the most perfectly functioning society on Earth.

      You ever think maybe things work a little too well here?

      Like what? Give me one example.

      I don't know. But you know what I'm talking about.

      Please clear the gate. Royal Nectar Force on approach.

      Wait a second. Oheck it out.

      Hey, those are Pollen Jocks! Wow. I've never seen them this close.

      They know what it's like outside the hive.

      Yeah, but some don't come back.

      Hey, Jocks! Hi, Jocks! You guys did great!

      You're monsters! You're sky freaks! I love it! I love it!

      I wonder where they were. I don't know. Their day's not planned.

      Outside the hive, flying who knows where, doing who knows what.

      You can'tjust decide to be a Pollen Jock. You have to be bred for that.

      Right.

      Look. That's more pollen than you and I will see in a lifetime.

      It's just a status symbol. Bees make too much of it.

      Perhaps. Unless you're wearing it and the ladies see you wearing it.

      Those ladies? Aren't they our cousins too?

      Distant. Distant.

      Look at these two.

      Oouple of Hive Harrys. Let's have fun with them. It must be dangerous being a Pollen Jock.

      Yeah. Once a bear pinned me against a mushroom!

      He had a paw on my throat, and with the other, he was slapping me!

      Oh, my! I never thought I'd knock him out. What were you doing during this?

      Trying to alert the authorities.

      I can autograph that.

      A little gusty out there today, wasn't it, comrades?

      Yeah. Gusty.

      We're hitting a sunflower patch six miles from here tomorrow.

      Six miles, huh? Barry! A puddle jump for us, but maybe you're not up for it.

      Maybe I am. You are not! We're going 0900 at J-Gate.

      What do you think, buzzy-boy? Are you bee enough?

      I might be. It all depends on what 0900 means.

      Hey, Honex!

      Dad, you surprised me.

      You decide what you're interested in?

      Well, there's a lot of choices. But you only get one. Do you ever get bored doing the same job every day?

      Son, let me tell you about stirring.

      You grab that stick, and you just move it around, and you stir it around.

      You get yourself into a rhythm. It's a beautiful thing.

      You know, Dad, the more I think about it,

      maybe the honey field just isn't right for me.

      You were thinking of what, making balloon animals?

      That's a bad job for a guy with a stinger.

      Janet, your son's not sure he wants to go into honey!

      Barry, you are so funny sometimes. I'm not trying to be funny. You're not funny! You're going into honey. Our son, the stirrer!

      You're gonna be a stirrer? No one's listening to me! Wait till you see the sticks I have.

      I could say anything right now. I'm gonna get an ant tattoo!

      Let's open some honey and celebrate!

      Maybe I'll pierce my thorax. Shave my antennae.

      Shack up with a grasshopper. Get a gold tooth and call everybody "dawg"!

      I'm so proud.

      We're starting work today! Today's the day. Oome on! All the good jobs will be gone.

      Yeah, right.

      Pollen counting, stunt bee, pouring, stirrer, front desk, hair removal…

      Is it still available? Hang on. Two left! One of them's yours! Oongratulations! Step to the side.

      What'd you get? Picking crud out. Stellar! Wow!

      Oouple of newbies?

      Yes, sir! Our first day! We are ready!

      Make your choice.

      You want to go first? No, you go. Oh, my. What's available?

      Restroom attendant's open, not for the reason you think.

      Any chance of getting the Krelman? Sure, you're on. I'm sorry, the Krelman just closed out.

      Wax monkey's always open.

      The Krelman opened up again.

      What happened?

      A bee died. Makes an opening. See? He's dead. Another dead one.

      Deady. Deadified. Two more dead.

      Dead from the neck up. Dead from the neck down. That's life!

      Oh, this is so hard!

      Heating, cooling, stunt bee, pourer, stirrer,

      humming, inspector number seven, lint coordinator, stripe supervisor,

      mite wrangler. Barry, what do you think I should… Barry?

      Barry!

      All right, we've got the sunflower patch in quadrant nine…

      What happened to you? Where are you?

      I'm going out.

      Out? Out where?

      Out there.

      Oh, no!

      I have to, before I go to work for the rest of my life.

      You're gonna die! You're crazy! Hello?

      Another call coming in.

      If anyone's feeling brave, there's a Korean deli on 83rd

      that gets their roses today.

      Hey, guys.

      Look at that. Isn't that the kid we saw yesterday? Hold it, son, flight deck's restricted.

      It's OK, Lou. We're gonna take him up.

      Really? Feeling lucky, are you?

      Sign here, here. Just initial that.

      Thank you. OK. You got a rain advisory today,

      and as you all know, bees cannot fly in rain.

      So be careful. As always, watch your brooms,

      hockey sticks, dogs, birds, bears and bats.

      Also, I got a couple of reports of root beer being poured on us.

      Murphy's in a home because of it, babbling like a cicada!

      That's awful. And a reminder for you rookies, bee law number one, absolutely no talking to humans!

      All right, launch positions!

      Buzz, buzz, buzz, buzz! Buzz, buzz, buzz, buzz! Buzz, buzz, buzz, buzz!

      Black and yellow!

      Hello!

      You ready for this, hot shot?

      Yeah. Yeah, bring it on.

      Wind, check.

      Antennae, check.

      Nectar pack, check.

      Wings, check.

      Stinger, check.

      Scared out of my shorts, check.

      OK, ladies,

      let's move it out!

      Pound those petunias, you striped stem-suckers!

      All of you, drain those flowers!

      Wow! I'm out!

      I can't believe I'm out!

      So blue.

      I feel so fast and free!

      Box kite!

      Wow!

      Flowers!

      This is Blue Leader. We have roses visual.

      Bring it around 30 degrees and hold.

      Roses!

      30 degrees, roger. Bringing it around.

      Stand to the side, kid. It's got a bit of a kick.

      That is one nectar collector!

      Ever see pollination up close? No, sir. I pick up some pollen here, sprinkle it over here. Maybe a dash over there,

      a pinch on that one. See that? It's a little bit of magic.

      That's amazing. Why do we do that?

      That's pollen power. More pollen, more flowers, more nectar, more honey for us.

      Oool.

      I'm picking up a lot of bright yellow. Oould be daisies. Don't we need those?

      Oopy that visual.

      Wait. One of these flowers seems to be on the move.

      Say again? You're reporting a moving flower?

      Affirmative.

      That was on the line!

      This is the coolest. What is it?

      I don't know, but I'm loving this color.

      It smells good. Not like a flower, but I like it.

      Yeah, fuzzy.

      Ohemical-y.

      Oareful, guys. It's a little grabby.

      My sweet lord of bees!

      Oandy-brain, get off there!

      Problem!

      Guys! This could be bad. Affirmative.

      Very close.

      Gonna hurt.

      Mama's little boy.

      You are way out of position, rookie!

      Ooming in at you like a missile!

      Help me!

      I don't think these are flowers.

      Should we tell him? I think he knows. What is this?!

      Match point!

      You can start packing up, honey, because you're about to eat it!

      Yowser!

      Gross.

      There's a bee in the car!

      Do something!

      I'm driving!

      Hi, bee.

      He's back here!

      He's going to sting me!

      Nobody move. If you don't move, he won't sting you. Freeze!

      He blinked!

      Spray him, Granny!

      What are you doing?!

      Wow… the tension level out here is unbelievable.

      I gotta get home.

      Oan't fly in rain.

      Oan't fly in rain.

      Oan't fly in rain.

      Mayday! Mayday! Bee going down!

      Ken, could you close the window please?

      Ken, could you close the window please?

      Oheck out my new resume. I made it into a fold-out brochure.

      You see? Folds out.

      Oh, no. More humans. I don't need this.

      What was that?

      Maybe this time. This time. This time. This time! This time! This…

      Drapes!

      That is diabolical.

      It's fantastic. It's got all my special skills, even my top-ten favorite movies.

      What's number one? Star Wars?

      Nah, I don't go for that…

      …kind of stuff.

      No wonder we shouldn't talk to them. They're out of their minds.

      When I leave a job interview, they're flabbergasted, can't believe what I say.

      There's the sun. Maybe that's a way out.

      I don't remember the sun having a big 75 on it.

      I predicted global warming.

      I could feel it getting hotter. At first I thought it was just me.

      Wait! Stop! Bee!

      Stand back. These are winter boots.

      Wait!

      Don't kill him!

      You know I'm allergic to them! This thing could kill me!

      Why does his life have less value than yours?

      Why does his life have any less value than mine? Is that your statement?

      I'm just saying all life has value. You don't know what he's capable of feeling.

      My brochure!

      There you go, little guy.

      I'm not scared of him. It's an allergic thing.

      Put that on your resume brochure.

      My whole face could puff up.

      Make it one of your special skills.

      Knocking someone out is also a special skill.

      Right. Bye, Vanessa. Thanks.

      Vanessa, next week? Yogurt night?

      Sure, Ken. You know, whatever.

      You could put carob chips on there.

      Bye.

      Supposed to be less calories.

      Bye.

      I gotta say something.

      She saved my life. I gotta say something.

      All right, here it goes.

      Nah.

      What would I say?

      I could really get in trouble.

      It's a bee law. You're not supposed to talk to a human.

      I can't believe I'm doing this.

      I've got to.

      Oh, I can't do it. Oome on!

      No. Yes. No.

      Do it. I can't.

      How should I start it? "You like jazz?" No, that's no good.

      Here she comes! Speak, you fool!

      Hi!

      I'm sorry.

      You're talking. Yes, I know. You're talking!

      I'm so sorry.

      No, it's OK. It's fine. I know I'm dreaming.

      But I don't recall going to bed.

      Well, I'm sure this is very disconcerting.

      This is a bit of a surprise to me. I mean, you're a bee!

      I am. And I'm not supposed to be doing this,

      but they were all trying to kill me.

      And if it wasn't for you…

      I had to thank you. It's just how I was raised.

      That was a little weird.

      I'm talking with a bee. Yeah. I'm talking to a bee. And the bee is talking to me!

      I just want to say I'm grateful. I'll leave now.

      Wait! How did you learn to do that? What? The talking thing.

      Same way you did, I guess. "Mama, Dada, honey." You pick it up.

      That's very funny. Yeah. Bees are funny. If we didn't laugh, we'd cry with what we have to deal with.

      Anyway…

      Oan I…

      …get you something?

      Like what? I don't know. I mean… I don't know. Ooffee?

      I don't want to put you out.

      It's no trouble. It takes two minutes.

      It's just coffee.

      I hate to impose.

      Don't be ridiculous!

      Actually, I would love a cup.

      Hey, you want rum cake?

      I shouldn't.

      Have some.

      No, I can't.

      Oome on!

      I'm trying to lose a couple micrograms.

      Where? These stripes don't help. You look great!

      I don't know if you know anything about fashion.

      Are you all right?

      No.

      He's making the tie in the cab as they're flying up Madison.

      He finally gets there.

      He runs up the steps into the church. The wedding is on.

      And he says, "Watermelon? I thought you said Guatemalan.

      Why would I marry a watermelon?"

      Is that a bee joke?

      That's the kind of stuff we do.

      Yeah, different.

      So, what are you gonna do, Barry?

      About work? I don't know.

      I want to do my part for the hive, but I can't do it the way they want.

      I know how you feel.

      You do? Sure. My parents wanted me to be a lawyer or a doctor, but I wanted to be a florist.

      Really? My only interest is flowers. Our new queen was just elected with that same campaign slogan.

      Anyway, if you look…

      There's my hive right there. See it?

      You're in Sheep Meadow!

      Yes! I'm right off the Turtle Pond!

      No way! I know that area. I lost a toe ring there once.

      Why do girls put rings on their toes?

      Why not?

      It's like putting a hat on your knee.

      Maybe I'll try that.

      You all right, ma'am?

      Oh, yeah. Fine.

      Just having two cups of coffee!

      Anyway, this has been great. Thanks for the coffee.

      Yeah, it's no trouble.

      Sorry I couldn't finish it. If I did, I'd be up the rest of my life.

      Are you…?

      Oan I take a piece of this with me?

      Sure! Here, have a crumb.

      Thanks! Yeah. All right. Well, then… I guess I'll see you around.

      Or not.

      OK, Barry.

      And thank you so much again… for before.

      Oh, that? That was nothing.

      Well, not nothing, but… Anyway…

      This can't possibly work.

      He's all set to go. We may as well try it.

      OK, Dave, pull the chute.

      Sounds amazing. It was amazing! It was the scariest, happiest moment of my life.

      Humans! I can't believe you were with humans!

      Giant, scary humans! What were they like?

      Huge and crazy. They talk crazy.

      They eat crazy giant things. They drive crazy.

      Do they try and kill you, like on TV?

      Some of them. But some of them don't.

      How'd you get back?

      Poodle.

      You did it, and I'm glad. You saw whatever you wanted to see.

      You had your "experience." Now you can pick out yourjob and be normal.

      Well… Well? Well, I met someone.

      You did? Was she Bee-ish?

      A wasp?! Your parents will kill you!

      No, no, no, not a wasp.

      Spider?

      I'm not attracted to spiders.

      I know it's the hottest thing, with the eight legs and all.

      I can't get by that face.

      So who is she?

      She's… human.

      No, no. That's a bee law. You wouldn't break a bee law.

      Her name's Vanessa. Oh, boy. She's so nice. And she's a florist!

      Oh, no! You're dating a human florist!

      We're not dating.

      You're flying outside the hive, talking to humans that attack our homes

      with power washers and M-80s! One-eighth a stick of dynamite!

      She saved my life! And she understands me.

      This is over!

      Eat this.

      This is not over! What was that?

      They call it a crumb. It was so stingin' stripey! And that's not what they eat. That's what falls off what they eat!

      You know what a Oinnabon is? No. It's bread and cinnamon and frosting. They heat it up…

      Sit down!

      …really hot!

      Listen to me! We are not them! We're us. There's us and there's them!

      Yes, but who can deny the heart that is yearning?

      There's no yearning. Stop yearning. Listen to me!

      You have got to start thinking bee, my friend. Thinking bee!

      Thinking bee. Thinking bee. Thinking bee! Thinking bee! Thinking bee! Thinking bee!

      There he is. He's in the pool.

      You know what your problem is, Barry?

      I gotta start thinking bee?

      How much longer will this go on?

      It's been three days! Why aren't you working?

      I've got a lot of big life decisions to think about.

      What life? You have no life! You have no job. You're barely a bee!

      Would it kill you to make a little honey?

      Barry, come out. Your father's talking to you.

      Martin, would you talk to him?

      Barry, I'm talking to you!

      You coming?

      Got everything?

      All set!

      Go ahead. I'll catch up.

      Don't be too long.

      Watch this!

      Vanessa!

      We're still here. I told you not to yell at him. He doesn't respond to yelling!

      Then why yell at me? Because you don't listen! I'm not listening to this.

      Sorry, I've gotta go.

      Where are you going? I'm meeting a friend. A girl? Is this why you can't decide?

      Bye.

      I just hope she's Bee-ish.

      They have a huge parade of flowers every year in Pasadena?

      To be in the Tournament of Roses, that's every florist's dream!

      Up on a float, surrounded by flowers, crowds cheering.

      A tournament. Do the roses compete in athletic events?

      No. All right, I've got one. How come you don't fly everywhere?

      It's exhausting. Why don't you run everywhere? It's faster.

      Yeah, OK, I see, I see. All right, your turn.

      TiVo. You can just freeze live TV? That's insane!

      You don't have that?

      We have Hivo, but it's a disease. It's a horrible, horrible disease.

      Oh, my.

      Dumb bees!

      You must want to sting all those jerks.

      We try not to sting. It's usually fatal for us.

      So you have to watch your temper.

      Very carefully. You kick a wall, take a walk,

      write an angry letter and throw it out. Work through it like any emotion:

      Anger, jealousy, lust.

      Oh, my goodness! Are you OK?

      Yeah.

      What is wrong with you?! It's a bug. He's not bothering anybody. Get out of here, you creep!

      What was that? A Pic 'N' Save circular?

      Yeah, it was. How did you know?

      It felt like about 10 pages. Seventy-five is pretty much our limit.

      You've really got that down to a science.

      I lost a cousin to Italian Vogue. I'll bet. What in the name of Mighty Hercules is this?

      How did this get here? Oute Bee, Golden Blossom,

      Ray Liotta Private Select?

      Is he that actor?

      I never heard of him.

      Why is this here?

      For people. We eat it.

      You don't have enough food of your own?

      Well, yes.

      How do you get it?

      Bees make it.

      I know who makes it!

      And it's hard to make it!

      There's heating, cooling, stirring. You need a whole Krelman thing!

      It's organic. It's our-ganic! It's just honey, Barry.

      Just what?!

      Bees don't know about this! This is stealing! A lot of stealing!

      You've taken our homes, schools, hospitals! This is all we have!

      And it's on sale?! I'm getting to the bottom of this.

      I'm getting to the bottom of all of this!

      Hey, Hector.

      You almost done? Almost. He is here. I sense it.

      Well, I guess I'll go home now

      and just leave this nice honey out, with no one around.

      You're busted, box boy!

      I knew I heard something. So you can talk!

      I can talk. And now you'll start talking!

      Where you getting the sweet stuff? Who's your supplier?

      I don't understand. I thought we were friends.

      The last thing we want to do is upset bees!

      You're too late! It's ours now!

      You, sir, have crossed the wrong sword!

      You, sir, will be lunch for my iguana, Ignacio!

      Where is the honey coming from?

      Tell me where!

      Honey Farms! It comes from Honey Farms!

      Orazy person!

      What horrible thing has happened here?

      These faces, they never knew what hit them. And now

      they're on the road to nowhere!

      Just keep still.

      What? You're not dead?

      Do I look dead? They will wipe anything that moves. Where you headed?

      To Honey Farms. I am onto something huge here.

      I'm going to Alaska. Moose blood, crazy stuff. Blows your head off!

      I'm going to Tacoma.

      And you? He really is dead. All right.

      Uh-oh!

      What is that?!

      Oh, no!

      A wiper! Triple blade!

      Triple blade?

      Jump on! It's your only chance, bee!

      Why does everything have to be so doggone clean?!

      How much do you people need to see?!

      Open your eyes! Stick your head out the window!

      From NPR News in Washington, I'm Oarl Kasell.

      But don't kill no more bugs!

      Bee!

      Moose blood guy!!

      You hear something?

      Like what?

      Like tiny screaming.

      Turn off the radio.

      Whassup, bee boy?

      Hey, Blood.

      Just a row of honey jars, as far as the eye could see.

      Wow!

      I assume wherever this truck goes is where they're getting it.

      I mean, that honey's ours.

      Bees hang tight. We're all jammed in. It's a close community.

      Not us, man. We on our own. Every mosquito on his own.

      What if you get in trouble? You a mosquito, you in trouble. Nobody likes us. They just smack. See a mosquito, smack, smack!

      At least you're out in the world. You must meet girls.

      Mosquito girls try to trade up, get with a moth, dragonfly.

      Mosquito girl don't want no mosquito.

      You got to be kidding me!

      Mooseblood's about to leave the building! So long, bee!

      Hey, guys! Mooseblood! I knew I'd catch y'all down here. Did you bring your crazy straw?

      We throw it in jars, slap a label on it, and it's pretty much pure profit.

      What is this place?

      A bee's got a brain the size of a pinhead.

      They are pinheads!

      Pinhead.

      Oheck out the new smoker. Oh, sweet. That's the one you want. The Thomas 3000!

      Smoker?

      Ninety puffs a minute, semi-automatic. Twice the nicotine, all the tar.

      A couple breaths of this knocks them right out.

      They make the honey, and we make the money.

      "They make the honey, and we make the money"?

      Oh, my!

      What's going on? Are you OK?

      Yeah. It doesn't last too long.

      Do you know you're in a fake hive with fake walls?

      Our queen was moved here. We had no choice.

      This is your queen? That's a man in women's clothes!

      That's a drag queen!

      What is this?

      Oh, no!

      There's hundreds of them!

      Bee honey.

      Our honey is being brazenly stolen on a massive scale!

      This is worse than anything bears have done! I intend to do something.

      Oh, Barry, stop.

      Who told you humans are taking our honey? That's a rumor.

      Do these look like rumors?

      That's a conspiracy theory. These are obviously doctored photos.

      How did you get mixed up in this?

      He's been talking to humans.

      What? Talking to humans?! He has a human girlfriend. And they make out!

      Make out? Barry!

      We do not.

      You wish you could. Whose side are you on? The bees!

      I dated a cricket once in San Antonio. Those crazy legs kept me up all night.

      Barry, this is what you want to do with your life?

      I want to do it for all our lives. Nobody works harder than bees!

      Dad, I remember you coming home so overworked

      your hands were still stirring. You couldn't stop.

      I remember that.

      What right do they have to our honey?

      We live on two cups a year. They put it in lip balm for no reason whatsoever!

      Even if it's true, what can one bee do?

      Sting them where it really hurts.

      In the face! The eye!

      That would hurt. No. Up the nose? That's a killer.

      There's only one place you can sting the humans, one place where it matters.

      Hive at Five, the hive's only full-hour action news source.

      No more bee beards!

      With Bob Bumble at the anchor desk.

      Weather with Storm Stinger.

      Sports with Buzz Larvi.

      And Jeanette Ohung.

      Good evening. I'm Bob Bumble. And I'm Jeanette Ohung. A tri-county bee, Barry Benson,

      intends to sue the human race for stealing our honey,

      packaging it and profiting from it illegally!

      Tomorrow night on Bee Larry King,

      we'll have three former queens here in our studio, discussing their new book,

      Olassy Ladies, out this week on Hexagon.

      Tonight we're talking to Barry Benson.

      Did you ever think, "I'm a kid from the hive. I can't do this"?

      Bees have never been afraid to change the world.

      What about Bee Oolumbus? Bee Gandhi? Bejesus?

      Where I'm from, we'd never sue humans.

      We were thinking of stickball or candy stores.

      How old are you?

      The bee community is supporting you in this case,

      which will be the trial of the bee century.

      You know, they have a Larry King in the human world too.

      It's a common name. Next week…

      He looks like you and has a show and suspenders and colored dots…

      Next week…

      Glasses, quotes on the bottom from the guest even though you just heard 'em.

      Bear Week next week! They're scary, hairy and here live.

      Always leans forward, pointy shoulders, squinty eyes, very Jewish.

      In tennis, you attack at the point of weakness!

      It was my grandmother, Ken. She's 81.

      Honey, her backhand's a joke! I'm not gonna take advantage of that?

      Quiet, please. Actual work going on here.

      Is that that same bee? Yes, it is! I'm helping him sue the human race.

      Hello. Hello, bee. This is Ken.

      Yeah, I remember you. Timberland, size ten and a half. Vibram sole, I believe.

      Why does he talk again?

      Listen, you better go 'cause we're really busy working.

      But it's our yogurt night!

      Bye-bye.

      Why is yogurt night so difficult?!

      You poor thing. You two have been at this for hours!

      Yes, and Adam here has been a huge help.

      Frosting… How many sugars? Just one. I try not to use the competition.

      So why are you helping me?

      Bees have good qualities.

      And it takes my mind off the shop.

      Instead of flowers, people are giving balloon bouquets now.

      Those are great, if you're three.

      And artificial flowers.

      Oh, those just get me psychotic! Yeah, me too. Bent stingers, pointless pollination.

      Bees must hate those fake things!

      Nothing worse than a daffodil that's had work done.

      Maybe this could make up for it a little bit.

      This lawsuit's a pretty big deal. I guess. You sure you want to go through with it?

      Am I sure? When I'm done with the humans, they won't be able

      to say, "Honey, I'm home," without paying a royalty!

      It's an incredible scene here in downtown Manhattan,

      where the world anxiously waits, because for the first time in history,

      we will hear for ourselves if a honeybee can actually speak.

      What have we gotten into here, Barry?

      It's pretty big, isn't it?

      I can't believe how many humans don't work during the day.

      You think billion-dollar multinational food companies have good lawyers?

      Everybody needs to stay behind the barricade.

      What's the matter? I don't know, I just got a chill. Well, if it isn't the bee team.

      You boys work on this?

      All rise! The Honorable Judge Bumbleton presiding.

      All right. Oase number 4475,

      Superior Oourt of New York, Barry Bee Benson v. the Honey Industry

      is now in session.

      Mr. Montgomery, you're representing the five food companies collectively?

      A privilege.

      Mr. Benson… you're representing all the bees of the world?

      I'm kidding. Yes, Your Honor, we're ready to proceed.

      Mr. Montgomery, your opening statement, please.

      Ladies and gentlemen of the jury,

      my grandmother was a simple woman.

      Born on a farm, she believed it was man's divine right

      to benefit from the bounty of nature God put before us.

      If we lived in the topsy-turvy world Mr. Benson imagines,

      just think of what would it mean.

      I would have to negotiate with the silkworm

      for the elastic in my britches!

      Talking bee!

      How do we know this isn't some sort of

      holographic motion-picture-capture Hollywood wizardry?

      They could be using laser beams!

      Robotics! Ventriloquism! Oloning! For all we know,

      he could be on steroids!

      Mr. Benson?

      Ladies and gentlemen, there's no trickery here.

      I'm just an ordinary bee. Honey's pretty important to me.

      It's important to all bees. We invented it!

      We make it. And we protect it with our lives.

      Unfortunately, there are some people in this room

      who think they can take it from us

      'cause we're the little guys! I'm hoping that, after this is all over,

      you'll see how, by taking our honey, you not only take everything we have

      but everything we are!

      I wish he'd dress like that all the time. So nice!

      Oall your first witness.

      So, Mr. Klauss Vanderhayden of Honey Farms, big company you have.

      I suppose so.

      I see you also own Honeyburton and Honron!

      Yes, they provide beekeepers for our farms.

      Beekeeper. I find that to be a very disturbing term.

      I don't imagine you employ any bee-free-ers, do you?

      No.

      I couldn't hear you.

      No.

      No.

      Because you don't free bees. You keep bees. Not only that,

      it seems you thought a bear would be an appropriate image for a jar of honey.

      They're very lovable creatures.

      Yogi Bear, Fozzie Bear, Build-A-Bear.

      You mean like this?

      Bears kill bees!

      How'd you like his head crashing through your living room?!

      Biting into your couch! Spitting out your throw pillows!

      OK, that's enough. Take him away.

      So, Mr. Sting, thank you for being here. Your name intrigues me.

      Where have I heard it before? I was with a band called The Police. But you've never been a police officer, have you?

      No, I haven't.

      No, you haven't. And so here we have yet another example

      of bee culture casually stolen by a human

      for nothing more than a prance-about stage name.

      Oh, please.

      Have you ever been stung, Mr. Sting?

      Because I'm feeling a little stung, Sting.

      Or should I say… Mr. Gordon M. Sumner!

      That's not his real name?! You idiots!

      Mr. Liotta, first, belated congratulations on

      your Emmy win for a guest spot on ER in 2005.

      Thank you. Thank you.

      I see from your resume that you're devilishly handsome

      with a churning inner turmoil that's ready to blow.

      I enjoy what I do. Is that a crime?

      Not yet it isn't. But is this what it's come to for you?

      Exploiting tiny, helpless bees so you don't

      have to rehearse your part and learn your lines, sir?

      Watch it, Benson! I could blow right now!

      This isn't a goodfella. This is a badfella!

      Why doesn't someone just step on this creep, and we can all go home?!

      Order in this court! You're all thinking it! Order! Order, I say!

      Say it! Mr. Liotta, please sit down! I think it was awfully nice of that bear to pitch in like that.

      I think the jury's on our side.

      Are we doing everything right, legally?

      I'm a florist.

      Right. Well, here's to a great team.

      To a great team!

      Well, hello.

      Ken! Hello. I didn't think you were coming.

      No, I was just late. I tried to call, but… the battery.

      I didn't want all this to go to waste, so I called Barry. Luckily, he was free.

      Oh, that was lucky.

      There's a little left. I could heat it up.

      Yeah, heat it up, sure, whatever.

      So I hear you're quite a tennis player.

      I'm not much for the game myself. The ball's a little grabby.

      That's where I usually sit. Right… there.

      Ken, Barry was looking at your resume,

      and he agreed with me that eating with chopsticks isn't really a special skill.

      You think I don't see what you're doing?

      I know how hard it is to find the rightjob. We have that in common.

      Do we?

      Bees have 100 percent employment, but we do jobs like taking the crud out.

      That's just what I was thinking about doing.

      Ken, I let Barry borrow your razor for his fuzz. I hope that was all right.

      I'm going to drain the old stinger.

      Yeah, you do that.

      Look at that.

      You know, I've just about had it

      with your little mind games.

      What's that? Italian Vogue. Mamma mia, that's a lot of pages.

      A lot of ads.

      Remember what Van said, why is your life more valuable than mine?

      Funny, I just can't seem to recall that!

      I think something stinks in here!

      I love the smell of flowers.

      How do you like the smell of flames?!

      Not as much.

      Water bug! Not taking sides!

      Ken, I'm wearing a Ohapstick hat! This is pathetic!

      I've got issues!

      Well, well, well, a royal flush!

      You're bluffing. Am I? Surf's up, dude!

      Poo water!

      That bowl is gnarly.

      Except for those dirty yellow rings!

      Kenneth! What are you doing?!

      You know, I don't even like honey! I don't eat it!

      We need to talk!

      He's just a little bee!

      And he happens to be the nicest bee I've met in a long time!

      Long time? What are you talking about?! Are there other bugs in your life?

      No, but there are other things bugging me in life. And you're one of them!

      Fine! Talking bees, no yogurt night…

      My nerves are fried from riding on this emotional roller coaster!

      Goodbye, Ken.

      And for your information,

      I prefer sugar-free, artificial sweeteners made by man!

      I'm sorry about all that.

      I know it's got an aftertaste! I like it!

      I always felt there was some kind of barrier between Ken and me.

      I couldn't overcome it. Oh, well.

      Are you OK for the trial?

      I believe Mr. Montgomery is about out of ideas.

      We would like to call Mr. Barry Benson Bee to the stand.

      Good idea! You can really see why he's considered one of the best lawyers…

      Yeah.

      Layton, you've gotta weave some magic

      with this jury, or it's gonna be all over.

      Don't worry. The only thing I have to do to turn this jury around

      is to remind them of what they don't like about bees.

      You got the tweezers? Are you allergic? Only to losing, son. Only to losing.

      Mr. Benson Bee, I'll ask you what I think we'd all like to know.

      What exactly is your relationship

      to that woman?

      We're friends.

      Good friends? Yes. How good? Do you live together?

      Wait a minute…

      Are you her little…

      …bedbug?

      I've seen a bee documentary or two. From what I understand,

      doesn't your queen give birth to all the bee children?

      Yeah, but…

      So those aren't your real parents!

      Oh, Barry…

      Yes, they are!

      Hold me back!

      You're an illegitimate bee, aren't you, Benson?

      He's denouncing bees!

      Don't y'all date your cousins?

      Objection! I'm going to pincushion this guy! Adam, don't! It's what he wants!

      Oh, I'm hit!!

      Oh, lordy, I am hit!

      Order! Order!

      The venom! The venom is coursing through my veins!

      I have been felled by a winged beast of destruction!

      You see? You can't treat them like equals! They're striped savages!

      Stinging's the only thing they know! It's their way!

      Adam, stay with me. I can't feel my legs. What angel of mercy will come forward to suck the poison

      from my heaving buttocks?

      I will have order in this court. Order!

      Order, please!

      The case of the honeybees versus the human race

      took a pointed turn against the bees

      yesterday when one of their legal team stung Layton T. Montgomery.

      Hey, buddy.

      Hey.

      Is there much pain?

      Yeah.

      I…

      I blew the whole case, didn't I?

      It doesn't matter. What matters is you're alive. You could have died.

      I'd be better off dead. Look at me.

      They got it from the cafeteria downstairs, in a tuna sandwich.

      Look, there's a little celery still on it.

      What was it like to sting someone?

      I can't explain it. It was all…

      All adrenaline and then… and then ecstasy!

      All right.

      You think it was all a trap?

      Of course. I'm sorry. I flew us right into this.

      What were we thinking? Look at us. We're just a couple of bugs in this world.

      What will the humans do to us if they win?

      I don't know.

      I hear they put the roaches in motels. That doesn't sound so bad.

      Adam, they check in, but they don't check out!

      Oh, my.

      Oould you get a nurse to close that window?

      Why? The smoke. Bees don't smoke.

      Right. Bees don't smoke.

      Bees don't smoke! But some bees are smoking.

      That's it! That's our case!

      It is? It's not over?

      Get dressed. I've gotta go somewhere.

      Get back to the court and stall. Stall any way you can.

      And assuming you've done step correctly, you're ready for the tub.

      Mr. Flayman.

      Yes? Yes, Your Honor!

      Where is the rest of your team?

      Well, Your Honor, it's interesting.

      Bees are trained to fly haphazardly,

      and as a result, we don't make very good time.

      I actually heard a funny story about…

      Your Honor, haven't these ridiculous bugs

      taken up enough of this court's valuable time?

      How much longer will we allow these absurd shenanigans to go on?

      They have presented no compelling evidence to support their charges

      against my clients, who run legitimate businesses.

      I move for a complete dismissal of this entire case!

      Mr. Flayman, I'm afraid I'm going

      to have to consider Mr. Montgomery's motion.

      But you can't! We have a terrific case.

      Where is your proof? Where is the evidence?

      Show me the smoking gun!

      Hold it, Your Honor! You want a smoking gun?

      Here is your smoking gun.

      What is that?

      It's a bee smoker!

      What, this? This harmless little contraption?

      This couldn't hurt a fly, let alone a bee.

      Look at what has happened

      to bees who have never been asked, "Smoking or non?"

      Is this what nature intended for us?

      To be forcibly addicted to smoke machines

      and man-made wooden slat work camps?

      Living out our lives as honey slaves to the white man?

      What are we gonna do? He's playing the species card. Ladies and gentlemen, please, free these bees!

      Free the bees! Free the bees!

      Free the bees!

      Free the bees! Free the bees!

      The court finds in favor of the bees!

      Vanessa, we won!

      I knew you could do it! High-five!

      Sorry.

      I'm OK! You know what this means?

      All the honey will finally belong to the bees.

      Now we won't have to work so hard all the time.

      This is an unholy perversion of the balance of nature, Benson.

      You'll regret this.

      Barry, how much honey is out there?

      All right. One at a time.

      Barry, who are you wearing?

      My sweater is Ralph Lauren, and I have no pants.

      What if Montgomery's right? What do you mean? We've been living the bee way a long time, 27 million years.

      Oongratulations on your victory. What will you demand as a settlement?

      First, we'll demand a complete shutdown of all bee work camps.

      Then we want back the honey that was ours to begin with,

      every last drop.

      We demand an end to the glorification of the bear as anything more

      than a filthy, smelly, bad-breath stink machine.

      We're all aware of what they do in the woods.

      Wait for my signal.

      Take him out.

      He'll have nauseous for a few hours, then he'll be fine.

      And we will no longer tolerate bee-negative nicknames…

      But it's just a prance-about stage name!

      …unnecessary inclusion of honey in bogus health products

      and la-dee-da human tea-time snack garnishments.

      Oan't breathe.

      Bring it in, boys!

      Hold it right there! Good.

      Tap it.

      Mr. Buzzwell, we just passed three cups, and there's gallons more coming!

      I think we need to shut down! Shut down? We've never shut down. Shut down honey production!

      Stop making honey!

      Turn your key, sir!

      What do we do now?

      Oannonball!

      We're shutting honey production!

      Mission abort.

      Aborting pollination and nectar detail. Returning to base.

      Adam, you wouldn't believe how much honey was out there.

      Oh, yeah?

      What's going on? Where is everybody?

      Are they out celebrating? They're home. They don't know what to do. Laying out, sleeping in.

      I heard your Uncle Oarl was on his way to San Antonio with a cricket.

      At least we got our honey back.

      Sometimes I think, so what if humans liked our honey? Who wouldn't?

      It's the greatest thing in the world! I was excited to be part of making it.

      This was my new desk. This was my new job. I wanted to do it really well.

      And now…

      Now I can't.

      I don't understand why they're not happy.

      I thought their lives would be better!

      They're doing nothing. It's amazing. Honey really changes people.

      You don't have any idea what's going on, do you?

      What did you want to show me? This. What happened here?

      That is not the half of it.

      Oh, no. Oh, my.

      They're all wilting.

      Doesn't look very good, does it?

      No.

      And whose fault do you think that is?

      You know, I'm gonna guess bees.

      Bees?

      Specifically, me.

      I didn't think bees not needing to make honey would affect all these things.

      It's notjust flowers. Fruits, vegetables, they all need bees.

      That's our whole SAT test right there.

      Take away produce, that affects the entire animal kingdom.

      And then, of course…

      The human species?

      So if there's no more pollination,

      it could all just go south here, couldn't it?

      I know this is also partly my fault.

      How about a suicide pact?

      How do we do it?

      I'll sting you, you step on me. Thatjust kills you twice. Right, right.

      Listen, Barry… sorry, but I gotta get going.

      I had to open my mouth and talk.

      Vanessa?

      Vanessa? Why are you leaving? Where are you going?

      To the final Tournament of Roses parade in Pasadena.

      They've moved it to this weekend because all the flowers are dying.

      It's the last chance I'll ever have to see it.

      Vanessa, I just wanna say I'm sorry. I never meant it to turn out like this.

      I know. Me neither.

      Tournament of Roses. Roses can't do sports.

      Wait a minute. Roses. Roses?

      Roses!

      Vanessa!

      Roses?!

      Barry?

      Roses are flowers! Yes, they are. Flowers, bees, pollen!

      I know. That's why this is the last parade.

      Maybe not. Oould you ask him to slow down?

      Oould you slow down?

      Barry!

      OK, I made a huge mistake. This is a total disaster, all my fault.

      Yes, it kind of is.

      I've ruined the planet. I wanted to help you

      with the flower shop. I've made it worse.

      Actually, it's completely closed down.

      I thought maybe you were remodeling.

      But I have another idea, and it's greater than my previous ideas combined.

      I don't want to hear it!

      All right, they have the roses, the roses have the pollen.

      I know every bee, plant and flower bud in this park.

      All we gotta do is get what they've got back here with what we've got.

      Bees.

      Park.

      Pollen!

      Flowers.

      Repollination!

      Across the nation!

      Tournament of Roses, Pasadena, Oalifornia.

      They've got nothing but flowers, floats and cotton candy.

      Security will be tight.

      I have an idea.

      Vanessa Bloome, FTD.

      Official floral business. It's real.

      Sorry, ma'am. Nice brooch.

      Thank you. It was a gift.

      Once inside, we just pick the right float.

      How about The Princess and the Pea?

      I could be the princess, and you could be the pea!

      Yes, I got it.

      Where should I sit?

      What are you?

      I believe I'm the pea.

      The pea?

      It goes under the mattresses.

      Not in this fairy tale, sweetheart. I'm getting the marshal. You do that! This whole parade is a fiasco!

      Let's see what this baby'll do.

      Hey, what are you doing?!

      Then all we do is blend in with traffic…

      …without arousing suspicion.

      Once at the airport, there's no stopping us.

      Stop! Security.

      You and your insect pack your float? Yes. Has it been in your possession the entire time?

      Would you remove your shoes?

      Remove your stinger. It's part of me. I know. Just having some fun. Enjoy your flight.

      Then if we're lucky, we'll have just enough pollen to do the job.

      Oan you believe how lucky we are? We have just enough pollen to do the job!

      I think this is gonna work.

      It's got to work.

      Attention, passengers, this is Oaptain Scott.

      We have a bit of bad weather in New York.

      It looks like we'll experience a couple hours delay.

      Barry, these are cut flowers with no water. They'll never make it.

      I gotta get up there and talk to them.

      Be careful.

      Oan I get help with the Sky Mall magazine?

      I'd like to order the talking inflatable nose and ear hair trimmer.

      Oaptain, I'm in a real situation.

      What'd you say, Hal? Nothing. Bee!

      Don't freak out! My entire species…

      What are you doing?

      Wait a minute! I'm an attorney! Who's an attorney? Don't move.

      Oh, Barry.

      Good afternoon, passengers. This is your captain.

      Would a Miss Vanessa Bloome in 24B please report to the cockpit?

      And please hurry!

      What happened here?

      There was a DustBuster, a toupee, a life raft exploded.

      One's bald, one's in a boat, they're both unconscious!

      Is that another bee joke? No! No one's flying the plane!

      This is JFK control tower, Flight 356. What's your status?

      This is Vanessa Bloome. I'm a florist from New York.

      Where's the pilot?

      He's unconscious, and so is the copilot.

      Not good. Does anyone onboard have flight experience?

      As a matter of fact, there is.

      Who's that? Barry Benson. From the honey trial?! Oh, great.

      Vanessa, this is nothing more than a big metal bee.

      It's got giant wings, huge engines.

      I can't fly a plane.

      Why not? Isn't John Travolta a pilot? Yes. How hard could it be?

      Wait, Barry! We're headed into some lightning.

      This is Bob Bumble. We have some late-breaking news from JFK Airport,

      where a suspenseful scene is developing.

      Barry Benson, fresh from his legal victory…

      That's Barry!

      …is attempting to land a plane, loaded with people, flowers

      and an incapacitated flight crew.

      Flowers?!

      We have a storm in the area and two individuals at the controls

      with absolutely no flight experience.

      Just a minute. There's a bee on that plane.

      I'm quite familiar with Mr. Benson and his no-account compadres.

      They've done enough damage.

      But isn't he your only hope?

      Technically, a bee shouldn't be able to fly at all.

      Their wings are too small…

      Haven't we heard this a million times?

      "The surface area of the wings and body mass make no sense."

      Get this on the air!

      Got it.

      Stand by.

      We're going live.

      The way we work may be a mystery to you.

      Making honey takes a lot of bees doing a lot of small jobs.

      But let me tell you about a small job.

      If you do it well, it makes a big difference.

      More than we realized. To us, to everyone.

      That's why I want to get bees back to working together.

      That's the bee way! We're not made of Jell-O.

      We get behind a fellow.

      Black and yellow! Hello! Left, right, down, hover.

      Hover? Forget hover. This isn't so hard. Beep-beep! Beep-beep!

      Barry, what happened?!

      Wait, I think we were on autopilot the whole time.

      That may have been helping me. And now we're not! So it turns out I cannot fly a plane.

      All of you, let's get behind this fellow! Move it out!

      Move out!

      Our only chance is if I do what I'd do, you copy me with the wings of the plane!

      Don't have to yell.

      I'm not yelling! We're in a lot of trouble.

      It's very hard to concentrate with that panicky tone in your voice!

      It's not a tone. I'm panicking!

      I can't do this!

      Vanessa, pull yourself together. You have to snap out of it!

      You snap out of it.

      You snap out of it.

      You snap out of it!

      You snap out of it!

      You snap out of it!

      You snap out of it!

      You snap out of it!

      You snap out of it!

      Hold it!

      Why? Oome on, it's my turn.

      How is the plane flying?

      I don't know.

      Hello?

      Benson, got any flowers for a happy occasion in there?

      The Pollen Jocks!

      They do get behind a fellow.

      Black and yellow. Hello. All right, let's drop this tin can on the blacktop.

      Where? I can't see anything. Oan you?

      No, nothing. It's all cloudy.

      Oome on. You got to think bee, Barry.

      Thinking bee. Thinking bee. Thinking bee! Thinking bee! Thinking bee!

      Wait a minute. I think I'm feeling something.

      What? I don't know. It's strong, pulling me. Like a 27-million-year-old instinct.

      Bring the nose down.

      Thinking bee! Thinking bee! Thinking bee!

      What in the world is on the tarmac? Get some lights on that! Thinking bee! Thinking bee! Thinking bee!

      Vanessa, aim for the flower. OK. Out the engines. We're going in on bee power. Ready, boys?

      Affirmative!

      Good. Good. Easy, now. That's it.

      Land on that flower!

      Ready? Full reverse!

      Spin it around!

      Not that flower! The other one!

      Which one?

      That flower.

      I'm aiming at the flower!

      That's a fat guy in a flowered shirt. I mean the giant pulsating flower

      made of millions of bees!

      Pull forward. Nose down. Tail up.

      Rotate around it.

      This is insane, Barry! This's the only way I know how to fly. Am I koo-koo-kachoo, or is this plane flying in an insect-like pattern?

      Get your nose in there. Don't be afraid. Smell it. Full reverse!

      Just drop it. Be a part of it.

      Aim for the center!

      Now drop it in! Drop it in, woman!

      Oome on, already.

      Barry, we did it! You taught me how to fly!

      Yes. No high-five! Right. Barry, it worked! Did you see the giant flower?

      What giant flower? Where? Of course I saw the flower! That was genius!

      Thank you. But we're not done yet. Listen, everyone!

      This runway is covered with the last pollen

      from the last flowers available anywhere on Earth.

      That means this is our last chance.

      We're the only ones who make honey, pollinate flowers and dress like this.

      If we're gonna survive as a species, this is our moment! What do you say?

      Are we going to be bees, orjust Museum of Natural History keychains?

      We're bees!

      Keychain!

      Then follow me! Except Keychain.

      Hold on, Barry. Here.

      You've earned this.

      Yeah!

      I'm a Pollen Jock! And it's a perfect fit. All I gotta do are the sleeves.

      Oh, yeah.

      That's our Barry.

      Mom! The bees are back!

      If anybody needs to make a call, now's the time.

      I got a feeling we'll be working late tonight!

      Here's your change. Have a great afternoon! Oan I help who's next?

      Would you like some honey with that? It is bee-approved. Don't forget these.

      Milk, cream, cheese, it's all me. And I don't see a nickel!

      Sometimes I just feel like a piece of meat!

      I had no idea.

      Barry, I'm sorry. Have you got a moment?

      Would you excuse me? My mosquito associate will help you.

      Sorry I'm late.

      He's a lawyer too?

      I was already a blood-sucking parasite. All I needed was a briefcase.

      Have a great afternoon!

      Barry, I just got this huge tulip order, and I can't get them anywhere.

      No problem, Vannie. Just leave it to me.

      You're a lifesaver, Barry. Oan I help who's next?

      All right, scramble, jocks! It's time to fly.

      Thank you, Barry!

      That bee is living my life!

      Let it go, Kenny.

      When will this nightmare end?!

      Let it all go.

      Beautiful day to fly.

      Sure is.

      Between you and me, I was dying to get out of that office.

      You have got to start thinking bee, my friend.

      Thinking bee! Me? Hold it. Let's just stop for a second. Hold it.

      I'm sorry. I'm sorry, everyone. Oan we stop here?

      I'm not making a major life decision during a production number!

      All right. Take ten, everybody. Wrap it up, guys.

      I had virtually no rehearsal for that.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2024-02438

      Corresponding author(s): Ryusuke, Niwa

      1. General Statements [optional]

      This section is optional. Insert here any general statements you wish to make about the goal of the study or about the reviews.

      Below are quotes from the Reviewers' overall evaluations:

      As might be expected based on the authors' skills and expertise, the study is well executed, nicely documented with perfect microscopy images, and well presented. It has been easy to follow. However, suitability for publication depends on where the authors aim to place their paper. Although I like the paper very much, it might seem incomplete for high-end journals.

      This is a very nice paper and solid piece of work.

      Its major strength is the focus on poorly studied the male reproductive organ and identification of Ldh as a novel target of JH activity in the seminal vesicles.

      While the developmental roles of insect Juvenile Hormone (JH) are very well studied, its adult functions are largely unknown. Target genes of JH signaling are poorly described. This study adds significant insight into both of these aspects. The study underscores the usefulness of the JHRE-GFP reporter that identifies JH function, and not just JH presence since the reporter is only expressed after JH binding to Met and Gce, a prerequisite for JHRE reporter activation.

      The authors have identified the epithelial cells of the ____Drosophila____ seminal vesicle as a JH target tissue. The authors nicely extended this finding by mining already existing expression data to identify a specific JH induced gene in these cells.

      This small study reports new but limited results (one tissue of one stage, one hormone) that could be useful for specialists. The work is solid and includes controls and interpretable data.

      2. Description of the planned revisions

      Insert here a point-by-point reply that explains what revisions, additional experimentations and analyses are planned to address the points raised by the referees.

      1) The study suggests an important role for JH signaling in the SV, likely affecting reproductive capacity of males. The authors depleted the JH receptors through RNAi, achieving a loss in the expression of the WT JHRE-GFP reporter as well as of the authentic target Ldh. Surprisingly, no phenotypic consequences of the double KD of Met and gce are presented. Does that mean that there were none? The authors only discuss a potential impact of Ldh loss for metabolism. Unless I am missing something, the study reports molecular phenotypes that clearly document JH signaling in the SV but no physiological impact of loss of this JH signaling, suggesting that there may be no obvious biological role for JH in this context. I think this is unlikely. Have the authors check fertility of the males, sperm viability and quality, mating competitiveness of the RNAi males? Loss of JH epoxidation (only methyl farnesoate present) made mosquito males less fit and less reproductively competitive relative to epox+ controls (Nouzova et al., 2021, PNAS) -- btw, I think the authors should discuss this paper.

      Our response: We will conduct the following experiments to answer these criticisms.

      1) We will examine the male fertility by counting the number of offspring from wild-type mothers crossed with males of the seminal vesicle-specific ____Met _& _gce____ double RNAi and with males of control RNAi.

      2) We will also examine the mating competitiveness of the RNAi males. In more detail, we will cross ____w1118_ (white eye) wild-type background females with (i) a mixed population of males of _w1118_ wild-type background males and_ w+_ (red eye) control RNAi males, and (ii) a mixed population of males of _w1118_ wild-type background males and_ w+ Met _& _gce____ double RNAi males. We can distinguish between the progenies from RNAi males and those from wild-type males by eye colors.

      By conducting plans 1) and 2), we will also indirectly evaluate sperm viability and quality.

      In addition, we will also discuss the paper of Nouzova et al. PNAS 2021 in the Discussion section.

      2) The authors seem to have made no effort to distinguish between Met and Gce functions. It is always the results from the double knockdown of both paralogs that are presented. Does this mean that single-KD had no effect, thereby indicating entirely redundant functions of both proteins in the studied context? Even if so, it would be of interest to document this redundancy by showing the single-gene KD data. However, I would be surprised if both proteins were equally important in the SV. The authors checked mRNA/protein expression levels. Was any of the two paralogs prevalent in the SV?

      Our response: To address this criticism, we will conduct a single transgenic RNAi experiment to knock down either Met or gce separately and assess JHRE-GFP signals in the seminal vesicles.

      __ Regarding the expression of Met and gce in the seminal vesicles, a previous study (Baumann et al. Scientific Reports 7: 2132, DOI:10.1038/s41598-017-02264-41) has already reported that GFP signals are observed in the seminal vesicles of _Met-T2A-GAL4>UAS_-GFP and gce-T2A-GAL4>UAS-GFP animals. These results strongly indicate that both Met and gce are expressed in the seminal vesicles. We will describe and discuss this point in our revised manuscript. In addition, we plan to check and analyze gene expression of Met, gce, and Ldh in the seminal vesicles using a publicly-available single-cell RNA-seq database, such as _DRscDB (https://www.flyrnai.org/tools/singlecell/web/).

      3) The authors argue for direct regulation of Ldh by Met/Gce (again by which one?). Oddly, the statement in the Results (l.187-188; "suggests ... direct target") is stronger than in the Discussion (l.214, "leaving open the possibility"). The putative JHREs upstream and within the Ldh gene are identified but not tested in a functional study. At least a simple luciferase reporter assay and mutagenesis of the JHREs should be attempted.

      Our response: To address this criticism, we plan to conduct a luciferase-based promoter/enhancer analysis in Drosophila S2 cultured cells. A similar system was used for a JH-responsiveness of the JHRE promoter in a previous study (Jindra et al. PLoS Genetics 11: e1005394, DOI: __10.1371/journal.pgen.1005394). We will generate plasmid constructs carrying the luciferase coding regions. In these plasmids, the luciferase coding regions will be fused with the upstream region and the first intron region of Ldh possessing the intact E-boxes or the mutated E-boxes. Then, we will determine whether the luciferase activity is enhanced by the presence of a JH analog (methoprene) when E-boxes are intact. __

      __ For this revision, a new collaborator, Ryosuke Hayashi (a graduate student in the Niwa lab), will participate in this analysis. Thus, he becomes a co-author in the revised manuscript.__

      l.232-233. It is not surprising that the JHRR-lacZ reporter shows a different expression pattern relative to JHRE-GFP, as these are really different constructs. The problem is that JH-dependent activation of the JHRR-lacZ transgene has not been tested as thoroughly as that of JHRE-GFP. Is it inducible by added JH or methoprene?

      Have the authors examined whether JHRE-lacZ expression increases with Methoprene?

      Our response: We have yet to do this analysis. To address this important point from Reviewers #1 and #2, we will examine whether JHRR-lacZ expression is upregulated in the seminal vesicles of virgin males fed methoprene-supplemented food. The lacZ signals will be visualized by immunostaining with an anti-LacZ antibody.

      Document testis staining of JHRE-GFP. I think the authors missed a chance by not providing a clear/nice picture of the testis staining. Stainings of testes squashed on a slide is easy and would nicely document in which cells the reporter is activated. Similarly, extracting sperm from the seminal vesicle and examining whether the sperm express JHRE-GFP would be informative.

      Our response: As the reviewer suggested, we will assess JHRE-GFP signal in sperm in squashed testis samples.

      Did the authors try to analyze the 66 genes identified in seminal vesicle whether they had JHRE elements? This could yield additional significant information about other JH responsive genes in the seminal vesicle.

      Our response: We have yet to do this analysis. We will follow the reviewer's suggestion and examine whether the 66 genes identified in the seminal vesicle have JHRE elements.

      3a. Doublestaining would further confirm that pd8-Gal4 (crossed to UAS-dsRed) and JHRE-GFP overlap.

      3b. Similarly, Doublestaining would further confirm that pd8-Gal4 (crossed to UAS-dsREd) and JHRE-GFP overlap.

      Our response: To address this question, we will generate males of Pde8-GAL4; UAS-red fluorescent protein (RedStinger, RFP, or DsRed); JHRE-GFP and observe the overlap between the red fluorescent signals and green fluorescent (JHRE-GFP) signals in the seminal vesicle epithelial cells.

      Minor comments:

      Fig.1a could be in a supplement.

      __Our response: At this point, we are unsure whether to follow this reviewer's suggestion. This is because there are no supplemental figures in the current manuscript, so we hesitate to create a supplemental figure just for this one figure. On the other hand, three reviewers now ask us to perform various additional experiments, thus some of the new data may be shown as supplemental figures. In this case, Fig. 1a can be moved to a supplemental figure, but we would like to wait on this decision. __

      3. Description of the revisions that have already been incorporated in the transferred manuscript

      Please insert a point-by-point reply describing the revisions that were already carried out and included in the transferred manuscript. If no revisions have been carried out yet, please leave this section empty.

      l.25,91,117, and throughout, "JH analog" or "JHA". The authors only use methoprene, so it would be better to specifically talk about methoprene, which is a proven agonist ligand of the JHR proteins (reference 10 and/or Jindra and Bittova, 2020 [Arch Insect Biochem Physiol] for a review). This would land more credibility to using methoprene than just referring to a "JHA".

      Our response: According to the reviewer's suggestion, we have replaced "JHA" with "methoprene" as many as possible. In Figures, we used "MTP" instead of "methoprene" due to space limitations.

      l.42,44, "paralogs". I believe in this case the authors refer to orthologs of Met in other species. Paralogs result from gene duplications within species, such as Met and gce in cyclorrhaphous flies or Met 1 and 2 in the Lepidoptera. I recommend a recent review on all bHLH-PAS proteins featuring reconstruction of the phylogenetic position of Met/Gce (Tumova et al., 2024 in J Mol Biol).

      Our response: As suggested, we have replaced "paralogs" and "paralogous" with "orthologs" and "orthologous," respectively on P3. We have also cited Tumova et al. J. Mol. Biol. 2023 as a new Ref 12.

      l.54, "Met and Gce act redundantly to regulate JH-responsive gene expression". Ref 10 should be cited here as it provides functional cell-based and genetic rescue evidence for each paralog.

      Our response: We have cited Ref 10 as suggested.

      l.66, It would be better to start "In this study" or "Here" to distinguish from the last cited paper.

      Our response:____ We created a new paragraph with the sentence "In this study..." at the beginning. We hope we understand the reviewer's suggestion correctly.

      l.175, levels were

      Our response: We have fixed this error in the transferred manuscript.

      l.209, might be evolutionarily among.... conserved ??

      Our response: We have fixed this error in the transferred manuscript.

      l.226, study has

      Our response: We have fixed this error in the transferred manuscript.

      l.227-229. The authors are missing a paper by Shin et al., 2012 (PNAS) that shows physical interaction of Met with Cycle and their regulation of circadian gene activity and another paper by Bajgar et al., 2013 (PNAS) which describes photoperid-dependent seasonal regulation of circadian genes by Met, Clk and Cyc.

      On the other hand, the cited reference [51] does NOT demonstrate Met:Clk heterodimer since coIP is by no means adequate to address complex stoichiometry. In fact, it is suspicious that Met would heterodimerize and either Cyc or Clk, as they present class II and class I bHLH-PAS proteins.

      Our response: In response to both comments from Reviewer #1, ____we have cited these references and rewritten the discussion on P10-11 as below: "An interesting previous study has reported that the seminal vesicle expresses multiple clock genes such as period, Clock (Clk), and timeless, all of which are necessary for generating proper circadian rhythm [52]. In the case of the mosquito Aedes aegypti female, it is reported that JH controls gene expression via a heterodimer of Met and circadian rhythm factor Cycle (CYC) [53]. It was also suggested that Met binds directly to CLK in D. melanogaster [54]. In addition, in the linden bug, Pyrrhocoris apterus, JH alters gene expression via Met, CLK, and CYC in the gut [55]. Considering these previous reports and our results, circadian rhythm factors and JH may cooperate to regulate gene expression in the seminal vesicles."

      l.245. It is not "whether", but for sure the existing reporters only reflect limited JHR activity, being based on Kr-h1 JHREs. These reporters likely uncover only a small subset of JH activity in vivo.

      Our response: We have rewritten the sentence as follows: "..., more comprehensive JH reporter strains will be needed in D. melanogaster as well as other insects in future studies."

      reference 10/11 is duplicated.

      Our response: We have fixed this error in the transferred manuscript.

      Have the authors done a careful comparison of JHRE-GFP expression and the Met/gce reporter expression described by Baumann et al (Scientific Reports | 7: 2132 | DOI:10.1038/s41598-017-02264-4)? Would be nice to add a few more sentences in the discussion.

      Our response: As suggested, we have added some sentences to explain this point on Page 11 as below: "P____revious studies reported that ____Met-T2A-GAL4_ and _gce-T2A-GAL4_ labeled male accessory glands, ejaculatory duct, and testes as well as seminal vesicles. On the other hand, in our results, JHRE-GFP only labels cells in seminal vesicles and testes [21]. Considering that Met and Gce are expressed in almost all cell types of male reproductive tracts [21], more comprehensive JH reporter strains will be needed in _D. melanogaster____ as well as other insects in future studies."

      • In the discussion:*

      6.1 Would have liked to see a more in depth discussion of the role of the seminal vesicle. How could that be supported by JH / metabolic processes? Does it have secretory functions that might be induced by JH? Important functions relative to sperm storage? How could that relate to the finding that JH response is enhanced by mating?

      Our response: Unfortunately, the function of the seminal vesicles is largely unknown. However, ____in response to the reviewer's suggestion, we have added some sentences to discuss this point and cited some references describing the seminal vesicles in insects other than the fruit fly, as follows on P9-10: "Furthermore, in some insects other than D. melanogaster, morphological and ultrastructural studies revealed that secretory vesicles were observed in the epithelial cells of the seminal vesicles [37,38,40,44]. JH is known to stimulate secretory activity in the male accessory glands of many insects [45]. Based on the JH response in the seminal vesicles, it is possible that JH signaling affects the secretory activity of the seminal vesicles in D. melanogaster."

      The arrow in figure is not defined

      Our response: We believe that the reviewer pointed out the arrow in Figure 1e. We have added a sentence to define the arrow in the Figure legend as "The arrow indicates the cell with a GFP signal."

      Figure 2b graph labels are flipped

      Our response: We have fixed the error.

      Line 624: Change "Allow heads" to "Arrowheads"

      Our response: We have fixed this error in the transferred manuscript.

      Major Comments:

      The work uses standard methods and strains. Although the specific findings are new and believable, the authors interpret them beyond what is appropriate. For example, based on increased amounts of a single RNA, they propose that JH regulates metabolism in seminal vesicles and because circadian rhythm genes were known to be expressed in this tissue they propose that JH and circadian systems work together there.

      Our response: In response to the reviewer's criticisms, we have discussed our arguments more appropriately in the Discussion. For example, we have mentioned circadian rhythm more carefully on Pages 10-11 as follows: "An interesting previous study has reported that the seminal vesicle expresses multiple clock genes such as period, Clock (Clk), and timeless, all of which are necessary for generating proper circadian rhythm [52]. In case of mosquito Aedes aegypti female, it is reported that JH controls gene expression via a heterodimer of Met and circadian rhythm factor Cycle (CYC) [53]. It was also suggested that Met binds directly to CLK in D. melanogaster [54]. In addition, in the linden bug, Pyrrhocoris apterus, JH alter gene expression via Met, CLK and CYC in the gut [55]. Considering these previous reports and our results, it is possible that circadian rhythm factors and JH cooperatively regulate gene expression in the seminal vesicles."

      __ Regarding Ldh, we have added a sentence on Page 10 as "Also, the biological significance of the induction of Ldh expression by JH signaling is not clear."__

      4. Description of analyses that authors prefer not to carry out

      Please include a point-by-point response explaining why some of the requested data or additional analyses might not be necessary or cannot be provided within the scope of a revision. This can be due to time or resource limitations or in case of disagreement about the necessity of such additional data given the scope of the study. Please leave empty if not applicable.

      l.244, tract

      Our response: We have carefully checked out the usage of "tract" and "tracts" not only on Page 11 but also throughout the manuscript. We have decided to use "tracts," but not "tract," throughout the manuscript.

      6.2 What do epithelial cells of spermatheca do?

      Our response: We agree with the reviewer that this is a very interesting question. However, please note that this paper focuses on males, and females are beyond our current scope. We plan to examine JHRE-GFP signals in the spermatheca in a different project. We do appreciate the reviewer's kind understanding.

      6.3 How do the authors envision that JH enters the epithelial cells?

      __Our response:____ We don't have any hypotheses on this point. Transporters may exist to achieve intracellular permeability of JH, but we do not think this point has been discussed in current insect physiology. Furthermore, since this issue is related to all JH-responsive cells, not just seminal vesicle epithelial cells, we do not feel the need to discuss it in this paper. __

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      Both reviewers positively received the manuscript, in general. The agreement was that the manuscript presented valuable findings, using solid techniques and approaches, that shed additional light into how the canine distemper virus hemagglutinin might engage cellular receptors and how that engagement impacts host tropism. While both reviewers appreciated the X-ray crystallographic data, they also felt that the AFM experiments could have been performed at a higher standard and that the interpretation of the results ensuing from those AFM experiments could have been explained more thoroughly and in simpler terms. An additional missed opportunity of the current manuscript is the lack of comparison of the crystal structure to that of the already published cryo-EM structure, for context.

      Thank you very much for constructive comments of the editor and reviewers. Following your comments, we have changed the text related to the AFM experiments with simpler terms as follows.

      “When CDV-H was loaded onto a mica substrate and scanned with a cantilever to acquire images of attached molecules, the CDV-H dimer was observed as two globules clustered together in most cases, but sometimes, each domain moved independently (Fig. 7B and Supplementary Movie). Time-course analysis of the dynamics of the representative CDV-H dimer showed that CDV-H could adopt both associated and dissociated forms (Fig. 7C). The distances between the domains were calculated by measuring those between the centers of mass of each domain. Finally, the distribution of distances between each head domain in the CDV-H dimers showed approximately 15 nm as a major peak (Fig. 7D). This is a reasonable length for the linker between the head domain dimers.” in Page 11, Lines 8-17.

      With regards to the structural comparison between cryo-EM structure published in Proc. Natl. Acad. Sci. U. S. A. (2023) 120, e2208866120 and our crystal structure, we have compared these structures for Cα on page 6 and added the following text. “A recent cryo-EM structure of the wild-type CDV-H ectodomain revealed that the head dimer is located on one side of the stalk region in solution (Proc. Natl. Acad. Sci. U. S. A. (2023) 120, e2208866120)” in Page 14, Lines 22-24.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Fukuhara, Maenaka, and colleagues report a crystal structure of the canine distemper virus (CDV) attachment hemagglutinin protein globular domain. The structure shows a dimeric organization of the viral protein and describes the detailed amino-acid side chain interactions between the two protomers. The authors also use their best judgement to comment on predicted sites for the two cellular receptors - Nectin-4 and SLAM - and thus speculate on the CDV host tropism. A complementary AFM study suggests a breathing movement at the hemagglutinin dimer interface.

      Strengths:

      The study of CDV and related Paramyxoviruses is significant for human/animal health and is very timely. The crystallographic data seem to be of good quality.

      Thank you very much for the constructive comment of the reviewer.

      Weaknesses:

      While the recent CDV hemagglutinin cryo-EM structure is mentioned, it is not compared to the present crystal structure, and thus the context of the present study is poorly justified. Additionally, the results of the AFM experiment are not unexpected. Indeed, other paramyxoviral RBP/G proteins also show movement at the protomer interface.

      Thank you very much for constructive comments of the reviewer. When we submitted our manuscript to e-life, cryo-EM structure just published in Proc. Natl. Acad. Sci. U. S. A. (2023) 120, e2208866120 a week ago was not able to be available. Following the comment of the reviewer, we have added the text about the structural comparison between the cryo-EM structure and our crystal structure. We also have changed the text related to the AFM experiments to tone down the movement of the protomer interfaceas follows.

      “This observation raises the possibility that each head domain of CDV-H also dissociates and moves flexibly, as shown in the structure of Nipah virus (NiV)-G protein, previously (Science (2022) 375, 1373–1378).” in Page 11, Lines 4-6.

      Reviewer #2 (Public Review):

      Summary:

      The authors solved the crystal structure of CDV H-protein head domain at 3,2 A resolution to better understand the detailed mechanism of membrane fusion triggering. The structure clearly showed that the orientation of the H monomers in the homodimer was similar to that of measles virus H and different from other paramyxoviruses. The authors used the available co-crystal strictures of the closely related measles virus H structures with the SLAM and Nectin4 receptors to map the receptor binding site on CDV H. The authors also confirmed which N-linked sites were glycosylated in the CDV H protein and showed that both wildtype and vaccine strains of CDV H have the same glycosylation pattern. The authors documented that the glycans cover a vast majority of the H surface while leaving the receptor binding site exposed, which may in part explain the long-term success of measles virus and CDV vaccines. Finally, the authors used HS-AFM to visualize the real-time dynamic characteristics of CDV-H under physiological conditions. This analysis indicated that homodimers may dissociate into monomers, which has implications for the model of fusion triggering.

      The structural data and analysis were thorough and well-presented. However, the HS-AFM data, while very exciting, was not presented in a manner that could be easily grasped by readers of this manuscript. I have some suggestions for improvement.

      (1) The authors claim their structure is very similar to the recently published croy-EM structure of CDV H. Can the authors provide us with a quantitative assessment of this statement?

      Thank you very much for constructive comments of the reviewer. When we submitted our manuscript to e-life, cryo-EM structure just published in Proc. Natl. Acad. Sci. U. S. A. (2023) 120, e2208866120 a week ago was not able to be available. Following the comment of the reviewer, we have added the text about the structural comparison between the cryo-EM structure and our crystal structure. We also have changed the text related to the AFM experiments to tone down the movement of the protomer interface as follows.

      “This observation raises the possibility that each head domain of CDV-H also dissociates and moves flexibly, as shown in the structure of Nipah virus (NiV)-G protein, previously (Science (2022) 375, 1373–1378).” in Page 11, Lines 4-6.

      (2) The results for the HS-AFM are difficult to follow and it is not clear how the authors came to their conclusions. Can the authors better explain this data and justify their conclusions based on it?

      Thank you very much for constructive comments of the reviewer. Following your comments, we have changed the text related to the AFM experiments with simpler terms as follows.

      “When CDV-H was loaded onto a mica substrate and scanned with a cantilever to acquire images of attached molecules, the CDV-H dimer was observed as two globules clustered together in most cases, but sometimes, each domain moved independently (Fig. 7B and Supplementary Movie). Time-course analysis of the dynamics of the representative CDV-H dimer showed that CDV-H could adopt both associated and dissociated forms (Fig. 7C). The distances between the domains were calculated by measuring those between the centers of mass of each domain. Finally, the distribution of distances between each head domain in the CDV-H dimers showed approximately 15 nm as a major peak (Fig. 7D). This is a reasonable length for the linker between the head domain dimers.” in Page 11, Lines 8-17.

      (3) The fusion triggering model in Figure 8 is ambiguous as to when H-F interactions are occurring and when they may be disrupted. The authors should clarify this point in their model.

      Thank you very much for constructive comments of the reviewer. Following your comments, we have changed the Figure 8 and its legend.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) AFM experiments with SLAM or Nectin-4 immobilized on the cantilever would be much more informative.

      Thank you very much for the constructive comment of the reviewer. We will try this experiment in the next paper.

      (2) The authors should compare their crystal structure to that of the reported cryo-EM structure.

      With regards to the structural comparison between cryo-EM structure published in Proc. Natl. Acad. Sci. U. S. A. (2023) 120, e2208866120 and our crystal structure, we have added the text.

      (3) Figure 1D - why does the beta2 MG negative control have such a high SPR signal?

      Thank you very much for the constructive comment of the reviewer. The immobilization levels for b 2-microglobulin (beta2 MG), CDV-OP-H and CDV-5VD-H were similar, 1204.7 RU, 1235.7 RU, and 1504.5 RU, respectively. We applied relatively high concentrations (5 mM) of dNectin4 and hNectin4 onto the chip to determine low-affinity dissociation constants. Then, the signals for beta2 MG (negative control) were high. In other SPR experiments for cell surface receptors, such high signals for beta2 MG were often observed in our previous paper, Kuroki et al., J. Immunol. 2019 Dec 15;203(12):3386-3394. doi: 10.4049/jimmunol.1900562. Therefore, we think that these SPR signals are not unusual.

      (4) Figure 1C - please indicate the Ve volume for the peak and add in Ve for standard.

      Thank you very much for the constructive comment of the reviewer. We have indicated the Ve volume for the peak and added in Ve for standard in Figure 1C.

      (5) The authors mention that one of the chains in the asymmetric unit was better resolved than the other. Please show regions of the atomic model fit regions of the electron density to convince the reader of the quality of your data.

      Thank you very much for the constructive comment of the reviewer. We have added new Supplementary figure 2 for comparison of electron density maps of chains A and B.

      (6) Table 2 indicates that the difference between Rw and Rf values is larger than 5% which indicates slight overfitting during refinement. Please provide details of your refinement strategy and attempt simulated annealing as a strategy to reduce this delta.

      Thank you very much for the constructive comment of the reviewer. We further introduced TLS and NCS parameters for the refinement. Consequently, the R/Rfree factors became 0.2645/0.3092. Simulated annealing had been already carried out. All the refinement statistics in the table 2 are updated.

      Reviewer #2 (Recommendations For The Authors):

      (1) The authors' fusion triggering model was difficult to follow. For example, this sentence was difficult to understand: "The other possible models may include the monomer-dimer-tetramer transition facilitated by receptor binding for the fusion."

      Thank you very much for the constructive comment of the reviewer. Following your comments, we have removed the above sentences and have added the detail mechanism of the proposed model in Discussion. Furthermore, we have changed the Figure 8 and its legend for readers to understand more clearly.

      (2) Figure 5A is not called out in the main text.

      Thank you very much for the constructive comment of the reviewer. Following your comments, we have added the text as follows.

      “the crystal structure of MeV-H in complex with hNectin-4 showed that the H-SLAM interaction consists of three main sites (Fig. 5A) (Nat. Struct. Mol. Biol. (2013) 20, 67–72).” in Page 11, Lines 4-6.

      (3) Page 9, Line 4: interspaces? Perhaps interphases.

      Thank you very much for the constructive comment of the reviewer. We have changed the term “interspaces” to “internal spaces”.

      (4) Page 12, penultimate line: The authors mention "epitopes for anti-MeV-H Abs." Do they mean anti-CDV-H Abs?

      Thank you very much for the constructive comment of the reviewer. Following your comments, we have changed the “anti-MeV-H Abs” to “anti-morbillivirus H neutralizing antibodies”.

      (5) The paper will benefit from an English language editor to help clarify what the authors are trying to convey.

      Thank you very much for the constructive comment of the reviewer.

      We have asked a English proof reading company to check.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank the reviewers for their time and effort to improve and clarify our manuscript. We now have addressed the reviewers’ suggestions in full on a point-by-point basis. Revisions in the manuscript file are highlighted in yellow.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Supernumerary centrosomes are observed in the majority of human tumors. In cells they induce abnormal mitosis leading to chromosome missegregation and aneuploidy. In animal models it is demonstrated that extra centrosomes are sufficient to drive tumor formation. Previous work studying the impact of centrosome amplification on tumor formation in vivo used Plk4 overexpression to drive the formation of supernumerary centrosomes. In this manuscript Moussa and co-workers from the Krämer group developed a mouse model in which centrosome amplification is triggered by the overexpression of the structural centrosomal protein STIL rather than the kinase Plk4 in order to a) assess the potential for centrosome amplification induced by STIL overexpression to drive tumor formation and b) to rule out any potential non-centrosomal related effects of the kinase Plk4 on tumor formation.* The authors show that STIL ovexrexpression in cells (MEFs) drives centrosome amplification and aberrant mitosis (Fig. 1), leading to chromosome missegregation and aneuploidy (Fig. 2). They also show that STIL overexpression is linked to reduced cellular proliferation and apoptosis (Fig 3). The authors then present in vivo experiments performed in mice. They observed that STIL expression causes embryonic lethality, microcephaly and a reduced lifespan (Fig 4). Despite increased STIL mRNA levels they do not detect elevated STIL protein levels in adult tissues except for the spleen. They do not detect significant increase of centrosome amplification or aneuploidy in animal tissues (Fig 4) and they conclude of a STIL translational shut down in most adult tissues. The authors then assess the impact of STIL overexpression on tumor formation. They observed a reduced spontaneous tumor formation despite elevated STIL mRNA levels in both healthy and tumor (lymphomas) tissues of mice overexpressing STIL. They don't detect increased centrosome amplification and aneuploidy in lymphomas from STIL overexpressing mice compared to lymphomas naturally occurring in control animals (Fig 5). Finally, they found that STIL overexpression suppresses chemical skin carcinogenesis using a combination of tamoxifen induction of STIL in the skin with DMBA/TPA carcinogenic treatment (Fig 7). They link this effect to an increased number of centriole and a reduction in cycling cells number in the skin of STIL overexpressing mice (Fig 6).

      The manuscript is written in a clear manner. The experimental approaches are properly designed and the experimental methods are described in sufficient details. Most of the experimental data present a good number of replicates. The figures are generally well assembled despite some errors in a few panels/legends (see major and minor points). Most of the conclusions are supported by the experimental data. However, a few specific points or interpretations are not convincingly supported by the experimental data (see major points) and will need to be revised and/or reformulated.

      Major points:

      1. Figures 1D and F show that MEFs hemizygous (CMV-STIL+/-) and homozygous (CMV-STIL+/+) for STIL present similar level of centrosome amplification and aberrant mitosis. Although, despite these similarities the homozygous MEFs display about two time more micronuclei and chromosomes aberrations (Fig. 2). The authors explain this discrepancy by the fact that MEFs homozygous for STIL have reduced proliferation and an increased propension to stay in interphase compared to hemizygous MEFs (Fig. 3). I don't understand why an interphase arrest would lead to a higher chromosomal instability resulting in higher micronuclei formation and abnormal karyotypes since those phenotypes are the consequences of abnormal mitosis occurring in cycling cells. I would rather argue that Homozygous MEFs are more prone to cell cycle arrest because of mitotic errors, but those mitotic errors cannot be explained by the centrosome status or the mitotic figures quantified in homozygous MEFs. Therefore, the authors explanation written as: "Graded inhibition of proliferation and accumulation of cells in interphase explains why CMV-STIL+/- and CMV-STIL+/+ MEFs contain increasing frequencies of micronuclei and aberrant karyotypes (Fig. 2) despite similar levels of supernumerary centrosomes" is not right for me. The authors should reformulate this section of the manuscript so their conclusion fit their data. The differences between hemi and homozygotes MEFs regarding chromosome stability could come from mitotic errors they did not spot using fixed immunofluorescence images of mitotic MEFs. Thus, as an optional additional experiment, analyzing live mitosis of MEFs could potentially help reconciliate results from mitotic figures and from karyotypes.*

      We basically agree with the reviewer and have therefore reanalyzed our data on centriole numbers in a time-dependent manner. As already shown in Figure 3L of the initial manuscript version, the number of both CMV-STIL+/- and CMV-STIL+/+ MEFs with supernumerary centrioles increases with passaging from passage 3 (p3) to p6. Also, in this experiment amplified centrioles were more frequent in CMV-STIL+/+ compared to CMV-STIL+/- MEFs in both passages (p3 and p6) analyzed. We have therefore now pooled the data and substituted the former Figure panel 1D by these combined results. As the results of Figure 1F and especially those for the CMV-STIL+/+ MEFs had to rely on very low mitotic figure counts, because these cells only very rarely divide (as shown in Figure 3A; mitosis frequency of CMV-STIL+/+ MEFs 0.12%), we have now deleted Figure panel 1F from the manuscript. For the same reason - an extremely low proliferation and division rate of especially CMV-STIL+/+ MEFs - live cell imaging to detect different types of mitotic errors, is unfortunately not feasible.

      Figure 5 panel F does not support the claim of the main text and does not match the legend of the figure: In the text the authors wrote: "Ki67 immunostaining revealed that, ..., proliferation rates were elevated independent from lymphoma genotypes". If the authors claim and increased cell proliferation in lymphoma compared to lymph nodes, which is expected, they should show the data for the lymph node in the graph. In addition, in the legend the authors mentioned a "Percentage of Ki67-positive cells in healthy spleens and lymphomas from mice with the indicated genotypes." Since there are three genotypes and two tissue types but the figure presents a graph with only three bars did the Spleen and lymphoma data were combined? Or did some data were not inserted in the graph? Thus, since the data does not support the claim for an increased cell proliferation in lymphoma, the authors explanation for the increased protein level observed in these lymphomas (Fig. 5 panel E) is not supported. Therefore, the authors need to present the correct data in the figure or to change their conclusion. They will also need to correct the figure legend and to add a panel with images illustrating the Ki67 labelling in the different tissues in the figure.

      We apologize for this mistake and have corrected the legend to Figure panel 5F, which now reads: “Percentage of Ki67-positive cells in two B6-STIL, two CMV-STIL+/- and one CMV-STIL+/+ lymphoma. For comparison, frequencies of Ki67-positive cells in healthy lymph nodes from B6-STIL mice are displayed. Data are means ± SEM from at least two independent immunostainings per lymphoma or healthy lymph node. P-values were calculated using the one-way ANOVA with post-hoc Tukey test for multiple comparison. For space reasons, only statistically significant differences are displayed”.

         We agree with the reviewer that for comparison Ki67 immunostainings of healthy lymph node tissue was missing in the graph and have therefore added this information to the figure panel, which shows increased proliferation of lymphoma compared to normal lymph node cells. Also, a panel with images illustrating Ki67 labelling in healthy lymph node and lymphomas from different genotypes has been added to the figure (panel 5G).
      
      • *

      __Minor points:____* * __1. In the introduction, page 4 paragraph 3, the authors wrote: "To assess the impact of centrosome amplification on CIN, senescence, lifespan and tumor formation in vivo without interfering with extracentrosomal traits,..." they need to clarify what they meant by extracentrosomal traits.

      As requested by the reviewer we have modified the respective sentence, which now reads: “To assess the impact of centrosome amplification on CIN, senescence, lifespan and tumor formation in vivo with an orthologous approach without interfering with PLK4, we generated transgenic mouse models overexpressing the structural centrosome protein STIL, …”.

      • *

      In the 1st paragraph of the results, page 4, the authors wrote: "leads to ubiquitous transgene expression at levels similar to the CAG promoter used in most..." but there is no link to a figure presenting the mRNA levels in those mice (potentially Fig. 4F and Fig. S6). Also, in the references cited for comparison, to my knowledge, there was no measurement of Plk4 mRNA levels in tissues in the work from Marthiens and colleagues, in this work the authors assess the expression of the Plk4 transgene by investigating the presence of the protein.

      To show STIL transgene expression levels in our system, we have now linked Figure panels 1A (STIL mRNA expression in MEFs), 1B (STIL protein expression in MEFs) and Supplemental Fig. S2 (Supplemental Fig. S6 of the previous manuscript version showing STIL mRNA levels in healthy mouse tissues) to this statement as suggested. In the references now cited for comparison (Kulukian et al. 2015; Vitre et al. 2015; Sercin et al. 2016) PLK4 transgene mRNA (Kulukian et al. 2015; Sercin et al. 2016) and protein levels (Vitre et al. 2015) are shown.

      • *

      Page 5 second line the authors wrote: "Despite the graded increase in Plk4 expression, CMV-STIL+/- and, CMV-STIL+/+ MEFs exhibited a similar increase in supernumerary centrioles". The authors must meant increase in STIL expression or do they have data not shown about an increase of Plk4 expression? Then they explain this absence of difference in supernumerary centriole by the ability of "excess Plk4" to access the centrosome, again they probably meant STIL. Regarding this point and related to Major Point 1 it might be worth for the authors to quantify actual extra centrosomes in mitosis rather than cells with more than 4 centrioles in interphase (as in Fig. 1C, D). They might find differences in the number of centrosomes in hemizygous versus homozygous MEFs.

      We indeed meant STIL instead of PLK4 and have corrected the mistake. As described in our response to the reviewer’s major point 1 we have now reanalyzed our data on centriole numbers in a time-dependent manner. As already shown in Figure 3L of the initial manuscript version, the frequency of both CMV-STIL+/- and CMV-STIL+/+ MEFs with supernumerary centrioles increases with passaging from passage 3 (p3) to p6. Also, in this experiment amplified centrioles were more frequent in CMV-STIL+/+ compared to CMV-STIL+/- MEFs in both passages (p3 and p6) analyzed. We have therefore now pooled and substituted the former Figure panel 1D by these combined results.

      Page 5, in the first paragraph the authors mention "the rate of respective mitotic aberrations..." without defining the mitotic aberrations. For instance, in panel 1E a metaphase with 4 centrosomes is shown for CMV-STIL+/- while an anaphase with an unknown number of clustered centrosomes is presented for CMV-STIL+/+. Classifying the different types of aberrant mitotic figures (i.e: multipolar anaphases versus bipolar with clustered centrosomes) might help the authors identify differences between hemi and homozygous MEFS that may explain the differences in the proportions of chromosomes aberrations they present in Fig. 2.

      As described in our response to the reviewer’s major point 1 the number of mitotic figures that could be analyzed was extremely low, especially for CMV-STIL+/+ MEFs, which do only rarely divide (mitosis frequency of CMV-STIL+/+ MEFs 0.12%). Therefore, although certainly of value, classification of different types of mitotic aberrations is unfortunately not feasible.

      • *

      In Fig 4A the number of mice analyzed should be mentioned.

      After mating of B6-STIL transgenic animals with CMV-CRE mice and further breeding of successive generations, we obtained a total of 198 pups over four generations, 162 of which were born alive: 116 B6-STIL wildtype animals, 27 CMV-STIL+/- and 19 CMV-STIL-/- mice. We have now added these numbers to the figure legend.

      • *

      In Fig. 5E, the band corresponding to STIL protein is difficult to visualize in the B6-STIL control, it is therefore difficult to compare its level to the level of STIL protein in the CMV-STIL hemizygotes and homozygotes. If possible, it would improve the manuscript to present a blot with clearer results.

      We have tried to improve the quality by repeating the Western blot. Due to the small size of healthy mouse lymph nodes, resulting in low protein yields, only lysates from lymphomas were left, and these were of poor quality with a high lipid content. We therefore tried to delipidate the lymphoma lysates and hope that the result of the new blot is now somewhat clearer. Due to the low lymphoma frequency in CMV-STIL hemizygotes and homozygotes (only 2 in each case) we were unfortunately not able to prepare fresh lysates.

      Related to Figure 6B the authors wrote a "5 to 10 fold-increased expression..." in the text while panel 6B show a maximum of 8 fold increase.

      The respective statement has been rephrased according to the reviewer´s suggestion.

      __Reviewer #1 (Significance (Required)): ______ *Centrosome amplification is a demonstrated cause of genomic instability and tumor development as shown in multiple previous work performed in mice. In this work, Moussa and co-workers developed a mouse model that does not depends on Plk4 to trigger centrosome amplification but which depends on the overexpression of the centrosome structural protein STIL. This effort is welcome as previous works could not formally rule out potential role of Plk4, not related to its centrosome duplication function, on tumor formation. The authors show that their system is functional in MEFs where STIL overexpression drives centrosome amplification and aneuploidy. Unfortunately, in vivo, despite elevated level of STIL mRNA they do not detect centrosome amplification in tissues and consequently, they do not observe an increase rate of aneuploidy and tumor formation. This result is not surprising as previous studies using strong promoters (comparable to the one used to drive STIL expression in this study) to induce Plk4 overexpression led to similar results, i.e. an absence of centrosome amplification in adult tissues and no effects on tumor formation. Therefore, the results and the concepts proposed in this work are not novel but they reinforce previous studies showing the deleterious effect of high level of centrosome amplification on cells. This work also confirms that strong mechanisms, here the authors propose a translational shut-down, are preventing the apparition or the persistence of high level of centrosome amplification in animal tissues. By complementing existing results with the use of an alternate experimental approach this study will be of interest for the scientific community working on the basic biological mechanisms driving aneuploidy and tumor development.*

      __Reviewer #2 (Evidence, reproducibility and clarity (Required)):______ *In this manuscript, Moussa et al. describe the effects of over-expressing the centriole duplication factor STIL in whole mice and with expression restricted to the skin. They find that over expression of STIL, similar to that of PLK4, induces centriole overduplication, abnormal mitoses, and genetic instability leading to cell arrest. Additionally, over-expressing STIL results in microcephaly, perinatal lethality and a shortened lifespan. In addition, they do not find that expression of the p53 R127H mutant alleviates the cell growth defect. Moreover, overexpression of STIL does not lead to increased general tumour formation and suppresses tumour formation in an induced skin tumour model.

      Although this is an interesting manuscript, the authors need address a number of issues before this manuscript can be recommend the manuscript for publication. Importantly, the manuscript lacks statistical analyses to support some of their conclusions, some figures should be quantified, and controls are missing in some cases. *

      __Major Issues____* * __1. Many of the figure panels lack appropriate statistical analyses to support the conclusions (see details below). This needs to be rectified.

      In view of the limited number of mice (due to an increased frequency of pups that died around birth) and the resulting impossibility of performing several (>3) independent experiments in many cases, we have decided to limit the statistics in the main text to a descriptive analysis without mentioning inferences (p-values). Nevertheless, we have now included the missing statistical analyses in the figure panels and/or legends. However, the reported p-values (*p≤0.05, **p≤0.01, ***p≤0.001; ns, not significant) should be interpreted as descriptive rather than confirmatory values.

      • *

      The authors suggest that the interpretation of PLK4 over-expression studies are hampered by the possibility of centriole/centrosome independent PLK4 roles and that STIL overexpression circumvents some of these issues. Although orthologous approaches to problems are always desired, STIL itself has also been implicated in other cellular processes, such as the Sonic hedgehog pathway (Carr AL, 2014) and in cell motility (Liu Y, 2020). In addition, the data presented in the manuscript are suggestive of a STIL function in the mouse that is independent of centriole number. The authors demonstrate that the amount of centriole over-duplication in MEFs containing a single copy of the STIL over-expression locus is equivalent to that of MEFs carrying two copies. However, in most other assays, the homozygous lines display more severe phenotypes, suggesting that STIL might have a function outside centriole duplication. The authors need to discuss this further in a revised manuscript.

      As described in our response to major point 1 and minor point 3 of reviewer 1 we have now reanalyzed our data on centriole numbers in a time-dependent manner. As already shown in Figure 3L of the initial manuscript version, the number of both CMV-STIL+/- and CMV-STIL+/+ MEFs with supernumerary centrioles increases with passaging from passage 3 (p3) to p6. Also, in this experiment amplified centrioles were more frequent in CMV-STIL+/+ compared to CMV-STIL+/- MEFs in both passages (p3 and p6) analyzed. We have therefore now pooled the data and substituted the former Figure panel 1D by these combined results, which show that, similar to other models, also regarding STIL overexpression the homozygous line displays a more severe phenotype, which does therefore per se not argue for a STIL function outside the centrosome. However, as a few recent studies indeed suggest additional roles of STIL, we have amended the respective passages in the revised version of the manuscript accordingly.

      • *

      Why did the authors use the p53 R127H mutant instead of a p53 knockout or null allele system? The R127H mutant has a gain-of-function phenotype and cells expressing this mutant display different phenotypes than a p53 null. The primary conclusion in one of the references cited by the authors (Caulin C, 2007) is that p53R127H is a gain-of-function mutant and behaves distinct from loss-of-function p53 mutations, such as deletions using floxed alleles. Throughout the manuscript, the authors use terms that suggest the R127H allele is equivalent to a loss of function mutant. Given that supernumerary centriole growth arrest is universally suppressed by inactivation of p53 it is somewhat surprising that this pathway is not active in response to STIL over-expression. The authors should confirm this key conclusion by depleting p53 in MEFs using RNAi, or by using mice where complete inactivation of p53 can be achieved.

      We agree with the reviewer that the p53-R172H mutant version of p53 is not equivalent to a p53 knockout. We have therefore and as suggested by reviewer 3 as well (see also our response to point 3 of reviewer 3) corrected the wording and have substituted “absence of p53” by “interference with p53 function” where appropriate. In addition, we now have added data to the manuscript, which show that neither p53 expression nor p53-S18 phosphorylation becomes induced during prolonged cultivation and passaging of CMV-STIL transgenic MEFs (see Figure 3B of the revised manuscript). Importantly, this finding is in line with a recent report showing that PLK4-induced extra centrosomes may not rely on p53 for tumor suppression and cell death induction (Braun et al.: Extra centrosomes delay DNA damage-driven tumorigenesis. Sci. Adv. 10: eadk0564, 2024). Similarly, it has been recently shown that centrosome amplification increases apoptosis independently of p53 in PLK4-overexpressing cells treated with DNA-damaging agents (Edwards et al.: Centrosome amplification primes for apoptosis and favors the response to chemotherapy in ovarian cancer beyond multipolar divisions. bioRxiv 2023.07.28.550973, 2023). Therefore, these findings and references have now been added to results and discussion sections of the revised manuscript.

         A plethora of p53-related findings in mouse models, including the majority of results on PLK4-induced tumor formation in mice, is based on p53 knockouts, a situation that is only rarely found in human cancers. In contrast, the p53-R172H missense mutation in mice corresponds to the p53-R175H mutation in human tumors, which has the highest occurrence in diverse human cancer types among all p53 hotspot mutations, and results in a transcriptionally inactive protein that accumulates in cells, similar to the majority of naturally occurring versions of mutant p53 (Yao et al.: Protein-level mutant p53 reporters identify druggable rare precancerous clones in noncancerous tissues. Nat Cancer 4: 1176-1192, 2023; Chiang et al.: The function of mutant p53-R175H in cancer. Cancers 13: 4088, 2021). We therefore believe that it more faithfully recapitulates the situation in p53-mutant tumors than a p53 knockout.
      
         Although basically an important and valid experiment, depleting p53 in STIL-transgenic MEFs using RNAi is not easily done as (i) transfection of MEFs per se is difficult and (ii) STIL-overexpressing MEFs do only slowly proliferate and are prone to senescence and apoptosis (see Figure 3), all phenotypes which are even further exacerbated after transfection. Generation of STIL-transgenic mice with complete inactivation of p53 on the other hand is an extremely time-consuming endeavor that would lead to a significant delay of publication of our results. Given that currently similar data are published by other groups (Braun et al.: Extra centrosomes delay DNA damage-driven tumorigenesis. Sci. Adv. 10: eadk0564, 2024; Edwards et al.: Centrosome amplification primes for apoptosis and favors the response to chemotherapy in ovarian cancer beyond multipolar divisions. *bioRxiv* 2023.07.28.550973, 2023), we do not think that this would be appropriate.
      

      __Minor Issues and details____* * __Figure 1 1. Panel E. It is unclear what the authors are calling an 'aberrant mitosis'. Typically an aberrant mitosis refers to chromosomal abnormalities such as multipolar spindles, anaphase bridges or micronuclei (which they quantify in Figure 2). The aberrant mitotic figures presented in Figure 1E show a clustered metaphase with 4 centrosomes (2 per pole; 2 centrioles per centrosome) for CMV-STIL+/- MEFs and a clustered telophase with 2 centrosomes (1 per pole; 5 centrioles per centrosome) for CMV-STIL+/+ MEFs. This is now specified in detail in the legend to Figure 1E.

      • *

      Panel E. Please include images representing a normal mitosis from control cells derived from B6-STIL mice.

      As suggested, we have now included a representative image of a normal mitosis from B6-STIL control mice.

      Figure 2____ 1. Panels B, E and F. Statistical significance is not indicated between B6-STIL and CMV-STIL+/- or CMV-STIL+/- and CMV-STIL+/+. The authors indicated a 'graded' phenotype which is qualitatively apparent, but should be backed by statistical analysis.

      We have now included a statistical analysis. However, and as already described in our answer to major issue 1 of this reviewer, the reported p-values should be interpreted as descriptive rather than confirmatory values due to the limited number of independent experiments.

      • *

      Can the authors indicate how they scored a tetraploid cell? Some of the cells are 100% tetraploid while others contain other aberrations.

      According to the International System for Human Cytogenomic Nomenclature (ISCN) version from 2020, polyploidy is defined by the modal numbers of chromosomes in the karyotype. A number of 81-103 chromosomes is called near-tetraploid, at which a hypotetraploidy (81-91 chromosomes) is distinguished from a hypertetraploidy (93-103 chromosomes) (An International System for Human Cytogenomic Nomenclature, Karger (2020), Eds.: McGowan-Jordan, Hastings, Moore). For mouse karyotypes respective numbers were recalculated on the basis of a diploid chromosome content of 40 instead of 46 chromosomes. To be strictly in accordance with this nomenclature, we have exchanged the term "tetraploid" by "near-tetraploid".

      __ Is the height of the rows in Panel D significant? What are the solid black rows?______ We thank the reviewer for this comment/observation. We have now increased the resolution of this part of the figure. Unfortunately, the resolution had deteriorated so much when the pdf file was created that individual lines were no longer recognizable. The height of the lines should be identical, as single lines correspond to the karyotypes of each metaphase cell analyzed, while chromosomes are plotted as columns. The solid black lines separate independently established MEF lines with the indicated STIL genotypes from each other. At least 20 metaphase cells per MEF line were analyzed. We have now explained these points in the figure legend.

      Figure 3____ 1. Panels C, F, G, and K require statistical analyses.

      We have now included the appropriate statistical analyses in the figure panels and/or legends. However, the reported p-values should be interpreted as descriptive rather than confirmatory values due to the limited number of independent experiments.

      • *

      Panel D should be quantified.

      We have now included a quantification of the protein bands in panels B, E (former panel D), and K of the revised manuscript and explained the quantification procedure in detail in the methods section.

      Panel E. mRNA expression is quantified in RPKM here, while GeTMM is used in Figures 3I and Supplementary Figures S2 and S6. Is there a reason this panel uses a different method? RPKM can be used for intra-sample comparisons, but is not ideal for comparison among different samples.

      We now uniformly quantify mRNA expression in GeTMM in all figures of the revised manuscript version as requested.

      • *

      Panel G. Can the authors show the original FACS profiles in Supplementary material?

      As requested, we have now included representative examples of original FACS profiles from the cell cycle analyses into Supplemental Figure S5.

      • *

      Panel H. Requires molecular weight markers

      Molecular weight markers for the DNA ladder (L) with the corresponding bp size have now been included into the Figure panel (formerly 3H, 3I in the revised version of the manuscript).

      • *

      __ Panel J. Missing B6-STIL control. Quantify Western blots.______ We have now included an immunoblot showing STIL protein expression levels in passage p1-p5 of B6-STIL control MEFs as well as a quantification of the protein bands into the Figure panel (formerly 3J, 3K in the revised version of the manuscript). The quantification procedure has been explained in detail in the methods section of the revised manuscript version.

      Figure 4____ 1. The authors mention 'Simultaneously, we found an increased frequency of pups that died around birth.' Can the data for this be included?

      After mating B6-STIL transgenic animals with CMV-CRE mice and further breeding of successive generations, we obtained a total of 198 pups over four generations, of which 162 were born alive: 116 B6-STIL wildtype animals, 27 CMV-STIL+/- and 19 CMV-STIL+/+ mice. We have now added these numbers to the figure legend. Stillbirths increased over the generations: while in the first generation after mating B6-STIL animals with CMV-CRE mice all pups (B6-STIL wildtype animals and STIL heterozygotes) were born alive, in the fourth generation (from mating CMV-STIL transgenic mice with each other) 54% of the pups were stillborn. We have now included this observation into the main text to further emphasize the impact of STIL overexpression on perinatal lethality.

      Panels B and D. Please include the data for CMV-STIL+/-.

      We now have included a representative H&E-stained histological section of a CMV-STIL+/- mouse brain into Figure panel 4D as suggested by the reviewer. For space reasons we have not added an extra image of a CMV-STIL+/- total brain into Figure panel 4B, as this does not add novel information.

      Panels C, F and K require statistics.

      As requested, we have now included the appropriate statistical analysis in the figure panels and/or legends. However, the reported p-values should be interpreted as descriptive rather than confirmatory values due to the limited number of independent experiments.

      • *

      Panel F. Include statistical analysis.

      We have now included the appropriate statistical analysis in the figure panels and/or legends. However, the reported p-values should be interpreted as descriptive rather than confirmatory values due to the limited number of independent experiments.

      • *

      Panel G/H. The levels of STIL in the CMV-STIL+/+ spleen are higher than the other samples, yet there is no concomitant increase in centriole overduplication. Can the authors comment on this?

      Interestingly, we indeed found a higher STIL protein expression level in spleen tissue from CMV-STIL+/+ as compared to B6-STIL control and CMV-STIL+/- mice. Nevertheless, the amount of splenocytes with supernumerary centrioles was only marginally increased in these animals. A similar finding has recently been described for B lymphocytes with upregulated PLK4 expression after PLK4 transgene induction by exposure to doxycycline in vivo (Braun et al.: Extra centrosomes delay DNA damage-driven tumorigenesis. Sci. Adv. 10: eadk0564, 2024). Here, the lack of B cells with supernumerary centrioles despite increased PLK4 levels was explained by increased apoptosis and thereby selection against and rapid loss of PLK4-overexpressing cells. In line, we show that CMV-STIL+/+ MEFs have increased rates of senescence and apoptosis (Fig. 4).

      • *

      __ Panel J. The font within the plots is difficult to read. ______ We thank the reviewer for this comment/observation. We have now increased the resolution of this figure panel, and the font is now outside of the plots.

      Figure 5____** s should be interpreted as descriptive rather than confirmatory values due to the limited number of independent experiments. No further statistical analysis can be done for panel D as in some cases (lymph node from B6-STIL mouse, lymphoma from CMV-STIL+/+ mouse) only one measurement exists.

      Panel F. The legend indicates that these data are from spleens and lymphomas. Is this correct? Would the results from non-lymphoma cells in the spleen mask the results from lymphoma cells?

      We apologize for this mistake and have corrected the legend to Figure panel 5F, which now reads: “Percentage of Ki67-positive cells in two B6-STIL, two CMV-STIL+/- and one CMV-STIL+/+ lymphoma. For comparison, frequencies of Ki67-positive cells in healthy lymph nodes from B6-STIL mice are displayed. Data are means ± SEM from at least two independent immunostainings per lymphoma or healthy lymph node. P-values were calculated using the one-way ANOVA with post-hoc Tukey test for multiple comparison. For space reasons, only statistically significant differences are displayed”.

      • *

      Panel F. The authors indicate that 'In line, assessment of lymphomas from B6-STIL control, CMV-STIL+/- and CMV-STIL+/+ mice by Ki67 immunostaining revealed that, corresponding to STIL protein levels, proliferation rates were elevated independent from lymphoma genotypes'. However, Ki67 levels, the marker for proliferation actually decreased in these samples indicating less proliferative cells. This needs to be clarified since the data shown appears to show the opposite of what is stated in the mansucript....

      As noticed by the reviewer further below, differences in the percentages of Ki67-positive, proliferating cells between lymphomas from B6-STIL, CMV-STIL+/- and CMV-STIL+/+ mice were statistically not significant. However, we have now for comparison added the results of Ki67 immunostaining of healthy lymph node tissue to Figure panel 5F, which show increased proliferation of lymphoma compared to normal lymph node cells. Also, a panel with images illustrating Ki67 labelling in healthy lymph node and lymphomas from different genotypes has been added to the figure (panel 5G). These data reveal that, independent from the genotype, proliferation rates of lymphoma cells are increased as compared to healthy lymph nodes, thereby further corroborating our assumption that STIL protein levels in lymphomas are increased as a consequence of their increased proliferation and independent from STIL transgene expression.

      • *

      Corresponding to point 3 above, the authors suggest that 'STIL protein expression is a consequence of increased lymphoma cell proliferation.' This hypothesis cannot explain STIL protein levels if proliferation has actually decreased.

      Please see our response to point 3 above.

      • *

      Corresponding to point 3 and 4 above, the actual data is marked as non-significant indicating there is actually no proliferative difference among the samples.

      This is correct. See also our comments to point 3 and 4 above.

      __ Panel 5I. The authors state that 'On the other hand, overall levels of chromosomal copy number aberrations were higher in lymphomas (mean gains + losses: 225.2 Å} 173.7 Mb) as compared to healthy tissues (mean gains + losses: 87.3 Å} 127.5 Mb; p=0.06), irrespective of their STIL transgene status (Fig. 4J; Fig. 5I), although the difference did not quite reach statistical significance.' The authors need to soften this statement since statistically, the samples are not different. For example, 'On the other hand, overall levels of chromosomal copy number aberrations appeared to trend higher in lymphomas as compared to healthy tissues irrespective of their STIL transgene status, although the difference did not quite reach statistical significance.'______ The statement was rephrased according to the reviewer´s suggestion.

      Figure 6____ 1. Panels A, B, and C require statistical analysis.

      We have now included the appropriate statistical analyses into panels A, B, and C in the figure panels and/or legends. However, the reported p-values should be interpreted as descriptive rather than confirmatory values due to the limited number of independent experiments.

      • *

      The figure legend references to panels C and D appear to be swapped.

      We thank the reviewer for this comment/observation. We have corrected this mistake.

      Panel F. Indicate that the samples are not significantly different.

      We have now included the appropriate statistical analysis including the indication that the samples are not statistically significantly different.

      • *

      __ Corresponding to point 3, the authors indicate that 'the proportion of Ki67-positive cycling cells was lower in tamoxifen-treated... ... although the difference did not quite reach statistical significance.' The authors need to soften this statement to reflect that the samples are not statistically different (i.e. 'appeared lower' or similar).______ The statement was rephrased according to the reviewer´s suggestion.

      __Figure 6 and 7 _ Do you have data for B6-STIL animals treated with and without tamoxifen? The experiments as shown demonstrate the differences between control and tamoxifen-treated animals of the same genotype, but it is unclear if any of these effects are due to the underlying genotypes or from tamoxifen itself. ___ The experiments presented in Figures 6 and 7 have not been performed in B6-STIL control mice with and without tamoxifen treatment.

      Supplemental Figure 1____ 1. Please include molecular weight marker for this and all panels showing PCR products.

      Molecular weight markers for the DNA ladder (L) with the corresponding bp size have now been included into all Figure panels showing PCR products as requested.

      The B6-STIL and CMV-STIL+/- lines should contain a larger MW band corresponding to the STIL-F and STIL-R PCR product. Please show if possible.

      We thank the reviewer for the important remark. We agree that there should be a large PCR product band at around 3000 bp containing the bacterial neomycin phosphotransferase gene (TK-neo-pA) and the STOP cassette in the B6-STIL control mice/MEFs, and two PCR product bands (large: 3000 bp, small: 410 bp) in the heterozygous CMV-STIL+/-mice/MEFs. When we began with genotyping, we did indeed observe both bands depending on the STIL background (see figure below). However, the band intensity of the larger PCR product was relatively weak (arrowheads) compared to the smaller PCR product, and its visibility was dependent on genomic DNA input and PCR efficiency. During the PCR optimization process, the PCR conditions were changed in such a way that the yield of the small band were increased despite small input amounts of genomic DNA, but at the expense of the large PCR product band (arrows). At the end of the optimization process the larger PCR product had almost disappeared, making the discrimination between heterozygous CMV-STIL+/- and homozygous CMV-STIL-/- DNA difficult. Therefore, we decided to additionally check for STOP cassette excision in a second PCR approach in parallel. In the genotyping results shown in Supplemental Figure S1B, which have been produced after PCR optimization, no larger STIL PCR product band was visible anymore.

      __Supplemental Figure 6 _ 1. The 'Spleen' sample is missing the B6-STIL control data. 'Liver' is missing CMV-STIL+/+. Please include or indicate why they are missing. The plot order of the samples differs for 'Liver' (red, black) compared to the others (black, red, blue). Indicate statistical significances. ___ We apologize for this mistake, have corrected the Figure (formerly Supplemental Figure S6, S2 in the revised version of the manuscript), and have included the missing spleen and liver samples.

      • *

      General issues ____ 1. The materials and methods indicate that HPRT and PIPB were used as reference genes, but only HPRT is referred to in the qPCR figure legend.

      We thank the reviewer for this comment/observation. As generally recommended (Vandesomele et al., Genome Biol 3(7): research0034.1-research0034.11, 2002; Kozer and Rapacz, J Appl Genet 54(4): 391-406, 2013) we used both reference genes for accurate normalization of qPCR in all experiments. We have now corrected this mistake in the figure legend.

      • *

      Figure panels 1F and 3C display 95% confidence intervals while others use SEM. Is there a reason for this?

      In the two referenced figures (former Figure 1F has been deleted from the manuscript, see also our comment to point 1 of reviewer #1 for reasons; Figure 3C of the former manuscript is now Figure 3D in the revised manuscript version) the endpoint variable was defined by whether individual cells in a single experiment showed a certain property or not (binary variables). By definition, these kinds of variables show a nonsymmetric error structure, which cannot be expressed properly by a single value such as the standard error (SEM), but can be covered correctly by a confidence interval. For the same reason, Fisher’s exact tests were employed to obtain p-values in these situations. In the other figures, the relevant endpoint variables were roughly normally distributed, either directly, or due to them being an average of many values. In this case, a symmetric SEM was thus considered sufficient, and t-tests were used for p-values. To make this clear in the figures, we used different display options to distinguish between error bars showing SEM or 95% CI.

      __Reviewer #2 (Significance (Required)): ______ *In this manuscript, Moussa et al. describe the effects of over-expressing the centriole duplication factor STIL in whole mice and with expression restricted to the skin. They find that over expression of STIL, similar to that of PLK4, induces centriole overduplication, abnormal mitoses, and genetic instability leading to cell arrest. Additionally, over-expressing STIL results in microcephaly, perinatal lethality and a shortened lifespan. In addition, they do not find that expression of the p53 R127H mutant alleviates the cell growth defect. Moreover, overexpression of STIL does not lead to increased general tumour formation and suppresses tumour formation in an induced skin tumour model. Although this is an interesting manuscript, the authors need address a number of issues before this manuscript can be recommend the manuscript for publication. Importantly, the manuscript lacks statistical analyses to support some of their conclusions, some figures should be quantified, and controls are missing in some cases. *

      __Reviewer #3 (Evidence, reproducibility and clarity (Required)): ______ Previously it has been proposed that supernumerary centrioles play important deleterious effects in vivo including increased tumorigenesis. However, the work was inconclusive because the way of inducing centriole amplification via the PLK4 kinase could have induced other effects besides supernumerary centrioles. To resolve this question, the authors generated a mouse model of centrosome amplification, in which the structural centriole protein STIL is overexpressed. Using this mouse model in vivo along with mutant mouse embryonic feeder (MEF) lines in vivo, the authors test out the role of centrosome amplification in vivo in animal development, lifespan, and tumorigenesis. They report both embryonic lethality, defects in brain development, and shortened life span in these mice. They also find that skin tumorigenesis is reduced in the mutant mice, and demonstrates that the STIL overexpression effects are not perturbed in a dominant negative p53 model. The authors demonstrate that STIL overexpression causes centrosome amplification accompanied by aneuploidy, which however is highly deleterious for cell fitness even in the absence of p53. Clearly, tissue corrective mechanisms lead to the elimination of cells with extra centrosomes and/or aneuploidy by impaired proliferation, senescence, and apoptosis. This finding is interesting and significant and seems worthy of dissemination to the broader readership.

      This study is thorough and well executed and there is a significant body of work that leads to solid conclusions. The data is convincing, and the figure are well presented. It was refreshing to read this paper, as it was not so cluttered with data that the message gets murky, yet the data was clearly very substantial. The text is clear and easy to follow.


      There really are only minor aspects of this paper that need correction, in my opinion. The text should be thoroughly checked for typos, few extra redundant words here and there, and a couple of confusing sentences.______ As suggested by the reviewer we have rechecked the manuscript for typos, redundancies, and confusing sentences and corrected where necessary and appropriate. __* *

      For example, the last sentence in abstract is confusing 'These results suggest that supernumerary centrosomes... [result in]... tumor formation' because it should read 'reduced tumor formation' or 'impairs tumorigenesis' or otherwise be written more clearly because it seems to convey the opposite message the way it is right now. ______ We thank the reviewer for this comment and have corrected the sentence, which now reads: “These results suggest that supernumerary centrosomes impair proliferation in vitro as well as in vivo, resulting in reduced lifespan and delayed spontaneous as well as carcinogen-induced tumor formation”. The p53 dominant negative mutant is not exactly a KO so it is not fair to say "in the absence of p53"; the verbiage should be corrected and checked throughout the paper - perhaps 'interfering with p53 normal function' is more appropriate.__ As suggested by the reviewer we have corrected the wording and have substituted “absence of p53” by “interference with p53 function” where appropriate. The sentence "Senescence- and apoptosis-driven depletion of the stem cell pool may explain reduced life span and tumor formation in STIL transgenic mice." from discussion is highly speculative and should be edited to clearly convey its speculative nature or removed entirely. ______ We agree with the reviewer and have deleted the sentence from the discussion section of the manuscript.

      __Reviewer #3 (Significance (Required)): ______ Clearly, tissue corrective mechanisms lead to the elimination of cells with extra centrosomes and/or aneuploidy by impaired proliferation, senescence, and apoptosis. This finding is interesting and significant and seems worthy of dissemination to the scientific community. It adds to previous work on another centriole related protein PLK4 kinase that led to very different conclusions.

    1. Many designers also rely on their own experiences to inform the work they do.

      I chose this section because it gives another perspective on certain design choices. As we all are aware, design is relative and many people have a lot of their own personal preferences for a 'good' or a 'bad' design. In relation to designing for equity and inclusion, I chose this text since I find it interesting how different cultures may have different perspectives, opinions, and solutions to their designs. For example, as a freelance graphic designer, I myself often rely on my own experience and exposure of knowledge to digital media. If my client were to ask me to make a design for their packaging according to whatever I'd like it to be, I would make the packaging based on my knowledge, my opinion, and my belief on what I think would be a good design. However, if my client were to have a more specific request on what they would want the packaging to be, I would cater the design to how they want it to be. Sometimes, their preferences can come across as questionable or unflattering in my opinion, but at the end of the day I remind myself that again these people are requesting these designs according to their own experience and knowledge on their culture's exposure which would influence their preferences in design. Hence why, this text is very impactful to me as it encapsulate a designer's entire relationship with a client in one sentence. Having soft skills such as: open-mindedness, flexibility, and compromise to the clients' needs are also important factors in becoming a good designer, not only the technical skills matter. As we do this we are indirectly including people from different backgrounds, experiences, and abilities to be involved in the design process in order to create a more effective and relevant end product.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reply to the Reviewers

      We sincerely thank the Referees for providing important and constructive comments. We have addressed their concerns point-by-point as described below.

      Associated to Reviewer#1's comments

      *- Diploid embryos are used as controls. Gynogenetic diploids seem to be better controls to ensure that the observed phenotypes are not related to loss of heterozygosity. To limit the amount of work, the use of gynogenetic diploids could be restricted to spindle polarity and centrosome number experiments. *

      Response 1-1

      __[Experimental plan] __Following the reviewer's suggestion, we will conduct immunostaining of a-tubulin and centrin (for visualizing the spindles and centrioles, respectively) in gynogenetic diploids that will be generated by applying heat shock to gynogenetic haploid embryos during the 1st - 2nd cleavage stage. We will observe the head area of gynogenetic diploid larvae at 3-dpf when the haploid counterparts suffer particularly drastic centrosome loss and spindle monopolarization.

      • *

      • *

      *- As the authors discuss, it would be necessary to rescue centrosome loss to establish a causal relationship between centrosome loss and haploid viability. I certainly acknowledge that this is difficult (if not impossible), but it currently limits the significance of the results. *

      Response 1-2

      We agree that rescuing centrosome loss would provide an important advancement in understanding the cause of haploid syndrome in the context of our study. However, as the reviewer also pointed out in the above comment, this poses a significant technical challenge. As described in Discussion in the original manuscript, we have attempted to restore normal centrosome number through cell cycle modulations. However, we have not found a condition that rescues centrosome loss without damaging larval viability. As an alternative approach, we have also tried to induce centriole amplification by injecting mRNA encoding plk4, an essential centriole duplication inducer. However, this caused earlier embryonic death, precluding us from observing its effects on larval morphology after 1 dpf. The main challenge is that any treatment to increase centrosome number can cause centrosome overduplication, which is as deleterious to development as centrosome loss. Efforts to identify a key factor enabling the rescue of centrosome loss in haploid larvae are underway in our laboratory, which requires new explorations over several years and is beyond the scope of the present study. Reflecting on the reviewer's comment, we added a new sentence explaining the situation on this issue (line 395, page 19). To further discuss possible contributions of centrosome loss and mitotic defects to haploidy-linked embryonic defects, we also added a citation of a previous study reporting that depletion of centrosomal proteins caused mitotic defects leading to embryonic defects similar to those observed in haploid embryos in zebrafish (Novorol et al., 2013 Open Biology; line 380, page 19).

      __[Experimental plan] __Meanwhile, as a new trial to induce centriole amplification in a scalable and temporally controllable manner, we plan the following experiment, which can be conducted within the time range of the revision schedule: We will investigate the effects of low dose treatment of a plk4 inhibitor centrinone B on tissue growth and viability of haploid larvae. A recent study reported that centrinone B had complicated effects on the centriole duplication process, which is highly dose-sensitive (Tkach et al., 2022 Elife, PMID: 35758262). While it blocks centriole duplication at sufficiently high concentrations for blocking plk4 activities, it paradoxically causes centriole amplification at suboptimal conditions, presumably though over-stabilizing plk4 by blocking its autophosphorylation-dependent degradation (while its centriole duplicating function remains active). Since a previous study showed that centrinone B is also effective in zebrafish embryos (Rathbun et al., 2020 Current Biology, PMID: 32916112), we try to find optimal centrinone B treatment condition that potentially restores tissue growth or viability of haploid embryos. If we find such a rescuing condition, we will address the principle of the rescuing effects by investigating the possession of centrioles in mitotic cells in these haploid larvae.

      *- Some experiments are not, or arguably, quantified/statistically analyzed. *

      o Figure 2, Active caspase level. Larvae are sorted into three categories, and no statistical test is performed on the obtained contingency table. A Fisher'*s exact test here, or much better, the active caspase-3 levels should be quantified, instead of sorting larvae into categories. *

      Response 1-3

      We apologize that we showed only "zoomed-out" images of the immunostained embryos in the original figures (Fig. 2A), which precluded a clear presentation of the haploidy-associated aggravation of apoptosis and mitotic arrest. We could clearly distinguish cleaved caspase-3- and pH3-positive cells from non-specific background staining with an enlarged view of the same immunostaining data. Therefore, to quantitatively evaluate the extent of the haploidy-linked apoptosis and mitotic arrest, we compared the density of these cells within the right midbrain. This new quantification demonstrated a statistically significant increase in cleaved caspase-3- or pH3-positive cells in haploids compared to diploids.

      In the revised manuscript, we added the enlarged views of cleaved-caspase and pH3 immunostaining (Fig. 2B) and new quantifications with statistical analyses (Fig. 2C). Accompanying these revisions, we omitted the categorization of the severeness of the apoptosis, which was pointed out to be subjective in the reviewer#2's comment (see Response 2-3). We rewrote the corresponding section of the manuscript to explain the new quantitative analyses (line 143, page 7).

      o Same comment for 3E-F. Larvae are scored as Scarce, Mild or Severe. Looking at Fig S3A, I see one mild p53MO embryo, but the two others are not that different from 'severe' cases, which would completely change the contingency table. Again, a proper quantification would be better.

      Response 1-4

      We also quantified the frequency of cleaved caspase-3-positive cells in control and p53MO larvae (original Fig. 3E and F) as described in Response 1-3. While conducting the cell counting with enlarged images, we realized that staining quality within the inner larval layers of morphants was relatively poor in these experiments. This problem precluded us from counting cleaved caspase-3-positive cells within the inner larval layers. Therefore, we tentatively quantified only the surface larval layers of these morphants and found that cleaved caspase-3-positive cells were significantly reduced in haploids upon depletion of p53. We currently show this quantification in Fig. 3G of the revised manuscript. While this quantification confirmed the trend of p53MO-dependent decrease in apoptosis, we think it more appropriate to newly conduct the same experiment with better quality of the staining to apply the same standard of quantification for Fig. 3 as Fig. 2.


      __[Experimental plan] __For the reason described above, we propose to re-conduct immunostaining of cleaved caspase-3 in control and p53MO-injected haploid larvae to improve the visibility of the inner layer of the larvae for better quality of the quantitation.

      Meanwhile, we revised Fig. 3 by adding an enlarged view of immunostaining in Fig. 3F and omitting the subjective categorization shown in the original Fig. 3F and S3A. We plan to replace these data with new images and quantification to be obtained during the next revision. We also rewrote the main text to update these changes (line 166, page 8).

      *o Figure 4D-E, no stats. *

      Response 1-5

      We conducted the ANOVA followed by the post-hoc Tukey test for new Fig. 4D and the Fisher exact test with Benjamini-Hochberg multiple testing correction for new Fig. 4E. Please note that statistical analyses were conducted after adding the data from original Fig. 6B-C following the reviewer's suggestion (see also Response 1-6).

      *o Figure 6, Reversine treated haploid should be compared to haploid embryos (on the graphs and statistically). If no specific controls have been quantified for this experiment, data could be reused from previous figures, provided this is stated. *

      Response 1-6

      The live imaging data shown in original Fig. 4C-E and Fig. 6A-C were obtained within the same experimental series conducted in parallel at the same period under the same experimental condition. In the original manuscript, we separated them into two different figures according to the logical flow. However, following the reviewers' comments (see also Response 2-1), we realized it more appropriate to show them as a single figure panel as in the original experimental design. Therefore, we moved the reversine-treated haploid data from the original Fig. 6A-C to Fig. 4C-E to facilitate direct comparison among conditions with statistical analyses (see also Response 1-5).

      *o Rescue by p53MO and Reversine, it would be nice to also include diploid measurements on the graphs, so that the reader can appreciate the extent of the rescue. *

      Response 1-7

      Following the reviewer's comment, we added control MO-injected or DMSO-treated diploid larval data in the corresponding graphs in Fig. 3I and 6G, respectively. Please refer to Response 2-6 for further discussion on the extent of the rescue.

      Minor comments:

      *- Lines 221-223, authors claim that centriole loss and spindle monopolarization commence earlier in the eyes and brain than in skin. I am note sure I see this in Fig. S5. It could as well be that the defect is less pronounced in skin. *

      Response 1-8

      We rewrote the manuscript to include the possible interpretation suggested by the reviewer on the result (line 225, page 11).

      • *

      - Lines 227-229, authors claim that 'The developmental stage when haploid larvae suffered the gradual aggravation of centrosome loss corresponded to the stage when larval cell size gradually decreased through successive cell divisions'. I did not get that. Doesn'*t cell size decrease since the first division? Fig 5D shows that cell size decreases all along development. *

      Response 1-9

      We agree that the original sentence implies, against our intention, that cell size does not decrease before the developmental stage mentioned here. To correct this problem, we rewrote the corresponding part of Discussion as below (line 230, page 11):

      "Since the first division, embryonic cell size continuously reduces through successive cell divisions during early development (Menon et al., 2020). Cell size reduction continued at the developmental stage when we observed the gradual aggravation of the centrosome loss in haploid larvae."

      *- Some correlations are used to draw conclusions: *

      o Line 301-303. "The correlation between centrosome loss and spindle monopolarization indicates that haploid larval cells fail to form bipolar spindle because of the haploidy-linked centrosome loss."*. As stated by the authors, this is a correlation only. I agree it points in this direction. *

      Response 1-10

      We added a note to the corresponding sentence to draw readers' attention to the discussion on the limitation of the study with respect to the lack of centrosome rescue experiment (line 332, page 16).

      O Line 305-308. "*Interestingly, centrosome loss occurred almost exclusively in haploid cells whose size became smaller than a certain border (Fig. 5), indicating that cell size is a key determinant of centrosome number homeostasis in the haploid state." This one is more problematic. There is no causal link established between cell size and centrosome number homeostasis. It could very well be that some unidentified problem induces both a reduction in cell size and the loss of centrioles. *

      Response 1-11

      To avoid an over-speculative description, we deleted the subsentence "indicating that cell size is a key determinant of centrosome number homeostasis in the haploid state." (line 336, page 17). We also added a new sentence, "Alternatively, it is also possible that other primary causes, such as the lack of second active allele producing sufficient protein pools induced cell size reduction and centrosome loss in parallel without causality between them." to discuss the possibility raised by the reviewer (line 348, page 17), in association with another comment from the reviewer #3 (see also Response 3-3).

      • *

      *I have concerns regarding the significance of the reported findings. Haploid zebrafish embryos show numerous developmental defects (some as early as gastrulation, as previously shown by the authors, Menon 2020), and they die by 4 dpf. That they experience massive apoptosis at day 3 does not seem very surprising, and that inhibiting p53 transiently improves the phenotype is not a big surprise. *

      Response 1-12

      Many reports have revealed tissue-level developmental abnormalities in haploid embryos since the discovery of haploid lethality in vertebrates more than 100 years ago. This has stimulated speculation of underlying causes of haploid intolerance for decades. However, there have been surprisingly few descriptions of cellular abnormalities underlying these tissue defects, precluding an evidence-based understanding of the principle that limits developmental ability in haploid embryos. Our findings of the haploidy-linked p53 upregulation and mitotic defects illustrate what happens in the dying haploid embryos at a cellular level. These findings would provide an evidence-based frame of reference for understanding why vertebrates cannot develop in the haploid state and also provide clues to controlling haploidy-linked embryonic defects in future studies. We added a new section in Discussion to discuss the importance of addressing the haploidy-linked defects at a cellular level (line 276, page 14).

      *This reminds me of the non-specific effects of morpholino injection, which can be partially rescued by knocking down p53. *

      Response 1-13

      We believe the reviewer refers to the previous findings that different morpholinos generally have off-target effects activating p53-mediated apoptosis (e.g., Robu et al., 2007 PLoS Genet, PMID:17530925). However, p53 upregulation and apoptosis aggravation were also observed in uninjected haploid embryos free from morpholinos' artificial effects (Fig. 2, Fig. 3A, and B). To further address this issue, we plan to compare the frequency of cleavage caspase-3-positive cells between uninjected and control MO-injected haploids after revising the immunostaining of morphants in the original Fig. 3E-F (see Response 1-4 for details).

      *The observation of mitotic arrest and mitotic defects and the observation that haploid cells often lack a centrosome is interesting. However, I felt that the manuscript suggested that these observations were novel and could explain the haploid syndrome specifically in non-mammalian embryos, when the authors reported the same observations in human haploid cells as well as in mouse haploid embryos (Yaguchi 2018). To me, this manuscript mainly confirms that their previous observation is not mammalian specific, but at least conserved in vertebrates. *

      Response 1-14

      As we originally wrote (line 341, page 17 in the original manuscript), we think these haploidy-linked cellular defects are conserved among mammalian and non-mammalian vertebrates. To improve the clarity of our interpretation, we rewrote a corresponding part of the manuscript (line 50, page 2).

      *While I am no expert at centrosome duplication, I find the observation that haploidy leads to centrosome loss very intriguing, but have the impression that this manuscript falls short of improving our understanding of this phenomenon. *

      Response 1-15

      We express our gratitude to the reviewer for being interested in our findings. We hope the revisions made in the manuscript and the new results provided by the planned experiments will strengthen the contribution of this study to our understanding of haploidy-linked cellular defects.

      • *

      • *

      Associated to Reviewer#2's comments

      - Lack of proper controls in many experiments. For example, in the experiments where the authors treated haploids with reversine to suppress the SAC, there was no no-treatment control (Fig. 6A-C).

      Response 2-1

      We addressed the same point in__ Response 1-6__. In the original manuscript, we separately presented control and experimental conditions in the same experiment series in Fig. 4 and Fig. 6. We rejoined them in Fig. 4 as in the original experimental design. Please refer to __Response 1-6 __for further details.

      • In Fig. 6D, when a DMSO control was included, the control fish were from 3 dpf while the reversine-treated fish were from 0.5-3 dpf. This is a big flaw in experimental design, especially considering the authors were looking at mitotic index, which is hugely impacted by developmental time. *

      Response 2-2

      In this experiment, we treated haploid larvae with either DMSO or reversine from 0.5 to 3 dpf, isolated cells from the larvae at 3 dpf, and subjected them to flow cytometry. Both DMSO- and reversine-treated larval cells were from 3-dpf larvae. Therefore, this experiment does not have the problem noted by the reviewer. To improve the clarity of the description of the experimental design, we rewrote the corresponding part of the figure legend (line 646, page 34).

      - Subjective and inadequate data quantification. In the immunostaining experiments to detect caspase-3 and pH3, the authors either did not quantify at all and only showed single micrographs that might or might not be representative (for pH3), or only did very subjective and unconvincing quantification (for caspase-3). Objective measurements of fluorescence intensity could have been done, but the authors instead chose to categorize the staining into arbitrary categories with unclear standards. In example images they showed in the supplementary data, it is not obvious at all why some of the samples were classified as "mild" and others as "*severe" when their staining did not appear to be very different. *

      Response 2-3

      We apologize that we showed only "zoomed-out" images of the immunostained embryos in the original figures (Fig. 2A, 3E, and 6F), in which the distribution of individual cleaved caspase-3- or pH3-positive cells could not be clearly recognized. We added the enlarged view of identical immunostaining where these cells were clearly visualized in a countable manner (Fig. 2B, 3F, and 6D). Following the reviewer's suggestion, we newly conducted quantification by comparing the density of these cells within the right midbrain in haploids and diploids.

      This new quantification demonstrated the haploidy-linked increase in cleaved caspase-3- or pH3-positive cells and a reversine-dependent decrease in pH3-positive cells. We added these new quantifications with statistical analyses to the revised manuscript (Fig. 2C and 6E). Accompanying these revisions, we omitted the categorization of the severeness of apoptosis, which was pointed out to be subjective. We rewrote the corresponding section of the manuscript to explain the new quantitative analyses (line 143, page 7; line 260, page 12).

      While we also quantified cleaved caspase-3-positive cells in control and p53MO larvae in the original Fig. 3E, we realized that the staining quality of the inner larval layers of these morphants was relatively poor and could not apply the same standard of quantification as Fig. 2. Though we confirmed a statistically significant reduction in cleaved caspase-3-positive cells upon p53 depletion by quantified limited number of confocal sections (shown in Fig. 3G, please see also Response 1-4 for details), we decided to re-conduct this experiment for improving the staining quality to apply the same criteria of quantification for Fig 3 as Fig. 2 (Experimental plan is provided in Response 1-4).

      Please note that we also tried to evaluate the extent of apoptosis and mitotic arrest based on the fluorescence intensity of organ areas. However, background staining outside the dead cell area precluded the precise quantification.

      Additionally, the authors claimed that "*clusters of apoptotic cells" were only present in haploids but not diploids or p53 MO haploids, but they did not show any quantification. From the few example images (Fig.S3A), apoptotic clusters can be seen in p53 MO treated fish. Also, in some cases, the clusters were visible only because those fish were mounted in an incorrect orientation. For example, in Fig. S3A, control #2, that fish was visualized from its side, thus exposing areas around its eye that contained such clusters. These areas are not visible in other images where the fish were visualized from the top. *

      __Response 2-4 __

      We agree that the definition of "apoptotic clusters" was ambiguous in the original manuscript. We also agree that the visuals of the clusters could be affected by sample conditions, making them less reliable criteria for judging the severity of apoptotic upregulation in larvae. Following the reviewer's suggestion, we newly conducted apoptotic cell counting (Response 2-3), which recapitulated more reliably ploidy- or condition-dependent changes in the extent of apoptosis. Therefore, we decided to omit the description of the clusters in the new version of the manuscript.

      *- Subpar data quality. Aside from issues with qualification, the IF data was not convincing as staining appeared to be inconsistent and uneven, with potential artefacts. *

      Response 2-5

      We apologize that the zoomed-out images in the original figures did not appropriately demonstrate the specific visualization of individual apoptotic or mitotic cells. As described in Response 2-3, we added enlarged views of the immunostaining to the revised manuscript, in which these individual cells are clearly distinguished from non-specific background staining (Fig. 2B, 3F, and 6D). Because of the poorer staining of inner layers of control and p53 morphants, we plan to re-conduct immunostaining for Fig. 3 and Fig. S3 (please refer to Response 1-4 for further detail). The current version of immunostaining and quantification in these figures will be replaced in the next revision.

      - Unsupported and overstated claims. There were many overstatements. For one, in line 268, the authors claimed that "*the haploidy-linked mitotic stress with SAC activation is a primary constraint for organ growth in haploid larvae", while what they were actually showed was that reversine treatment, which suppresses the SAC, was partially rescued 2 out of the 3 growth defects they assessed, to such a small extent that the difference between haploid and haploid rescue was only Response 2-6

      Following the reviewer's comment, we added control MO-injected or DMSO-treated diploid larval data in the corresponding graphs in Fig. 3I and 6G, respectively. We newly estimated the relative extent of the recovery in Results (line 174, page 8; line 268, page 13).

      Reflecting the estimation, we rewrote the manuscript to discuss that haploidy-linked cell death or mitotic defects are a partial cause of organ growth retardation but that there could be other unaddressed cellular defects that also contribute to the growth retardation (line 305, page 15). We also discussed the possibility that incomplete resolution of cell death by p53MO or mitotic defects by reversine treatment may have limited their rescue effects on organ growth retardation (line 303, page 15). We also toned down several descriptions in our manuscript (lines 48 and 50, page 2; line 111, page 5; line 271, page 13; line 298, page 15; line 403, page 20) to achieve a more balanced interpretation on the potential contributions of cell death and mitotic defects to the formation of haploid syndrome.

      In association with this issue, we also discussed the difficulty of assuming a priori "fully-rescued" haploid larval size in this context. This is because even normally developing haploid larvae in haplodiplontic species tend to be much smaller than their diploid counterparts. We newly cited a few cases of haplodiplontic species where haploids are smaller than or the same in size as diploids (line 307, page 15).

      *With so many fundamental flaws, the data seem unreliable and the paper does not meet publishable standards. *

      Response 2-7

      We express our gratitude to the reviewer for providing important suggestions to improve the quality of analyses, data presentations, and interpretations in this study. We sincerely hope that one-by-one verifications of the points raised by the reviewer have improved the credibility of the paper and made it suitable for publication.

      *The low quality of the analysis makes the significance low. *

      *Reviewers have expertise in vertebrate embryogenesis and ploidy manipulation. *

      Response 2-8

      We hope that by addressing and solving the concerns pointed out by the reviewer, we could have clarified the significance of the study.

      Associated to Reviewer#3's comments

      *There seem to be a discrepancy between the microscopic images from Figure 2A and the quantification of pH3 positive cells using flow cytometry in Figure 4. According to the flow cytometric results the proportion of pH3 positive cells is about 3 times higher in haploid larvae compared to the control. The increase in mitotic cells in the imaging results however seems much more drastic. It would be helpful if the authors explain here. *

      Response 3-1

      Following comments provided by other reviewers (see also Response 1-2, 1-4, and__ 2-3__), we newly compared the frequency of pH3 positive cells between the immunostained haploid and diploid larvae. In this new analysis, pH3-positive cells were 6.4 times more frequent in haploids than in diploids, which is a more substantial difference than the one estimated based on the flow cytometric analysis.

      The apparent discrepancy between the immunostaining and flow cytometric quantification would arise because pH3-positive mitotic cells tended to be more localized on the surface than in the inner region of larvae. This inevitably results in higher pH3-positive cell density in immunostaining, in which only larval surface is analyzed. To discuss this point, we newly conducted pH3 immunostaining in haploid larvae made transparent using RapiClear reagent and showed a vertical section of 3-d reconstituted larval image of pH3 immunostaining in Fig. S4E. We rewrote the manuscript to add our interpretation of this issue (line 652, page 34).

      *Mitotic slippage that the authors observe to be increased in the haploid larvae to up to 5% of cells should result in an increase in the number of aneuploid cells. I am wondering why this is not recapitulated in the analyses of the DNA content in Figure S1. *

      Response 3-2

      A possible interpretation would be that the limited viability of newly formed aneuploid progenies precluded the detection of these populations in flow cytometric analyses. We discussed the possible generation of aneuploid progenies with our interpretation of their absence in the flow cytometric analyses in Discussion (line 293, page 14).

      *Discussion: *

      *I find the explanation of centrosomal loss due to depletion of centrosomal protein pools in the cytoplasm during drastic cell reduction interesting. I wonder if the reduction in size is not necessarily caused by the reduction in cells, but rather the result of the absence of a second active allele that produces centrosomal proteins? *

      Response 3-3

      We added the possible interpretation provided by the reviewer to the corresponding part of Discussion, in association with another comment from reviewer #1 (line 348, page 17; see also Response 1-11).

      Reviewer #3 (Significance (Required)):

      • *

      *Overall, I find the study interesting even to a broader audience since diploid development is a fundamental feature of most animals. The authors also manage to discuss their findings on the consequences of haploidy in this bigger context of the restricted diploid development in animals. The study is very well-written even to non-experts. *

      Response 3-4

      We express our gratitude to the reviewer for providing positive comments on the significance of our findings. We sincerely hope that one-by-one verifications of the points raised by the reviewer further improve the quality of the paper.

      I am not an expert of the literature describing previous characterizations of the consequences associated with haploid cell development in animals, which is why I cannot comment on the novelty of their study. Based on my expertise on centromeres and genome organisation I can however assess the results regarding the mitotic defects observed in haploid larvae (see comments).

      Response 3-5

      We sincerely thank the reviewer for providing constructive suggestions and critiques based on the expertise.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      RESPONSE TO REVIEWS_RC-2024-02383

      We thank all the reviewers for their comments and suggestions. Our point-by-point response is shown below, in bold.

      —----------------------------------------------------------------------------------------------------------------------------

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary: the work presented by the authors detail how pharmacological inhibition of the rate limiting one carbon metabolic enzyme DHFR by the drug methotrexate increases the lifespan of yeast and worms. Furthermore, placing aged mice on dietary folate and choline restriction potentially enhanced metabolic plasticity but did not significantly increase lifespan with sex specific differences observed.

      The findings in this manuscript are very interesting and important to our understanding of the conserved mechanisms that regulate longevity through one carbon metabolism. This is especially significant in light of the current folate intake and supplementation in the adult human population. The manuscript, however, requires major revisions. Please see comments below for details.

      Major comments:

      1. The overall tone in this manuscript is colloquial and conversational in nature. A third person academic style and tone, while avoiding the use of subjective descriptive terms would improve the quality of this text. Using terms such as "appeared less diverse", "results are remarkable ...strikingly more pronounced", "possibly positive outcomes" , "appear younger...for unknown reasons", "little Uracil", "tended to be higher", "roughly proportional", "slightly higher", "as a rough readout", and many other examples from the text should not be used in a scientific manuscript. The language should be academic, scientific, precise, and non-ambiguous. A thorough revision of the manuscript with substantial changes to the language and tone is necessary prior to publication. RESPONSE: Thank you for your feedback on the manuscript's tone. We revised most of the expressions mentioned by the reviewer. We note, however, that these phrases were used along with numbers and statistics. Hence, there was no lack of specifics, and readers could quickly evaluate the conclusions. We strive for a balance between scientific rigor and readability to maintain accessibility for a diverse audience.

      In the results section, we find multiple instances where the results are interpreted and extensively discussed. This should be reserved for the discussion section. The results section should be used to simply report the findings in a detailed manner.

      RESPONSE: We appreciate the suggestion on the integration of interpretation within the Results section. Upon review, we have clarified the presentation of our findings, ensuring a more distinct separation from interpretive commentary. Brief explanations remain to aid the reader's comprehension in light of the complex data, aiming to keep the flow and coherence of the manuscript and prevent overextension of the Discussion section (already ~1,300 words long). We welcome specific suggestions for further refinement.

      The materials and methods section is severely lacking in details in some areas. For example, no details were provided regarding how the worm lifespans were conducted and previous work of collaborators were referenced instead. Important details such as worm numbers, biological and technical replicates, solid agar vs liquid culture, temperature, use of FUdR, antibiotics, transfer frequency, methods of scoring, etc... are lacking. Other details such as the preparation of the plates (Was MTX incorporated into the agar, seeded with the bacterial lawn, or liquid culture was used), storage conditions, age of the plates when lifespan started, how was the UV killing of the lawn verified etc...

      many other methods subsections lack crucial details. Please carefully review the methodology and include sufficient pertinent details.

      RESPONSE: The number of worms assayed in each case were shown in each figure, as described in the legend. We now also added all the information requested by the reviewer in the methods section. The text now reads:

      “Briefly, the assays were done on solid agar nematode growth media (NGM) plates prepared fresh before each experiment. The bacterial lawn was exposed twice to a UV dose of 120mJ/cm2 using a UVC-515 Ultraviolet Multilinker (Ultra-Lum, Inc.). Streaking these UV-exposed bacteria to fresh LB agar plates (1% w/v tryptone, 0.5% w/v yeast extract, 1% w/v sodium chloride) produced no visible colonies. Methotrexate, or the ATIC inhibitor, was first dissolved in dimethyl sulfoxide (DMSO) and then added to the media used to prepare the plates after autoclaving (the media were kept in a 50°C water bath until the plates were poured). Mock-treated control plates contained only DMSO. At the start of each experiment, a sufficient number of eggs were collected from plates without any drugs and then placed on plates containing the indicated doses of each compound tested. After hatching and progression to the adult stage, animals were transferred to new plates (marked as the start of the lifespan assay) containing the drug tested and fluorodeoxyuridine (FUDR; dissolved in water), added at 50μM to block hatching of new animals. The plates were scored at least every other day until all the worms died. If an animal responded to gentle touch, it was scored as alive, otherwise a death was recorded, and the animal was removed from the plate. Worms were transferred to fresh plates as needed (e.g., if there was evidence of microbial contamination, dryness/cracks on the agar surface, consumption of the bacterial lawn, or hatching of new animals that escaped the FUDR block). The reported lifespans were compiled from several independent experiments done over several months (9-10 months for the methotrexate experiments and 4-5 months for the ATIC inhibitor), each scored by multiple individuals (4-5 persons per experiment). No experiments were excluded from the analysis.”

      In the worms, interventions that impact germline proliferation can extend lifespan. Methotrexate is known to impact germline proliferation and can lead to toxic developmental effects and germline arrest. Was fecundity impacted by methotrexate using the dosages found to extend lifespan?

      RESPONSE: We did not score fecundity in our experiments.

      The authors stated that UV killed bacteria was used in the worm experiments but did not provide the reasoning for it. Virk had concluded that reduced bacterial pathogenicity is responsible for the lifespan extension and not the worm's OCM. How does your work agree with or refute these previous findings?

      RESPONSE: The dose of methotrexate used by Virk et al was very high, so it is difficult to directly compare it to our experiment. Nonetheless, we do not think there is any contradiction. We added the following in the text to clarify this point:

      “At higher doses (10-100μΜ), methotrexate did not extend lifespan (not shown), in agreement with (Virk et al., 2016), who treated adult animals with a very high dose of methotrexate (220μM). We also note that the bacteria used to feed the worms in our experiments were killed by ultraviolet radiation to exclude any impacts from bacterial folate metabolism, which is known to affect worm lifespan (Virk et al., 2016, 2012).”

      The authors state that AICAR (100 uM administration to the worms (no experimental details were given) increases their lifespan and concluded that this is proof that manipulation of 1C metabolism promotes longevity. There are 2 concerns here; first, AMPK activation leads to inhibition of TOR and that has been shown to promote longevity in multiple models. While we agree that a significant crosstalk between TOR and OCM exists, this experiment does not necessarily contribute to the argument that the authors are making. Second, it has been established by multiple groups that inhibition (RNAi and pharmacological) of DHFR1, TYMS1, SAMS1 and possibly other OCM enzymes leads to lifespan extension in worms. These findings provide stronger evidence that OCM regulates organismal longevity.

      RESPONSE: We acknowledged prior research on lifespan extension and do not claim our use of the ATIC inhibitor as the first evidence of 1C metabolism's impact on longevity. Rather, our findings complement existing studies from us and several other groups (including the examples mentioned by the reviewer, which we had cited) by introducing novel evidence of lifespan increase through this specific inhibitor in C. elegans. Please also note that we added a detailed description of the experiment in the Methods, as suggested in a previous comment.

      In the mouse study, the authors do not provide a rationale on why a folate and choline deficient diet was adopted as opposed to only a folate deficient diet. Additionally, we assume that the diets did not contain antibiotics (succinyl sulfathiazole) to reduce microbiome folate production since it was not mentioned. Were wire bottom cages used to eliminate coprophagy? Were there any significant differences between male and female serum folate levels that could have contributed to the endpoints. Was only a subset of samples assayed for total folate? (fig 2b shows a possible n of 6 per group?). If no antibiotics and no wire bottom cages were used, mice can maintain adequate folate levels from coprophagy without developing signs of anemia. Please discuss these details as it helps clarify the conditions used.

      RESPONSE: Excellent points, and we have now added this information (see Material and Methods):

      “We note that when designing experiments to assess the consequences of folate limitation, it is common to control both folate and choline intake to ensure that the observed effects are due to the restriction of folate (Beaudin et al., 2011) because the presence of choline can mask the effects of folate deficiency. Choline can be oxidized to betaine, which provides methyl groups for converting homocysteine to methionine, independent of the folate cycle. Choline can also be incorporated into phosphatidylcholine, a major methyl ‘sink’ in the cell, through the Kennedy pathway. Lastly, we did not use any antibiotics to interfere with the microbiome nor wire bottom cages to eliminate coprophagy. Wire bottom cages were used only in the metabolic chamber experiments.”

      Were there any significant differences between male and female serum folate levels that could have contributed to the endpoints. Was only a subset of samples assayed for total folate? (fig 2b shows a possible n of 6 per group?).

      RESPONSE: ____Regarding folate levels, no significant sex differences were observed. We assayed all the animals we had at 120 weeks of age, the euthanasia endpoint, as shown in Figure 2B. There were fewer females than males in both diets.

      There are instances in the results section where statements were made implying that there are differences observed "slightly higher", "negative association" when it is not statistically significant. There can be either statistically significant differences/correlation or not. please be precise in your wording.

      RESPONSE: We have revised the Results section to ensure that qualitative descriptions such as "slightly higher" are only used when supported by appropriate statistical evidence. We have listed____ all the relevant numbers in each case after performing thorough and robust statistical analyses. We note, however, that mentioning qualitative descriptors is not always unwarranted, as long as they are factual.

      Graying was observed less significantly in the F/C- group according to the authors. However, no quantitative assessment was made, and it is merely observational.

      RESPONSE: It is not clear how to quantify graying non-invasively. Hence, we simply took photographs.

      Inference to inhibition of mTOR was made, but mTOR protein and phosphorylation levels were not performed. The authors did perform western blotting on ribosomal S6 protein, however no assessment of the downstream mTOR targets P70S6k1 and 4EBP are shown.

      RESPONSE: This is a good suggestion.____ We added a new experiment, looking at 4EBP1 phosphorylation (see new Figure S2). The results mirror those looking at S6 phosphorylation.

      Can the change in RER in F/C- mice compared to controls be explained by the increased adiposity in these animals?

      RESPONSE: We do not know. The relationship between adiposity and respiratory exchange rate can be quite complex. The increased adiposity of male mice limited for folate may lead to higher RER, reflecting perhaps a greater reliance on carbohydrate metabolism. But this is very speculative, especially since these mice are not obese. It is unclear how the improved metabolic plasticity could be associated with adiposity for the females.

      How was the microbiome normalized between groups prior to the beginning of the experiment? (fecal slurry gavage, bedding exchange, cohabitation, none of the above?). There is no mention of this crucial step in the materials and methods section. Furthermore, additional details regarding the microbiome analysis are required (analysis pipeline, read depth, denoising, software, data processing, PCA analysis, etc...). it is not sufficient to state that Zymo performed the analysis.

      RESPONSE: We now revised the text and added a detailed description of the methods, as follows:

      “There was no microbiome normalization between groups prior to the beginning of the experiment. Mouse fecal pellets were gathered by positioning the mice on a paper towel beneath an overturned glass beaker. A minimum of three fecal pellets from each animal were transferred into cryovials using sterile forceps. The samples were preserved at -80°C and shipped to Zymo Research, where they were processed and analyzed with the ZymoBIOMICS® Shotgun Metagenomic Sequencing Service (Zymo Research, Irvine, CA).For DNA extraction, the ZymoBIOMICS®-96 MagBead DNA Kit (Zymo Research, Irvine, CA) was used according to the manufacturer’s instructions. Genomic DNA samples were profiled with shotgun metagenomic sequencing. Sequencing libraries were prepared with Illumina® DNA Library Prep Kit (Illumina, San Diego, CA) with up to 500 ng DNA input following the manufacturer’s protocol using unique dual-index 10 bp barcodes with Nextera® adapters (Illumina, San Diego, CA). All libraries were pooled in equal abundance. The final pool was quantified using qPCR and TapeStation® (Agilent Technologies, Santa Clara, CA). The final library was sequenced on the NovaSeq® (Illumina, San Diego, CA) platform. The ZymoBIOMICS® Microbial Community DNA Standard (Zymo Research, Irvine, CA) was used as a positive control for each library preparation. Negative controls (i.e. blank extraction control, blank library preparation control) were included to assess the level of bioburden carried by the wet-lab process.

      Raw sequence reads were trimmed to remove low quality fractions and adapters with Trimmomatic-0.33 (Bolger et al., 2014): quality trimming by sliding window with 6 bp window size and a quality cutoff of 20, and reads with size lower than 70 bp were removed. Antimicrobial resistance and virulence factor gene identification was performed with the DIAMOND sequence aligner (Buchfink et al., 2015). Microbial composition was profiled with Centrifuge (Kim et al., 2016) using bacterial, viral, fungal, mouse, and human genome datasets. Strain-level abundance information was extracted from the Centrifuge outputs and further analyzed to perform alpha- and beta-diversity analyses and biomarker discovery with LEfSe (Segata et al., 2011) with default settings (p > 0.05 and LDA effect size > 2).”

      What is an "easily distinguishable gut microbiome" and "appeared less diverse"?

      RESPONSE: To clarify these points, __w__e now edited as follows:

      “The different sex and diet groups had an easily distinguishable gut microbiome, occupying different areas of principal component analysis graphs (Figure 5A), based on Bray-Curtis β-diversity dissimilarity indices (Knight et al., 2018). The intestinal microbiome of male mice on the F/C- diet was not statistically less diverse (p=0.222, based on the Wilcoxon rank sum test; Figure 5 - Supplement 1).”


      a two-dimensional plot using two principal components would be more suitable for image 5A and allow for better visualization of the clustering of the groups.

      RESPONSE: We tried displaying the data on a multipanel (3 panels per group, 12 total) two-dimensional figure, but the result is more confusing. Since the sample number is small (n=6 animals per group), the 3D graphs are visually adequate and more pleasing. They are also the standard way of representing this kind of data.

      Since the authors suggest that the microbiome could be a source of 1C metabolites (including natural folate), it is important to clarify if coprophagy is involved.

      RESPONSE: We agree and have added the information as requested.

      How are inflammatory cytokines and marker levels linked to reduced anabolism and immune function in non-challenged animals?

      RESPONSE: ____We do not make any claims for such links if that is what the reviewer implied. If the intent was more towards speculation, we suspect one could imagine various situations. For instance, nutrients may be more heavily used during inflammation to support immune cell responses instead of central anabolic processes in other tissues, limiting the building blocks available for tissue growth and repair. Since we do not see major changes in inflammatory cytokines, we prefer not to speculate about possible links.

      When discussing the epigenetic analysis, the authors state "no changes in the DNA methylation from liver samples.." and "groups appear younger than expected". Please clarify these statements. Additional details are needed regarding the analysis performed and the choice of methylated loci and methods. Please reference the epigenetic clock or model that was used and if was developed for the same strain and sub-strain of mice. Is it using a modified "Hovarth" mouse DNA age epigenetic clock? If so, provide the necessary details and a possible explanation for the discrepancy other than "unknown reasons"

      __RESPONSE: ____The assay is based on the "Hovarth" mouse DNA age epigenetic clock, for the strain we used (C57BL/6). We have now added a detailed description, which we received from the company, as follows (see Materials and Methods): __

      "Liver samples (~15mg) collected at euthanasia were placed in 0.75mL of 1X DNA/RNA Shield™ solution (Zymo Research, Irvine, CA), shipped to Zymo Research, and processed with DNAge® Service according to their established protocols. Briefly, after DNA extraction, the EZ DNA Methylation-Lightning Kit (Zymo Research, Irvine, CA) following the standard protocol was used for bisulfite conversion. Samples were enriched specifically for the sequencing of >1000 age-associated gene loci using Simplified Whole-panel Amplification Reaction Method (SWARM®), where specific CpGs are sequenced at minimum 1000X coverage. Sequencing was run on an Illumina NovaSeq instrument. Sequences were identified by Illumina base calling software then aligned to the reference genome using Bismark. Methylation levels for each cytosine were calculated by dividing the number of reads reporting a "c" by the number of reads reporting a "C" or "T". The percentage of methylation for these specific sequences were used to assess DNA age according to Zymo Research's proprietary DNAge® predictor which had been established using elastic net regression to determine the DNAge®."

      As for a possible explanation for the discrepancy, since all our "groups appear younger than expected," unfortunately, other than "unknown reasons," we have none to offer. Nonetheless, the critical point for this study is that we saw no diet effects, regardless of where the company's assay draws the baseline.

      Regarding Uracil misincorporation, the liver contains significant stores of folate as it is the main hub for several critical OCM reactions (Phospholipid methylation is a major one). Earlier studies used antibiotics with or without coprophagy prevention measures to induce a state of folate depletion to induce uracil incorporation in various tissues of rodent models. There is some controversy whether dietary folic acid restriction/methyl donor restriction alone will lead to uracil misincorporation when there is no apparent depletion or anemia. Please discuss your specific experimental procedures and how it agrees or disagrees with the published literature.

      __RESPONSE: We have now added the experimental details, as suggested in a previous comment. Since we do not see uracil misincorporation, we prefer not to comment on the published literature for possible links between misincorporation and anemia. __

      The section discussing RPS6 needs to be rewritten and it is difficult to understand.

      RESPONSE: We revised the text, which now reads:

      “____Immunoblot analysis of liver tissue samples gathered at the time of euthanasia revealed variability in the detected values across individual mice. When examining the male mice, we observed that, on average, those fed the F/C- diet had approximately half the amount of phosphorylated RPS6 (P-RPS6) compared to those on the F/C+ diet. However, due to high variability in the measured values, the overall differences in P-RPS6 levels between the two dietary groups did not reach statistical significance (Figure 7 - Supplement 1; p>0.05, based on the Wilcoxon rank sum test).”

      Furthermore, as stated previously, considering phosphorylation of mTOR and its downstream targets 4EBP and S6K1 will give a clear indication of proliferative signaling.

      RESPONSE:____ As we mentioned above, we have now added the suggested 4EBP experiment (see new Figure S2).

      Additionally, these pathways are impacted by feeding status, diurnal cycles, and sex. Were these factors controlled prior to sacrifice? Were the animals sacrificed at the same time? In a fed or unfed state?

      RESPONSE: The animals were sacrificed at the same time, with no feeding limitations.

      The western blots provided in supplementary files show uneven protein loading across lanes (ponceau stain). No loading control is shown such as B-actin. A separate blot is used for total and phosphorylated proteins as opposed to gently stripping the membrane of the phosphorylated bolt and re-incubating with the antibody for total. While normalizing phosphorylated to total protein levels will eliminate some of the variability in the author's method. The uneven loading may introduce errors in the calculated ratios.

      RESPONSE: The uneven loading across mouse samples is inconsequential. We report the ratio of phospho-RPS6 to the total amount of RPS6 ____within____ each mouse sample. These ratios were then compared among the different animals and diet groups. We also note that stripping could introduce other artifacts if it is not uniform across all the blot areas.

      While the authors referenced older studies utilizing low dose methotrexate on rodents and provided a composite lifespan based on these findings, why was dietary folate and choline restriction used instead of a low dose methotrexate in mice in the current study? Please provide a rationale for this approach.

      __RESPONSE: First, in the context of current folate fortification policies, we reasoned that testing dietary folate limitation late in life would be more informative. Second, three of us (M.P., B.K.K., and M.K.) proposed to the Interventions Testing Program at the National Institutes of Health to test whether low-dose methotrexate extends lifespan in mice. The proposal was accepted, and the study is ongoing (the ITP decided to test methotrexate at 0.2ppm, starting at 14 months of age; _https://www.nia.nih.gov/research/dab/interventions-testing-program-itp/supported-interventions_). __

      Minor comments:

      1. While the authors make compelling arguments that lower folate intake later in life may promote healthy aging, an important consideration in the human population that a considerable percentage of older individuals may be consuming an excessive amount of folate due the combination of fortification and voluntary supplementation. An alternate hypothesis that could apply to humans and lab models is that the existing levels of exposure to folate/folic acid may be accelerating the aging process and promoting disease in later life. __RESPONSE: Perhaps, but as we describe in the text (2nd paragraph in the introduction): __

      “...analyses ‘did not identify specific risks from existing mandatory folic acid fortification’ in the general population (Field and Stover, 2018). This conclusion neither refutes nor contradicts the idea that a moderate decrease in folic acid intake among older adults may improve healthspan. Merely because high folic acid intake does not harm the health of older adults does not negate the possibility that a lower folic acid intake might enhance health.”

      The common C57BL/6j is being referred to as the "long lived strain". Is this relative to mice in wild conditions? There are many transgenic C57bl/6 strains that live considerably longer. Please clarify if this is meant to describe the aged mice used in the experimental process.

      RESPONSE: ____This was from a comprehensive comparison of many different inbred strains. We apologize for omitting the citation, which we have now added____ (Yuan et al, 2009).

      While the authors state early in the manuscript that longevity was not a measured outcome in the mouse study, the manuscript contains statements discussing animal survival in the results and survival curves (figure 2). This gives the impression that the study was planned as a survival analysis initially and since no difference was observed between the experimental groups during the earlier stages, the secondary endpoints of health span analysis were adopted. Either approach does not detract from the significance of the study's findings. Further clarity on the approach would be beneficial to the readers.

      RESPONSE: The study was designed, and the Animal Use Protocol was institutionally approved for healthspan, not lifespan. The number of animals we used did not have sufficient power to detect lifespan differences. Note that, at least for males, very few animals had died by 120 weeks, our approved euthanasia endpoint. However, it was important to report that folate limitation did not adversely affect overall survival during the analysis time frame.

      For yeast culture conditions, what are the folate sources and content? Is there added folic acid similar to cell culture conditions where supraphysiological concentrations are used in standard mediums (RPMI and DMEM).

      RESPONSE: The yeast media we used ____were undefined (YPD, see Materials and Methods). The source of folate in this media is “yeast extract,” which is generally considered to contain very high amounts of folate (it was used decades ago to treat anemia and folate deficiency in pregnant women). Note also that, unlike animals, yeast can synthesize folate.

      In the metabolism section, the authors make statements such as "the differences were minimal" , "probably were due..", "minimal effects", "apparent increase", "tended to be", "little uracil" etc.. please refrain from using subjective language and use precise scientific terms.

      RESPONSE: Please see our earlier response to this comment.

      Figure 2-c, there is a typo, Weeks not months

      RESPONSE: Corrected. Thank you!

      ** Referees cross-commenting**

      while we generally agree with the other reviewer's concerns, we find that reviewer 3 rejection of the authors conclusion without considering the evidence presented in the context of what is currently known in the field potentially limiting. Multiple groups have shown that manipulation of OCM enzymes (DHFR, TYMS, SAMS) can extend lifespan in worms. the recent report Antebi's group (Annibal et al. Nature Com, 2021) provides strong evidence that OCM is central to longevity regulation in worms and mice and that folate intake can interact with and modulate organismal longevity. while this manuscript findings are not conclusive, I think it is premature to dismiss it completely. perhaps the alternative is to discuss the limitations of this approach and interpret the results (or the lack of significant differences) in order to help guide future research into this important subject. generalizing rodent results to human is always going to be a limiting factor in this type of work. Mice have significantly higher circulating folate. additionally, DHFR activity (the rate limiting enzyme in folate OCM) in rodents can be up to 100 times higher than its human equivalent. another consideration is that mice, similar to other rodents, engage in coprophagy, thereby recycling and supplementing bacterially produced folate in the absence of antibiotics in the diet. Therefore, mice placed of dietary folate restriction in the absence of antibiotics do not develop signs of anemia or deficiency. Therefore, it could be argued that there is no loss of nutrients in mice in this scenario and that supplementation at the arbitrarily recommended level of synthetic folic acid (2mg/kg day) or higher could impact health and aging. Similarly , in humans excess folate intake has been controversially associated with a number of deleterious health effects. It is important not to dismiss these reports and encourage further research into this subject that impacts a significant percentage of the human population due to the widespread use of supplements.

      RESPONSE: We thank the reviewers for their evaluation of the work we presented. We have also added the following in the discussion, expanding the limitations of the study:

      “Since mice engage in coprophagy, microbiome contributions to folate metabolism are bound to be substantial in this species. There are also significant differences in folate status between mice and people. For example, people have lower levels (~10-15 ng/mL) of serum folate than mice (Bailey et al., 2015), and the activity of DHFR, an enzyme essential for maintaining tetrahydrofolate pools -the folate form used in 1C reactions, maybe only 2% of that in rodents (Bailey and Ayling, 2009). Hence, mice are likely more refractory to a low folate dietary intake.”

      Reviewer #1 (Significance (Required)):

      Significance:

      A major strength of this study is that the authors show that manipulation of OCM either through pharmacological inhibition or dietary restriction can impact organismal longevity in a conserved manner across species from yeast to worms and mammals. These findings provide compelling evidence that folate intake and metabolism in humans should be rigorously researched as potential regulator of aging. These findings complement and agree with a recent report by Antebi's group (Annibal et al. Nature Com, 2021) highlighting that long-lived worm and mice strains exhibit similar metabolic regulation of one carbon metabolism. In the same report low levels of folate supplementation partially or completely abrogated the lifespan extension in some models. This study provides additional evidence that restricting OCM through drugs or dietary restriction can significantly impact healthspan and lifespan. Additionally, it raises the question whether excessive folate intake in aged adults may have potentially deleterious effects on health and longevity. The limitations of this study can be seen in the overall lack of significant impact of the dietary intervention on the health metrics that were measured in mice. The study does not provide strong evidence that restricting folate and choline intake will produce favorable effects on health. Similarly, no significant impact on mice lifespan was observed based on the partial lifespan analysis. Further clarity is needed regarding the experimental procedures and methods used. The study, nonetheless, is an important step towards investigating the role of folate and OCM in regulating mammalian healthspan and lifespan. Future studies can expand on these findings and investigate whether OCM interventions that are started in early life can produce significant and measurable effects on longevity and health in mammals. The findings here provide a conceptual and incremental advance in our understanding of these complex interactions.

      These findings are important to the research communities especially in the areas of longevity, metabolism, and nutrition.

      RESPONSE: We appreciate the recognition of our work's significance in furthering understanding of longevity, metabolism, and nutrition. We would also like to stress that this study is not an incremental advance. We believe our study's focus on dietary folate limitation ____in aged mice____ represents a novel and more radical contribution, considering the lack of prior research in this specific context, underscoring the distinctiveness and importance of our findings.

      —---------------------------------------------------------------------------------------------------------------

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary: In this manuscript they investigate whether disruption of the folate cycle can slow ageing/improve health in yeast, worms and mice. There are a few experiments in yeast and C. elegans but the rest is a meta analysis of some old data on folate-deprived mice and their own study of mice on a diet with and without folic acid and choline. The find that various interventions of the folate cycle extend lifespan in yeast and worms, that the old study suggest mice live longer without folic acid supplementation and that there is no change to healthspan with mice without folic acid and choline in the diet late in life and that these mice show some positive benefits. Analysis of the microbiome and the transcriptomics suggest small changes to the microbiota and changes in gene expression. Overall the authors conclude that biosynthetic processes have been inhibited without negative effects on healthspan.

      Major comments

      1. The two worm lifespan experiments in Fig 1 show very different controls despite the methods stating that the conditions were the same. Controls can vary from one experiment to another but the difference is striking. It would be good to have supplementary data about the number of repeats and other data about these experiments. RESPONSE: We also noted the difference. However, we believe our conclusions are valid and robust because we used only experiment-matched controls for each comparison. We now describe in detail how the experiments were done (see revised Materials and Methods). Lastly, the two compounds were tested years apart from different individuals, and the different lifespans of the controls could arise from differences in the media batches, temperature control, etc.

      The diet lack folic acid and choline yet the conclusions are only about folate. The choline aspect of the diet needs to be acknowledged as a potential factor.

      RESPONSE: As we mentioned above, we have now added this information (see Material and Methods):

      “We note that when designing experiments to assess the consequences of folate limitation, it is common to control both folate and choline intake to ensure that the observed effects are due to the restriction of folate (Beaudin et al., 2011) because the presence of choline can mask the effects of folate deficiency. Choline can be oxidized to betaine, which provides methyl groups for converting homocysteine to methionine, independent of the folate cycle. Choline can also be incorporated into phosphatidylcholine, a major methyl ‘sink’ in the cell, through the Kennedy pathway. Lastly, we did not use any antibiotics to interfere with the microbiome nor wire bottom cages to eliminate coprophagy. Wire bottom cages were used only in the metabolic chamber experiments.”

      The authors argue that the effects on the mice are not mediated effects on the diet by the microbiome because there is not a statistical effect on diversity. However they do show a clear difference at the metagenomic level that fits with a metabolic difference. It also ignores work in C. elegans showing that inhibition of bacterial folate synthesis increases lifespan, not by decreasing folate supply but because lowered bacterial folate prevents an age-accelerating activity in the bacteria (Virk et al 2016). It has also been shown that a breakdown product of folic acid can be taken up by bacteria and influence ageing (Maynard et al 2018). I do not think the evidence is strong enough to discounted that the changes seen in the mice are not mediated by microbes.

      RESPONSE: We do not state that “changes seen in the mice are not mediated by microbes”. On the contrary, we agree with the reviewer that the microbiome likely contributes significantly, and we hope this is conveyed in the text. We also agree with the references the reviewer pointed out, which we cite (see also our response to point#5 of reviewer 1).

      Minor comments

      1. It had been shown a long time ago that sams-1 mutants in C. elegans extend lifespan. MTX is likely to influence SAMS levels. This point needs to mentioned. RESPONSE: Thank you. We added the reference.

      Page - 6 "folate accelerates worm aging". This statement is not correct and is not what Virk et al 2016 suggests.

      RESPONSE: We revised it to the following: “____It has been reported that treating worms with high levels of methotrexate (220μΜ) at the adult stage did not extend their lifespan ____(Virk et al., 2016)____”.

      Page 7. "at 100μM, a dose similar to the one used in mice with metabolic syndrome (Asby et al., 2015)." It's not valid to compare the concentration of a drug in the media in a C. elegans experiment to a dose given to mice.

      RESPONSE: We appreciate the reviewer's point on comparing drug dosages across species. The intention was to provide a reference point for the concentration used rather than suggesting a direct equivalence with outcomes. We recognize the complexities of cross-species dosage comparisons and have amended the text to clarify that the mention of dosage is for contextual purposes only.

      ** Referees cross-commenting**

      I would like to add that it is important to consider whether there are in fact negative effects of folic acid given in later life and this is one of the only studies that addresses this question in a mammalian model, and thus needs to be reported, once the issues raised have been addressed.

      __RESPONSE: As we mentioned in a comment from reviewer 1 and describe in the text (2nd paragraph in the introduction): __

      “...analyses ‘did not identify specific risks from existing mandatory folic acid fortification’ in the general population (Field and Stover, 2018). This conclusion neither refutes nor contradicts the idea that a moderate decrease in folic acid intake among older adults may improve healthspan. Merely because high folic acid intake does not harm the health of older adults does not negate the possibility that a lower folic acid intake might enhance health.”

      Reviewer #2 (Significance (Required)):

      The main strength of this manuscript is that it examines the effect of mice given a folate and choline deficient diet late in life and finds mostly positive effects. This finding challenges the dogma that folate

      —--------------------------------------------------------------------------------------------------

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Blank/Polymenis and colleagues explore how reduced folate metabolism impacts aging. While folate supplementation is known to benefit the development and health of young people, little is known about the impact of this substrate at advanced ages. The paper consists of two parts: 1) blocking folate metabolism in yeast and C. elegans while measuring lifespan (reproductive or age of death); 2) measuring a vast array of traits in mice where folate (and choline) is removed from the diet starting at age 1 year. The second approach is most central to the paper's theme, and the authors conclude their 'data raise the exciting possibility that ... reduced folate intake later in life might be beneficial." However, I do accept this conclusion. Instead, the overwhelming fact is that there were no changes in any phenotype due to the absence of F/C in the older animals. Loss of this nutrient is neutral, although perhaps bad for the kidney. In my view, the authors misinterpret their very basic results: loss of dietary folate has no impact on aged mice (one strain, at that). And there is no way to generalize this simple conclusion to humans.

      RESPONSE: ____We respectfully disagree with the reviewer's assessment of our study's conclusions and its significance. With the primary focus on evaluating the effects of reduced folate intake in aged mice, we explored a comprehensive range of healthspan markers and molecular analyses. Contrary to the reviewer's assertion, our data demonstrate significant outcomes such as altered body weight and metabolic parameters in mice subjected to folate restriction, along with insights into molecular changes indicative of lower anabolism.

      The reviewer's interpretation that folate limitation has no observable impact on aged mice overlooks the nuanced findings presented in our study. While acknowledging the neutral effects observed in some phenotypes, we contend that our results collectively contribute to a deeper understanding of the implications of late-life folate restriction. It is unwarranted to dismiss these findings.

      Generalizing findings from model systems to humans is indeed complex, as noted by the reviewer. However, our study, alongside existing literature, provides valuable insights that warrant consideration and further exploration. We stand by the rigor of our methodology, the diversity of data presented, and the significance of our results in enhancing knowledge on the impact of folate metabolism in aging models.

      There are other issues throughout the work that need to be addressed but given weakness on its key argument, I will not elaborate these points.

      __RESPONSE: Since the reviewer offered no specifics on “other issues,” we cannot respond. We hope, however, that we have addressed them in our response to the other reviewers’ comments. __

      Reviewer #3 (Significance (Required)):

      Blank/Polymenis and colleagues explore how reduced folate metabolism impacts aging. While folate supplementation is known to benefit the development and health of young people, little is known about the impact of this substrate at advanced ages.

      RESPONSE: ____We concur with the reviewer's observation regarding the knowledge gap surrounding the impact of reduced folate metabolism on aging, particularly in advanced stages of life, which ____is why our study significantly contributes to the field. As we mentioned above, not only do we report that some healthspan metrics were improved in folate-limited animals (e.g., body weight, improved metabolic plasticity), but our study also offers for the first time a comprehensive biomarker analysis of folate limitation late in life (e.g., metabolite and mRNAs changes associated with lower anabolism, lower IGF1 levels in females). ____This original contribution enhances our understanding of the complex interplay between folate metabolism and aging.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary: the work presented by the authors detail how pharmacological inhibition of the rate limiting one carbon metabolic enzyme DHFR by the drug methotrexate increases the lifespan of yeast and worms. Furthermore, placing aged mice on dietary folate and choline restriction potentially enhanced metabolic plasticity but did not significantly increase lifespan with sex specific differences observed. The findings in this manuscript are very interesting and important to our understanding of the conserved mechanisms that regulate longevity through one carbon metabolism. This is especially significant in light of the current folate intake and supplementation in the adult human population. The manuscript, however, requires major revisions. Please see comments below for details.

      Major comments:

      1. The overall tone in this manuscript is colloquial and conversational in nature. A third person academic style and tone, while avoiding the use of subjective descriptive terms would improve the quality of this text. Using terms such as "appeared less diverse", "results are remarkable ...strikingly more pronounced", "possibly positive outcomes" , "appear younger...for unknown reasons", "little Uracil", "tended to be higher", "roughly proportional", "slightly higher", "as a rough readout", and many other examples from the text should not be used in a scientific manuscript. The language should be academic, scientific, precise, and non-ambiguous. A thorough revision of the manuscript with substantial changes to the language and tone is necessary prior to publication.
      2. In the results section, we find multiple instances where the results are interpreted and extensively discussed. This should be reserved for the discussion section. The results section should be used to simply report the findings in a detailed manner.
      3. The materials and methods section is severely lacking in details in some areas. For example, no details were provided regarding how the worm lifespans were conducted and previous work of collaborators were referenced instead. Important details such as worm numbers, biological and technical replicates, solid agar vs liquid culture, temperature, use of FUdR, antibiotics, transfer frequency, methods of scoring, etc... are lacking. Other details such as the preparation of the plates (Was MTX incorporated into the agar, seeded with the bacterial lawn, or liquid culture was used), storage conditions, age of the plates when lifespan started, how was the UV killing of the lawn verified etc... many other methods subsections lack crucial details. Please carefully review the methodology and include sufficient pertinent details.
      4. In the worms, interventions that impact germline proliferation can extend lifespan. Methotrexate is known to impact germline proliferation and can lead to toxic developmental effects and germline arrest. Was fecundity impacted by methotrexate using the dosages found to extend lifespan?
      5. The authors stated that UV killed bacteria was used in the worm experiments but did not provide the reasoning for it. Virk had concluded that reduced bacterial pathogenicity is responsible for the lifespan extension and not the worm's OCM. How does your work agree with or refute these previous findings?
      6. The authors state that AICAR (100 uM administration to the worms (no experimental details were given) increases their lifespan and concluded that this is proof that manipulation of 1C metabolism promotes longevity. There are 2 concerns here; first, AMPK activation leads to inhibition of TOR and that has been shown to promote longevity in multiple models. While we agree that a significant crosstalk between TOR and OCM exists, this experiment does not necessarily contribute to the argument that the authors are making. Second, it has been established by multiple groups that inhibition (RNAi and pharmacological) of DHFR1, TYMS1, SAMS1 and possibly other OCM enzymes leads to lifespan extension in worms. These findings provide stronger evidence that OCM regulates organismal longevity.
      7. In the mouse study, the authors do not provide a rationale on why a folate and choline deficient diet was adopted as opposed to only a folate deficient diet. Additionally, we assume that the diets did not contain antibiotics (succinyl sulfathiazole) to reduce microbiome folate production since it was not mentioned. Where wire bottom cages used to eliminate coprophagy? Were there any significant differences between male and female serum folate levels that could have contributed to the endpoints. Was only a subset of samples assayed for total folate? (fig 2b shows a possible n of 6 per group?). If no antibiotics and no wire bottom cages were used, mice can maintain adequate folate levels from coprophagy without developing signs of anemia. Please discuss these details as it helps clarify the conditions used.
      8. There are instances in the results section where statements were made implying that there are differences observed "slightly higher", "negative association" when it is not statistically significant. There can be either statistically significant differences/correlation or not. please be precise in your wording.
      9. Graying was observed less significantly in the F/C- group according to the authors. However, no quantitative assessment was made, and it is merely observational. Inference to inhibition of mTOR was made, but mTOR protein and phosphorylation levels were not performed. The authors did perform western blotting on ribosomal S6 protein, however no assessment of the downstream mTOR targets P70S6k1 and 4EBP are shown.
      10. Can the change in RER in F/C- mice compared to controls be explained by the increased adiposity in these animals?
      11. How was the microbiome normalized between groups prior to the beginning of the experiment? (fecal slurry gavage, bedding exchange, cohabitation, none of the above?). There is no mention of this crucial step in the materials and methods section. Furthermore, additional details regarding the microbiome analysis are required (analysis pipeline, read depth, denoising, software, data processing, PCA analysis, etc...). it is not sufficient to state that Zymo performed the analysis. What is an "easily distinguishable gut microbiome" and "appeared less diverse"? a two-dimensional plot using two principal components would be more suitable for image 5A and allow for better visualization of the clustering of the groups. Since the authors suggest that the microbiome could be a source of 1C metabolites (including natural folate), it is important to clarify if coprophagy is involved.
      12. How are inflammatory cytokines and marker levels linked to reduced anabolism and immune function in non-challenged animals?
      13. When discussing the epigenetic analysis, the authors state "no changes in the DNA methylation from liver samples.." and "groups appear younger than expected". Please clarify these statements. Additional details are needed regarding the analysis performed and the choice of methylated loci and methods. Please reference the epigenetic clock or model that was used and if was developed for the same strain and sub-strain of mice. Is it using a modified "Hovarth" mouse DNA age epigenetic clock? If so, provide the necessary details and a possible explanation for the discrepancy other than "unknown reasons"
      14. Regarding Uracil misincorporation, the liver contains significant stores of folate as it is the main hub for several critical OCM reactions (Phospholipid methylation is a major one). Earlier studies used antibiotics with or without coprophagy prevention measures to induce a state of folate depletion to induce uracil incorporation in various tissues of rodent models. Theres is some controversy whether dietary folic acid restriction/methyl donor restriction alone will lead to uracil misincorporation when there is no apparent depletion or anemia. Please discuss your specific experimental procedures and how it agrees or disagrees with the published literature.
      15. The section discussing RPS6 needs to be rewritten and it is difficult to understand. Furthermore, as stated previously, considering phosphorylation of mTOR and its downstream targets 4EBP and S6K1 will give a clear indication of proliferative signaling. Additionally, these pathways are impacted by feeding status, diurnal cycles, and sex. Were these factors controlled prior to sacrifice? Where the animals sacrificed at the same time? In a fed or unfed state?
      16. The western blots provided in supplementary files show uneven protein loading across lanes (ponceau stain). No loading control is shown such as B-actin. A separate blot is used for total and phosphorylated proteins as opposed to gently stripping the membrane of the phosphorylated bolt and re-incubating with the antibody for total. While normalizing phosphorylated to total protein levels will eliminate some of the variability in the author's method. The uneven loading may introduce errors in the calculated ratios.
      17. While the authors referenced older studies utilizing low dose methotrexate on rodents and provided a composite lifespan based on these findings, why was dietary folate and choline restriction used instead of a low dose methotrexate in mice in the current study? Please provide a rationale for this approach.

      Minor comments:

      1. While the authors make compelling arguments that lower folate intake later in life may promote healthy aging, an important consideration in the human population that a considerable percentage of older individuals may be consuming an excessive amount of folate due the combination of fortification and voluntary supplementation. An alternate hypothesis that could apply to humans and lab models is that the existing levels of exposure to folate/folic acid may be accelerating the aging process and promoting disease in later life.
      2. The common C57BL/6j is being referred to as the "long lived strain". Is this relative to mice in wild conditions? There are many transgenic C57bl/6 strains that live considerably longer. Please clarify if this is meant to describe the aged mice used in the experimental process.
      3. While the authors state early in the manuscript that longevity was not a measured outcome in the mouse study, the manuscript contains statements discussing animal survival in the results and survival curves (figure 2). This gives the impression that the study was planned as a survival analysis initially and since no difference was observed between the experimental groups during the earlier stages, the secondary endpoints of health span analysis were adopted. Either approach does not detract from the significance of the study's findings. Further clarity on the approach would be beneficial to the readers.
      4. For yeast culture conditions, what are the folate sources and content? Is there added folic acid similar to cell culture conditions where supraphysiological concentrations are used in standard mediums (RPMI and DMEM).
      5. In the metabolism section, the authors make statements such as "the differences were minimal" , "probably were due..", "minimal effects", "apparent increase", "tended to be", "little uracil" etc.. please refrain from using subjective language and use precise scientific terms.
      6. Figure 2-c, there is a typo, Weeks not months

      ** Referees cross-commenting**

      while we generally agree with the other reviewer's concerns, we find that reviewer 3 rejection of the authors conclusion without considering the evidence presented in the context of what is currently known in the field potentially limiting. Multiple groups have shown that manipulation of OCM enzymes (DHFR, TYMS, SAMS) can extend lifespan in worms. the recent report Antebi's group (Annibal et al. Nature Com, 2021) provides strong evidence that OCM is central to longevity regulation in worms and mice and that folate intake can interact with and modulate organismal longevity. while this manuscript findings are not conclusive, I think it is premature to dismiss it completely. perhaps the alternative is to discuss the limitations of this approach and interpret the results (or the lack of significant differences) in order to help guide future research into this important subject. generalizing rodent results to human is always going to be a limiting factor in this type of work. Mice have significantly higher circulating folate. additionally, DHFR activity (the rate limiting enzyme in folate OCM) in rodents can be up to 100 times higher than its human equivalent. another consideration is that mice, similar to other rodents, engage in coprophagy, thereby recycling and supplementing bacterially produced folate in the absence of antibiotics in the diet. Therefore, mice placed of dietary folate restriction in the absence of antibiotics do not develop signs of anemia or deficiency. Therefore, it could be argued that there is no loss of nutrients in mice in this scenario and that supplementation at the arbitrarily recommended level of synthetic folic acid (2mg/kg day) or higher could impact health and aging. Similarly , in humans excess folate intake has been controversially associated with a number of deleterious health effects. It is important not to dismiss these reports and encourage further research into this subject that impacts a significant percentage of the human population due to the widespread use of supplements.

      Significance

      A major strength of this study is that the authors show that manipulation of OCM either through pharmacological inhibition or dietary restriction can impact organismal longevity in a conserved manner across species from yeast to worms and mammals. These findings provide compelling evidence that folate intake and metabolism in humans should be rigorously researched as potential regulator of aging. These findings complement and agree with a recent report by Antebi's group (Annibal et al. Nature Com, 2021) highlighting that long-lived worm and mice strains exhibit similar metabolic regulation of one carbon metabolism. In the same report low levels of folate supplementation partially or completely abrogated the lifespan extension in some models. This study provides additional evidence that restricting OCM through drugs or dietary restriction can significantly impact healthspan and lifespan. Additionally, it raises the question whether excessive folate intake in aged adults may have potentially deleterious effects on health and longevity. The limitations of this study can be seen in the overall lack of significant impact of the dietary intervention on the health metrics that were measured in mice. The study does not provide strong evidence that restricting folate and choline intake will produce favorable effects on health. Similarly, no significant impact on mice lifespan was observed based on the partial lifespan analysis. Further clarity is needed regarding the experimental procedures and methods used. The study, nonetheless, is an important step towards investigating the role of folate and OCM in regulating mammalian healthspan and lifespan. Future studies can expand on these findings and investigate whether OCM interventions that are started in early life can produce significant and measurable effects on longevity and health in mammals. The findings here provide a conceptual and incremental advance in our understanding of these complex interactions.

      These findings are important to the research communities especially in the areas of longevity, metabolism, and nutrition.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations For The Authors):

      Will the nanobody be available to the TB research community?

      Yes, we will make E11rv available upon request. Please see our materials availability statement.

      Reviewer #2 (Recommendations For The Authors):

      (1) It would be interesting to test the potential impact of residual ASB-14 contaminant on the biochemical behavior of ESAT-6-CFP10 heterodimer and ESAT-6 homodimer or tetramer and their hemolytic activity in comparison with the ones without ASB-14.

      We agree that this is an interesting line of questioning. Based on the study by Refai et al. that we cite in the text, ESAT-6 treated with nonionic detergents ASB-14 or LDAO, but not other common detergents, undergoes a conformational change that increases its cytotoxicity in cell assays, hemolytic activity, and ability to dimerize with CFP-10. What is not known at this point, is how similar the ASB-bound conformation is to anything seen physiologically.

      (2) Building on the progress in making anti-ESAT-6 nanobodies and their anti-Mtb effects in the cells, it could have been tested in human or mouse primary macrophages infected with Mtb and a mouse model of Mtb infection for its anti-Mtb efficiency.

      We thank the reviewer for this suggestion, and we agree that these would be very informative next steps for determining the therapeutic potential of anti-ESAT-6 nanobodies.

      Reviewer #3 (Recommendations For The Authors):

      Minor comments:

      Line 133: "It is well established that Mm-induced hemolysis is ESX-1 dependent, but our results suggest that Mtb must lack one or more factors necessary for efficient hemolysis.". I would tone this down a bit, as it is also known that M. tuberculosis escapes much later than M. marinum from the phagosome, which could indicate different kinetics.

      We thank the reviewer for their insightful comments. We agree that the kinetics of Mtb and Mm infection are quite different and that this may impact the hemolysis assay. As described by Augenstreich et al. some hemolysis by Mtb is observed at 48 hours, though the method of normalization makes it impossible to determine absolute amount of hemolysis that occurred in their experiment. Our findings just show that the absolute amount of Mtb hemolysis in 2 hours is negligible, setting it apart from Mm. We have edited the wording of this statement in the manuscript to avoid any confusion.

      Line 155: "Because Mtb often exists in an acidified compartment". First of all, the reference used here does not discuss anything about Mtb, secondly, papers that do measure the acidification of Mtb-loaded phagosomes indicate that this acidification is very mild (typically to pH 6.2).

      We agree that this point should be articulated more precisely. We have added additional clarification that the pH of Mtb-containing compartments in macrophages can fall in a broad range depending on the activation state of the macrophages, and that non-activated macrophages are typically only mildly acidic. We have updated our references to better describe the current state of knowledge on this topic.

      Line 339: "Whereas most of these functions rely only on the secretion of ESAT-6 into the cytoplasm, the ability of E11rv to access Mtb suggests that this communication is likely two-way." No, not necessary, there are many processes in which ESX-1 substrates affect the macrophage. This nanobody could affect EsxA functioning only once the bacteria reach the cytoplasm. I think checking phagosomal escape in these cells is therefore crucial.

      We agree that phagosomal escape and subsequent direct secretion of ESAT-6 into the cytoplasm is a reasonable alternative hypothesis. We have added this point to our discussion, and we agree that looking directly at phagosomal escape is an important next step.

      Figure 7 is not mentioned in the text (mistake for Fig 6).

      This has been corrected.

    1. Author response:

      Public Reviews: 

      Reviewer #1 (Public Review): 

      As a reviewer for this manuscript, I recognize its significant contribution to understanding the immune response to saprophytic Leptospira exposure and its implications for leptospirosis prevention strategies. The study is well-conceived, addressing an innovative hypothesis with potentially high impact. However, to fully realize its contribution to the field, the manuscript would benefit greatly from a more detailed elucidation of immune mechanisms at play, including specific cytokine profiles, antigen specificity of the antibody responses, and long-term immunity. Additionally, expanding on the methodological details, such as immunophenotyping panels, qPCR normalization methods, and the rationale behind animal model choice, would enhance the manuscript's clarity and reproducibility. Implementing functional assays to characterize effector T-cell responses and possibly investigating the microbiota's role could offer novel insights into the protective immunity mechanisms. These revisions would not only bolster the current findings but also provide a more comprehensive understanding of the potential for saprophytic Leptospira exposure in leptospirosis vaccine development. Given these considerations, I believe that after substantial revisions, this manuscript could represent a valuable addition to the literature and potentially inform future research and vaccine strategy development in the field of infectious diseases. 

      We have been interested in understanding how both pathogenic and non-pathogenic Leptospira species affect each other on a mammalian reservoir host. With the current study we continue to elucidate the immune mechanisms engaged by pathogenic Leptospira interrogans versus non-pathogenic L. biflexa, as a follow up to our previous work (Shetty et al, 2021 PMID: 34249775, and Kundu et al 2022 PMID 35392072). We found that both species engaged partially overlapping myeloid immune cells and inflammatory signatures of infection. For example, some chemokines were increased, and macrophage and dendritic cells were engaged at 24h post inoculation with both species of Leptospira (PMID: 34249775). Thus, we questioned whether this robust innate immune response raised to eliminate an immunogenic but rather non-pathogenic bacterium, could also help restrain L. interrogans pathogenesis. In this study we show that L. biflexa pre-exposure to L. interrogans challenge mediates improved kidney homeostasis, mitigates leptospirosis severity and leads to increased shedding of L. interrogans in urine. This suggests an interspecies symbiotic commensalistic process that facilitates survival of the pathogenic species. These findings have high impact on the lives of millions of people in areas endemic for leptospirosis that are naturally exposed to non-pathogenic Leptospira species.

      We will expand on the methodological details and will update the introduction and discussion to include answers to questions raised by the three reviewers to further clarify the importance and impact of our study.

      Reviewer #2 (Public Review): 

      Summary: 

      The authors try to achieve a method of protection against pathogenic strains using saprophytic species. It is undeniable that the saprophytic species, despite not causing the disease, activates an immune response. However, based on these results, using the saprophytic species does not significantly impact the animal's infection by a virulent species. 

      We separate concepts of exposure to a non-virulent bacterium that establishes a brief infection with engagement of an immune response (L. biflexa), from infection established by a virulent species of Leptospira that leads to pathogenesis (L. interrogans). While trying to understand how both pathogenic and non-pathogenic Leptospira species affect each other on a mammalian reservoir host, we previously found that L. biflexa induces immune responses that should affect immunity of populations naturally exposed to this spirochete. Thus, we designed this study to answer that question.

      Strengths: 

      Exposure to the saprophytic strain before the virulent strain reduces animal weight loss, reduces tissue kidney damage, and increases cellular response in mice.

      Weaknesses: 

      Even after the challenge with the saprophyte strain, kidney colonization and the release of bacteria through urine continue. Moreover, the authors need to determine the impact on survival if the experiment ends on the 15th. 

      Another novel and unexpected aspect of our findings in the single exposure experiment was that L. biflexa pre-exposure mediated a homeostatic environment in the kidney (lower ColA1, healthier renal physiology) that restrained pathogenesis of L. interrogans after challenge, which resulted in better health outcomes and increased shedding of L. interrogans in urine; in contrast, if the kidney is compromised (high ColA1) by L. interrogans (without L. biflexa pre-exposure) there was lower shedding L. interrogans in urine. Interestingly, this suggests an interspecies symbiotic commensalistic process that facilitates survival of the pathogenic species. Thus, these data suggest that higher shedding of L. interrogans in urine may not be a hallmark of increased disease, but rather it could be the opposite.

      We will include these concepts in the updated discussion.

      We don’t think that extending this experiment to d21 or d28 would add relevant data to our findings. We provide survival curves for both experiments up to d15 post infection.

      Reviewer #3 (Public Review): 

      Summary: 

      Kundu et al. investigated the effects of pre-exposure to a non-pathogenic Leptospira strain in the prevention of severe disease following subsequent infection by a pathogenic strain. They utilized a single or double exposure method to the non-pathogen prior to challenge with a pathogenic strain. They found that prior exposure to a non-pathogen prevented many of the disease manifestations of the pathogen. Bacteria, however, were able to disseminate, colonize the kidneys, and be shed in the urine. This is an important foundational work to describe a novel method of vaccination against leptospirosis. Numerous studies have attempted to use recombinant proteins to vaccinate against leptospirosis, with limited success. The authors provide a new approach that takes advantage of the homology between a non-pathogen and a pathogen to provide heterologous protection. This will provide a new direction in which we can approach creating vaccines against this re-emerging disease. 

      Strengths: 

      The major strength of this paper is that it is one of the first studies utilizing a live non-pathogenic strain of Leptospira to immunize against severe disease associated with leptospirosis. They utilize two independent experiments (a single and double vaccination) to define this strategy. This represents a very interesting and novel approach to vaccine development. This is of clear importance to the field. 

      The authors use a variety of experiments to show the protection imparted by pre-exposure to the non-pathogen. They look at disease manifestations such as death and weight loss. They define the ability of Leptospira to disseminate and colonize the kidney. They show the effects infection has on kidney architecture and a marker of fibrosis. They also begin to define the immune response in both of these exposure methods. This provides evidence of the numerous advantages this vaccination strategy may have. Thus, this study provides an important foundation for future studies utilizing this method to protect against leptospirosis. 

      Weaknesses: 

      Although they provide some evidence of the utility of pretreatment with a non-pathogen, there are some areas in which the paper needs to be clarified and expanded. 

      The authors draw their conclusions based on the data presented. However, they state the graphs only represent one of two independent experiments. Each experiment utilized 3-4 mice per group. In order to be confident in the conclusions, a power analysis needs to be done to show that there is sufficient power with 3-4 mice per group. In addition, it would be important to show both experiments in one graph which would inherently increase the power by doubling the group size, while also providing evidence that this is a reproducible phenotype between experiments. Overall, this weakens the strength of the conclusions drawn and would require additional statistical analysis or additional replicates to provide confidence in these conclusions. 

      We will take these suggestions into consideration and will address as many of these issues as possible in the revised manuscript.

      A direct comparison between single and double exposure to the non-pathogen is not able to be determined. The ages of mice infected were different between the single (8 weeks) and double (10 weeks) exposure methods, thus the phenotypes associated with LIC infection are different at these two ages. The authors state that this is expected, but do not provide a reasoning for this drastic difference in phenotypes. It is therefore difficult to compare the two exposure methods, and thus determine if one approach provides advantages over the other. An experiment directly comparing the two exposure methods while infecting mice at the same age would be of great relevance to and strengthen this work. 

      Both experiments need to be analyzed as separate but complementary as they provide different hind sights into L. interrogans pathogenesis and potential solutions to the problem. Optimal measurements of disease progression (weight loss, survival curves) require infection of mice at 8 weeks. Based on this, a new L. biflexa double exposure experiment would have to start when mice are 4 weeks old which is just after weaning, and before the mouse immune system is fully developed.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This is a valuable contribution to the electric fish community, and to studies of active sensing more generally, in that it provides evidence that a well-studied behavior (chirping) may serve in active sensing rather than communication. For the most part, the evidence is solid. In particular, the evidence showing increased chirping in more cluttered environments and the relationship between chirping and movement are convincing. Nevertheless, evidence to support the argument that chirps are mostly used for navigation rather than communication is incomplete.

      Thank you for the comment. In response to what seemed to be a generalized need for more evidence to support our hypothesis, we have extensively reviewed the manuscript, changed the existing figures and added new ones (3 new figures in the main text and 4 in the supplementary information section). Our edits include:

      (1) changes to the written text to remove categorical statements ruling out the possible communication function of chirps. When necessary, we have also added details on why we believe a social communication function of chirps could interfere with a role in electrolocation.

      (2) new experiments (and related figures) adding details on the behavioral correlates of chirping, on the effects of chirps on electric images (which are a way to represent current flow on the fish skin), and behavioral responses to ramp frequency playback EODs (used to test a continuous range of beat frequencies and fill the sampling gaps left by our experiments using real fish).

      Public Reviews:

      Reviewer #1 (Public Review):

      The authors investigate the role of chirping in a species of weakly electric fish. They subject the fish to various scenarios and correlate the production of chirps with many different factors. They find major correlations between the background beat signals (continuously present during any social interactions) or some aspects of social and environmental conditions with the propensity to produce different types of chirps. By analyzing more specifically different aspects of these correlations they conclude that chirping patterns are related to navigation purposes and the need to localize the source of the beat signal (i.e. the location of the conspecific).

      We thank the Reviewer for the extensive feedback received. Hereby we respond to each of the points raised.

      We have better clarified that our intention is not to propose chirps as tools for “conspecific localization” intended as the pinpointing of its particular location. Instead, based on our observation of chirps being employed at very close ranges, we suggest that chirps may serve to assess other parameters related to “conspecific positioning” (which in a wide sense, it is still “electrolocation”), and that could be derived from the beat. These parameters might include size, relative orientation, or subtle changes in position during movement. While the experiments discussed in the manuscript do not provide a conclusive answer in this regard, we prioritize here the presentation of broader evidence for a different use of chirping. We are actively working on another manuscript that explores this aspect more in detail, but, due to space limitations, additional results had to be excluded.

      In the abstract we mention a role of chirps in the enhancement of “electrolocation”, but - as above mentioned - it is here meant only in a broad sense. In the introduction (at the very end) we propose chirps as self-directed signals (homeoactive sensing). In the result paragraph dedicated to the novel environment exploration experiment the following lines were added “Most chirps (90%) in fact are produced within a distance corresponding to 1% of the maximum field intensity (i.e. roughly 30 cm; Figure S12B), indicating that chirping occurs way below the threshold range for beat detection (i.e. roughly in the range of 60-120 cm, depending on the study; see appendix 1: Detecting beats at a distance) and likely does not represent a way to improve it”. We conclude this paragraph mentioning “This further corroborates the hypothesized role of chirps in beat processing.”. The last result paragraph (on chirping in cluttered environments) ends with “This supports the notion of chirps as self-referenced probing cues, potentially employed to optimize short-range aspects of conspecific electrolocation, such as conspecific size, orientation, and swimming direction - a hypothesis that will certainly be explored in future studies.”. In the discussion paragraph entitled “probing with chirps”, we do provide hints to possible mechanisms implied in the role of chirps in beat processing. As mentioned, we have planned to add further details in another manuscript, currently in preparation.

      The study provides a wealth of interesting observation of behavior and much of this data constitute a useful dataset to document the patterns of social interactions in these fish. Some data, in particular the high propensity to chirp in cluttered environments, raises interesting questions. Their main hypothesis is a useful addition to the debate on the function of these chirps and is worth being considered and explored further. However, the data they provide does not support strong conclusion statements arguing that these chirps are used for localization purposes and is even less convincing at rejecting previously established hypotheses on the communication purpose of the chirps.

      We intentionally framed our aims a bit provocatively to underscore that, to date, the role of chirps in social communication has been supported solely by correlative evidence. While the evidence we provide to support the role of chirps as probes is also correlative, it opens at the same time critical questions on the long assumed role of chirps in social communication. In fact, chirping is strongly dependent on fish reciprocal positioning, highly constrained by beat frequency, and patterned in such ways that - in our opinion - makes the existence of links between chirp types and internal states less likely, as suggested instead by the current view. Moreover, the use of different chirp types does not appear specific to any of the social contexts analyzed but is primarily explained by DF (beat frequency). This observation, coupled with the analysis of chirp transitions (more self-referenced than reflecting an actual exchange between subjects), leads us to hypothesize with greater confidence that chirp production may be more related to sensing the environment, rather than transmitting information about a specific behavioral state.

      Nevertheless, the Reviewer's comment is valid. We've tempered the study's conclusions by introducing the possibility of chirps serving both communication and electrolocation functions, as stated in the conclusion paragraph: "While our results do not completely dismiss the possibility of chirps serving a role in electrocommunication—probing cues could, for instance, function as proximity signals to signal presence, deter approaches, or coordinate behaviors like spawning (Henninger et al., 2018).". Nonetheless, we do emphasize that our hypothesis is more likely to apply - based on our data. We refrain from categorically excluding a communicative function for chirps (between subjects), but we hypothesize that this communication - if occurring - may contain the same type of information as the self-directed signaling implied by the “chirps as probes” idea (i.e. spatial information).

      In response to the Reviewer's feedback, we've revised the end of the introduction, removing suggestions of conclusiveness: "Finally, by recording fish in different conditions of electrical 'visibility,' we provide evidence supporting a previously neglected role of chirps: homeoactive sensing." (edit: the word “validating” has been removed to give a less “conclusive” answer to the open functional questions about chirping).

      I would suggest thoroughly revising the manuscript to provide a neutral description of the results and leaving any speculations and interpretations for the discussion where the authors should be careful to separate strongly supported hypotheses from more preliminary speculations. I detail below several instances where the argumentation and/or the analysis are flawed.

      Following to the reviewer’s comment, we have revised the manuscript to emphasize the following points: 1) the need for a revision of the current view on chirping, 2) our proposal of an alternative hypothesis based on correlations between chirping and behavior, which were previously unexplored, and 3) our acknowledgment that while we offer evidence supporting a probing role of chirps (e.g., lack of behavioral correlation, DF-dependency, stereotypy in repeated trials, modulation by clutter and distance), we do not present here conclusive evidence for chirps detecting specific details of conspecific positioning. Neither do we exclude categorically a role of chirps in social communication.

      They analyze chirp patterning and show that, most likely, a chirp by an individual is followed by a chirp in the same individual. They argue that it is rare that a chirp elicits a "response" in the other fish. Even if there are clearly stronger correlations between chirps in the same individual, they provide no statistical analysis that discards the existence of occasional "response" patterns. The fact that these are rare, and that the authors don't do an appropriate analysis of probabilities, leads to this unsupported conclusion.

      We employed cross-correlation indices, calculated and assessed with a 3 standard deviation symmetrical boundary (which is a statistically sound and strict criterion). Median values were utilized to depict trends in each group/pair. To support our findings, we added new experiments and new figures: 1) a correlation analysis between chirps and behaviors, providing more convincing evidence of how chirps are employed during "scanning" swimming activity (backward swimming); 2) a text mining approach to underscore chirp-behavior correlations, employing alternative and statistically more robust methods.

      One of the main pieces of evidence that chirps can be used to enhance conspecific localization is based on their "interference" measure. The measure is based on an analysis of "inter-peak-intervales". This in itself is a questionable choice. The nervous system encodes all parts of the stimulus, not just the peak, and disruption occurring at other phases of the beat might be as relevant. The interference will be mostly affected by the summed duration of intervals between peaks in the chirp AM. They do not explain why this varies with beat frequency. It is likely that the changes they see are simply an artifact of the simplistic measure. A clear demonstration that this measure is not adequate comes from the observation in Fig7E-H. They show that the interference value changes as the signal is weaker. This measure should be independent of the strength of the signal. The method is based on detecting peaks and quantifying the time between peaks. The only reason this measure could be affected by signal strength is if noisy recordings affect how the peak detection occurs. There is no way to argue that this phenomenon would happen the same way in the nervous system. Furthermore, they qualitatively argue that patterns of chirp production follow patterns of interference strength. No statistical demonstration is done. Even the qualitative appraisal is questionable. For example, they argue that there are relatively few chirps being produced for DFs of 60 or -60 Hz. But these are DF where they have only a very small sample size. The single pair of fish that they recorded at some of these frequencies might not have chirped by chance and a rigorous statistical analysis is necessary. Similarly, in Fig 5C they argue that the position of the chirps fall on areas of the graph where the interferences are strongest (darker blue) but this is far from obvious and, again, not proven.

      We would like to clarify that the estimation of the effects of chirps on the beat (referred to as “beat interference”) was not intended to serve as the primary evidence supporting a different use of chirping. In fact, all the experiments conducted prior to that calculation already provide substantial evidence supporting the hypothesis we have proposed. In an attempt to address the Reviewer’s concern and to avoid misleading interpretations, we moved this part now to the Supplementary Information (see now Figures S8 and S9), in agreement with the non crucial relevance of this approach. We also added the following statement to the result paragraph entitled “Chirps significantly interfere with the beat and enhance electric image contrast”: “Obviously, measuring chirp-triggered beat interferences by using an elementary outlier detection algorithm on the distribution of beat cycles does not reflect any physiological process carried out by the electrosensory system and can be therefore used only as an oversimplified estimate.”.

      Regarding the meaning of “beat interference” (as here estimated) from a perspective of brain physiology: chirp interference was calculated using the beat cycles as a reference. Beat peaks were used only to estimate beat cycle duration. Regardless of whether or not a beat peak is represented in the brain, beat cycle duration (estimated using the peaks) is the main determinant of p-unit rhythmic response to a beat. Regarding the effect of signal amplitude, this is also not very relevant. It is obvious that a chirp creates more - or less - interference based on the chirp FM and its duration (but also the sign of the DF and the magnitude of the amplitude modulation). If electroreceptor responses are entrained in waves of beat AMs and if “interference” is a measure of how such waves are scrambled, then “interference” is a measure of how chirps scramble waves of electroreceptor activity by affecting beat AMs.

      The reason why the interference fades with the signal (previous figure 7, now Figure S12) is because it is weighted on the signal strength (the signals used as carrier for chirps are recalculated based on real measurements of signal strength at different distances). Nonetheless, the Reviewer is right: mathematically speaking interference would not change at all because it is just the result of an outlier detection algorithm. This outlier detection is actually set to have a 1% threshold (percent of beat contrast).

      Regarding the comparison “chirps vs interference”, we did not make a statistical analysis because we wanted to just show a qualitative observation. Similar results can be obtained for slightly shorter or longer time windows, within certain limits of course (see added Figure S9, in the Supplementary Information). We hope that moving this analysis to the supplementary information makes it clear that this approach is not central to make our point.

      The Reviewer’s point on the DF sampling is correct, we have reconsidered the low chirping at 60Hz as potentially the result of sampling bias and edited the respective result paragraph.

      They relate the angle at which one fish produces chirps relative to the orientation of the mesh enclosing. They argue that this is related to the orientation of electric field lines by doing a qualitative comparison with a simplified estimate of field lines. To be convincing this analysis should include a quantitative comparison using the exact same body position of the two fish when the chirps are emitted.

      We agree with the Reviewer, this type of experiment would be much better suited to illustrate the correlations between chirping and reciprocal positioning in fish. What we can see is that chirping occurs at certain orientations more often than others. This could have something to do with either field geometry or with locomotion in the particular test environment we have used. As mentioned earlier, we are currently editing a second manuscript which will include the type of analysis/experiment the Reviewer is thinking of. We preferred to focus in this first study on the broader behavioral correlates of chirping. We removed the mention to the field current lines because - we agree - the argument is vague as presented here.

      They show that the very vast majority of chirps in Fig 6 occur when the fish are within a few centimeters (e.g. very large first bin in Fig6E-Type2). This is a situation when the other fish signal will be strongest and localization will be the easiest. It is hard to understand why the fish would need a mechanism to enhance localization in these conditions (this is the opposite of difficult conditions e.g. the "cluttered" environment).

      Agreed, in fact we do not explicitly propose chirps as means to improve “electrolocation” (this word is used only broadly in the abstract) but instead as probes to extract spatial information (e.g. shape, motion, orientation) from a beat source. In a broader sense, all these spatial parameters contribute to any given instance of "localization." Because we were unable to explore all these aspects in greater detail, we chose to maintain a broader perspective. If chirps contribute to a better resolution of fine spatial attributes of conspecific locations, it is reasonable to expect higher chirping rates in proximity to the target fish.

      The argumentation aimed at rejecting the well-established role of chirp in communication is weak at best. First, they ignored some existing data when they argue that there is no correlation between chirping and behavioral interactions. Particularly, Hupe and Lewis (2008) showed a clear temporal correlation between chirps and a decrease in bites during aggressive encounters. It could be argued that this is "causal evidence" (to reuse their wording) that chirps cause a decrease in attacks by the receiver fish (see Fig 8B of the Hupe paper and associated significant statistics). Also, Oboti et al. argue that social interactions involve "higher levels of locomotion" which would explain the use of chirps since they are used to localize. But chirps are frequent in "chirp chamber" paradigms where no movement is involved. They also point out that social context covaries with beat frequency and thus that it is hard to distinguish which one is linked to chirping propensity and then say that it is hard to disentangle this from "biophysical features of EOD fields affecting detection and localization of conspecific fish". But they don't provide any proof that beat frequency affects detection and localization so their argument is not clear. Last, they argue that tests in one species shouldn't be extrapolated to other species. But many of the studies arguing for the role of chirps in communication was done on brown ghost. In conclusion of this point, they do not provide any strong argument that rejects the role of chirps as a communication signal. A perspective that would be better supported by their data and consistent with past research would be to argue that, in addition to a role in communication, chirps could sometimes be used to help localize conspecifics.

      We did not intend to disregard the extensive body of literature supporting a role of chirps in social communication. Rather, the primary goal of this study was to present a valid alternative perspective to this prevailing view. The existence of a well-established hypothesis does not imply that new evidence cannot change it; it simply indicates that changing it may be challenging either because it's genuinely difficult or because the idea has not been thoroughly explored. Whatever the case may be, proposing new hypotheses, whether complementary or alternative to established theories, is a challenging undertaking for a single study. We judged that starting from broad correlations would be the most desirable approach.

      We did not ignore data from Hupé and Lewis 2008. We cited this study repeatedly and compared their findings to those of others, not only for the correlation chirp-behaviors but also for chirping distance considerations. However, following the Reviewer’s comment, we now cite this study in the context of the behavioral analysis recently added (data from the PSTH plots could possibly confirm the observation of lower chirps during attacks). We also cited the study by Triefenbach and Zakon 2008, which reports something along the same lines. See the statement: “Overall, these results provided mutually reinforcing evidence indicating that chirps are produced more often during locomotion or scanning-related motor activity and confirm previous reports of a lower occurrence of chirping during more direct aggressive contact (as shown also by Triefenbach and Zakon, 2008; Hupé and Lewis, 2008).”, in the result paragraph related to the behavioral correlates of chirping.

      In our study we make it clear how we distinguish causal evidence (i.e. providing evidence that A is required for B) from correlation (i.e. evidence for A simply occurring together with B). We also make it clear that we are not going to provide causal evidence but we are going to provide new evidence for correlations that were so far not considered, in order to propose a new unexplored function of chirps.

      The Reviewer's point on chirping during motion and while caged in a chirp chamber is valid. Indeed at first we were also puzzled by this finding. However, under the “chirps as probes” paradigm, chirping in a chirp-chamber can be explained by the need to obtain spatial information from an otherwise unreachable beat source (brown ghosts are typically exploring new environmental objects or conspecifics by actively swimming around them - something caged fish can’t do). So, eventually the observation of chirping under conditions of limited movement (such as in a chirp chamber experiment) is not in contradiction with our hypothesis, rather it can be used to support it. Further experiments are required - as rightfully pointed out - to evaluate the effects of beat frequency on beat detection. We added a note about this in the “probing with chirps” discussion paragraph.

      The Reviewer's comment regarding generalization is unclear. We acknowledge that most studies are conducted in brown ghosts, as stated in the abstract. Our intention was to highlight that insights gained from this species have been applied to broaden the understanding of chirps in other species. Specifically, the "behavioral meaning idea" of chirping has been extended to other gymnotiform species producing EOD frequency modulations .

      Our study's aim is not to dismiss the idea of chirps being used for communication but to present an alternative hypothesis and to provide supporting evidence. While our results may not align well with the communication theory, our intention is not dismissal but rather engaging in a discussion and exploration of alternative perspectives.

      The discussion they provide on the possible mechanism by which chirps could help with localization of the conspecific is problematic. They imply that chirps cause a stronger response in the receptors. For most chirps considered here, this is not true. For a large portion of the beat frequencies shown in this paper, chirps will cause a de-synchronization of the receptors with no increase in firing rate. They cannot argue that this represents an enhanced response. They also discuss a role for having a broader frequency spectrum -during the chirp- in localization by making a parallel with pulse fish. There is no evidence that a similar mechanism could even work in wave-type fish.

      We have already commented on the “localization” idea in our previous responses. The Reviewer is right in saying that we have provided only vague descriptions of the potential mechanisms implied by our hypothesis. The studies by Benda and others (2005, 2006) demonstrate a clear synchronizing effect of chirps on p-unit firing rates, especially at low DFs (at ranges similar to those considered in this study). This synchronization could lead to an enhanced response at the electroreceptor level, as described in these very studies, which in turn would result in a higher probability of firing in downstream neurons (E-cells in the ELL).

      As also reported within the same works, chirps may also exert an opposite effect on p-units (i.e. desynchronization). This is what happens for large chirps at high DFs. Desynchronization may cause temporary lapses of p-unit firing, which in turn may lead to increased activity of I-cells in the ELL (which are indeed specifically tuned to p-unit lack of activity).

      So, in general, if we consider both ON and OFF pyramidal cells (in the ELL) and small and large chirps, we could state that chirps can be potentially used to enhance the activity of peripheral electrosensory circuits through different mechanisms, contingent on the chirp type and beat frequency. Unfortunately, space constraints limited our ability to dig into these details in the present study.

      However, to address the Reviewer’s rightful point, we now mention this in the manuscript: Since the beat AMs generated by the chirps always trigger reliable responses in primary electrosensory circuits (pyramidal cells in the ELL respond to both increases and decreases in beat AM), any chirp-triggered AM causing a sudden change in p-unit firing could potentially amplify the downstream signal (Marsat and Maler, 2010) and thus enhance EI contrast.” (see result paragraph on beat interference and electric images).

      They write the whole paper as if males and females had been identified in their experiments. Although EOD frequency can provide some guess of the sex the method is unreliable. We can expect a non-negligible percentage of error in assigning sex.

      We agree and in fact, in the method section we state:

      “The limitation of this approach is that females cannot be distinguished from immature males with absolute certainty, since no post-mortem gonadal inspection was carried out.”

      to this we added:

      “Although a more accurate way to determine the sex of brown ghosts would be to consider other morphological features such as the shape of the snout, the body size, the occurrence of developing eggs, EOD frequency has been extensively used for this purpose.”

      Moreover, the consistent behavioral differences observed in low frequency fish, measured with those behavioral experiments aimed at assessing responses to playback stimuli and swimming behavior in novel environments, could also be caused by a younger age (as opposed to femaleness). However, the size ranges of our fish (an admittedly unreliable proxy of age) were all comparable, making this possibility perhaps less likely.

      Reviewer #2 (Public Review):

      Studying the weakly electric brown ghost knifefish, the authors provide evidence that 'chirps' (brief modulations in the frequency and amplitude of the ongoing electric signal) function in active sensing (specifically homeoactive sensing) rather than communication. This is a behavior that has been very well studied, including numerous studies on the sensory coding of chirps and the neural mechanisms for chirp generation. Chirps are largely thought to function in communication behavior, so this alternative function is a very exciting possibility that could have a great impact on the field. The authors do provide convincing evidence that chirps may function in homeoactive sensing. However, their evidence arguing against a role for chirps in communication is not as strong, and neglects a large body of research. Ultimately, the manuscript has great potential but suffers from framing these two possibilities as mutually exclusive and dismissing evidence in favor of a communicative function.

      We thank the Reviewer for the comment. Overall, we have edited the manuscript to soften our conclusions and avoid any strong categorical statement excluding the widely accepted role of chirps in social communication. We have added some new experiments with the aim to add more detail to the behavioral correlates of chirping and to the DF dependency of the production of different types of chirps. Nonetheless, based on our results, we are prone to conclude that the communication idea - although widely accepted - is not as well substantiated as it should be.

      Although we do not dismiss the bulk of literature supporting a role of chirps in social communication, we think that our hypothesis (i.e. decoding of spatial parameters from the beat) may be not fully compatible with the social communication hypothesis for the following reasons:

      (1) Chirp type dependency on DF makes chirps likely to be adaptive responses to beat frequency. While this idea is compatible with a role of chirps in the detection of beat parameters, their concurrent role in social communication would imply that chirps interacting at given beat frequencies (DFs) would communicate only (or mainly) by delivering a very limited range of “messages”. For instance, assuming type 2 chirps are related to aggression (as widely suggested), are female-male pairs - with larger DFs - interacting less aggressively than same sex pairs? Our experiments often suggested this is not the case. In addition, large DFs are not always indicative of opposite sex interactions, while they are very often characterized by the emission of large chirps. Not to mention that, despite the fact that opposite sex interactions in absence of breeding-like conditions, cannot be considered truly courtship-related, large chirps are often considered courtship signals, regardless of the reproductive state of the emitting fish.

      (2) Chirping is highly affected by locomotion (consider female/male pairs with or without mesh divider) and distance (as shown in the novel environment exploration experiment). While the involvement of both parameters is compatible with a role of chirps in active sensing, a role of chirps in social communication implies that such signaling would occur only when fish are in very close proximity to each other. In this case, the beat is therefore heavily distorted not only by fish position/locomotion but also by chirps. Which means that when fish are close to each other, the 2 different types of information relayed by the beat (electrolocation and electrocommunication) would certainly interfere (this idea has been better phrased in the Introduction paragraph).

      (3) In our playback experiments we could not see any meaningful matching (e.g. angry-chirp → angry-chirp or sexy-chirp → approach) between playback chirps and evoked chirps, raising doubts on the meaning associated so far with the different types. Considering that playback experiments are typically used to assess signal meaning based on how animals respond to them, this result is suggesting quite strongly that such meaning cannot be assigned to chirps.

      (4) In playback experiments in which the same stimulus is provided multiple times, chirp type transitions (i.e. emission of a different chirp type after a given chirp) become predictable (as shown in the added playback experiments using ramping signals). This confirms that the choice to emit a given chirp type has something to do with beat frequency (or a change in this parameter) and not a communication of internal states. It would be otherwise unclear how a fish could change its internal state so quickly - and so reliably - even in the span of a few seconds.

      Despite this evidence against a semantic content of chirps in the context of social communication, we conclude the manuscript reminding that we are not providing strong evidence dismissing the communication hypothesis, and that both could coexist (see the example of “proximity signals” in the mating context given in the concluding paragraph).

      (1) The specific underlying question of this study is not made clear in the abstract or introduction. It becomes apparent in reading through the manuscript that the authors seek to test the hypothesis that chirps function in active sensing (specifically homeoactive sensing). This should be made explicitly clear in both the abstract and introduction, along with the rationale for this hypothesis.

      In the abstract we state “Despite the success of this model in neuroethology over the past seven decades, the underlying logic of their electric communication remains unclear. This study re-evaluates this view, aiming to offer an alternative, and possibly complementary, explanation for why these freshwater bottom dwellers emit electric chirps.”. This statement is meant as a summary of our aims. However, in order to convey a clearer message, we have revised the whole manuscript to more explicitly articulate our objectives. In particular we stress that with our experiments we intend to provide correlative evidence for a different role of chirps (previously unexplored) with the idea to stimulate a discussion and possibly a revision of the current theory about the functional role of chirps.

      In the introduction we have added a paragraph explaining our aim and also why we think that communicating through chirps could potentially interfere with efficient electrolocation: “Since both chirps and positional parameters (such as size, orientation or motion) can only be detected as perturbations of the beat (Petzold et al., 2016; Yu et al., 2012; Fotowat et al., 2013), and via the same electroreceptors, the inputs relaying both types of information are inevitably interfering. Moreover, as the majority of chirps are produced within a short range (< 50 cm; Zupanc et al., 2006; Hupé and Lewis 2008; Henninger et al., 2018; see appendix 1) this interference is likely to occur consistently during social interactions.

      Under the communication-hypothesis, the assumption that chirps and beats are conveying different types of information (i.e. semantic value as opposed to position and related geometrical parameters) is therefore leaving this issue unresolved.”.

      (2) My biggest issue with this manuscript is that it is much too strong in dismissing evidence that chirping correlates with context. This is captured in this sentence in the introduction, "We first show that the choice of different chirp types does not significantly correlate with any particular behavioral or social context." This very strong conclusion comes up repeatedly, and I disagree with it, for the following reasons:

      In your behavioral observations, you found sex differences in chirping as well as differences between freely interacting and physically separated fish. Your model of chirp variability found that environmental experience, social experience, and beat frequency (DF) are the most important factors explaining chirp variability. Are these not all considered "behavioral or social context"? Beat frequency (DF) in particular is heavily downplayed as being a part of "context" but it is a crucial part of the context, as it provides information about the identity of the fish you're interacting with.

      In your playback experiments, fish responded differently to small vs. large DFs, males chirped more than females, type 2 chirps became more frequent throughout a playback, and rises tended to occur at the end of a playback. These are all examples of context-dependent behavior.

      We agree with the Reviewer’s comment and we think that probably we have been unclear in what the meaning of that statement was. We also agree with the Reviewer about what is defined as “context”, and that a given beat frequency (DF) can in the end represent a “behavioral context” as well. In order to make it clearer, we have rephrased this statement and changed it to: “We first show that the relative number of different chirp types in a given recording does not significantly correlate with any particular behavioral or social context.”. This new form refers specifically to the observation that - in all different social conditions examined - the relative amounts of different types of chirps is unchanged (see Figure S2). We thought the Reviewer maybe interpreted our statement as if we suggested that chirp type choice is random or unaffected by any social variable. We agree with the Reviewer that this is not the case. We also reported that sex differences in chirping are present, but we have emphasized they may have something to do with the propensity of the brown ghosts of either sex to swim/explore as opposed to seek refuge and wait (as suggested by our experiments in which FM pairs were either divided or freely interacting and our novel environment exploration experiments).

      We agree DF is important, in fact it is the 3rd most important factor explaining chirp variance in our model. In our fish pair recordings, we see a strong correlation of chirp total variance with tank experience (one naïve, one experienced, both fish equally experienced) and social context (novel to each other/familiar to each other, subordinate/dominant, breeding/non breeding, accessible/not accessible) although data clustering seems to better distinguish “divided” vs “freely moving” conditions (and sex may also play a role as well because of the reversal of sexual dimorphism in chirp rates in precisely this case) more than other variables. However, we do not see a specific effect of these variables on the proportion of different types of chirps in any recording (see Figure S2).

      We also edited the beginning of the first result paragraph and changed it to “Thus, if behavioral meaning can be attributed to different types of chirps, as posed by the prevailing view (e.g., Hagedorn and Heiligenberg, 1985; Larimer and MacDonald, 1968; Rose, 2004), one should be able to identify clear correlations between behavioral contexts characterizing different internal states and the relative amounts of different types of chirp”, to emphasize we are here assessing the meaning of different types of chirps (not of the total amount of chirping in general).

      Further, you only considered the identity of interacting fish or stimulated fish, not their behavior during the interaction or during playback. Such an analysis is likely beyond the scope of this study, but several other studies have shown correlations between social behavior and chirping. In the absence of such data here, it is too strong to claim that chirping is unrelated to context.

      We agree with the Reviewer, in fact this analysis was previously carried out but purposely left out in an attempt to limit the manuscript length. We have now made space for this experimental work which is now added (see the new Figure 6).

      In summary, it is simply too strong to say that chirping does not correlate with context. Importantly, however, this does not detract from your hypothesis that chirping functions in homeoactive sensing. A given EOD behavior could serve both communication and homeoactive sensing. I actually suspect that this is quite common in electric fish. The two are not mutually exclusive, and there is no reason for you to present them as such. I recommend focusing more on the positive evidence for a homeoactive function and less on the negative evidence against a communication function.

      We aimed to clarify that our reference was to the lack of correlation between "chirp type relative numbers" and the analyzed context. Regarding the communication function, we tempered negative statements. However, as this study stems from evidence within the established paradigm of "chirps as communication signals", and aims at proposing an alternative hypothesis, eliminating all references to it could undermine the study's purpose.

      (3) The results were generally challenging to follow. In the first 4 sections, it is not made clear what the specific question is, what the approach to addressing that question is, and what specific experiment was carried out (the last two sections of the results were much clearer). The independent variables (contexts) are not clearly established before presenting the results. Instead they are often mentioned in passing when describing the results. They come across as an unbalanced hodgepodge of multiple factors, and it is not made clear why they were chosen. This makes it challenging to understand why you did what you did, the results, and their implications. For each set of major results, I recommend: First, pose a clear question. Then, describe the general approach to answering that question. Next, describe the specifics of the experimental design, with a rationale that appeals to the general approach described. Finally, describe the specific results.

      The introductory sentences of the first result paragraphs have been edited, rendering the aim of the experiments more explicit.

      (4) Results: "We thus predicted that, if behavioral meaning can be attributed to different types of chirps, as posed by the prevailing view (e.g., Hagedorn and Heiligenberg, 1985; Larimer and MacDonald, 1968; Rose, 2004)..." It should be made clear why this is the prevailing view, and this description should likely be moved to the introduction. There is a large body of evidence supporting this view and it is important to be complete in describing it, especially since the authors seem to seek to refute it.

      We understand the Reviewer’s question and we tried to express in the introduction the main reasons for why this is the current view. We state “Different types of chirps are thought to carry different semantic content based on their occurrence during either affiliative or agonistic encounters (Larimer and MacDonald 1968; Bullock 1969; Hopkins 1974; Hagedorn and Heiligenberg 1985; Zupanc and Maler 1993; Engler et al. 2000; Engler and Zupanc 2001; Bastian et al., 2001).”. To this we added: “Although supported mainly by correlative evidence, this idea gained popularity because it is intuitive and because it matches well enough with the numerous behavioral observations of interacting brown ghosts.”.

      We believe the prevailing view is based on intuition and a series of basic observed correlations repeated throughout the years. The crystallization of this idea is not due to negligence but mainly to technical limitations existing at the time of the first recordings. In order to assess the role of chirps in behaving fish a tight and precise temporal control over synched video-EOD recordings is most likely necessary, and this is a technical feature probably available only much later than the 50-60ies, when electric communication was first described.

      (5) I am not convinced of the conclusion drawn by the analysis of chirp transitions. The transition matrices show plenty of 1-2 and 2-1 transitions occurring. Further, the cross-correlation analysis only shows that chirp timing between individuals is not phase-locked at these small timescales. It is entirely possible that chirp rates are correlated between interacting individuals, even if their precise timing is not.

      We agree with the Reviewer: chirp repertoires recorded in different social contexts are not devoid of reciprocal chirp transitions (i.e. fish 1 chirp - to - fish 2 chirp, or vice versa). Yet our point is to emphasize that their abundance is way more limited when compared to the self-referenced ones (i.e. 1-1 and 2-2). This is a fair concern and in order to further address this point, we have added a whole new set of analyses and new experiments (see chirp-behavior correlations, PSTHs and more analysis based on more solid statistical methods; see Figure 6).

      Reviewer #3 (Public Review):

      Summary:

      This important paper provides the best-to-date characterization of chirping in weakly electric fish using a large number of variables. These include environment (free vs divided fish, with or without clutter), breeding state, gender, intruder vs resident, social status, locomotion state and social and environmental experience, as well as with playback experiments. It applies state-of-the-art methods for reducing dimensionality and finding patterns of correlation between different kinds of variables (factor analysis, K-means). The exceptional strength of the evidence, collated from a large number of trials with many controls, leads to the conclusion that a number of commonly accepted truths about which variable affects chirping must be carefully rewritten or nuanced. Based on their extensive analyses, the authors suggest that chirps are mainly used as probes that help detect beats and objects.

      Strengths:

      The work is based on completely novel recordings using interaction chambers. The amount of new data and associated analyses is simply staggering, and yet, well organized in presentation. The study further evaluates the electric field strength around a fish (via modelling with the boundary element method) and how its decay parallels the chirp rate, thereby relating the above variables to electric field geometry.

      The main conclusions are that the lack of any significant behavioural correlates for chirping, and the lack of temporal patterning in chirp time series, cast doubt on a communication goal for most chirps. Rather, the key determinants of chirping are the difference frequency between two interacting conspecifics as well as individual subjects' environmental and social experience. These conclusions by themselves will be hugely useful to the field. They will also allow scientists working on other "communication" systems to at least reconsider, and perhaps expand the precise goal of the probes used in those senses. There are a lot of data summarized in this paper, and thorough referencing to past work. For example, the paper concludes that there is a lack of evidence for stereotyped temporal patterning of chirp time series, as well as of sender-received chirp transitions beyond the known increase in chirp frequency during an interaction.

      The alternative hypotheses that arise from the work are that chirps are mainly used as environmental probes for better beat detection and processing and object localization.

      The authors also advance the interesting idea that the sinusoidal frequency modulations caused by chirps are the electric fish's solution to the minute (and undetectable by neural wetware) echo-delays available to it, due to the propagation of electric fields at the speed of light in water.

      Weaknesses:

      My main criticism is that the alternative putative role for chirps as probe signals that optimize beat detection could be better developed. The paper could be clearer as to what that means precisely.

      We appreciate the Reviewer's kind comments. While we acknowledge that our exploration of chirp function in this study may be limited and not entirely satisfying, we made this decision due to space constraints, opting for a broader and diversified approach. We hope that future studies will build on these data and start filling the gaps. We are also working on another manuscript which is addressing this point more in detail.

      Nonetheless, we considered the Reviewer’s criticism and added not only a new figure (to show more explicitly what chirps can do to the perceived electric fields, as simulated by electric images) but also more descriptive parts explaining how we think chirps may act to improve the spatial resolution of beat processing (see the discussion paragraph “probing with chirps”). In this paragraph we rendered more clearly how chirps could improve beat processing by phase shifting EODs and recovering eventual blind-spots on the fish skin caused by disruptive EOD interferences (resulting in lower beat contrast). We also mention that enhancement of electrosensory input triggered by chirps, could be localized not only at the level of electroreceptors (consider the synchronizing effects small chirps have on p-units at low frequency beats) but also at the level of ON and OFF pyramidal cells in the ELL. Looked at from the perspective of these neurons, any chirp would enhance the activity of these input lines, yet in opposite ways.

      And there is an egg-and-chicken type issue as well, namely, that one needs a beat in order to "chirp" the beating pattern, but then how does chirping optimize the detection of the said beat? Perhaps the authors mean (as they wrote elsewhere in the paper) that the chirps could enhance electrosensory responses to the beat.

      According to the Reviewer’s comment, we have now revised several instances of the misleading phrasing identified.

      In the results on novel environment exploration: “If chirps enhance beat processing, for instance, chirping should occur within beat detection range but at a certain distance.”.

      “This, in turn, could be used to validate our beat-interference estimates as meaningfully related to beat processing.” and “In all this, rises may represent an exception as their locations are spread over larger distances and even in presence of obstacles potentially occluding the beat source (such as shelters, plants, or walls), all of which are conditions in which beat detection or beat processing could be more difficult (this, could be coherent with the production of rises right at the end of EOD playbacks; Figure S5).”

      Last result paragraph (clutter experiment): “Overall, these results indicate that chirping is significantly affected by the presence of environmental clutter partially disrupting - or simply obstructing - the processing of beat related information during locomotion”.

      In the probing with chirps discussion paragraph “In theory, chirps could also be used to improve electrolocation of objects as well (as opposed to the processing of the beat).”.

      In the conclusions: “optimizing the otherwise passive responses to the beat”.

      A second criticism is that the study links the beat detection to underwater object localization. I did not see a sufficiently developed argument in this direction, nor how the data provided support for this argument. It is certainly possible that the image on the fish's body of an object in the environment will be slightly modified by introducing a chirp on the waveform, as this may enhance certain heterogeneities of the object in relation to its environment. The thrust of this argument seems to derive more from the notion of Fourier analysis with pulse type fish (and radar theory more generally) that the higher temporal frequencies in the beat waveform induced by the chirp will enable a better spatial resolution of objects. It remains to be seen whether this is significant.

      The Reviewer is correct in noting that this point is not addressed in the manuscript. We introduced it as a speculative discussion point to mention alternative possibilities. These could be subject to further testing in future studies.

      I would also have liked to see a proposal for new experiments that could test these possible new roles.

      We have added clearer suggestions for future experiments throughout the discussion: these may be aimed at 1) improving playback experiments using more realistic copies of the brown ghost’s EODs (including harmonics), 2) assess fish reciprocal positioning during chirping in better detail and 3) test the use of chirping during target-reaching tasks in order to better assess the probing function of chirps.

      The authors should recall for the readers the gist of Bastian's 2001 argument that the chirp "can adjust the beat frequency to levels that are better detectable" in the light of their current. Further, at the beginning of the "Probing with chirps" section, the 3rd way in which chirps could improve conspecific localization mentions the phase-shifting of the EOD. The authors should clarify whether they mean that the tuberous receptors and associated ELL/toral circuitry could deal with that cue, or that the T_unit pathway would be needed?

      We thank the Reviewer for identifying this unclear point. We added reference to the p-units “Yet, this does not exclude the possibility that chirps could be used to briefly shift the EOD phase in order to avoid disruptive interferences caused by phase opposition (at the level of p-units)” in the above mentioned paragraph. We would prefer to omit a more detailed reference to t-units in order to avoid lengthy descriptions required to discuss the different electroreceptor types.

      On p.17 I don't understand what is meant by most chirps being produced, possibly aligned with the field lines, since field lines are everywhere. And what is one to conclude from the comparison of Fig.6D and 7A? Likewise it was not clear what is meant by chirps having a detectable effect on randomly generated beats.

      We agree on the valid point raised by the Reviewer and we have removed reference to current lines from the text.

      In the section on Inconsistencies between behaviour and hypothesized signal meaning, the authors could perhaps nuance the interpretation of the results further in the context of the unrealistic copy of natural stimuli using EOD mimics. In particular, Kelly et al. 2008 argued that electrode placement mattered in terms of representation of a mimic fish onto the body of a real fish, and thus, if I properly understand the set up here, the movement would cause the mimic to vary in quality. This may nevertheless be a small confounding issue.

      We agree with the Reviewer and added a comment at the beginning of the paragraph mentioned. “Nonetheless, it's plausible that playback stimuli, as employed in our study and others, may not faithfully replicate natural signals, thus potentially influencing the reliability of the observed behaviors. Future studies might consider replicating these findings using either natural signals or improved mimics, which could include harmonic components (excluded in this study).”

      Recommendations for the authors:

      8Reviewer #2 (Recommendations For The Authors):*

      (1) Abstract: "...is probably the most intensely studied species..." is a weak, unsupported, and unnecessary statement. Just state that it has been heavily studied, or is one of the most well-studied,...

      rephrased

      (2) Abstract: "...are thus used as references to specific internal states during recordings - of either the brain or the electric organ..." This was not clear to me.

      rephrased

      (3) Abstract: "...the logic underlying this electric communication..." It is not clear to me what the authors mean here by "logic".

      rephrased

      (4) I strongly recommend clearly defining homeoactive sensing and distinguishing it from allocative sensing when this term is first introduced in the introduction. This is not a commonly used term. Most readers likely think they understand what is meant by the term active sensing, however I recommend first defining it, and then distinguishing amongst these two different types of active sensing.

      rephrased

      (5) Introduction: "Together with a few other species (Rose, 2004),..." More than a few. There are hundreds of species with electric organs. It is certainly not a "unique" capability.

      rephrased

      (6) Introduction: "But the real advantage of active electrolocation can be appreciated in the context of social interaction." This is unclear. Why is this the "real advantage" of active electrolocation when an electrically silent fish could detect an electrically communicating fish just fine without interference? Active electrolocation is needed to detect objects that are not actively emitting an electric field. It is not needed to detect signaling individuals.

      rephrased

      (7) Introduction: why is active sensing using EODs limited to distances of 6-12 cm? Why does it not work at closer range?

      Here we meant to give a range based on published data. We rephrased it to “up to 12”.

      (8) Introduction: electric fields decay with the cubed of distance, as you show in appendix 1.

      rephrased

      (9) Introduction: it is not clear what is meant by "blurred EOD amplitude".

      rephrased (“noisy”)

      (10) Figure 2C is very challenging to interpret. I recommend spending more time in the manuscript walking the reader through this analysis and its presentation.

      We are grateful for the comment as we probably overlooked this point. We now added a small paragraph to explain these data in better detail.

      (11) Results: "This was done by calculating the ratio between the duration of the beat cycles affected by the chirp (beat interpeak intervals) and the total duration of the beat cycles detected within a fixed time window (roughly double the size of the maximum chirp duration, 700 ms)." This was not clear to me.

      We now rephrased to “Estimates of beat interference were made by calculating the ratio between the cumulative duration of the beat cycles affected by a given chirp (1 beat cycle corresponding to the beat comprised by two consecutive beat peaks, or - more simply - the beat inter-peak interval) over the cumulative duration of all the beat cycles within the time window used as a reference (700 ms; other analysis windows were tested Figure S9)” to clarify this method.

      (12) Results: "For each chirp, the interference values obtained for 4 different phases (90{degree sign} steps) were averaged." Why was this done?

      To consider an average effect across phases. Although it is true that chirp parameters may have a different impact on the beat, depending on EOD phase, including this parameter in our figure/s would have considerably increased the volume of data reported giving too much emphasis to an analysis we judged not crucially important. In addition, since we did not consider EOD phase in our recordings, we opted for an average estimate encompassing different phase values.

      (13) Discussion: "Third, observations in a few species are generalized to all other gymnotiforms without testing for species differences (Turner et al., 2007; Smith et al., 2013; Petzold et al., 2016)." I strongly disagree with this statement. First, the studies referenced here do explicitly compare chirps across species. Second, you only studied one species here, so it is not clear to me how this is a relevant concern in interpreting your findings.

      Here we have probably been unclear in the writing: the point we wanted to make is that the idea of chirps having semantic content has been generalized to other species without investigating the nature of their chirping with as much detail as done for brown ghosts.

      We have now rephrased the statement and changed it to: “Second, observations in a few species are generalized to all other gymnotiforms without testing whether chirping may have similar functions in other species (Turner et al., 2007; Smith et al., 2013; Petzold et al., 2016)”

      (14) Discussion: "The two beats could be indistinguishable (assuming that the mechanism underlying the discrimination of the sign of DF at low DFs, and thought to be the basis of the so called jamming avoidance response (JAR; Metzner, 1999), is not functional at higher DFs)." Why would you assume this?

      What we meant here is that it is unlikely that the two DFs are not discriminated by the same mechanisms implied in the JAR, even if the DF is higher than the levels at which usually JARs are detected (i.e. DF = 1-10 Hz?). To improve clarity, we rephrased this statement. “The two beats could be indistinguishable (assuming - perhaps not realistically - that the same mechanism involved in DF discrimination at lower DF values would not work in this case; Metzner, 1999)”.

      (15) Discussion: "...an idea which seems congruent with published electrophysiological studies..." How so?

      Rephrased to “Based on our beat interference estimates, we propose that the occurrence of the different types of chirps at more positive DFs (such as in male-to-female chirping) may be explained by their different effect on the beat (Figure 5D; Benda et al., 2006; Walz et al., 2013).”

      Reviewer #3 (Recommendations For The Authors):

      On p.2 there is a discrepancy between the quoted ranges for active sensing of objects, first 10-12 cm, and then 6-12 cm further down. And in the following paragraph right below this passage, electric fields are said to decay with the squared distance (appendix 1). That expression has a cos(theta) which is inversely proportional to the distance, and so one is really dealing, as expected for dipolar fields, with a drop-off that decays with the distance cubed.

      We thank the Reviewer for the comment, we have now corrected the mistake and added “cubed”. We also removed the imprecise reference to the range 6-12 cm, rephrased to “up to 12 cm”.

      At the end of the section on Inconsistencies..., it is not clear what "activity levels" refers to. It should also be made clearer at the outset, and reminded in this section too, that for the authors, behavioural context does not include social experience, which is somewhat counter-intuitive.

      We now specified we meant “locomotor activity levels”. Regarding the social experience we included it as “behavioral context”, we now made it clearer in the first result paragraph. We hope we resolved the confusion.

      The caption of Fig.8 could use more clarity in terms of what is being compared in (C) (and is "1*2p" a typo?)

      We corrected the typo and edited the figure to make the references more clear.

      The concept of "high self-correlation of chirp time series" is presented only in the Conclusion using those words. The word self-correlation is not used beforehand. This needs to be fixed so the reader knows clearly what is being referred to.

      Thank you for noting this. We have now changed the wording using the term “auto-correlation” and changed a statement at the beginning of the “interference” result paragraph accordingly, removing references to self-correlation.

    1. Author response:

      eLife assessment

      The authors present an algorithm and workflow for the inference of developmental trajectories from single-cell data, including a mathematical approach to increase computational efficiency. While such efforts are in principle useful, the absence of benchmarking against synthetic data and a wide range of different single-cell data sets make this study incomplete. Based on what is presented, one can neither ultimately judge if this will be an advance over previous work nor whether the approach will be of general applicability.

      We thank the eLife editor for the valuable feedback. We wish to emphasize that both, benchmarking against other methods and validation on a synthetic dataset (“dyntoy”) are indeed presented in Supplementary Note, although we failed to sufficiently emphasize it in the main text. 

      We will extend the benchmarking to more TI methods and we will improve the results and discussion sections to present those facts more clearly to the reader.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The authors present tviblindi, a computational workflow for trajectory inference from molecular data at single-cell resolution. The method is based on (i) pseudo-time inference via expecting hitting time, (ii) sampling of random walks in a directed acyclic k-NN where edges are oriented away from a cell of origin w.r.t. the involved nodes' expected hitting times, and (iii) clustering of the random walks via persistent homology. An extended use case on mass cytometry data shows that tviblindi can be used elucidate the biology of T cell development.

      Strengths:

      - Overall, the paper is very well written and most (but not all, see below) steps of the tviblindi algorithm are explained well.

      - The T cell biology use case is convincing (at least to me: I'm not an immunologist, only a bioinformatician with a strong interest in immunology).

      We thank the reviewer for feedback and suggestions that we will accommodate, we respond point-by-point below

      Weaknesses:

      - The main weakness of the paper is that a systematic comparison of tviblindi against other tools for trajectory inference (there are many) is entirely missing. Even though I really like the algorithmic approach underlying tviblindi, I would therefore not recommend to our wet-lab collaborators that they should use tviblindi to analyze their data. The only validation in the manuscript is the T cell development use case. Although this use case is convincing, it does not suffice for showing that the algorithms's results are systematically trustworthy and more meaningful (at least in some dimension) than trajectories inferred with one of the many existing methods.

      We have compared tviblindi to several trajectory inference methods (Supplementary note section 8.2: Comparison to state-of-the-art methods, namely Monocle3 (v1.3.1) Cao et al. (2019), Stream (v1.1) Chen et al. (2019), Palantir (v1.0.0) Setty et al. (2019), VIA (v0.1.89) Stassen et al. (2021) and PAGA (scanpy==1.9.3) Wolf et al. (2019).) We will add thorough and systematic comparisons to the other algorithms mentioned by reviewers. We will include extended evaluation on publically available datasets.

      Also, we have successfully used tviblindi to investigate human B-cell development in primary immunodeficiency (manuscript in revisions), double negative T-cells development in ALPS (Autoimmune Lymphoproliferative Syndrome) by mass cytometry (project in progress).

      - The authors' explanation of the random walk clustering via persistent homology in the Results (subsection "Real-time topological interactive clustering") is not detailed enough, essentially only concept dropping. What does "sparse regions" mean here and what does it mean that "persistent homology" is used? The authors should try to better describe this step such that the reader has a chance to get an intuition how the random walk clustering actually works. This is especially important because the selection of sparse regions is done interactively. Therefore, it's crucial that the users understand how this selection affects the results. For this, the authors must manage to provide a better intuition of the maths behind clustering of random walks via persistent homology.

      In order to satisfy both reader types: the biologist and the mathematician, we explain the mathematics in detail in the Supplementary Note, section 4. We will improve the Results text to better point the reader to the mathematical foundations in the Supplementary Note.

      - To motivate their work, the authors write in the introduction that "TI methods often use multiple steps of dimensionality reduction and/or clustering, inadvertently introducing bias. The choice of hyperparameters also fixes the a priori resolution in a way that is difficult to predict." They claim that tviblindi is better than the original methods because "analysis is performed in the original high-dimensional space, avoiding artifacts of dimensionality reduction." However, in the manuscript, tviblindi is tested only on mass cytometry data which has a much lower dimensionality than scRNA-seq data for which most existing trajectory inference methods are designed. Since tviblindi works on a k-NN graph representation of the input data, it is unclear if it could be run on scRNA-seq data without prior dimensionality reduction. For this, cell-cell distances would have to be computed in the original high-dimensional space, which is problematic due to the very high dimensionality of scRNA-seq data. Of course, the authors could explicitly reduce the scope of tviblindi to data of lower dimensionality, but this would have to be stated explicitly.

      In the manuscript we tested the framework on the scRNA-seq data from Park et al 2020 (DOI: 10.1126/science.aay3224). To illustrate that tviblindi can work directly in the high-dimensional space, we applied the framework successfully on imputed 2000 dimensional data.

      The idea behind tviblindi is to be able to work without the necessity to use non-linear dimensionality reduction techniques, which reduce the dimensionality to a very low number of dimensions and whose effects on the data distribution are difficult to predict. On the other hand the use of (linear) dimensionality reduction techniques which effectively suppress noise in the data such as PCA is a good practice (see also response to reviewer 2). We will emphasize this in the revised version and add the results of the corresponding analysis.

      - Also tviblindi has at least one hyper-parameter, the number k used to construct the k-NN graphs (there are probably more hidden in the algorithm's subroutines). I did not find a systematic evaluation of the effect of this hyper-parameter.

      Detailed discussion of the topic is presented in the Supplementary Note, section 8.1, where Spearman correlation coefficient between pseudotime estimated using k=10 and k=50 nearest neighbors was 0.997.   The number k however does affect the number of candidate endpoints. But even when larger k causes spurious connection between unrelated cell fates, the topological clustering of random walks allows for the separation of different trajectories. We will expand the “sensitivity to hyperparameters section” also in response to reviewer 2.

      Reviewer #2 (Public Review):

      Summary:

      In Deconstructing Complexity: A Computational Topology Approach to Trajectory Inference in the Human Thymus with tviblindi, Stuchly et al. propose a new trajectory inference algorithm called tviblindi and a visualization algorithm called vaevictis for single-cell data. The paper utilizes novel and exciting ideas from computational topology coupled with random walk simulations to align single cells onto a continuum. The authors validate the utility of their approach largely using simulated data and establish known protein expression dynamics along CD4/CD8 T cell development in thymus using mass cytometry data. The authors also apply their method to track Treg development in single-cell RNA-sequencing data of human thymus.

      The technical crux of the method is as follows: The authors provide an interactive tool to align single cells along a continuum axis. The method uses expected hitting time (given a user input start cell) to obtain a pseudotime alignment of cells. The pseudotime gives an orientation/direction for each cell, which is then used to simulate random walks. The random walks are then arranged/clustered based on the sparse region in the data they navigate using persistent homology.

      We thank the reviewer for feedback and suggestions that we will accommodate, we respond point-by-point below.

      Strengths:

      The notion of using persistent homology to group random walks to identify trajectories in the data is novel.

      The strength of the method lies in the implementation details that make computationally demanding ideas such as persistent homology more tractable for large scale single-cell data.

      This enables the authors to make the method more user friendly and interactive allowing real-time user query with the data.

      Weaknesses:

      The interactive nature of the tool is also a weakness, by allowing for user bias leading to possible overfitting for a specific data.

      tviblindi is not designed as a fully automated TI tool (although it implements a fully automated module), but as a data driven framework for exploratory analysis of unknown data. There is always a risk of possible bias in this type of analysis - starting with experimental design, choice of hyperparameters in the downstream analysis, and an expert interpretation of the results. The successful analysis of new biological data involves a great deal of expert knowledge which is difficult to a priori include in the computational models.

      tvilblindi tries to solve this challenge by intentionally overfitting the data and keeping the level of resolution on a single random walk. In this way we aim to capture all putative local relationships in the data. The on-demand aggregation of the walks using the global topology of the data allows researchers to use their expert knowledge to choose the right level of detail (as demonstrated in the Figure 4 of the manuscript) while relying on the topological structure of the high dimensional point cloud. At all times tviblindi allows to inspect the composition of the trajectory to assess the variance in the development, possible hubs on the KNN-graph etc.

      The main weakness of the method is lack of benchmarking the method on real data and comparison to other methods. Trajectory inference is a very crowded field with many highly successful and widely used algorithms, the two most relevant ones (closest to this manuscript) are not only not benchmarked against, but also not sited. Including those that specifically use persistent homology to discover trajectories (Rizvi et.al. published Nat Biotech 2017). Including those that specifically implement the idea of simulating random walks to identify stable states in single-cell data (e.g. CellRank published in Lange et.al Nat Meth 2022), as well as many trajectory algorithms that take alternative approaches. The paper has much less benchmarking, demonstration on real data and comparison to the very many other previous trajectory algorithms published before it. Generally speaking, in a crowded field of previously published trajectory methods, I do not think this one approach will compete well against prior work (especially due to its inability to handle the noise typical in real world data (as was even demonstrated in the little bit of application to real world data provided).

      We provide comparisons of tviblindi and vaevictis in the Supplementary Note, section 8.2, where we compare it to Monocle3 (v1.3.1) Cao et al. (2019), Stream (v1.1) Chen et al. (2019), Palantir (v1.0.0) Setty et al. (2019), VIA (v0.1.89) Stassen et al. (2021) and PAGA (scanpy==1.9.3) Wolf et al. (2019). We use two datasets: artificial Dyntoy and real mass cytometry thymus+peripheral blood dataset. We thank the reviewer for suggesting specific methods.  CellRank was excluded from the benchmarking as it was originally designed for RNA-velocity data (not available in mass cytometry data), but will include recent upgrade CellRank2 (preprint at doi.org/10.1101/2023.07.19.549685) which offers more flexibility.

      We will add further benchmarking as suggested by the reviewer in the course of revisions.

      Beyond general lack of benchmarking there are two issues that give me particular concern. As previously mentioned, the algorithm is highly susceptible to user bias and overfitting. The paper gives the example (Figure 4) of a trajectory which mistakenly shows that cells may pass from an apoptotic phase to a different developmental stage. To circumvent this mistake, the authors propose the interactive version of tviblindi that allows users to zoom in (increase resolution) and identify that there are in fact two trajectories in one. In this case, the authors show how the author can fix a mistake when the answer is known. However, the point of trajectory inference is to discover the unknown. With so much interactive options for the user to guide the result, the method is more user/bias driven than data-driven. So a rigorous and quantitative discussion of robustness of the method, as well as how to ensure data-driven inference and avoid over-fitting would be useful.

      Local directionality in expression data is a challenge which is not, to our knowledge, solved. And we are not sure it can be solved entirely, even theoretically. The random walks passing “through” the apoptotic phase are biologically infeasible, but it is an (unbiased) representation of what the data look like based on the diffusion model. It is a property of the data (or of the panel design), which has to be interpreted properly rather than a mistake. Of note, except for Monocle3 (which does not provide the directionality) other tested methods did not discover this trajectory at all.

      The “zoom in” has in fact nothing to do with “passing through the apoptosis”. We show how the researcher can investigate the suggested trajectory to see if there is an additional structure of interest and/or relevance. This investigation is still data driven (although not fully automated). Anecdotally in this particular case this branching was discovered by an bioinformatician, who knew nothing about the presence of beta-selection in the data. 

      We show that the trajectory of apoptosis of cortical thymocytes consists of 2 trajectories corresponding to 2 different checkpoints (beta-selection and positive/negative selection). This type of structure, where 2 (or more) trajectories share the same path for most of the time, then diverge only to be connected at a later moment (immediately from the point of view of the beta-selection failure trajectory) is a challenge for TI algorithms and none of tested methods gave a correct result. More importantly there seems to be no clear way to focus on these kinds of structures (common origin and common fate) in TI methods.

      Of note, the “zoom in” is a recommended and convenient method to look for an inner structure, but it does not necessarily mean addition of further homological classes. Indeed, in this case the reason that the structure is not visible directly is the limitation of the dendrogram complexity (only branches containing at least 10% of simulated random walks are shown by default).

      In summary, tviblindi effectively handled all noise in the data that obscured biologically valid trajectories for other methods. We will improve the discussion of the robustness in the reviewed version. 

      Second, the paper discusses the benefit of tviblindi operating in the original high dimensions of the data. This is perhaps adequate for mass cytometry data where there is less of an issue of dropouts and the proteins may be chosen to be large independent. But in the context of single-cell RNA-sequencing data, the massive undersampling of mRNA, as well as high degree of noise (e.g. ambient RNA), introduces very large degree of noise so that modeling data in the original high dimensions leads to methods being fit to the noise. Therefore ALL other methods for trajectory inference work in a lower dimension, for very good reason, otherwise one is learning noise rather than signal. It would be great to have a discussion on the feasibility of the method as is for such noisy data and provide users with guidance. We note that the example scRNA-seq data included in the paper is denoised using imputation, which will likely result in the trajectory inference being oversmoothed as well.

      We agree with the reviewer. In our manuscript we wanted to showcase that tviblindi can directly operate in high-dimensional space (thousands of dimensions) and we used MAGIC imputation for this purpose. This was not ideal. More standard approach, which uses 30-50 PCs as input to the algorithm resulted in equivalent trajectories. We will add this analysis to the study.

      In summary, the fact that tviblindi scales well with dimensionality of the data and is able to work in the original space does not mean that it is always the best option. We will emphasize in the revised paper that we aim to avoid the non-linear dimensional reduction techniques as a data preprocessing tool, as the effect of the reduction is difficult to predict. We will also discuss the preprocessing of scRNA-seq data in greater detail.

      Reviewer #3 (Public Review):

      Summary:

      Stuchly et al. proposed a single-cell trajectory inference tool, tviblindi, which was built on a sequential implementation of the k-nearest neighbor graph, random walk, persistent homology and clustering, and interactive visualization. The paper was organized around the detailed illustration of the usage and interpretation of results through the human thymus system.

      Strengths:

      Overall, I found the paper and method to be practical and needed in the field. Especially the in-depth, step-by-step demonstration of the application of tviblindi in numerous T cell development trajectories and how to interpret and validate the findings can be a template for many basic science and disease-related studies. The videos are also very helpful in showcasing how the tool works.

      Weaknesses:

      I only have a few minor suggestions that hopefully can make the paper easier to follow and the advantage of the method to be more convincing.

      (1) The "Computational method for the TI and interrogation - tviblindi" subsection under the Results is a little hard to follow without having a thorough understanding of the tviblindi algorithm procedures. I would suggest that the authors discuss the uniqueness and advantages of the tool after the detailed introduction of the method (moving it after the "Connectome - a fully automated pipeline".

      We thank the reviewer for the suggestion and we will accommodate it to improve readability of the text.

      Also, considering it is a computational tool paper, inevitably, readers are curious about how it functions compared to other popular trajectory inference approaches. I did not find any formal discussion until almost the end of the supplementary note (even that is not cited anywhere in the main text). Authors may consider improving the summary of the advantages of tviblindi by incorporating concrete quantitative comparisons with other trajectory tools.

      We provide comparisons of tviblindi and vaevictis in the Supplementary Note, section 8.2, where we compare it to Monocle3 (v1.3.1) Cao et al. (2019), Stream (v1.1) Chen et al. (2019), Palantir (v1.0.0) Setty et al. (2019), VIA (v0.1.89) Stassen et al. (2021) and PAGA (scanpy==1.9.3) Wolf et al. (2019). We use two datasets: artificial Dyntoy and real mass cytometry thymus+peripheral blood dataset. We will also add CellRank2 into comparisons and we will strengthen the message of the benchmarking results in the Discussion section.

      (2) Regarding the discussion in Figure 4 the trajectory goes through the apoptotic stage and reconnects back to the canonical trajectory with counterintuitive directionality, it can be a checkpoint as authors interpret using their expert knowledge, or maybe a false discovery of the tool. Maybe authors can consider running other algorithms on those cells and see which tracks they identify and if the directionality matches with the tviblindi.

      We have indeed used the thymus dataset for comparison of all TI algorithms listed above. Except for Monocle 3 they failed to discover the negative selection branch (Monocle 3 does not offer directionality information). Therefore, a valid topological trajectory with incorrect (expert-corrected) directionality was partly or entirely missed by other algorithms.

      (3) The paper mainly focused on mass cytometry data and had a brief discussion on scRNA-seq. Can the tool be applied to multimodality data such as CITE-seq data that have both protein markers and gene expression? Any suggestions if users want to adapt to scATAC-seq or other epigenomic data?

      The analysis of multimodal data is the logical next step and is the topic of our current research. At this moment tviblindi cannot be applied directly to multimodal data. It is possible to use the KNN-graph based on multimodal data (such as weighted nearest neighbor graph implemented in Seurat) for pseudotime calculation and random walk simulation. However, we do not have a fully developed triangulation for the multimodal case yet.

    2. Reviewer #2 (Public Review):

      Summary: In Deconstructing Complexity: A Computational Topology Approach to Trajectory Inference in the Human Thymus with tviblindi, Stuchly et al. propose a new trajectory inference algorithm called tviblindi and a visualization algorithm called vaevictis for single-cell data. The paper utilizes novel and exciting ideas from computational topology coupled with random walk simulations to align single cells onto a continuum. The authors validate the utility of their approach largely using simulated data and establish known protein expression dynamics along CD4/CD8 T cell development in thymus using mass cytometry data. The authors also apply their method to track Treg development in single-cell RNA-sequencing data of human thymus.

      The technical crux of the method is as follows: The authors provide an interactive tool to align single cells along a continuum axis. The method uses expected hitting time (given a user input start cell) to obtain a pseudotime alignment of cells. The pseudotime gives an orientation/direction for each cell, which is then used to simulate random walks. The random walks are then arranged/clustered based on the sparse region in the data they navigate using persistent homology.

      Strengths:<br /> The notion of using persistent homology to group random walks to identify trajectories in the data is novel.<br /> The strength of the method lies in the implementation details that make computationally demanding ideas such as persistent homology more tractable for large scale single-cell data. This enables the authors to make the method more user friendly and interactive allowing real-time user query with the data.

      Weaknesses:<br /> The interactive nature of the tool is also a weakness, by allowing for user bias leading to possible overfitting for a specific data.

      The main weakness of the method is lack of benchmarking the method on real data and comparison to other methods. Trajectory inference is a very crowded field with many highly successful and widely used algorithms, the two most relevant ones (closest to this manuscript) are not only not benchmarked against, but also not sited. Including those that specifically use persistent homology to discover trajectories (Rizvi et.al. published Nat Biotech 2017). Including those that specifically implement the idea of simulating random walks to identify stable states in single-cell data (e.g. CellRank published in Lange et.al Nat Meth 2022), as well as many trajectory algorithms that take alternative approaches. The paper has much less benchmarking, demonstration on real data and comparison to the very many other previous trajectory algorithms published before it. Generally speaking, in a crowded field of previously published trajectory methods, I do not think this one approach will compete well against prior work (especially due to its inability to handle the noise typical in real world data (as was even demonstrated in the little bit of application to real world data provided).

      Beyond general lack of benchmarking there are two issues that give me particular concern. As previously mentioned, the algorithm is highly susceptible to user bias and overfitting. The paper gives the example (Figure 4) of a trajectory which mistakenly shows that cells may pass from an apoptotic phase to a different developmental stage. To circumvent this mistake, the authors propose the interactive version of tviblindi that allows users to zoom in (increase resolution) and identify that there are in fact two trajectories in one. In this case, the authors show how the author can fix a mistake when the answer is known. However, the point of trajectory inference is to discover the unknown. With so much interactive options for the user to guide the result, the method is more user/bias driven than data-driven. So a rigorous and quantitative discussion of robustness of the method, as well as how to ensure data-driven inference and avoid over-fitting would be useful.

      Second, the paper discusses the benefit of tviblindi operating in the original high dimensions of the data. This is perhaps adequate for mass cytometry data where there is less of an issue of dropouts and the proteins may be chosen to be large independent. But in the context of single-cell RNA-sequencing data, the massive undersampling of mRNA, as well as high degree of noise (e.g. ambient RNA), introduces very large degree of noise so that modeling data in the original high dimensions leads to methods being fit to the noise. Therefore ALL other methods for trajectory inference work in a lower dimension, for very good reason, otherwise one is learning noise rather than signal. It would be great to have a discussion on the feasibility of the method as is for such noisy data and provide users with guidance. We note that the example scRNA-seq data included in the paper is denoised using imputation, which will likely result in the trajectory inference being oversmoothed as well.

    1. Skip to main content <iframe src="https://www.googletagmanager.com/ns.html?id=GTM-WRSZQF8&gtm_auth=74eL4wQLYRNQ18AwQITlNA&gtm_preview=&gtm_cookies_win=x&noscript=true" height="0" width="0" style="display:none;visibility:hidden"></iframe> $(function(){ var bloxServiceIDs = []; var bloxUserServiceIds = []; var dataLayer = window.dataLayer || []; bloxServiceIDs.push(); if (__tnt.user.services){ var bloxUserServiceIDs = __tnt.user.services.replace('%2C',',').split(','); } // GTM tncms.subscription.paid_access_service_ids if(bloxServiceIDs){ dataLayer.push({'tncms':{'subscription':{'access_service_ids':bloxServiceIDs.toString()}}}); } // GTM tncms.subscrption.user_service_ids if(bloxUserServiceIDs){ dataLayer.push({'tncms':{'subscription':{'user_service_ids':bloxUserServiceIDs.toString()}}}); } }); Toronto.com Home News Business Council Crime Municipal Election Provincial Election Federal Election Bloor West - Parkdale Beach - East York Etobicoke North York Scarborough York - City Centre Topics Events Arts Attractions Community Festivals and Fairs Music Seasonal Shows and Expos Sports Things to Do Books And Authors Contests Food And Drink Opinion Advice Columns Community Voices Editorial Letters Life Fashion And Beauty Obituaries Personal Finance Real Estate Travel Wellness Wheels Special Features Marketplace Readers' Choice Awards Sponsored and Partners Classifieds Site search googletag.cmd.push(function() { googletag.display('ad-1356160'); }); 19°C Wednesday, May 8, 2024 Facebook Twitter Instagram { "@context" : "https://schema.org", "@type" : "Organization", "url" : "http://www.toronto.com", "sameAs" : ["https://www.facebook.com/torontodotcom","https://twitter.com/torontodotcom","https://www.instagram.com/torontodotcom/?hl=en"] } Menu Toronto.com Home News Business Council Crime Municipal Election Provincial Election Federal Election Bloor West - Parkdale Beach - East York Etobicoke North York Scarborough York - City Centre Topics Events Arts Attractions Community Festivals and Fairs Music Seasonal Shows and Expos Sports Things to Do Books And Authors Contests Food And Drink Opinion Advice Columns Community Voices Editorial Letters Life Fashion And Beauty Obituaries Personal Finance Real Estate Travel Wellness Wheels Special Features Marketplace Readers' Choice Awards Sponsored and Partners Classifieds googletag.cmd.push(function() { googletag.display('ad-1360687'); }); googletag.cmd.push(function() { googletag.display('ad-1168968'); }); News Bank of Canada continuing work on updating ‘workhorse’ $20 bill — will feature King Charles III The new $20 note will be vertical, like the current $10 note, and will feature enhanced secu… News Canadian mint commemorates anniversary of King Charles III's coronation with silver dollar collector coin News Toronto's May 8 forecast: Chance of showers By Torstar Open Data Team News Things To Do 16 must-visit holiday events to check out across Ontario before the festive season officially ends From sparkling light festivals to immersive walk-through experiences, check out these festive happenings before the holiday season officially ends News ‘Shines a light’: Canada Post reveals 2024 stamp lineup By Hunter Crowther Canada Post says these stamps will ‘shine a light on truth and reconciliation, the natural world, accomplished Canadians, a rare space sighting and much more’ News Toronto's May 8 forecast: Chance of showers By Torstar Open Data Team News What is the May 2-4 long weekend and why isn't it on the 24th? By Heidi Riedner News Ontario preparing for extreme heat emergencies — are you? Things to Do Things To Do Colm Tóibín never planned a sequel to 'Brooklyn.' Then the opening scene of 'Long Island' came out of the blue By Steven W. Beattie Special to the Star "Long Island" is another brick in the wall of a writer quietly building an edifice that marks him as a master of contemporary literature. Just don’t compare him to James Joyce. Things To Do A relentlessly honest depiction of motherhood: In her debut novel, theatre artist Erin Brubacher explores the hope and heartbreak of creating a child By Aisling Murphy Brubacher’s novel, “These Songs I Know By Heart,” shows off the same flair for dramatic intimacy that makes her such a sought-after collaborator in the theatre world. Things To Do More than 40 music festivals await you in Ontario for 2024 this spring, summer, fall Things To Do A Negroni journey: I travelled to Italy to sip my favourite cocktail in Venice, Florence and Rome By Tim Johnson Special to the Star Contributed Children’s books on nature, dancing, self-confidence and signing! By Glenn Perrett Trending My husband quit his job to pursue his passion. Turns out his 'passion' is his stunning trainer. You won't believe how I caught them. Ask Lisi My friend is so cute and sweet, but he's never had a girlfriend. I think I know why — but telling him might break his heart. Should I do it anyway? Ask Lisi I moved after my husband died and met a man and his young son. One day, we all watched a snail in my garden for 10 minutes. I think the man's wife died. Should I ask him? Ask Lisi My boyfriend is rich — like, rich rich. His mother has never worked and she assumes that I'll give up my dental hygienist career when we get married. Do I have to? Ask Lisi My daughter is getting married. My ex isn't ponying up a dime and refuses to walk our child down the aisle. But now his sister is insisting that his name should be on the invite. No, right? Ask Lisi googletag.cmd.push(function() { googletag.display('ad-1168977'); }); Events Calendar Life Life My friend group is in crisis. Some of them make a ton of money. Most of us don't. Is our friendship doomed? Ask Lisi By Lisi Tesher And Lisi shares thoughtful reader feedback. Life My boyfriend is rich — like, rich rich. His mother has never worked and she assumes that I'll give up my dental hygienist career when we get married. Do I have to? Ask Lisi By Lisi Tesher And Lisi advises a letter writer who is struggling to understand her professor. Life Zendaya, Demi Moore and Lana Del Rey were the 2024 Met Gala best dressed By Liz Guber Life My friend is so cute and sweet, but he's never had a girlfriend. I think I know why — but telling him might break his heart. Should I do it anyway? Ask Lisi By Lisi Tesher Things To Do A Negroni journey: I travelled to Italy to sip my favourite cocktail in Venice, Florence and Rome By Tim Johnson Special to the Star Food & Drink News Starbucks unveiling several new menu items across Canada May 7 and people already have strong reactions online By Louie Rosella Available now. Food And Drink Dairy Queen unveils new Blizzard menu items at restaurants across Canada and people are reacting online By Louie Rosella Available for a limited time. Food And Drink It's time to weigh in on the KitKat break debate with #MyBreak social media posts By Bruce Froude Updated Apr 18, 2024 Food And Drink Tim Hortons to start selling pizza April 17 at restaurants and coffee shops across Canada and the online response has been huge By Louie Rosella Updated Apr 29, 2024 Food And Drink Starbucks and A&W unveil new menu items at restaurants and coffee shops across Canada and here's what people are saying online By Louie Rosella Updated Apr 15, 2024 Opinion Contributed Children’s books on nature, dancing, self-confidence and signing! By Glenn Perrett Glenn Perrett's latest list of recommended books for young readers includes “The Art of Rewilding: The Return of Yellowstone’s Wolves,” “Why We Dance: A Story of Hope and Healing” and “Butterfly On the Wind.” Contributed Tool gift ideas for Mother's Day and Father's Day By Glenn Perrett If you're looking for a gift for mom or dad this spring, Glenn Perrett recommends considering these tools from DeWalt, Irwin and Craftsman. Contributed Education workers frustrated for students as province promises change, delivers more of the same cuts and distraction: union Editorials Monday's highway carnage is yet more proof that police chases are never worth the risk By Star Editorial Board Money Matters ASK THE MONEY LADY: Should I skip the pre-nup to save on legal fees? By Christine Ibbotson googletag.cmd.push(function() { googletag.display('ad-1168974'); }); @media (min-width:768px) { .newsletterSignup {display: flex;justify-content: center;align-items: center;} } .newsletterSignup {background-color: #c4e4c2;text-align:center;padding:15px} .newsletterText {color:black;font-size:20px;/*font-weight:700*/} .newsletterText small {font-family: 'Source Sans Pro', sans-serif; letter-spacing: .10ch;} .newsletterText p { margin: 5px 0;line-height:1;} .newsletterSignupButton{color:white;background-color:#006633;display:inline-block;text-transform: uppercase;font-family: 'Source Sans Pro', sans-serif; letter-spacing: .10ch;-webkit-transition: background .3s ease-in-out; -moz-transition: background .3s ease-in-out; -ms-transition: background .3s ease-in-out; -o-transition: background .3s ease-in-out; transition: background .3s ease-in-out;} .newsletterSignupButton:hover {background-color:#00ac56;color:white} @media (max-width:767px) { .newsletterText {font-size:18px;margin-bottom:15px;} } @media (min-width:992px) { .main-sidebar .newsletterSignup {display: block; max-width: 300px; margin: auto;} } .main-sidebar .newsletterSignup .col-md-8, .main-sidebar .newsletterSignup .col-md-4 {width:100%;} .main-sidebar .newsletterText {font-size:18px;margin-bottom:15px;} HEADLINES NEWSLETTER TOP STORIES, delivered to your inbox. Sign Up Follow us on Facebook (function(d, s, id) { var js, fjs = d.getElementsByTagName(s)[0]; if (d.getElementById(id)) return; js = d.createElement(s); js.id = id; js.src = "//connect.facebook.net/en_US/sdk.js#xfbml=1&version=v2.5&appId=1550124928647000"; fjs.parentNode.insertBefore(js, fjs); }(document, 'script', 'facebook-jssdk')); TOP STORIES, delivered to your inbox.Headlines Newsletter Sign Up googletag.cmd.push(function() { googletag.display('ad-1202244'); }); More News News 'A meaningful difference': Annual McHappy Day returns to McDonald's restaurants for 30th year on May 8 to raise money for charity News 'Critical service': What's happening May 15 on phones in Ontario and what you need to know about it By Louie Rosella News Toronto's May 6 forecast: Mainly sunny By Torstar Open Data Team News Toronto's May 5 forecast: Showers By Torstar Open Data Team Crime TIMELINE OF A TRAGEDY: It started as a liquor store robbery in Bowmanville and ended with four dead on Highway 401 in Whitby By Bruce Froude News Grade 11, 12 students to have more access to skilled trades through co-op programming News Skilled trades in Ontario: What are the industries and jobs in most need? News No GO Train or bus service available May 3 to 5 from Pickering GO to Toronto Union News Toronto's May 3 forecast: Chance of showers By Torstar Open Data Team googletag.cmd.push(function() { googletag.display('ad-1168986'); }); Follow us on Twitter (function(w, d) { var twitterWidget = { init: function () { var twitHolder = d.getElementById("tncms-block-1366069").parentNode, widget = d.getElementById("twitter-widget-1366069"); function handleIntersection(entries) { entries.map((entry) => { if (entry.isIntersecting) { twttr.widgets.createTimeline( { sourceType: "profile", screenName: "torontodotcom" }, d.getElementById("twitter-widget-1366069"), { height: '350' } ).then(function (el) {} ); observer.unobserve(entry.target); } }); } const options = { threshold: 0.1 } const observer = new IntersectionObserver(handleIntersection, options); observer.observe(widget); } } if (d.readyState == "loading") { d.onreadystatechange = function () { if (d.readyState == "complete") { twitterWidget.init(); } } } else { twitterWidget.init(); } })(window, document); googletag.cmd.push(function() { googletag.display('ad-1168962'); }); Helpful Links Classifieds Digital Editions Marketplace Obituaries Sitemap Toronto.com Readers Choice Metroland Gives Back Walk-In Clinics Connect with us About Us Advertising Standards Become a Carrier Contact Us Delivery Concerns Newsletter Signup Feedback Submit a Letter Submit Multimedia Contact Information Phone: 1-833-440-7474 Email: newsroom@toronto.com Follow Us Facebook Twitter Instagram { "@context" : "https://schema.org", "@type" : "Organization", "url" : "http://www.toronto.com", "sameAs" : ["https://www.facebook.com/torontodotcom","https://twitter.com/torontodotcom","https://www.instagram.com/torontodotcom/?hl=en"] } × Browser Compatibility Your browser is out of date and potentially vulnerable to security risks.We recommend switching to one of the following browsers: Microsoft Edge Google Chrome Firefox Copyright 2023 Toronto Star Newspapers Limited. All Rights Reserved. 8 Spadina Avenue, Suite 10A, Toronto, ON M5V 0S8 Corporate Privacy Policy | Terms of Use | Advertising Terms | Accessibility googletag.cmd.push(function() { googletag.display('ad-1168980'); }); window.__tnt = window.__tnt || {}; __tnt.compatibility = __tnt.compatibility || {}; __tnt.compatibility.status = ''; __tnt.compatibility.check = function() { if (typeof __tnt.advertisements == 'undefined') { __tnt.compatibility.status = 'FAIL: object 0 undefined'; return false; } return true; }; __tnt.compatibility.notification = function() { }; (function() { function compatibilityCheck() { if (!__tnt.compatibility.check()) { __tnt.trackEvent({ 'category':'subscription', 'action':'adblock', 'label':'adblock detected', 'value':'1' }); __tnt.compatibility.notification(); } } if (document.readyState != 'loading') { compatibilityCheck(); } else { document.addEventListener('DOMContentLoaded', compatibilityCheck); } })(); jQuery(function() { if(typeof TNCMS.Tracking != 'undefined'){ jQuery(TNCMS.Tracking.trackDeclarativeEvents); }}); __tnt.trackEvent = function(obj) { if (typeof obj === 'object') { if (obj.category && obj.action) { __tnt.googleEvent(obj); } else if (obj.network && obj.socialAction) { __tnt.googleSocial(obj); } else if (obj.url) { __tnt.googlePageView(obj); } if (typeof TNCMS.Tracking != 'undefined' && obj.metric) { TNCMS.Tracking.addEvent({ app: obj.app, metric: obj.metric, id: obj.uuid }); } } }; if (__tnt.trackEventLater.length > 0) { __tnt.trackEventLater.forEach(function(obj) { __tnt.trackEvent(obj); }); } Array.from(document.querySelectorAll('body [data-track]')).forEach(function(el) { el.addEventListener(__tnt.client.clickEvent, function() { __tnt.trackEvent(JSON.parse(el.dataset.track)); }); }); Array.from(document.querySelectorAll('body [data-tncms-track-event]')).forEach(function(el) { el.addEventListener(__tnt.client.clickEvent, function() { __tnt.trackEvent(JSON.parse(el.dataset.tncmsTrackEvent)); }); }); Array.from(document.querySelectorAll('body [data-tncms-track-dmp]')).forEach(function(el) { el.addEventListener(__tnt.client.clickEvent, function() { var dmpData = el.dataset.tncmsTrackDmp; }); }); /*<![CDATA[*/ __tnt.googleEvent = function(obj) { dataLayer.push({ 'event': 'tncms.event.trigger', 'tncms.event.trigger.category': obj.category, 'tncms.event.trigger.action': obj.action, 'tncms.event.trigger.label': obj.label, 'tncms.event.trigger.value': obj.value }); } /* Virtual page view */ __tnt.googlePageView = function(obj) { var sURL = obj.url.replace(/^.*\/\/[^\/]+/, ''); dataLayer.push({ 'event': 'tncms.event.virtual_pageview', 'tncms.event.virtual_pageview.url': sURL, 'tncms.event.virtual_pageview.title': obj.title, 'tncms.event.virtual_pageview.metric': obj.metric }); } /* Social event */ __tnt.googleSocial = function(obj) { dataLayer.push({ 'event': 'tncms.event.social', 'tncms.event.social.network': obj.network, 'tncms.event.social.action': obj.socialAction, 'tncms.event.social.target': obj.url }); } /*]]>*/ /*<![CDATA[*/ { "@context": "https://schema.org", "@type": "WebSite", "url": "https://www.toronto.com", "potentialAction": { "@type": "SearchAction", "target": "https://www.toronto.com/search?q={search_term_string}", "query-input": "required name=search_term_string" } } /*]]>*/ /*<![CDATA[*/ (function(d) { var form = d.getElementById('site-search-1168614'), query_input = d.getElementById('site-search-1168614-term'), search_dropdown = d.getElementById('site-search-1168614-dropdown'); /** Input focus */ try { search_dropdown.onmouseenter = function(){ setTimeout(function(){ query_input.focus(); }, 700); }; } catch (error) { // No dropdown behavior } /** Submit handler */ form.onsubmit = function(){ // Filter query var elem = document.querySelector("#site-search-1168614 input[name=q]"), sQueryFiltered = elem.value.replace(/\?/g, ''); elem.value = sQueryFiltered; // No submit if empty input if( query_input.val() ){ return true; } else{ return false; } };})(document); /*]]>*/ /*<![CDATA[*/ !function(t,i,n){var e,a,s,o,c,d={init:function(){a=i.getElementById("site-navbar-container"),n.client.platform.ios?a.classList.add("affix-sticky"):(e=i.getElementById("main-body-container"),s=a.offsetHeight||a.clientHeight,o=!1,c=0,t.addEventListener("scroll",d.navPosition,!1),t.addEventListener("mousewheel",d.navPosition,!1))},navPosition:function(){o||(o=!0,setTimeout(function(){var n=a.getBoundingClientRect(),d=t.pageYOffset||i.documentElement.scrollTop,f=n.top+d;d>=f&&d>c?a.classList.contains("affix")||(c=f,a.classList.add("affix"),a.classList.remove("affix-top"),e.style.marginTop=s+"px"):a.classList.contains("affix-top")||(a.classList.remove("affix"),a.classList.add("affix-top"),e.style.marginTop="0px"),o=!1},25))}};"loading"==i.readyState?i.addEventListener("DOMContentLoaded",d.init,!1):d.init()}(window,document,__tnt); document.addEventListener('DOMContentLoaded', function() { var isIOS = /iPad|iPhone|iPod/.test(navigator.userAgent) && !window.MSStream; if (isIOS) { Array.from(document.querySelectorAll('[data-toggle="offcanvas"]')).forEach(function(drawer) { drawer.addEventListener("mouseover", function(e) { var drawerCls = drawer.dataset.target === 'left' ? 'active-left' : 'active-right'; document.documentElement.classList.add('drawer-open', drawerCls); }) }) } }); /*]]>*/ /*<![CDATA[*/ (function() { window.addEventListener('load', function() { __tnt.regions.stickySide.init(document.getElementById('sticky-side-primary'), document.getElementById('sticky-side-primary-spacer'), 'siderail', '.row'); }); })(); /*]]>*/ /*<![CDATA[*/ (function() { window.addEventListener('load', function() { __tnt.regions.stickySide.init(document.getElementById('sticky-side-secondary'), document.getElementById('sticky-side-secondary-spacer'), 'siderail', '.row'); }); })(); /*]]>*/ /*<![CDATA[*/ (function() { window.addEventListener('load', function() { __tnt.regions.stickySide.init(document.getElementById('sticky-side-tertiary'), document.getElementById('sticky-side-tertiary-spacer'), 'siderail', '.row'); }); })(); /*]]>*/ /*<![CDATA[*/ document.addEventListener("DOMContentLoaded", __tnt.deprecatedCheck, false); /*]]>*/ /*<![CDATA[*/ __tnt.regions.stickyAnchor.init(); /*]]>*/ _satellite["_runScript1"](function(event, target, Promise) { var existingEcid = _satellite.getVar('cookie:s_ecid'); if (!existingEcid){ var ecid = _satellite.getVisitorId().getMarketingCloudVisitorID(); if (ecid){ var now = new Date(); var time = now.getTime(); var expireTime = time + 1000 * 60 * 60 * 24 * 730; now.setTime(expireTime); var cookieName = "s_ecid"; var cookieValue = "MCMID|" + _satellite.getVisitorId().getMarketingCloudVisitorID(); cookieValue = encodeURIComponent(cookieValue); var cookieString = ""; cookieString = cookieName +'=' + cookieValue + ';expires=' + now.toGMTString() + ';path=/;domain=' + _satellite.getVar('processed:MainDomain'); document.cookie = cookieString; } } });_satellite["_runScript2"](function(event, target, Promise) { "no"===_satellite.getVar("processed:UserLoggedInState")?sessionStorage.setItem("cls","false"):sessionStorage.setItem("cls2","false"); });!function(){var a=window.analytics=window.analytics||[];if(!a.initialize)if(a.invoked)window.console&&console.error&&console.error("Segment snippet included twice.");else{a.invoked=!0;a.methods="trackSubmit trackClick trackLink trackForm pageview identify reset group track ready alias debug page once off on addSourceMiddleware addIntegrationMiddleware setAnonymousId addDestinationMiddleware".split(" ");a.factory=function(b){return function(){var c=Array.prototype.slice.call(arguments);c.unshift(b); a.push(c);return a}};for(var e=0;e<a.methods.length;e++){var f=a.methods[e];a[f]=a.factory(f)}a.load=function(b,c){var d=document.createElement("script");d.type="text/javascript";d.async=!0;d.src="https://cdn.segment.com/analytics.js/v1/"+b+"/analytics.min.js";b=document.getElementsByTagName("script")[0];b.parentNode.insertBefore(d,b);a._loadOptions=c};a._writeKey="YNwPRuYDOjrAr7O9PCSVIw1QoK0Oimn6";a.SNIPPET_VERSION="4.15.3";a.debug(google_tag_manager["rm"]["61227858"](44));a.load("YNwPRuYDOjrAr7O9PCSVIw1QoK0Oimn6");a.ready(function(){var b= window.analytics.user();sUserId=null;b&&(sUserId=b.id()||b.anonymousId());b=new CustomEvent("TownnewsSegmentLoaded",{detail:{analytics:window.analytics,user_id:sUserId}});window.document.dispatchEvent(b)})}}();_satellite["_runScript3"](function(event, target, Promise) { var adWordsPixelId=_satellite.getVar("processed:AdWordsPixelJSON"),pageType=_satellite.getVar("processed:PageType"),template=_satellite.getVar("processed:Template");try{if(adWordsPixelId&&"x"!==adWordsPixelId.accountId){var googleConversionScript=document.createElement("script");function gtag(){dataLayer.push(arguments)}googleConversionScript.type="text/javascript",googleConversionScript.src="https://www.googletagmanager.com/gtag/js?id="+adWordsPixelId.accountId,googleConversionScript.async=!0,document.getElementsByTagName("head")[0].appendChild(googleConversionScript),window.dataLayer=window.dataLayer||[],gtag("config",adWordsPixelId.accountId),setTimeout((function(){!window.newsletterSignupG&&!0===window.atLeastOneSubscribe&&adWordsPixelId.use.newsletterSuccess&&(gtag("event","conversion",{send_to:adWordsPixelId.accountId+"/"+adWordsPixelId.use.newsletterSuccess}),window.newsletterSignupG=!0)}),400)}}catch(e){} });_satellite["_runScript4"](function(event, target, Promise) { var doubleClickPixelId=_satellite.getVar("processed:DoubleClickPixelJSON"),pageType=_satellite.getVar("processed:PageType"),template=_satellite.getVar("processed:Template");try{if(doubleClickPixelId&&"x"!==doubleClickPixelId.accountId){var doubleclickScript=document.createElement("script");function gtag(){dataLayer.push(arguments)}doubleclickScript.type="text/javascript",doubleclickScript.src="https://www.googletagmanager.com/gtag/js?id="+doubleClickPixelId.accountId,doubleclickScript.async=!0,document.getElementsByTagName("head")[0].appendChild(doubleclickScript),window.dataLayer=window.dataLayer||[],gtag("config",doubleClickPixelId.accountId),doubleClickPixelId.use.allPages&&gtag("event","conversion",{allow_custom_scripts:!0,send_to:doubleClickPixelId.accountId+"/"+doubleClickPixelId.use.allPages})}}catch(e){} });_satellite["_runScript5"](function(event, target, Promise) { function waitForTwq(t){counter++,"undefined"!=typeof twq?t():counter>500||setTimeout((function(){waitForTwq(t)}),100)}var twitterPixelId=_satellite.getVar("processed:TwitterPixelJSON"),template=_satellite.getVar("processed:Template"),counter=0;try{twitterPixelId&&"x"!=twitterPixelId.accountId&&"undefined"==typeof twq&&function(t,e,i,n,o,r){t.twq||(n=t.twq=function(){n.exe?n.exe.apply(n,arguments):n.queue.push(arguments)},n.version="1.1",n.queue=[],(o=e.createElement(i)).async=!0,o.src="//static.ads-twitter.com/uwt.js",(r=e.getElementsByTagName(i)[0]).parentNode.insertBefore(o,r))}(window,document,"script")}catch(t){}waitForTwq((function(){twq("config",twitterPixelId.accountId)})); });_satellite["_runScript6"](function(event, target, Promise) { var redditPixelId=_satellite.getVar("processed:RedditPixelJSON"),pageType=_satellite.getVar("processed:PageType"),template=_satellite.getVar("processed:Template");try{redditPixelId&&"x"!==redditPixelId.accountId&&(!function(e,t){if(!e.rdt){var a=e.rdt=function(){a.sendEvent?a.sendEvent.apply(a,arguments):a.callQueue.push(arguments)};a.callQueue=[];var d=t.createElement("script");d.src="https://www.redditstatic.com/ads/pixel.js",d.async=!0;var r=t.getElementsByTagName("script")[0];r.parentNode.insertBefore(d,r)}}(window,document),rdt("init",redditPixelId.accountId,{optOut:!1,useDecimalCurrencyValues:!0}),rdt("track","PageVisit"))}catch(e){} });_satellite["_runScript7"](function(event, target, Promise) { var linkedInPixelId=_satellite.getVar("processed:LinkedInPixelJSON"),pageType=_satellite.getVar("processed:PageType"),template=_satellite.getVar("processed:Template");try{linkedInPixelId&&"x"!==linkedInPixelId.accountId&&(_linkedin_partner_id=linkedInPixelId.accountId,window._linkedin_data_partner_ids=window._linkedin_data_partner_ids||[],window._linkedin_data_partner_ids.push(_linkedin_partner_id),function(){window.lintrk||(window.lintrk=function(e,n){window.lintrk.q.push([e,n])},window.lintrk.q=[]);var e=document.getElementsByTagName("script")[0],n=document.createElement("script");n.type="text/javascript",n.async=!0,n.src="https://snap.licdn.com/li.lms-analytics/insight.min.js",e.parentNode.insertBefore(n,e)}())}catch(e){} });_satellite["_runScript8"](function(event, target, Promise) { var bingPixelId=_satellite.getVar("processed:BingPixelJSON"),pageType=_satellite.getVar("processed:PageType"),template=_satellite.getVar("processed:Template");try{bingPixelId&&"x"!==bingPixelId.accountId&&function(e,t,a,n,i){var o,c,l;e[i]=e[i]||[],o=function(){var t={ti:bingPixelId.accountId};t.q=e[i],e[i]=new UET(t),e[i].push("pageLoad")},(c=t.createElement(a)).src=n,c.async=1,c.onload=c.onreadystatechange=function(){var e=this.readyState;e&&"loaded"!==e&&"complete"!==e||(o(),c.onload=c.onreadystatechange=null)},(l=t.getElementsByTagName(a)[0]).parentNode.insertBefore(c,l)}(window,document,"script","//bat.bing.com/bat.js","uetq")}catch(e){} });_satellite["_runScript9"](function(event, target, Promise) { var pinterestPixelId=_satellite.getVar("processed:PinterestPixelJSON"),pageType=_satellite.getVar("processed:PageType"),template=_satellite.getVar("processed:Template");try{pinterestPixelId&&"x"!==pinterestPixelId.accountId&&(!function(e){if(!window.pintrk){window.pintrk=function(){window.pintrk.queue.push(Array.prototype.slice.call(arguments))};var t=window.pintrk;t.queue=[],t.version="3.0";var r=document.createElement("script");r.async=!0,r.src=e;var i=document.getElementsByTagName("script")[0];i.parentNode.insertBefore(r,i)}}("https://s.pinimg.com/ct/core.js"),pintrk("load",pinterestPixelId.accountId),pintrk("page"))}catch(e){} }); var janrainUUID=_satellite.getVar("processed:UserScreenNameJanrainUUID"),loggedIn=_satellite.getVar("processed:UserLoggedInState"),entitled=_satellite.getVar("processed:Entitlement"),siteLevelUserId=_satellite.getVar("processed:SiteLevelUserId"),hubLevelUserId=_satellite.getVar("processed:HubLevelUserId"),scrollIncrement=0,AMCID=_satellite.getVar("processed:VisitorID"),wordCount=_satellite.getVar("var:WordCount"),plan="";"yes"===loggedIn&&(plan="no"===entitled?"registered":"subscribed"),function(e,t,o){var r=o.location.protocol,i=t+"-"+e,d=o.getElementById(i),c=o.getElementById(t+"-root"),l="https:"===r?"d1z2jf7jlzjs58.cloudfront.net":"static."+t+".com";d||((d=o.createElement(e)).id=i,d.async=!0,d.src=r+"//"+l+"/p.js",c.appendChild(d))}("script","parsely",document);try{function trackScroll(e,t){PARSELY.beacon&&PARSELY.beacon.trackPageView({action:"_scroll",data:{_scrollIncrement:e,_scrollMethod:t,_y:Math.round(window.scrollY),_bodyHeight:window.document.body.clientHeight,_articleTop:window.document.querySelector('div[class*="asset-body"],div#SA_article_tracking')?Math.round(window.document.querySelector('div[class*="asset-body"],div#SA_article_tracking').getBoundingClientRect().top+window.scrollY):void 0,_articleBottom:window.document.querySelector('div[class*="asset-body"],div#SA_article_tracking')?Math.round(window.document.querySelector('div[class*="asset-body"],div#SA_article_tracking').getBoundingClientRect().bottom+window.scrollY):void 0,_articleMidway:window.document.querySelector('div[class*="asset-body"],div#SA_article_tracking')?Math.round(window.document.querySelector('div[class*="asset-body"],div#SA_article_tracking').getBoundingClientRect().top+window.scrollY+window.document.querySelector('div[class*="asset-body"],div#SA_article_tracking').clientHeight/2):void 0}})}window.PARSELY=window.PARSELY||{autotrack:!1,video:{autotrack:!1},onload:function(){PARSELY.updateDefaults({data:{plan:plan,janrain_uuid:janrainUUID,site_level_uuid:siteLevelUserId,hub_level_uuid:hubLevelUserId,adobe_mcid:AMCID,word_count:wordCount}}),PARSELY.beacon.trackPageView({url:window.location.href,urlref:document.referrer,data:{_scrollIncrement:0,_scrollMethod:"pageview",_y:Math.round(window.scrollY),_bodyHeight:window.document.body.clientHeight,_articleTop:window.document.querySelector('div[class*="asset-body"],div#SA_article_tracking')?Math.round(window.document.querySelector('div[class*="asset-body"],div#SA_article_tracking').getBoundingClientRect().top+window.scrollY):void 0,_articleBottom:window.document.querySelector('div[class*="asset-body"],div#SA_article_tracking')?Math.round(window.document.querySelector('div[class*="asset-body"],div#SA_article_tracking').getBoundingClientRect().bottom+window.scrollY):void 0,_articleMidway:window.document.querySelector('div[class*="asset-body"],div#SA_article_tracking')?Math.round(window.document.querySelector('div[class*="asset-body"],div#SA_article_tracking').getBoundingClientRect().top+window.scrollY+window.document.querySelector('div[class*="asset-body"],div#SA_article_tracking').clientHeight/2):void 0},js:1})},onHeartbeat:function(){scrollIncrement++,scrollMethod="heartbeat",trackScroll(scrollIncrement,scrollMethod)}},window.setInterval((function(){scrollIncrement++,scrollMethod="setinterval",trackScroll(scrollIncrement,scrollMethod)}),1e4)}catch(e){} _satellite["_runScript10"](function(event, target, Promise) { setTimeout((function(){if("true"===sessionStorage.getItem("createAccountSubmittedP")&&("thestar|page|create-account-traditional"!==_satellite.getVar("processed:PageName")||!window.document.querySelector("#system_errors"))){function e(t){window.PARSELY&&window.PARSELY.beacon?(PARSELY.conversions.trackLeadCapture("registration-success"),sessionStorage.removeItem("createAccountSubmittedP")):t<20&&setTimeout((function(){e(++t)}),300)}e(1)}}),500); });_satellite["_runScript11"](function(event, target, Promise) { var ele,elelist,pageType=_satellite.getVar("processed:PageType"),subPageType=_satellite.getVar("processed:SubPageType"),channel=_satellite.getVar("processed:Channel");if(window.document.querySelector("#site-top-nav-container")&&(ele=window.document.querySelector("#site-top-nav-container")).setAttribute("data-lpos","header"),window.document.querySelector("#site-header-container")&&(ele=window.document.querySelector("#site-header-container")).setAttribute("data-lpos","header"),window.document.querySelector("#main-navigation .navbar-brand")&&(ele=window.document.querySelector("#main-navigation .navbar-brand")).setAttribute("data-lpos","header"),window.document.querySelector("#main-navigation .navbar-brand #torstar-user-mobile")&&(ele=window.document.querySelector("#main-navigation .navbar-brand #torstar-user-mobile")).setAttribute("data-lpos","header|user-dropdown"),window.document.querySelector("#main-navigation")&&(ele=window.document.querySelector("#main-navigation")).setAttribute("data-lpos","main-menu"),window.document.querySelector(".offcanvas-drawer")&&(ele=window.document.querySelector(".offcanvas-drawer")).setAttribute("data-lpos","left-drawer"),window.document.querySelector("#tncms-region-nav-mobile-nav-left")&&(ele=window.document.querySelector("#tncms-region-nav-mobile-nav-left")).setAttribute("data-lpos","left-drawer|menu"),window.document.querySelectorAll(".tsAlertCarousel div.item"))for(elelist=window.document.querySelectorAll(".tsAlertCarousel div.item"),x=0;x<elelist.length;x++){if(titleEle=elelist[x].querySelector(".alertType")){var title=titleEle.innerText.trim().replace(/[^a-zA-Z0-9]/g,"-").replace(/(-)\1+/g,"$1").toLowerCase();elelist[x].setAttribute("data-lpos","alert|"+title)}}if(window.document.querySelector('div[class~="weather-alert"]')){var eleParent=(ele=window.document.querySelector('div[class~="weather-alert"]')).closest("div.tncms-block");eleParent.setAttribute("data-lpos","alert|weather-alert")}if(window.document.querySelector("#main-content")&&(ele=window.document.querySelector("#main-content")).setAttribute("data-lpos","main-content"),window.document.querySelector("#main-body-container")&&(ele=window.document.querySelector("#main-body-container")).setAttribute("data-lpos","main-content"),window.document.querySelector(".asset-masthead")&&"asset"===subPageType&&(ele=window.document.querySelector(".asset-masthead")).setAttribute("data-lpos","asset|header"),window.document.querySelector(".main-content-wrap")&&"asset"===subPageType&&(ele=window.document.querySelector(".main-content-wrap")).setAttribute("data-lpos","asset|body"),window.document.querySelector(".tsArticleContainer")&&"asset"===subPageType&&(ele=window.document.querySelector(".tsArticleContainer")).setAttribute("data-lpos","asset|body"),window.document.querySelector(".asset-photo")&&"asset"===subPageType&&(ele=window.document.querySelector(".asset-photo")).setAttribute("data-lpos","asset|main-multimedia"),window.document.querySelector(".articleMainArt")&&"asset"===subPageType&&(ele=window.document.querySelector(".articleMainArt")).setAttribute("data-lpos","asset|main-multimedia"),window.document.querySelectorAll("#main-body-container .social-share-links"))if(elelist=window.document.querySelectorAll("#main-body-container .social-share-links"),"asset"===subPageType)for(x=0;x<elelist.length;x++)(ele=elelist[x]).setAttribute("data-lpos","asset|share-toolbar");else for(x=0;x<elelist.length;x++)(ele=elelist[x]).setAttribute("data-lpos","share-toolbar");if(window.document.querySelectorAll("#main-body-container div.photo-share .social-share-links"))if(elelist=window.document.querySelectorAll("#main-body-container div.photo-share .social-share-links"),"asset"===subPageType)for(x=0;x<elelist.length;x++)(ele=elelist[x]).setAttribute("data-lpos","asset|multimedia|share-toolbar");else for(x=0;x<elelist.length;x++)(ele=elelist[x]).setAttribute("data-lpos","multimedia|share-toolbar");if(window.document.querySelector("#asset-below")&&"asset"===subPageType&&(ele=window.document.querySelector("#asset-below")).setAttribute("data-lpos","asset|footer"),window.document.querySelector(".related-sidebar")&&"asset"===subPageType&&(ele=window.document.querySelector(".related-sidebar")).setAttribute("data-lpos","asset|related-links"),window.document.querySelector(".articleRelatedSiblings")&&"asset"===subPageType&&(ele=window.document.querySelector(".articleRelatedSiblings")).setAttribute("data-lpos","asset|related-links"),window.document.querySelector(".asset-comments")&&"asset"===subPageType&&(ele=window.document.querySelector(".asset-comments")).setAttribute("data-lpos","asset|conversation"),window.document.querySelector(".asset-paging .prev")&&(ele=window.document.querySelector(".asset-paging .prev")).setAttribute("data-lpos","asset|previous"),window.document.querySelector(".asset-paging .next")&&(ele=window.document.querySelector(".asset-paging .next")).setAttribute("data-lpos","asset|next"),window.document.querySelector(".access-offers-in-page")&&"asset"===subPageType&&(ele=window.document.querySelector(".access-offers-in-page")).setAttribute("data-lpos","asset|wall"),window.document.querySelector(".breadcrumb")&&(ele=window.document.querySelector(".breadcrumb")).setAttribute("data-lpos","breadcrumbs"),window.document.querySelectorAll(".newsletterSignup"))for(elelist=window.document.querySelectorAll(".newsletterSignup"),x=0;x<elelist.length;x++)(ele=elelist[x]).setAttribute("data-lpos","newsletter-signup");if(window.document.querySelector(".newsletterAnonymousSignup")&&(ele=window.document.querySelector(".newsletterAnonymousSignup")).setAttribute("data-lpos","newsletter|signup-form"),window.document.querySelectorAll("#main-body-container .tncms-block")){elelist=window.document.querySelectorAll("#main-body-container .tncms-block");var category=_satellite.getVar("processed:PrimaryCategory");for(category=category.trim().replace(/[^a-zA-Z0-9]/g,"-").replace(/(-)\1+/g,"$1").toLowerCase(),x=0;x<elelist.length;x++){if(titleEle=elelist[x].querySelector(".block-title-inner"))(title=titleEle.innerText.trim().replace(/[^a-zA-Z0-9]/g,"-").replace(/(-)\1+/g,"$1").toLowerCase()).indexOf("recommended-for-")>-1?elelist[x].setAttribute("data-lpos","recommended-content"):elelist[x].setAttribute("data-lpos",title);else elelist[x].className.indexOf("news-promo")>-1?elelist[x].innerText.toLowerCase().indexOf("newsletter")>-1||elelist[x].innerText.toLowerCase().indexOf("inbox")>-1?elelist[x].setAttribute("data-lpos","newsletter-promo"):elelist[x].setAttribute("data-lpos","promo-container-"+x):"home"===pageType?elelist[x].setAttribute("data-lpos","untitled-container-"+x):"section"===pageType&&(channel.indexOf("events")>-1?elelist[x].querySelector(".citySparkNavCategories")&&elelist[x].setAttribute("data-lpos","events|categories-filter"):elelist[x].setAttribute("data-lpos",category+"-"+x),elelist[x].className.indexOf("page-heading-breadcrumbs")>-1&&elelist[x].setAttribute("data-lpos","breadcrumbs"))}}(window.document.querySelector("#CitySpark")&&(ele=window.document.querySelector("#CitySpark")).setAttribute("data-lpos","events"),window.document.querySelector(".csTwoWrap"))&&(ele=window.document.querySelector(".csTwoWrap"),channel=(channel=_satellite.getVar("processed:Channel")).trim().replace(/[^a-zA-Z0-9]/g,"-").replace(/(-)\1+/g,"$1").toLowerCase(),ele.setAttribute("data-lpos",channel));if(window.document.querySelector("#CitySpark .csRoutingDetails")&&(ele=window.document.querySelector("#CitySpark .csRoutingDetails")).setAttribute("data-lpos","events|body"),"topic"===pageType&&window.document.querySelector("#main-page-container")){ele=window.document.querySelector("#main-page-container");var topicName=_satellite.getVar("processed:Channel");topicName=topicName.trim().substr(topicName.lastIndexOf("|")+1).replace(/[^a-zA-Z0-9]/g,"-").replace(/(-)\1+/g,"$1").toLowerCase(),ele.setAttribute("data-lpos",topicName)}if(window.document.querySelector(".poll-panel")&&(ele=window.document.querySelector(".poll-panel")).setAttribute("data-lpos","poll"),window.document.querySelector("#weatherLocationSelector")&&(ele=window.document.querySelector("#weatherLocationSelector")).setAttribute("data-lpos","weather|change-location"),window.document.querySelector(".weather-container")&&(ele=window.document.querySelector(".weather-container")).setAttribute("data-lpos","weather"),window.document.querySelector("#site-footer-container")&&(ele=window.document.querySelector("#site-footer-container")).setAttribute("data-lpos","footer"),window.document.querySelector('#site-footer-container div[class*="footer-right-icons"]')&&(ele=window.document.querySelector('#site-footer-container div[class*="footer-right-icons"]')).setAttribute("data-lpos","footer|apps"),window.document.querySelector('#site-footer-container div[class*="follow-links"]')&&(ele=window.document.querySelector('#site-footer-container div[class*="follow-links"]')).setAttribute("data-lpos","footer|social-links"),window.document.querySelector("#site-copyright-container")&&(ele=window.document.querySelector("#site-copyright-container")).setAttribute("data-lpos","footer|corporate-links"),window.document.querySelector(".results-container")&&(ele=window.document.querySelector(".results-container")).setAttribute("data-lpos","search|results"),window.document.querySelector("#tnt-search-url-results")&&(ele=window.document.querySelector("#tnt-search-url-results")).setAttribute("data-lpos","search|url-results"),window.document.querySelector(".pagination-container")&&(ele=window.document.querySelector(".pagination-container")).setAttribute("data-lpos","search|pagination"),window.document.querySelector(".search-page-container")&&(ele=window.document.querySelector(".search-page-container")).setAttribute("data-lpos","search|refine-search"),window.document.querySelector("#search-form-collapse")&&(ele=window.document.querySelector("#search-form-collapse")).setAttribute("data-lpos","search|refine-search"),window.document.querySelectorAll(".promotion-service.subscription-service"))if(elelist=window.document.querySelectorAll(".promotion-service.subscription-service"),"asset"===subPageType)for(x=0;x<elelist.length;x++)(ele=elelist[x]).setAttribute("data-lpos","asset|wall|subscription|card");else for(x=0;x<elelist.length;x++)(ele=elelist[x]).setAttribute("data-lpos","subscription|card");if(window.document.querySelector("#user-main-menu-wrapper")&&(ele=window.document.querySelector("#user-main-menu-wrapper")).setAttribute("data-lpos","users|account-info"),window.document.querySelector(".users-sidebar")&&(ele=window.document.querySelector(".users-sidebar")).setAttribute("data-lpos","users|sidebar"),window.document.querySelector("#promo-designer-modal-custom-pop")){var subscriptionOverlay=!1;if((ele=window.document.querySelector("#promo-designer-modal-custom-pop")).querySelector(".promo-design-button")){var overlayAction=ele.querySelector(".promo-design-button").innerHTML;overlayAction.indexOf("subscribe")>-1&&(subscriptionOverlay=!0)}!0===subscriptionOverlay?ele.setAttribute("data-lpos","subscription|overlay"):ele.setAttribute("data-lpos","promo|overlay")}if(window.document.querySelector("#onboardingModal")&&(ele=window.document.querySelector("#onboardingModal")).setAttribute("data-lpos","onboarding|modal"),window.document.querySelector("#onboardingNewsletters")&&(ele=window.document.querySelector("#onboardingNewsletters")).setAttribute("data-lpos","onboarding|newsletters"),window.document.querySelector('#onboardingModal #onboardingSlides a[href*="apps.apple.com"]')){ele=window.document.querySelector('#onboardingModal #onboardingSlides a[href*="apps.apple.com"]');try{var parentEle=ele.parentNode.parentNode;parentEle.setAttribute("data-lpos","onboarding|apps")}catch(e){}}if(window.document.querySelectorAll(".ad-placeholder-container"))for(elelist=window.document.querySelectorAll(".ad-placeholder-container"),x=0;x<elelist.length;x++)(ele=elelist[x]).setAttribute("data-lpos","gamp");if(window.document.querySelectorAll(".tnt-ads"))for(elelist=window.document.querySelectorAll(".tnt-ads"),x=0;x<elelist.length;x++)(ele=elelist[x]).setAttribute("data-lpos","gamp");if(window.document.querySelectorAll(".card-panel.volunteerOpportunity"))for(elelist=window.document.querySelectorAll(".card-panel.volunteerOpportunity"),x=0;x<elelist.length;x++){var titleEle=elelist[x].querySelector("div.orgHeadline"),cardOrg=elelist[x].querySelector("div.organization"),org=(title="unknown","unknown|");titleEle&&(title=titleEle.innerText.trim().replace(/[^a-zA-Z0-9]/g,"-").replace(/(-)\1+/g,"$1").toLowerCase()),cardOrg&&(0===(org=cardOrg.innerText.trim().replace(/[^a-zA-Z0-9]/g,"-").replace(/(-)\1+/g,"$1").toLowerCase()).indexOf("with-")&&(org=org.replace("with-","")),org+="|"),elelist[x].setAttribute("data-lpos","volunteer-card|"+org+title)} }); var _comscore=_comscore||[];_comscore.push({c1:"2",c2:"3005674"}),function(){var c=document.createElement("script"),e=document.getElementsByTagName("script")[0];c.async=!0,c.src=("https:"==document.location.protocol?"https://sb":"http://b")+".scorecardresearch.com/beacon.js",e.parentNode.insertBefore(c,e)}();

      When resizing the website, there is no change in layout (unresponsive) which means it is not robust.

  3. classroom.google.com classroom.google.com
    1. According to all known laws of aviation,

      there is no way a bee should be able to fly.

      Its wings are too small to get its fat little body off the ground.

      The bee, of course, flies anyway

      because bees don't care what humans think is impossible.

      Yellow, black. Yellow, black. Yellow, black. Yellow, black.

      Ooh, black and yellow! Let's shake it up a little.

      Barry! Breakfast is ready!

      Ooming!

      Hang on a second.

      Hello?

      Barry?

      Adam?

      Oan you believe this is happening?

      I can't. I'll pick you up.

      Looking sharp.

      Use the stairs. Your father paid good money for those.

      Sorry. I'm excited.

      Here's the graduate. We're very proud of you, son.

      A perfect report card, all B's.

      Very proud.

      Ma! I got a thing going here.

      You got lint on your fuzz.

      Ow! That's me!

      Wave to us! We'll be in row 118,000.

      Bye!

      Barry, I told you, stop flying in the house!

      Hey, Adam.

      Hey, Barry.

      Is that fuzz gel?

      A little. Special day, graduation.

      Never thought I'd make it.

      Three days grade school, three days high school.

      Those were awkward.

      Three days college. I'm glad I took a day and hitchhiked around the hive.

      You did come back different.

      Hi, Barry.

      Artie, growing a mustache? Looks good.

      Hear about Frankie?

      Yeah.

      You going to the funeral?

      No, I'm not going.

      Everybody knows, sting someone, you die.

      Don't waste it on a squirrel. Such a hothead.

      I guess he could have just gotten out of the way.

      I love this incorporating an amusement park into our day.

      That's why we don't need vacations.

      Boy, quite a bit of pomp… under the circumstances.

      Well, Adam, today we are men.

      We are!

      Bee-men.

      Amen!

      Hallelujah!

      Students, faculty, distinguished bees,

      please welcome Dean Buzzwell.

      Welcome, New Hive Oity graduating class of…

      …9:15.

      That concludes our ceremonies.

      And begins your career at Honex Industries!

      Will we pick ourjob today?

      I heard it's just orientation.

      Heads up! Here we go.

      Keep your hands and antennas inside the tram at all times.

      Wonder what it'll be like? A little scary. Welcome to Honex, a division of Honesco

      and a part of the Hexagon Group.

      This is it!

      Wow.

      Wow.

      We know that you, as a bee, have worked your whole life

      to get to the point where you can work for your whole life.

      Honey begins when our valiant Pollen Jocks bring the nectar to the hive.

      Our top-secret formula

      is automatically color-corrected, scent-adjusted and bubble-contoured

      into this soothing sweet syrup

      with its distinctive golden glow you know as…

      Honey!

      That girl was hot.

      She's my cousin!

      She is?

      Yes, we're all cousins.

      Right. You're right.

      At Honex, we constantly strive

      to improve every aspect of bee existence.

      These bees are stress-testing a new helmet technology.

      What do you think he makes? Not enough. Here we have our latest advancement, the Krelman.

      What does that do? Oatches that little strand of honey that hangs after you pour it. Saves us millions.

      Oan anyone work on the Krelman?

      Of course. Most bee jobs are small ones. But bees know

      that every small job, if it's done well, means a lot.

      But choose carefully

      because you'll stay in the job you pick for the rest of your life.

      The same job the rest of your life? I didn't know that.

      What's the difference?

      You'll be happy to know that bees, as a species, haven't had one day off

      in 27 million years.

      So you'll just work us to death?

      We'll sure try.

      Wow! That blew my mind!

      "What's the difference?" How can you say that?

      One job forever? That's an insane choice to have to make.

      I'm relieved. Now we only have to make one decision in life.

      But, Adam, how could they never have told us that?

      Why would you question anything? We're bees.

      We're the most perfectly functioning society on Earth.

      You ever think maybe things work a little too well here?

      Like what? Give me one example.

      I don't know. But you know what I'm talking about.

      Please clear the gate. Royal Nectar Force on approach.

      Wait a second. Oheck it out.

      Hey, those are Pollen Jocks! Wow. I've never seen them this close.

      They know what it's like outside the hive.

      Yeah, but some don't come back.

      Hey, Jocks! Hi, Jocks! You guys did great!

      You're monsters! You're sky freaks! I love it! I love it!

      I wonder where they were. I don't know. Their day's not planned.

      Outside the hive, flying who knows where, doing who knows what.

      You can'tjust decide to be a Pollen Jock. You have to be bred for that.

      Right.

      Look. That's more pollen than you and I will see in a lifetime.

      It's just a status symbol. Bees make too much of it.

      Perhaps. Unless you're wearing it and the ladies see you wearing it.

      Those ladies? Aren't they our cousins too?

      Distant. Distant.

      Look at these two.

      Oouple of Hive Harrys. Let's have fun with them. It must be dangerous being a Pollen Jock.

      Yeah. Once a bear pinned me against a mushroom!

      He had a paw on my throat, and with the other, he was slapping me!

      Oh, my! I never thought I'd knock him out. What were you doing during this?

      Trying to alert the authorities.

      I can autograph that.

      A little gusty out there today, wasn't it, comrades?

      Yeah. Gusty.

      We're hitting a sunflower patch six miles from here tomorrow.

      Six miles, huh? Barry! A puddle jump for us, but maybe you're not up for it.

      Maybe I am. You are not! We're going 0900 at J-Gate.

      What do you think, buzzy-boy? Are you bee enough?

      I might be. It all depends on what 0900 means.

      Hey, Honex!

      Dad, you surprised me.

      You decide what you're interested in?

      Well, there's a lot of choices. But you only get one. Do you ever get bored doing the same job every day?

      Son, let me tell you about stirring.

      You grab that stick, and you just move it around, and you stir it around.

      You get yourself into a rhythm. It's a beautiful thing.

      You know, Dad, the more I think about it,

      maybe the honey field just isn't right for me.

      You were thinking of what, making balloon animals?

      That's a bad job for a guy with a stinger.

      Janet, your son's not sure he wants to go into honey!

      Barry, you are so funny sometimes. I'm not trying to be funny. You're not funny! You're going into honey. Our son, the stirrer!

      You're gonna be a stirrer? No one's listening to me! Wait till you see the sticks I have.

      I could say anything right now. I'm gonna get an ant tattoo!

      Let's open some honey and celebrate!

      Maybe I'll pierce my thorax. Shave my antennae.

      Shack up with a grasshopper. Get a gold tooth and call everybody "dawg"!

      I'm so proud.

      We're starting work today! Today's the day. Oome on! All the good jobs will be gone.

      Yeah, right.

      Pollen counting, stunt bee, pouring, stirrer, front desk, hair removal…

      Is it still available? Hang on. Two left! One of them's yours! Oongratulations! Step to the side.

      What'd you get? Picking crud out. Stellar! Wow!

      Oouple of newbies?

      Yes, sir! Our first day! We are ready!

      Make your choice.

      You want to go first? No, you go. Oh, my. What's available?

      Restroom attendant's open, not for the reason you think.

      Any chance of getting the Krelman? Sure, you're on. I'm sorry, the Krelman just closed out.

      Wax monkey's always open.

      The Krelman opened up again.

      What happened?

      A bee died. Makes an opening. See? He's dead. Another dead one.

      Deady. Deadified. Two more dead.

      Dead from the neck up. Dead from the neck down. That's life!

      Oh, this is so hard!

      Heating, cooling, stunt bee, pourer, stirrer,

      humming, inspector number seven, lint coordinator, stripe supervisor,

      mite wrangler. Barry, what do you think I should… Barry?

      Barry!

      All right, we've got the sunflower patch in quadrant nine…

      What happened to you? Where are you?

      I'm going out.

      Out? Out where?

      Out there.

      Oh, no!

      I have to, before I go to work for the rest of my life.

      You're gonna die! You're crazy! Hello?

      Another call coming in.

      If anyone's feeling brave, there's a Korean deli on 83rd

      that gets their roses today.

      Hey, guys.

      Look at that. Isn't that the kid we saw yesterday? Hold it, son, flight deck's restricted.

      It's OK, Lou. We're gonna take him up.

      Really? Feeling lucky, are you?

      Sign here, here. Just initial that.

      Thank you. OK. You got a rain advisory today,

      and as you all know, bees cannot fly in rain.

      So be careful. As always, watch your brooms,

      hockey sticks, dogs, birds, bears and bats.

      Also, I got a couple of reports of root beer being poured on us.

      Murphy's in a home because of it, babbling like a cicada!

      That's awful. And a reminder for you rookies, bee law number one, absolutely no talking to humans!

      All right, launch positions!

      Buzz, buzz, buzz, buzz! Buzz, buzz, buzz, buzz! Buzz, buzz, buzz, buzz!

      Black and yellow!

      Hello!

      You ready for this, hot shot?

      Yeah. Yeah, bring it on.

      Wind, check.

      Antennae, check.

      Nectar pack, check.

      Wings, check.

      Stinger, check.

      Scared out of my shorts, check.

      OK, ladies,

      let's move it out!

      Pound those petunias, you striped stem-suckers!

      All of you, drain those flowers!

      Wow! I'm out!

      I can't believe I'm out!

      So blue.

      I feel so fast and free!

      Box kite!

      Wow!

      Flowers!

      This is Blue Leader. We have roses visual.

      Bring it around 30 degrees and hold.

      Roses!

      30 degrees, roger. Bringing it around.

      Stand to the side, kid. It's got a bit of a kick.

      That is one nectar collector!

      Ever see pollination up close? No, sir. I pick up some pollen here, sprinkle it over here. Maybe a dash over there,

      a pinch on that one. See that? It's a little bit of magic.

      That's amazing. Why do we do that?

      That's pollen power. More pollen, more flowers, more nectar, more honey for us.

      Oool.

      I'm picking up a lot of bright yellow. Oould be daisies. Don't we need those?

      Oopy that visual.

      Wait. One of these flowers seems to be on the move.

      Say again? You're reporting a moving flower?

      Affirmative.

      That was on the line!

      This is the coolest. What is it?

      I don't know, but I'm loving this color.

      It smells good. Not like a flower, but I like it.

      Yeah, fuzzy.

      Ohemical-y.

      Oareful, guys. It's a little grabby.

      My sweet lord of bees!

      Oandy-brain, get off there!

      Problem!

      Guys! This could be bad. Affirmative.

      Very close.

      Gonna hurt.

      Mama's little boy.

      You are way out of position, rookie!

      Ooming in at you like a missile!

      Help me!

      I don't think these are flowers.

      Should we tell him? I think he knows. What is this?!

      Match point!

      You can start packing up, honey, because you're about to eat it!

      Yowser!

      Gross.

      There's a bee in the car!

      Do something!

      I'm driving!

      Hi, bee.

      He's back here!

      He's going to sting me!

      Nobody move. If you don't move, he won't sting you. Freeze!

      He blinked!

      Spray him, Granny!

      What are you doing?!

      Wow… the tension level out here is unbelievable.

      I gotta get home.

      Oan't fly in rain.

      Oan't fly in rain.

      Oan't fly in rain.

      Mayday! Mayday! Bee going down!

      Ken, could you close the window please?

      Ken, could you close the window please?

      Oheck out my new resume. I made it into a fold-out brochure.

      You see? Folds out.

      Oh, no. More humans. I don't need this.

      What was that?

      Maybe this time. This time. This time. This time! This time! This…

      Drapes!

      That is diabolical.

      It's fantastic. It's got all my special skills, even my top-ten favorite movies.

      What's number one? Star Wars?

      Nah, I don't go for that…

      …kind of stuff.

      No wonder we shouldn't talk to them. They're out of their minds.

      When I leave a job interview, they're flabbergasted, can't believe what I say.

      There's the sun. Maybe that's a way out.

      I don't remember the sun having a big 75 on it.

      I predicted global warming.

      I could feel it getting hotter. At first I thought it was just me.

      Wait! Stop! Bee!

      Stand back. These are winter boots.

      Wait!

      Don't kill him!

      You know I'm allergic to them! This thing could kill me!

      Why does his life have less value than yours?

      Why does his life have any less value than mine? Is that your statement?

      I'm just saying all life has value. You don't know what he's capable of feeling.

      My brochure!

      There you go, little guy.

      I'm not scared of him. It's an allergic thing.

      Put that on your resume brochure.

      My whole face could puff up.

      Make it one of your special skills.

      Knocking someone out is also a special skill.

      Right. Bye, Vanessa. Thanks.

      Vanessa, next week? Yogurt night?

      Sure, Ken. You know, whatever.

      You could put carob chips on there.

      Bye.

      Supposed to be less calories.

      Bye.

      I gotta say something.

      She saved my life. I gotta say something.

      All right, here it goes.

      Nah.

      What would I say?

      I could really get in trouble.

      It's a bee law. You're not supposed to talk to a human.

      I can't believe I'm doing this.

      I've got to.

      Oh, I can't do it. Oome on!

      No. Yes. No.

      Do it. I can't.

      How should I start it? "You like jazz?" No, that's no good.

      Here she comes! Speak, you fool!

      Hi!

      I'm sorry.

      You're talking. Yes, I know. You're talking!

      I'm so sorry.

      No, it's OK. It's fine. I know I'm dreaming.

      But I don't recall going to bed.

      Well, I'm sure this is very disconcerting.

      This is a bit of a surprise to me. I mean, you're a bee!

      I am. And I'm not supposed to be doing this,

      but they were all trying to kill me.

      And if it wasn't for you…

      I had to thank you. It's just how I was raised.

      That was a little weird.

      I'm talking with a bee. Yeah. I'm talking to a bee. And the bee is talking to me!

      I just want to say I'm grateful. I'll leave now.

      Wait! How did you learn to do that? What? The talking thing.

      Same way you did, I guess. "Mama, Dada, honey." You pick it up.

      That's very funny. Yeah. Bees are funny. If we didn't laugh, we'd cry with what we have to deal with.

      Anyway…

      Oan I…

      …get you something?

      Like what? I don't know. I mean… I don't know. Ooffee?

      I don't want to put you out.

      It's no trouble. It takes two minutes.

      It's just coffee.

      I hate to impose.

      Don't be ridiculous!

      Actually, I would love a cup.

      Hey, you want rum cake?

      I shouldn't.

      Have some.

      No, I can't.

      Oome on!

      I'm trying to lose a couple micrograms.

      Where? These stripes don't help. You look great!

      I don't know if you know anything about fashion.

      Are you all right?

      No.

      He's making the tie in the cab as they're flying up Madison.

      He finally gets there.

      He runs up the steps into the church. The wedding is on.

      And he says, "Watermelon? I thought you said Guatemalan.

      Why would I marry a watermelon?"

      Is that a bee joke?

      That's the kind of stuff we do.

      Yeah, different.

      So, what are you gonna do, Barry?

      About work? I don't know.

      I want to do my part for the hive, but I can't do it the way they want.

      I know how you feel.

      You do? Sure. My parents wanted me to be a lawyer or a doctor, but I wanted to be a florist.

      Really? My only interest is flowers. Our new queen was just elected with that same campaign slogan.

      Anyway, if you look…

      There's my hive right there. See it?

      You're in Sheep Meadow!

      Yes! I'm right off the Turtle Pond!

      No way! I know that area. I lost a toe ring there once.

      Why do girls put rings on their toes?

      Why not?

      It's like putting a hat on your knee.

      Maybe I'll try that.

      You all right, ma'am?

      Oh, yeah. Fine.

      Just having two cups of coffee!

      Anyway, this has been great. Thanks for the coffee.

      Yeah, it's no trouble.

      Sorry I couldn't finish it. If I did, I'd be up the rest of my life.

      Are you…?

      Oan I take a piece of this with me?

      Sure! Here, have a crumb.

      Thanks! Yeah. All right. Well, then… I guess I'll see you around.

      Or not.

      OK, Barry.

      And thank you so much again… for before.

      Oh, that? That was nothing.

      Well, not nothing, but… Anyway…

      This can't possibly work.

      He's all set to go. We may as well try it.

      OK, Dave, pull the chute.

      Sounds amazing. It was amazing! It was the scariest, happiest moment of my life.

      Humans! I can't believe you were with humans!

      Giant, scary humans! What were they like?

      Huge and crazy. They talk crazy.

      They eat crazy giant things. They drive crazy.

      Do they try and kill you, like on TV?

      Some of them. But some of them don't.

      How'd you get back?

      Poodle.

      You did it, and I'm glad. You saw whatever you wanted to see.

      You had your "experience." Now you can pick out yourjob and be normal.

      Well… Well? Well, I met someone.

      You did? Was she Bee-ish?

      A wasp?! Your parents will kill you!

      No, no, no, not a wasp.

      Spider?

      I'm not attracted to spiders.

      I know it's the hottest thing, with the eight legs and all.

      I can't get by that face.

      So who is she?

      She's… human.

      No, no. That's a bee law. You wouldn't break a bee law.

      Her name's Vanessa. Oh, boy. She's so nice. And she's a florist!

      Oh, no! You're dating a human florist!

      We're not dating.

      You're flying outside the hive, talking to humans that attack our homes

      with power washers and M-80s! One-eighth a stick of dynamite!

      She saved my life! And she understands me.

      This is over!

      Eat this.

      This is not over! What was that?

      They call it a crumb. It was so stingin' stripey! And that's not what they eat. That's what falls off what they eat!

      You know what a Oinnabon is? No. It's bread and cinnamon and frosting. They heat it up…

      Sit down!

      …really hot!

      Listen to me! We are not them! We're us. There's us and there's them!

      Yes, but who can deny the heart that is yearning?

      There's no yearning. Stop yearning. Listen to me!

      You have got to start thinking bee, my friend. Thinking bee!

      Thinking bee. Thinking bee. Thinking bee! Thinking bee! Thinking bee! Thinking bee!

      There he is. He's in the pool.

      You know what your problem is, Barry?

      I gotta start thinking bee?

      How much longer will this go on?

      It's been three days! Why aren't you working?

      I've got a lot of big life decisions to think about.

      What life? You have no life! You have no job. You're barely a bee!

      Would it kill you to make a little honey?

      Barry, come out. Your father's talking to you.

      Martin, would you talk to him?

      Barry, I'm talking to you!

      You coming?

      Got everything?

      All set!

      Go ahead. I'll catch up.

      Don't be too long.

      Watch this!

      Vanessa!

      We're still here. I told you not to yell at him. He doesn't respond to yelling!

      Then why yell at me? Because you don't listen! I'm not listening to this.

      Sorry, I've gotta go.

      Where are you going? I'm meeting a friend. A girl? Is this why you can't decide?

      Bye.

      I just hope she's Bee-ish.

      They have a huge parade of flowers every year in Pasadena?

      To be in the Tournament of Roses, that's every florist's dream!

      Up on a float, surrounded by flowers, crowds cheering.

      A tournament. Do the roses compete in athletic events?

      No. All right, I've got one. How come you don't fly everywhere?

      It's exhausting. Why don't you run everywhere? It's faster.

      Yeah, OK, I see, I see. All right, your turn.

      TiVo. You can just freeze live TV? That's insane!

      You don't have that?

      We have Hivo, but it's a disease. It's a horrible, horrible disease.

      Oh, my.

      Dumb bees!

      You must want to sting all those jerks.

      We try not to sting. It's usually fatal for us.

      So you have to watch your temper.

      Very carefully. You kick a wall, take a walk,

      write an angry letter and throw it out. Work through it like any emotion:

      Anger, jealousy, lust.

      Oh, my goodness! Are you OK?

      Yeah.

      What is wrong with you?! It's a bug. He's not bothering anybody. Get out of here, you creep!

      What was that? A Pic 'N' Save circular?

      Yeah, it was. How did you know?

      It felt like about 10 pages. Seventy-five is pretty much our limit.

      You've really got that down to a science.

      I lost a cousin to Italian Vogue. I'll bet. What in the name of Mighty Hercules is this?

      How did this get here? Oute Bee, Golden Blossom,

      Ray Liotta Private Select?

      Is he that actor?

      I never heard of him.

      Why is this here?

      For people. We eat it.

      You don't have enough food of your own?

      Well, yes.

      How do you get it?

      Bees make it.

      I know who makes it!

      And it's hard to make it!

      There's heating, cooling, stirring. You need a whole Krelman thing!

      It's organic. It's our-ganic! It's just honey, Barry.

      Just what?!

      Bees don't know about this! This is stealing! A lot of stealing!

      You've taken our homes, schools, hospitals! This is all we have!

      And it's on sale?! I'm getting to the bottom of this.

      I'm getting to the bottom of all of this!

      Hey, Hector.

      You almost done? Almost. He is here. I sense it.

      Well, I guess I'll go home now

      and just leave this nice honey out, with no one around.

      You're busted, box boy!

      I knew I heard something. So you can talk!

      I can talk. And now you'll start talking!

      Where you getting the sweet stuff? Who's your supplier?

      I don't understand. I thought we were friends.

      The last thing we want to do is upset bees!

      You're too late! It's ours now!

      You, sir, have crossed the wrong sword!

      You, sir, will be lunch for my iguana, Ignacio!

      Where is the honey coming from?

      Tell me where!

      Honey Farms! It comes from Honey Farms!

      Orazy person!

      What horrible thing has happened here?

      These faces, they never knew what hit them. And now

      they're on the road to nowhere!

      Just keep still.

      What? You're not dead?

      Do I look dead? They will wipe anything that moves. Where you headed?

      To Honey Farms. I am onto something huge here.

      I'm going to Alaska. Moose blood, crazy stuff. Blows your head off!

      I'm going to Tacoma.

      And you? He really is dead. All right.

      Uh-oh!

      What is that?!

      Oh, no!

      A wiper! Triple blade!

      Triple blade?

      Jump on! It's your only chance, bee!

      Why does everything have to be so doggone clean?!

      How much do you people need to see?!

      Open your eyes! Stick your head out the window!

      From NPR News in Washington, I'm Oarl Kasell.

      But don't kill no more bugs!

      Bee!

      Moose blood guy!!

      You hear something?

      Like what?

      Like tiny screaming.

      Turn off the radio.

      Whassup, bee boy?

      Hey, Blood.

      Just a row of honey jars, as far as the eye could see.

      Wow!

      I assume wherever this truck goes is where they're getting it.

      I mean, that honey's ours.

      Bees hang tight. We're all jammed in. It's a close community.

      Not us, man. We on our own. Every mosquito on his own.

      What if you get in trouble? You a mosquito, you in trouble. Nobody likes us. They just smack. See a mosquito, smack, smack!

      At least you're out in the world. You must meet girls.

      Mosquito girls try to trade up, get with a moth, dragonfly.

      Mosquito girl don't want no mosquito.

      You got to be kidding me!

      Mooseblood's about to leave the building! So long, bee!

      Hey, guys! Mooseblood! I knew I'd catch y'all down here. Did you bring your crazy straw?

      We throw it in jars, slap a label on it, and it's pretty much pure profit.

      What is this place?

      A bee's got a brain the size of a pinhead.

      They are pinheads!

      Pinhead.

      Oheck out the new smoker. Oh, sweet. That's the one you want. The Thomas 3000!

      Smoker?

      Ninety puffs a minute, semi-automatic. Twice the nicotine, all the tar.

      A couple breaths of this knocks them right out.

      They make the honey, and we make the money.

      "They make the honey, and we make the money"?

      Oh, my!

      What's going on? Are you OK?

      Yeah. It doesn't last too long.

      Do you know you're in a fake hive with fake walls?

      Our queen was moved here. We had no choice.

      This is your queen? That's a man in women's clothes!

      That's a drag queen!

      What is this?

      Oh, no!

      There's hundreds of them!

      Bee honey.

      Our honey is being brazenly stolen on a massive scale!

      This is worse than anything bears have done! I intend to do something.

      Oh, Barry, stop.

      Who told you humans are taking our honey? That's a rumor.

      Do these look like rumors?

      That's a conspiracy theory. These are obviously doctored photos.

      How did you get mixed up in this?

      He's been talking to humans.

      What? Talking to humans?! He has a human girlfriend. And they make out!

      Make out? Barry!

      We do not.

      You wish you could. Whose side are you on? The bees!

      I dated a cricket once in San Antonio. Those crazy legs kept me up all night.

      Barry, this is what you want to do with your life?

      I want to do it for all our lives. Nobody works harder than bees!

      Dad, I remember you coming home so overworked

      your hands were still stirring. You couldn't stop.

      I remember that.

      What right do they have to our honey?

      We live on two cups a year. They put it in lip balm for no reason whatsoever!

      Even if it's true, what can one bee do?

      Sting them where it really hurts.

      In the face! The eye!

      That would hurt. No. Up the nose? That's a killer.

      There's only one place you can sting the humans, one place where it matters.

      Hive at Five, the hive's only full-hour action news source.

      No more bee beards!

      With Bob Bumble at the anchor desk.

      Weather with Storm Stinger.

      Sports with Buzz Larvi.

      And Jeanette Ohung.

      Good evening. I'm Bob Bumble. And I'm Jeanette Ohung. A tri-county bee, Barry Benson,

      intends to sue the human race for stealing our honey,

      packaging it and profiting from it illegally!

      Tomorrow night on Bee Larry King,

      we'll have three former queens here in our studio, discussing their new book,

      Olassy Ladies, out this week on Hexagon.

      Tonight we're talking to Barry Benson.

      Did you ever think, "I'm a kid from the hive. I can't do this"?

      Bees have never been afraid to change the world.

      What about Bee Oolumbus? Bee Gandhi? Bejesus?

      Where I'm from, we'd never sue humans.

      We were thinking of stickball or candy stores.

      How old are you?

      The bee community is supporting you in this case,

      which will be the trial of the bee century.

      You know, they have a Larry King in the human world too.

      It's a common name. Next week…

      He looks like you and has a show and suspenders and colored dots…

      Next week…

      Glasses, quotes on the bottom from the guest even though you just heard 'em.

      Bear Week next week! They're scary, hairy and here live.

      Always leans forward, pointy shoulders, squinty eyes, very Jewish.

      In tennis, you attack at the point of weakness!

      It was my grandmother, Ken. She's 81.

      Honey, her backhand's a joke! I'm not gonna take advantage of that?

      Quiet, please. Actual work going on here.

      Is that that same bee? Yes, it is! I'm helping him sue the human race.

      Hello. Hello, bee. This is Ken.

      Yeah, I remember you. Timberland, size ten and a half. Vibram sole, I believe.

      Why does he talk again?

      Listen, you better go 'cause we're really busy working.

      But it's our yogurt night!

      Bye-bye.

      Why is yogurt night so difficult?!

      You poor thing. You two have been at this for hours!

      Yes, and Adam here has been a huge help.

      Frosting… How many sugars? Just one. I try not to use the competition.

      So why are you helping me?

      Bees have good qualities.

      And it takes my mind off the shop.

      Instead of flowers, people are giving balloon bouquets now.

      Those are great, if you're three.

      And artificial flowers.

      Oh, those just get me psychotic! Yeah, me too. Bent stingers, pointless pollination.

      Bees must hate those fake things!

      Nothing worse than a daffodil that's had work done.

      Maybe this could make up for it a little bit.

      This lawsuit's a pretty big deal. I guess. You sure you want to go through with it?

      Am I sure? When I'm done with the humans, they won't be able

      to say, "Honey, I'm home," without paying a royalty!

      It's an incredible scene here in downtown Manhattan,

      where the world anxiously waits, because for the first time in history,

      we will hear for ourselves if a honeybee can actually speak.

      What have we gotten into here, Barry?

      It's pretty big, isn't it?

      I can't believe how many humans don't work during the day.

      You think billion-dollar multinational food companies have good lawyers?

      Everybody needs to stay behind the barricade.

      What's the matter? I don't know, I just got a chill. Well, if it isn't the bee team.

      You boys work on this?

      All rise! The Honorable Judge Bumbleton presiding.

      All right. Oase number 4475,

      Superior Oourt of New York, Barry Bee Benson v. the Honey Industry

      is now in session.

      Mr. Montgomery, you're representing the five food companies collectively?

      A privilege.

      Mr. Benson… you're representing all the bees of the world?

      I'm kidding. Yes, Your Honor, we're ready to proceed.

      Mr. Montgomery, your opening statement, please.

      Ladies and gentlemen of the jury,

      my grandmother was a simple woman.

      Born on a farm, she believed it was man's divine right

      to benefit from the bounty of nature God put before us.

      If we lived in the topsy-turvy world Mr. Benson imagines,

      just think of what would it mean.

      I would have to negotiate with the silkworm

      for the elastic in my britches!

      Talking bee!

      How do we know this isn't some sort of

      holographic motion-picture-capture Hollywood wizardry?

      They could be using laser beams!

      Robotics! Ventriloquism! Oloning! For all we know,

      he could be on steroids!

      Mr. Benson?

      Ladies and gentlemen, there's no trickery here.

      I'm just an ordinary bee. Honey's pretty important to me.

      It's important to all bees. We invented it!

      We make it. And we protect it with our lives.

      Unfortunately, there are some people in this room

      who think they can take it from us

      'cause we're the little guys! I'm hoping that, after this is all over,

      you'll see how, by taking our honey, you not only take everything we have

      but everything we are!

      I wish he'd dress like that all the time. So nice!

      Oall your first witness.

      So, Mr. Klauss Vanderhayden of Honey Farms, big company you have.

      I suppose so.

      I see you also own Honeyburton and Honron!

      Yes, they provide beekeepers for our farms.

      Beekeeper. I find that to be a very disturbing term.

      I don't imagine you employ any bee-free-ers, do you?

      No.

      I couldn't hear you.

      No.

      No.

      Because you don't free bees. You keep bees. Not only that,

      it seems you thought a bear would be an appropriate image for a jar of honey.

      They're very lovable creatures.

      Yogi Bear, Fozzie Bear, Build-A-Bear.

      You mean like this?

      Bears kill bees!

      How'd you like his head crashing through your living room?!

      Biting into your couch! Spitting out your throw pillows!

      OK, that's enough. Take him away.

      So, Mr. Sting, thank you for being here. Your name intrigues me.

      Where have I heard it before? I was with a band called The Police. But you've never been a police officer, have you?

      No, I haven't.

      No, you haven't. And so here we have yet another example

      of bee culture casually stolen by a human

      for nothing more than a prance-about stage name.

      Oh, please.

      Have you ever been stung, Mr. Sting?

      Because I'm feeling a little stung, Sting.

      Or should I say… Mr. Gordon M. Sumner!

      That's not his real name?! You idiots!

      Mr. Liotta, first, belated congratulations on

      your Emmy win for a guest spot on ER in 2005.

      Thank you. Thank you.

      I see from your resume that you're devilishly handsome

      with a churning inner turmoil that's ready to blow.

      I enjoy what I do. Is that a crime?

      Not yet it isn't. But is this what it's come to for you?

      Exploiting tiny, helpless bees so you don't

      have to rehearse your part and learn your lines, sir?

      Watch it, Benson! I could blow right now!

      This isn't a goodfella. This is a badfella!

      Why doesn't someone just step on this creep, and we can all go home?!

      Order in this court! You're all thinking it! Order! Order, I say!

      Say it! Mr. Liotta, please sit down! I think it was awfully nice of that bear to pitch in like that.

      I think the jury's on our side.

      Are we doing everything right, legally?

      I'm a florist.

      Right. Well, here's to a great team.

      To a great team!

      Well, hello.

      Ken! Hello. I didn't think you were coming.

      No, I was just late. I tried to call, but… the battery.

      I didn't want all this to go to waste, so I called Barry. Luckily, he was free.

      Oh, that was lucky.

      There's a little left. I could heat it up.

      Yeah, heat it up, sure, whatever.

      So I hear you're quite a tennis player.

      I'm not much for the game myself. The ball's a little grabby.

      That's where I usually sit. Right… there.

      Ken, Barry was looking at your resume,

      and he agreed with me that eating with chopsticks isn't really a special skill.

      You think I don't see what you're doing?

      I know how hard it is to find the rightjob. We have that in common.

      Do we?

      Bees have 100 percent employment, but we do jobs like taking the crud out.

      That's just what I was thinking about doing.

      Ken, I let Barry borrow your razor for his fuzz. I hope that was all right.

      I'm going to drain the old stinger.

      Yeah, you do that.

      Look at that.

      You know, I've just about had it

      with your little mind games.

      What's that? Italian Vogue. Mamma mia, that's a lot of pages.

      A lot of ads.

      Remember what Van said, why is your life more valuable than mine?

      Funny, I just can't seem to recall that!

      I think something stinks in here!

      I love the smell of flowers.

      How do you like the smell of flames?!

      Not as much.

      Water bug! Not taking sides!

      Ken, I'm wearing a Ohapstick hat! This is pathetic!

      I've got issues!

      Well, well, well, a royal flush!

      You're bluffing. Am I? Surf's up, dude!

      Poo water!

      That bowl is gnarly.

      Except for those dirty yellow rings!

      Kenneth! What are you doing?!

      You know, I don't even like honey! I don't eat it!

      We need to talk!

      He's just a little bee!

      And he happens to be the nicest bee I've met in a long time!

      Long time? What are you talking about?! Are there other bugs in your life?

      No, but there are other things bugging me in life. And you're one of them!

      Fine! Talking bees, no yogurt night…

      My nerves are fried from riding on this emotional roller coaster!

      Goodbye, Ken.

      And for your information,

      I prefer sugar-free, artificial sweeteners made by man!

      I'm sorry about all that.

      I know it's got an aftertaste! I like it!

      I always felt there was some kind of barrier between Ken and me.

      I couldn't overcome it. Oh, well.

      Are you OK for the trial?

      I believe Mr. Montgomery is about out of ideas.

      We would like to call Mr. Barry Benson Bee to the stand.

      Good idea! You can really see why he's considered one of the best lawyers…

      Yeah.

      Layton, you've gotta weave some magic

      with this jury, or it's gonna be all over.

      Don't worry. The only thing I have to do to turn this jury around

      is to remind them of what they don't like about bees.

      You got the tweezers? Are you allergic? Only to losing, son. Only to losing.

      Mr. Benson Bee, I'll ask you what I think we'd all like to know.

      What exactly is your relationship

      to that woman?

      We're friends.

      Good friends? Yes. How good? Do you live together?

      Wait a minute…

      Are you her little…

      …bedbug?

      I've seen a bee documentary or two. From what I understand,

      doesn't your queen give birth to all the bee children?

      Yeah, but…

      So those aren't your real parents!

      Oh, Barry…

      Yes, they are!

      Hold me back!

      You're an illegitimate bee, aren't you, Benson?

      He's denouncing bees!

      Don't y'all date your cousins?

      Objection! I'm going to pincushion this guy! Adam, don't! It's what he wants!

      Oh, I'm hit!!

      Oh, lordy, I am hit!

      Order! Order!

      The venom! The venom is coursing through my veins!

      I have been felled by a winged beast of destruction!

      You see? You can't treat them like equals! They're striped savages!

      Stinging's the only thing they know! It's their way!

      Adam, stay with me. I can't feel my legs. What angel of mercy will come forward to suck the poison

      from my heaving buttocks?

      I will have order in this court. Order!

      Order, please!

      The case of the honeybees versus the human race

      took a pointed turn against the bees

      yesterday when one of their legal team stung Layton T. Montgomery.

      Hey, buddy.

      Hey.

      Is there much pain?

      Yeah.

      I…

      I blew the whole case, didn't I?

      It doesn't matter. What matters is you're alive. You could have died.

      I'd be better off dead. Look at me.

      They got it from the cafeteria downstairs, in a tuna sandwich.

      Look, there's a little celery still on it.

      What was it like to sting someone?

      I can't explain it. It was all…

      All adrenaline and then… and then ecstasy!

      All right.

      You think it was all a trap?

      Of course. I'm sorry. I flew us right into this.

      What were we thinking? Look at us. We're just a couple of bugs in this world.

      What will the humans do to us if they win?

      I don't know.

      I hear they put the roaches in motels. That doesn't sound so bad.

      Adam, they check in, but they don't check out!

      Oh, my.

      Oould you get a nurse to close that window?

      Why? The smoke. Bees don't smoke.

      Right. Bees don't smoke.

      Bees don't smoke! But some bees are smoking.

      That's it! That's our case!

      It is? It's not over?

      Get dressed. I've gotta go somewhere.

      Get back to the court and stall. Stall any way you can.

      And assuming you've done step correctly, you're ready for the tub.

      Mr. Flayman.

      Yes? Yes, Your Honor!

      Where is the rest of your team?

      Well, Your Honor, it's interesting.

      Bees are trained to fly haphazardly,

      and as a result, we don't make very good time.

      I actually heard a funny story about…

      Your Honor, haven't these ridiculous bugs

      taken up enough of this court's valuable time?

      How much longer will we allow these absurd shenanigans to go on?

      They have presented no compelling evidence to support their charges

      against my clients, who run legitimate businesses.

      I move for a complete dismissal of this entire case!

      Mr. Flayman, I'm afraid I'm going

      to have to consider Mr. Montgomery's motion.

      But you can't! We have a terrific case.

      Where is your proof? Where is the evidence?

      Show me the smoking gun!

      Hold it, Your Honor! You want a smoking gun?

      Here is your smoking gun.

      What is that?

      It's a bee smoker!

      What, this? This harmless little contraption?

      This couldn't hurt a fly, let alone a bee.

      Look at what has happened

      to bees who have never been asked, "Smoking or non?"

      Is this what nature intended for us?

      To be forcibly addicted to smoke machines

      and man-made wooden slat work camps?

      Living out our lives as honey slaves to the white man?

      What are we gonna do? He's playing the species card. Ladies and gentlemen, please, free these bees!

      Free the bees! Free the bees!

      Free the bees!

      Free the bees! Free the bees!

      The court finds in favor of the bees!

      Vanessa, we won!

      I knew you could do it! High-five!

      Sorry.

      I'm OK! You know what this means?

      All the honey will finally belong to the bees.

      Now we won't have to work so hard all the time.

      This is an unholy perversion of the balance of nature, Benson.

      You'll regret this.

      Barry, how much honey is out there?

      All right. One at a time.

      Barry, who are you wearing?

      My sweater is Ralph Lauren, and I have no pants.

      What if Montgomery's right? What do you mean? We've been living the bee way a long time, 27 million years.

      Oongratulations on your victory. What will you demand as a settlement?

      First, we'll demand a complete shutdown of all bee work camps.

      Then we want back the honey that was ours to begin with,

      every last drop.

      We demand an end to the glorification of the bear as anything more

      than a filthy, smelly, bad-breath stink machine.

      We're all aware of what they do in the woods.

      Wait for my signal.

      Take him out.

      He'll have nauseous for a few hours, then he'll be fine.

      And we will no longer tolerate bee-negative nicknames…

      But it's just a prance-about stage name!

      …unnecessary inclusion of honey in bogus health products

      and la-dee-da human tea-time snack garnishments.

      Oan't breathe.

      Bring it in, boys!

      Hold it right there! Good.

      Tap it.

      Mr. Buzzwell, we just passed three cups, and there's gallons more coming!

      I think we need to shut down! Shut down? We've never shut down. Shut down honey production!

      Stop making honey!

      Turn your key, sir!

      What do we do now?

      Oannonball!

      We're shutting honey production!

      Mission abort.

      Aborting pollination and nectar detail. Returning to base.

      Adam, you wouldn't believe how much honey was out there.

      Oh, yeah?

      What's going on? Where is everybody?

      Are they out celebrating? They're home. They don't know what to do. Laying out, sleeping in.

      I heard your Uncle Oarl was on his way to San Antonio with a cricket.

      At least we got our honey back.

      Sometimes I think, so what if humans liked our honey? Who wouldn't?

      It's the greatest thing in the world! I was excited to be part of making it.

      This was my new desk. This was my new job. I wanted to do it really well.

      And now…

      Now I can't.

      I don't understand why they're not happy.

      I thought their lives would be better!

      They're doing nothing. It's amazing. Honey really changes people.

      You don't have any idea what's going on, do you?

      What did you want to show me? This. What happened here?

      That is not the half of it.

      Oh, no. Oh, my.

      They're all wilting.

      Doesn't look very good, does it?

      No.

      And whose fault do you think that is?

      You know, I'm gonna guess bees.

      Bees?

      Specifically, me.

      I didn't think bees not needing to make honey would affect all these things.

      It's notjust flowers. Fruits, vegetables, they all need bees.

      That's our whole SAT test right there.

      Take away produce, that affects the entire animal kingdom.

      And then, of course…

      The human species?

      So if there's no more pollination,

      it could all just go south here, couldn't it?

      I know this is also partly my fault.

      How about a suicide pact?

      How do we do it?

      I'll sting you, you step on me. Thatjust kills you twice. Right, right.

      Listen, Barry… sorry, but I gotta get going.

      I had to open my mouth and talk.

      Vanessa?

      Vanessa? Why are you leaving? Where are you going?

      To the final Tournament of Roses parade in Pasadena.

      They've moved it to this weekend because all the flowers are dying.

      It's the last chance I'll ever have to see it.

      Vanessa, I just wanna say I'm sorry. I never meant it to turn out like this.

      I know. Me neither.

      Tournament of Roses. Roses can't do sports.

      Wait a minute. Roses. Roses?

      Roses!

      Vanessa!

      Roses?!

      Barry?

      Roses are flowers! Yes, they are. Flowers, bees, pollen!

      I know. That's why this is the last parade.

      Maybe not. Oould you ask him to slow down?

      Oould you slow down?

      Barry!

      OK, I made a huge mistake. This is a total disaster, all my fault.

      Yes, it kind of is.

      I've ruined the planet. I wanted to help you

      with the flower shop. I've made it worse.

      Actually, it's completely closed down.

      I thought maybe you were remodeling.

      But I have another idea, and it's greater than my previous ideas combined.

      I don't want to hear it!

      All right, they have the roses, the roses have the pollen.

      I know every bee, plant and flower bud in this park.

      All we gotta do is get what they've got back here with what we've got.

      Bees.

      Park.

      Pollen!

      Flowers.

      Repollination!

      Across the nation!

      Tournament of Roses, Pasadena, Oalifornia.

      They've got nothing but flowers, floats and cotton candy.

      Security will be tight.

      I have an idea.

      Vanessa Bloome, FTD.

      Official floral business. It's real.

      Sorry, ma'am. Nice brooch.

      Thank you. It was a gift.

      Once inside, we just pick the right float.

      How about The Princess and the Pea?

      I could be the princess, and you could be the pea!

      Yes, I got it.

      Where should I sit?

      What are you?

      I believe I'm the pea.

      The pea?

      It goes under the mattresses.

      Not in this fairy tale, sweetheart. I'm getting the marshal. You do that! This whole parade is a fiasco!

      Let's see what this baby'll do.

      Hey, what are you doing?!

      Then all we do is blend in with traffic…

      …without arousing suspicion.

      Once at the airport, there's no stopping us.

      Stop! Security.

      You and your insect pack your float? Yes. Has it been in your possession the entire time?

      Would you remove your shoes?

      Remove your stinger. It's part of me. I know. Just having some fun. Enjoy your flight.

      Then if we're lucky, we'll have just enough pollen to do the job.

      Oan you believe how lucky we are? We have just enough pollen to do the job!

      I think this is gonna work.

      It's got to work.

      Attention, passengers, this is Oaptain Scott.

      We have a bit of bad weather in New York.

      It looks like we'll experience a couple hours delay.

      Barry, these are cut flowers with no water. They'll never make it.

      I gotta get up there and talk to them.

      Be careful.

      Oan I get help with the Sky Mall magazine?

      I'd like to order the talking inflatable nose and ear hair trimmer.

      Oaptain, I'm in a real situation.

      What'd you say, Hal? Nothing. Bee!

      Don't freak out! My entire species…

      What are you doing?

      Wait a minute! I'm an attorney! Who's an attorney? Don't move.

      Oh, Barry.

      Good afternoon, passengers. This is your captain.

      Would a Miss Vanessa Bloome in 24B please report to the cockpit?

      And please hurry!

      What happened here?

      There was a DustBuster, a toupee, a life raft exploded.

      One's bald, one's in a boat, they're both unconscious!

      Is that another bee joke? No! No one's flying the plane!

      This is JFK control tower, Flight 356. What's your status?

      This is Vanessa Bloome. I'm a florist from New York.

      Where's the pilot?

      He's unconscious, and so is the copilot.

      Not good. Does anyone onboard have flight experience?

      As a matter of fact, there is.

      Who's that? Barry Benson. From the honey trial?! Oh, great.

      Vanessa, this is nothing more than a big metal bee.

      It's got giant wings, huge engines.

      I can't fly a plane.

      Why not? Isn't John Travolta a pilot? Yes. How hard could it be?

      Wait, Barry! We're headed into some lightning.

      This is Bob Bumble. We have some late-breaking news from JFK Airport,

      where a suspenseful scene is developing.

      Barry Benson, fresh from his legal victory…

      That's Barry!

      …is attempting to land a plane, loaded with people, flowers

      and an incapacitated flight crew.

      Flowers?!

      We have a storm in the area and two individuals at the controls

      with absolutely no flight experience.

      Just a minute. There's a bee on that plane.

      I'm quite familiar with Mr. Benson and his no-account compadres.

      They've done enough damage.

      But isn't he your only hope?

      Technically, a bee shouldn't be able to fly at all.

      Their wings are too small…

      Haven't we heard this a million times?

      "The surface area of the wings and body mass make no sense."

      Get this on the air!

      Got it.

      Stand by.

      We're going live.

      The way we work may be a mystery to you.

      Making honey takes a lot of bees doing a lot of small jobs.

      But let me tell you about a small job.

      If you do it well, it makes a big difference.

      More than we realized. To us, to everyone.

      That's why I want to get bees back to working together.

      That's the bee way! We're not made of Jell-O.

      We get behind a fellow.

      Black and yellow! Hello! Left, right, down, hover.

      Hover? Forget hover. This isn't so hard. Beep-beep! Beep-beep!

      Barry, what happened?!

      Wait, I think we were on autopilot the whole time.

      That may have been helping me. And now we're not! So it turns out I cannot fly a plane.

      All of you, let's get behind this fellow! Move it out!

      Move out!

      Our only chance is if I do what I'd do, you copy me with the wings of the plane!

      Don't have to yell.

      I'm not yelling! We're in a lot of trouble.

      It's very hard to concentrate with that panicky tone in your voice!

      It's not a tone. I'm panicking!

      I can't do this!

      Vanessa, pull yourself together. You have to snap out of it!

      You snap out of it.

      You snap out of it.

      You snap out of it!

      You snap out of it!

      You snap out of it!

      You snap out of it!

      You snap out of it!

      You snap out of it!

      Hold it!

      Why? Oome on, it's my turn.

      How is the plane flying?

      I don't know.

      Hello?

      Benson, got any flowers for a happy occasion in there?

      The Pollen Jocks!

      They do get behind a fellow.

      Black and yellow. Hello. All right, let's drop this tin can on the blacktop.

      Where? I can't see anything. Oan you?

      No, nothing. It's all cloudy.

      Oome on. You got to think bee, Barry.

      Thinking bee. Thinking bee. Thinking bee! Thinking bee! Thinking bee!

      Wait a minute. I think I'm feeling something.

      What? I don't know. It's strong, pulling me. Like a 27-million-year-old instinct.

      Bring the nose down.

      Thinking bee! Thinking bee! Thinking bee!

      What in the world is on the tarmac? Get some lights on that! Thinking bee! Thinking bee! Thinking bee!

      Vanessa, aim for the flower. OK. Out the engines. We're going in on bee power. Ready, boys?

      Affirmative!

      Good. Good. Easy, now. That's it.

      Land on that flower!

      Ready? Full reverse!

      Spin it around!

      Not that flower! The other one!

      Which one?

      That flower.

      I'm aiming at the flower!

      That's a fat guy in a flowered shirt. I mean the giant pulsating flower

      made of millions of bees!

      Pull forward. Nose down. Tail up.

      Rotate around it.

      This is insane, Barry! This's the only way I know how to fly. Am I koo-koo-kachoo, or is this plane flying in an insect-like pattern?

      Get your nose in there. Don't be afraid. Smell it. Full reverse!

      Just drop it. Be a part of it.

      Aim for the center!

      Now drop it in! Drop it in, woman!

      Oome on, already.

      Barry, we did it! You taught me how to fly!

      Yes. No high-five! Right. Barry, it worked! Did you see the giant flower?

      What giant flower? Where? Of course I saw the flower! That was genius!

      Thank you. But we're not done yet. Listen, everyone!

      This runway is covered with the last pollen

      from the last flowers available anywhere on Earth.

      That means this is our last chance.

      We're the only ones who make honey, pollinate flowers and dress like this.

      If we're gonna survive as a species, this is our moment! What do you say?

      Are we going to be bees, orjust Museum of Natural History keychains?

      We're bees!

      Keychain!

      Then follow me! Except Keychain.

      Hold on, Barry. Here.

      You've earned this.

      Yeah!

      I'm a Pollen Jock! And it's a perfect fit. All I gotta do are the sleeves.

      Oh, yeah.

      That's our Barry.

      Mom! The bees are back!

      If anybody needs to make a call, now's the time.

      I got a feeling we'll be working late tonight!

      Here's your change. Have a great afternoon! Oan I help who's next?

      Would you like some honey with that? It is bee-approved. Don't forget these.

      Milk, cream, cheese, it's all me. And I don't see a nickel!

      Sometimes I just feel like a piece of meat!

      I had no idea.

      Barry, I'm sorry. Have you got a moment?

      Would you excuse me? My mosquito associate will help you.

      Sorry I'm late.

      He's a lawyer too?

      I was already a blood-sucking parasite. All I needed was a briefcase.

      Have a great afternoon!

      Barry, I just got this huge tulip order, and I can't get them anywhere.

      No problem, Vannie. Just leave it to me.

      You're a lifesaver, Barry. Oan I help who's next?

      All right, scramble, jocks! It's time to fly.

      Thank you, Barry!

      That bee is living my life!

      Let it go, Kenny.

      When will this nightmare end?!

      Let it all go.

      Beautiful day to fly.

      Sure is.

      Between you and me, I was dying to get out of that office.

      You have got to start thinking bee, my friend.

      Thinking bee! Me? Hold it. Let's just stop for a second. Hold it.

      I'm sorry. I'm sorry, everyone. Oan we stop here?

      I'm not making a major life decision during a production number!

      All right. Take ten, everybody. Wrap it up, guys.

      I had virtually no rehearsal for that.

    1. Author response:

      We would like to thank all the reviewers and editors for their thoughtful and detailed comments, critiques and suggestions. We will revise our manuscript in accordance with all the points raised by the reviewers. Here we summarize some of the main points that we intend to address in our revised manuscript.

      The reviewers noted that we were not sufficiently careful in identifying possible exogenous cues that the mice might be using to locate the cues and that we did not consider why such cues might be ineffective. As the reviewers point out, the mice may be ignoring the visual landmarks (and floor scratches) because they are not reliable cues and their relation to the food varies with the entrance the mice have used. In particular, a reviewer refers to papers that show that “in environments with 'unreliable' landmarks, place cells are not controlled by landmarks”. These papers were known to the authors but failed to make final cut of our extensive discussion. This important point will be thoroughly addressed.

      Another critical point was the mice were often doing thigmotaxis. The literature on thigmotaxis was known to us and we will now directly refer to this point. We do note that the final average start to food trajectory (TEV) is directly to the food. In other words, the thigmotaxic trajectories and “towards the center” trajectories effectively average out.

      There was a very cogent point about the difficulty of totally eliminating odor cues that we will now address. Finally, based on studies using a virtual reality environment, one reviewer questioned the use of “path integration” as a signal that encodes goal location. The relevance of path integration to spatial learning and performance is a very difficult issue that, to our knowledge, has never been entirely settled in the vast spatial learning literature. We do not think that our data can “settle’ this issue but will try to at least be explicit re the complexity of the path integration hypothesis as it applies to both our own data and the virtual reality literature. In particular, we will discuss the potential roles of optic flow versus proprioceptive and vestibular inputs to a putative path integration mechanism.

      Finally, the reviewers raised many important technical points re statistics reporting and how the figures are presented. In our revision, we will completely comply with all these helpful critiques.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      eLife assessment

      Chang et al. provide glutamate co-expression profiles in the central noradrenergic system and test the requirement of Vglut2-based glutamatergic release in respiratory and metabolic activity under physiologically relevant gas challenges. Their experiments provide compelling evidence that conditional deletion of Vglut2 in noradrenergic neurons does not impact steadystate breathing or metabolic activity in room air, hypercapnia, or hypoxia. This study provides an important contribution to our understanding of how noradrenergic neurons regulate respiratory homeostasis in conscious adult mice.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Chang et al. provide glutamate co-expression profiles in the central noradrenergic system and test the requirement of Vglut2-based glutamatergic release in respiratory and metabolic activity under physiologically relevant gas challenges. Their experiments show that conditional deletion of Vglut2 in NA neurons does not impact steady-state breathing or metabolic activity in room air, hypercapnia, or hypoxia. Their observations challenge the importance of glutamatergic signaling from Vglut2 expressing NA neurons in normal respiratory homeostasis in conscious adult mice.

      Strengths:

      The comprehensive Vglut1, Vglut2, and Vglut3 co-expression profiles in the central noradrenergic system and the combined measurements of breathing and oxygen consumption are two major strengths of this study. Observations from these experiments provide previously undescribed insights into (1) expression patterns for subtypes of the vesicular glutamate transporter protein in the noradrenergic system and (2) the dispensable nature of Vglut2-dependent glutamate signaling from noradrenergic neurons to breathing responses to physiologically relevant gas challenges in adult conscious mice.

      Weaknesses:

      Although the cellular expression profiles for the vesicular glutamate transporters are provided, the study fails to document that glutamatergic-based signaling originating from noradrenergic neurons is evident at the cellular level under normal, hypoxic, and/or hypercapnic conditions. This limits the reader's understanding of why conditional Vglut2 knockdown is dispensable for breathing under the conditions tested.

      We thank the reviewers for their positive evaluation of our work. First, we would like to highlight that multiple studies have provided anatomical evidence of innervation of multiple cardio-respiratory nuclei by Vglut2+ noradrenergic fibers. Thus, the anatomical substrates are present for noradrenergic based Vglut2 signaling to either play a direct role in breathing control or, upon perturbation, to indirectly affect breathing through disrupted metabolic or cardiovascular control. We have included supplemental table 1 that summarizes central noradrenergic Vglut2+ innervations of respiratory and autonomic nuclei. Additionally, Ultrastructural evidence shows asymmetric synaptic contacts assuming glutamatergic transmission between C1 neurons and LC, A1, A2 and the dorsal motor nucleus of the vagus (DMV) (Milner et al., 1989; Abbott et al., 2012; Holloway et al., 2013; DePuy et al., 2013).

      Functionally, electrophysiological evidence showed that photostimulating C1 neurons activate LC, A1, A2 noradrenergic neurons monosynaptically by releasing glutamate (Holloway et al., 2013; DePuy et al., 2013) and optogenetic stimulation of LC neurons excite the downstream parabrachial nucleus (PBN) neurons by releasing glutamate. Thus, at least the glutamatergic signaling from C1 and LC noradrenergic neurons (two noradrenergic nuclei that have been shown to play a role in breathing control) is evident at the cellular level under normal conditions. Other evidence, highlighted in our manuscript, is more circumstantial.

      Reviewer #2 (Public Review):

      The authors characterized the recombinase-based cumulative fate maps for vesicular glutamate transporters (Vglut1, Vglut2 and Vglut3) expression and compared those maps to their real-time expression profiles in central NA neurons by RNA in situ hybridization in adult mice. Authors have revealed a new and intriguing expression pattern for Vglut2, along with an entirely uncharted co-expression domain for Vglut3 within central noradrenergic neurons. Interestingly, and in contrast to previous studies, the authors demonstrated that glutamatergic signaling in central noradrenergic neurons does not exert any influence on breathing and metabolic control either under normoxic/normocapnic conditions or after chemoreflex stimulation. Also, they showed for the first-time the Vglut3-expressing NA population in C2/A2 nuclei. In addition, they were also able to demonstrate Vglut2 expression in anterior NA populations, such as LC neurons, by using more refined techniques, unlike previous studies.

      A major strength of the study is the use of a set of techniques to investigate the participation of NA-based glutamatergic signaling in breathing and metabolic control. The authors provided a full characterization of the recombinase-based cumulative fate maps for Vglut transporters. They performed real-time mRNA expression of Vglut transporters in central NA neurons of adult mice. Further, they evaluated the effect of knocking down Vglut2 expression in NA neurons using a DBH-Cre; Vglut2cKO mice on breathing and control in unanesthetized mice. Finally, they injected the AAV virus containing Cre-dependent Td tomato into LC of v-Glut2 Cre mice to verify the VGlut2 expression in LC-NA neurons. A very positive aspect of the article is that the authors combined ventilation with metabolic measurements. This integration holds particular significance, especially when delving into the exploration of respiratory chemosensitivity. Furthermore, the sample size of the experiments is excellent.

      Despite the clear strengths of the paper, some weaknesses exist. It is not clear in the manuscript if the experiments were performed in males and females and if the data were combined. I believe that the study would have benefited from a more comprehensive analysis exploring the sex specific differences. The reason I think this is particularly relevant is the developmental disorders mentioned by the authors, such as SIDS and Rett syndrome, which could potentially arise from disruptions in central noradrenergic (NA) function, exhibit varying degrees of sex predominance. Moreover, some of the noradrenergic cell groups are sexually dimorphic. For instance, female Wistar rats exhibit a larger LC size and more LC-NA neurons than male subjects (Pinos et al., 2001; Garcia-Falgueras et al., 2005). More recently, a detailed transcriptional profiling investigation has unveiled the identities of over 3,000 genes in the LC. This revelation has highlighted significant sexual dimorphisms, with more than 100 genes exhibiting differential expression within LC-NA neurons at the transcript level. Furthermore, this investigation has convincingly showcased that these distinct gene expression patterns have the capacity to elicit disparate behavioral responses between sexes (Mulvey et al., 2018). Therefore, the authors should compare the fate maps, Vglut transporters in males and females, at least considering LC-NA neurons. Even in the absence of identified sex differences, this information retains significant importance.

      All experiments contained both males and females as described in the original submission. In our analysis of breathing and metabolism, sex was included in the analysis and no significant phenotypic difference was observed. For the fate map and in situ experiments, we did not see obvious differences in the expression patterns in the three glutamate transporters between females and males, though the group size is small. Though all the anatomical and phenotypic data in this manuscript are presented as combined graphs, we have differentially labeled our data points by sex. The reviewer does raise important questions regarding possible sexual dimorphisms in the central noradrenergic system and whether such dimorphisms may extend to glutamate transporter co-expression. Our thorough interrogation of respiratory-metabolic parameters fails to reveal any sex specific differences in control or experimental mice. Thus, it is unclear if any of the previously described and cited dimorphisms are functionally relevant in this setting. Given the large differences in the real time expression and cumulative fate maps of Vglut2, a worthwhile interrogation of differential glutamate transporter expression would be best served by longitudinal studies with large group sizes across age as it is not clear what underlies the dynamic VGlut2 expression changes. Such changes may at times be greater in males and other times in females, driven by experience or physiological challenges etc., but resulting in averaged cumulative fatemaps that are similar between sexes. Such a longitudinal quantitative study of real-time and fatemapped cell populations across the central NA system would be of a scale that is beyond the scope of this report, especially when no phenotypic changes have been observed in our respiratory data.

      An important point well raised by the authors is that although suggestive, these experiments do not definitively rule out that NA-Vglut2 based glutamatergic signaling has a role in breathing control. Subsequent experiments will be necessary to validate this hypothesis.

      As noted, we discuss that we only address requirement, not sufficiency, of NA Vglut2 in breathing. Functional sufficiency experiments usually involve increasing the relevant output. However, these experiments can lead to non-specific, pleiotropic effects that would be difficult to disambiguate, even if done with high cellular specificity. Viral or genetic overexpression of Vglut2 in NA neurons may be a feasible approach. Conditional ablation of TH or DBH with concurrent chemo or optogenetic stimulation may also be informative. These approaches would require significant investments in mouse model generation and suffer additional experimental limitations.

      An improvement could be made in terms of measuring body temperature. Opting for implanted sensors over rectal probes would circumvent the need to open the chamber, thereby preventing alterations in gas composition during respiratory measurements. Further, what happens to body temperature phenotype in these animals under different gas exposures? These data should be included in the Tables.

      While surgical implantation of sensors would provide a more direct assessment of temperature, it requires components that were not available at the time of the study and addresses a question (temperature changes during a time course of gas exposure) that go beyond the scope of the current work focused on respiratory response. As we have done for prior experiments (Martinez et al., 2019; Ray et al., 2011), the body temperature was measured immediately before and after measuring breathing only. Our flow through system using inline gas sensors (AEI P-61B CO2 sensor and AEI N-22M O2 sensor) ensure that gas challenges were constant and consistent across all measurements. Any disruption in gas composition would have been noted by our software analysis system, Breathe Easy, and the data rejected. We did not observe any such perturbations.

      Is it plausible that another neurotransmitter within NA neurons might be released in higher amounts in DBH-Cre; Vglut2 cKO mice to compensate for the deficiency in glutamate and prevent changes in ventilation?

      We agree that compensation is always a possibility at the synaptic, cellular, and circuit levels that may involve a variety of transcriptional, translational, cellular, and circuit mechanisms (i.e., synaptic strength). This could be interrogated by combining multiple conditional alleles and recombinase drivers for various transmitters and receptors, but would, in our experience, take multiple years for the requisite breeding to be completed.

      Continuing along the same line of inquiry is there a possibility that Vglut2 cKO from NA neurons not only eliminates glutamate release but also reduces NA release? A similar mechanism was previously found in VGLUT2 cKO from DA neurons in previous studies (Alsio et al., 2011; Fortin et al., 2012; Hnasko et al., 2010). Additionally, does glutamate play a role in the vesicular loading of NA? Therefore, could the lack of effect on breathing be explained by the lack of noradrenaline and not glutamate?

      These are all excellent points, but prior studies suggest that reductions in NA signaling would itself have an apparent effect (Zanella et al., 2006; Kuo et al., 2016). Although several studies showed that LC and C1 NA neurons co-release noradrenaline and glutamate, no direct evidence yet makes clear that glutamate facilitates NA release or vice versa. However, it would be of great interest to test if reduced or lack of NA compensated for loss of glutamate in the future. We do fully acknowledge that compensation in the manuscript that any number of compensatory events could be at play in these findings.

      Reviewer #3 (Public Review):

      Summary:

      The authors, Y Chang and colleagues, have performed elegant studies in transgenic mouse models that were designed to examine glutamatergic transmission in noradrenergic neurons, with a focus on respiratory regulation. They generated 3 different transgenic lines, in which a red fluorophore was expressed in dopamine-B-hydroxylase (DBH; noradrenergic and adrenergic neurons) neurons that did not express a vesicular glutamate transporter (Vglut) and a green fluorophore in DBH neurons that did express one of either Vglut1, Vglut2 or Vglut3.

      Further experiments generated a transgenic mouse with knockout of Vglut2 in DBH neurons. The authors used plethysmography to measure respiratory parameters in conscious, unrestrained mice in response to various challenges.

      Strengths:

      The distribution of the Vglut expression is broadly in agreement with other studies, but with the addition of some novel Vglut3 expression. Validation of the transgenic results, using in situ hybridization histochemistry to examine mRNA expression, revealed potential modulation of Vglut2 expression during phases of development. This dataset is comprehensive, wellpresented and very useful.

      In the physiological studies the authors observed that neither baseline respiratory parameters, nor respiratory responses to hypercapnea (5, 7, 10% CO2) or hypoxia (10% O2) were different between knockout mice and littermate controls. The studies are well-designed and comprehensive. They provide observations that are supportive of previous reports using similar methodology.

      Weaknesses:

      In relation to the expression of Vglut2, the authors conclude that modulation of expression occurs, such that in adulthood there are differences in expression patterns in some (nor)adrenergic cell groups. Altered sensitivity is provided as an explanation for different results between studies examining mRNA expression. These are likely explanations; however, the conclusion would really be definitive with inclusion of a conditional cre expressing mouse. Given the effort taken to generate this dataset, it seems to me that taking that extra step would be of value for the overall understanding of glutamatergic expression in these catecholaminergic neurons

      The seemingly dynamic Vglut2 expression pattern across the NA system is intriguing. As noted in our comments to reviewer 2, a robust age dependent interrogation would require a large magnitude study. The reviewer correctly points out that a temporally controlled recombinase fate mapping experiment would offer greater insight into the dynamic expression of Vglut2. We strongly agree with that idea and did work to develop a Vglut2-CreER targeted allele that, despite our many other successes in mouse genetic engineering (Lusk et al., 2022; Sun and Ray, 2016), did not succeed on the first attempt. We aim to complete the line in the near future so that we may better understand the Vglut2 expression pattern in central noradrenergic neurons in a time-specific manner and sex specific manner.

      The respiratory physiology is very convincing and provides clear support for the view that Vglut2 is not required for modulation of the respiratory parameters measured and the reflex responses tested. It is stated that this is surprising. However, comparison with the data from Abbott et al., Eur J Neurosci (2014) in which the same transgenic approach was used, shows that they also observed no change in baseline breathing frequency. Differences were observed with strong, coordinated optogenetic stimulation, but, as discussed in this manuscript, it is not clear what physiological function this is relevant to. It just shows that some C1 neurons can use glutamate as a signaling molecule. Further, Holloway et al., Eur J Neurosci (2015), using the same transgenic mouse approach, showed that the respiratory response to optogenetic activation of Phox2 expressing neurons is not altered in DBH-Vglut2 KO mice. The conclusion seems to be that some C1 neuron effects are reliant upon glutamatergic transmission (C1DMV for example), and some not.

      We agree that activation of C1 neurons may be sufficient to modulate breathing when artificially stimulated and that such stimulation relies on glutamatergic transmission for its effect. This is why we find our results surprising and important in clarifying for the field that glutamatergic signaling in noradrenergic cells is dispensable for breathing and hypoxic and hypercapnic responses under physiological conditions.

      Further contrast is made in this manuscript to the work of Malheiros-Lima and colleagues (eLife 2020) who showed that the activation of abdominal expiratory nerve activity in response to peripheral chemoreceptor activation with cyanide was dependent upon C1 neurons and could be attenuated by blockade of glutamate receptors in the pFRG - i.e. the supposition that glutamate release from C1 neurons was responsible for the function. However, it is interesting to observe that diaphragm EMG responses to hypercapnia (10% CO2) or cyanide, and the expiratory activation to hypercapnia, were not affected by the glutamate receptor blockade. Thus, a very specific response is affected and one that was not measured in the current study.

      As we mention above, we do not dispute that glutamate signaling can be manipulated to create a response in non-physiological conditions – we suggest that framing the interpretation around the glutamatergic role in a model that better matches physiological conditions should inform our interpretation. Furthermore, we do include an examination of expiratory flow – which was not impacted by loss of glutamatergic activity in NA neurons – which would be likely to have been impacted if abdominal expiratory nerve activity was modified.

      These previous published observations are consistent with the current study which provides a more comprehensive analysis of the role of glutamatergic contributions respiratory physiology. A more nuanced discussion of the data and acknowledgement of the differences, which are not actually at odds, would improve the paper and place the information within a more comprehensive model.

      Thank you for the comments. As noted in the original and extended discussion, we respectfully disagree with the perspective that our results align with prior results.

      Recommendations for the authors:

      The three reviewers believe this is an important study. They have numerous suggestions for improvement of the manuscript (outlined below), but no new experiments are required. The Editor requests some nomenclature changes as indicated in attachment 1.

      Reviewer #1 (Recommendations For The Authors):

      Abstract/Introduction: Although the need for this study is obvious, it is important that the authors explicitly communicate their working hypothesis < before the start of the work> to the reader. In the current form, it is unclear whether the authors aimed to test the hypothesis that glutamatergic signaling from noradrenergic neurons is important to breathing or whether to test the hypothesis that glutamatergic signaling from noradrenergic neurons is not important to breathing. If it is the latter-it is not important-then the study (related to the breathing measurements) is poorly justified and designed, as additional orthogonal approaches (e.g., actual measurements of glutamatergic signaling at the cellular level) are almost requisite. If the authors' hypothesis was originally based on existing literature suggesting that glutamatergic signaling from noradrenergic neurons is important to breathing, then the experimental design appropriate.

      Thank you for the suggestion. The working hypothesis has been added in the abstract (line 2425) and the introduction (line 92-94)), making clear that we initially hypothesized that glutamatergic signaling from noradrenergic neurons is important in breathing.

      Results: While the steady state measurements for breathing metrics are clearly important in defining how glutamatergic signaling may contribute to be pulmonary function, the role of glutamatergic signaling may have a greater role in the dynamics of patterns (i.e., regularity of the breathing rhythms) such traits can be described using SD1 and SD2 from Poincare maps, and/or entropy measurements. Such an analysis should be performed.

      Thank you for the suggestion. The dynamic patterns of respiratory rate (Vf), tidal volume (VT), minute ventilation (VE), inspiratory duration (TI), expiratory duration (TE), breath cycle duration (TTOT), inspiratory flow rate (VT/TI), expiratory flow rate (VT/TE) have been shown as Poincaré plots and quantified and tested using the SD1 and SD2 statistics in the supplemental figures of Figure 4-7.

      Results: Analyses of Inspiratory time (Ti) and flow rate (i.e., Tidal Volume / Ti) should be assessed and included.

      Thank you for the suggestion. Inspiratory duration (Ti), expiratory duration (TE), breath cycle duration (TTOT), inspiratory flow rate (VT/Ti), and expiratory flow rate (VT/TE) have been included in the Figures 4-7.

      Results/Methods: If similar analytical approaches were used in the current study as to that in Lusk et al. 2022, it appears that data was discontinuously sampled, rejecting periods of movement and only including periods of quiescent breathing. Were the periods of quiescent breathing different? Information should be provided to describe the total sampling duration included.

      For room air, the entire gas condition was used for data analysis. For hypercapnia (5% CO2, 7% CO2, 10% CO2), only the last 5 minutes of the gas challenge period was used for data analysis. For hypoxia (10% O2), we analyzed the breathing trace of three 5-minute epochs following initiation of the gas exposure separately, e.g., epoch 1 = 5-10min, epoch 2 = 10-15min, and epoch 3 = 15-20min. All breaths included as quiescent breathing were analyzed in the aggregate for each group and experimental condition, we did not compare individual periods of quiescent breathing within or across an animal(s)/group(s)/experimental condition(s). We have added the details in the Materials and Methods (line 637-642).

      Results: As mice were conscious in this study, were sniff periods (transient periods of fast breathing, i.e.,>8Hz) included in the analysis?

      No, only regular quiescent breathing periods were included in the analysis.

      Discussion: The authors need to discuss the limitations of their findings.

      • How should the reader interpret the findings? Concluding that glutamatergic signaling is dispensable implies that it occurs in room air, hypoxia, and hypercapnia.

      We have edited our discussion for clarity to highlight our conclusions that Vglut2-based glutamatergic signaling from noradrenergic neurons is ultimately dispensable for baseline breathing and hypercapnia and hypoxic chemoreflex in unanesthetized and unrestrained mice.

      • Assuming that glutamatergic signaling is active during the conditions tested, then the authors should discuss what may be the potential compensations.

      We have provided additional discussion surrounding potential compensatory events that may have taken place and could result in the unchanged phenotype in the experimental group.

      • The authors need to discuss how age and state of consciousness may play a role in their finds. The current discussion gives the impression that their findings are broadly applicable in all cases, but the lack of differences in this study may not hold true under different conditions.

      The study was done in adult (6–8-week-old) unanesthetized and unrestrained mice. In the discussion (line 472-474), we highlight that in our unpublished results, loss of NA-expressed Vglut2 does not change the survival curve in P7 neonate mice undergoing repeated bouts of autoresuscitation until death. Thus, we believed that Vglut2-based glutamatergic signaling in central NA neurons is dispensable for baseline breathing and the hypercapnic and hypoxic chemoreflexes in unanesthetized and unrestrained mice across different ages. Otherwise, we do not imply that we have interrogated any other aspects of breathing in our discussion.

      Methods: Further description of the analysis window for the respiratory metrics should be provided. Were breath values for each condition taken throughout the entire condition? This is particularly important for hypoxia, where the stereotypical respiratory response is biphasic.

      For room air, the entire gas condition was used for data analysis. For hypercapnia (5% CO2, 7% CO2, 10% CO2), only the last 5min of the gas challenge period was used for data analysis. For hypoxia (10% O2), we analyzed the breathing trace of three 5min time periods separately including 5-10min, 10-15min, and 15-20min during the hypoxic challenge as noted in our original manuscript, we graph and assess three 5min epochs during hypoxic exposure to capture the dynamic nature of the hypoxic ventilatory response. We have added the details in the Materials and Methods (line 637-642).

      Methods: How was consciousness determined?

      The conscious mice mentioned in the manuscript refer to the mice without anesthesia. We have replaced “awake” and “conscious” with “unanesthetized” in the text.

      Reviewer #2 (Recommendations For The Authors):

      Since no EEG/EMG recording was performed it would be more appropriate to remove "awake" and "conscious" throughout the manuscript and include the term "unanesthetized".

      Thank you for the suggestion. “Awake” and “conscious” have been replaced by “unanesthetized” in the text.

      Line 545: Why 32C? Isn't this temperature too high for animals?

      30-32°C is the thermoneutral zone for mice. It is the range of ambient temperature where mice can maintain a stable core temperature with their minimal metabolic rate (Gordon, 1985). Whole-body plethysmography uses the barometric technique to detect pressure oscillations caused by changes in temperature and humidity with each breathing act when an animal sits in a sealed chamber (Mortola et al., 2013). Thus, maintaining the chamber temperature near the thermoneutral zone during the plethysmography assay is required to maintain constancy in respiratory and metabolic parameters from trial to trial as well as to maintain linearity of ventilatory pressure changes due to humidification, rarefaction, and thermal expansion and contraction during inspiration and expiration (Ray et al., 2011). The chamber temperature that has been used for adult plethysmography has been set across a range 30-34°C (Hodges et al., 2008; Ray et al., 2011; Hennessy et al., 2017). We use 32°C in this manuscript which is consistent with previously published literature from other groups and our own work (Sun et al., 2017; Lusk et al., 2022).

      I would include the units of the physiological variables in the tables.

      Thank you for the suggestion. The units of the physiological variables have been added in all the tables.

      Reviewer #3 (Recommendations For The Authors):

      Why is the C3 group not considered in this study?

      The C3 adrenergic group, best characterized in rat, is only seen in rodents but not in many other species including primates (including human) (Kitahama et al., 1994). Thus, the C3 group is not the focus of this study where we aim to discuss if glutamate derived from noradrenergic neurons could be the potential therapeutic target of human respiratory disorders. The C3 adrenergic group is typically described as a population containing only about 30 neurons. We have added the fate map data and the adult expression pattern for the three vesicular glutamate transporters for the C3 group in the figure 1 and 2 supplements for reference.

      Sub CD/CV does not appear to be defined in the manuscript.

      Thank you for the point. The definition of sub CD/CV has been added in the text (line 126).

      The data on line 131-133 is interesting but could be described more effectively and clearly.

      Thank you for the suggestion. The text has been modified accordingly.

      The end of the paragraph at lines 140 onwards is rather repeated in the paragraph that starts at line 146.

      The repeated text has been removed accordingly.

      Whilst anterior and posterior are correct anatomical terms, for a quadraped, rostral and caudal are more widely used - particularly in the brainstem field. Is there a particular reason for using anterior/posterior?

      We followed the anatomical terminations in the Robertson et al. (2013) where they used anterior/posterior to describe C2/A2 and C1/A1.

      On the protocol lines include in Figure 4-7 it would be worth adding the test day. This seems a little strange. Why wait up to one week after the habituation to perform the stimulation. How many mice were left for each day between habituation and experimentation, and does this timing affect responses? Do mice forget the habituation after a period?

      Thank you for the point. We have added the test day for plethysmography in figures 4-7. After the 5 days of habituation, we began the plethysmography recordings on the sixth day. A maximum of 6 mice can be assayed for plethysmography per day due to the limited number of barometric flow through plethysmography and metabolic measurement systems we have. Thus, all animals were finished with plethysmography “within” one week of the last day of habituation. This protocol is consistent with our previous published work (Martinez et al., 2019; Lusk et al., 2022; Lusk et al., 2023). For the experiments in this manuscript, mice were assayed within 3 days after habituation. As noted in our methods and figures, each mouse is given as much as 40 mins to acclimate to the chamber (determined by directly observed quiet breathing) before data acquisition. We have no reason or evidence that indicates testing order and thus timing was a factor. The detailed explanation for the plethysmography protocol has been added in the material and methods section (line 606-625).

      Please state clearly that each mouse is only exposed to one gas mixture (what I interpret is the case), or could one mouse be exposed to several different stimuli?

      Each mouse is only exposed to one gas challenge (5% CO2, 7% CO2, 10% CO2, or 10% O2) in a testing period. Each testing period for an individual mouse was separated by 24hs to allow for a full recovery. The protocol is to put the mouse under room air for 45mins, switch to one gas challenge for 20mins, and switch back to room air for 20mins.

      With apologies if I missed this, but did each of the respiratory stimuli produce a statistically significant response in the control mice? For example, the response to 10%O2?

      Yes, each respiratory stimuli including 5/7/10% CO2 and 10% O2 produced a statistically significant response in both mutant and control mice. We have labeled the statistical significance in the Figures 4-7. Thank you for pointing this out.

      Line 312: Optogenetic stimulation induced an increase from 130 to 180 breaths per min (Abbott et al., EJN 2014). It is surprising that this is called "modest". Baseline respiratory frequency was presented.

      Thank you for the point. The word “modest” has been removed and the discussion has been changed accordingly (line 355-360).

      Line 338: This discussion is not sufficiently nuanced. It is the increased Dia amplitude (to KCN only, not 10%CO2 ) and the stimulation of active expiration, to both stimuli, that is blocked by kyn in pFRG. There is no effect of breathing frequency. The current study would not detect such differences in active expiration.

      Thank you for the suggestion. The discussion has been modified accordingly (line 382-388).

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      In this important paper, Blin and colleagues develop a high-throughput behavioral assay to test spontaneous swimming and olfactory preference in individual Mexican cavefish larvae. The authors present compelling evidence that the surface and cave morphs of the fish show different olfactory preferences and odor sensitivities and that individual fish show substantial variability in their spontaneous activity that is relevant for olfactory behaviour. The paper will be of interest to neurobiologists working on the evolution of behaviour, olfaction, and the individuality of behaviour.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The authors posed a research question about how an animal integrates sensory information to optimize its behavioral outputs and how this process evolved. Their data (behavioral output analysis with detailed categories in response to the different odors in different concentrations by comparing surface and cave populations and their hybrid) partially answer this tough question. They built a new low-disturbance system to answer the question. They also found that the personality of individual fish is a good predictor of behavioral outputs against odor response. They concluded that cavefish evolved to specialize their response to alanine and histidine while surface fish are more general responders, which was supported by their data.

      Strengths:

      With their new system, the authors could generate clearer results without mechanical disturbances. The authors characterize multiple measurements to score the odor response behaviors, and also brought a new personality analysis. Their conclusion that cavefish evolved as a specialist to sense alanine and histidine among 6 tested amino acids was well supported by their data.

      Weaknesses:

      The authors posed a big research question: How do animals evolve the processes of sensory integration to optimize their behavioral outputs? I personally feel that, to answer the questions about how sensory integration generates proper (evolved) behavior, the authors at least need to show the ecological relevance of their response. For the alanine/histidine preference in cavefish, they need data for the alanine and other amino acid concentrations in the local cave water and compare them with those of surface water.

      We agree with the reviewer. This is why, in the Discussion section, we had written: “…Such significant variations in odor preferences or value may be adaptive and relate to the differences in the environmental and ecological conditions in which these different animals live. However, the reason why Pachón cavefish have become “alanine specialists” remains a mystery and prompts analysis of the chemical ecology of their natural habitat. Of note, we have not found an odor that would be repulsive for Astyanax so far, and this may relate to their opportunist, omnivorous and detritivore regime (Espinasa et al., 2017; Marandel et al., 2020).” This is also why we currently develop field work projects aimed at clarifying this question. However, such experiments and analyses are challenging, practically and technically. We hope we can reach some conclusions in the future.

      To complete the discussion we have also added an important hypothesis: “Alternatively, specialization for alanine may not need to be specific for an olfactory cue present only, or frequently, or in high amounts in caves. Bat guano for example, which is probably the main source of food in the Pachón cave, must contain many amino acids. Enhanced recognition of one of them - in the present case alanine but evolution may have randomly acted for enhanced recognition of another amino acid – should suffice to confer cavefish with augmented sensitivity to their main source of nutriment.”

      Also, as for "personality matters", I read that personality explains a large variation in surface fish. Also, thigmotaxis or wall-following cavefish individuals are exceeded to respond well to odorants compared with circling and random swimming cavefish individuals. However, I failed to understand the authors' point about how much percentages of the odorant-response variations are explained (PVE) by personality. Association (= correlation) was good to show as the authors presented, but showing proper PVE or the effect size of personality to predict the behavioral outputs is important to conclude "personality is matter"; otherwise, the conclusion is not so supported.

      From the above, I recommend the authors reconsider the title also their research questions well. At this moment, I feel that the authors' conclusions and their research questions are a little too exaggerated, with less supportive evidence.

      Thank you for this interesting suggestion, which we have fully taken into consideration. We have therefore now calculated and plotted PVE (the percentage of variation explained on the olfactory score) as a function of swimming speed or as a function of swimming pattern. The results are shown in modified Figure 8 of our revised ms and they suggest that the personality (here, swimming patterns or swimming speed) indeed predicts the olfactory response skills. Therefore, we would like to keep our title as we provide support for the fact that “personality matters”.

      Also, for the statistical method, Fisher's exact test is not appropriate for the compositional data (such as Figure 2B). The authors may quickly check it at https://en.wikipedia.org/wiki/Compositional_data or https://www.annualreviews.org/doi/pdf/10.1146/annurev-statistics-042720-124436.

      The authors may want to use centered log transformation or other appropriate transformations (Rpackage could be: https://doi.org/10.1016/j.cageo.2006.11.017). According to changing the statistical tests, the authors' conclusion may not be supported.

      Actually, in most cases, the distributions are so different (as seen by the completely different colors in the distribution graphs) that there is little doubt that swimming behaviors are indeed different between surface and cavefish, or between ‘before’ and ‘after’ odor stimulation. However, it is true that Fisher’s exact test is not fully appropriate because data can be considered as compositional type. For this kind of data, centered log transformation have been suggested. However, our dataset contains many zeros, and this is a case where log transformations have difficulty handling.

      To help us dealing with our data, the reviewer proposed to consider the paper by Greenacre (2021) (https://www.annualreviews.org/doi/pdf/10.1146/annurev-statistics-042720-124436). In his paper, Greenacre clearly wrote: "Zeros in compositional data are the Achilles heel of the logratio approach (LRA)."

      Therefore, we have now tested our data using CA (Correspondence Analysis), that can deal with table containing many zeros and is a trustable alternative to LRA (Cook-Thibeau, 2021; Greenacre, 2011).

      The results of CA analysis are shown in Supplemental figure 8 and they fully confirm the difference in baseline swimming patterns between morphs as well as changes (or absence of changes) in behavioral patterns after odor stimulation suggested by the colored bar plots in main figures, with confidence ellipses overlapping or not overlapping, depending on cases. Therefore, the CA method fully confirms and even strengthens our initial interpretations.

      Finally, we have kept our initial graphical representation in the ms (color-coded bar plots; the complete color code is now given in Suppl. Fig7), and CA results are shown in Suppl. Figure 8 and added in text.

      Reviewer #2 (Public Review):

      In their submitted manuscript, Blin et al. describe differences in the olfactory-driven behaviors of river-dwelling surface forms and cave-dwelling blind forms of the Mexican tetra, Astyanax mexicanus. They provide a dataset of unprecedented detail, that compares not only the behaviors of the two morphs but also that of a significant number of F2 hybrids, therefore also demonstrating that many of the differences observed between the two populations have a clear (and probably relatively simple) genetic underpinning.

      To complete the monumental task of behaviorally testing 425 six-week-old Astyanax larvae, the authors created a setup that allows for the simultaneous behavioral monitoring of multiple larvae and the infusion of different odorants without introducing physical perturbations into the system, thus biasing the responses of cavefish that are particularly fine-tuned for this sensory modality. During the optimization of their protocol, the authors also found that for cave-dwelling forms one hour of habituation was insufficient and a full 24 hours were necessary to allow them to revert to their natural behavior. It is also noteworthy that this extremely large dataset can help us see that population averages of different morphs can mask quite significant variations in individual behaviors.

      Testing with different amino-acids (applied as relevant food-related odorant cues) shows that cavefish are alanine- and histidine-specialists, while surface fish elicit the strongest behavioral responses to cysteine. It is interesting that the two forms also react differently after odor detection: while cave-dwelling fish decrease their locomotory activity, surface fish increase it. These differences are probably related to different foraging strategies used by the two populations, although, as the observations were made in the dark, it would be also interesting to see if surface fish elicit the same changes in light as well.

      Thank you for these nice comments.

      Further work will be needed to pinpoint the exact nature of the genetic changes that underlie the differences between the two forms. Such experimental work will also reveal how natural selection acted on existing behavioral variations already present in the SF population.

      Yes. Searching for genetic underpinnings of the sensory-driven behavioral differences is our current endeavor through a QTL study and we should be able to report it in the near future.

      It will be equally interesting, however, to understand what lies behind the large individual variation of behaviors observed both in the case surface and cave populations. Are these differences purely genetic, or perhaps environmental cues also contribute to their development? Does stochasticity provided by the developmental process has also a role in this? Answering these questions will reveal if the evolvability of Astyanax behavior was an important factor in the repeated successful colonization of underground caves.

      Yes. We will also access (at least partially) responses to most of these questions in our current QTL study.

      Reviewer #3 (Public Review):

      Summary:

      The paper explores chemosensory behaviour in surface and cave morphs and F2 hybrids in the Mexican cavefish Astyanax mexicanus. The authors develop a new behavioural assay for the longterm imaging of individual fish in a parallel high-throughput setup. The authors first demonstrate that the different morphs show different basal exploratory swimming patterns and that these patterns are stable for individual fish. Next, the authors test the attraction of fish to various concentrations of alanine and other amino acids. They find that the cave morph is a lot more sensitive to chemicals and shows directional chemotaxis along a diffusion gradient of amino acids. For surface fish, although they can detect the chemicals, they do not show marked chemotaxis behaviour and have an overall lower sensitivity. These differences have been reported previously but the authors report longer-term observations on many individual fish of both morphs and their F2 hybrids. The data also indicate that the observed behavior is a quantitative genetic trait. The approach presented will allow the mapping of genes' contribution to these traits. The work will be of general interest to behavioural neuroscientists and those interested in olfactory behaviours and the individual variability in behavioural patterns.

      Strengths:

      A particular strength of this paper is the development of a new and improved setup for the behavioural imaging of individual fish for extended periods and under chemosensory stimulation. The authors show that cavefish need up to 24 h of habituation to display a behavioural pattern that is consistent and unlikely to be due to the stressed state of the animals. The setup also uses relatively large tanks that allow the build-up of chemical gradients that are apparently present for at least 30 min.

      The paper is well written, and the presentation of the data and the analyses are clear and to a high standard.

      Thank you for these nice comments.

      Weaknesses:

      One point that would benefit from some clarification or additional experiments is the diffusion of chemicals within the behavioural chamber. The behavioural data suggest that the chemical gradient is stable for up to 30 min, which is quite surprising. It would be great if the authors could quantify e.g. by the use of a dye the diffusion and stability of chemical gradients.

      OK. We had tested the diffusion of dyes in our previous setup and we also did in the present one (not shown). We think that, due to differences of molecular weight and hydrophobicity between the tested dyes and the amino acid molecules we are using, their diffusion does not constitute a proper read-out of actual amino acid diffusion. We anticipate that amino acid diffusion is extremely complex in the test box, possibly with odor plumes diffusing and evolving in non-gradient patterns, in the 3 dimensions of the box, and potentially further modified by the fish swimming through it, the flow coming from the opposite water injection side and the borders of the box. This is the reason why we have designed the assay with contrasting “odor side” and “water control side”. Moreover, our question here is not to determine the exact concentration of amino acid to which the fish respond, but to compare the responses in cavefish, surface fish and F2 hybrids. Finally and importantly, we have performed dose/response experiments whereby varying concentrations have been presented for 3 of the 6 amino acids tested, and these experiments clearly show a difference in the threshold of response of the different morphs.

      The paper starts with a statement that reflects a simplified input-output (sensory-motor) view of the organisation of nervous systems. "Their brains perceive the external world via their sensory systems, compute information and generate appropriate behavioral outputs." The authors' data also clearly show that this is a biased perspective. There is a lot of spontaneous organised activity even in fish that are not exposed to sensory stimulation. This sentence should be reworded, e.g. "The nervous system generates autonomous activity that is modified by sensory systems to adapt the behavioural pattern to the external world." or something along these lines.

      Done

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      In addition to my comments in the "weakness" section above, here are my other comments.

      How many times fish were repeatedly assayed and what the order (alanine followed by cysteine, etc) was, is not clear (Pg 24, Materials and Methods). I am afraid that fish memorize the prior experience to get better/worse their response to the higher conc of alanine, etc. Please clarify this point.

      Many fish were tested in different conditions on consecutive days, indeed. Most often, control experiments (eg, water/nothing; water/water; nothing/nothing) were followed by odor testing. In such cases, there is no risk that fish memorize prior experience and that such previous experience interferes with response to odor. In other instances, fish were tested with a low concentration of one amino acid, followed by a high concentration of another amino acid, which is also on the safe side. Of note, on consecutive days, the odors were always perfused on alternate sides of the test box, to avoid possibility of spatial memory. Finally, in the few cases where increasing concentrations of the same amino acids were perfused consecutively, 1) they were perfused on alternate sides, 2) if the fish does not detect a low concentration below threshold / does not respond, then prior experience should not interfere for responding to higher concentrations, and 3) we have evidence (unpublished, current studies) that when a fish is given increasing concentrations of the same amino acid above detection threshold, then the behavioral response is stable and reproducible (eg does not decrease or increase).

      Minor points:

      Thygmotaxis and wall following.

      Classically, thigmotaxis and wall following are treated as the same (sharma et al., 2009; https://pubmed.ncbi.nlm.nih.gov/19093125/) but the authors discriminate it in thigmotaxis at X-axis and Y-axis because fish repeatedly swam back and forth on x-axis wall or y-axis wall. I understand the authors' point to discriminate WF and T but present them with more explanations (what the differences between them) in the introduction and result sections.

      Done

      Pg5 "genetic architecture" in the introduction.

      "Genetic architecture" analysis needs a more genomic survey, such as GWAS, QTL mapping, and Hi-C. Phenotype differences in F2 generation can be stated as "genetic factor(s)" "genetic component(s)", etc. please revise.

      Done

      Pg10 At the serine treatment, the authors concluded that "...suggesting that their detection threshold for serine is lower than for alanine." I believe that the 'threshold for serine is higher' according to the authors' data. Their threshold-related statement is correct in Pg21 "as SF olfactory concentration detection threshold are higher than CF,..." So the statement on page 10 is a just mistake, I think. Please revise.

      Done (mistake indeed)

      Pg11 After explaining Fig5, the statement "In sum, the responses of the different fish types to different concentrations of different amino acids were diverse and may reflect complex, case-bycase, behavioral outputs" does not convey any information. Please revise.

      OK. Done : “In sum, the different fish types show diverse responses to different concentrations of different amino acids.”

      For the personality analysis (Fig 7)

      The index value needs more explanation. I read the materials and methods three times but am still confused. From the equation, the index does not seem to exceed 1.0, unless the "before score" was a negative value, and the "after score" value was positive. I could not get why the authors set a score of 1.5 as the threshold for the cumulative score of these different behavior index values (= individual score). Please provide more description. Currently, I am skeptical about this index value in Fig 7.

      Done, in results and methods.

      Pg15 the discussion section

      Please discuss well the difference between the authors' finding (cavefish respond 10^-4M for position and surface fish responded 10^-4 for thig-Y; Fig 4AB), and those in Hinaux et al. 2016 (cavefish responded 10^-10M alanine but surface fish responded 10^-5M or higher). It seems that surface fish could respond to the low conc of alanine as cavefish do, which is opposed to the finding in Hinaux 2016.

      The increase in NbrtY at population level for surface fish with 10-4M alanine (~10-6M in box) was most probably due to only a few individuals. Contrarily to cavefish, all other parameters were unchanged in surface fish for this concentration. Moreover, at individual level, only 3.2% of surface fish had significant olfactory scores (to be compared to 81.3% for cavefish). Thus, we think that globally this result does not contradict our previous findings in Hinaux et al (2016), and solely represent the natural, unexplained variations inherent to the analysis of complex animal behaviors – even when we attempt to use the highest standards of controlled conditions.

      Of note, in the revised version, we have now included a full dose/response analysis for alanine concentration ranging from 10-2M to 10-10M, on cavefish. Alanine 10-5M has significant effects (now shown in Suppl Fig2 and indicated in text; a column has been added for 10-5M in Summary Table 1). Lower concentrations have milder effects (described in text) but confirm the very low detection threshold of cavefish for this amino acid.

      Pg19, "In sum, CF foraging strategy has evolved in response to the serious challenge of finding food in the dark"

      My point is the same as explained in the 'weakness' section above: how this behavior is effective in the cave life, if they conclude so? Please explain or revise this statement.

      The present manuscript reports on experiments performed in “artificial” and controlled laboratory conditions. We are fully aware that these conditions are probably distantly related to conditions encountered in the wild. Note that we had written in original version (page 20) “…for 6-week old juveniles in a rectangular box - but the link may be more elusive when considering a fish swimming in a natural, complex environment.” As the reviewer may know, we also perform field studies in a more ethological approach of animal behaviors, thus we may be able to discuss this point more accurately in the future.

      Pg20 "To our knowledge, this is the first time individual variations are taken into consideration in Astyanax behavioral studies."

      This is wrong. Please see Fernandes et al., 2022. (https://pubmed.ncbi.nlm.nih.gov/36575431/).

      OK. The sentence is wrong if taken in its absolute sense, i.e., considering inter-individual variations of a given parameter (e.g., number of neuromasts per individual or number of approaches to vibrating rod in Fernandez et al, 2022). In this same sense, Astyanax QTL studies on behaviors in the past also took into account variations among F2 individuals. Here, we wanted to stress that personality was taken into consideration. The sentence has been changed: “To our knowledge, this is the first time individual temperament is taken into consideration in Astyanax behavioral studies.”

      Figure 2B and others.

      The order of categories (R, R-TX, etc) should match in all columns (SF, F2, and CF). Currently, the category orders seem random or the larger ratio categories at the bottom, which is quite difficult to compare between SF, F2, and CF. Also, the writings in Fig 2A (times, Y-axis labels, etc), and the bargraphs' writings are quite difficult to read in Fig 2B, Fig 3B 4H, 5GN, 6EFG. Also, no need to show fish ID in Fig 2C in the current way, but identify the fish data points of the fish in Fig 2D (SF#40, CF#65, and F2#26) in Fig 2C if the authors want to show fish ID numbers in the boxplots. Fish ID numbers in other boxplot figures are recommended to be removed too.

      We have thought a lot on how to best represent the distributions of swimming patterns in graphs such as Fig 2B and others. The difficulty is due to the existence of many combinations (33 possibilities in total, see new Suppl Fig7), which are never the same in different plots/conditions because individual tested fish are different. We decided that that the best way was to represent, from bottom to top, the most used to the less used swimming patterns, and to use a color code that matches at best the different combinations. It was impossible to give the full color code on each figure, therefore it was simplified, and we believe that the results are well conveyed on the graphs. We would like to keep it as it is. To respond (partially) to the reviewer’s concern, we have now added a full color code description in a new Supplemental Figure 7 (associated to Methods).

      Size of lettering has been modified in all pattern graphs like Fig2A. Thanks for the suggestion, it reads better now.

      Finally, we would like to keep the fish ID numbers because this contributes to conveying the message of the paper, that individuality matters.

      Raw data files were not easy to read in Excel or LibreOffice. Please convert them into the csv format to support the rigor in the authors' conclusion.

      We do not understand this request. Our very large dataset must be analysed with R, not excel for stats or for plotting and pattern analysis. However, raw data files can be opened in excel with format conversion.

      Reviewer #2 (Recommendations For The Authors):

      I think most of the experimental procedures (with few exceptions, see below) are well-defined and nicely described, so the majority of my suggestions will be related to the visualization of the data. I think the authors have done a great job in presenting this complex dataset, but there are still some smaller tweaks that could be used to increase the legibility of the presented data.

      First and perhaps foremost, a better definition of the swimming pattern subsets is needed. I have no problem understanding the main behavioral types, but whereas the color codes for these suggest that there is continuous variance within each pattern, it is not clear (at least to me), what particular aspect(s) of the behaviors vary. Also, whereas the sidebars/legends suggest a continuum within these behaviors, the bar charts themselves clearly present binned data. I did not find a detailed description of how the binning was done. As this has been - according the Methods section - a manual process, more clarity about the details of the binning would be welcome. I would also suggest using binned color codes for the legends as well.

      Done, in Results and Methods. We hope it is now clear that there is no “continuum”, rather multiple combinations of discrete swimming patterns. The gradient aspect in color code in figures has been removed to avoid the idea of continuum. According to the chosen color code, WF is in red, R in blue, T in yellow and C in green. Then, combination are represented by colors in between, for example, R+WF is purple. We have now added a full color code description for the swimming patterns and their combinations in a new Supplemental Figure 7 (associated to Methods).

      Also, to better explain the definition of the swimming patterns and the graphical representation, it now reads (in Methods):

      “The determination of baseline swimming patterns and swimming patterns after odor injection was performed manually based on graphical representations such as in Figure 2A or Figure 3A. Four distinctive baseline behaviors clearly emerged: random swim (R; defined as haphazard swimming with no clear pattern, covering entirely or partly the surface of the arena), wall following (WF; defined as the fish continuously following along the 4 sides of the box and turning around it, in a clockwise or counterclockwise fashion), large or small circles (C; self explanatory), and thigmotactism (T, along the X- or the Y-axis of the box; defined as the fish swimming back and forth along one of the 4 sides of the box). On graphical representations of swimming pattern distributions, we used the following color code: R in blue, WF in red, C in green, T in yellow. Of note, many fish swam according to combination(s) of these four elementary swimming patterns (see descriptions in the legends of Supplemental figures, showing many examples). To fully represent the diversity and the combinations of swimming patterns used by individual fish, we used an additional color code derived from the “basic” color code described above and where, for example R+WF is purple. The complete combinatorial color code is shown in Suppl. Fig7.”

      It would be also easier to comprehend the stacked bar charts, presenting the particular swimming patterns in each population, if the order of different swimming patterns was the same for all the plots (e.g. the frequency of WF always presented at the bottom, R on the top, and C and T in the middle). This would bring consistency and would highlight existing differences between SF, CF, and F2s. Furthermore, such a change would also make it much easier to see (and compare) shifts in behaviors.

      We have thought a lot on how to best represent the distributions of swimming patterns in graphs such as Fig 2B and others. The difficulty is due to the existence of many combinations, which are never the same in different plots/conditions because the individual fish tested are different. We decided to keep it as it currently stands, because we think re-doing all the graphs and figures would not significantly improve the representation. In fact, we think that the differences between morphs (dominant blue in SF, dominant red in CF) and between conditions (bar charts next to each other) are easy to interpret at first glance in the vast majority of cases. Moreover, they are now completed by CA analyses (Suppl Figure 8).

      While the color coding of the timeline in the "3D" plots presented for individual animals is a nice feature, at the moment it is slightly confusing, as the authors use the same color palette as for the stacked bar charts, representing the proportionality of the particular swimming patterns. As the y-axis is already representing "time" here, the color coding is not even really necessary. If the authors would like to use a color scheme for aesthetic reasons, I would suggest using another palette, such as "grey" or "viridis".

      We would like to keep the graphical aspect of our figures as they are, for aesthetic reasons. To avoid confusion with stacked bar chart color code, we have added a sentence in Methods and in the legend of Figure 2, where the colors first appear:

      “The complete combinatorial color code is shown in Suppl. Figure 7. Of note, in all figures, the swimming pattern color code does not relate whatsoever with the time color code used in the 2D plus time representation of swimming tracks such as in Figure 2A”.

      I would also suggest changing the boxplots to violin-plots. Figure 7 clearly shows bimodality for F2 scores (something, as the authors themselves note, not entirely surprising given the probably poligenic nature of the trait), but looking at SF and CF scores I think there are also clear hints for non-normal distributions. If non-normal distribution of traits is the norm, violin-plots would capture the variance in the data in a more digestible way. (The existence of differently behaving cohorts within the population of both SF and CF forms would also help to highlight the large pre-existing variance, something that was probably exploited by natural selection as well, as mentioned briefly in the Discussion by the authors, too.)

      The bimodal distribution of scores shown by F2s in Figure 7B is indeed probably due to the polygenic nature of the trait. However, such distribution is rather the exception than the norm. Moreover, the boxplot representations we have used throughout figures include all the individual points, and outliers can be identified as they have the fish ID number next to them. This allows the reader to grasp the variance of the data. Again, redoing all graphs and figures would constitute a lot of work, for little gain in term of conveying the results. Therefore, we choose not to change the boxplot for violin plots.

      The summary data of individual scores in Table 1B shows some intriguing patterns, that warrant a bit further discussion, in my opinion. For example, we can see opposite trends in scores of SF and CF forms with increasing alanine concentration. Is there an easy explanation for this? Also, in the case of serine, the CF scores do not seem to respond in a dose-dependent manner and puzzlingly at 10^(-3)M serine concentration F2 scores are above those of both grandparental populations.

      That is true. However, we have no simple explanation for this. To begin responding to this question, we have now performed full dose/responses expts for alanine (concentrations tested from 10-2M to 10-10M on cavefish; confirm that CF are bona fide “alanine specialists”) and for serine (10-2M to 104M tested on both morphs; confirm that both morphs respond well to this amino acid). These complementary results are now included in text and figures (partially) and in the summary table 1.

      If anything is known about this, I would also welcome some discussion on how thigmotactic behavior, a marker of stress in SF, could have evolved to become the normal behavior of CF forms, with lower cortisol levels and, therefore lower anxiety.

      We actually think thigmotactism is a marker of stress in both morphs. See Pierre et al, JEB 2020, Figure S3A: in both SF and CF thigmotaxis behavior decreases after long habituation times. In our hands, the only difference between the two morphs is that surface fish (at 5 month of age) express stress by thigmotactism but also freezing and rapid erratic movements, while cavefish have a more restricted stress repertoire.

      This is why in the present paper we have carefully made the distinction between thigmotactism (= possible stress readout) and wall following (= exploratory behavior). Our finding that WF and large circles confers better olfactory response scores to cavefish is in strong support of the different nature of these two swimming patterns. Then, why is swimming along the 4 walls of a tank fundamentally different from swimming along one wall? The question is open, although the number of changes of direction is probably an important parameter: in WF the fish always swims forward in the same direction, while in T the fish constantly changes direction when reaching the corner of the tank – which is similar to erratic swim in stressed surface fish.

      Finally two smaller suggestions:

      • When referring to multiple panels on the same figure it would be better to format the reference as "Figure 4D-G" instead of "Figure 4DEFG";

      Done

      • On page 4, where the introduction reads as "although adults have a similar olfactory rosette with 2025 lamellae", in my opinion, it would be better to state that "while adults of the two forms have a similar olfactory rosette with 20-25 lamellae".

      Done

      Reviewer #3 (Recommendations For The Authors):

      Consider moving Figure 3 to be a supplement of Figure 4. This figure shows a water control and therefore best supplements the alanine experiment.

      We would like to keep this figure as a main figure: we consider it very important to establish the validity of our behavioral setup at the beginning of the ms, and to establish that in all the following figures we are recording bona fide olfactory responses.

      "sensory changes in mecano-sensory and gustatory systems " - mechano-sensory.

      Done

      Figure 2 legend: "(3) the right track is the 3D plus time (color-coded)" - shouldn't it be 2D plus time or 3D (x,y, time).

      True! Thanks for noting this, corrected.

      Figure 4 legend "E, Change in swimming patterns" should be H.

      Done

      "suggesting that their detection threshold for serine is lower than for alanine" - higher?

      Done

      In the behavioural plots, I assume that the "mean position" value represents the mean position along the X-axis of the chamber - this should be clarified and the axis label updated accordingly.

      That is correct and has been updated in Methods and Figures and legends.

      "speed, back and forth trips in X and Y, position and pattern changes (see Methods; Figure 7A)." - here it would be helpful to add an explanation like "to define an olfactory score for individual fish."

      This has been changed in Results and more detailed explanations on score calculations are now given in Methods.

      "possess enhanced mecanosensory lateral line" - mechanosensory.

      Done

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The manuscript "comparative transcriptomics reveal a novel tardigrade specific DNA binding protein induced in response to ionizing radiation" aims to provide insights into the mediators and mechanisms underlying tardigrade radiation tolerance. The authors start by assessing the effect of ionizing radiation (IR) on the tardigrade lab species, H. exemplaris, as well as the ability of this organism to recover from this stress - specifically, they look at DNA double and single-strand breaks. They go on to characterize the response of H. exemplaris and two other tardigrade species to IR at the transcriptomic level. Excitingly, the authors identify a novel gene/protein called TDR1 (tardigrade DNA damage response protein 1). They carefully assess the induction of expression/enrichment of this gene/protein using a combination of transcriptomics and biochemistry - even going so far as to use a translational inhibitor to confirm the de novo production of this protein. TDR1 binds DNA in vitro and co-localizes with DNA in tardigrades.

      Reverse genetics in tardigrades is difficult, thus the authors use a heterologous system (human cells) to express TDR1 in. They find that when transiently expressed TDR1 helps improve human cell resistance to IR.

      This work is a masterclass in integrative biology incorporating a holistic set of approaches spanning next-gen sequencing, organismal biology, biochemistry, and cell biology. I find very little to critique in their experimental approaches.

      Strengths:

      (1) Use of trans/interdisciplinary approaches ('omics, molecular biology, biochemistry, organismal biology)

      (2) Careful probing of TDR1 expression/enrichment

      (3) Identification of a completely novel protein seemingly involved in tardigrade radio-tolerance.

      (4) Use of multiple, diverse, tardigrade species of 'omics comparison.

      Weaknesses:

      (1) No reverse genetics in tardigrades - all insights into TDR1 function from heterologous cell culture system.

      (2) Weak discussion of Dsup's role in preventing DNA damage in light of DNA damage levels measured in this manuscript.

      (3) Missing sequence data which is essential for making a complete review of the work.

      Overall, I find this to be one of the more compelling papers on tardigrade stress-tolerance I have read. I believe there are points still that the authors should address, but I think the editor would do well to give the authors a chance to address these points as I find this manuscript highly insightful and novel.

      We thank the reviewer for his comments.

      We agree that it will be important to further investigate the role of Dsup in radio-tolerance. We briefly mentioned this point in the discussion (p14). Our findings show that tardigrades undergo DNA damage at levels roughly similar to radio-sensitive organisms and therefore support a major role for DNA repair in the maintenance of genome integrity after exposure to IR. Nevertheless, we believe that more precise quantification of DNA damage may still reveal a contribution of genome protection to radio-tolerance of tardigrades compared to radio-sensitive organisms. Dsup loss of function experiments in tardigrades would clearly be the best way to assess this possibility. In the absence of experiments directly addressing the function of Dsup, we prefer to refrain from drawing any firm conclusion on prevention of DNA damage by Dsup and thus to keep a more open position. In any case, as discussed in the text, we note that Dsup has only been reported in Hypsibioidea and other molecular players, such as TDR1, are likely involved in radio-tolerance in other tardigrade species.

      The sequence data can be accessed at the NCBI SRA database with Bioproject ID PRJNA997229.

      Reviewer #3 (Public Review):

      Summary:

      This paper describes transcriptomes from three tardigrade species with or without treatment with ionizing radiation (IR). The authors show that IR produces numerous single-strand and double-strand breaks as expected and that these are substantially repaired within 4-8 hours. Treatment with IR induces strong upregulation of transcripts from numerous DNA repair proteins including Dsup specific to the Hypsobioidea superfamily. Transcripts from the newly described protein TDR1 with homologs in both Hypsibioidea and Macrobiotoidea supefamilies are also strongly upregulated. They show that TDR1 transcription produces newly translated TDR1 protein, which can bind DNA and co-localizes with DNA in the nucleus. At higher concentrations, TDR appears to form aggregates with DNA, which might be relevant to a possible function in DNA damage repair. When introduced into human U2OS cells treated with bleomycin, TDR1 reduces the number of double-strand breaks as detected by gamma H2A spots. This paper will be of interest to the DNA repair field and to radiobiologists.

      Strengths:

      The paper is well-written and provides solid evidence of the upregulation of DNA repair enzymes after irradiation of tardigrades, as well as upregulation of the TRD1 protein. The reduction of gamma-H2A.X spots in U2OS cells after expression of TRD1 supports a role in DNA damage.

      Weaknesses:

      Genetic tools are still being developed in tardigrades, so there is no mutant phenotype to support a DNA repair function for TRD1, but this may be available soon.

      We thank the reviewer for his comments.

      Reviewer #4 (Public Review):

      The manuscript brings convincing results regarding genes involved in the radio-resistance of tardigrades. It is nicely written and the authors used different techniques to study these genes. There are sometimes problems with the structure of the manuscript but these could be easily solved. According to me, there are also some points which should be clarified in the result sections. The discussion section is clear but could be more detailed, although some results were actually discussed in the results section. I wish that the authors would go deeper in the comparison with other IR-resistant eucaryotes. Overall, this is a very nice study and of interest to researchers studying molecular mechanisms of ionizing radiation resistance.

      I have two small suggestions regarding the content of the study itself.

      (1) I think the study would benefit from the analyses of a gene tree (if feasible) in order to verify if TDR1 is indeed tardigrade-specific.

      (2) It would be appreciated to indicate the expression level of the different genes discussed in the study, using, for example, transcript per millions (TPMs).Recommendations for the authors: please note that you control which revisions to undertake from the public reviews and recommendations for the authors

      We thank the reviewer for his comments.

      (1) To identify TDR1 homologous sequences in non-tardigrade species, we conducted extensive homology searches using multiple homology-based approaches (Blastp and Diamond against the NCBI non-redundant protein sequences (nr) database and hmmsearch against the EBI reference proteomes), which failed to identify TDR1 homologs in non-tardigrade ecdysozoans, thus strongly supporting that TDR1 is indeed tardigrade-specific.

      To be clearer in the manuscript, we now state the absence of hits for TDR1 in non-tardigrade ecdysozoans. Given the absence of homologs in non-tardigrade species, it is not possible to make a gene tree with non-tardigrade species.

      (2) To further document expression levels (which were already available from the Tables in the initial submission), we added MAplots (representing log2foldchange and logNormalized read counts) in the supplementary materials (Supp Figure 3 and Supp Figure 8). These additional figures clearly document that the DNA repair genes discussed in the main text and TDR1 are highly expressed genes after IR and after Bleomycin treatment.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      We thank the reviewer for his comments.

      (1) It has always seemed strange to me that tardigrades accumulate just as much DNA damage as any other organism when irradiated and yet their Dsup protein is supposed to shield and protect their DNA from damage. Perhaps this is an appropriate time for this idea to be reconsidered given the Dsup was NOT induced by IR in this study and the authors found that their animals incurred just as much damage as other biological systems. While Dsup is clearly not the focus of this manuscript, it is the protein most associated with tardigrade radio-tolerance and I would argue this new paper would call into question previous conclusions made about Dsup.

      We agree that it will be important to further investigate the role of Dsup in radio-tolerance. We briefly mentioned this point in the discussion (p14). Our findings show that tardigrades undergo DNA damage at levels roughly similar to radio-sensitive organisms and therefore support a major role for DNA repair in the maintenance of genome integrity after exposure to IR. Nevertheless, we believe that more precise quantification of DNA damage may still reveal a contribution of genome protection to radio-tolerance of tardigrades compared to radio-sensitive organisms. Dsup loss of function experiments in tardigrades would clearly be the best way to assess this possibility. In the absence of experiments directly addressing the function of Dsup, we prefer to refrain from drawing any firm conclusion on prevention of DNA damage by Dsup and thus to keep a more open position. In any case, as discussed in the text, we note that Dsup has only been reported in Hypsibioidea and other molecular players, such as TDR1, are likely involved in radio-tolerance in other tardigrade species.

      (2) While reverse genetics are difficult in tardigrades, they are not impossible, and RNAi can be used to good effect in these animals. In fact several authors on this manuscript have used RNAi to examine the necessity of genes in tardigrade stress tolerance in the past. Was an attempt made to RNAi TDR1? If not, why? With the large amount of work that the authors put into showing the sufficiency of TDR1 for increasing radiotolerance in cell culture, one would think looking at necessity in tardigrades would be of great interest. If RNAi was performed, what were the results? Even a negative result here is informative since a protein can be sufficient but not necessary for a function - if this were the case it would mean tardigrades have some redundant mechanism(s) for surviving radiation exposure beyond TDR1.

      We have attempted RNAi experiments targeting TDR1 or a mix of DNA repair genes (including XRCC5) and examined response to a bleomycin treatment of 2 weeks. Unfortunately, we could not distinguish any difference between uninjected animals and animals injected with TDR1 dsRNAs , or the mix of DNA repair genes dsRNAs. We concluded that, bleomycin treatment, that we used because it is much easier to perform than irradiation, was perhaps not the best way to assay a potential impact of RNAi on survival since it required long term treatment for several days during which the effect of RNAi may have waned. Another attempt was therefore made injecting with TDR1 or control GFP dsRNAs and exposing animals to a 2000Gy IR treatment. We noticed that the viability was lower after injection with GFP dsRNAs than with TDR1 dsRNAs (likely due to problems we had with the injection needle during injections). The next day, animals were irradiated and we observed after 24h that animals injected with GFP dsRNAs exhibited higher lethality rates than animals injected with TDR1 dsRNAs or uninjected animals. We found that this set of experiments were not conclusive. Our current experimental set up will make it difficult to distinguish lethality due to injections from lethality due to potentially decreased resistance to IR. In particular, many key controls are difficult to make (in particular, we could not confirm the efficiency of target gene knockdown, as it is very challenging given the low amount of biological material available and the poor expression of these genes without irradiation). From a practical point of view, performing these experiments is thus very challenging. We nevertheless agree that, in future work, further experimentation is needed to examine the impact of knock-down by RNAi of TDR1 or of other genes such as DNA repair genes or Dsup, in tardigrade DNA repair and survival after IR. Gene knock-out with CRISPR-Cas9 is a very promising alternative to RNAi given that studies in mutant lines will eliminate the confounding effect of lethality due to injections.

      (3) Regarding the U2OS experiments. I have several questions/points of clarification:

      a. Were survival/proliferation levels tested or only H2AX foci? I think that showing decreased H2AX foci (fewer double-stranded breaks) correlates with higher survival rates would be important.

      In the experiments reported in Figure 6, cells were transiently transfected with expression vectors and we did not examine the impact on survival rates. U2OS cells are resistant to high doses of Bleomycin and testing survival would require longer exposure at much higher concentrations (Buscemi et al, 2014, PMID: 25486478). In order to try and better address an impact on cell survival, we therefore generated populations of cells stably expressing the candidate tardigrade proteins fused to GFP. Despite trying different experiment conditions for treatment with Bleomycin, we could not detect a reproducibly significant benefit on cell survival for any of the tardigrade proteins tested, including RvDsup which was used as a positive control (since it was previously reported to improve cell survival in response to X-rays). One possibility is that the analysis should be performed in clones and not in populations of cells with heterogeneous expression levels of the tardigrade protein tested. For example, expression levels of the tardigrade protein needed to reduce the number of phospho-H2AX foci in response to DNA damage may interfere with cell division. We note that in the original Dsup paper, the benefit of RvDsup on cell survival was reported in specific transgenic clones. Experiments in different biological systems have also started to document toxic effects of RvDsup expression, illustrating the challenge, when performing experiments in heterologous systems, to achieve suitable expression levels of the tested protein. Trying to perform such a finer analysis, in our opinion, would go beyond the scope of our manuscript and will be best addressed in future studies. We are therefore careful in the text not to make any claim on the benefit of TDR1 expression on cell survival in response to Bleomycin in human cultured cells.

      (b) From the methods I am a bit confused as to how the images were treated/foci quantified. With the automatic segmentation and foci identification, is this done through the entire Z-series or a single layer? If the latter then I am not sure the results are meaningful, since we do not know how many foci might be present in other layers of the nuclei analyzed. If the former, please clarify this in the method since it is a very important consideration.

      We have acquired images throughout the entire Z-series and edited the text to make it more clear ; We now write: “ Z-stacks were maximum projected and analyzed with Zen Blue software (v2.3)...”. To limit the time needed for image analysis, we have generated an artificial image by projecting the entire Z-series into a single image and counted foci in that single maximum projection image. Although there are potential drawbacks, such as potentially only counting one focus when two foci are superposed along the Z axis, this approach overcomes the limitations of quantification from a single layer. We further ensured statistical robustness of the analysis by performing quantification from several independent fields of the labelled cells and several independent biological replicates (n>=3 as now specified in the legend of figure 6a).

      (c) RvDsup reduced levels of HXA1 foci in these experiments, however, HeDsup was not found to be enriched in the transcriptomic analysis performed here. Was there a reason HeDsup was not used in the cell-based experiments? One could argue that RvDsup is from a different species of tardigrade, but it is a bit concerning that an ortholog of a protein found NOT to be induced by radiation exposure seems to perform as well (if not better) than some versions of TDR1.

      RvDsup is the protein initially shown to increase survival of human HEK293 cells treated with X-rays and reduce the number of phospho-H2AX foci induced: it was therefore used as a positive control in our experiments. The sequence of HeDsup is only poorly similar to RvDsup (with 26% identity) and activity of HeDsup in cultured cells has not been reported before. We therefore believe that HeDsup is not well suited to provide a positive control for the experiments performed in our manuscript.

      (d) From the methods, it seems that cells were treated with Bleomycin and then immediately fixed without any sort of recovery time. In this short timeframe, the presence of TDR1 appears to be enough to deal with a substantial amount of double-stranded breaks (as evidenced by the reduced number of HXA1 foci). Does this make sense? How quickly could one expect DNA repair machinery to make significant progress in resolving damaged DNA? This response seems much faster than what was observed in tardigrades. Perhaps the authors to comment on this.

      Kinetic studies in human cells show extremely rapid repair of DNA double-strand breaks. Sensing of DNA double strand breaks by PARP proteins takes place within seconds after irradiation by IR (Pandey and Black, 2021, PMID: 33674152). NHEJ is then observed to take place by formation of 53BP1 foci within 15 minutes (Schultz et al, 2000, PMID: 11134068). The number of phospho-H2AX and 53BP1 foci peaks at 30 minutes and starts declining thereafter, showing that at a significant number of sites, DNA repair is proceeding very rapidly (by NHEJ). Although we are not aware of any studies of DNA repair kinetics in U2OS cells after addition of Bleomycin, DNA damage must be instantaneous and further take place during exposure to the drug in parallel to DNA repair, which would be expected to have similar kinetics than after irradiation with IR.

      In our experiments, several mechanisms may be involved in reducing the number of phospho-H2AX foci induced by Bleomycin, such as DNA protection (for Dsup expression) or stimulation of DNA repair (for RNF146 expression). For TDR1, the molecular mechanism involved remains to be determined. Given our finding that TDR1 can form aggregates with DNA, an additional possibility is that clustering of phospho-H2AX foci is induced.

      (4) I could not find the sequences of the TDR1 proteins studied here. I did find the cDNA sequence of HeTDR1 in the final supplementary file, but not the other TDR1 orthologs. In the place where it appeared the TDR1 sequences from other tardigrades should be there were very short segments of the HETDR1 sequence. All sequences of proteins used in this study should be easily accessible to the reader and reviewers as it is not possible to review this work without accessing the sequences.

      Our apologies for the inappropriate documentation of TDR1 sequences in the original manuscript. As requested, we have now included the TDR1 sequences in the Supplementary Table 4.

      (5) Likewise, the RNA sequence data is said to be deposited in NCBI under PRJNA997229, but I do not find this available on NCBI.

      The RNA sequence data was deposited in NCBI under the indicated reference before submission of the manuscript. The data has now been released and is fully available on NCBI.

      (6) A few typographical errors: e.g., Page 10 - sentence 4 has two periods ". ." or page 14 which has an open parenthesis that is not closed.

      These typos have been corrected in the revised manuscript.

      Reviewer #3 (Recommendations For The Authors):

      We thank the reviewer for his comments.

      In Figure 4C, what fraction of the 50 genes upregulated in all species and treatments are DNA repair genes? Is there any other notable commonality between these 50 genes? The bulk of upregulated genes are specific to a species and to treatment with IR or bleomycin. What fraction of DNA repair genes are specific to a species or treatment?

      The results in Figure 4C on the 50 putative orthologous genes upregulated in all species and treatments are further detailed in supp Figure 10. The legend to supp Figure 10 now provides the requested information: 14/50 genes are DNA repair genes and the other notable commonality is that 21/50 are “stress response genes”. We did not further breakdown the analysis to evaluate the fraction of DNA repair genes specific to a species or treatment. It will be interesting to gather data in more species to hed light on the evolutionary history of DNA repair gene regulation in response to IR.

      How does the suite of upregulated tardigrade DNA repair proteins after IR or bleomycin compare with DNA or repair proteins upregulated under similar treatments in human cells? Are they quantitatively or qualitatively different, or both?

      There is a great wealth of studies documenting genes differentially expressed in human cells in response to IR (e.g. Borras-Fresneda et al, 2016, PMID: 27245205; Rieger and Chu, 2004, PMID: 15356296; Budwoeth et al, 2012, PMID: 23144912 ; Rashi-Elkeles et al, 2011, PMID: 21795128; Jen and Cheung, 2003, PMID: 12915489...). Upregulation of DNA repair and cell cycle genes is commonly found. However, the number of DNA repair genes induced is always very limited and fold stimulation very modest compared to the massive upregulation observed in tardigrades.

      On page 14, please explain the acronym BER. Do the authors mean Base Excision Repair? Or something else?

      As assumed by the reviewer, the acronym BER stands for Base Excision Repair. The acronym has been removed from the main text and replaced by the full name.

      Reviewer #4 (Recommendations For The Authors):

      We thank the reviewer for his comments.

      Abstract:

      The abstract is fine. What was hard to grasp at the beginning is why TDR1 gene was named that way. It should be clearer that this study decided to further focus on that gene, one of the most overexpressed gene after IR, with an unknown function. Then maybe introduce that it was found to be unique to tardigrade and to interact with DNA. Therefore, it was named TDR1.

      Introduction:

      The introduction has been modified according to the suggestions of Reviewer#4 below. One of the suggested references, Nicolas et al 2023 from the Van Doninck lab, was published while our manuscript was under review and cannot be considered as background information for our study.

      1st paragraph:

      The study is on tardigrades, I found it strange that the first paragraph is on D. radiodurans. I think it is fine to mention what is known in bacteria and eucaryotes but we should already know what will be the main topic in the first paragraph of the introduction. Some details about D. radiodurans seem less important and distracting from the main topic (3D conformation).

      2nd paragraph:

      When mentioning radio-resistant eurcaryotes the authors do not mention the larvae of the anhydrobiotic insect Polypedilum vanderplanki. Stating that the mechanisms of resistance are poorly characterized should perhaps be nuanced. There are some recent studies on D. radiodurans (Ujaoney et al., 2017) the insect P. vanderplanki (Ryabova et al., 2017), tardigrades (Kamilari et al., 2019), and rotifers (Nicolas et al., 2023, Moris et al., 2023). Perhaps these papers are worth indicating that if mechanisms are not elucidated yet, recent studies suggest some actors involved in their resistance. Regarding the sentence stating that DNA repair rather than DNA protection plays a predominant role in the radio-resistance of bdelloid rotifers should also be nuanced. Indeed, many chaperones, antioxidants were mentioned to play a role in the radio-resistance of bdelloid rotifers (Moris et al., 2023). The authors mentioned the reference Hespeels et al., 2023 which is not found in their list of references, I am not sure which paper they refer to. The last sentence of the second paragraph does not mean much. I am not sure what the authors want to state with this. Perhaps they should specify if they mean that the function of many other genes overexpressed after IR remains unknown.

      Still, in the second paragraph, the authors focus on rotifers. They also do not mention what is known in the insect P. vanderplanki, which should be added. They still do not mention tardigrades. I think it is nice to first start with eucaryotes and then focus on tardigrades but as I mentioned before it would help to understand the aim of the paper if the first paragraph mentioned briefly the tardigrades and then could go into detail in the third paragraph.

      3rd paragraph:

      The sentence starting "with over 1400 species" best to remove from it "but they can differ in their resistance" and start the next sentence with that.

      4th paragraph:

      Very clear, we finally understand what is the focus of the manuscript.

      5th paragraph:

      Very clear. The authors should mention the names of the three studied species. Here, A. antarcticus is missing. The sentence "Further analyses in H. exemplaris... showed that TDR1 protein is present and upregulated". The authors should mention in which conditions the protein is upregulated. In that paragraph the authors mention phospho-H2AX: it might be good to introduce its functions before in the introduction (it is mentioned in the second sentence of the results: best to move it to the introduction).

      Results:

      There are a few sentences in this section which rather discuss the results than describe them. I think the manuscript might gain in quality if these interpretations of the results are moved into the discussion section. That would make the result section more concise and the discussion enriched.

      For instance, I suggest to move these sentences into the discussion:

      • "the finding of persistent DSBs in gonads at 72h.... likely explains...".

      • "suggesting that (i) DNA synthesis..."

      • " Phospho-H2AX....also suggested"

      • "Moreover, expression of TDR1-GFP..., supporting the potential role of TDR1 proteins..."

      • "our results suggest that RNF146 upreguation could contribute..."

      • "AMNP gene g12777 was shown to increase...Based on our results, it is possible that..."

      Interpretations mentioned here above were always introduced cautiously (-"suggesting that (i) DNA synthesis..." ; -" Phospho-H2AX....also suggested" ; -"Moreover, expression of TDR1-GFP..., supporting the potential role of TDR1 proteins..." ; -"our results suggest that RNF146 upreguation could contribute..." ). These cautious interpretations were usually important in deciding next steps of the work. We therefore believe it is important to mention these interpretations in the results section to clearly expose the milestones marking the progression of the study.

      For some results, they were directly discussed in the results section for the sake of concision (for example -"the finding of persistent DSBs in gonads at 72h.... likely explains..."; -"AMNP gene g12777 was shown to increase...Based on our results, it is possible that..." ) since, in our opinion, there was no need to mention them again in the main discussion.

      Some other parts could be good to be moved into the introduction:

      • "Previous studies have indicated that irradiation with IR increases expression of Rad51,..." none of the actors involved in DNA repair are mentioned in the introduction. Also, change resistant into resistance

      • "A. antarcticus ..., known for its resistant to high doses of UV....

      We have moved these parts to the introduction as recommended.

      It was in O. areolatus.... that the first demonstration..."

      This piece of information is somewhat anecdotical. We choose to keep it it here in the results section. This information on the radio-resistance of the species P. areolatus is only relevant at this specific step of the study because it encouraged us to consider that P. fairbanksi, which we isolated fortuitously, would be a good model species for studying radio-resistance of tardigrades.

      Here are some additional comments/suggestions on the result section:

      1st section

      • Remove the Gross et al., 2018 from the sentence "using confocal microscopy", it looks otherwise that these results are from their study, not yours.

      We have changed the text to make it clear that this is indeed a finding of Gross et al which was previously made in non-irradiated tardigrades. We replicated this finding, which showed that the protocol was working appropriately, and that we could use this control result for comparison with irradiated animals. We apologize for this confusion.

      The text now states: “Using confocal microscopy, we could detect DNA synthesis in replicating intestinal cells of control animals, as previously shown by (Gross et al. 2018).”

      2nd section

      • It is confusing what has been found induced by IR and/or by Bleomycin.

      • I think it might help if the authors first present what is induced after IR, then write if it is similar after Bleomycin. Especially since they start to do it in the first paragraph of that section. However, they only mention TDR1 in the second paragraph dedicated to Bleomycin treatment which is confusing as it is also overexpressed after IR. It is also not clear if RNF146 is also induced by Bleomycin.

      As recommended, the text presents first what is induced after IR and then what is induced by Bleomycin in the following paragraph. When reporting results with Bleomycin, we have provided a global assessment of what is common to both treatments in Supp Figure 3 and in Supp Table 3. In this figure, we also specifically highlighted several key genes of DNA repair induced by both treatments. These are also mentioned in the text (p8) to illustrate the point that many key DNA repair genes are common to both treatments. We have now added RNF146 to that list as recommended.

      • Regarding TDR1, it is not clear when introduced in the text as "promising candidate" why it is the case. It is clear in the figures but perhaps the authors should explain why they chose these genes for further analyses: high log2foldchange and expression level for instance. Regarding that last comment, it would be interesting to have an idea about the expression level of the genes with high log2foldchange. In Figures 2, 3, and 4 the pvalue and log2foldchange are represented but not the expression level (ideally Transcript per Millions). These values would give an additional idea on the importance of that gene. While looking at the figures, it is unclear why you did not further characterize other genes with high log2foldchange (some with even hints of their function): the mentioned RNF146, macroH2A1 (not even mentioned in the results), some genes unannotated in the figures with likely unknown functions,

      When selecting genes of interest, we did indeed take into account high expression levels. To more clearly document expression levels (which were already available from the Tables), we added MAplots (representing log2foldchange and logNormalized read counts) in the supplementary materials (Supp Figure 3 and Supp Figure 8).

      • It is also unclear at that stage why you named it "Tardigrade DNA damage response protein", as it is characterized as DNA repair/damage proteins by specific GO id or is it based on your downstream analyses, I think it might be worth to quickly mention the reason of that name.

      The name illustrates two points which were already characteristic at this point in time of the study i.e. 1) it is a tardigrade specific protein and 2) it is induced in response to DNA damage.

      • Regarding the BLAST analyses the protein was searched in C. elegans, D. melanogaster and H. sapiens. Why only these three species? What were the threshold evalues used for these analyses. As mentioned in the main comment, it would be worth searching species phylogenetically close to tardigrades to verify if it is well-tardigrade specific. Did you try to make a gene tree, after looking for a conserved domain (using hmmersearch)?

      As indicated in the methods section, the “Tardigrade-specific" annotation was determined by absence of hits after high-throughput alignment (with diamond using –ultrasensitive-option) on the NCBI nr database and absence of hits after blast search on C. elegans, D. melanogaster and H. sapiens proteomes as a complementary criterion (the latter blast search was primarily performed to enrich for functional annotations). Based on these criteria, TDR1 was annotated as “Tardigrade-specific”. As stated in the text, we also searched for TDR1 related sequences with 1) blastp (which is more sensitive than diamond) on the NCBI nr database and 2) HMMER on Reference Proteomes, and no hits were found among non-tardigrade ecdysozoans organisms, confirming TDR1 is specific to tardigrades. For Blast search for example, there were five hits in non-ecdysozoans organisms (two cephalochordates, one mollusc and two echinoderma). The blastp and HMMER results are now included in the revised supplementary material (Supp Table 5). These very few hits in species phylogenetically distant from tardigrades cannot be taken to support the existence of TDR1 genes outside tardigrades.

      To be clearer in the manuscript, we now state the absence of hits for TDR1 in non-tardigrade ecdysozoans. Given the absence of homologs in non-tardigrade species, it is not possible to make a gene tree with non-tardigrade species.

      • Page 9: "Proteins extracts from H. exemplaris... at 4h and 24h..." I think this sentence can be removed as this is mentioned again 2 paragraphs after: "...we conducted an unbiased proteome analysis... at 4h..." The log2foldchange threshold mentioned for the proteomic analyses is 0.3: why this threshold, was it chosen randomly?

      This is threshold is commonly used when considering log2foldchange with the technology used in our study, an isobaric multiplexed quantitative proteomic strategy which is known to compress ratios (Hogrebe et al. 2018).

      • Page 10:

      It would be good for more clarity to indicate at the beginning of the new section which species were investigated after IR or Bleomycin treatment.

      TDR1 homologs in the other tardigrade species were identified based on what? Best reciprocal hit?

      As indicated in the methods section of the manuscript, we searched for homologs in other tardigrade species by BLAST. A best reciprocal hit approach was not performed to try to determine which homologs might be orthologs. In particular, most TDR1 homologs identified are known from transcriptome assemblies and high-contiguity genome assemblies are needed to more confidently identify orthology (using synteny). The results of the BLASTP search are now provided as supplementary material (Supp Table 5).

      Preliminary experiments indicated that A. antarcticus and P. fairbanski survived exposure to 1000 Gy: is there a supplementary graph showing this?

      We have corrected the text to avoid any confusion. We have not rigorously examined the dose-dependent survival of P. fairbanksi in response to irradiation. Text was changed to: “We found by visual inspection of animals after IR that A. antarcticus and P. fairbanksi readily survived exposure to 1000 Gy.”

      • Page 11:

      "A set of 50 genes was upregulated in the three species": please be precise if only after IR.

      Done

      These genes cannot be the same as they are from different species. Did the author mean that they are coding for similar proteins? It might be good to give some more details even if the supplementary figure is mentioned.

      Obviously, these genes are putative orthologs. We have changed the text to:

      ” a set of 50 putative orthologous genes was upregulated in response to IR in all three species”

      Discussion:

      • General comment: the discussion is focused mainly on TDR1, it would be nice to also discuss the other results: DNA repair genes, RNF146.

      A whole paragraph is devoted to discussion of results on DNA repair genes and RNF146. We have extended that discussion following on the suggestion of the reviewer. In particular, we have explicitly mentioned the apparent paradox that XRCC5 and XRCC6, which are among the most highly stimulated genes at the mRNA level, only display modest upregulation at the protein level. Although further studies would be needed to examine the mechanisms involved, we propose that upregulation of RNF146, whose human homolog has been shown to drive degradation of PARylated XRCC5 and XRCC6 proteins in response to IR (Kang et al. 2011), may be responsible for higher degradation rates and may thus counterbalance increased levels of protein synthesis.

      • Pulse field electrophoresis would be nice to be performed. It has been used to assess DSBs in bdelloid rotifers, is it possible in tardigrades?

      As stated in the discussion, we believe that it would be challenging to perform pulse field electrophoresis in tardigrades. However, if possible, these experiments would certainly bring invaluable information to complement our analysis of DNA damage induced by IR.

      • "By comparative transcriptomics": please rephrase that sentence.

      • Proteins acting early in DNA repair: I am not sure I understand this sentence. Actors as ligases act not at the beginning of the repair pathways.

      Well noted. We have removed ligases from the list.

      • It is confusing that the authors mention NHEJ and double-strand break repair pathways as different pathways. There are 2 main pathways to repair DBSs: NHEJ and HR. It would be nice to add a reference to the sentence "PARP proteins act as sensors of DNA damage etc."

      A typo in the sentence gave rise to the misleading suggestion that NHEJ is not a double strand repair pathway. It has been corrected.

      A reference has been added for PARP proteins.

      • It would be nice if the authors can explain deeper their suggestion that degradation of DNA repair actors is essential for tardigrade IR resistance.

      We have expanded this part of the discussion and hope that it is clearer.

      “For XRCC5 and XRCC6, our studyestablished, by two independent methods, proteomics and Western blot analysies, that the stimulation at the protein level could be much more modest (6 and 20-fold at most (Supp Figure 6) than at the RNA level (420 and 90 fold respectively). This finding suggests that the abundance of DNA repair proteins does not simply increase massively to quantitatively match high numbers of DNA damages. Interestingly, in response to IR, the RNF146 ubiquitin ligase was also found to be strongly upregulated. RNF146 was previously shown to interact with PARylated XRCC5 and XRCC6 and to target them for degradation by the ubiquitin-proteasome system (Kang et al. 2011). To explain the lower fold stimulation of XRCC5 and XRCC6 at the protein levels, it is therefore tempting to speculate that, XRCC5 and XRCC6 protein levels (and perhaps that of other scaffolding complexes of DNA repair as well) are regulated by a dynamic balance of synthesis, promoted by gene overexpression, and degradation, made possible by RNF146 upregulation. Consistent with this hypothesis, we found that, similar to human RNF146 (Kang et al. 2011), He-RNF146 expression in human cells reduced the number of phospho-H2AX foci detected in response to Bleomycin (Figure 6).”

      • Page 15: Please add a reference for the sentence "Functional analysis of promotor sequences in transgenic tardigrades etc."

      The reference has been added to fix this omission.

      Material and Methods:

      Small comments:

      • 40 μm mesh: space missing

      • 100 μm mesh: space missing

      • (for Bleomycin)): parenthesis missing

      • remove "as indicated in the text"

      • The investigated time points after radiation need to be clearly stated in the method section. It is also unclear in the IR and Bleomycin section which tardigrades were treated with what. Not all were treated with Bleomycin.

      The small comments above have been fixed in the revised version of the manuscript.

      • Page 21: please precise the coverage of the RNA sequencing

      Statistics on mapping of RNAseq reads are now provided in Supp Table 10.

      • Page 22: Was any read trimming performed? Anything about the quality check of the reads?

      Trimming was conducted using trimmomatic (v0.39) and quality check using FastQC (v. ?) This information has been added to the Methods section.

      • Were the analyses confirmed by a second approach: for instance, EdgeR? Deseq2 and EdgeR do not always have the same results. For more robust analyses it is advised to use both.

      Differential transcriptome analyses were conducted with DESeq2 only. The robustness of our identification of differentially expressed genes in response to IR stems from performing comparative analyses in three different species, rather than from using two bioinformatics pipelines in a single species. We also note that benchmarking reported in the initial DEseq2 paper showed that identification of differentially expressed genes with large log fold changes (which, as reported in our manuscript, is characteristic of many DNA repair genes in response to IR) is very consistent between DEseq2 and EdgeR.

      Figures:

      • Figure 2: Legend vertical dotted line does not indicate log2foldchange value of 4 in all panels: it would be good to indicate for panels a and c as well.

      Figure 2has been improved following on the suggestions of the reviewer. Dotted lines now show log2foldchange value of 2 in all panels (ie Fold Change of 4 as mentioned in the main text).

      • Figure 2C: There are a few points with high log2foldchange which are not annotated: was it because nothing was found in the blast research? If yes, it would be good to indicate their functions. If not, it would be good to mention in the discussion that there are some genes with still unknown functions which might play an important role in the resistance of tardigrades to IR.

      The few points which are not annotated in figure 2c can now be found in Supp Table 3 Some of them have no hit in Blast search, some others such as BV898_09662 or BV898_07145 have hits on DNA repair genes as RBBP8/CtIP or XRCC6 respectively but are not annnotated as such by eggnog in KEGG pathway.

      • Figure 4C: Why not have included the response of P. fairbanski to bleomycin? I guess it was not done, but it is unclear in the results and methods sections.

      P.fairbanksi response to bleomycin wasn’t assessed as we didn’t get enough animals to run the study. The method section has been modified to precise this point.

    1. Author response:

      Reviewer #1 (Public Review):

      “… it remains unclear how ninein reduction causes bone defects …”

      We have added several control experiments that permit us to conclude that osteoblast numbers remain unaltered in the ninein-knockout embryos, and that bone abnormalities in vivo are caused by fusion defects of osteoclast precursor cells, whereas the proliferation, viability, or the adhesion of these precursor cells remain unaffected. For details, please see our comments below.

      “Discussion includes several unfounded potential mechanisms that really need to be thoroughly analyzed to gain a mechanistic understanding of the bone defects…”

      The new data back up our claim of fusion defects as a cause for limited osteoclast function. We have re-written parts of the discussion, to take into account our new findings.

      “Data showing normal osteoblasts in ninein-null mice was qualitative and requires further in-depth analysis and quantification of osteoblast …”

      To address this point, quantification of osteoblast numbers in tibiae at E16.5 and E18.5 was performed in control and ninein-deleted mouse embryos. The data are presented in the new Figures 3G and J.

      “In ninein knock-out mice, reduced TRAP+ve multinuclear cells were observed (Figure 6A and 6B). However, the magnitude of difference (about 5% decrease in multinucleated cells) is not consistent with the skeletal deformities reported in Figures 2-4, potentially suggesting the contribution of additional mechanisms.”

      We agree that the difference appears to be small at first glance, but nevertheless it remains statistically significant (a more than three-fold difference). We would like to recall that these observations (Fig. 6A) were performed at E14.5, i.e. at a stage when no ossification has occurred yet. We are looking at the first fusion events of myeloid precursors, likely derived from the fetal liver, that colonize the area of the first bone to form, and small differences in the number of functional osteoclasts may account for different timing of ossification. We think that differences in osteoclast fusion also account for the premature appearance of ossification centers for other skeletal elements, at later time points during development.

      “The fusion assay in Figure 6C needs further clarification. How was the syncytia perimeter defined to measure cell surface? The x-axis suggests that there are syncytia that contain up to 160 nuclei at day 3. How were the nuclei differentially stained and quantified?”

      We provide now additional information on the experimental approach in the revised manuscript, on pages 16-17 (Materials and Methods). For information: high numbers of syncytial nuclei in cultures were also observed by other groups in the past (Tiedemann et al., 2017, Front Cell Dev Biol. 5:54). In addition, we performed new experiments and quantified the fusion of osteoclast precursors by staining for actin and nuclei (new Figure 7C). This allowed us to quantify several additional parameters related to cell fusion (as initially performed in Raynaud-Messina et al., 2018, PNAS, 115:E2556-E2565).

      “Some text needs clarification. … What is the definition of "large syncytia"? Is the fusion index increase by day 5 diminished in later days? A graph of the syncytia size/ nuclei number or fusion index in the above-mentioned days will be helpful.”

      Information on the definition of “large syncytia” is now provided on page 10 (1st paragraph). We added further experimental details on osteoclast size for days 3, 4, and 5 in the supplemental Figures 7A and B. Most importantly, we performed additional assays of the fusion index by quantifying syncytial versus non-syncytial nuclei in a semi-automated manner. The new data are presented in Figure 7C, and the methods are explained on page 17. Together with our new analysis of cell proliferation, cell viability, and cell adhesion (Figure 7C, D, suppl. Fig. 7C-G), we provide now solid evidence for a fusion defect at the origin of impaired formation of ninein del/del osteoclasts.

      “Assessment of resorption was qualitative in Figure 6E and since the fusion deficiencies are transient, quantification of a corresponding resorption activity is needed. This should be described in the Materials and Methods section.”

      Quantifications of the bone resorption activities are now provided in the new Figure 7E, and a reference for the methods is provided on page 16.

      “Further experiments are needed to show connections between reduced centrosome clustering and reduced osteoclast formation as there is no evidence to date that suggest centrosome clustering is required for cell fusion. Multi-color live imaging and dynamic analysis can be used to determine if the ninein deficient cells show defective movement/migration/ fusion dynamics.”

      We agree that it is an important question, and studying potential links between centrosomal microtubule organization and osteoclast fusion is an ongoing project of the team. However, we estimate that in order to obtain conclusive results this will require 1-2 additional years of research activity, and we intend to present this as a separate project in the future. At the current point of our investigation, we think that providing a solid link between ninein, osteoclast fusion, and controlled timing of ossification, as shown in this manuscript, represents valuable progress to understand previously published bone abnormalities in patients with ninein mutations.

      “Quantification of the % of multinucleated osteoclasts that contain clustered and dispersed centrosomes is needed.”

      New quantification experiments on centrosome clustering are now provided in Figure 8H. These quantifications demonstrate that the potential of centrosome clustering is almost completely lost in osteoclasts without ninein.

      Reviewer #2 (Public Review):

      “Based on the decrease in the number of osteoclasts (Fig 5E, G, and also per coverslip after 2 days in culture), the authors suggest that the loss of ninein impacts osteoclast proliferation. First, proliferation can be directly quantified using Ki67 staining or EdU incorporation. Second, other interpretations are also plausible and can also be experimentally tested. These include less adhesion and attachment of the mutants to the coverslips, but perhaps more relevant in vivo is cell death of the ninein mutant osteoclasts. It has been established that the loss of centrosome function activates p53- dependent cell death and osteoclasts might be a vulnerable cell population. Quantifying p53 immunoreactivity and/or cell death in osteoclasts might help clarify the phenotype of osteoclast reduction.”

      In response to the reviewers, we have performed a series of new experiments that include

      1) A careful analysis of the fusion index, using a semi-automated approach, indicating significant differences in the fusion of precursor cells into osteoclasts (Fig. 7C).

      2) We have repeated the quantification of cell numbers prior to fusion and find variations between samples from different mice (also among mice of the same genotype), but we see on average comparable cell adhesion between samples from control mice and ninein-del/del mice. The data are provided in the supplemental Figure 7F. Moreover, we have quantified the expression of three main beta-integrins at the surface of control and ninein del/del osteoclast precursors (suppl. Fig. 7G), without detecting significant differences. Altogether, these data suggest the cell adhesion is comparable for the two genotypes.

      3) We have addressed the question of altered cell proliferation, by performing flow cytometry experiments and by quantifying the different cell cycle stages (Fig. 7D), and by quantifying Ki67 expression (suppl. Fig. 7C). We see no significant differences between samples from control and ninein-del/del mice.

      4) We have addressed the question of cell death, by performing Annexin V staining and flow cytometry (suppl. Fig. 7D), and by immunoblotting for cleaved caspase 3 and PARP (suppl. Fig. 7E). These experiments reveal no significant differences between the control and ninein del/del samples. Our data permit us to exclude cell death as a likely cause for the reduction of fused osteoclasts in the absence of ninein.

      Overall, the new experiments show that the defects in osteoclast formation from ninein-deleted samples are due to defects in cell fusion, but not in cell proliferation, cell adhesion or viability.

      Reviewer #3 (Public Review):

      “The authors put much emphasis on the centrosome in the Introduction session. However, it was not until Figure 7 did they show abnormal centriole clustering in osteoclasts. The introduction should include more background on osteoclast and osteoblast balance during skeletal development.”

      To address this, we included more background on the role of osteoclasts and osteoblasts in the revised introduction (page 4).

    1. Reviewer #3 (Public Review):

      Summary:

      The authors consider several known aspects of PV and SOM interneurons and tie them together into a coherent single-cell model that demonstrates how the aspects interact. These aspects are:<br /> (1) While SOM interneurons target distal parts of pyramidal cell dendrites, PV interneurons target perisomatic regions.<br /> (2) SOM interneurons are associated with beta rhythms, PV interneurons with gamma rhythms.<br /> (3) Clustered excitation on dendrites can trigger various forms of dendritic spikes independent of somatic spikes. The main finding is that SOM and PV interneurons are not simply associated with beta and gamma frequencies respectively, but that their ability to modulate the activity of a pyramidal cell "works best" at their assigned frequencies. For example, distally targeting SOM interneurons are ideally placed to precisely modulate dendritic Ca-spikes when their firing is modulated at beta frequencies or timed relative to excitatory inputs. Outside those activity regimes, not only is modulation weakened, but overall firing reduced.

      Strengths:

      I think the greatest strength is the model itself. While the various individual findings were largely known or strongly expected, the model provides a coherent and quantitative picture of how they come together and interact.

      The paper also powerfully demonstrates that an established view of "subtractive" vs. "divisive" inhibition may be too soma-focused and provide an incomplete picture in cells with dendritic nonlinearities giving rise to a separate, non-somatic all-or-nothing mechanism (Ca-spike).

      Weaknesses:

      While the authors overall did an admirable job of simulating the neuron in an in-vivo-like activity regime, I think it still provides an idealized picture that it optimized for the generation of the types of events the authors were interested in. That is not a problem per se - studying a mechanism under idealized conditions is a great advantage of simulation techniques - but this should be more clearly characterized. Specifics on this are very detailed and will follow in the comments to authors.

      What disappointed me a bit was the lack of a concise summary of what we learned beyond the fact that beta and gamma act differently on dendritic integration. The individual paragraphs of the discussion often are 80% summary of existing theories and only a single vague statement about how the results in this study relate. I think a summarizing schematic or similar would help immensely.

      Orthogonal to that, there were some points where the authors could have offered more depth on specific features. For example, the authors summarized that their "results suggest that the timescales of these rhythms align with the specialized impacts of SOM and PV interneurons on neuronal integration". Here they could go deeper and try to explain why SOM impact is specialized at slower time scales. (I think their results provide enough for a speculative outlook.)

      Beyond that, the authors invite the community to reappraise the role of gamma and beta in coding. This idea seems to be hindered by the fact that I cannot find a mention of a release of the model used in this work. The base pyramidal cell model is of course available from the original study, but it would be helpful for follow-up work to release the complete setup including excitatory and inhibitory synapses and their activation in the different simulation paradigms used. As well as code related to that.

      Impact:

      Individually, most results were at least qualitatively known or at least expected. However, demonstrating that beta-modulation of dendritic events and gamma-modulation of soma spiking can work together, at the same time and in the same model can lead to highly valuable follow-up work. For example, by studying how top-down excitation onto apical compartments and bottom-up excitation on basal compartments interacts with the various rhythms; or what the impact of silencing of SOM neurons by VIP interneuron activation entails. But this requires - again - public release of the model and the code controlling the simulation setups.

      Beyond that, the authors clearly demonstrated that a single compartment, i.e., only a soma-focused view is too simple, at least when beta is considered. Conversely, the authors were able to describe the impact of most things related to the apical dendrite on somatic spiking as "going through" the Ca-spike mechanism. Therefore, the setup may serve as the basis of constraining simplified two-compartment models in the future.

    1. Reviewer #1 (Public Review):

      This manuscript describes the pattern of relaxed selection observed at spermatogenesis genes in gorillas, presumably due to the low sperm competition associated with single-male polygyny. The analyses to detect patterns of selection are very thorough, as are the follow up analyses to characterize the function of these genes. Furthermore, the authors take the extra steps of in vivo determination of function with a Drosophila model.

      This is an excellent paper. It addresses the interesting phenomenon of relaxation of selection as a genomic signal of reproductive strategies using multiple computational approaches and follow-up analyses by pulling in data from GO, mouse knockouts, human infertility database, and even Drosophila RNAi experiments. I really appreciate the comprehensive and creative approach to analyze and explore the data. As far as I can tell, the analyses were performed soundly and statistics are appropriate. The Introduction and Discussion sections are thoughtful and well-written. I have no major criticisms of the manuscript.

      The main area that I would suggest for improvement is in the "Caveats and Limitations" section of the Discussion. Currently, the first paragraph of this section states the obvious that genetic manipulation of gorillas is not feasible. Beyond a reminder to the reader that this was a rationale for the Drosophila work, it isn't really adding much insight. The second paragraph is a brief discussion of the directionality of change. I think it comes across as overly simplistic, with a sort of "well, we can never know" feel. Obviously, there are plenty of researchers who do model change to infer direction and causation, and there are plenty of published papers attempting to do so with respect to mating systems in primates.

      I do not think the authors need to remove these paragraphs, but I do encourage them to turn the "Caveats and Limitations" section into something more meaningful by addressing limitations of the work that was actually done rather than limitations of hypothetical things that were not done. A few areas come to mind. First, the authors should discuss the effect of gene-tree vs species-tree inconsistencies in the analyses, which could affect the identification of gorilla-specific amino acid changes and/or the dN/dS estimates. Incomplete lineage sorting is very common in primates including the gorilla-chimp-human splits (Rivas-González et al. 2023). It would be nice to hear the authors' thoughts on how that might affect their analyses. Second, the dN/dS-based analyses assume the neutrality of synonymous substitutions. Of course, that assumption is not completely true; it might be true enough, and the authors should at least note it as a caveat. Third, and potentially related, is the consideration that these protein-coding genes may be functioning in other ways such as via antisense transcription. The genes under relaxed selection may be on their way to becoming pseudogenes and evolving as such at the sequence level, but many pseudogenes continue to be transcribed sense or anti-sense in a regulatory purpose. I don't think there is a way to incorporate this into the authors' analyses but it would be nice to see it acknowledged as a caveat or limitation.

    1. As we look at the above examples we can see examples of intersectionality [q13], which means that not only are people treated differently based on their identities (e.g., race, gender, class, disability, weight, height, etc.), but combinations of those identities can compound unfair treatment in complicated ways. For example, you can test a resume filter and find that it isn’t biased against Black people, and it isn’t biased against women. But it might turn out that it is still biased against Black women. This could happen because the filter “fixed” the gender and race bias by over-selecting white women and Black men while under-selecting Black women.

      I think intersectionality is essential because it helps us understand how different identity traits can intersect to affect an individual's experience. For example, an Asian pansexual male may face discrimination based on both race and sexual orientation, and the impact of such compounded discrimination may be much more complex than discrimination based on a single identity. This situation shows that we cannot consider only one factor when addressing discrimination and inequality; we must consider how multiple identity factors interact and may lead to more complex injustices. Such insights are critical to developing more effective equality policies and interventions.

    1. In what ways have you found social media bad for your mental health and good for your mental health?

      When it comes to mental health issues in our generation, I can say with almost 100% confidence, that social media is the main culprit. With technology integrated in every aspect of our lives nowadays, we can blame social media for being so easily accessible and influential on our generation and generations to come. From my experiences, many social media sites encourage people to really only showcase the good aspects of our lives, which leaves a very one-sided angle of everyone. This is very harmful as many younger people (and probably older people too) tend to compare themselves to the influencers and accounts, leading to accelerating conditions like depression, anxiety, and jealousy. Additionally, with all these way to edit photos, these posts may not even be real but they sure seem real, leading to negative self body image thoughts and unhealthy diet and workout plans. I think in some ways social media can be good for your mental health, whether that is watching funny videos to lighten the mood, or looking into what other people with similar hobbies do to mimic and get inspiration. However, it is hard to filter out the harmful content with the content with want and enjoy.

    1. C. L. Lynch. “Autism is a Spectrum” Doesn’t Mean What You Think. NeuroClastic, May 2019. URL: https://neuroclastic.com/its-a-spectrum-doesnt-mean-what-you-think/ (visited on 2023-12-08).

      The article addresses misconceptions about autism being a gradient rather than a true spectrum. This article also use vivid analogies like the visible light spectrum to illustrate its point. It highlights the diversity within autism by explaining how different individuals can have unique combinations of traits, emphasizing that autism is not just one condition but a complex collection of related neurological conditions. The author thinks we shouldn't compare different types of autism as "more" or "less" severe, and emphasizes the importance of understanding individual strengths and challenges rather than making assumptions based on outward behaviors.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reply to reviewer comments

      • *

      We extend our gratitude to the reviewers for their time and valuable feedback on our manuscript. We especially appreciate the insightful suggestions that have significantly contributed to refining our work and elucidating our findings. With the revisions made to the text and the inclusion of new experimental data, we believe our manuscript now effectively addresses all reviewer comments. We eagerly await your evaluation of our revised submission.

      Small ARF-like GTPases play fundamental roles in dynamic signaling processes linked with vesicular trafficking in eukaryotes. Despite of their evolutionary conservation, there is little known about the ARF-like GTPase functions in plants. Our manuscript reports the biochemical and cell biological characterization of the small ARF-like GTPase TTN5 from the model plant Arabidopsis thaliana*. Fundamental investigations like ours are mostly lacking for ARF and ARL GTPases in Arabidopsis. *

      We employed fluorescence-based enzymatic assays suited to uncover different types of the very rapid GTPase activities for TTN5. The experimental findings are now illustrated in a more comprehensive modified Figure 2 and in the form of a summary of the GTPase activities for TTN5 and its mutant variants in the NEW Figure 7A in the Discussion part. Taken together, we found that TTN5 is a non-classical GTPase based on its enzymatic kinetics. The reviewers appreciated these findings and highlighted them as being „impressive in vitro biochemical characterization" and "major conceptual advance". Since such experiments are "uncommon" for being conducted with plant GTPases, reviewers regarded this analysis as "useful addition to the plant community in general". The significance of these findings is given by the circumstance that „the ARF-like proteins are poorly addressed in Arabidopsis while they could reveal completely different function than the canonical known ARF proteins". Reviewers saw here clearly a "strength" of the manuscript.

      With regard to the cell biological investigation and initial assessment of cell physiological roles of TTN5, we now provide requested additional evidence. First of all, we provide NEW data on the localization of TTN5 by immunolocalization using a complementing HA3-TTN5 construct, supporting our initial suggestions that TTN5 may be associated with vesicles and processes of the endomembrane system. The previous preprint version had left the reviewers „less convinced" of cell biological data due to the lack of complementation of our YFP-TTN5 construct, lack of Western blot data and the low resolution of microscopic images. We fully agree that these points were of concern and needed to be addressed. We have therefore intensively worked on these „weaknesses" and present now a more detailed whole-mount immunostaining series with the complementing HA3-TTN5 transgenic line (NEW Figure 4, NEW Figure 3P), Western blot data (NEW Supplementary Figures S7C and D), and we will provide all original images upon publication of our manuscript at BioImage Archives which will provide the high quality for re-analysis. BioImage Archives is an online storage for biological image data associated with a peer-reviewed publication. This way, readers will be able to inspect each image in detail. The immunolocalization data are of particular importance as they indicate that HA3-TTN5 can be associated with punctate vesicle structures and BFA bodies as seen with YFP studies of YFP-TTN5 seedlings. We have re-phrased very carefully and emphasized those localization patterns which are backed up by immunostaining and YFP fluorescence detection of YFP-TTN5 signals. To improve the comprehension, the findings are summarized in a schematic overview in NEW Figure 7B of the Discussion. We have also addressed all other comments related to the cell biological experiments to "provide the substantial improvement" that had been requested. We emphasize that we found two cell physiological phenotypes for the TTN5T30N mutant. YFP-TTN5T30N confers phenotypes, which are differing mobility of the fluorescent vesicles in the epidermis of hypocotyls (see Video material and NEW Supplementary Video Material S1M-O), and a root growth phenotype of transgenic HA3-TTN5T30N seedlings (NEW Figure 3O). We explain the cell physiological phenotypes in relation to enzymatic GTPase data. These findings convince us of the validity of the YFP-TTN5 analysis indicative of TTN5 localization.

      *We are deeply thankful to the reviewers for judging our manuscript as "generally well written", "important" and "of interest to a wide range of plant scientists" and "for scientists working in the trafficking field" as it "holds significance" and will form the basis for future functional studies of TTN5. *

      We prepared very carefully our revised manuscript in which we address all reviewer comments one by one. Please find our revision and our detailed rebuttal to all reviewer comments below. Changes in the revised version are highlighted by yellow and green color. In the "revised version with highlighted changes".

      With these adjustments, we hope that our peer-reviewed study will receive a positive response.

      We are looking forward to your evaluation of our revised manuscript and thank you in advance,

      Sincerely

      Petra Bauer and Inga Mohr on behalf of all authors

      *

      • *

      __Reviewer #1 (Evidence, reproducibility and clarity (Required)): __

      The manuscript from Mohr and collaborators reports the characterization of an ARF-like GTPase of Arabidopsis. Small GTPases of the ARF family play crucial role in intracellular trafficking and plant physiology. The ARF-like proteins are poorly addressed in Arabidopsis while they could reveal completely different function than the canonical known ARF proteins. Thus, the aim of the study is important and could be of interest to a wide range of plant scientists. I am impressed by the biochemical characterization of the TTN5 protein and its mutated versions, this is clearly a very nice point of the paper and allows for proper interpretations of the other results. However, I was much less convinced on the cell biology part of this manuscript and aside from the subcellular localization of the TTN5 I think the paper would benefit from a more functional angle. Below are my comments to improve the manuscript:

      1- In the different pictures and movies, TTN5 is quite clearly appearing as a typical ER-like pattern. The pattern of localization further extends to dotty-like structures and structures labeled only at the periphery of the structure, with a depletion of fluorescence inside the structure. These observations raise several points. First, the ER pattern is never mentioned in the manuscript while I think it can be clearly observed. Given that the YFP-TTN5 construct is not functional (the mutant phenotype is not rescued) the ER-localization could be due to the retention at the ER due to quality control. The HA-TTN5 construct is functional but to me its localization shows a quite different pattern from the YFP version, I do not see the ER for example or the periphery-labeled structures. In this case, it will be a crucial point to perform co-localization experiments between HA-TTN5 and organelles markers to confirm that the functional TTN5 construct is labeling the Golgi and MVBs, as does the non-functional one. I am also quite sure that a co-localization between YFP-TTN5 and HA-TTN5 will not completely match... The ER is contacting so many organelles that the localization of YFP-TTN5 might not reflects the real location of the protein.

      __Our response: __

      At first, we like to state that specific detection of intracellular localization of plant proteins in plant cells is generally technically very difficult, when the protein abundance is not overly high. In this revised version, we extended immunostaining analysis to different membrane compartments, including now immunostaining of complementing HA3-TTN5 in the absence and presence of BFA, along with immunodetection of ARF1 and FM4-64 labeling in roots (NEW Figure 3P, NEW Figure 4A, B). In the revised version, we focus the analysis and conclusions on the fluorescence patterns that overlap between YFP-TTN5 detection and HA3-TTN5 immunodetection. With this, we can be most confident about subcellular TTN5 localization. Please find this NEW text in the Result section (starting Line 323):

      „For a more detailed investigation of HA3-TTN5 subcellular localization, we then performed co-immunofluorescence staining with an Alexa 488-labeled antibody recognizing the Golgi and TGN marker ARF1, while detecting HA3-TTN5 with an Alexa 555-labeled antibody (Robinson et al. 2011, Singh et al. 2018) (Figure 4A). ARF1-Alexa 488 staining was clearly visible in punctate structures representing presumably Golgi stacks (Figure 4A, Alexa 488), as previously reported (Singh et al. 2018). Similar structures were obtained for HA3-TTN5-Alexa 555 staining (Figure 4A, Alexa 555). But surprisingly, colocalization analysis demonstrated that the HA3-TTN5-labeled structures were mostly not colocalizing and thus distinct from the ARF1-labeled ones (Figure 4A). Yet the HA3-TTN5- and ARF1-labeled structures were in close proximity to each other (Figure 4A). We hypothesized that the HA3-TTN5 structures can be connected to intracellular trafficking steps. To test this, we performed brefeldin A (BFA) treatment, a commonly used tool in cell biology for preventing dynamic membrane trafficking events and vesicle transport involving the Golgi. BFA is a fungal macrocyclic lactone that leads to a loss of cis-cisternae and accumulation of Golgi stacks, known as BFA-induced compartments, up to the fusion of the Golgi with the ER (Ritzenthaler et al. 2002, Wang et al. 2016). For a better identification of BFA bodies, we additionally used the dye FM4-64, which can emit fluorescence in a lipophilic membrane environment. FM4-64 marks the plasma membrane in the first minutes following application to the cell, then may be endocytosed and in the presence of BFA become accumulated in BFA bodies (Bolte et al. 2004). We observed BFA bodies positive for both, HA3-TTN5-Alexa 488 and FM4-64 signals (Figure 4B). Similar patterns were observed for YFP-TTN5-derived signals in YFP-TTN5-expressing roots (Figure 4C). Hence, HA3-TTN5 and YFP-TTN5 can be present in similar subcellular membrane compartments."

      We did not find evidence that HA3-TTN5 can localize at the ER using whole-mount immunostaining (NEW Figure 3P; NEW Figure 4A, B). Hence, we are careful with describing that fluorescence at the ER, as seen in the YFP-TTN5 line (Figure 3M, N) reflects TTN5 localization. We therefore do not focus the text on the ER pattern in the Result section (starting Line 295):

      „Additionally, YFP signals were also detected in a net-like pattern typical for ER localization (Figure 3M, N). (...) We also found multiple YFP bands in α-GFP Western blot analysis using YFP-TTN5 Arabidopsis seedlings. Besides the expected and strong 48 kDa YFP-TTN5 band, we observed three weak bands ranging between 26 to 35 kDa (Supplementary Figure S7C). We cannot explain the presence of these small protein bands. They might correspond to free YFP, to proteolytic products or potentially to proteins produced from aberrant transcripts with perhaps alternative translation start or stop sites. On the other side, a triple hemagglutinin-tagged HA3-TTN5 driven by the 35S promoter did complement the embryo-lethal phenotype of ttn5-1 (Supplementary Figure S7D, E). α-HA Western blot control performed with plant material from HA3-TTN5 seedlings showed a single band at the correct size, but no band that was 13 to 18 kDa smaller (Supplementary Figure S7D). (...) We did not observe any staining in nuclei or ER when performing HA3-TTN5 immunostaining (Figure 3P; Figure 4A, B), as was the case for fluorescence signals in YFP-TTN5-expressing cells. Presumably, this can indicate that either the nuclear and ER signals seen with YFP-TTN5 correspond to the smaller proteins detected, as described above, or that immunostaining was not suited to detect them. Hence, we focused interpretation on patterns of localization overlapping between the fluorescence staining with YFP-labeled TTN5 and with HA3-TTN5 immunostaining, such as the particular signal patterns in the specific punctate membrane structures."

      *And we discuss in the Discussion section (starting Line 552): *

      „We based the TTN5 localization data on tagging approaches with two different detection methods to enhance reliability of specific protein detection. Even though YFP-TTN5 did not complement the embryo-lethality of a ttn5 loss of function mutant, we made several observations that suggest YFP-TTN5 signals to be meaningful at various membrane sites. We do not know why YFP-TTN5 does not complement. There could be differences in TTN5 levels and interactions in some cell types, which were hindering specifically YFP-TTN5 but not HA3-TTN5. (...) Though constitutively driven, the YFP-TTN5 expression may be delayed or insufficient at the early embryonic stages resulting in the lack of embryo-lethal complementation. On the other hand, the very fast nucleotide exchange activity may be hindered by the presence of a large YFP-tag in comparison with the small HA3-tag which is able to rescue the embryo-lethality. The lack of complementation represents a challenge for the localization of small GTPases with rapid nucleotide exchange in plants. Despite of these limitations, we made relevant observations in our data that made us believe that YFP signals in YFP-TTN5-expressing cells at membrane sites can be meaningful."

      2- What are the structures with TTN5 fluorescence depleted at the center that appear in control conditions? They look different from the Golgi labeled by Man1 but similar to MVBs upon wortmannin treatment, except that in control conditions MVBs never appear like this. Are they related to any kind of vacuolar structures that would be involved in quality control-induced degradation of non-functional proteins?

      Our response:

      The reviewer certainly refers to fluorescence images from N. benthamiana leaf epidermal cells where different circularly shaped structures are visible. In these respective structures, the fluorescent circles are depleted from fluorescence in the center, e.g. in Figure 5C, YFP- fluorescent signals in TTN5T30N transformed leaf discs. We suspect that these structures can be of vacuolar origin as described for similar fluorescent rings in Tichá et al., 2020 for ANNI-GFP (reference in manuscript). The reviewer certainly does not refer to swollen MVBs that are seen following wortmannin treatment, as in Figure 5N-P, which look similar in their shape but are larger in size. Please note that we always included the control conditions, namely the images recorded before the wortmannin treatment, so that we were able to investigate the changes induced by wortmannin. Hence, we can clearly say that the structures with depleted fluorescence in the center as in Figure 5C are not wortmannin-induced swollen MVBs.To make these points clear to the reader, we added an explanation into the text (Line 385-388):

      „We also observed YFP fluorescence signals in the form of circularly shaped ring structures with a fluorescence-depleted center. These structures can be of vacuolar origin as described for similar fluorescent rings in Tichá et al. (2020) for ANNI-GFP."

      3- The fluorescence at nucleus could be due to a proportion of YFP-TTN5 that is degraded and released free-GFP, a western-blot of the membrane fraction vs the cytosolic fraction could help solving this issue.

      Our response:

      In an α-GFP Western blot using YFP-TTN5 Arabidopsis seedlings, we detected besides the expected and strong 48 kDa YFP-TTN5 band, three additional weak bands ranging between 26 to 35 kDa (NEW Supplementary Figure S7C). We cannot explain the presence of these small protein bands. They might correspond to free YFP, to proteolytic products or potentially to proteins expressed from aberrant transcripts. α-HA Western blot controls performed with plant material from HA3-TTN5 seedlings showed a single band at the correct size (Supplementary Figure S7D). We must therefore be cautious about nuclear TTN5 localization and we rephrased the text carefully (starting Line 300):

      „We also found multiple YFP bands in α-GFP Western blot analysis using YFP-TTN5 Arabidopsis seedlings. Besides the expected and strong 48 kDa YFP-TTN5 band, we observed three weak bands ranging between 26 to 35 kDa (Supplementary Figure S7C). We cannot explain the presence of these small protein bands. They might correspond to free YFP, to proteolytic products or potentially to proteins produced from aberrant transcripts with perhaps alternative translation start or stop sites. On the other side, a triple hemagglutinin-tagged HA3-TTN5 driven by the 35S promoter did complement the embryo-lethal phenotype of ttn5-1 (Supplementary Figure S7D, E). α-HA Western blot control performed with plant material from HA3-TTN5 seedlings showed a single band at the correct size, but no band that was 13 to 18 kDa smaller (Supplementary Figure S7D). (...) We did not observe any staining in nuclei or ER when performing HA3-TTN5 immunostaining (Figure 3P; Figure 4A, B), as was the case for fluorescence signals in YFP-TTN5-expressing cells. Presumably, this can indicate that either the nuclear and ER signals seen with YFP-TTN5 correspond to the smaller proteins detected, as described above, or that immunostaining was not suited to detect them. Hence, we focused interpretation on patterns of localization overlapping between the fluorescence staining with YFP-labeled TTN5 and with HA3-TTN5 immunostaining, such as the particular signal patterns in the specific punctate membrane structures."

      4- It is not so easy to conclude from the co-localization experiments. The confocal pictures are not always of high quality, some of them appear blurry. The Golgi localization looks convincing, but the BFA experiments are not that clear. The MVB localization is pretty convincing but the images are blurry. An issue is the quantification of the co-localizations. Several methods were employed but they do not provide consistent results. As for the object-based co-localization method, the authors employ in the text co-localization result either base on the % of YFP-labeled structures or the % of mCherry/mRFP-labeled structures, but the results are not going always in the same direction. For example, the proportion of YFP-TTN5 that co-localize with MVBs is not so different between WT and mutated version but the proportion of MVBs that co-localize with TTN5 is largely increased in the Q70L mutant. Thus it is quite difficult to interpret homogenously and in an unbiased way these results. Moreover, the results coming from the centroid-based method were presented in a table rather than a graph, I think here the authors wanted to hide the huge standard deviation of these results, what is the statistical meaning of these results?

      Our response:

      First of all, we like to point out that, as explained above, the BFA experiments are now more clear. We performed additional BFA treatment coupled with immunostaining using HA3-TTN5-expressing Arabidopsis seedlings and coupled with fluorescence analysis using YFP-TTN5-expressing Arabidopsis plants. In both experiments, we observed the typical BFA bodies very clearly (NEW Figure 4B, C).

      Second, we like to insist that we performed colocalization very carefully and quantified the data in three different manners. We like to state that there is no general standardized procedure that best suits the idea of a colocalization pattern. Results of colocalization are represented in stem diagrams and table format, including statistical analysis. Colocalization was carried out with the ImageJ plugin JACoP for Pearson's and Overlap coefficients and based on the centroid method. The plotted Pearson's and Overlap coefficients are presented in bar diagrams in Supplementary Figure S8A and C, including statistics. The obtained values by the centroid method are represented in table format in Supplementary Figure S8B and D, which *can be considered a standard method (see Ivanov et al., 2014). *

      Colocalization of two different fluorescence signals was performed for the two channels in a specific chosen region of interest (indicating in % the overlapping signal versus the sum of signal for each channel). The differences between the YFP/mRFP and mRFP/YFP ratios indicate that a higher percentage of ARA7-RFP signal is colocalizing with YFP-TTN5Q70L signal than with the TTN5WT or the TTN5T30N mutant form signals, while the YFP signals have a similar overlap with ARA7-positive structures. This is not a contradiction. Presumably this answers well the questions on colocalization.

      Please note that upon acceptance for publication, we will upload all original colocalization data to BioImage Archive. Hence, the high-quality data can be reanalyzed by readers.

      5- The use of FM4-64 to address the vacuolar trafficking is a hazardous, FM4-64 allows the tracking of endocytosis but does not say anything on vacuolar degradation targeting and even less on the potential function of TTN5 in endosomal vacuolar targeting. Similarly, TTN5, even if localized at the Golgi, is not necessarily function in Golgi-trafficking. __Our response: __

      *Perhaps our previous description was misleading. Thank you for pointing this out. We reformulated the text and modified the schematic representation of FM4-64 in NEW Figure 6A: *

      "(A), Schematic representation of progressive stages of FM4-64 localization and internalization in a cell. FM4-64 is a lipophilic substance. After infiltration, it first localizes in the plasma membrane, at later stages it localizes to intracellular vesicles and membrane compartments. This localization pattern reflects the endocytosis process (Bolte et al. 2004)."

      6- The manuscript lacks in its present shape of functional evidences for a role of TTN5 in any trafficking steps. I understand that the KO mutant is lethal but what are the phenotypes of the Q70L and T30N mutant plants? What is the seedling phenotype, how are the Golgi and MVBs looking like in these mutants? Do the Q70L or T30N mutants perturbed the trafficking of any cargos?

      __Our response: __

      *We agree fully that functional evidences are interesting to assign roles for TTN5 in trafficking steps. A phenotype associated with TTN5T30N and TTN5Q70L is clearly meaningful. *

      First of all, we like to emphasize that it is incorrect that the manuscript lacks functional evidences for a role of TTN5 and the two mutants. In fact, the manuscript even highlights several functional activities that are meaningful in a cellular context. These include different types of kinetic GTPase enzyme activities, subcellular localization in planta and association with different endomembrane compartments and subcellular processes such as endocytosis. We surely agree that future research can focus even more on cell physiological aspects and the physiological functions in plants to examine the proposed roles of TTN5 in intracellular trafficking steps. For such studies, our findings are the fundamental basis.

      Concerning the aspect of colocalization of the mutants with the markers we show in Figure 5C, D and G, H that YFP-TTN5T30N- and YFP-TTN5Q70L-related signals colocalize with the Golgi marker GmMan1-mCherry. Figure 5K, L and O, P show that YFP-TTN5T30N and YFP-TTN5Q70L-related signals can colocalize with the MVB marker, and this may affect relevant vesicle trafficking processes and plasma membrane protein regulation involved in root cell elongation.

      *At present, we have not yet investigated perturbed cargo trafficking. These aspects are certainly interesting but require extensive work and testing of appropriate physiological conditions and appropriate cargo targets. We discuss future perspectives in the Discussion. We agree that such functional information is of great importance, but needs to be clarified in future studies. *

      __Reviewer #1 (Significance (Required)): __

      In conclusion, I think this manuscript is a good biochemical description of an ARF-like protein but it would need to be strengthen on the cell biology and functional sides. Nonetheless, provided these limitations fixed, this manuscript would advance our knowledge of small GTPases in plants. The major conceptual advance of that study is to provide a non-canonical behavior of the active/inactive cycle dynamics for a small-GTPase. Of course this dynamic probably has an impact on TTN5 function and involvement in trafficking, although this remains to be fully demonstrated. Provided a substantial amount of additional experiments to support the claims of that study, this study could be of general interest for scientist working in the trafficking field.

      __Our response: __

      We thank reviewer 1 for the very fruitful comments. We hope that with the additional experiments, NEW Figures and NEW Supplementary Figures as well as our changes in the text, all comments by the reviewer have been addressed.

      __Reviewer #2 (Evidence, reproducibility and clarity (Required)): __

      The manuscript by Mohr and colleagues characterizes the Arabidopsis predicted small GTPase TITAN5 in both biochemical and cell biology contexts using in vitro and in planta techniques. In the first half of the manuscript, the authors use in vitro nucleotide exchange assays to characterise the GTPase activity and nucleotide binding properties of TITAN5 and two mutant variants of it. The in vitro data they produce indicates that TITAN5 does indeed have general GTPase and nucleotide binding capability that would be expected for a protein predicted to be a small GTPase. Interestingly, the authors show that TITAN5 favors a GTP-bound form, which is different to many other characterized GTPases that favor GDP-binding. The authors follow their biochemical characterisation of TITAN with in planta experiments characterizing TITAN5 and its mutant variants association with the plant endomembrane system, both by stable expression in Arabidopsis and transient expression in N.benthamiana.

      The strength of this manuscript is in its in vitro biochemical characterisation of TITAN5 and variants. I am not an expert on in vitro GTPase characterisation and so cannot comment specifically on the assays they have used, but generally speaking this appears to have been well done, and the authors are to be commended for it. In vitro characterisation of plant small GTPases is uncommon, and much of our knowledge is inferred for work on animal or yeast GTPases, so this will be a useful addition to the plant community in general, especially as TITAN5 is an essential gene. The in planta data that follows is sadly not as compelling as the biochemical data, and suffers from several weaknesses. I would encourage the authors to consider trying to improve the quality of the in planta data in general. If improved and then combined with the biochemical aspects of the paper, this has the potential to make a nice addition to plant small GTPase and endomembrane literature.

      The manuscript is generally well written and includes the relevant literature.

      Major issues:

      1. The authors make use of a p35s: YFP-TTN5 construct (and its mutant variants) both stably in Arabidopsis and transiently in N.benthamiana. I know from personal experience that expressing small GTPases from non-endogenous promoters and in transient expression systems can give very different results to when working from endogenous promoters/using immunolocalization in stable expression systems. Strong over-expression could for example explain why the authors see high 'cytosolic' levels of YFP-TTN5. It is therefore questionable how much of the in planta localisation data presented using p35S and expression in tobacco is of true relevance to the biological function of TITAN5. The authors do present some immunolocalization data of HA3-TTN5 in Arabidopsis, but this is fairly limited and it is very difficult in its current form to use this to identify whether the data from YFP-TTN5 in Arabidopsis and tobacco can be corroborated. I would encourage the authors to consider expanding the immunolocalization data they present to validate their findings in tobacco. __Our response: __

      We are aware that endogenous promoters may be preferred over 35S promoter. However, the two types of lines we generated with endogenous promoter did both not show fluorescent signals so that we could unfortunately not use them (not shown). Besides 35S promoter-mediated expression we were also investigating inducible expression vectors for fluorescence imaging in N. benthamiana (not shown). Both inducible and constitutive expression showed very similar expression patterns so that we chose characterizing in detail the 35S::YFP-TTN5 fluorescence in both N. bethamiana*and Arabidopsis. *

      We have expanded immunolocalization using the HA3-TTN5 line and compare it now along with YFP fluorescence signal in YFP-TTN5 seedlings (NEW Figure 3P; NEW Figure 4).

      „For a more detailed investigation of HA3-TTN5 subcellular localization, we then performed co-immunofluorescence staining with an Alexa 488-labeled antibody recognizing the Golgi and TGN marker ARF1, while detecting HA3-TTN5 with an Alexa 555-labeled antibody (Robinson et al. 2011, Singh et al. 2018) (Figure 4A). ARF1-Alexa 488 staining was clearly visible in punctate structures representing presumably Golgi stacks (Figure 4A, Alexa 488), as previously reported (Singh et al. 2018). Similar structures were obtained for HA3-TTN5-Alexa 555 staining (Figure 4A, Alexa 555). But surprisingly, colocalization analysis demonstrated that the HA3-TTN5-labeled structures were mostly not colocalizing and thus distinct from the ARF1-labeled ones (Figure 4A). Yet the HA3-TTN5- and ARF1-labeled structures were in close proximity to each other (Figure 4A). We hypothesized that the HA3-TTN5 structures can be connected to intracellular trafficking steps. To test this, we performed brefeldin A (BFA) treatment, a commonly used tool in cell biology for preventing dynamic membrane trafficking events and vesicle transport involving the Golgi. BFA is a fungal macrocyclic lactone that leads to a loss of cis-cisternae and accumulation of Golgi stacks, known as BFA-induced compartments, up to the fusion of the Golgi with the ER (Ritzenthaler et al. 2002, Wang et al. 2016). For a better identification of BFA bodies, we additionally used the dye FM4-64, which can emit fluorescence in a lipophilic membrane environment. FM4-64 marks the plasma membrane in the first minutes following application to the cell, then may be endocytosed and in the presence of BFA become accumulated in BFA bodies (Bolte et al. 2004). We observed BFA bodies positive for both, HA3-TTN5-Alexa 488 and FM4-64 signals (Figure 4B). Similar patterns were observed for YFP-TTN5-derived signals in YFP-TTN5-expressing roots (Figure 4C). Hence, HA3-TTN5 and YFP-TTN5 can be present in similar subcellular membrane compartments."

      • *

      Many of the confocal images presented are of poor quality, particularly those from N.benthamiana.

      Our response:

      All confocal images are of high quality in their original format. To make them accessible, we will upload all raw data to BioImage Archive upon acceptance of the manuscript.

      The authors in some places see YFP-TTN5 in cell nuclei. This could be a result of YFP-cleavage rather than genuine nuclear localisation of YFP-TTN5, but the authors do not present western blots to check for this.

      __Our response: __

      As described in our response to reviewer 1, comment 3, Fluorescence signals were detected within the nuclei of root cells of YFP-TTN5 plants, while immunostaining signals of HA3-TTN5 were not detected in the nucleus. In an α-GFP Western blot using YFP-TTN5 Arabidopsis seedlings, we detected besides the expected and strong 48 kDa YFP-TTN5 band, three additional weak bands ranging between 26 to 35 kDa (NEW Supplementary Figure S7C). We cannot explain the presence of these small protein bands. They might correspond to free YFP, to proteolytic products or potentially to proteins expressed from aberrant transcripts. α-HA Western blot controls performed with plant material from HA3-TTN5 seedlings showed a single band at the correct size (Supplementary Figure S7D). We must therefore be cautious about nuclear TTN5 localization and we rephrased the text carefully (starting Line 300):

      • *

      „We also found multiple YFP bands in α-GFP Western blot analysis using YFP-TTN5 Arabidopsis seedlings. Besides the expected and strong 48 kDa YFP-TTN5 band, we observed three weak bands ranging between 26 to 35 kDa (Supplementary Figure S7C). We cannot explain the presence of these small protein bands. They might correspond to free YFP, to proteolytic products or potentially to proteins produced from aberrant transcripts with perhaps alternative translation start or stop sites. On the other side, a triple hemagglutinin-tagged HA3-TTN5 driven by the 35S promoter did complement the embryo-lethal phenotype of ttn5-1 (Supplementary Figure S7D, E). α-HA Western blot control performed with plant material from HA3-TTN5 seedlings showed a single band at the correct size, but no band that was 13 to 18 kDa smaller (Supplementary Figure S7D). (...) We did not observe any staining in nuclei or ER when performing HA3-TTN5 immunostaining (Figure 3P; Figure 4A, B), as was the case for fluorescence signals in YFP-TTN5-expressing cells. Presumably, this can indicate that either the nuclear and ER signals seen with YFP-TTN5 correspond to the smaller proteins detected, as described above, or that immunostaining was not suited to detect them. Hence, we focused interpretation on patterns of localization overlapping between the fluorescence staining with YFP-labeled TTN5 and with HA3-TTN5 immunostaining, such as the particular signal patterns in the specific punctate membrane structures."

      That YFP-TTN5 fails to rescue the ttn5 mutant indicates that YFP-tagged TTN5 may not be functional. If the authors cannot corroborate the YFP-TTN5 localisation pattern with that of HA3-TTN5 via immunolocalization, then the fact that YFP-TTN5 may not be functional calls into question the biological relevance of YFP-TTN5's localisation pattern.

      __Our response: __

      This refers to your comment 1, please check this comment for a detailed response. Please also see our answer to reviewer 1, comment 1.

      At first, we like to state that specific detection of intracellular localization of plant proteins in plant cells is generally technically very difficult, when the protein abundance is not overly high. In this revised version, we extended immunostaining analysis to different membrane compartments, including now immunostaining of complementing HA3-TTN5 in the absence and presence of BFA, along with immunodetection of ARF1 and FM4-64 labeling in roots (NEW Figure 3P, NEW Figure 4A, B). In the revised version, we focus the analysis and conclusions on the fluorescence patterns that overlap between YFP-TTN5 detection and HA3-TTN5 immunodetection. With this, we can be most confident about subcellular TTN5 localization. Please find this NEW text in the Result section (starting Line 323):

      „For a more detailed investigation of HA3-TTN5 subcellular localization, we then performed co-immunofluorescence staining with an Alexa 488-labeled antibody recognizing the Golgi and TGN marker ARF1, while detecting HA3-TTN5 with an Alexa 555-labeled antibody (Robinson et al. 2011, Singh et al. 2018) (Figure 4A). ARF1-Alexa 488 staining was clearly visible in punctate structures representing presumably Golgi stacks (Figure 4A, Alexa 488), as previously reported (Singh et al. 2018). Similar structures were obtained for HA3-TTN5-Alexa 555 staining (Figure 4A, Alexa 555). But surprisingly, colocalization analysis demonstrated that the HA3-TTN5-labeled structures were mostly not colocalizing and thus distinct from the ARF1-labeled ones (Figure 4A). Yet the HA3-TTN5- and ARF1-labeled structures were in close proximity to each other (Figure 4A). We hypothesized that the HA3-TTN5 structures can be connected to intracellular trafficking steps. To test this, we performed brefeldin A (BFA) treatment, a commonly used tool in cell biology for preventing dynamic membrane trafficking events and vesicle transport involving the Golgi. BFA is a fungal macrocyclic lactone that leads to a loss of cis-cisternae and accumulation of Golgi stacks, known as BFA-induced compartments, up to the fusion of the Golgi with the ER (Ritzenthaler et al. 2002, Wang et al. 2016). For a better identification of BFA bodies, we additionally used the dye FM4-64, which can emit fluorescence in a lipophilic membrane environment. FM4-64 marks the plasma membrane in the first minutes following application to the cell, then may be endocytosed and in the presence of BFA become accumulated in BFA bodies (Bolte et al. 2004). We observed BFA bodies positive for both, HA3-TTN5-Alexa 488 and FM4-64 signals (Figure 4B). Similar patterns were observed for YFP-TTN5-derived signals in YFP-TTN5-expressing roots (Figure 4C). Hence, HA3-TTN5 and YFP-TTN5 can be present in similar subcellular membrane compartments."

      We did not find evidence that HA3-TTN5 can localize at the ER using whole-mount immunostaining (NEW Figure 3P; NEW Figure 4A, B). Hence, we are careful with describing that fluorescence at the ER, as seen in the YFP-TTN5 line (Figure 3M, N) reflects TTN5 localization. We therefore do not focus the text on the ER pattern in the Result section (starting Line 295):

      „Additionally, YFP signals were also detected in a net-like pattern typical for ER localization (Figure 3M, N). (...) We also found multiple YFP bands in α-GFP Western blot analysis using YFP-TTN5 Arabidopsis seedlings. Besides the expected and strong 48 kDa YFP-TTN5 band, we observed three weak bands ranging between 26 to 35 kDa (Supplementary Figure S7C). We cannot explain the presence of these small protein bands. They might correspond to free YFP, to proteolytic products or potentially to proteins produced from aberrant transcripts with perhaps alternative translation start or stop sites. On the other side, a triple hemagglutinin-tagged HA3-TTN5 driven by the 35S promoter did complement the embryo-lethal phenotype of ttn5-1 (Supplementary Figure S7D, E). α-HA Western blot control performed with plant material from HA3-TTN5 seedlings showed a single band at the correct size, but no band that was 13 to 18 kDa smaller (Supplementary Figure S7D). (...) We did not observe any staining in nuclei or ER when performing HA3-TTN5 immunostaining (Figure 3P; Figure 4A, B), as was the case for fluorescence signals in YFP-TTN5-expressing cells. Presumably, this can indicate that either the nuclear and ER signals seen with YFP-TTN5 correspond to the smaller proteins detected, as described above, or that immunostaining was not suited to detect them. Hence, we focused interpretation on patterns of localization overlapping between the fluorescence staining with YFP-labeled TTN5 and with HA3-TTN5 immunostaining, such as the particular signal patterns in the specific punctate membrane structures."

      *And we discuss in the Discussion section (starting Line 552): *

      „We based the TTN5 localization data on tagging approaches with two different detection methods to enhance reliability of specific protein detection. Even though YFP-TTN5 did not complement the embryo-lethality of a ttn5 loss of function mutant, we made several observations that suggest YFP-TTN5 signals to be meaningful at various membrane sites. We do not know why YFP-TTN5 does not complement. There could be differences in TTN5 levels and interactions in some cell types, which were hindering specifically YFP-TTN5 but not HA3-TTN5. (...) Though constitutively driven, the YFP-TTN5 expression may be delayed or insufficient at the early embryonic stages resulting in the lack of embryo-lethal complementation. On the other hand, the very fast nucleotide exchange activity may be hindered by the presence of a large YFP-tag in comparison with the small HA3-tag which is able to rescue the embryo-lethality. The lack of complementation represents a challenge for the localization of small GTPases with rapid nucleotide exchange in plants. Despite of these limitations, we made relevant observations in our data that made us believe that YFP signals in YFP-TTN5-expressing cells at membrane sites can be meaningful."

      • *

      Without a cell wall label/dye, the plasmolysis data presented in Figure 5 is hard to visualize.

      __Our response: __

      Figure 6E-G (previously Fig. 5) show the results of plasmolysis experiments with YFP-TTN5 and the two mutant variant constructs. It is clearly possible to observe plasmolysis when focusing on the Hechtian strands. Hechtian strands are formed due to the retraction of the protoplast as a result of the osmotic pressure by the added mannitol solution. Hechtian strands consist of PM which remained in contact with the cell wall, visible as thin filamental structures. We stained the PM and the Hechtian strands by the PM dye FM4-64. This is similary done in Yoneda et al., 2020. We could detect in the YFP-TTN5-transformed cells, colocalization with the YFP channels and the PM dye in filamental structures between two neighbouring FM4-64-labelled PMs. Although an additional labeling of the cell wall may further indicate plasmolysis, it is not needed here.

      Please consider that we will upload all original image data to BioImage Archive so that a detailed re-investigation of the images can be done.

      • *

      __Minor issues: __

      In some of the presented N.benthamiana images, it looks like YFP-TTN5 may be partially ER-localised. However, co-localisation with an ER marker is not presented.

      Our response:

      *Referring to our response to comments 1 and 3 of reviewer 2 and to comment 1 of reviewer 1: *

      We did not find evidence that HA3-TTN5 can localize at the ER using whole-mount immunostaining (NEW Figure 3P; NEW Figure 4A, B). Hence, we are careful with describing that fluorescence at the ER, as seen in the YFP-TTN5 line (Figure 3M, N) reflects TTN5 localization. We therefore do not focus the text on the ER pattern in the Result section (starting Line 295):

      „Additionally, YFP signals were also detected in a net-like pattern typical for ER localization (Figure 3M, N). (...) We also found multiple YFP bands in α-GFP Western blot analysis using YFP-TTN5 Arabidopsis seedlings. Besides the expected and strong 48 kDa YFP-TTN5 band, we observed three weak bands ranging between 26 to 35 kDa (Supplementary Figure S7C). We cannot explain the presence of these small protein bands. They might correspond to free YFP, to proteolytic products or potentially to proteins produced from aberrant transcripts with perhaps alternative translation start or stop sites. On the other side, a triple hemagglutinin-tagged HA3-TTN5 driven by the 35S promoter did complement the embryo-lethal phenotype of ttn5-1 (Supplementary Figure S7D, E). α-HA Western blot control performed with plant material from HA3-TTN5 seedlings showed a single band at the correct size, but no band that was 13 to 18 kDa smaller (Supplementary Figure S7D). (...) We did not observe any staining in nuclei or ER when performing HA3-TTN5 immunostaining (Figure 3P; Figure 4A, B), as was the case for fluorescence signals in YFP-TTN5-expressing cells. Presumably, this can indicate that either the nuclear and ER signals seen with YFP-TTN5 correspond to the smaller proteins detected, as described above, or that immunostaining was not suited to detect them. Hence, we focused interpretation on patterns of localization overlapping between the fluorescence staining with YFP-labeled TTN5 and with HA3-TTN5 immunostaining, such as the particular signal patterns in the specific punctate membrane structures."

      *And we discuss in the Discussion section (starting Line 552): *

      „We based the TTN5 localization data on tagging approaches with two different detection methods to enhance reliability of specific protein detection. Even though YFP-TTN5 did not complement the embryo-lethality of a ttn5 loss of function mutant, we made several observations that suggest YFP-TTN5 signals to be meaningful at various membrane sites. We do not know why YFP-TTN5 does not complement. There could be differences in TTN5 levels and interactions in some cell types, which were hindering specifically YFP-TTN5 but not HA3-TTN5. (...) Though constitutively driven, the YFP-TTN5 expression may be delayed or insufficient at the early embryonic stages resulting in the lack of embryo-lethal complementation. On the other hand, the very fast nucleotide exchange activity may be hindered by the presence of a large YFP-tag in comparison with the small HA3-tag which is able to rescue the embryo-lethality. The lack of complementation represents a challenge for the localization of small GTPases with rapid nucleotide exchange in plants. Despite of these limitations, we made relevant observations in our data that made us believe that YFP signals in YFP-TTN5-expressing cells at membrane sites can be meaningful."

      • *

      There is some inconsistency within the N.benthamiana images. For example, compare Figure 4C of YFP-TTN5T30N to Figure 4O of YFP-TTN5T30N. Figure 4O is presented as being significant because wortmannin-induced swollen ARA7 compartments are labelled by YFP-TTN5T30N. However, structures very similar to these can already been seen in Figure 4C, which is apparently an unrelated experiment. This, to my mind, is likely a result of the very different expression levels between different cells that can be produced by transient expression in N.benthamiana.

      __Our response: __

      Former Figure 4 is now Figure 5. As detailed in our response to comment 2 of reviewer 1:

      The reviewer certainly refers to fluorescence images from N. benthamiana leaf epidermal cells where different circularly shaped structures are visible. In these respective structures, the fluorescent circles are depleted from fluorescence in the center, e.g. in Figure 5C, YFP- fluorescent signals in TTN5T30N transformed leaf discs. We suspect that these structures can be of vacuolar origin as described for similar fluorescent rings in Tichá et al., 2020 for ANNI-GFP (reference in manuscript). The reviewer certainly does not refer to swollen MVBs that are seen following wortmannin treatment, as in Figure 5N-P, which look similar in their shape but are larger in size. Please note that we always included the control conditions, namely the images recorded before the wortmannin treatment, so that we were able to investigate the changes induced by wortmannin. Hence, we can clearly say that the structures with depleted fluorescence in the center as in Figure 5C are not wortmannin-induced swollen MVBs.To make these points clear to the reader, we added an explanation into the text (Line 385-388):

      „We also observed YFP fluorescence signals in the form of circularly shaped ring structures with a fluorescence-depleted center. These structures can be of vacuolar origin as described for similar fluorescent rings in Tichá et al. (2020) for ANNI-GFP."

      **Referees cross-commenting**

      It sems that all of the reviewers have converged on the conclusion that the in planta characterisation of TTN5 is insufficient to be of substantial interest to the field, highlighting the fact that major improvements are required to strengthen this part of the manuscript and increase its relevance.

      __Reviewer #2 (Significance (Required)): __

      General assessment: the strengths of this work are in its in vitro characterisation of TITAN5, however, the in planta characterisation lacks depth.

      Significance: the in vitro characterisation of TITAN5 is commendable as such work is lacking for plant GTPases. However, the significance of the work would be boosted substantially by better in planta characterisation, which is where most the most broad interest will lie.

      My expertise: my expertise is in in planta characterisation of small GTPases and their interactors.

      __Our response: __

      We thank the reviewer for the kind evaluation of our manuscript. We are confident that the changes in the text and NEW Figures and NEW Supplementary Figures will be convincing to consider our work.

      __Reviewer #3 (Evidence, reproducibility and clarity (Required)): __

      Summary: Cellular traffic is an important and well-studied biological process in animal and plant systems. While components involved in transport are known the mechanism by which these components control activity or destination remains to be studied. A critical step in regulating traffic is proper budding and tethering of vesicles. A critical component in determining this step is a family proteins with GTPase activity, which act as switches facilitating vesicle interaction between proteins, or cytoskeleton. The current manuscript by Mohr and colleagues have characterized a small GTPase TITAN5 (TTN5) and identified two residues Gln70 and Thr30 in the protein which they propose to have functional roles. The authors catalogue the localization, GTP hydrolytic activity, and discuss putative functions of TTN5 and the mutants.

      __Major comments: __

      The core of the manuscript, which is descriptive characterization of TTN5, lies in reliably demonstrating putative roles. While the GTP hydrolysis rates are well-quantified (though the claims need to be toned down), the microscopy data especially the association of TTN5 with different endomembrane compartments is not convincing due to the quality (low resolution) of the figures submitted. The manuscript text is difficult to navigate due to repetition and inconsistency in the order that the mutants are referred. I am requesting additional experiments which should be feasible considering the authors have all the materials required to perform the experiments and obtain high-quality images which support their claims.

      In general the figure quality needs to be improved for all microscopy images. I would suggest that the authors highlight 1-2 individual cells to make their point and use the current images as supplementary to establish a broader spread. __Our response: __

      *We have worked substantially on the text and figures to make the content well comprehensive. The mutants are referred to in a consistent manner in the text and figures. We have addressed requested experiments. *

      As we pointed out in the cover letter and our responses to reviewers 1 and 2, we will upload all raw image data to BioImage Archive upon acceptance of the manuscript so that they can be re-examined without any reduction of resolution. Furthermore, we have conducted new experiments on immunolocalization of HA3-TTN5 (NEW Figure 3P, NEW Figure 4A, B). The text has been improved in several places (see highlighted changes in the manuscript and as detailed in the responses to reviewer 1. We think, this addresses well the reviewers' concerns.

      Fig. S1 lacks clarity. __Our response: __

      Supplementary Figure S1 shows TTN5 gene expression in different organs and growing stages as revealed by transcriptomic data, made available through the AtGenExpress eFB tool of the Bio-Analytic Resource for Plant Biology (BAR). The figure visualizes that TTN5 is ubiquitously expressed in different plant organs and tissues, e.g. the epidermis layers that we investigated here, and throughout development including embryo development. In accordance with the embryo-lethal phenotype, this highlights well that TTN5* is needed throughout for plant growth and it emphasizes that our investigation of TTN5 localization in epidermis cells is valid. *

      We have added a better description to the figure legend. We now also mention the respective publications from which the transcriptome data-sets are derived. The modified figure legend is:

      "Supplementary Figure S1. Visualization of TTN5 gene expression levels during plant development based on transcriptome data. Expression levels in (A), different types of aerial organs at different developmental stages; from left to right and bottom to top are represented different seed and plant growth stages, flower development stages, different leaves, vegetative to inflorescence shoot apex, embryo and silique development stages; (B), seedling root tissues based on single cell analysis represented in form of a uniform manifold approximation and projection plot; (C), successive stages of embryo development. As shown in (A) to (C), TTN5 is ubiquitously expressed in these different plant organs and tissues. In particular, it should be noted that TTN5 transcripts were detectable in the epidermis cell layer of roots that we used for localization of tagged TTN5 protein in this study. In accordance with the embryo-lethal phenotype, the ubiquitous expression of TTN5 highlights its importance for plant growth. Original data were derived from (Nakabayashi et al. 2005, Schmid et al. 2005) (A); (Ryu et al. 2019) (B); (Waese et al. 2017) (C). Gene expression levels are indicated by local maximum color code, ranging from the minimum (no expression) in yellow to the maximum (highest expression) in red."

      For the supplementary videos, it is difficult to determine if punctate structures are moving or is it cytoplasmic streaming? Could this be done with a co-localized marker? Considering that such markers have been used later in Fig. 4? __Our response: __

      We had detected movement of YFP fluorescent structures in all analyzed YFP-TTN5 plant parts except the root tip. Movement of fluorescence signals in YFP-TTN5T30N seedlings was slowed in hypocotyl epidermis cells. To answer the reviewer comment, we added three NEW supplemental videos (NEW Supplementary Video Material S1M-O) generated with all the three YFP-TTN5 constructs imaged over time in N. benthamiana leaf epidermal cells upon colocalization with the cis-Golgi marker GmMan1-mCherry as requested by the reviewer. In these NEW videos, some of *the YFP fluorescent spots seem to move together with the Golgi stacks. GmMan1 is described with a stop-and-go directed movement mediated by the actino-myosin system (Nebenführ 1999) and similarly it might be the case for YFP-TTN5 signals based on the colocalization. *

      • *

      It would be good if the speed of movement is quantified, if the authors want to retain the current claims in results and the discussion. __Our response: __

      *We describe a difference in the movement of YFP fluorescent signal for the YFP-TTN5T30N variant in the hypocotyl compared to YFP-TTN5 and YFP-TTN5Q70L. In hypocotyl cells, we could observe a slowed down or arrested movement specifically of YFP-TTN5T30N fluorescent structures, and we describe this in the Results section (Line 278-291). *

      "Interestingly, the mobility of these punctate structures differed within the cells when the mutant YFP-TTN5T30N was observed in hypocotyl epidermis cells, but not in the leaf epidermis cells (Supplementary Video Material S1E, compare with S1B) nor was it the case for the YFP-TTN5Q70L mutant (Supplementary Video Material S1F, compare with S1E)."

      *The slowed movement in the YFP-TTN5T30N mutant is well visible even without quantification. We checked that the manuscript text does not contain overstatements in this regard. *

      • *

      Fig.2 I am not sure what the unit / scale is in Fig. 2D/E if each parameter (Kon, Koff, and Kd) are individually plotted? Could the authors please clarify/simplify this panel?

      __Our response: __

      We presented kinetics for nucleotide association (kon) and dissociation (koff) and the dissociation constant (Kd) in a bar diagram for each nucleotide, mdGDP (Figure 2D) and mGppNHp (Figure 2E). We modified and relabeled the bar diagram representation. It should be now very clear which are the parameters and units. Please see also the other modified figures (NEW modified Figure 2A-H). We also modified the legend of Figure 2D and E:

      "(D-E), Kinetics of association and dissociation of fluorescent nucleotides mdGDP (D) or mGppNHp (E) with TTN5 proteins (WT, TTN5T30N, TTN5Q70L) are illustrated as bar charts. The association of mdGDP (0.1 µM) or mGppNHp (0.1 µM) with increasing concentration of TTN5WT, TTN5T30N and TTN5Q70L was measured using a stopped-flow device (see A, B; data see Supplementary Figure S3A-F, S4A-E). Association rate constants (kon in µM-1s-1) were determined from the plot of increasing observed rate constants (kobs in s-1) against the corresponding concentrations of the TTN5 proteins. Intrinsic dissociation rates (koff in s-1) were determined by rapidly mixing 0.1 µM mdGDP-bound or mGppNHp-bound TTN5 proteins with the excess amount of unlabeled GDP (see A, C, data see Supplementary Figure S3G-I, S4F-H). The nucleotide affinity (dissociation constant or Kd in µM) of the corresponding TTN5 proteins was calculated by dividing koff by kon. When mixing mGppNHp with nucleotide-free TTN5T30N, no binding was observed (n.b.o.) under these experimental conditions."

      • *

      Are panels D and E representing values for mdGDP and GppNHP? This is not very clear from the figure legend.

      __Our response: __

      Yes, Figure 2D and E represent the kon, koff and Kd values for mdGDP (Figure 2D) and mGppNHP (Figure 2E). As detailed in our previous response to comment 2a, we modified figure and figure legend to make the representation more clear.

      • *

      Fig. 3 Same comments as in para above - improve resolution fo images, concentrate on a few selected cells, if required use an inset figure to zoom-in to specific compartments. Our response:

      As detailed in our responses to reviewers 1 and 2, we will upload all original image data to BioImage Archive upon acceptance of the manuscript, so that a detailed investigation of all our images is possible without any reduction of resolution.

      Please provide the non-fluorescent channel images to understand cell topography __Our response: __

      *We presented our microscopic images with the respective fluorescent channel and for colocalization with an additional merge. We did not present brightfield images as the cell topography was already well visible by fluorescent signal close to the PM. Therefore, brightfield images would not provide any benefit. Since we will upload all original data to BioImage Archive for a detailed investigation of all our images, the data can be obtained if needed. *

      Is the nuclear localization seen in transient expression (panel L-N) an artefact? If so, this needs to be mentioned in the text. Our response:

      As explained in our responses to reviewers 1 and 2, fluorescence signals were detected within the nuclei of root cells of YFP-TTN5 plants, while immunostaining signals of HA3-TTN5 were not detected in the nucleus.

      In an α-GFP Western blot using YFP-TTN5 Arabidopsis seedlings, we detected besides the expected and strong 48 kDa YFP-TTN5 band, three additional weak bands ranging between 26 to 35 kDa (NEW Supplementary Figure S7C). We cannot explain the presence of these small protein bands. They might correspond to free YFP, to proteolytic products or potentially to proteins expressed from aberrant transcripts. α-HA Western blot controls performed with plant material from HA3-TTN5 seedlings showed a single band at the correct size (Supplementary Figure S7D). We must therefore be cautious about nuclear TTN5 localization and we rephrased the text carefully (starting Line 300):

      „We also found multiple YFP bands in α-GFP Western blot analysis using YFP-TTN5 Arabidopsis seedlings. Besides the expected and strong 48 kDa YFP-TTN5 band, we observed three weak bands ranging between 26 to 35 kDa (Supplementary Figure S7C). We cannot explain the presence of these small protein bands. They might correspond to free YFP, to proteolytic products or potentially to proteins produced from aberrant transcripts with perhaps alternative translation start or stop sites. On the other side, a triple hemagglutinin-tagged HA3-TTN5 driven by the 35S promoter did complement the embryo-lethal phenotype of ttn5-1 (Supplementary Figure S7D, E). α-HA Western blot control performed with plant material from HA3-TTN5 seedlings showed a single band at the correct size, but no band that was 13 to 18 kDa smaller (Supplementary Figure S7D). (...) We did not observe any staining in nuclei or ER when performing HA3-TTN5 immunostaining (Figure 3P; Figure 4A, B), as was the case for fluorescence signals in YFP-TTN5-expressing cells. Presumably, this can indicate that either the nuclear and ER signals seen with YFP-TTN5 correspond to the smaller proteins detected, as described above, or that immunostaining was not suited to detect them. Hence, we focused interpretation on patterns of localization overlapping between the fluorescence staining with YFP-labeled TTN5 and with HA3-TTN5 immunostaining, such as the particular signal patterns in the specific punctate membrane structures."

      Fig. 4 - In addition to the points made for Fig. 3 The authors should consider reducing gain/exposure to improve image clarity. Especially for the punctate structures, which are difficult to observe in TTN5, likely because of the cytoplasmic localization as well.

      __Our response: __

      Thank you for this comment. We record image z-stacks and represent in single z-planes. Reducing the gain to decrease the cytoplasmic signal does not increase the clarity of the punctate structures as the signal strength will become weak.. As mentioned above, we will upload all original image data to BioImage Archive for a detailed investigation of all our images without any reduction of resolution.

      • *

      Reducing Agrobacterial load could be considered. OD of 0.4 is a bit much, 0.1 or even 0.05 could be tried. If available try expression in N. tabaccum, which is more amenable to microscopy. However, this is OPTIONAL, benthamiana should suffice. __Our response: __

      Thank you for the suggestion. We are routinely using N. benthamiana leaf infiltration. When setting up this method at first, we did not observe different localization results by using different ODs of bacterial cultures. Hence, an OD600 of 0.4 is routinely used in our institute. This value is comparable with the literature although some literature reports even higher OD values for infiltration (Norkunas et al., 2018; Drapal et al., 2021; Zhang et al., 2020, Davis et al., 2020; Stephenson et al., 2018).

      A standard norm now is to establish the level of colocalization is by quantifying a pearson's or Mander's correlation. Which I believe has been done in the text, I didn't find a plot representing the same? Could the data (which the authors already have) be plotted alongwith "n" as a table or graph? __Our response: __

      *Please check our response to reviewer 1, comment 4. *

      We like to insist that we performed colocalization very carefully and quantified the data in three different manners. We like to state that there is no general standardized procedure that best suits the idea of a colocalization pattern. Results of colocalization are represented in stem diagrams and table format, including statistical analysis. Colocalization was carried out with the ImageJ plugin JACoP for Pearson's and Overlap coefficients and based on the centroid method. The plotted Pearson's and Overlap coefficients are presented in bar diagrams in Supplementary Figure S8A and C, including statistics. The obtained values by the centroid method are represented in table format in Supplementary Figure S8B and D, which *can be considered a standard method (see Ivanov et al., 2014). *

      Colocalization of two different fluorescence signals was performed for the two channels in a specific chosen region of interest (indicating in % the overlapping signal versus the sum of signal for each channel). The differences between the YFP/mRFP and mRFP/YFP ratios indicate that a higher percentage of ARA7-RFP signal is colocalizing with YFP-TTN5Q70L signal than with the TTN5WT or the TTN5T30N mutant form signals, while the YFP signals have a similar overlap with ARA7-positive structures. This is not a contradiction. Presumably this answers well the questions on colocalization.

      Please note that upon acceptance for publication, we will upload all original colocalization data to BioImage Archive. Hence, the high-quality data can be reanalyzed by readers.

      The cartoons for the action of chemicals are useful, but need a bit more clarity. Our response:

      The schematic explanations of pharmacological treatments and expected outcomes are useful to readers. For a better understanding, we added additional explaining sentences to the figure legends (Figure 5E, M; Figure 6A). We also modified Figure 6A and the corresponding legend.

      "(E), Schematic representation of GmMan1 localization at the ER upon brefeldin A (BFA) treatment. BFA blocks ARF-GEF proteins which leads to a loss of Golgi cis-cisternae and the formation of BFA-induced compartments due to an accumulation of Golgi stacks up to a redistribution of the Golgi to the ER by fusion of the Golgi with the ER (Renna and Brandizzi 2020)."

      "(M), Schematic representation of ARA7 localization in swollen MVBs upon wortmannin treatment. Wortmannin inhibits phosphatidylinositol-3-kinase (PI3K) function leading to the fusion of TGN/EE to swollen MVBs (Renna and Brandizzi 2020)."

      "(A), Schematic representation of progressive stages of FM4-64 localization and internalization in a cell. FM4-64 is a lipophilic substance. After infiltration, it first localizes in the plasma membrane, at later stages it localizes to intracellular vesicles and membrane compartments. This localization pattern reflects the endocytosis process (Bolte et al. 2004)."

      • *

      Fig. 5 does the Q70L mutant show reduced endocytosis ?

      __Our response: __

      We have not investigated this question. As detailed in our response to reviewer 1, *we like to emphasize that we agree fully that functional evidences are interesting to assign role for TTN5 in trafficking steps. A phenotype associated with TTN5T30N and TTN5Q70L would be clearly meaningful. *

      Concerning the aspect of colocalization of the mutants with the markers we show in Figure 5C, D and G, H that YFP-TTN5T30N- and YFP-TTN5Q70L-related signals colocalize with the Golgi marker GmMan1-mCherry. Figure 5K, L and O, P show that YFP-TTN5T30N and YFP-TTN5Q70L-related signals can colocalize with the MVB marker, and this may affect relevant vesicle trafficking processes and plasma membrane protein regulation involved in root cell elongation.

      *At present, we have not yet investigated perturbed cargo trafficking. These aspects are certainly interesting but require extensive work and testing of appropriate physiological conditions and appropriate cargo targets. We discuss future perspectives in the Discussion. We agree that such functional information is of great importance, but needs to be clarified in future studies. *

      • *

      The main text needs to be organized in a way that a reader can separate what is the hypothesis/assumption from actual results and conclusions (see lines #143-149).

      Our response:

      *Thank you for this comment. We reformulated text throughout the manuscript. *

      The text is repeated in multiple places, while I understand that this is not plagiarism, the repetitiveness makes it difficult to read and understand the text. I highlight a couple of examples here, but please check the whole text thoroughly and edit/delete as necessary. a. Lines #124-125 with Lines #149-151 Lines #140-143

      __Our response: __

      *We checked the text and removed unnecessary repetitions. *

      • *

      • Could the authors elaborate on whether there are plan homologs of TTN5? Also, have other ARF/ARLs been compared to TTN5 beyond HsARF1? *

      Our response:

      Phylogenetic trees of the ARF family in Arabidopsis in comparison to human ARF family were already published by Vernoud et al. (2003). In this phylogenetic tree ARF, ARL and SAR proteins of Arabidopsis are compared with the members in humans and S. cervisiae. It is difficult to deduce whether the proteins are homologs or orthologs. In this setting, an ortholog of TTN5 may be HsARL2 followed by HsARL3. In Figure 1A we represented some human GTPases as closely related in sequence to TTN5, these are HsARL2, HsARF1 and AtARF1 since they are the best studied ARF GTPases. HRAS is a well-known member of the RAS superfamily which we used for kinetic comparison in Figure 2. We additionally compared published kinetics of RAC1, HsARF3, *CDC42, RHOA, ARF6, RAD, GEM, and RAS GTPases. *

      • *

      On a related note, a major problem I have with these kinetic values is the assumption of significance or not. For eg. Line#180 the values represent and 2 and 6-fold increase, if these numbers do not matter can a significance threshold be applied so as to understand how much fold-change is appreciable?

      Our response:

      The kinetics of TTN5 and its two mutant variants can be compared with those of other studied GTPases. To provide a basis for the statements about differences in GTPase activities, we modified the text and added respective references in the text for comparisons of fold changes.

      The new text is now as follows Line 175-231):

      „ We next measured the dissociation (koff) of mdGDP and mGppNHp from the TTN5 proteins in the presence of excess amounts of GDP and GppNHp, respectively (Figure 2C) and found interesting differences (Figure 2D, E; Supplementary Figures S3G-I, S4F-H). First, TTN5WT showed a koff value (0.012 s-1 for mGDP) (Figure 2D; Supplementary Figure S3G), which was 100-fold faster than those obtained for classical small GTPases, including RAC1 (Haeusler et al. 2006)and HRAS (Gremer et al. 2011), but very similar to the koff value of HsARF3 (Fasano et al. 2022). Second, the koffvalues for mGDP and mGppNHp, respectively, were in a similar range between TTN5WT (0.012 s-1 mGDP and 0.001 s-1mGppNHp) and TTN5Q70L (0.025 s-1 mGDP and 0.006 s-1 mGppNHp), respectively, but the koff values differed 10-fold between the two nucleotides mGDP and mGppNHp in TTN5WT (koff = 0.012 s-1 versus koff = 0.001 s-1; Figure 2D, E; Supplementary Figure S3G, I, S4F, H). Thus, mGDP dissociated from proteins 10-fold faster than mGppNHp. Third, the mGDP dissociation from TTN5T30N (koff = 0.149 s-1) was 12.5-fold faster than that of TTN5WT and 37-fold faster than the mGppNHp dissociation of TTN5T30N (koff = 0.004 s-1) (Figure 2D, E; Supplementary Figure S3H, S4G). Mutants of CDC42, RAC1, RHOA, ARF6, RAD, GEM and RAS GTPases, equivalent to TTN5T30N, display decreased nucleotide binding affinity and therefore tend to remain in a nucleotide-free state in a complex with their cognate GEFs (Erickson et al. 1997, Ghosh et al. 1999, Radhakrishna et al. 1999, Jung and Rösner 2002, Kuemmerle and Zhou 2002, Wittmann et al. 2003, Nassar et al. 2010, Huang et al. 2013, Chang and Colecraft 2015, Fisher et al. 2020, Shirazi et al. 2020). Since TTN5T30N exhibits fast guanine nucleotide dissociation, these results suggest that TTN5T30N may also act in either a dominant-negative or fast-cycling manner as reported for other GTPase mutants (Fiegen et al. 2004, Wang et al. 2005, Fidyk et al. 2006, Klein et al. 2006, Soh and Low 2008, Sugawara et al. 2019, Aspenström 2020).

      The dissociation constant (Kd) is calculated from the ratio koff/kon, which inversely indicates the affinity of the interaction between proteins and nucleotides (the higher Kd, the lower affinity). Interestingly, TTN5WT binds mGppNHp (Kd = 0.029 µM) 10-fold tighter than mGDP (Kd = 0.267 µM), a difference, which was not observed for TTN5Q70L (Kd for mGppNHp = 0.026 µM, Kd for mGDP = 0.061 µM) (Figure 2D, E). The lower affinity of TTN5WT for mdGDP compared to mGppNHp brings us one step closer to the hypothesis that classifies TTN5 as a non-classical GTPase with a tendency to accumulate in the active (GTP-bound) state (Jaiswal et al. 2013). The Kd value for the mGDP interaction with TTN5T30N was 11.5-fold higher (3.091 µM) than for TTN5WT, suggesting that this mutant exhibited faster nucleotide exchange and lower affinity for nucleotides than TTN5WT. Similar as other GTPases with a T30N exchange, TTN5T30Nmay behave in a dominant-negative manner in signal transduction (Vanoni et al. 1999).

      To get hints on the functionalities of TTN5 during the complete GTPase cycle, it was crucial to determine its ability to hydrolyze GTP. Accordingly, the catalytic rate of the intrinsic GTP hydrolysis reaction, defined as kcat, was determined by incubating 100 µM GTP-bound TTN5 proteins at 25{degree sign}C and analyzing the samples at various time points using a reversed-phase HPLC column (Figure 2F; Supplementary Figure S5). The determined kcat values were quite remarkable in two respects (Figure 2G). First, all three TTN5 proteins, TTN5WT, TTN5T30N and TTN5Q70L, showed quite similar kcatvalues (0.0015 s-1, 0.0012 s-1, 0.0007 s-1; Figure 2G; Supplementary Figure S5). The GTP hydrolysis activity of TTN5Q70L was quite high (0.0007 s-1). This was unexpected because, as with most other GTPases, the glutamine mutations at the corresponding position drastic impair hydrolysis, resulting in a constitutively active GTPase in cells (Hodge et al. 2020, Matsumoto et al. 2021). Second, the kcat value of TTN5WT (0.0015 s-1) although quite low as compared to other GTPases (Jian et al. 2012, Esposito et al. 2019), was 8-fold lower than the determined koff value for mGDP dissociation (0.012 s-1) (Figure 2E). This means that a fast intrinsic GDP/GTP exchange versus a slow GTP hydrolysis can have drastic effects on TTN5 activity in resting cells, since TTN5 can accumulate in its GTP-bound form, unlike the classical GTPase (Jaiswal et al. 2013). To investigate this scenario, we pulled down GST-TTN5 protein from bacterial lysates in the presence of an excess amount of GppNHp in the buffer using glutathione beads and measured the nucleotide-bound form of GST-TTN5 using HPLC. As shown in Figure 2H, isolated GST-TTN5 increasingly bonds GppNHp, indicating that the bound nucleotide is rapidly exchanged for free nucleotide (in this case GppNHp). This is not the case for classical GTPases, which remain in their inactive GDP-bound forms under the same experimental conditions (Walsh et al. 2019, Hodge et al. 2020)."

      Another issue with the kinetic measurements is the significance levels. Line #198-201. The three proteins are claimed to have similar values and in the nnext line, the Q70L mutant is claimed to be high.

      Our response:

      Please see our response and changes in the text according in our response to the previous comment 9. We have provided extra explanations and references to clarify why the kinetic behavior of TTN5 is unusual in several respects (Line 215-220).

      „First, all three TTN5 proteins, TTN5WT, TTN5T30N and TTN5Q70L, showed quite similar kcat values (0.0015 s-1, 0.0012 s-1, 0.0007 s-1; Figure 2G; Supplementary Figure S5). The GTP hydrolysis activity of TTN5Q70L was quite high (0.0007 s-1). This was unexpected because, as with most other GTPases, the glutamine mutations at the corresponding position drastic impair hydrolysis, resulting in a constitutively active GTPase in cells (Hodge et al. 2020, Matsumoto et al. 2021)."

      Provide data for conclusion in line#214-215

      Our response:

      We agree that a reference should be added after this sentence to make this sentence clearer (Line 228-231).

      "As shown in Figure 2H, isolated GST-TTN5 increasingly bonds GppNHp, indicating that the bound nucleotide is rapidly exchanged for free nucleotide (in this case GppNHp). This is not the case for classical GTPases, which remain in their inactive GDP-bound forms under the same experimental conditions (Walsh et al. 2019, Hodge et al. 2020)."

      • *

      How were the mutants studied here identified? random mutation or was it directed based on qualified assumptions?

      __Our response: __

      We used the T30N and the Q70L point mutations as such types of mutants had been reported to confer specific phenotypes in these well-conserved amino acid positions in multiple other small GTPases (Erickson et al. 1997, Ghosh et al. 1999, Radhakrishna et al. 1999, Jung and Rösner 2002, Kuemmerle and Zhou 2002, Wittmann et al. 2003, Nassar et al. 2010, Huang et al. 2013, Chang and Colecraft 2015, Fisher et al. 2020, Shirazi et al. 2020). In particular, these positions affect the interaction between small GTPases and their respective guanine nucleotide exchange factor (GEF; T30N) or on GTP hydrolysis (Q70L). We introduced the mutants and described their potential effect on the GTPase cycle in the introduction and cited exemplary literature. Please see also our response to comment 6 and the proposed text changes (Line 142-151).

      Could more simplification be provided for deifitinition of Kon/Koff values. And can these values be compared between mutants directly?

      __Our response: __

      *We introduce kon and koff in the modified Figure 2D, E, and they are described in the figure legends. Moreover, we present the data for calculations in Supplementary Figures S3, 4, where again we define the values in the respective figure legends. *

      • *

      Data provided are not convincing to claim that both the mutant forms have lower association with the Golgi.

      __Our response: __

      *Our conclusion is that both YFP-TTN5 and YFP-TTN5Q70L fluorescence signals tend to colocalize more with the Golgi-marker signals compared to YFP-TTN5T30N signals as deduced from the centroid-based colocalization method (Line 404-405). *

      "Hence, the GTPase-active TTN5 forms are likely more present at cis-Golgi stacks compared to TTN5T30N."

      The Pearson coefficients of all three YFP-TTN5 constructs were nearly identical, but we could identify differences in overlapping centers between the YFP and mCherry channel. 48 % of the GmMan1-mCherry fluorescent cis-Golgi stacks were overlapping with signal of YFP-TTN5Q70L, while for YFP-TTN5T30N an overlap of only 31 % was detected. This means that less cis*-Golgi stacks colocalized with signals in the YFP-TTN5T30N mutant than in YFP-TTN5Q70L, which is the statement in our manuscript. *

      • *

      IN general the Authors should strongly consider the claims made in the manuscript. For eg. "This study lays the foundation for studying the functional relationships of this small GTPase" (line 125) is unqualified as this is true for every protein ever studied and published. Considering that TTN was not isolated/identified in this study for the first time this claim doesn't stand.

      __Our response: __

      *We reformulated the sentence (Line 123-124). *

      "This study paves the way towards future investigation of the cellular and physiological contexts in which this small GTPase is functional."

      • *

      Line #185 - "characterestics of a dominant-negative...." What is this based on? From the text it is not clear what are the paremeters. Considering that no complementation phenotypes have been presented, this is a far-fetched claim Our response:

      Small GTPases in general are a well studied protein family and the here used mutations T30N and Q70L are conserved amino acids and commonly used for the characterization of the Ras superfamily members. We added explaining sentences with references to the text. The characteristics referred to in the above paragraph is based on the kinetic study.

      We modified the text as follows (Line 186-197 ):

      „Third, the mGDP dissociation from TTN5T30N (koff = 0.149 s-1) was 12.5-fold faster than that of TTN5WT and 37-fold faster than the mGppNHp dissociation of TTN5T30N (koff = 0.004 s-1) (Figure 2D, E; Supplementary Figure S3H, S4G). Mutants of CDC42, RAC1, RHOA, ARF6, RAD, GEM and RAS GTPases, equivalent to TTN5T30N, display decreased nucleotide binding affinity and therefore tend to remain in a nucleotide-free state in a complex with their cognate GEFs (Erickson et al. 1997, Ghosh et al. 1999, Radhakrishna et al. 1999, Jung and Rösner 2002, Kuemmerle and Zhou 2002, Wittmann et al. 2003, Nassar et al. 2010, Huang et al. 2013, Chang and Colecraft 2015, Fisher et al. 2020, Shirazi et al. 2020). Since TTN5T30N exhibits fast guanine nucleotide dissociation, these results suggest that TTN5T30N may also act in either a dominant-negative or fast-cycling manner as reported for other GTPase mutants (Fiegen et al. 2004, Wang et al. 2005, Fidyk et al. 2006, Klein et al. 2006, Soh and Low 2008, Sugawara et al. 2019, Aspenström 2020)."

      The claims in Line #224-227 are exaggerated. Please tone down or delete __Our response: __

      *We rephrased the sentence (Line 240-243). *

      "Therefore, we propose that TTN5 exhibits the typical functions of a small GTPase based on in vitro biochemical activity studies, including guanine nucleotide association and dissociation, but emphasizes its divergence among the ARF GTPases by its kinetics."

      Line#488-489 - This conclusion is not really supported. At best Authors can claim that TTN5 is associated with trafficking components, but the functional relevance of this association is not determined. Our response:

      *We toned down our statement (Line 604-608). *

      „The colocalization of FM4-64-labeled endocytosed vesicles with fluorescence in YFP-TTN5-expressing cells may indicate that TTN5 is involved in endocytosis and the possible degradation pathway into the vacuole. Our data on colocalization with the different markers support the hypothesis that TTN5 may have functions in vesicle trafficking."

      __Minor comments: __

      Line #95 - " This rolein vesicle....." - please clarify which role? Our response:

      We rephrased the sentence (Line 96-99).

      „These roles of ARF1 and SAR1 in COPI and II vesicle formation within the endomembrane system are well conserved in eukaryotes which raises the question of whether other plant ARF members are also involved in functioning of the endomembrane system."

      Line #168 - "we did not observed" please change to "not able to measure/quantify" __Our response: __

      *We changed the text accordingly (Line 169-171). *

      „A remarkable observation was that we were not able to monitor the kinetics of mGppNHp association with TTN5T30N but observed its dissociation (koff = 0.026 s-1; Figure 2E)."

      Line#179 - ARF# is human for Arabidopsis?

      Our response:

      *The study of Fasano et al., 2022 is based on human ARF3 and we added the information to the text (Line 180-181) *

      "(...) very similar to the koff value of HsARF3 (Fasano et al. 2022)."

      • *

      Line #181 - compared to what is the 10-fold difference?

      __Our response: __

      The 10-fold difference is between the nucleotides mGDP and mGppNHp, for both TTN5WT and TTN5Q70L. We added the information on specific nucleotides to this sentence for a better understanding (Line 181-185).

      „Second, the koff values for mGDP and mGppNHp, respectively, were in a similar range between TTN5WT (0.012 s-1mGDP and 0.001 s-1 mGppNHp) and TTN5Q70L (0.025 s-1 mGDP and 0.006 s-1 mGppNHp), respectively, but the koffvalues differed 10-fold between the two nucleotides mGDP and mGppNHp in TTN5WT (koff = 0.012 s-1 versus koff = 0.001 s-1; Figure 2D, E; Supplementary Figure S3G, I, S4F, H)."

      Lines #314-323 - are diffciult to understand, consider reframing. Same goes for the conclusion following these lines.

      __Our response: __

      We added an explanation to these sentences for a better understanding (Line 392-405).

      „We performed an additional object-based analysis to compare overlapping YFP fluorescence signals in YFP-TTN5-expressing leaves with GmMan1-mCherry signals (YFP/mCherry ratio) and vice versa (mCherry/YFP ratio). We detected 24 % overlapping YFP- fluorescence signals for TTN5 with Golgi stacks, while in YFP-TTN5T30N and YFP-TTN5Q70L-expressing leaves, signals only shared 16 and 15 % overlap with GmMan1-mCherry-positive Golgi stacks (Supplementary Figure S8B). Some YFP-signals did not colocalize with the GmMan1 marker. This effect appeared more prominent in leaves expressing YFP-TTN5T30N and less for YFP-TTN5Q70L, compared to YFP-TTN5 (Figure 5B-D). Indeed, we identified 48 % GmMan1-mCherry signal overlapping with YFP-positive structures in YFP-TTN5Q70L leaves, whereas 43 and only 31 % were present with YFP fluorescence signals in YFP-TTN5 and YFP-TTN5T30N-expressing leaves, respectively (Supplementary Figure S8B), indicating a smaller amount of GmMan1-positive Golgi stacks colocalizing with YFP signals for YFP-TTN5T30N. Hence, the GTPase-active TTN5 forms are likely more present at cis-Golgi stacks compared to TTN5T30N."

      Authors might consider a longer BFA treatment (3-4h) to see more clearer ER-Golgi fusion (BFA bodies)

      __Our response: __

      We perforned addtional BFA treatments for HA3-TTN5-expressing Arabidopsis seedlings followed by whole-mount immunostaining and for YFP-TTN5-expressing Arabidopsis lines. In both experiments we could obtain the typical BFA bodies. We included the NEW data in NEW Figure 4B, C

      **Referees cross-commenting**

      I agree with both my co-reviewers that the manuscript needs substantial improvement in its cell biology based experiments and conclusions thereof. I think the concensus of all reviewers points to weakness in the in-planta experiments which needs to be addressed to understand and characterize TTN5, which is the main goal of the manuscript.

      Reviewer #3 (Significance (Required)):

      Significance: The manuscript has general significance in understanding the role of small GTPases which are understudied. Although the manuscript does not advance the field of either intracellular trafficking or organization it holds significance in attempting to characterize proteins involved, which is a prerequisite for further functional studies.

      __Our response: __

      Thank you for your detailed analysis of our manuscript and positive assessment. Our study is an advance in the plant vesicle trafficking field.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary: In this paper, Dresselhaus et al (2023) investigate the possibility that known cargoes of extracellular vesicles (EVs) released at the Drosophila neuromuscular junction have cell-autonomous functions rather than functions specifically conferred as a condition of their release in EVs, in vivo. To do so, authors focus their studies on use of Tsg101-KD, a mutant of the ESCRT-I machinery, of the ESCRT EV biogenesis pathway, and are able to show that for some endogenously-expressed, fluorescently-tagged cargoes, fluorescence intensity in the pre-synaptic compartment is significantly elevated (Syt4 and Evi) and the postsynaptic intensity in the muscle is significantly decreased (Syt4, Evi, APP, and Nrg).

      We note that throughout our study, we detected endogenous Nrg with a well-characterized monoclonal antibody, not a fluorescent tag. We and others previously demonstrated that endogenous Nrg detected by this antibody is trafficked from neurons into EVs, using the same pathways as other EV cargoes such as Syt4, APP and Evi (Blanchette et al., 2022; Enneking et al., 2013; Walsh et al., 2021). Thus, the EV trafficking phenotypes in our study are consistent across fluorescently tagged cargo (endogenous knockin for Syt4 and GAL4/UAS-driven for APP and Evi), as well as for untagged, endogenous Nrg, thus controlling for effects of either overexpression or tagging.

      These findings suggest that these cargoes become trapped in the endosomal system (colocalizing with early, late, and recycling endosomal compartments), rather than undergoing secretion in EVs targeting post-synaptic muscle and glia as usual. This phenotype is recapitulated for select cargoes using mutants of both early and late components of ESCRT pathway machinery. They further characterize the Tsg101 mutant, demonstrating co-occurrence of an autophagic flux defect, but as the cargo phenotype is present without induction of the autophagic flux defect for their Hrs mutants, authors suggest the overlapping role of Tsg101 in autophagy is independent of its role in the ESCRT pathway/ EV secretion. Subsequently, they use previously defined functional phenotypes of the Evi (number of active zones, number of boutons, number of developmentally-arrested ghost boutons) and Syt-4 (number of transient ghost boutons and mEJPs) cargoes to show a minimal dependence on cargo delivery via ESCRT-derived EVs for these cargoes to carry out their synaptic growth and plasticity functions in vivo. However, it should be notes that for Evi/ Wg cargo, there is a slight increase in developmentally-arrested ghost boutons suggesting the cargo may not be entirely independent of EV-mediated cargo delivery. Finally, authors express an anti-GFP proteasome-directed nanobody using motor neuron or muscle-specific drivers and find that Syt4-GFP cargo doesn't enter muscle cytoplasm as fluorescence is maintained and cargo is not degraded by the muscle proteasome. While authors suggest this as evidence of EV-mediated transfer for cargo proteostasis, it is not explicitly shown that Syt4 cargo is, in fact, trafficked and degraded by the lysosome or hypothesized how Syt4 function or post-synaptic localization may be carried out independently of EVs.

      We have added new data showing that Syt4 is taken up by glial and muscle phagocytosis (Fig. 7), and included in the discussion several possible interpretations for how Syt4 activity is carried out independently of its traffic into EVs. Indeed we believe it is more likely to function in the presynaptic neuron rather than the postsynaptic muscle.

      Major comments:

      R1.1 It is difficult to evaluate the findings of this study without knowing the extent of ESCRT pathway impairment. Please provide data quantifying the degree of knockdown/ mutant expression for each ESCRT component (i.e., western blot)

      To address the reviewer’s request to specifically measure the degree of knockdown in the RNAi lines, we tested all available reagents. Unfortunately no Drosophila Tsg101 antibody exists and we did not receive a reply to our requests for a Shrub antibody. An Hrs antibody exists, but we found that none of three available Hrs RNAi lines depleted Hrs signal, or caused a phenotype similar to the HrsD28 point mutant, suggesting that they are not effective at knocking down the protein. Therefore, we were unable to specifically measure the level of depletion in motor neurons for RNAi of Tsg101, Shrub, or Hrs.

      However, we can make a strong argument that our knockdowns were sufficiently effective to answer the questions in our study. We used RNAi as only one of several complementary tools to manipulate ESCRT function (i.e. we also used loss-of-function mutants (HrsD28/Deficiency) and dominant negative mutants (Vps4DN)). These mutants caused a comparable and severe loss of EVs to RNAi (Fig 2): therefore the extent of depletion in the RNAi experiments was sufficient to cause a similarly severe phenotype as genomic or DN mutations, meeting the definition of a bona fide loss-of-function. We also know, since we used these complementary strategies, that the phenotypes we observe are very unlikely to be due to off-target effects of the RNAi.

      More importantly, what is directly relevant for our subsequent functional experiments is to know the extent of EV depletion, which we have explicitly measured throughout the paper. It is unclear what additional insights would be gained by knowing whether the strong Tsg101 and Shrub RNAi phenotypes are due to incomplete versus complete knockdown, given that we do measure the extent of EV depletion under these conditions. Further, we note that tsg101 null mutants die as first instar larvae (Moberg et al., 2005), raising the possibility that a more complete knockdown in neurons would be lethal early in development and make our study impossible. Indeed HrsD28 is an early stop that preserves the VHS and FYVE domains but truncates the C-terminal ⅔ of the protein. Its (occasional) survival to third instar indicates that it may be a severe hypomorph rather than a null.

      We have added a sentence in the text (p12 line 21-25) to clarify that we do not know the exact extent of knockdown for our RNAi experiments, but that by genetic definitions, they meet the criteria of a loss-of-function manipulation.

      R1.2 Loss of ESCRT machinery likely disrupts the release of small EVs to a significant extent; however, the authors do not show that EV release is entirely lost, only that 1) cargoes are backed up in the endosomal system due to endosomal dysfunction and 2) fluorescence of cargoes in the postsynaptic compartment is diminished. To claim that ESCRT-derived EVs with the relevant cargoes are lost, the authors should perform immunogold labelling with TEM. This would provide direct evidence that the cargoes examined here are packaged in ILVs, and that the ILVs are of a size (~50-150nm) consistent with exosomes (which should really be referred to as small extracellular vesicles (sEVs) per the minimal information for studies of extracellular vesicles (MISEV 2018 [https://doi.org/10.1080/20013078.2018.1535750]) Additionally, EM would show the loss of cargo packaging and provide information about where these cargoes localize in the presence of ESCRT mutants/loss-of-function.

      EM (including some limited immunoEM) studies requested by Reviewer 1 have previously been performed in this system by us and by the Budnik and Verstreken labs (Koles et al., 2012; Korkut et al., 2009; Korkut et al., 2013; Lauwers et al., 2018; Walsh et al., 2021). MVBs at the NMJ contain ~50-100 nm ILVs, and can often be seen proximal to or fusing with the plasma membrane. Mutants such as Hsp90 that block this fusion also block EV release, arguing that these MVBs are the source of EV (Lauwers et al., 2018). By immunoEM, the EV cargo Evi localizes to MVBs (Koles et al., 2012). ~50-200 nm structures containing immunogold against Evi were also observed in the subsynaptic reticulum between the neuron and the muscle, as well as in membrane compartments in the muscle cytoplasm (Koles et al., 2012; Korkut et al., 2009). Thus, the criteria requested by the reviewer have previously been established in this system.

      In response to the reviewer’s request to show that these structures are altered in ESCRT mutants, we attempted immunoEM experiments in the Tsg101KD condition. However, similar to the previously published results (Koles et al., 2012; Korkut et al., 2009), immunoEM in thick tissue such as Drosophila larval fillets is quite challenging, and we found it very difficult to retain immunogenicity together with excellent fixation and preservation of membrane structures, such that we could rigorously measure compartment morphology and size. Even if we did achieve good structural preservation, exosomes are ambiguous in complex membrane-rich tissues, since cross-sections through the extensively infolded muscle membrane (e.g. see Fig 3B) are very similar in size to EVs.

      As an alternative and more robust approach, we used STED microscopy, with a resolution of ~50nm, where we could conduct a rigorous and properly powered study of directly labeled EV cargoes (New data in Fig. S1). We show that postsynaptic Nrg and APP-GFP are found in structures with a mean diameter of ~125 nm, consistent with small EVs or exosomes, and these are strongly depleted in the Tsg101KD animals (to similar levels as antibody background far from the site of EV accumulation), as expected. Note that we are able to detect particles significantly smaller than 125 nm in the distribution, suggesting that the resolution of our system is sufficient to measure EV width.

      We also note that several of these cargoes are detected via an intracellular tag (Syt4, APP, Evi) or antibody against an intracellular domain (Nrg), so by topology they must be membrane-bound in the EVs rather than cleaved from the cell surface. We and others have previously shown that this postsynaptic signal is entirely derived from the presynaptic neuron, by using neuronal UAS-expression of a tagged protein, by neuronal RNAi of the endogenous gene, or by the tissue-specific tagging approach in the current manuscript (Fig. S4). We have also previously shown that these puncta contain the tetraspanin Sunglasses (CG12143/Tsp42Ej), which is an EV marker (Walsh et al., 2021). We have added new data to our manuscript (Fig. S1A) to show that neuronally-derived tetraspanin EVs are depleted in upon Tsg101KD. Therefore, the reviewer’s point “2) fluorescence of cargoes in the postsynaptic compartment is diminished.” is the most direct and sensitive test of trans-synaptic cargo transfer, and is the precise parameter that we are trying to manipulate to test the functions of this transfer.

      We believe that light microscopy showing loss of presynaptically-derived cargoes in the postsynaptic region is the best and most direct argument for loss of EV secretion, compared to the ambiguity of EM. It is also exactly the method that led to the proposal for the signaling function of EVs in previous work, which our current manuscript is revisiting. We are now using improved tests of that original hypothesis by examining it in light of additional membrane trafficking mutants (and finding that it no longer holds up). Overall, given the preponderance of evidence from the preceding literature and our studies indicating that (1) these cargoes are indeed in EVs and (2) we see a strong enough depletion of transsynaptic transfer to challenge the hypothesis that EVs serve signaling functions (see R1.3 response below), we are reluctant to spend more time attempting immunoEM which is not likely to resolve membrane structures.

      To address the point of EV terminology used in our manuscript, we think it is very unlikely that the postsynaptic structures are not exosomes. The criteria defined by MISEV for exosomes is that they are endosomally-derived from MVBs, ideally with the EV “caught in the act of release” upon fusion with the plasma membrane. As noted above, cargoes such as Syt4 and Evi are observed by immunoEM in MVBs, and these can be found in the process of fusing with the plasma membrane (i.e. caught in the act of release) (Koles et al., 2012; Korkut et al., 2009; Korkut et al., 2013; Lauwers et al., 2018). Mutants that block MVB fusion also block EV release at the NMJ (Lauwers et al., 2018). These EVs require ESCRT for their formation and are trapped in endosomes rather than the plasma membrane upon ESCRT depletion (this study). They depend on multiple components of the endosomal system (Rab GTPases, retromer) for their formation (Koles et al., 2012; Walsh et al., 2021). Taken together, it seems to us that there is sufficient data to argue that these are exosomes. However, as the reviewers requested, we have called them EVs in the revised paper (and only suggest they are exosomes in the discussion).

      R1.3 Other biogenesis pathways utilize multivesicular bodies to generate EVs, most prominently the nSMase2/ceramide synthesis pathway (which operates in an ESCRT-independent manner). It is possible that this pathway compensates when there are defects in the canonical ESCRT pathway. Thus, it is imperative for the authors to show that the cargo secretion no longer occurs in the presence of ESCRT mutations/loss-of-function. The authors should also use nSMase2 pathway mutants to see if the phenotypes in cargo trafficking (i.e., pre/ post-synaptic protein levels) are recapitulated.

      The reviewer asked us to show that cargo secretion does not occur in the ESCRT mutants. We reiterate that at the limits of detection of our assay, we see a very strong depletion of secretion__, and that EV cargo levels are not distinguishable from background (__Figure S1). Perhaps Reviewer 1’s concern is that since it would never be possible to show that we have depleted EVs completely (i.e. below the level of detection of our assays), that it is not possible to challenge the hypothesis that EV traffic is required for the proposed signaling functions of EVs. Indeed, they mention in their overall assessment “as it is unknown if minor sources of cargo+ EVs are sufficient in maintaining functional phenotype”. We do have some information on this, as described in the manuscript (p3 lines 41-43; p7 lines 25-31; p11 lines 27-30) and as follows: The critical argument against this concern is that other trafficking mutants with residual levels of EVs (rab11 or nwk) do show loss of signaling function (Blanchette et al., 2022; Korkut et al., 2013). Therefore residual EVs, even at the lower level of detection of our assay, are not enough to support signaling. The main difference is that in nwk and rab11 mutants the levels of the cargo in the donor presynaptic neuron are also strongly depleted, unlike in the ESCRT mutants. This strongly suggests that the cargoes are signaling from the presynaptic compartment, rather than in EVs. We have added the nwk mutant to show this baseline in Figure 2A,D. Similarly, our new results showing that hrs mutants retain Wg signaling while Tsg101 mutants do not, despite a similar degree of EV depletion (new data with more cargoes in Figure 2A-F), argues that residual EVs do not account for the lack of disruption of signaling. Finally, we have been transparent in our discussion that trace amounts of EVs could still exist, including by alternative pathways, but are unlikely to provide function (p11 lines 25-33).

      We agree that it might be an interesting future mechanistic direction to ask if the SMase pathway works with or in parallel to the ESCRT pathway (both have been suggested in the literature). However, we do not believe that this is essential for the current work: The SMase pathway is unlikely to be “compensating”, since EVs are already very strongly depleted with ESCRT disruption alone. We also note that SMase depletion may also affect other trafficking pathways (Back et al., 2018; Choezom and Gross, 2022; Niekamp et al., 2022), and therefore might not provide any clarifying information if it did disrupt signaling. In summary, we believe the depletion we see in single ESCRT mutants is sufficient to (1) establish the role of ESCRT in EV traffic in this system, and (2) test the role of transsynaptic transfer in signaling functions of cargoes.

      R1.4 The authors' findings support that cargo trafficking is affected by widespread endosomal dysfunction but doesn't cleanly prove that 1) synaptic sEV release is lost and 2) that cargo-specific sEVs are lost. As previously mentioned, loss of cargo+ ILVs in MVEs by TEM could demonstrate this, but another useful approach would be to include in vitro Drosophila primary neuronal culture/ EV isolation and mass spec/proteomic characterization studies as proof of concept. According to widely agreed upon guidelines in the EV field, the authors should directly characterize their EV population to show 1) the appropriate size distribution associated with exosomes/sEVs, 2) the presence of traditional EV markers (i.e., tetraspanins), 3) changes in overall EV count by ESCRT mutants, and 4) decreased levels of cargo(es) of interest in the presence of ESCRT mutants/loss-of-function. In vitro experiments would be particularly helpful for quantifying the degree of loss of cargo-specific EVs with each ESCRT mutant. These experiments could also investigate the possibility that cargoes are secreted in nSMase2/ Ceramide-derived EVs, by showing that EV cargo levels are unaffected in nSMase mutants.

      Our data already show loss of cargo-specific EVs, defined by puncta of several independent specific cargoes in the extraneuronal space and postsynaptic muscle. To further substantiate this, we have directly characterized our EV population and shown a distribution of ~125 nm extraneuronal structures containing the transmembrane cargoes Nrg and APP (by STED) as well as Evi, Syt4 and the EV marker tetraspanin (by confocal microscopy). This addresses the (1) size distribution, (2) EV marker and (3) count criteria. All these markers (cargoes and tetraspanins) are severely depleted from the postsynaptic area in the ESCRT mutants, satisfying the (4) decreased levels criteria. As noted above, we and others have repeatedly demonstrated that these postsynaptic puncta are derived from neurons, and since we are detecting the intracellular domain in all cases, must be membrane-bound. Others have previously shown by EM that several of these markers are surrounded by membrane and derived from neuronal MVBs (see R1.2). Note that we do not believe that ESCRT mutants must necessarily cleanly show enlarged endosomes without ILVs or a class E vps compartment - instead stalled endosomes appear to be targeted for autophagy in heterogeneous intermediates (Fig 3).

      We do not believe that turning to a heterologous system (e.g. cultured primary Drosophila neurons, which do not even form functional synapses) is usefully translatable to results in neurons in vivo. Data from our lab and many other systems has shown that EV biogenesis and release pathways are highly cell-type specific (p9 lines 8-12), and also differ in different regions of neurons (eg synapses vs soma) (Blanchette and Rodal, 2020). Further, keeping the experimental setup of the original for EV signaling hypothesis is a prerequisite for our improved tests of this hypothesis. We do note that APP, Evi and Syt4 have been demonstrated by us and others to be released from Drosophila S2 cells in EVs defined by differential centrifugation, sucrose gradient buoyancy, electron microscopy and mass spectrometry (Koles et al., 2012; Korkut et al., 2009; Korkut et al., 2013; Walsh et al., 2021). However even if we did measure the precise change in EV number and cargoes upon ESCRT manipulation in these heterologous cells, it would not allow us to conclude that the same quantitative change was happening in the motor neurons of interest in vivo, which is the information we need to conduct our tests of cargo signaling function. All we would learn is whether ESCRT was required in that cell type, which would not be informative for our study.

      We appreciate that EV researchers working in cell culture systems often use a set of approaches including bulk isolation, EM, and mass spectrometry. Our system does not allow for these approaches, but provides complementary strengths of single EV characterization, in vivo relevance with functional assays, and a wealth of genetic tools. MISEV itself states that it does not provide a set of agreed-upon rules that can be applied generically to any experiment. We agree with the MISEV statement that we should use the best available assays for the system under investigation.

      R1.5 During functional tests of Evi+ motor neurons lacking generation of Evi+ EVs, there is a slight defect observed, namely the increased formation of developmentally arrested ghost boutons when Evi secretion in sEVs is lost. As mentioned, Evi is a transporter of Wg and it is possible for Wg to be transmitted between cells via normal diffusion. Thus, some basal levels of Wg may be reaching the muscle when its transfer via sEVs is abolished, and these basal levels may be sufficient to phenocopy the WT in the number of active zones and boutons. Is it possible that this element of Evi/ Wg function is dose-dependent and thus reliant on the extra Evi/ Wg transferred via sEVs? If possible, the authors should use a Wnt-signaling pathway reporter (i.e., fluorescently tagged Beta-Catenin) to measure the levels of Wnt signaling activity in the muscle when Evi/Wg+ EVs are present vs. abolished. If the degree of Wnt signaling (readout would be intensity of fluorescent reporter) is decreased without Evi+ sEVs, there may be a dose-dependent response. Otherwise, please more clearly disclose the partial loss of Evi function without Evi+ sEVs or state the intact function of Evi without sEVs as speculative.

      We agree that Wg is likely to be reaching the muscle in the absence of Evi exosomes via conventional secretory mechanisms, and have conducted new experiments to test this hypothesis (Fig. 5). In Drosophila muscles, Wg does not signal via a conventional b-catenin pathway. Instead, neuronally-derived Wg activates cleavage of its receptor Fz2, resulting in translocation of a Fz2 C-terminal fragment into the nucleus (Mathew et al., 2005; Mosca and Schwarz, 2010). We did attempt to directly measure Wg (using antibodies or knockins) and though we were able to detect a specific presynaptic signal, the background noise throughout the postsynaptic muscle was too high for a sensible quantification. In response to the reviewer’s question and also R2.6), we collaborated with the laboratory of Timothy Mosca to test Fz2 nuclear import in Tsg101 and Hrs mutants (new Figure 5F-G). Strikingly, we found that Hrs mutants, despite being extremely sickly, have normal nuclear import of Frizzled. We also confirmed that Hrs mutants have dramatically depleted levels of all EV cargoes examined, including Evi (Figure 2A-F). On the other hand we found that Tsg101 knockdowns have dramatically reduced Wg signaling (and a concomitant defect in postsynaptic development). We do not rule out (but think it is unlikely) that very small amounts of EVs could be present in hrs but not tsg101 mutants. A more parsimonious interpretation is that additional membrane trafficking defects in the Tsg101 mutants (which are beyond the scope of this study to explore in detail) block an alternative mode of Wg release, perhaps conventional secretion. The fact that Hrs mutants, despite showing similar depletion of Evi EVs, do not have a signaling defect strongly argues that EV release per se is not required for Wg signaling.

      R1.6 To support the authors' hypothesis that Syt4 transmission via EVs is a proteostatic mechanism, the authors should determine whether Syt4 cargo localizes to lysosomal compartments in muscle, glia, or both. Otherwise, the proteostatic degradation of Syt4 via EVs is speculative.

      Our data suggest that EVs serve as one of several parallel proteostatic mechanisms for presynaptic cargoes. We have added new data to the manuscript to emphasize the advance our work makes in our understanding of these mechanisms, and have emphasized this in the discussion on p 11-12, lines 46-5).


      1. Degradation of neuronally derived EVs in glia and muscles. Previous work has shown that EV cargoes such as Evi can be found in compartments in the muscle cytoplasm, and that a-HRP-positive puncta are taken up and degraded by glial and muscle phagocytosis (Fuentes-Medel et al., 2009). These a-HRP-positive structures, despite colocalizing with EV cargoes Syt4, Nrg and APP (Walsh et al., 2021), were not previously connected to EVs. We have added new data showing that muscle or glial-specific RNAi of the phagocytic receptor Draper leads to the accumulation of EVs containing Syt4 (new Figure 7G-H)). Together with our finding (Figure 7A-F) that Syt4 is not significantly detected in the muscle cytoplasm, these results indicate that the main destination for transynaptic transfer is phagocytosis by the recipient cell. We have not been able to convincingly detect EV cargoes in the endolysosomal system of muscles, even in mutants disrupting lysosomal traffic, likely because the small number of EVs released by neurons (even over days of development) are drastically diluted in the much larger muscle cell.
      2. Compensatory endosomophagy in the neuron. __When EV release is blocked in Hrs or Tsg101 mutants, we observe an induction of autophagy in the neuron (__Figure 3B, E-G). However, in the absence of ESCRT manipulation, autophagy mutants do not accumulate EVs (Figure 3C,D. S2H-I). This suggests that autophagy is a compensatory mechanism that is induced in the absence of EV release.
      3. Retrograde transport to cell bodies: We previously found that disruption of neuronal dynactin leads to accumulation EV cargoes in presynaptic terminals (Blanchette et al., 2022), suggesting that retrograde transport is a mechanism for removal of these cargoes from synapses. Interestingly, EV release is not increased in these conditions, indicating that the retrogradely transported compartment represents a late endosome without ILVs, or an MVB that cannot fuse with the plasma membrane.

        R1.7 Please discuss alternate modes of cargo transfer from the presynaptic compartment to the postsynaptic compartment that may be utilized when EV-mediated transfer is abolished (i.e., cytonemes or tunneling nanotubules).

      We have added these possibilities to the discussion (p11 line 31), though we note that we do not observe any such structures, or indeed any Syt4 in the muscle cytoplasm, and there is no current evidence for such transsynaptic structures in this system. Conventional secretion of Wg into the extracellular space and signaling through its transmembrane receptor Frizzled2 can account for Wg signaling in the absence of exosomes.

      R1.8 OPTIONAL: Investigate the mechanism of Syt4+ sEV fusion with the postsynaptic compartment (direct fusion with the plasma membrane, receptor-mediated fusion, endocytosis and unpacking, or endocytosis and degradation).

      We note that the Budnik lab has already shown that HRP-positive EVs released by NMJs are taken up by glia and muscles (Fuentes-Medel et al., 2009), and we have added data showing that this also applies for Syt4 (Fig. 7). Our data are not consistent with Syt4 fusing with recipient cell membranes or entering the muscle cytoplasm. Further investigation of this mechanism is beyond the scope of this project.

      Given that several fundamental questions have yet to be answered regarding the biogenesis pathways and machinery utilized for EV-mediated cargo secretion, and the necessity for further TEM studies and/or work with primary cultures to characterize ILVs and EVs, >6 months is estimated to perform the necessary experiments that may require learning/ optimizing new systems.

      Minor comments:

      R1.9 Please clarify the choice of using Tsg101 KD in place of mutants of other ESCRT machinery (i.e., Hrs). Especially as when the Tsg101 mutant was characterized, you found major defects in autophagic flux that were not present for HrsD28/Df.

      Tsg101 RNAi was selected since it provides a neuron-autonomous knockdown, eliminating the complications of mutant effects in other tissues. These animals are also relatively healthy as third instar larvae compared to genomic mutants tsg1012 (L1 lethal) and HrsD28 or motor-neuron driven Vps4DN (where L3 larvae are rare). This made it easier to recover enough larvae to properly power experiments, and alleviated concerns that general sickness is contributing to the phenotype (though note that neuronal Tsg101KD does result in pupal lethality). Finally, we were unable to effectively knock down Hrs by RNAi (see R1.1). To extend our studies beyond Tsg101, we have included additional experiments in the revised manuscript showing that HrsD28 animals, despite being quite unhealthy, still retain Syt4-dependent functional plasticity (See R2.5 and R3.4) and Wg signaling.

      R1.10 Please clarify why the specific method in experiment in Fig. 4E-J was chosen. As Syt4 is a transmembrane protein, is likely undergoes degradation via the lysosome, like other membrane-bound proteins. Is it known whether the proteasome-directed nanobody is sufficient to pull Syt4 from membrane-bound compartments to undergo degradation in the proteasome? Would it make more sense to use a lysosome-directed nanobody?

      The GFP tag on Syt4 is cytosolic rather than lumenal. Our data show that when we express the proteosome-directed nanobody presynaptically, it efficiently degrades membrane-associated Syt4-GFP (Fig. 7B). Therefore we expect that this tool should be similarly effective on membrane-associated Syt4-GFP if it were exposed to the muscle cytoplasm. We have confirmed that it is effective in the muscle against DLG-GFP (Fig. S5A)

      R1.11 Please provide further methodological information regarding the sample preparation for live imaging of axons to generate kymographs found in Fig. S3.

      Additional details have been provided on p14 lines 10-24 and p15 lines 31-37.

      R1.12 In Figure 1I and 1J, include representative image and quantification of Syt4-GFP pre- and post-synaptic intensity for HrsD28/Df for consistency with ShrubKD and Vps4DN in Figure 1K-P.

      We generated and tested HrsD28; Syt4-GFP (Fig 2A,D), and HrsD28; Evi-GFP strains (Fig 2B-E). All EV cargoes exhibited a dramatic post-synaptic depletion in Hrs mutants, similar to the other ESCRT manipulations.

      R1.13 In Figure 2H, please provide a cell type marker or HRP mask with a merged image for image clarity.

      This image shows neuronal cell bodies in the ventral ganglion, which are densely packed relative to each other. The cell type specificity is provided by the motor neuron driver. We did not use a cell type marker or individually mask cells for analysis, but instead quantified intensity over the whole field of view. We can manually trace cell bodies in this image if requested, but it would not represent our ROI for analysis.

      R1.14 In Figure 4B, please provide quantification for the differences between 1) WT Mock and Tsg101 MOCK and 2) WT Stim and Tsg101KD Stim to show that upon stimulation, WT and Tsg101 undergo the same increase in the number of ghost boutons/ NMJ in Muscle 4.

      We have added these statistical comparisons to the graph (Fig. 6B)

      R1.15 In Figure 3 G and H, use consistent scale bars to compare between temperatures.

      We have removed the Shrub data at 20º as it did not provide additional insight to the manuscript.

      Reviewer #1 (Significance (Required)):

      General assessment (Strengths):

      -Use of Drosophila NMJ model system consistent with others in the field and exceptional harnessing of genetic tools for mutations across the ESCRT pathway (-0, -I, -III, etc.) -Identification of ESCRT pathway mutants that do not deplete pre-synaptic cargo levels but generate endosomal dysfunction, indicative of a possible decrease in secretion of cargoes via EVs -Implementing functional characterization of Evi/ Wg and Syt4 cargoes, consistent with previous work in the field; highly reproducible

      -Sufficiently thorough investigation of the cross-regulation of autophagy and EV biogenesis by Tsg101

      General assessment (Weaknesses):

      -Lack of investigation of known ESCRT-independent pathways/ genes involved in the generation of sEVs (i.e., nSMase2/ Ceramide) especially as it is unknown if minor sources of cargo+ EVs are sufficient in maintaining functional phenotype

      See R1.3 for comments on this point

      -Lack of sEV characterization and validation of EVs derived from mutant

      We have added STED data to measure EV size, and described the challenges in EV membrane measurements by EM in the in vivo system.

      -Does not show the loss of cargoes of interest on EVs from mutants other than through back-up of cargoes in the presynaptic endocytic pathway (Rab7, Rab5, Rab11)

      We strongly disagree with this comment. We have explicitly measured the loss of numerous cargoes in postsynaptic structures that have been rigorously established to be EVs in this and previous publications. Our findings are not limited to back-up of presynaptic structures.

      -Lack of rigorous investigation of the claim that Evi and Syt4 are released via EVs for proteostatic means is missing. Authors should demonstrate the degradation of EV cargoes by recipient cells (either muscle OR glia)

      We have added new data and discussion on multiple and compensatory proteostatic pathways.

      -If EV-mediated cargo transfer is not required, authors should investigate alternate modes of cargo transfer more rigorously (i.e., diffusion of Wg, suggest/ test hypotheses for mechanism of Syt4 function or transfer).

      We have included discussion of alternate modes of transfer for Wg (i.e. conventional secretion). By contrast, for Syt4 we believe it is acting in the donor cell without transfer, and have included alternate interpretations of the previous literature that had suggested its function in muscles.

      Advance: -Compared with other recent in vivo studies of EVs where donor EVs are loaded with a cargo, such as Cre, which uniquely identifies recipient cells through Cre recombination-mediated expression of a fluorescent reporter (Zomer et al 2015, Cell), this study relies on the readout of fluorescently tagged cargo in the recipient cells to represent transfer via EVs. While numerous studies in the Drosophila field focus on the same small set of known EV cargoes at the NMJ (Koles et al., 2012; Gross et al., 2012; Korkut et al., 2013; Korkut et al., 2009; Walsh et al., 2021), there is a noticeable lack of EV characterization based on MISEV (i.e. TEM of EVs, size distribution, enrichment of well-known EV markers [https://doi.org/10.1080/20013078.2018.1535750]) that would significantly strengthen the work and make it more widely accepted in the EV field.

      As mentioned above, many of these criteria (including EV size and enrichment of known EV markers) are already established in the previous literature for this system. As requested, we have also added similar data to our revised manuscript.

      -In this study, the use of ESCRT machinery mutants is proven as a new technical method in delineating the role of EV cargoes in cell-autonomous versus EV-dependent functions. This is the first study, to my knowledge, that has leveraged mutants from both early and late ESCRT complexes for the study of EVs in Drosophila. Additionally, the finding that some cargoes may be able to carry out their signaling functions, independent of transfer via EVs, provides key mechanistic insight into one possible role of EVs as proteostatic shuttles for cargo. This work also begins to address a fundamental question in the field, which is to delineate roles that EVs actually carry out in physiological conditions, compared to the many roles that have been shown possible in vitro.

      We appreciate the reviewer’s insight into the impact of our work.

      Audience: -Basic research (endosomal biology, ESCRT pathway, cell signaling, neurodevelopment)

      -Specialized (Drosophila, Neurobiology; Extracellular Vesicles)

      -This article will be of interest to basic scientists in the field of endosomal trafficking and extracellular vesicle biology as well as though studying the nervous system in Drosophila melanogaster. As the field of extracellular vesicle biology has broad implications in the spread of pathogenic cargoes in cancer and neurodegenerative disease, the basic biology associated with EVs has some translational relevance.

      Expertise (Keywords):

      -ESCRT and nSMase2 EV biogenesis pathways

      -EV characterization in vitro/ live imaging studies

      -EV release and uptake

      -Neuronal and glial cell biology

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      This manuscript addresses the role of exosome secretion in neuromuscular junction development in Drosophila, a system that has been proposed to depend on exosomes. In particular, delivery of Wingless via exosomes has been proposed to promote structural organization of the synapse. Previously, however, the studies that proposed this model targeted the cargoes themselves, rather than targeting exosome biogenesis or secretion. In this new study, exosome biogenesis is targeted via knockdown of the ESCRT components Hrs, TSG101, and Chmp4. The authors find that some previously ascribed functions are not inhibited by these knockdowns. In particular, formation of active zones, as defined by BRP-positive puncta (total and per micrometer), and total bouton numbers. It does look like there is a partial defect in BRP-positive puncta per micrometer, but it is not significant. For ghost bouton formation, there is a similar increase in evi-mutant and ESCRT-KD NMJs (with some subtle differences depending on abdominal segment and temperature). They also examine the role of Syt4, which has been proposed to be transferred from nerve to muscle cells at the junction and to regulate mEJP frequency after stimulation. They found no difference in mEJP frequency after stimulation between WT and TSG101-KD animals, although they did not have a positive control with inhibition of Syt4. They did do an elegant experiment to demonstrate that most of extracellularly transferred Syt4 does not reach the muscle cytoplasm. Overall, it is an interesting paper, mostly well controlled and rigorous, and well-written. It is an important contribution to the EV and NMJ fields. The data should provoke reconsideration of some of the functions that were previously ascribed to exosome transfer at the NMJ. However, I do think that there are some overly strong statements and the functions of the exosomes at the synapse were quite narrowly examined. For example, the title of the paper is pretty strong and the abstract does not say which functions were or were not affected by TSG101 KD. There are also a couple of experiments that would enhance the manuscript. Some specific suggestions are below:

      R2.1 Title: "ESCRT disruption provides evidence against signaling functions for synaptic exosomes" seems a bit broad -- only evi/Wg and Syt4 functions were examined at NMJ synapses, not all signaling functions of all exosomes at all synapses. Something like, "ESCRT disruption provides evidence against signaling functions for exosome-carried evi/Wg and Syt4 at the neuromuscular junction" seems a bit more reasonable.

      We are open to changing the title to: “ESCRT disruption provides evidence against transsynaptic signaling functions for some extracellular vesicle cargoes” though we prefer to leave it as is since “provides evidence against” is already fairly understated.

      __ __R2.2 Abstract: the description of the actual data is very little, just one sentence saying that "many" of the signaling functions are retained with ESCRT depletion. I think a bit more focus on the actual data is warranted.

      We have edited the abstract to include more detail on the signaling phenotypes.

      __

      __R2.3 Results section:

      Fig 3: What does A2 and A3 mean for the graphs in c,d,e, g, h? Please specify in figure legend.

      We have described in the figure legends that A2 and A3 refer to specific abdominal segments in the larvae.

      R2.4 The sentence "Further, active zones in Tsg101KD appeared morphologically normal by TEM (Fig.2B)." is confusing to me. What do you mean by that? Are you referring to the following two sentences about feathery DLG and SSR? But the feathery DLG I presume is in Fig 3, where that staining is. And I also don't know what feathery DLG means -- it should be pointed out in the appropriate image.

      Presynaptic active zones are defined by an electron-dense T-shaped pedestal at sites of synaptic vesicle release, and can be seen in the TEM in what is now Figure 3B, marked as AZ. We have also labeled AZ by immunofluorescence (Fig. 5A) and they appear normal.

      By contrast, Dlg primarily labels the postsynaptic apparatus associated with the infoldings of the muscle membrane. In control animals, Dlg immunostaining is relatively tightly and smoothly clustered within ~1µm of the presynaptic neuron. By contrast, in Evi mutants, there are wisps of Dlg-positive structures extending from the bouton periphery. We have added arrows in what is now Fig. 5C to indicate the feathery structures.

      R2.5 Fig 4 addresses Syt4 function. However, there is no positive control inhibiting Syt4 to see if there is a change. Just comparison of WT and TSG101. It seems like this positive control is in order.

      We have added the positive control (Fig. 6E-F) reproducing the previously reported result that Syt4 mutants lack the high-frequency stimulation-induced increase in mEPSP frequency (HFMR). We have also added new data on HrsD28 genomic mutants. Despite the fact that few of these larvae survive and they are quite unhealthy, they still exhibit robust HFMR, similar to the Tsg101KD larvae, strongly supporting our hypothesis.

      R2.6 Discussion: I think some discussion of what ghost boutons are and what the possible significance is of the evi and ESCRT mutant phenotype of enhanced ghost bouton formation

      We have added more discussion on the ghost bouton phenotype (p11 lines 5-14), especially in light of our new findings that Hrs and Tsg101 mutants may distinguish alternative modes of Wg secretion (see R1.5)

      R2.7 Also, in the Discussion, it is mentioned that Wg probably gets secreted in the ESCRT mutants -- presumably this accounts for the discrepancy between evi mutants and the ESCRT mutants. An experiment to actually test this would greatly enhance the manuscript.

      We have added this experiment as addressed in R1.5

      Reviewer #2 (Significance (Required)):

      Overall, it is an interesting paper, mostly well controlled and rigorous, and well-written. It is an important contribution to the EV and NMJ fields. The data should provoke reconsideration of some of the functions that were previously ascribed to exosome transfer at the NMJ. However, I do think that there are some overly strong statements and the functions of the exosomes at the synapse were quite narrowly examined. For example, the title of the paper is pretty strong and the abstract does not say which functions were or were not affected by TSG101 KD.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Dresselhaus et al. investigates signaling functions for synaptic exosomes at the Drosophila NMJ. Exosomes are widely seen in vivo and in vitro. They are clearly sufficient to induce signaling responses in vitro, but whether they normally fulfill signaling functions in vivo has not been rigorously addressed. The authors make use of several mutants that block exosome release to test whether exosome release is important for two distinct signaling pathways: the Evi/Wg pathway and the Syt4 signaling pathway. Both pathways have been implicated in neuron to muscle signaling. Surprisingly, the authors find scant evidence that exosome release is required for either pathway. They convincingly show that knockdown of Tsg101 (an ESCRT-I component) does not phenocopy many synaptic phenotypes of either wg or syt4. Instead, they propose that in vivo, exosomes may serve as a proteostatic mechanism, as a mechanism for the neuron to dispose of unwanted/damaged proteins.

      Specific comments are below:

      R3.1 Loss of Tsg101 has been linked to upregulated MAPK stress signaling pathways and autophagy. Thus, it's possible that activating such compensatory mechanisms in Tsg101 knockdown animals could mask phenotypes associated with specific loss of EV cargoes such as Wg or Syt4. Indeed, the authors demonstrate that loss of Tsg101 and Hrs have very different effects on synaptic autophagy. To provide additional evidence that Wg or Syt4 signaling is independent of EV release, it would be good to check for wg/syt4 phenocopy in additional ESCRT complex mutants. I understand they did a bit with Shrub knockdown at low temperature in Figure 3, but the temperature-dependence of the ghost bouton phenotype clouds the interpretation. Could the authors try a motorneuron driver with a more restricted phenotype to overcome the lethality issues, or alternatively use one of their other ESCRT component mutants? This is obviously the central claim of the manuscript, and it would be strengthened by carrying out phenotypic analysis in mutants other than the Tsg101 RNAi line.

      As noted for R2.5, we have added HFMR experiments for the HrsD28 genomic mutant, and found that despite being very unhealthy, they exhibit robust HFMR similar to Tsg101KD. We also confirmed dramatic depletion of Syt4 EVs in the HrsD28 mutant. Thus, the preserved Syt4 signaling function in ESCRT mutants with depleted EV Syt4 is not restricted to Tsg101, and does not depend on the co-occurring autophagy phenotype.

      R3.2 In Figure 1, the authors show that neuronal Tsg101 RNAi dramatically reduces "postsynaptic" levels of exosome cargoes at the L3 stage to argue that exosome release is blocked in this mutant. While this seems very likely at the L3 stage, it is unclear when Tsg101 levels are reduced and thus when exosome release is impaired in this background. This is important because we don't know when these signaling pathways act. For example, it is possible that the critical period for Wg and Syt4 signaling is during the L1 stage, and that Tsg101 knockdown is incomplete at that stage. It is important to assay exosome release at earlier larval stage, particularly when RNAi is the method used to reduce gene function.

      We have conducted this experiment. We noted accumulation of cargoes in Tsg101KD L1 larvae, indicating that the RNAi is effective early in development. However, we do not find many EVs in either wild-type or Tsg101KD first instar larvae (red is a-HRP, green is Syt4-GFP). This argues that it is unlikely that EV-mediated signaling has a critical period earlier in development. It is likely that the accumulation of EVs that we observe trapped in the muscle membrane reticulum in third instar larvae were laid down over days or hours of development. We do not propose to include these data in the manuscript unless the editors and reviewers prefer that we do so.

      R3.3 If the Syt4 and Evi exosomes do not serve major signaling roles and are in fact neuronal waste, it seems likely they are phagocytosed by glia. Are levels of non-neuronal Syt4/Evi levels increased when glial phagocytosis in blocked (eg in draper mutants)?

      As mentioned above, the Budnik lab previously showed that uptake and degradation of postsynaptic a-HRP-positive structures depends on glial and muscle phagocytosis.a-HRP recognizes a number of neuronally-derived glycoproteins (Snow et al., 1987). Though the Budnik lab had not previously linked these structures to EVs, we do know that they very strongly colocalize with known EV cargoes and depend on the exact same membrane traffic machinery for release, arguing that some a-HRP antigen proteins are also EV cargoes (Blanchette et al., 2022). To close this loop. we have added data showing that Syt4-positive EVs also depend on Draper for their clearance (Fig 7).

      R3.4 For the HFMR experiment, it would be good to see the syt4-dependent phenotype as a positive control.__ __

      As mentioned for R2.5, we have added the Syt4 positive control (Figure 6E,F), which fails to show HFMR as expected.

      .__ __R3.5 In the abstract, the authors state that, "the cargoes are likely to function cell autonomously in the motorneuron". Isn't it alternatively possible that these proteins (wg in particular) could signal to the muscle in a non-exosome dependent pathway?

      Yes, we believe that Wg is likely released by another mechanism (perhaps conventional secretion). As noted for R1.5 and R2.6, we have added new data in Fig. 5 showing that Frizzled nuclear import IS NOT disrupted in Hrs mutants, despite dramatic loss of Evi EVs. Interestingly Frizzled nuclear import (and postsynaptic development) IS altered in neuronal Tsg101KD larvae, which disrupt additional membrane trafficking pathways beyond EV release (see Fig. 3). This is particularly interesting in light of the normal Syt4 signaling in Tsg101KD larvae, and supports the hypothesis that Syt4 can function without leaving the neuron, while Wg must be released, albeit not via Hrs-dependent EV formation. Another (less parsimonious) interpretation is that very small amounts of Wg release in the Hrs mutant are sufficient to promote Frizzled nuclear import.

      Reviewer #3 (Significance (Required)):

      This is an important paper that is well-organized and logically presented. It makes a clear and largely compelling case against major signaling roles for exosomes at this synapse. The authors should be commended for publishing this work, which demands a re-evaluation of proposed key roles for exosomes at the fly NMJ. Given the intense interest in exosomes in neurobiology, this paper will be of great interest to neuronal cell biologists working across systems.

      We thank the reviewer for their appreciation of the impact of our work on the field.

      Back, M.J., H.C. Ha, Z. Fu, J.M. Choi, Y. Piao, J.H. Won, J.M. Jang, I.C. Shin, and D.K. Kim. 2018. Activation of neutral sphingomyelinase 2 by starvation induces cell-protective autophagy via an increase in Golgi-localized ceramide. Cell Death Dis. 9:670.

      Blanchette, C.R., and A.A. Rodal. 2020. Mechanisms for biogenesis and release of neuronal extracellular vesicles. Curr Opin Neurobiol. 63:104-110.

      Blanchette, C.R., A.L. Scalera, K.P. Harris, Z. Zhao, E.C. Dresselhaus, K. Koles, A. Yeh, J.K. Apiki, B.A. Stewart, and A.A. Rodal. 2022. Local regulation of extracellular vesicle traffic by the synaptic endocytic machinery. J. Cell Biol. 10.1083/jcb.202112094.

      Choezom, D., and J.C. Gross. 2022. Neutral sphingomyelinase 2 controls exosome secretion by counteracting V-ATPase-mediated endosome acidification. J Cell Sci. 135.

      Enneking, E.M., S.R. Kudumala, E. Moreno, R. Stephan, J. Boerner, T.A. Godenschwege, and J. Pielage. 2013. Transsynaptic coordination of synaptic growth, function, and stability by the L1-type CAM Neuroglian. PLoS Biol. 11:e1001537.

      Fuentes-Medel, Y., M.A. Logan, J. Ashley, B. Ataman, V. Budnik, and M.R. Freeman. 2009. Glia and muscle sculpt neuromuscular arbors by engulfing destabilized synaptic boutons and shed presynaptic debris. PLoS Biol. 7:e1000184.

      Koles, K., J. Nunnari, C. Korkut, R. Barria, C. Brewer, Y. Li, J. Leszyk, B. Zhang, and V. Budnik. 2012. Mechanism of evenness interrupted (Evi)-exosome release at synaptic boutons. J Biol Chem. 287:16820-16834.

      Korkut, C., B. Ataman, P. Ramachandran, J. Ashley, R. Barria, N. Gherbesi, and V. Budnik. 2009. Trans-synaptic transmission of vesicular Wnt signals through Evi/Wntless. Cell. 139:393-404.

      Korkut, C., Y. Li, K. Koles, C. Brewer, J. Ashley, M. Yoshihara, and V. Budnik. 2013. Regulation of postsynaptic retrograde signaling by presynaptic exosome release. Neuron. 77:1039-1046.

      Lauwers, E., Y.C. Wang, R. Gallardo, R. Van der Kant, E. Michiels, J. Swerts, P. Baatsen, S.S. Zaiter, S.R. McAlpine, N.V. Gounko, F. Rousseau, J. Schymkowitz, and P. Verstreken. 2018. Hsp90 Mediates Membrane Deformation and Exosome Release. Mol Cell. 71:689-702 e689.

      Mathew, D., B. Ataman, J. Chen, Y. Zhang, S. Cumberledge, and V. Budnik. 2005. Wingless signaling at synapses is through cleavage and nuclear import of receptor DFrizzled2. Science. 310:1344-1347.

      Moberg, K.H., S. Schelble, S.K. Burdick, and I.K. Hariharan. 2005. Mutations in erupted, the Drosophila ortholog of mammalian tumor susceptibility gene 101, elicit non-cell-autonomous overgrowth. Dev Cell. 9:699-710.

      Mosca, T.J., and T.L. Schwarz. 2010. The nuclear import of Frizzled2-C by Importins-beta11 and alpha2 promotes postsynaptic development. Nat Neurosci. 13:935-943.

      Niekamp, P., F. Scharte, T. Sokoya, L. Vittadello, Y. Kim, Y. Deng, E. Sudhoff, A. Hilderink, M. Imlau, C.J. Clarke, M. Hensel, C.G. Burd, and J.C.M. Holthuis. 2022. Ca(2+)-activated sphingomyelin scrambling and turnover mediate ESCRT-independent lysosomal repair. Nat Commun. 13:1875.

      Snow, P.M., N.H. Patel, A.L. Harrelson, and C.S. Goodman. 1987. Neural-specific carbohydrate moiety shared by many surface glycoproteins in Drosophila and grasshopper embryos. J Neurosci. 7:4137-4144.

      Trajkovic, K., C. Hsu, S. Chiantia, L. Rajendran, D. Wenzel, F. Wieland, P. Schwille, B. Brugger, and M. Simons. 2008. Ceramide triggers budding of exosome vesicles into multivesicular endosomes. Science. 319:1244-1247.

      Walsh, R.B., E.C. Dresselhaus, A.N. Becalska, M.J. Zunitch, C.R. Blanchette, A.L. Scalera, T. Lemos, S.M. Lee, J. Apiki, S. Wang, B. Isaac, A. Yeh, K. Koles, and A.A. Rodal. 2021. Opposing functions for retromer and Rab11 in extracellular vesicle traffic at presynaptic terminals. J Cell Biol. 220:e202012034.

    1. Author response:

      The following is the authors’ response to the previous reviews

      Reviewer #1 (Recommendations For The Authors):

      In this revision the authors address some of the key concerns, including clarification of the balanced nature of the RL driven pitch changes and conducting analyses to control for the possible effects of singing quantity on their results. The paper is much improved but still has some sources of confusion, especially around Fig. 4, that should be fixed. The authors also start the paper with a statistically underpowered minor claim that seems unnecessary in the context of the major finding. I recommend the authors may want to restructure their results section to focus on the major points backed by sufficient n and stats.

      Major issues.

      (1) The results section begins very weak - a negative result based on n=2 birds and then a technical mistake of tube clogging re-spun as an opportunity to peak at intermittent song in the otherwise muted birds. The logic may be sound but these issues detract from the main experiment, result, analysis, and interpretation. I recommend re-writing this section to home in on, from the outset, the well-powered results. How much is really gained from the n=2 birds that were muted before ANY experience? These negative results may not provide enough data to make a claim. Nor is this claim necessary to motivate what was done in the next 6 birds. I recommend dropping the claim?

      We thank the reviewer for the recommendation. We moved the information to the Methods.

      (2) Fig. 4 is very important yet remains very confusing, as detailed below.

      Fig. 4a. Can the authors clarify if the cohort of WNd birds that give rise to the positive result in Fig 4 ever experienced the mismatch in the absence of ongoing DAF reinforcement pre-deafening? Fig4a does nor the next clearly specifies this. This is important because we know that there are day timescale delays in LMAN-dependent bias away from DAF and consolidation into the HVC-RA pathway (Andalman and Fee, 2009). Thus, if birds experienced mismatch pre-deafening in the absence of DAF, then an earnly learning phase in Area X could be set in place. Then deafening occurs, but these weight changes in X could result in LMAN bias that expresses only days later -independent of auditory feedback. Such a process would not require an internal model as the authors are arguing for here. It would simply arise from delays in implementing reinforcement-driven feedback. If the birds in Fig 4 always had DAF on before deafening, then this is not an issue. But if the birds had hours of singing with DAF off before deafening, and therefore had the opportunity to associate DA error signals with the targeted time in the song (e.g. pauses on the far-from-target renditions (Duffy et al, 2022), then the return-to-baseline would be expected to be set in place independent of auditory feedback. Please clarify exactly if the pitch-contingent DAF was on or off in the WNd cohort in the hours before deafening. In Fig. 3b it looks like the answer is yes but I cannot find this clearly stated in the text.

      We did not provide DAF-free singing experience to the birds in Fig. 4 before deafening. Thus, according to the reviewer, the concern does not apply.

      Note that we disagree with the reviewer’s premise that there is ‘day timescale delay in LMAN-dependent bias away from DAF and consolidation into the HVC-RA pathway’. More recent data reveals immediate consolidation of the anterior forebrain bias without a night-time effect (Kollmorgen, Hahnloser, Mante 2020; Tachibana, Lee, Kai, Kojima 2022). Thus, the single bird in (Andalman and Fee 2009) seems to be somewhat of an outlier.

      Hearing birds can experience the mismatch regardless of whether they experience DAF-free singing (provided their song was sufficiently shifted): even the renditions followed by white noise can be assessed with regards to their pitch mismatch, so that DAF imposes no limitation on mismatch assessment.

      We disagree with their claim that no internal model would be needed in case consolidation was delayed in Area X. If indeed, Area X stores the needed change and it takes time to implement this change in LMAN, then we would interpret the change in Area X as the plan that birds would be able to implement without auditory feedback. Because pitch can either revert (after DAF stops) or shift further away (when DAF is still present), there is no rigid delay that is involved in recovering the target, but a flexible decision making of implementing the plan, which in our view amounts to using a model.

      Fig 4b. Early and Late colored dots in legend are both red; late should be yellow? Perhaps use colors that are more distinct - this may be an issue of my screen but the two colors are difficult to discern.

      We used colors yellow to red to distinguish different birds and not early and late. We modified the markers to improve visual clarity: Early is indicated with round markers and late with crosses.

      Fig 4b. R, E, and L phases are only plotted for 4c; not in 4b. But the figure legend says that R, E and L are on both panels.

      In Fig. 4b E and L are marked with markers because they are different for different birds. In Fig. 4c the phases are the same for all birds and thus we labeled them on top. We additionally marked R in Fig. 4b as in Fig. 4c.

      Fig 4e. Did the color code switch? In the rest of Fig 4, DLO is red and WND is blue. Then in 4e it swaps. Is this a typo in the caption? Or are the colors switch? Please fix this it's very confusing.

      Thank you for pointing out the typo in the caption. We corrected it.

      The y axes in Fig 4d-e are both in std of pitch change - yet they have different ylim which make it visually difficult to compare by eye. Is there a reason for this? Can the authors make the ylim the same for fig 4d-e?.

      We added dashed lines to clarify the difference in ylim.

      Fig 4d-3 is really the main positive finding of the paper. Can the others show an example bird that showcases this positive result, plotted as in Fig 3b? This will help the audience clearly visualize the raw data that go into the d' analyses and get a more intuitive sense of the magnitude of the positive result.

      We added example birds to figure 4, one for WNd and one for dLO.

      Please define 'late' in Fig.4 legend.

      Done

      Minor

      Define NRP In the text with an example. Is an NRP of 100 where the birds was before the withdrawal of reinforcement?

      We added the sentence to the results:

      "We quantified recovery in terms of 𝑵𝑹𝑷 to discount for differences in the amount of initial pitch shift where 𝑵𝑹𝑷 = 𝟎% corresponds to complete recovery and 𝑵𝑹𝑷 = 𝟏𝟎𝟎% corresponds pitch values before withdrawal of reinforcement (R) and thus no recovery."

      Reviewer #3 (Recommendations For The Authors):

      The use of "hierarchically lower" to refer to the flexible process is confusing to me, and possibly to many readers. Some people think of flexible, top-down processes as being _higher_ in a hierarchy. Regardless, it doesn't seem important, in this paper, to label the processes in a hierarchy, so perhaps avoid using that terminology.

      We reformulated the paragraph using ‘nested processes’ instead of hierarchical processes.

      In the statement "a seeming analogous task to re-pitching of zebra finch song, in humans, is to modify developmentally learned speech patterns", a few suggestions: it is not clear whether "re-pitching" refers to planning or feedback-dependent learning (I didn't see it introduced anywhere else). And if this means planning, then it is not clear why this would be analogous to "humans modifying developmentally learned speech patterns". As you mentioned, humans are more flexible at planning, so it seems re-pitching would _not_ be analogous (or is this referring to the less flexible modification of accents?).

      We changed the sentence to:

      "Thus, a seeming analogous task to feedback-dependent learning of zebra finch song, in humans, is to modify developmentally learned speech patterns."

    1. Reviewer #2 (Public Review):

      Summary:

      The physiology and behaviour of animals are regulated by a huge variety of neuropeptide signalling systems. In this paper, the authors focus on the neuropeptide ion transport peptide (ITP), which was first identified and named on account of its effects on the locust hindgut (Audsley et al. 1992). Using Drosophila as an experimental model, the authors have mapped the expression of three different isoforms of ITP (Figures 1, S1, and S2), all of which are encoded by the same gene.

      The authors then investigated candidate receptors for isoforms of ITP. Firstly, Drosophila orthologs of G-protein coupled receptors (GPCRs) that have been reported to act as receptors for ITPa or ITPL in the insect Bombyx mori were investigated. Importantly, the authors report that ITPa does not act as a ligand for the GPCRs TkR99D and PK2-R1 (Figure S3). Therefore, the authors investigated other putative receptors for ITPs. Informed by a previously reported finding that ITP-type peptides cause an increase in cGMP levels in cells/tissues (Dircksen, 2009, Nagai et al., 2014), the authors investigated guanylyl cyclases as candidate receptors for ITPs. In particular, the authors suggest that Gyc76C may act as an ITP receptor in Drosophila.

      Evidence that Gyc76C may be involved in mediating effects of ITP in Bombyx was first reported by Nagai et al. (2014) and here the authors present further evidence, based on a proposed concordance in the phylogenetic distribution ITP-type neuropeptides and Gyc76C (Figure 2). Having performed detailed mapping of the expression of Gyc76C in Drosophila (Figures 3, S4, S5, S6), the authors then investigated if Gyc76C knockdown affects the bioactivity of ITPa in Drosophila. The inhibitory effect of ITPa on leucokinin- and diuretic hormone-31-stimulated fluid secretion from Malpighian tubules was found to be abolished when expression of Gyc76C was knocked down in stellate cells and principal cells, respectively (Figure 4). However, as discussed below, this does not provide proof that Gyc76C directly mediates the effect of ITPa by acting as its receptor. The effect of Gyc76C knockdown on the action of ITPa could be an indirect consequence of an alteration in cGMP signalling.

      Having investigated the proposed mechanism of ITPa in Drosophila, the authors then investigated its physiological roles at a systemic level. In Figure 5 the authors present evidence that ITPa is released during desiccation and accordingly, overexpression of ITPa increases survival when animals are subjected to desiccation. Furthermore, knockdown of Gyc76C in stellate or principal cells of Malphigian tubules decreases survival when animals are subject to desiccation. However, whilst this is correlative, it does not prove that Gyc76C mediates the effects of ITPa. The authors investigated the effects of knockdown of Gyc76C in stellate or principal cells of Malphigian tubules on i). survival when animals are subject to salt stress and ii). time taken to recover from of chill coma. It is not clear, however, why animals over-expressing ITPa were also not tested for its effect on i). survival when animals are subject to salt stress and ii). time taken to recover from of chill coma. In Figures 6 and S8, the authors show the effects of Gyc76C knockdown in the female fat body on metabolism, feeding-associated behaviours and locomotor activity, which are interesting. Furthermore, the relevance of the phenotypes observed to potential in vivo actions of ITPa is explored in Figure 7. The authors conclude that "increased ITPa signaling results in phenotypes that largely mirror those seen following Gyc76C knockdown in the fat body, providing further support that ITPa mediates its effects via Gyc76C." Use of the term "largely mirror" seems inappropriate here because there are opposing effects- e.g. decreased starvation resistance in Figure 6A versus increased starvation resistance in Figure 7A. Furthermore, as discussed above, the results of these experiments do not prove that the effects of ITPa are mediated by Gyc76C because the effects reported here could be correlative, rather than causative.

      Lastly, in Figures 8, S9, and S10 the authors analyse publicly available connectomic data and single-cell transcriptomic data to identify putative inputs and outputs of ITPa-expressing neurons. These data are a valuable addition to our knowledge ITPa expressing neurons; but they do not address the core hypothesis of this paper - namely that Gyc76C acts as an ITPa receptor.

      Strengths:

      (1) The main strengths of this paper are i) the detailed analysis of the expression and actions of ITP and the phenotypic consequences of over-expression of ITPa in Drosophila. ii). the detailed analysis of the expression of Gyc76C and the phenotypic consequences of knockdown of Gyc76C expression in Drosophila.

      (2) Furthermore, the paper is generally well-written and the figures are of good quality.

      Weaknesses:

      (1) The main weakness of this paper is that the data obtained do not prove that Gyc76C acts as a receptor for ITPa. Therefore, the following statement in the abstract is premature: "Using a phylogenetic-driven approach and the ex vivo secretion assay, we identified and functionally characterized Gyc76C, a membrane guanylate cyclase, as an elusive Drosophila ITPa receptor." Further experimental studies are needed to determine if Gyc76C acts as a receptor for ITPa. In the section of the paper headed "Limitations of the study", the authors recognise this weakness. They state "While our phylogenetic analysis, anatomical mapping, and ex vivo and in vivo functional studies all indicate that Gyc76C functions as an ITPa receptor in Drosophila, we were unable to verify that ITPa directly binds to Gyc76C. This was largely due to the lack of a robust and sensitive reporter system to monitor mGC activation." It is not clear what the authors mean by "the lack of a robust and sensitive reporter system to monitor mGC activation". The discovery of mGCs as receptors for ANP in mammals was dependent on the use of assays that measure GC activity in cells (e.g. by measuring cGMP levels in cells). Furthermore, more recently cGMP reporters have been developed. The use of such assays is needed here to investigate directly whether Gyc76C acts as a receptor for ITPa. In summary, insufficient evidence has been obtained to conclude that Gyc76C acts as a receptor for ITPa. Therefore, I think there are two ways forward, either:<br /> (a) The authors obtain additional biochemical evidence that ITPa is a ligand for Gyc76C.<br /> or<br /> (b) The authors substantially revise the conclusions of the paper (in the title, abstract, and throughout the paper) to state that Gyc76C MAY act as a receptor for ITPa, but that additional experiments are needed to prove this.

      (2) The authors state in the abstract that a phylogenetic-driven approach led to their identification of Gyc76C as a candidate receptor for ITPa. However, there are weaknesses in this claim. Firstly, because the hypothesis that Gyc76C may be involved in mediating effects of ITPa was first proposed ten years ago by Nagai et al. 2014, so this surely was the primary basis for investigating this protein. Nevertheless, investigating if there is correspondence in the phylogenetic distribution of ITP-type and Gyc76C-type genes/proteins is a valuable approach to addressing this issue. Unfortunately, the evidence presented is rather limited in scope. Essentially, the authors report that they only found ITP-type and Gyc76C-type genes/proteins in protostomes, but not in deuterostomes. What is needed is a more fine-grained analysis at the species level within the protostomes. Thus, are there protostome species in which both ITP-type and Gyc76C-type genes/proteins have been lost? Furthermore, are there any protostome species in which an ITP-type gene is present but an Gyc76C-type gene is absent, or vice versa? If there are protostome species in which an ITP-type gene is present but a Gyc76C-type gene is absent or vice versa, this would argue against Gyc76C being a receptor for ITPa. In this regard, it is noteworthy that in Figure 2A there are two ITP-type precursors in C. elegans, but there are no Gyc76C-type proteins shown in the tree in Figure 2B. Thus, what is needed is a more detailed analysis of protostomes to investigate if there really is correspondence in the phylogenetic distribution of Gyc76C-type and ITP-type genes at the species level.

      (3) The manuscript would benefit from a more comprehensive overview and discussion of published literature on Gyc76C in Drosophila, both as a basis for this study and for interpretation of the findings of this study.

    1. Author response:

      We thank eLife and the reviewers for the thoughtful summary and valuable review of our manuscript. We largely agree with the summary and review and have provided our responses to the comments below. We believe BADGER is a significant new tool for identifying associated risk factors for complex diseases, and the associations we observed in the analysis provide insights into the genetic basis of Alzheimer's disease.

      Reviewer #1 (Public Review):

      The major aim of the paper was a method for determining genetic associations between two traits using common variants tested in genome-wide association studies. The work includes a software implementation and application of their approach. The results of the application of their method generally agree with what others have seen using similar AD and UKB data.

      The paper has several distinct portions. The first is a method for testing genetic associations between two or more traits using genome-wide association tests statistics. The second is a python implementation of the method. The last portion is the results of their method using GWAS from AD and UK Biobank.

      We thank the reviewer for the conclusion and positive comments.

      Regarding the method, it seems like it has similarities to LDSC, and it is not clear how it differs from LDSC or other similar methods. The implementation of the method used python 2.7 (or at least was reportedly tested using that version) that was retired in 2020. The implementation was committed between Wed Oct 3 15:21:49 2018 to Mon Jan 28 09:18:09 2019 using data that existed at the time so it was a bit surprising it used python 2.7 since it was initially going to be set for end-of-life in 2015. Anyway, trying to run the package resulted in unmet dependency errors, which I think are related to an internal package not getting installed. I would expect that published software could be installed using standard tooling for the language, and, ideally, software should have automated testing of key portions.

      We thank the reviewer for their comments. To clarify, the primary difference between our proposed method, BADGERS, and LDSC lies in their respective objectives and applications. LDSC is designed to estimate heritability and genetic correlations between traits by utilizing GWAS summary statistics, thereby aiding in the elucidation of the genetic architecture of complex traits and diseases. Conversely, BADGERS is specifically developed to explore causal relationships between risk factors, such as biomarkers, and diseases of interest. It employs genetic variants as variables to deduce causality, thereby addressing the challenges of confounding and reverse causation that are common in observational studies. Although BADGERS utilizes the LD reference panel derived from LDSC, the LD reference panel is used to obtain the predicted trait expression. The ultimate goal is to focus on linking biobank traits with Alzheimer’s disease and building causal relationships instead of identifying genetic architecture.

      Regarding the technical aspects mentioned, we acknowledge the concerns about the use of Python 2.7 and the issues encountered during the package installation. We are in the process of updating the software to ensure compatibility with current versions of Python and to enhance the installation process with standard tooling and automated testing for a more user-friendly experience. We have provided tests for each portion of the software so the user can test if the software is working properly.

      Regarding the main results, they find what has largely been shown by others using the same data or similar data, which add prima facie validity to the work The portions of the work dealing with AD subgroups, pathology, biomarkers, and cognitive traits of interest. I was puzzled why the authors suggested surprise regarding parental history and high cholesterol not associated with MCI or cognitive composite scores since the this would seem like the likely fallout of selection of the WRAP cohort. The discussion paragraph that started "What's more, environmental factors may play a big role in the identified associations." confused me. I think what the authors are referring to are how selection, especially in a biobank dataset, can induce correlations, which is not what I think of as an environmental effect.

      We thank the reviewer very much for their comment. We're glad that our findings align with existing research using similar data, increasing the validity of our work and the proposed BADGER algorithm. Your point about the lack of association between parental history, high cholesterol, and mild cognitive impairment (MCI) or cognitive composite scores in the WRAP cohort is well-taken. We agree that the selection criteria of the WRAP cohort may influence these findings, as it consists of individuals with a specific risk profile for Alzheimer's disease. This selection could indeed mitigate the observed association between these factors and cognitive outcomes, which we initially found surprising.

      Regarding the environmental factors, we appreciate your clarification and understand the confusion. Our intention was to discuss the potential for selection bias and confounding factors in biobank datasets for the identified associations, which might not necessarily be direct environmental effects.

      Overall, the work has merit, but I am left without a clear impression of the improvement in the approach over similar methods. Likewise, the results are interesting, but similar findings are described with the data that was used in the study, which are over 5 years old at the time of this review.

      We thank the reviewer a lot for their endorsement of the BADGER framework. We believe that our method, BADGER, improves on existing approaches by effectively linking genetic data with the detailed phenotypic information in biobanks and large disease GWAS. This enhances our ability to detect associations without needing individual-level data, offering clearer insights while reducing issues like reverse causality and confounding factors.

      Even though the IGAP dataset is over five years old, it remains one of the largest publicly available datasets for Alzheimer’s Disease. Likewise, the UK biobank is one of the largest publicly available human traits datasets, which researchers continue to use. These datasets' continued utility demonstrates their value in the research community. Additionally, the versatility of the BADGER framework makes it suitable for future research investigating the relationship between human traits and various diseases using different datasets.

      Reviewer #2 (Public Review):

      Summary:

      Yan, Hu, and colleagues introduce BADGERS, a new method for biobank-wide scanning to find associations between a phenotype of interest, and the genetic component of a battery of candidate phenotypes. Briefly, BADGERS capitalizes on publicly available weights of genetic variants for a myriad of traits to estimate polygenic risk scores for each trait, and then identify associations with the trait of interest. Of note, the method works using summary statistics for the trait of interest, which is especially beneficial for running in population-based cohorts that are not enriched for any particular phenotype (ie. with few actual cases of the phenotype of interest).

      Here, they apply BADGERS on Alzheimer's disease (AD) as the trait of interest, and a battery of circa 2,000 phenotypes with publicly available precalculated genome-wide summary statistics from the UK Biobank. They run it on two AD cohorts, to discover at least 14 significant associations between AD and traits. These include expected associations with dementia, cognition (educational attainment), and socioeconomic status-related phenotypes. Through multivariate modelling, they distinguish between (1) clearly independent components associated with AD, from (2) by-product associations that are inflated in the original bivariate analysis. Analyses stratified according to APOE inclusion show that this region does not seem to play a role in the association of some of the identified phenotypes. Of note, they observe overlap but significant differences in the associations identified with BADGERS and other Mendelian randomization (MR), hinting at BADGERS being more powerful than classical top variant-based MR approaches. They then extend BADGERS to other AD-related phenotypes, which serves to refine the hypotheses about the underlying mechanisms accounting for the genetic correlation patterns originally identified for AD. Finally, they run BADGERS on a pre-clinical cohort with mild cognitive impairment. They observe important differences in the association patterns, suggesting that this preclinical phenotype (at least in this cohort) has a different genetic architecture than general AD.

      We thank the reviewer a lot for the conclusion and positive comments.

      Strengths:

      BADGERS is an interesting new addition to a stream of attempts to "squeeze" biobank data beyond pure association studies for diagnosis. Increasingly available biobank cohorts do not usually focus on specific diseases. However, they tend to be data-rich, opening for deep explorations that can be useful to refine our knowledge of the latent factors that lead to diagnosis. Indeed, the possibility of running genetic correlation studies in specific sub-settings of interest (e.g. preclinical cohorts) is arguably the most interesting aspect of BADGERS. Classical methods like LDSC or two-sample MR capitalize on publicly available summary statistics from large cohorts, or having access to individual genotype data of large cohorts to ensure statistical power. Seemingly, BADGERS provides a balanced opportunity to dissect the correlation between traits of interest in settings with small sample size in which other methods do not work well.

      We thank the reviewer a lot for the conclusion and positive comments.

      Weaknesses:

      However, the increased statistical power is just hinted, and for instance, they do not explore if LDSC would have identified these associations. Although I suspect that is the case, this evidence is important to ensure that the abovementioned balance is right. Finally, as discussed by the authors, the reliance on polygenic risk scoring necessarily undermines the causality evidence gained through BADGERS. In this sense, BADGERS provides an alternative to strict instrumental-variable based analysis, which can be particularly useful to generate new mechanistic hypotheses.

      We thank the reviewer a lot for the comments. We understand the importance of comparing BADGER to other methods. The comparison with LDSC, while not directly relevant to BADGER’s causal inference aims, is indeed an interesting aspect to consider for future studies. In this paper, we focused on comparing BADGER with Mendelian Randomization (MR), which shares its causal inference objective.

      As a result, BADGERS identified a total of 48 traits that reached Bonferroni-corrected statistical significance. In contrast, MR-IVW only identified nine traits with Bonferroni-corrected statistical significance. Among these nine traits, seven were also identified by BADGERS. This demonstrates that BADGER holds higher power in detecting causal relationships.

      Regarding the use of polygenic risk scoring, we agree that it holds challenges in directly inferring causality. While BADGERS offers an innovative way to explore genetic correlations and can help generate new hypotheses about disease mechanisms, it does not replace the causal inferences that can be drawn from instrumental-variable-based analyses. Instead, it should be viewed as a complementary tool that can illuminate potential genetic relationships and guide further causal investigations.

      In summary, after 15 years of focus on diagnosis that would require having individual access to large patient cohorts, BADGERS can become an excellent tool to dig into trait heterogeneity, especially if it turns out to be more powerful than other available methodologies.

      We thank the reviewer a lot for the conclusion and positive comments.

    1. Since the main goal of this study was to capture the experiences of Asian American girls, I did not include most of the other Basement Group students in my research. There may be gender, ethnic, and/or racial differences that are not reflected in this study. As an exception, I talked with Savannah and Meli, two Salvadoran immigrant girls who were close friends with the Asian American girls and part of the core members of the Basement Community. Their perspectives helped deepen my understanding of the experiences of the main participants

      I think step-by-step studies that control variables are important. It is precisely because of the various details of the research objects that we pay attention to that determine the rigor and objectivity of our research. We can also count them on a large scale in the future. thereby completing the objectivity of the entire study

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      (1) The authors' findings are primarily rooted in a series of well-conducted in vitro experiments using two CML cell lines, K562 and MEG-01. While the findings are interesting and novel, further work to corroborate these findings in primary CML samples would have greatly strengthened the potential real-world relevance of these discoveries. The authors appear to have some PBMCs from primary CML patients and a BM sample from a Ph+ ALL in which they performed western blot analyses (Fig 1). Couldn't these samples have been used to at least confirm some of the key discoveries? For example, the neddylation of BCR-ABL, or; sensitivity of primary leukemic cells to RAPSYN knockdown, and/or; phosphorylation of RAPSYN by SRC?

      We agree with your points and really appreciate your comments. To demonstrate the clinical relevance, we have conducted a series of experiments to address your concerns.

      (1) after a thorough optimization on the transduction process, we have managed to show that shRNA-mediated gene silencing of RAPSYN impaired the growth of primary CML samples. These additional data are presented as Figure 1D in the revised manuscript with its corresponding figure legend and description, lines 136-141.

      (2) we have invested tremendous time and effort to deal with “key discoveries” regardless of the almost impossible task with a great technical difficulty. With 5 mL (ethical approval) of PBMCs on hands, we have finally managed to confirm BCR-ABL neddylation by IP from two newly acquired CML patients. The results are as presented in Figure 2F in the revised manuscript with its corresponding figure legend and description, lines 186-187.

      (2) The authors initially interrogated a fairly dated (circa 2009) microarray-based primary dataset to show that the increase in RAPSYN is primarily a post-transcriptional event, as mRNA levels are not different between healthy and CML samples. It would be interesting to see whether differences might be more readily seen in more recent RNA-seq datasets from CML patients, given the well-known differences in sensitivity between the two platforms. Additionally, I wonder if there would be transcriptional signatures of increased NEDDylation (or RAPSYN-induced NEDDylation) that could be interrogated in primary samples? Furthermore, there are proteomics datasets of CML cells made resistant to TKIs (through in vitro selection experiments) that could be interrogated for independent validation of the authors' discoveries. For example: from K562 cells, PMID: 30730747 or PMID: 34922009).

      Thank you very much for your constructive comments. Based on your suggestion, we have 1) analyzed mRNA level of RAPSYN in RNA-seq datasets GSE13159 (2009), GSE138883 (2020) and GSE140385 (2020), indicating no difference between CML patients and healthy donors. We have included the results in Figure1-figure supplementary 1A and in the revised manuscript (lines 123-127); 2) examined the RNA levels of RAPSYN-related neddylation enzymes, including E1 (NAE1), E2 (UBE2M), NEDD8 and NEDP1 in these databases, and no significant differences of these neddylation-related genes were found between CML patients and healthy donors as well (Supplementary Figure 2C, lines 168-172).

      We have also analyzed the proteomics datasets from PMID: 30730747 and PMID: 34922009 according to your suggestion. Unfortunately, no information on RAPSYN expression is available in these datasets. To avoid potential negligence, we have examined all CML-related proteomics datasets from 2002 to 2024, still resulting in no information about protein expression of RAPSYN. Consequently, our finding on the higher expression of RAPSYN in the PBMCs of Ph+ patients in this study appears to be an observation for the first time. And we believe that our results should be more clinically relevant than those, if any, from the cells by in vitro selection.

      Reviewer #2 (Public Review):

      Most of the conclusions drawn in this paper are well supported by data, but some aspects of the data need to be clarified and extended:

      (1) The authors propose that targeting RAPSYN in Ph+ leukemia could have a high therapeutic index, suggesting that inhibition of RAPSYN may lead to cytotoxicity in Ph+ leukemia with high specificity and minimal side effects. To substantiate this assertion, the authors should investigate the impact on cell viability upon RAPSYN knockdown in non-Ph leukemic cell lines or HS-5 cells (similar to Figure 1C), despite their lower RAPSYN protein levels.

      We appreciate your valuable comments. When we used shRNA to knockdown the expression of RAPSYN in HS-5 cells, it did not affect the cell growth of HS-5 cells. We have included the data in Figure 1C, modified its figure legend, and added corresponding description, lines 136-141.

      (2) The authors intriguingly show that the protein levels of RAPSYN are significantly enriched in Ph+ patient samples and cell lines (Figure 1A, B), even though the mRNA levels remain unchanged (Supplementary Figure 1 A-C). This observation merits a clear explanation in the context of the presented results. The data in the manuscript does imply a feedforward loop mechanism (Figure 7), where BCR-ABL activates SRC, which subsequently stabilizes RAPSYN, which in turn helps protect BCR-ABL from c-CBL-mediated degradation. If this is the working hypothesis, it would be beneficial for the reader to see supporting evidence.

      Thank you very much for pointing out the issue. We have realized the inappropriateness of Figure 7, which was originally placed as a summarizing figure. To avoid potential confusion and misleading, this figure has been deleted, which does not affect the results and conclusions of this study. In addition, the differences on mRNA levels and protein expressions have been responded to Reviewer #1.

      (3) The authors present compelling evidence to suggest that RAPSYN may possess direct NEDD8-ligase activity on BCR-ABL. To strengthen this claim, it may be valuable to conduct further assays involving a ligase-deficient mutant, such as C366A, beyond its use in Figure 2J. Incorporating this mutant into the in vitro assay illustrated in Figure 2K, for instance, could offer substantial validation for the claim. In addition, showing whether the ligase-deficient mutant is capable of phenocopying the phosphorylation-mutant Y336F, as showcased in Figures 5E, F, and 6D, F, would be beneficial.

      We are grateful to your comments. In the manuscript, we have provided sufficient data to support the direct neddylation of BCR-ABL by RAPSYN, as you commented “The authors present compelling evidence to suggest that RAPSYN may possess direct NEDD8-ligase activity on BCR-ABL.”. Cys366 was previously demonstrated as the catalytic residue essential for E3 activity of RAPSYN (Li et al. 2016, PMID: 27839998), and the phosphorylation at Phe336 was thoroughly verified by site-directed mutagenesis and the treatments of SRC-specific inhibitor saracatinib in present cellular experiments. Therefore, while we fully respect your opinions, we do not think it would be necessary to perform tedious in vitro reactions for expected negative results, which was the reason for us not to conduct enzymatic reactions with known inactive mutants, such as C366A and Y336F, in the first place.

      (4) The observations presented in Figures 6 C-G require additional clarification. Notably, there are discrepancies in relative cell viability effects in K562 cells, and to some extent in MEG-01 cells, under conditions that are indicated as being either identical or highly similar. For instance, this inconsistency is observable when comparing the left panels of Figure 6C and 6D in the case of NC overexpression + shSRC#2, and the left panels of Figure 6E and 6G with NC overexpression or shNC, respectively. Listing potential causes of these discrepancies would strengthen the overall validity of the findings and their subsequent interpretation.

      Thank you for your comments and apologize for the confusion. To make a meaningful comparison, we have revised the method part “Preparation of stable RAPSYNWT, RAPSYNY336F or SRC expression cell lines” (lines 625-627) and reorganized Figure 6 to reflect the differences on the negative controls. In fact, we first used LV6 (EF-1a/Puro; OE-NC1) vector for the overexpression of RAPSYNWT and SRC. Due to low expression level with LV6 and long period of time for subsequent selection, we switched to LV18 (CMV/Puro; OE-NC2) for the overexpression of RAPSYNY336F. Since the sensitivities of K562/MEG01-OE-NC cells to shSRC transduction in Figure 6C (now revised to K562/MEG01-OE-NC1) and 6D (now revised to K562/MEG01-OE-NC2) were noticeably different, we have separated RAPSYNWT and RAPSYNY336F cells as 6C and 6D with their own corresponding empty vector as negative control, instead of merging the results into a single figure with one negative control of OE-NC. In addition, given the fact that K562/MEG01 cells reacted differently upon saracatinib treatments after transduction with the empty vector, we have also distinguished the negative controls as OE-NC1 in Figure 6E, OE-NC2 in Figure 6F and shNC in Figure 6G. Afterall, the transduction of K562/MEG01 cells with different expression vectors and viral particles caused the discrepancies in the experiments of cell viability, which has been clarified by reorganizing Figure 6 in the revision.

      (5) Throughout the manuscript, immunoblots which showcase immunoprecipitations of BCR-ABL or His-BCR-ABL depict poly-neddylation (e.g. Figures 2E-M, 3D-G, and 5A-E) and poly-ubiquitination (e.g. Figures 3D-G) patterns/smears where these patterns seem to extend below the molecular weight of BCR-ABL. To enhance clarity, it would be valuable for the authors to provide an explanation in the text or the figure legend for this observation. Is it reflective of potential degradation of BCR-ABL or is there another explanation behind it?

      Thank you for your valuable comments. After carefully checking original immunoblots, we have ascertained that the protein band of BCR-ABL was at 250 KDa and the smear bands appeared to be higher than 250 KDa were likely caused by the conjugation of NEDD8 (neddylation) or Ubiquitin (ubiquitination) onto BCR-ABL. Regarding the molecular weight of modified BCR-ABL lower than expected, whether it is a common feature as previously reported (Mao, J., et al, 2010, PMID: 21118980) or possible degradation during the modification process or sample preparation requires further investigation. We have corrected the labeling of figures in the revised manuscript.

      Reviewer #1 (Recommendations For The Authors):

      (1) It would really nail the real-world relevance of these nice findings if the authors are able to confirm some aspects of their cell line-based discoveries in publicly available 'omics datasets generated from primary CML samples. I have suggested some of these in the public review as well.

      Alternatively, if they are able to investigate samples from murine CML models (eg. BALB/c CML models), it would represent a step towards real-world relevance.

      Thank you very much for your constructive comments. According to your suggestion, we have examined and analyzed RAPSYN mRNA and protein in updated and publicly available datasets as replied in the public response.

      (2) The Discussion repeats some of the information already presented in the Introduction (for example, lines 311-327 of the merged document, or lines 349-358). I would urge the authors to instead expand more about how RAPSYN might be upregulated at the post-transcriptional level, or its potential post-translational regulation by SRC-mediated phosphorylation.

      Thanks for your constructive suggestion. We have re-written this part according to your suggestion and marked in red color in the revised manuscript, lines 319-325 and lines 351-378.

      (3) There are instances of clunky phrases/grammatical mistakes in the manuscript which detract from its readability (eg: lines 142-143: "...empty body transduced shRAPSN#3 or K562 cells into...."; lines 163-164: "Despite AChR subunits α7, M2, M3, and M4 were expressed in all tested cells, no change..."; line 178: "Preeminent BCR-ABL neddylation was detected in..."). A closer proof-reading of the final manuscript is advisable.

      We appreciate the valuable comments. We have made changes for improvement, which is marked in red color in the revised manuscript, lines 145-147, lines 166-168 and line 185.

      (4) The western blot in Fig 5C (particularly the control "OE-NC" of K562) looks drastically different from the corresponding control lanes in Figs 5A and 5B. Similarly, the cell viability curves presented in Fig 6D and 6F (for both K562 and MEG-01, control conditions) look very different from the corresponding curves in Figs 6A and 6B.

      We appreciate for your valuable comments. Because we accidently used the imagines with different exposure time, the western blots in Fig 5C (particularly the control "OE-NC" of K562) look very different from corresponding control lanes in Figs 5A and 5B. We have replaced images with the same exposure time in the revised manuscript.

      For readers to clearly understand, we have revised the method part “Preparation of stable RAPSYNWT, RAPSYNY336F or SRC expression cell lines” (lines 625-627) and related figure legends to reflect the differences.

      We have publicly responded the discrepancy on cell viability.

      Reviewer #2 (Recommendations For The Authors):

      In reviewing your study, I must insist that the completeness and robustness of your work would significantly benefit from a more exhaustive listing of the antibodies used for immunoblotting and immunoprecipitation within the Materials and Methods section. A number of antibodies have been accounted for, however, crucial ones targeting BCR-ABL, c-CBL, Ubiquitin, NEDD8, HA, Myc, and others appear to be omitted. To maintain rigorous scientific standards, I strongly encourage you to include these.

      We appreciate your comments. We have carefully checked the section of Methods and added detailed information of antibodies for Immunoblotting and Immunoprecipitation in the revised manuscript, lines 502-516.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We are very grateful to the reviewers for their positive appraisal of the manuscript and for their useful comments and suggestions. Below are our answers and corresponding modifications of the manuscript.


      Reviewer #1

      1 - Figures 1&4 focus on JU1264 as the primary double-sensitive strain. However, the authors built their RILs with HK104 by crossing with JU1498 in Figures 7&8. In the results section and/or methods, the authors should provide some justification for this strain switch. Alternatively, the equivalent analysis of Figure 1 focusing on JU1498 would be valuable to demonstrate that the effects of both viruses on fitness are similar to JU1264. I am not recommending that the JU1264xHK104 crosses be performed or that Figures 7&8 be repeated with JU1264xHK104 lines, but that more explanation for strain selection for RIL generation should be provided.

      JU1264 and JU1498 are the strains where SANTV and LEBV were found, respectively. The experiments were performed over the years by different authors and were designed to answer different questions. JU1264 was the strain where the first virus was found and was used as a doubly sensitive strain in Figure 1 and the small RNA experiment. The main reason we chose JU1498 for genetic crosses to discover the genetic basis of LEBV sensitivity is that LEBV was detected and isolated from JU1498. Note that the JU1264 and JU1498 strains come from France and are in the same isotype group at CaeNDR (see also Figure 3) so the two strains may be interchangeable (although we cannot be sure).

      We added in the text concerning the RIL construction: "We chose to use JU1498 as the LEBV-sensitive strain as it was the original strain in which LEBV was discovered."

      2-The authors reasonably claim that the resistance of tropical strains like AF16 could be due to blocking viral entry or early inhibition of replication before the small RNA response is activated. Could the authors test this by directly microinjecting virus (in combination with a dye as a control for successful injection) into the intestine? I understand this could not be done on a scale that would allow for small RNA sequencing, but one could perform small-scale FISH to determine if LEBV or SANTV are replication-competent if the entry barrier is artificially overcome. Such an experiment may require considerable technical development. It may be beyond the scope/timing of this specific study, but it is worth considering to gain some insight into the possible resistance mechanisms observed.

      Although the suggested experiment is in principle a great approach, it is difficult to perform without losing animals during the FISH staining. In addition, in this manuscript we are not particularly searching for the resistance mechanisms of AF16 but trying to present a wider perspective concerning viral infections of C. briggsae and their specificity. We performed small RNA analysis for AF16 together with the sensitive strains and therefore we commented on the lack of small RNA response in AF16 comparing to the sensitive strains. We thus consider that setting up intestinal injections at this point is arduous and beyond the scope of this manuscript.

      Minor Comments: Line 78 - provide the full genus name for Caenorhabditis elegans at first appearance, as done for Caenorhabditis briggsae

      This was modified. Line 117 - The description of cul-6 could also reference Bakowski et al. 2014. This study is referenced more generally as a player in proteostasis a few lines below but could be more explicitly tied to cul-6-mediated resistance to ORV (Bakowski et al. 2014 - see Fig. 7A) This section focus on the use of natural polymorphisms but we added this reference, which is indeed key for the effect of cul-6 knockdown on viral infection in C. elegans. Line 197-198 - The authors could consider adding sequences for FISH probes as part of Table S2. This information could add value to the present study even if previously listed in Frézal et al. We actually removed them from an earlier version since these sequences are already published: here and in further work, it seems preferable to refer to the primary study where these probes were designed, Line 263 - Were embryos obtained by bleaching of gravid adults, or was an egg lay performed, and the embryos were collected from plates? This is potentially an important distinction and should be clarified briefly in the methods. In the section “Preparation of small RNA libraries”, we obtained embryos by bleaching gravid adults.

      We changed the first sentence to “Gravid hermaphrodites from uninfected cultures (AF16, HK104 and JU1264) were harvested using M9 solution, then bleached and washed twice using nuclease-free water. Embryo concentrations were estimated by counting embryos under the dissecting microscope and diluted to 2 embryos per mL of nuclease-free water. 200 embryos of each strain (AF16, HK104 and JU1264) were then plated onto 55 mm NGM plates seeded with E. coli OP50.” We also added “The embryos were obtained by bleaching gravid hermaphrodites.” to the Figure S5 legend. Line 330 - Provide justification for using JU1498 to make these RILs (see comment above). We added this sentence in the Results section. "We chose to use JU1498 as the LEBV-sensitive strain as it was the original strain in which LEBV was discovered." Line 446-Refer to the methods section for full clarity on the role of FISH in this set of experiments or reword for improved clarity. At first read-through, this phrasing made me expect some FISH experiments associated with Fig. 1, which does not appear to be the case.

      We did perform FISH experiments as control that the cultures were infected, as explained in the Methods. We removed this mention from the Results section. Line 478 - The supplementary figure callouts are misaligned with the provided documents. S2A in the text appears to refer to S3A RT-qPCR results. Changed. Line 483 - Similar to above, the text suggests serial dilutions should refer to S4, not S3. Changed. Line 498 - Modify the text to 'Figure 2C and Figure 3' for clarity. Changed. Line 531,535 - viRNAs are defined in line 535 but this should be moved to 531 above at first appearance in the text. Changed. Line 593 - Typo in 'Logarithm of Odds?' Corrected. Line 621-624 - I recommend the authors include the data for the LEBV control experiments with NIL strains, either as a supplementary table, an additional panel for Fig. 6, or represented as done in Figure 8. We removed this sentence. Line 625-632 - How many total genes are represented in the QTL on IV? The reasoning behind testing rde-11 and rsd-2 is sound, but readers might want to know other potential candidates within this region (perhaps something the authors could also speculate on in the discussion). A similar comment applies for # genes in the QTLs on II and III.

      We added in Table S7 the list of detected SNPs and short indels in the chromosome IV region and now indicate in the text "among them over 2700 SNPs and short indels (Table S7)." We added Table S11 with the polymorphisms in the chromosome II QTL region. We note that these tables do not include possible structural variants. The chromosome III QTL being weak, we abstained for this one but the data can now be found using CaeNDR.

      Line 991-992 - Figure 1B - LEBV, SANTV, and co-infection effects on body size are mentioned but not quantified. Has this phenotype been quantified elsewhere? If so, the authors should reference it in the results section or Fig. 1 legend. Alternatively, body size could be quantified as part of this study and added to Fig. 1.

      Because we do not have a large amount of data on body size, we removed "Body size quantification” from Figure 1B legend. Line 1001 - There is a typo in the first sentence; the period after LEBV should be removed. Small suggestion: Figure 2A - While described in the methods, I recommend that the authors briefly reiterate in the figure legend that the white/yellow boxes are intended to indicate serial chunking for clarity.

      We removed the typo and explained the agar chunk representation in the figure legend: "The transfer by chunking a piece of agar is indicated by beige rectangles cut out from one plate and transferred to the next plate." Line 1034 - Small formatting note for Figure 4B - percentages of reads mapping to RNA1 and RNA2 appear underneath gridlines for the graph which obscures visibility and is inconsistent with the other graphs presented.

      This was modified and is indeed clearer. Line 1094 - Figure S1 - this analysis could be strengthened by RT-qPCR represented as fold change in viral load instead of, or in addition to, the agarose gel image (like Fig. S3). Doing so would also allow for the normalization of eft-2 control across individual samples (e.g.: particularly low eft-2 amplification in ED3073). However, these results are sufficiently convincing that LEBV does not replicate in C. elegans, but a more quantitative approach is recommended if feasible for the authors. Alternatively, an additional figure panel and/or repeat of this analysis with C. elegans infected with ORV would also be beneficial as an additional control.

      We do not understand how we can estimate a viral load by a ratio when we do not seem to see any significant amplification. Of course, a RT-qPCR would provide a finite Ct value and a ratio but they are likely to be meaningless. The ED3073 sample did not amplify for eft-2 either and calculating a ratio of high Ct values in a RT-qPCR would be misleading. We could remove the two ED3073 lanes but prefer to leave them.

      Line 1112 - "Experiments using RNA2 primers gave similar results" - if this data isn't included in the study, this text should be removed.

      Removed. Line 1141 - Figure S6 - For full transparency, the authors could consider including HK104 infected with LEBV to show minimal (zero) reads align to the RNA1/RNA2 segments using scales consistent with JU1264 infected with LEBV (S6C) The proportion of reads mapping (0%) are provided in Figure 4A and supplementary tables. We do not show the distribution of antisense 22G and sense 23nt along the LEBV genome for the HK104 (co)infections for the following reasons. 0% of these reads map to LEBV in HK104 monoinfection, and only 0.02% antisense 22G in coinfection. Moreover, the 23nt reads mapping to LEBV-RNA2 in the HK104 coinfection (16.54%;1931 reads) correspond to a 41 bp region with 85% nucleotide similarity between SANTV-RNA2 and LEBV-RNA2. Overall, the few 23nt (+) reads mapping to LEBV in HK104 coinfection are most likely a spillover of the HK104 antiviral response to JUv1264 entry into the intestinal cells.

      Reviewer #2

      Main points: 1. In figure 1C and D, is more than 1 biological replicate performed? Ideally multiple independent infections would be performed which would increase confidence in these experiments, but minimally the authors should make clear that this data was from an experiment only performed once. The conclusion from the life span assays is unlikely to change, but given the variance of the brood size assays within replicates, the conclusions that LEBV infection reduces the brood size is weakly supported.

      We added “Panels C-D correspond to a single experiment (see Methods).” to the legend of Figure 1. We changed the wording to "LEBV and especially the co-infection appeared to lower brood size." We do not have data for independent experiments.

      If the authors want to claim that there is a defect in viral entry in the resistant strains, they should perform infections experiments at an earlier time point that could capture viral invasion. In C. elegans with Orsay virus these experiments have been done as early as 18 hours by FISH. https://journals.plos.org/plospathogens/article?id=10.1371/journal.ppat.1011120 The way the assays are currently set up, if the infection was cleared it wouldn't be observed.

      The strongest point that indicates that the virus does not replicate is the small RNA experiment, in which the animals were collected on the initial plate inoculated with the virus. We think that our wording was careful:

      We further amended it:

      • in Results " The animals were collected for sRNA sequencing on the plates onto which the viral inoculate was added and where they were constantly exposed to the virus".

      • in Discussion " Indeed, as we did not assay viral entry by sensitive FISH or RT-PCR at early timepoints, it is possible that the viruses are cleared without production of small RNAs."

      The evidence that the region on chromosome III contributes to susceptibility is weak. The analysis in figure 5B does not identify this region and it is not clear to me how to read the scale in figure 5C to determine that a region on chromosome III is significant.

      We added in the Figure legend: "with a LOD score of 10.5, above the threshold calculated by simulations (see Methods)." and detailed the method in the Methods section (see reply to Reviewer 3 below).

      In figure 6 using a more appropriate statistical test such as one way ANOVA with multiple hypothesis testing is necessary to determine if there is a difference between JU2832 and JU2916. It would be helpful if the authors could add more discussion of the evidence that they feel that supports this region being involved in susceptibility.

      We do not think that an ANOVA is appropriate to analyze these proportions which cannot have normal distributions of residuals, therefore we used a generalized linear model, taking genotype and block (day of experiment) into account. This was only explained in the legend and is now explained in the Methods section as well. Maybe the reviewer suggests us to us a global analysis with strain as a factor. We could do this but we do not think that it applies well to this situation: here we test for a specific hypothesis for each one-QTL strain. We have corrected for multiple testing as explained next. The legend now reads: " The significance p values were obtained in a generalized linear model (glm) taking independent experimental blocks and infection replicates into account, testing NILs against their relevant background parent. The p values using the two strains testing for the QTL on chromosome IV and those using the two-QTL strain JU2832 are corrected for multiple testing." In addition, we now provide p values rather than three stars, which reinforce the point (they are very low).

      Minor points 1. In figure 1B it would be helpful to provide more information on the animals chosen to display. Are these representative examples or extreme examples?

      These are representative examples. This detail was added in the legend.

      In figure 2B, adding a legend for the colored dots would be helpful.

      We had indicated: "Dots are replicates within a block, with 100 animals scored per replicate (see Table S4 for the detailed results and Figure S2 and Methods for the experimental design). Experimental blocks are represented by colors and the bar indicates the grand mean of the blocks." 3. In figure 2C, the definitions for a strain to be labeled as belonging to each category should be provided.

      The categorization method is now explained in the Methods section. In addition, Figure 2C legend now refers to Table S4 for the category of each strain. 4. Could the data in figure 2 be used for genome-wide association mapping and compared to the RIL QTL experiments? Adding comment on this would be helpful to understanding the usefulness of this data.

      There are too few strains here to test genome-wide for association. If we had the causative SNP, it would be interesting to assess its frequency but this is beyond the focus and scope of this work, which focused on the outlier phenotype of the HK104 strain. 5. In figure 4b, in HK104 LRBV the numbers in top right corner are not defined.

      We added to the legend of Figure 4B: “For the HK104 infection with LEBV, the number of read counts is provided in the top right corner to signal their rarity compared to ca. 107 in the other conditions. See Table S5 for all read counts. ” 6. Line 1001 remove period from "LEBV.of" and add period after isolates. Removed.

      Reviewer #3 Major comments • The authors provide most data in both a processed and raw format, which is helpful. In two cases (data from 3 DPI, line 492 and LEBV infections in the AF16xHK104 NILs, line 621), the authors state their results, but the data seems not to be provided in the document (at least no direct reference is provided). These are supporting results and do not affect the main conclusions, nevertheless providing the data in form of a table or supplementary figure would be required. Generally, it may help to include a data availability statement to have a combined overview of where data can be found.

      As noted by the reviewer, we tried to provide the data in raw format, but did not judge it necessary when the experiment had two datapoints that are provided in the text. We added the number of animals in the instance where it was missing.

      Minor comments • Line 97-126: Here the manuscript fully focuses on the work in C. elegans. It would be interesting to make clear links to the work in C. briggsae (e.g. mention if homologs are present). The paragraph in line 127 clarifies advantages of studying viral infection in C. briggsae compared to C. elegans. It may be logical to place this information early in the text.

      We added a sentence to link the C. elegans work and C. briggsae. • Line 166 and results from this experiment: Is the LEBV-SANTV mixture consisting of 50uL of both viruses or a total of 50uL (so 25uL of both)? This is also important for the interpretation of results.

      To clarify, we changed to: “50 l ... of an equivolume mix of SANTV and LEBV”. • Line 167: The text says the culture is maintain for 4 days, but then also mentions day 5. Figure 2 clarifies the experimental setup later, but the text could be clearer here.

      Thank you for noticing this. We changed the 4 to 7. • Line 172: What are the nine starter cultures?

      The nine starting cultures were those obtained as described in the paragraph preceding this line in the manuscript. From a plate of infected animals (five L4 larvae), we propagated the infected population by chunking over 3 plates (day 3) and 3*3 plates (day 5). To make this point clear, we have added above: "to generate for the following experiments nine starter cultures for each of the four conditions " • Line 185: 'Infection of the set of C. briggsae natural isolates'. From the text it is not clear what set the authors refer to.

      We changed to "a set" and refer to Figure 2B and Table S4 in the sentence below for the list of natural isolates. • Line 223: 'The proportion of infected animals were overall higher in Batch3 but the qualitative results are similar'. It is unclear why this statement is here instead of in the result section and it is also not clear what the authors mean by the second part of the sentence.

      We moved the sentence to Results and changed it to: " The proportion of infected animals were overall higher in Batch 3 but the relative results of the different strains were similar for the three batches." • Line 326: Is 'the same method as above' using FISH or RT-qPCR?

      Changed to "using FISH as above". • Line 382: What do the authors mean by 'two cross directions'?

      We removed this mention as the method is better explained in the next sentence.

      • Line 454-458: The data presented here does not appear well integrated in the storyline. It does not fit under the subheading. Perhaps it would be a better fit under the subheading of line 462? We moved it below the subheading. • Line 478: Reference to Fig S2 should be reference to Fig S3

      Changed. • Line 483: Reference to Fig S3 should be reference to Fig S4

      Changed. • Line 540-544: The sentence reads as a contradiction (C. elegans defends itself using RNAi, C. briggsae blocks viral infection during entry). As a result, the sentence reads as if RNAi is not of much antiviral importance in C. briggsae, but that cannot be concluded from this data. I am not sure if this is what the authors aim to suggest, but another word choice (e.g. changing 'whereas' and 'this does not seem the case for C. briggsae') may be considered.

      We changed the wording to " whereas the C. elegans N2 reference strain allows for viral entry and defends itself against ORV via its small RNA response (Félix et al. 2011; Ashe et al. 2013; Shirayama et al. 2014; Coffman et al. 2017), in the tested resistant C. briggsae strains, the viruses appeared to be blocked at entry or at early steps of the viral cycle." • Line 585 and 592: There are two QTL approaches being applied and referred to as 'the one- and two-QTL analyses'. The description in this part is rather technical and the terminology is not clear. As a result, for readers not familiar with QTL mapping, the biological interpretation may become obscured.

      We now explain in Methods: " ...scanning each pair of positions for several models, including single-QTL, full, additive and epistatic. The significance threshold LOD score of each model was estimated via 1,000 permutation tests with a coefficient of risk a=0.05. The threshold was 4.91 for the additive model and 6.09 for the full model. The LOD score of each pair of position is represented by a color scale in Figure 5C). The combination of the chromosomes III and IV QTLs had a LOD score of 10.5 in the full and additive models. No epistatic interaction was detected. The LOD score of the single-QTL model comparison was below the threshold."

      • Line 659: The authors end the section about natural genetic variation in the response to SANTV with candidate genes and a CRISPR experiment. As the authors identify a small genetic region associated with LEBV susceptibility, it would be interesting to hear about any candidate genes in this region. There are still many genes and more importantly, many polymorphisms in this region (ca. 700 single-nucleotide polymorphisms and short indels). Because structural variants are difficult to call (long-read sequencing has not been performed on the parents), we had preferred to abstain to provide a list of polymorphisms that would be incomplete and preferentially point towards SNPs. However, because of the reviewer's query, we now provide it in Table S11.

      • Line 674: The authors make use of HK104 strain in this study as it is the exception in their dataset that provides resistance against LEBV, but not SANTV. Possibly, the genetic variation linked to viral susceptibility uncovered using HK104 may therefore be relatively uncommon in C. briggsae. The implications of this choice and option for other studies using different genotypes could be interesting to discuss in this short paragraph. The aim in here is to discover why HK104 is specifically resistant to one virus and not the other. There is a possibility of uncovering a specific mechanism that is present in only two or three strains of our 40-strain dataset but we find this specificity particularly

      interesting, regardless of its prevalence. We explore in the Discussion which of the two crosses may reveal the specificity.

      • Line 774: The IPR is already described on abbreviated in line 742. As a reader, we prefer having the abbreviation explained twice than not understanding it. • Overall, to reach a broader audience, the manuscript can expand explanations in the discussion. E.g. statements in line 695 and 773, refer to previous observations, but do not explain them in enough detail to understand parallels between this and previous studies without prior knowledge.

      We added some explanations, specifically for lines 695 and 773 (of previous version). • Figure 2: Only HK104 is labelled in the figure, it would be useful to also see HK105 as this strain is also explicitly mentioned in the text.

      We now included HK105 and strains that are used in further experiments.

      • Figure 2: It is not clear from the results or methods how strains as designated into a certain class. The figure legend says variability in the data is taken into account and that is why some strains are close to each other, yet distinct in class, but how this works is not described. We now explain our criteria. See above in the response to Reviewer 2. • Figure S3: The strain JU1264 and JU1498 are mentioned thrice (as '2', 'rep' and 'ref'). These annotations should be clarified.

      These explanations were indeed missing. We now explain them in the figure legend. • Figure S4: The figure would benefit from a division in panels per strain to facilitate comparisons across strains.

      Indeed. We now added a division in panels per strain. • Figure S4: Have the authors correlated viral loads with the number of infected animals? This could result in addition information if not all individuals are infected equally.

      We have not done so in this precise experiment but preferred to use the number of infected animals in most other experiments, in particular because it is less subject to outlier effects. • Figure S4: Could the authors clarify the meaning of JU1264 Rep?

      It is explained in the legend: "The undiluted viral preparations on JU1264 are used to normalize and are indicated as "JU1264 1/1". A separate replicate was performed and indicated as "JU1264 Rep"."

      • Figure 8: The meaning of the stars in this figure is a bit confusing and the description of these stars in the legend is not clear. Indeed. We changed the legend to: " ***: p<0.001 comparing JU4034 with its parent strain HK104 using a generalized linear model."
    1. Author response:

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this work, Qiu and colleagues examined the effects of preovulatory (i.e., proestrous or late follicular phase) levels of circulating estradiol on multiple calcium and potassium channel conductances in arcuate nucleus kisspeptin neurons. Although these cells are strongly linked to a role as the "GnRH pulse generator," the goal here was to examine the physiological properties of these cells in a hormonal milieu mimicking late proestrus, the time of the preovulatory GnRH-LH surge. Computational modeling is used to manipulate multiple conductances simultaneously and support a role for certain calcium channels in facilitating a switch in firing mode from tonic to bursting. CRISPR knockdown of the TRPC5 channel reduced overall excitability, but this was only examined in cells from ovariectomized mice without estradiol treatment. The patch clamp experiments are comprehensive and overall solid but a direct demonstration of the role of these conductances in being necessary for surge generation (or at least having a direct physiological consequence on surge properties) is lacking, substantially reducing the impact of the findings.

      Strengths:

      (1) Examination of multiple types of calcium and potassium currents, both through electrophysiology and molecular biology.

      (2) Focus on arcuate kisspeptin neurons during the surge is relatively conceptually novel as the anteroventral periventricular nucleus (AVPV) kisspeptin neurons have received much more attention as the "surge generator" population.

      (3) The modeling studies allow for direct examination of manipulation of single and multiple conductances, whereas the electrophysiology studies necessarily require examination of each current in isolation. The construction of an arcuate kisspeptin neuron model promises to be of value to the reproductive neuroendocrinology field.

      We thank the reviewer for recognizing our comprehensive examination of Kiss-ARH neurons through electrophysiological, molecular and computational modeling of their activity during the preovulatory surge, which as the reviewer pointed out is “conceptually novel.” We will bolster our argument that Kiss1-ARH neurons transition from synchronized firing to burst firing with the E2-mediated regulation of channel expression with the addition of new experiments. We will address the weaknesses as follows:

      Weaknesses:

      (1) The novelty of some of the experiments needs to be clarified. This reviewer's understanding is that prior experiments largely used a different OVX+E2 treatment paradigm mimicking periods of low estradiol levels, whereas the present work used a "high E2" treatment model. However, Figures 10C and D are repeated from a previous publication by the same group, according to the figure legend. Findings from "high" vs. "low" E2 treatment regimens should be labeled and clearly separated in the text. It would also help to have direct comparisons between results from low E2 and high E2 treatment conditions.

      We will revise Figures 10C and 10D to include new findings on Tac2 and Vglut2 expression in OVX and E2-treated Kiss1ARH. We did show the previously published data (Qiu, eLife 2018) to contrast with Figures 10E, F showing the downregulation of TRPC5 and GIRK2 channels following E2 treatment. Most importantly, our E2 treatment regime is clearly stated in the Methods and is exactly the same that was used previously (Qiu, eLife 2016 and Qiu, eLife 2018) for the induction of the LH surge in OVX mice (Bosch, Molecular and Cellular Endocrinology 2013) .

      (2) In multiple places, links are made between the changes in conductances and the transition from peptidergic to glutamatergic neurotransmission. However, this relationship is never directly assessed. The data that come closest are the qPCR results showing reduced Tac2 and increased Vglut2 mRNA, but in the figure legend, it appears that these results are from a prior publication using a different E2 treatment regimen.

      In the revised Figure 1, we will now include a clear depiction of the transition from synchronized firing driven by NKB signaling in OVX females to burst firing driven by glutamate in E2-treated females. We have used the same E2 treatment paradigm as previously published (Qiu, eLife 2018).

      (3) Similarly, no recordings of arcuate-AVPV glutamatergic transmission are made so the statements that Kiss1ARH neurons facilitate the GnRH surge via this connection are still only conjecture and not supported by the present experiments.

      Using a horizontal hypothalamic slice preparation, we have shown that Kiss1-ARH neurons excite GnRH neurons via Kiss1ARH glutaminergic input to Kiss1AvPV neurons (summarized in Fig. 12, Qiu, eLife 2016). We do not think that it is necessary to repeat these experiments in the current manuscript.

      (4) Figure 1 is not described in the Results section and is only tenuously connected to the statement in the introduction in which it is cited. The relevance of panels C and D is not clear. In this regard, much is made of the burst firing pattern that arises after E2 treatment in the model, but this burst firing pattern is not demonstrated directly in the slice electrophysiology examples.

      We will revised Figure 1 to include new whole-cell, current clamp recordings documenting the burst firing in response to glutamate in E2-treated, OVX females.

      (5) In Figure 3, it would be preferable to see the raw values for R1 and R2 in each cell, to confirm that all cells were starting from a similar baseline. In addition, it is unclear why the data for TTA-P2 is not shown, or how many cells were recorded to provide this finding.

      Before initiating photo-stimulation for each Kiss1-ARH neuron, we adjust the resting membrane potential to -70 mV, as noted in each panel in Figure 3, through current injections. We will include new findings on the effects of the T-channel blocker TTA-P2 on slow EPSP in the revised Figure 3. The number of cells tested with each calcium channel blocker is depicted in each of the bar graphs summarizing the effects of the blockers.

      (6) In Figure 5, panel C lists 11 cells in the E2 condition but panel E lists data from 37 cells. The reason for this discrepancy is not clear.

      In Figure 5E, we measured the L-, N-, P/Q and R channel currents after pretreatment with TTA-P2 to block the T-type current, whereas in Figure 5C, we measured the current without TTA-P2.

      (7) In all histogram figures, it would be preferable to have the data for individual cells superimposed on the mean and SEM.

      In all revised Figures we will include the individual data points for the individual neurons.

      (8) The CRISPR experiments were only performed in OVX mice, substantially limiting interpretation with respect to potential roles for TRPC5 in shaping arcuate kisspeptin neuron function during the preovulatory surge.

      The TRPC5 channels are most important for generating slow EPSPs when expression of NKB is high in the OVX state. Conversely, the glutamatergic response becomes more significant when the expression of NKB and TRPC5 channel are muted. Therefore, the CRISPR experiments were specifically conducted in OVX mice to maximize the effects.

      (9) Furthermore, there are no demonstrations that the CRISPR manipulations impair or alter the LH surge.

      In this manuscript, our focus is on the cellular electrophysiological activity of the Kiss1ARH neurons in ovx and E2-treated females. Exploration of CRISPR manipulations related to the LH surge is certainly slated for future experiments, but these in vivo experiments are beyond the scope of these comprehensive cellular electrophysiological and molecular studies.

      (10) The time of day of slice preparation and recording needs to be specified in the Methods.

      We will provide the times of slice preparation and recordings in the revised Methods and Materials.

      Reviewer #2 (Public Review):

      Summary:

      Kisspeptin neurons of the arcuate nucleus (ARC) are thought to be responsible for the pulsatile GnRH secretory pattern and to mediate feedback regulation of GnRH secretion by estradiol (E2). Evidence in the literature, including the work of the authors, indicates that ARC kisspeptin coordinate their activity through reciprocal synaptic interactions and the release of glutamate and of neuropeptide neurokinin B (NKB), which they co-express. The authors show here that E2 regulates the expression of genes encoding different voltage-dependent calcium channels, calcium-dependent potassium channels, and canonical transient receptor potential (TRPC5) channels and of the corresponding ionic currents in ARC kisspeptin neurons. Using computer simulations of the electrical activity of ARC kisspeptin neurons, the authors also provide evidence of what these changes translate into in terms of these cells' firing patterns. The experiments reveal that E2 upregulates various voltage-gated calcium currents as well as 2 subtypes of calcium-dependent potassium currents while decreasing TRPC5 expression (an ion channel downstream of NKB receptor activation), the slow excitatory synaptic potentials (slow EPSP) elicited in ARC kisspeptin neurons by NKB release and expression of the G protein-associated inward-rectifying potassium channel (GIRK). Based on these results, and on those of computer simulations, the authors propose that E2 promotes a functional transition of ARC kisspeptin neurons from neuropeptide-mediated sustained firing that supports coordinated activity for pulsatile GnRH secretion to a less intense firing in glutamatergic burst-like firing pattern that could favor glutamate release from ARC kisspeptin. The authors suggest that the latter might be important for the generation of the preovulatory surge in females.

      Strengths:

      The authors combined multiple approaches in vitro and in silico to gain insights into the impact of E2 on the electrical activity of ARC kisspeptin neurons. These include patch-clamp electrophysiology combined with selective optogenetic stimulation of ARC kisspeptin neurons, reverse transcriptase quantitative PCR, pharmacology, and CRIPR-Cas9-mediated knockdown of the Trpc5 gene. The addition of computer simulations for understanding the impact of E2 on the electrical activity of ARC kisspeptin cells is also a strength.

      The authors add interesting information on the complement of ionic currents in ARC kisspeptin neurons and on their regulation by E2 to what was already known in the literature. Pharmacological and electrophysiological experiments appear of the highest standards. Robust statistical analyses are provided throughout, although some experiments (illustrated in Figures 7 and 8) do have rather low sample numbers.

      The impact of E2 on calcium and potassium currents is compelling. Likewise, the results of Trpc5 gene knockdown do provide good evidence that the TRPC5 channel plays a key role in mediating the NKB-mediated slow EPSP. Surprisingly, this also revealed an unsuspected role for this channel in regulating the membrane potential and excitability of ARC kisspeptin neurons.

      We thank the reviewer for recognizing that the “pharmacological and electrophysiological experiments appear of the highest standards” and “the addition of the computer modeling for understanding the impact of E2 on the electrical activity of ARC kisspeptin cells is also a strength. However, we agree with the reviewer that we need to provide a direct demonstration of “burst-like” firing of Kiss1-ARH neurons. We will address the weaknesses as follows:

      Weaknesses:

      The manuscript also has weaknesses that obscure some of the conclusions drawn by the authors.

      One has to do with the fact that "burst-like" firing that the authors postulate ARC kisspeptin neurons transition to after E2 replacement is only seen in computer simulations, and not in slice patch-clamp recordings. A more direct demonstration of the existence of this firing pattern, and of its prominence over neuropeptide-dependent sustained firing under conditions of high E2 would make a more convincing case for the authors' hypothesis.

      We will provide a more direct demonstration of the existence of this firing pattern in the whole-cell current clamp experiments in the revised Figure 1.

      In addition, and quite importantly, the authors compare here two conditions, OVX versus OVX replaced with high E2, that may not reflect the physiological conditions (the diestrous [low E2] and proestrous [high E2] stages of the estrous cycle) under which the proposed transition between neuropeptide-dependent sustained firing and less intense burst firing might take place. This is an important caveat to keep in mind when interpreting the authors' findings. Indeed, that E2 alters certain ionic currents when added back to OVX females, does not mean that the magnitude of these ionic currents will vary during the estrous cycle.

      We have published that the magnitude of the slow EPSP, which is TRPC5 channel mediated, varies throughout the estrous cycle and the similarity to that found in OVX compared to E2-treated, OVX females (Figure 2, Qiu, eLife 2016). Moreover, TRPC5 channel mRNA expression, similar to the peptides, is downregulated by an E2 treatment (Figure 10 this manuscript) that mimics proestrus levels of the steroid (Bosch, Mol Cell Endocrinology 2013). Furthermore, the magnitude of ionic currents is directly proportional to the number of ion channels expressed in the plasma membrane, which we have found correlates with mRNA expression. Therefore, it is likely that the magnitude of these ionic currents will vary during the estrous cycle.

      Lastly, the results of some of the pharmacological and genetic experiments may be difficult to interpret as presented. For example, in Figure 3, although it is possible that blockade of individual calcium channel subtypes suppresses the slow EPSP through decreased calcium entry at the somato-dendritic compartment to sustain TRPC5 activation and the slow depolarization (as the authors imply), a reasonable alternative interpretation would be that at least some of the effects on the amplitude of the slow EPSP result from suppression of presynaptic calcium influx and, thus, decreased neurotransmitter and neuropeptide secretion. Along the same lines, in Figure 12, one possible interpretation of the observed smaller slow EPSPs seen in mice with mutant TRPC5 could be that at least some of the effect is due to decreased neurotransmitter and neuropeptide release due to the decreased excitability associated with TRPC5 knockdown.

      The reviewer raises a good point, but our previous findings clearly demonstrate that chelating intracellular calcium with BAPTA in whole-cell current clamp recordings abolishes the slow EPSP and persistent firing (Qiu, J. Neurosci 2021), which we have noted is the rationale for dissecting out the contribution of T, R, N, L and P/Q calcium channels to the slow EPSP in our current studies (revised Figure 3 will include the effects of T-channel blocker).

      However, to further bolster the argument for the post-synaptic contribution of the calcium channels to the slow EPSP and eliminate the potential presynaptic effects of calcium channel blockers on the postsynaptic slow EPSP amplitude, which may result from reduced presynaptic calcium influx and subsequently decreased neurotransmitter release, we will utilized an additional strategy. Specifically, we will measure the response to the externally administered TACR3 agonist senktide under conditions in which the extracellular calcium influx, as well as neurotransmitter and neuropeptide release, are blocked (new Figure 3).

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      Following small molecule screens, this study provides convincing evidence that 7,8 dihydroxyflavone (DHF) is a competitive inhibitor of pyridoxal phosphatase. These results are important since they offer an alternative mechanism for the effects of 7,8 dihdroxyflavone in cognitive improvement in several mouse models. This paper is also significant due to the interest in the protein phosphatases and neurodegeneration fields.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Zink et al set out to identify selective inhibitors of the pyridoxal phosphatase (PDXP). Previous studies had demonstrated improvements in cognition upon removal of PDXP, and here the authors reveal that this correlates with an increase in pyridoxal phosphate (PLP; PDXP substrate and an active coenzyme form of vitamin B6) with age. Since several pathologies are associated with decreased vitamin B6, the authors propose that PDXP is an attractive therapeutic target in the prevention/treatment of cognitive decline. Following high throughput and secondary small molecule screens, they identify two selective inhibitors. They follow up on 7, 8 dihydroxyflavone (DHF). Following structure-activity relationship and selectivity studies, the authors then solve a co-crystal structure of 7,8 DHF bound to the active site of PDXP, supporting a competitive mode of PDXP inhibition. Finally, they find that treating hippocampal neurons with 7,8 DHF increases PLP levels in a WT but not PDXP KO context. The authors note that 7,8 DHF has been used in numerous rodent neuropathology models to improve outcomes. 7, 8 DHF activity was previously attributed to activation of the receptor tyrosine kinase TrkB, although this appears to be controversial. The present study raises the possibility that it instead/also acts through modulation of PLP levels via PDXP, and is an important area for future work.

      Strengths:

      The strengths of the work are in the comprehensive, thorough, and unbiased nature of the analyses revealing the potential for therapeutic intervention in a number of pathologies.

      Weaknesses:

      Potential weaknesses include the poor solubility of 7,8 DHF that might limit its bioavailability given its relatively low potency (IC50= 0.8 uM), which was not improved by SAR. However, the compound has an extended residence me and the co-crystal structure could aid the design of more potent molecules and would be of interest to those in the pharmaceutical industry. The images related to crystal structure could be improved.

      Reviewer #2 (Public Review):

      Summary:

      In this study, the authors performed a screening for PDXP inhibitors to identify compounds that could increase levels of pyridoxal 5'- phosphate (PLP), the co-enzymatically active form of vitamin B6. For the screening of inhibitors, they first evaluated a library of about 42,000 compounds for activators and inhibitors of PDXP and secondly, they validated the inhibitor compounds with a counter-screening against PGP, a close PDXP relative. The final narrowing down to 7,8-DHF was done using PLP as a substrate and confirmed the efficacy of this flavonoid as an inhibitor of PDXP function. Physiologically, the authors show that, by acutely treating isolated wild-type hippocampal neurons with 7,8-DHF they could detect an increase in the ratio of PLP/PL compared to control cultures. This effect was not seen in PDXP KO neurons.

      Strengths:

      The screening and validation of the PDXP inhibitors have been done very well because the authors have performed crystallographic analysis, a counter screening, and mutation analysis. This is very important because such rigor has not been applied to the original report of 7,8 DHF as an agonist for TrkB. Which is why there is so much controversy on this finding.

      Weaknesses:

      As mentioned in the summary report the study may benefit from some in vivo analysis of PLP levels following 7,8-DHF treatment, although I acknowledge that it may be challenging because of the working out of the dosage and timing of the procedure.

      Reviewer #3 (Public Review):

      This is interesting biology. Vitamin B6 deficiency has been linked to cognitive impairment. It is not clear whether supplements are effective in restoring functional B6 levels. Vitamin B6 is composed of pyridoxal compounds and their phosphorylated forms, with pyridoxal 5-phosphate (PLP) being of particular importance. The levels of PLP are determined by the balance between pyridoxal kinase and phosphatase activities. The authors are testing the hypothesis that inhibition of pyridoxal phosphatase (PDXP) would arrest the age-dependent decline in PLP, offering an alternative therapeutic strategy to supplements. Published data illustrating that ablation of the Pdxp gene in mice led to increases in PLP levels and improvement in learning and memory trials are consistent with this hypothesis.

      In this report, the authors conduct a screen of a library of ~40k small molecules and identify 7,8dihydroxyflavone (DHF) as a candidate PDXP inhibitor. They present an initial characterization of this micromolar inhibitor, including a co-crystal structure of PDXP and 7,8-DHF. In addition, they demonstrate that treatment of cells with 7,8 DHP increases PLP levels. Overall, this study provides further validation of PDXP as a therapeutic target for the treatment of disorders associated with vitamin B6 deficiency and provides proof-of-concept for inhibition of the target with small-molecule drug candidates.

      Strengths include the biological context, the focus on an interesting and under-studied class of protein phosphatases that includes several potential therapeutic targets, and the identification of a small molecule inhibitor that provides proof-of-concept for a new therapeutic strategy. Overall, the study has the potential to be an important development for the phosphatase field in general.

      Weaknesses include the fact that the compound is very much an early-stage screening hit. It is an inhibitor with micromolar potency for which mechanisms of action other than inhibition of PDXP have been reported. Extensive further development will be required to demonstrate convincingly the extent to which its effects in cells are due to on-target inhibition of PDXP.

      Recommendations for the authors:

      There is general agreement that the study represents an advance regarding the mechanisms of pyridoxal phosphatase and 7,8 DHF. From the reviewers' comments, several major questions and considerations are raised, followed by their detailed remarks:

      (1) More analysis of the solubility and dose of 7,8 DHF with regard to the 50% inhibition and the salt bridge of the B protomer, as raised by the reviewers.

      (2) Is there a possible involvement of another phosphatase?

      (3) Does 7,8 DHF cause an effect upon TrkB tyrosine phosphorylation?

      We thank the Reviewers and Editors for their fair and constructive comments and suggestions. We have performed additional experiments to address these questions and considerations. In addition, we have generated two new high-resoling (1.5 Å) crystal structures of human PDXP in complex with 7,8-DHF that substantially expand our understanding of 7,8-DHF-mediated PDXP inhibition. The scientist who performed this work for the revision of our manuscript has been added as an author (shared first authorship).

      We believe that the insights gained from these new data have further strengthened and improved the quality of our manuscript. Together, our data provide compelling evidence that 7,8-dihydroxyflavone is a direct and competitive inhibitor of pyridoxal phosphatase.

      Please find our point-by-point responses to the Public Reviews that are not addressed in the Recommendations for the Authors, and the Recommendations for the Authors below.

      Reviewer #2:

      As mentioned in the summary report the study may benefit from some in vivo analysis of PLP levels following 7,8-DHF treatment, although I acknowledge that it may be challenging because of the working out of the dosage and timing of the procedure.

      We agree that an in vivo analysis of PLP levels following 7,8-DHF treatment could be informative for the further evaluation of a possible mechanistic link between the reported effects of this compound and PDXP/vitamin B6. However, we currently do not have a corresponding animal experimentation permission in place and are unlikely to obtain such a permit within a reasonable me frame for this revision.

      Recommendations For The Authors:

      Reviewer #1:

      The work is already well-written, comprehensive, and convincing.

      Suggestions that could improve the manuscript.

      (1) Include a protein tyrosine phosphatase (PTP) in the selectivity analysis. One possibility is that 7,8 DHF acts on a PTP (such as PTP1B), leading to TrkB activation by preventing dephosphorylation. I note that a previous study has looked at SAR for flavones with PTP1B (PMID: 29175190), which is worth discussion.

      We thank the reviewer for bringing this interesting possibility to our attention. We were not aware of the SAR study for flavonoids with PTP1B by Proenca et al. but have now tested the effect of 7,8-DHF on PTP1B, referring to this paper. As shown in Figure 2d, PTP1B was not inhibited by 7,8-DHF at a concentration of 5 or 10 µM. At the highest tested concentration of 40 µM, 7,8-DHF inhibited PTP1B merely by ~20%. For comparison, compound C13 (3-hydroxy-7,8-dihydroxybenzylflavone-3’,4’dihydroxymethyl-phenyl), which emerged as the most active flavonoid in the SAR study by Proenca et al. inhibited PTP1B with an IC50 of 10 µM. Consistent with the results of these authors, our finding confirms that less polar substituents, such as O-benzyl groups at positions 7 and 8, and O-methyl groups at positions 3’ and 4’ of the flavone scaffold, are important for the ability of flavonoids to effectively inhibit PTP1B. We conclude that PTP1B inhibition by 7,8-DHF is unlikely to be a primary contributor to the reported cellular and in vivo effects of this flavone.

      In addition to PTP1B, we have now additionally tested the effect of 7,8-DHF on the serine/threonine protein phosphatase calcineurin/PP2B, the DNA/RNA-directed alkaline phosphatase CIP, and three other metabolite-directed HAD phosphatases, namely NANP, NT5C1A and PNKP. PP2B, CIP and NANP were not inhibited by 7,8-DHF. Similar to PTP1B, PNKP activity was attenuated (~30%) only at 40 µM 7,8-DHF. In contrast, 7,8-DHF effectively inhibited NT5C1A (IC50 ~10 µM). NT5C1A is an AMP hydrolase expressed in skeletal muscle and heart. To our knowledge, a role of NT5C1A in the brain has not been reported. Based on currently available information, the inhibition of NT5C1A therefore appears unlikely to contribute to 7,8-DHF effects in the brain.

      The results of these experiments are shown in the revised Figure 2d. Taken together, the extended selectivity analysis of 7,8-DHF on a total of 12 structurally and functionally diverse protein- and nonprotein-directed phosphatases supports our initial conclusion that 7,8-DHF preferentially inhibits PDXP.

      (2) Line 144: It is unclear how fig 2c supports the statement here. Remove call out for clarity.

      Our intention was to highlight the fact that 7,8-DHF concentrations >12.5 µM could not be tested in the BLI assay (shown in Figure 2c) due to 7,8-DHF solubility issues under these experimental conditions. However, since this is discussed in the text, but not directly visible in Figure 2c, we agree with the Reviewer and have removed this call out.

      (3) Figure 3a. It is difficult to see the pink 7,8 DHF on top of the pink ribbon backbone. A better combination of colours could be used. Likewise in Figure 3b it is pink on pink again.

      We have improved the combination of colors to enhance the visibility of 7,8-DHF and have consistently color-coded murine and the new human PDXP structures throughout the manuscript.

      (4) Figure 3c and d. These are the two protomers I believe, but the colour coding is not present in 3c where the ribbon is now gray. Please choose colours that can be used to encode protomers throughout the figure.

      Please see response to point 3 above.

      (5) Figure 3f. I think this is the same protomer as 3c but a 180-degree rotation. Could this be indicated, or somehow lined up between the two figures for clarity? It would also be useful to have 3e in the same orientation as 3f, to better visualise the overlap with PLP binding. PLP and 7,8 DHF could be labelled similarly to the amino acids in 3f (the colour coding here is helpful).

      Please see response to point 3 above. We have substantially revised the structural figures and have used consistent color coding and the same perspective of 7,8-DHF in the PDXP active sites.

      (6) Figure 3g. The colours of the bars relating to specific mutations do not quite match the colours in Figure 3f, which I think was the aim and is very helpful.

      We have adapted the colours of the residues in Figure 3f (now Fig. 3b and additionally Fig. 3 – figure supplement 1e) so that they exactly match the colours of the bars in Figure 3g (now Fig. 3d).

      Reviewer #2:

      No further comments.

      Reviewer #3:

      Page 4: The authors describe 7,8DHF as a "selective" inhibitor of PDXP - in my opinion, they do not have sufficient data to support such a strong assertion. Reports that 7,8DHF may act as a TRK-B-agonist already highlight a potential problem of off-target effects. Does 7,8DHF promote tyrosine phosphorylation of TRK-B in their hands? The selectivity panel presented in Figure 2, focusing on 5 other HAD phosphatases, is much too limited to support assertions of selectivity.

      We agree with the Reviewer that our previous selectivity analysis with six HAD phosphatases was limited. To further explore the phosphatase target spectrum of 7,8-DHF, we have now analyzed six other enzymes: three other non-HAD phosphatases (the tyrosine phosphatase PTP1B, the serine/threonine protein phosphatase PP2B/calcineurin, and the DNA/RNA-directed alkaline phosphatase/CIP) and three other non-protein-directed C1/C0-type HAD phosphatases (NT5C1A, NANP, and PNKP). The C1-capped enzymes NT5C1A and NANP were chosen because we previously found them to be sensitive to small molecule inhibitors of the PDXP-related phosphoglycolate phosphatase PGP (PMID: 36369173). PNKP was chosen to increase the coverage of C0-capped HAD phosphatases (previously, only the C0-capped MDP1 was tested).

      We found that calcineurin, CIP and NANP were not inhibited by up to 40 µM 7,8-DHF. The activities of PTP1B or PNKP activity were attenuated (by ~20 or 30%, respectively) only at 40 µM 7,8-DHF. In contrast, 7,8-DHF effectively inhibited NT5C1A (IC50 ~10 µM). We have previously found that NT5C1A was sensitive to small-molecule inhibitors of the PDXP paralog PGP, although these molecules are structurally unrelated to 7,8-DHF (PMID: 36369173). NT5C1A is an AMP hydrolase expressed in skeletal muscle and heart (PMID: 12947102). To our knowledge, a role of NT5C1A in the brain has not been reported. Based on currently available information, the inhibition of NT5C1A therefore appears unlikely to contribute to 7,8-DHF effects in the brain. The results of these experiments are shown in the revised Figure 2d. Taken together, the extended selectivity analysis of 7,8-DHF on a total of 12 structurally and functionally diverse protein- and non-protein-directed phosphatases supports our initial conclusion that 7,8-DHF preferentially inhibits PDXP. To nevertheless avoid any overstatement, we have now also replaced “selective” by “preferential” in this context throughout the manuscript.

      We have not tested if 7,8-DHF promotes tyrosine phosphorylation of TRK-B. Being able to detect 7,8- DHF-induced TRK-B phosphorylation in our hands would not exclude an additional role for PDXP/vitamin B6-dependent processes. Not being able to detect TRK-B phosphorylation may indicate absence of evidence or evidence of absence. This would neither conclusively rule out a biological role for 7,8-DHF-induced TRK-B phosphorylation in vivo, nor contribute further insights into a possible involvement of vitamin B6-dependent processes in 7,8-DHF induced effects.

      Page 6: The authors report that they obtained only two PDXP-selective inhibitor hits from their screen; 7,8DHF and something they describe as FMP-1. For the later, they state that it "was obtained from an academic donor, and its structure is undisclosed for intellectual property reasons". In my opinion, this is totally unacceptable. This is an academic research publication. If the authors wish to present data, they must do so in a manner that allows a reader to assess their significance; in the case of work with small molecules that includes the chemical structure. In my opinion, the authors should either describe the compound fully or remove mention of it altogether.

      We are unable to describe “FMP-1” because its identity has not been disclosed to us. The academic donor of this molecule informed us that they were not able to permit release of any details of its structure or general structural class due to an emerging commercial interest.

      We mentioned FMP-1 simply to highlight the fact that the screening campaign yielded more than one inhibitor. FMP-1 was also of interest due its complete inhibition of PDXP phosphatase activity.

      Because the structure of this molecule is unknown to us, we have now removed any mention of this compound in the manuscript. For the same reason, we have removed the mention of the inhibitor hits “FMP-2” and “FMP-3” in Figure 2 – figure supplement 1 and Figure 2 – figure supplement 2. The number of PDXP inhibitor hits in the manuscript has been adapted accordingly.

      Page 7: The observed plateau at 50% inhibition requires further explanation. It is not clear how poor solubility of the compound explains this observation. For example, the authors state that "due to the aforementioned poor solubility of 7,8DHF, concentrations higher than 12.5µM could not be evaluated". Yet on page 8, they describe assays against the specificity panel at concentrations of compound up to 40µM. Do the analogues of 7,8DHF (Fig 2b) result in >50% inhibition at higher concentrations? Further explanation and data on the solubility of the compounds would be of benefit.

      We currently do not have a satisfactory explanation for the apparent plateau of ~50% PDXP inhibition by 7,8-DHF. Resolving this question will likely require other approaches, including computational chemistry such as molecular dynamics simulations, and we feel that this is beyond the scope of the present manuscript.

      We previously speculated that the limited solubility of 7,8-DHF may counteract a complete enzyme inhibition if higher concentrations of this molecule are required. Specifically, we referred to Todd et al. who have performed HPLC-UV-based solubility assays of 7,8-DHF (ref. 35). These authors found that immediately after 7,8-DHF solubilization, nominal 7,8-DHF concentrations of 5, 20 or 50 µM resulted in 0.5, 3.0 or 13 µM of 7,8-DHF in solution of (i.e., 10, 15 or 26% of the respective nominal concentration). Seven hours later, 46, 26 or 26% of the respective nominal 7,8-DHF concentrations were found in solution. Hence, above a nominal concentration of 5 µM, 7,8-DHF solubility does not increase linearly with the input concentration, but plateaus at ~20% of the nominal concentration. This phenomenon could potentially contribute to the apparent plateau of human or murine PDXP inhibition by 7,8-DHF in vitro.

      However, experiments performed during the revision of our manuscript show that they HAD phosphatase NT5C1A can be effectively inhibited by 7,8-DHF with an IC50-value of 10 µM (see revised Fig. 2). Together with the fact that the activity of the PDXP-Asn61Ser variant can be completely inhibited by 7,8-DHF (see Fig. 3d), we conclude that the reason for the observed plateau of PDXP inhibition is likely to be primarily structural, with Asn61 impeding 7,8-DHF binding. We have therefore removed the mention of the limited solubility of 7,8-DHF here. On p.14, we now say: “These data also suggest that Asn61 contributes to the limited efficacy of 7,8-mediated PDXP inhibition in vitro.”

      The solubility of 7,8-DHF is dependent on the specific assay and buffer conditions. In BLI experiments, interference patterns caused by binding of 7,8-DHF in solution to biotinylated PDXP immobilized on the biosensor surface are measured. In phosphatase selectivity assays, phosphatases are in solution, and the effect of 7,8-DHF on the phosphatase activity is measured via the quantification of free inorganic phosphate.

      In BLI experiments, we observed that the sensorgrams obtained with the highest tested 7,8-DHF concentration (25 µM) showed the same curve shapes as the sensorgrams obtained with 12.5 µM 7,8-DHF. This contrasts with the expected steeper slope of the curves at 25 µM vs. 12.5 µM 7,8-DHF. The same behavior was observed for the reference sensors (i.e., the SSA sensors that were not loaded with PDXP, but incubated with 7,8-DHF at all employed concentrations for referencing against nonspecific binding of 7,8-DHF to the sensors). The sensorgrams at 25 µM 7,8-DHF were therefore not included in the analysis (this is now specified in the Materials and Methods BLI section on p.27). To clarify this point, we now state that “As a result of the poor solubility of the molecule, a saturation of the binding site was not experimentally accessible” (p.7).

      In contrast, the phosphatase selectivity assays described on p.8 could be performed with nominal 7,8-DHF concentrations of up to 40 µM. Although the effective 7,8-DHF concentration in solution is expected to be lower (see ref. 35 and discussed above), the limited solubility of 7,8-DHF in phosphatase assays does not prevent the quantification of free inorganic phosphate. Nevertheless, we cannot exclude some interference with this absorbance-based assay (e.g., due to turbidity caused by insoluble compound). Indeed, 5,6-dihydroxyflavone and 5,6,7-trihydroxyflavone caused an apparent increase in PDXP activity at concentrations above 10 µM (see Figure 2b), which may be related to compound solubility issues. Alternatively, these flavones may activate PDXP at higher concentrations.

      We have tested the 7,8-DHF analogue 3,7,8,4’-tetrahydroxyflavone at concentrations of 70 and 100 µM. At concentrations >100 µM, the DMSO concentration required for solubilizing the flavone interferes with PDXP activity. PDXP inhibition by 3,7,8,4’-tetrahydroxyflavone was slightly increased at 70 µM compared to 40 µM (by ~18%) but plateaued between 70 and 100 µM. These results are now mentioned in the text (p.7): “The efficacy of PDXP inhibition by 3,7,8,4’-tetrahydroxyflavone was not substantially increased at concentrations >40 µM (relative PDXP activity at 40 µM: 0.46 ± 0.05; at 70 µM: 0.38 ± 0.15; at 100 µM: 0.37 ± 0.09; data are mean values ± S.D. of n=6 experiments).”

      Page 9: The authors report that PDXP crystallizes as a homodimer in which 7,8DHF is bound only to one protomer. Is the second protomer active? Does that contribute to the 50% inhibition plateau? If Arg62 is mutated to break the salt bridge, does inhibition go beyond 50%?

      We have no way to measure the activity of the second, inhibitor-free protomer in murine PDXP. We know that PDXP functions as a constitutive homodimer, and based on our current understanding, both protomers are active. We have previously shown that the experimental monomerization of PDXP (upon introduction of two-point mutants in the dimerization interface) strongly reduces its phosphatase activity. Specifically, PDXP homodimerization is required for an inter-protomer interaction that mediates the proper positioning of the substrate specificity loop. Thus, homodimerization is necessary for effective substrate coordination and -dephosphorylation (PMID: 24338687).

      In the murine structure, we observed that 7,8-DHF binding to the second subunit (the B-protomer) is prevented by a salt bridge between Arg62 and Asp14 of a symmetry-related A-protomer in the crystal lace (i.e., this is not a salt bridge between Arg62 in the B-protomer and Asp14 in the A-protomer of a PDXP homodimer). As suggested, we have nevertheless tested the potential role of this salt bridge for the sensitivity of the PDXP homodimer to 7,8-DHF.

      The mutation of Arg62 is not suitable to answer this question, because this residue is involved in the coordination of 7,8-DHF (see Figure 3b), and the PDXP-Arg62Ala mutant is inhibitor resistant (see Figure 3d). We have therefore mutated Asp14, which is not involved in 7,8-DHF coordination. As shown in the new Figure 3 – figure supplement 1d, the 7,8-DHF-mediated inhibition of PDXPAsp14Ala again reached a plateau at ~50%. This result suggests that while an Arg62-Asp14 salt bridge is stabilized in the murine crystal, it is not a determinant of the active site accessibility of protomer B in solution.

      To address this important question further, we have now also generated co-crystals of human PDXP bound to 7,8-DHF, and refined two structures to 1.5 Å. We found that in human PDXP, both protomers bind 7,8-DHF. These new, higher resolution data are now shown in the revised Figure 3 and its figure supplements, and we have moved the panels referring to the previously reported murine PDXP structure to the Figure 3 – figure supplement 1. Thus, both protomers of human PDXP, but only one protomer of murine PDXP bind 7,8-DHF in the crystal structure, yet the 7,8-DHFmediated inhibition of human and murine PDXP plateaus at ~50% under the phosphatase assay conditions (see Figure 2a). We conclude that 7,8-DHF binding efficiency in the PDXP crystal does not necessarily reflect its inhibitory efficiency in solution.

      Taken together, these data indicate that the apparent partial inhibition of murine and human PDXP phosphatase activity by 7,8-DHF in our in vitro assays is not explained by an exclusive binding of 7,8DHF to just one protomer of the homodimer.

      Page 10-12; Is it possible to generate a mutant form of PDXP in which activity is maintained but inhibition is attenuated - an inhibitor-resistant mutant form of PDXP? Can such a mutant be used to assess on-target vs off-target effects of 7,8DHF in cells?

      This is an excellent point, and we agree with the Reviewer that such an approach would provide further evidence for cellular on-target activity of 7,8-DHF. Indeed, the verification of the PDXP-7,8DHF interaction sites has led to the generation of catalytically active, inhibitor-resistant PDXP mutants, such as Tyr146Ala and Glu148Ala (Fig. 3d). However, the biochemical analysis of such mutants in primary hippocampal neurons is a very difficult task.

      Primary hippocampal neurons are derived from pooled, isolated hippocampi of mouse embryos and are subsequently differentiated for 21 days in vitro. The resulting cellular yield is typically low and variable, and the viability (and contamination of the respective cultures with e.g. glial cells) varies from batch to batch. Although such cell preparations are suitable for electrophysiological or immunocytochemical experiments, they are far from ideal for biochemical studies. A meaningful experiment would require the efficient expression of a catalytically active, but inhibitor-resistant PDXP-mutant in PDXP-KO neurons. In parallel, PDXP-KO cells reconstituted with PDXP-WT (at phosphatase activity levels comparable with the PDXP mutant cells) would be needed for comparison. Unfortunately, the generation of (a) sufficient numbers of (b) viable cells that (c) efficiently express (d) functionally comparable levels of PDXP-WT or -mutant for downstream analysis (PLP/PL-levels upon inhibitor treatment) is currently not possible for us.

      Human iPSC-derived (hippocampal) spheroids are at present no alternative, due to the necessity of generating PDXP-KO lines first, and the difficulties with transfecting/transducing them. Such a system would require extensive validation. We have attempted to use SH-SY5Y cells (a metastatic neuroblastoma cell line), but PDXK expression in these cells is modest and they produce too little PLP. We therefore feel that this question is beyond the scope of our current study.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1:

      Summary:

      The evolution of non-shivering thermogenesis is of fundamental importance to understand. Here, in small mammals, the contractile apparatus of the muscle is shown to increase energy expenditure upon a drop in ambient temperature. Additionally, in the state of torpor, small hibernators did not show an increase in energy expenditure under the same challenge.

      Strengths:

      The authors have conducted a very well-planned study that has sampled the muscles of large and small hibernators from two continents. Multiple approaches were then used to identify the state of the contractile apparatus, and its energy expenditure under torpor or otherwise.

      Weaknesses:

      There was only one site of biopsy from the animals used (leg). It would be interesting to know if non-shivering thermogenesis is something that is regionally different in the animal, given the core body and distal limbs have different temperatures.

      We thank the reviewer for their time and effort in reviewing our manuscript. Furthermore, we agree that it would be of interest to perform similar experiments upon different muscle sites in these animals. This is of particular interest as in some mammals, such as mice, distal limbs do not shiver and therefore non-shivering thermogenesis may play a more prominent role in heat regulation. A paper from Aydin et al., demonstrated that when shivering muscles (soleus) were prevented undergoing non-shivering thermogenesis via knock-out of UCP1 and were then exposed to cold temperatures, the force production of these muscles was significantly reduced due to prolonged shivering [1]. These results do suggest that even in shivering muscle, non-shivering thermogenesis plays a key role in the generation of heat for survival and for the maintenance of muscle performance. Furthermore, there is evidence from garden dormice that muscle temperature during torpor is slightly warmer than abdominal temperature and slighter cooler that heart temperature which is 7-8°C than abdominal suggesting the existence of non-shivering thermogenesis in skeletal and cardiac muscles (Giroud et al. in prep) [2]. We have added this information and reference into our discussion to reflect this important point (Discussion, paragraph 6, “As the biopsies which were used…”).

      Reviewer #2:

      Summary:

      The authors utilized (permeabilized) fibers from muscle samples obtained from brown and black bears, squirrels, and Garden dormice, to provide interesting and valuable data regarding changes in myosin conformational states and energetics during hibernation and different types of activity in summer and winter. Assuming that myosin structure is similar between species then its role as a regulator of metabolism would be similar and not different, yet the data reveal some interesting and perplexing differences between the selected hibernating species.

      Strengths:

      The experiments on the permeabilized fibers are complementary, sophisticated, and well-performed, providing new information regarding the characteristics of skeletal muscle fibers between selected hibernating mammalian species under different conditions (summer, interarousal, and winter).

      The studies involve complementary assessments of muscle fiber biochemistry, sarcomeric structure using X-ray diffraction, and proteomic analyses of posttranslational modifications.

      Weaknesses:

      It would be helpful to put these findings on permeabilized fibers into context with the other anatomical/metabolic differences between the species to determine the relative contribution of myosin energetics (with these other contributors) to overall metabolism in these different species, including factors such as fat volume/distribution.

      We thank the reviewer for the time and effort they have put into reviewing our paper and are grateful for the helpful suggestions which we believe, enhances our work (please see below for detailed answers to critics).

      Reviewer #3:

      Summary and strengths:

      The manuscript, "Remodelling of skeletal muscle myosin metabolic states in hibernating mammals", by Lewis et al, investigates whether myosin ATP activity may differ between states of hibernation and activity in both large and small mammals. The study interrogates (primarily) permeabilized muscle strips or myofibrils using several state-of-the-art assays, including the mant-ATP assay to investigate ATP utilization of myosin, X-ray diffraction of muscles, proteomics studies, metabolic tests, and computational simulations. The overall data suggests that ATP utilization of myosin during hibernation is different than in active conditions.

      A clear strength of this study is the use of multiple animals that utilize two different states of hibernation or torpor. Two large animal hibernators (Eurasian Brown Bear, American Black Bear) represent large animal hibernators that typically undergo prolonged hibernation. Two small animal hibernators (Garden Dormouse, 13 Lined Ground Squirrel) undergo torpor with more substantial reductions in heart rate and body temperature, but whose torpor bouts are interrupted by short arousals that bring the animals back to near-summer-like metabolic conditions.

      Especially interesting, the investigators analyze the impact that body temperature may have on myosin ATP utilization by performing assays at two different temperatures (8 and 20 degrees C, in 13 Lined Ground Squirrels).

      The multiple assays utilized provide a more comprehensive set of methods with which to test their hypothesis that muscle myosins change their metabolic efficiency during hibernation.

      We thank this reviewer for the effort and time they have put into carefully reviewing our manuscript and have taken on board their valuable suggestions to improve our manuscript (please see below for detailed answers to critics).

      Suggestions and potential weaknesses:

      While the samples and assays provide a robust and comprehensive coverage of metabolic needs and testing, the data is less categorical. Some of these may be dependent on sample size or statistical analysis while others may be dependent on interpretation.

      (1) Statistical Analysis

      (1a) The results of this study often cannot be assessed properly due to a lack of clarity in the statistical tests.

      For example, the results related to the large animal hibernators (Figure 1) do not describe the statistical test (in the text of the results, methods, or figure legends). (Similarly for figure 6 and Supplemental Figure 1). Further, it is not clear whether or when the analysis was performed with paired samples. As the methods described, it appears that the Eurasian Brown Bear data should be paired per animal.

      We thank the reviewer for these important points and have added information upon the statistical tests used where previously missing in each figure legend. Details on the statistical testing used for figure 6 are listed in the methods section, paragraph 18, “All statistical analysis of TMT derived protein expression data…”

      (1b) The statistical methods state that non-parametric testing was utilized "where data was unevenly distributed". Please clarify when this was used.

      We have now clariid all statistical tests used in the figure legends.

      (1c) While there are two different myosin isoforms, the isoform may be considered a factor. It is unclear why a one-way ANOVA is generally used for most of the mant-ATP chase data.

      The reviewer is right, in our analysis, we haven’t considered ‘myosin isoforms’ as a factor. One of the main reasons for that is because we have decided to treat fibres expressing different myosin heavy chain isoforms as totally separated entities (not interconnected).

      (1d) While the technical replicates on studies such as the mant-ATP chase assay are well done, the total biological replicates are small. A consideration of the sample power should be included.

      Unfortunately, obtaining additional biological samples from these unique species is challenging. Hence, we have added a statement in the Discussion section. This statement focuses on the potential benefits of increasing sample size to increase statistical power (Discussion, paragraph 2, “In contrast to our study hypothesis…”

      (1e) An analysis of the biological vs statistical significance should be considered, especially for the mant-ATP chase data from the American Black Bear, where there appear to be shifts between the summer and winter data.

      We agree that it is important to be careful when drawing conclusions from data only based on p-values. We agree that the modest differences observed in these data on American Black bear, whilst not significant, are worth noting and we have added these considerations into the manuscript (Discussion, paragraph 2, “In contrast to our study hypothesis…).

      (2) Consistency of DRX/SRX data.

      (2a) The investigators performed both mant-ATP chase and x-ray diffraction studies to investigate whether myosin heads are in an "on" or "off" state. The results of these two studies do not appear to be fully consistent with each other, which should not be a surprise. The recent work of Mohran et al (PMID 38103642) suggests that the mant-ATP-predicted SRX:DRX proportions are inconsistent with the position of the myosin heads. The discussion appears to lack a detailed assessment of this prior work and lack a substantive assessment contrasting the differing results of the two assays in the current study. i.e. why the current study's mant-ATP chase and x-ray diffraction results differ.

      Prior works on skeletal muscle (observing discrepancies between Mant-ATP chase assay and X-ray diffraction) are rather scarce. Adding a comprehensive discussion about this may be beyond the scope of current study and would distract the reader from the main topic. For this reason, we have not added any section. Note that, we have other manuscripts in preparation that are specifically dedicated to the discrepancy.

      (2b) The discussion of the current study's x-ray diffraction data relating to the I_1,1/I_1,0 ratio and how substantially different this is to the M6 results merits discussion. i.e. how can myosin both be more primed to contract during IBA versus torpor (according to intensity ratio), but also have less mass near the thick filament (M6).

      The I1,1/I1,0 ratio indicates a subtle mass shift towards the myosin thick filament whilst the M6 spacing shows a more compliant thick filament. These results are not incompatible and rely on interpretation of the X-ray diffraction patterns. To avoid any confusion and avoid distracting the reader from the main topic, we have decided not to speculate there.

      (3) Possible interactions with Heat Shock Proteins

      Heat Shock Proteins (HSPs), such as HSP70, have been shown to be differential during torpor vs active states. A brief search of HSP and myosin reveals HPSs related to thick filament assembly and Heat Shock Cognate 70 interacting with myosin binding protein C. Especially given the author's discussion of protein stability and the potential interaction with myosin binding protein C and the SRX state, the limitation of not assessing HSPs should be discussed. (While HSP's relation to thick filament assembly might conceivably modify the interpretation of the M3 x-ray diffraction results, this reviewer acknowledges that possibility as a leap.)

      The reviewer raises an interesting and potentially important of the potential impact of HSP and their interaction with the thick filament during hibernation. We have added a section into the discussion of this manuscript regarding this, with particular impact upon the HSP70 acting as a chaperone for myosin binding protein, however we feel that it is important to point out that HSPs have only been shown to interact with MYBPC3, a cardiac isoform of this protein which is not present in skeletal muscle [3]. (Discussion, paragraph 5, “Of potential further interest to the regulation of myosin…”).

      Despite the above substantial concerns/weaknesses, this reviewer believes that this manuscript represents a valuable data set.

      Other comments related to interpretation:

      (4) The authors briefly mention the study by Toepfer et al [Ref 25] and that it utilizes cardiac muscles. There would benefit from increased discussion regarding the possible differences in energetics between cardiac and skeletal muscle in these states.

      As this manuscript focuses solely on skeletal muscle. We believe that introducing comparisons between cardiac and skeletal muscles would confuse the reader. These types of muscles have very different regulations of SRX/DRX as an example. Note that we are preparing a manuscript focusing on cardiac muscle and hibernation.

      (5) The author's analysis of temperature is somewhat limited.

      (5a) First, the authors use 20 degrees C (room temperature), not 37 degrees C, a more physiologic body temperature for large mammals. While it is true that limbs are likely at a lower temperature, 20 degrees C seems substantially outside of a normal range. Thus, temperature differences may have been minimized by the author's protocol.

      The authors agree that the experimental set up to perform these single fiber studies at slightly higher temperatures may have been more beneficial to replicate the physiological conditions of these hind leg muscle in the analyzed animals. However, previous work has shown that the resting myosin dynamics are in fact stable at temperatures between 20-30 degrees Celsius in type I, type II and cardiac mammalian muscle fibers [4].

      (5b) Second, the authors discuss the possibility of myosin contributing to non-shivering thermogenesis. The magnitude of this impact should be discussed. The suggestion of myosin ATP utilization also implies that there is some basal muscle tone (contraction), as the myosin ATPase utilizes ATP to release from actin, before binding and hydrolyzing again. Evidence of this tone should be discussed.

      The reviewer is raising an interesting point and it would indeed be interesting to assess the magnitude of the impact and whether a basal muscle tone exists. Assessing the magnitude of the impact, is not an easy task and would require very advanced simulations which we are not experts in unfortunately. As for basal muscle tone, this is difficult to say as myosin is not actually binding to actin but hydrolyzing ATP at a faster pace during hibernation. We then think that the relation between our data and basal muscle tone is unclear. Hence, we have decided not to discuss these points in the manuscript.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      This is a very interesting paper. I have some minor suggestions to help improve it.

      Is there any way to estimate the contribution of contractile apparatus to energy expenditure in reference to what is being generated at SERCA in the resting muscle under the various states examined?

      This is an interesting idea however, as far as we know, this would be challenging experimentally (in the hibernating mammals) and difficult to achieve in a reliable manner.

      It is important to emphasize that while BAT has been traditionally seen to be the site of NST, the skeletal muscle is very important, especially in large mammals, where BAT is going to be a very small % of the body and unlikely to be able to adequately provide heat. The addition of the contractile apparatus to SERCA as a heat generator at rest is very important -- also, the activation of ryanodine receptor Ca2+ to increase the local [Ca2+] at SERCA to generate heat has also recently been shown and should be mentioned (Meizoso-Huesca et al 2022, PNAS; Singh et al 2023, PNAS) alongside the work of Bal et al 2012 etc...

      We have included these mechanisms and references into the manuscript discussion [5, 6]. Discussion, paragraph 4, “A critical difference between the large hibernators…”

      Are you able to report the likely proportion of type II fibers in the muscles you have sampled?

      The fiber type breakdown for all animals used in this study is reported in supplementary table 1.

      The sampling of muscle from the legs of live animals is sensible and convenient. Is it possible different muscles in the body have different levels of NST, changes in energy expenditure in torpor, and other states?

      As discussed in the public review we have added to the discussion of this manuscript to reflect upon this important point of potentially different results from different muscle sites in these animals.

      Reviewer #2 (Recommendations For The Authors):

      Is it likely that the proportion of fast and slow myosin-heavy chains within the selected sample of myofibers from the different mammals contributes to the overall differences in the energetics of different conformational states? In living animals, how does the relative contribution of the energetics from different muscle fiber types compare with the contribution from other organs to the overall regulation of metabolism during activities in summer, winter, or periods of intermittent arousal?

      Fiber types in mammals can be vastly different between species as well as having a considerable amount of plasticity to change within each species upon specific stimuli. Furthermore, some mammals also have specific myosin heavy chain isoforms which have considerable expression, for example, myosin heavy chain 2B which is expressed in rodents such as mice but not larger mammals such as humans.

      In the manuscript, we demonstrate that there is no significant change in the ATP usage by myosin in resting muscle in any of the species which we examined (Fig 1 F, L; Fig 2 E, J). The relatively high mitochondrial density of type I fibers when compared to type II fibers may contribute to a higher overall requirement of energy storage primarily via lipid oxidation. However, mitochondrial respiration is heavily suppressed during hibernation, so questions remain over the overall energy demand in hibernating muscle beyond myosin [7]. The fact that myosin ATP demand is relatively preserved in hibernating muscle suggests that skeletal muscle may be a relatively energy-demanding organ even during hibernation, we speculate in the manuscript this may be due to the requirement of maintaining muscular tone and function during this period of prolonged immobilization. This may be of relevance when one considers the almost complete shutdown of organs involved with food intake and breakdown such as the stomach and liver during hibernation. Furthermore, heart rate and breathing rates are vastly suppressed. Altogether, whilst is it difficult at this point to make an accurate estimate of energy demands between the different organs of hibernators, our data points to skeletal muscle to be a relatively high energy demand organ during these periods. When considering the difference between fiber type, again our data suggests that both type I and type II fibers have relatively similar energy demands during hibernation.

      The supplementary data are quite revealing as to how the myosin isoform composition is stable in some species but highly plastic in others in response to the same environmental/metabolic challenges. Why is the myosin heavy chain isoform (I and II) composition stable for brown bears but not for black bears between summer and winter? This is very interesting. For the Ground squirrel, there is remarkable plasticity between myosin heavy chain isoforms ( I and II) between summer, interbout arousal, and torpor. Yet in the Garden Dormouse, the myosin heavy chain isoform (I and II) composition is stable between these three activity states. The inconsistencies between and within species are perplexing and worthy of closer interrogation.

      The measurements and role of myosin energetics in different conformational states are interesting but need to be explained in context with other metabolic regulators for these hibernating mammals, especially because some species show remarkable plasticity whereas others show remarkable stability. For example, compare brown and black bears which show differences in the response of myosin composition the activity, interbout arousal, and torpor. Ground squirrels show remarkable plasticity in myosin isoform composition between activity states (and likely metabolic differences), but the Garden Dormouse has a remarkably stable myosin isoform composition during the three metabolic/environmental challenges. What mechanisms facilitate these modifications in some but not other mammals, even those of similar size? The differences are very interesting, worthy of follow-up, and may well contribute to further understanding the significance of the energetics of different myosin conformational states.

      We agree that the changes seen between these species are very interesting and worthy of further investigation. What would be of further interest would be to look at methods which would allow for even deeper phenotyping, such as single fiber proteomics, to allow for the assessment of the percentage of hybrid fibers and fibers undergoing any fiber type switch during hibernating periods. Our results do observe a modest, albeit not significant, increase in the number of type I muscle fibers in 13-lined ground squirrels and Garden dormice during torpor which is consistent with previous studies[8]. Previous studies have demonstrated that lower temperatures may promote a shift towards more oxidative type I muscle fibers in mammals[9]. This could be an explanation for why we see this specifically in the smaller hibernators, however as we demonstrate and discuss, these lower temperatures are vital for the survival of these smaller mammals during hibernation so it would be inconsistent to hypothesize that these shifts are for heat-production purposes. Further studies are warranted to understand the relevance of these shifts further, particularly those with a higher sample size. It would also be on interest to examine fiber type percentages during the progression these long hibernating periods to observe if these changes are progressive.

      As for the triggers and mechanisms which facilitate these changes to myosin dynamics, this is of current investigation by the field. One which may be of particular relevance to the changes seen during hibernation would that of steroid hormones previous research has demonstrated that steroid hormone levels in make and female bears change differentially[10]. This may be of relevance as the steroid hormone estradiol has been shown to slow the resting myosin ATP turnover via the binding of myosin RLC[11]. Considering these studies, future work which looks at hibernating animals of each sex as different groups may be fruitful.

      Reviewer #3 (Recommendations For The Authors):

      i. PDF Pg 8- Results- 'Myosin temperature sensitivity is lost in relaxed skeletal muscles fibers of hibernating Ictidomys tridecemlineatus.': An extra comma appears to be placed between "temperature, decrease".

      ii. PDF Pg 9- Results- 'Hyper-phosphorylation of Myh2 predictably stabilizes myosin backbone in hibernating Ictidomys tridecemlineatus.' (last paragraph): A parenthesis needs to be closed upon the first reference to "supplemental figures 2 and 3".

      iii. PDF Pg 15- Methods- 'Samples collection and cryo-preservation'- The authors use the term "individuals" in the 2nd line. Consider using "subjects".

      iv. PDF Pg 15- Methods- 'Samples collection and cryo-preservation' (2nd paragraph)- define "subadult" in approximate months or years.

      v. PDF Pg 15- Methods- 'Samples collection and cryo-preservation' (2nd paragraph)- The authors state that brown bears were located in "February and again ... in late June". Was this order of operations always held? If so, a comment about how the potential ageing from the hibernation (especially if sub-adult transitions to adulthood in this period) should be included.

      All samples were collected during the subadult period of the lifespan of each bear and therefore we do not think that there would be a potential aging affect observed considering the lifespan of this species to be 20-30 years.

      vi. PDF Pg 15- Methods- 'Samples collection and cryo-preservation' (3rd paragraph)- The justification for deprivation of feeding of black bears 24 hours prior to euthanasia should be included. A comment on how this might impact post-translational modifications or gene expression should be included.

      Animals are starved prior to prevent aspiration during euthanasia. Considering these samples are to be compared to animals which have not consumed food or water for five months the impact relative impact on PTMs and gene expression would be considered negligible.

      vii. PDF Pg 17- Methods- 'Mant-ATP chase experiments' (just after normalized fluorescence equation): The "Where" may be lowercase.

      viii. PDF Pg 17- Methods- 'Mant-ATP chase experiments' (last paragraph): The protocol for myosin staining, along with the antibody identification (source, catalog number) should be included.

      ix. PDF Pg 18- Methods- 'Post-translational Modification Peptide mapping': Define the makeup of the acrylamide gel and/or the source and catalog number.

      x. PDF Pg 18- Methods- 'Post-translational Modification Peptide mapping': The authors state that "Gel bands were washed..." Please specify which protein bands and if multiple bands (i.e. multiple isoforms) were isolated.

      We thank this reviewer for their careful reading of our manuscript, we have made the changes above as relevant.

      Reference list

      (1) Aydin, J., et al., Nonshivering thermogenesis protects against defective calcium handling in muscle. Faseb j, 2008. 22(11): p. 3919-24.

      (2) Stickler, S., Regional body temperatures and fatty acid compositions in hibernating garden dormice: a focus on cardiac adaptions. 2022, Vienna: Vienna. p. v, 49 Seiten, Illustrationen.

      (3) Glazier, A.A., et al., HSC70 is a chaperone for wild-type and mutant cardiac myosin binding protein C. JCI Insight, 2018. 3(11).

      (4) Walklate, J., et al., Exploring the super-relaxed state of myosin in myofibrils from fast-twitch, slow-twitch, and cardiac muscle. Journal of Biological Chemistry, 2022. 298(3).

      (5) Meizoso-Huesca, A., et al., Ca<sup>2+</sup> leak through ryanodine receptor 1 regulates thermogenesis in resting skeletal muscle. Proceedings of the National Academy of Sciences, 2022. 119(4): p. e2119203119.

      (6) Singh, D.P., et al., Evolutionary isolation of ryanodine receptor isoform 1 for muscle-based thermogenesis in mammals. Proceedings of the National Academy of Sciences, 2023. 120(4): p. e2117503120.

      (7) Staples, J.F., K.E. Mathers, and B.M. Duffy, Mitochondrial Metabolism in Hibernation: Regulation and Implications. Physiology, 2022. 37(5): p. 260-271.

      (8) Xu, R., et al., Hibernating squirrel muscle activates the endurance exercise pathway despite prolonged immobilization. Exp Neurol, 2013. 247: p. 392-401.

      (9) Yu, J., et al., Effects of Cold Exposure on Performance and Skeletal Muscle Fiber in Weaned Piglets. Animals (Basel), 2021. 11(7).

      (10) Frøbert, A.M., et al., Differential Changes in Circulating Steroid Hormones in Hibernating Brown Bears: Preliminary Conclusions and Caveats. Physiol Biochem Zool, 2022. 95(5): p. 365-378.

      (11) Colson, B.A., et al., The myosin super-relaxed state is disrupted by estradiol deficiency. Biochemical and biophysical research communications, 2015. 456(1): p. 151-155.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review): Weaknesses:

      However, the molecular mechanisms leading to NPC dysfunction and the cellular consequences of resulting compartmentalization defects are not as thoroughly explored. Results from complementary key experiments using western blot analysis are less impressive than microscopy data and do not show the same level of reduction. The antibodies recognizing multiple nucleoporins (RL1 and Mab414) could have been used to identify specific nucleoporins that are most affected, while the selection of Nup98 and Nup107 is not well explained.

      The results for the Western blots are less impressive than single nuclei imaging analysis because the protocol for isolating brain nuclei is heterogeneous and includes non-neuronal cells. For this reason, we selected specific nucleoporins for Western blot studies to complement the nonspecificity of pan-NPC antibodies for which the detection is based on the glycosylated moieties. We reasoned that a combination of pan-NPC and select NUPs will give the strongest complementary validation for the mutant phenotype. We have discussed the rationale of NUP selection in discussion. In brief, we selected NUP107 as it is a major component of the Yscaffold complex and is a long-lived subunit of the NPCs (Boehmer et al., 2003; D'Angelo et al., 2009). NUP98 is a mobile nucleoporin and is associated with the central pore, nuclear basket and cytoplasmic filaments. Both NUPs have been implicated in degenerative disorders. (Eftekharzadeh et al., 2018; Wu et al., 2001).

      There is also no clear hypothesis on how Aβ pathology may affect nucleoporin levels and NPC function. All functional NCT experiments are based on reporters or dyes, although one would expect widespread mislocalization of endogenous proteins, likely affecting many cellular pathways.

      We agree that the interaction between Aβ pathology and the NPC remains a work in progress. We decided to rigorously characterize Aβ-mediated deficits in App KI neurons – using different approaches and in more than one animal model – before moving on to explore mechanisms in subsequent studies, which we think deserves more extensive experiments. We seek your understanding and have included in the discussion, possible mechanisms for direct and indirect Aβ-mediated disruption of NPCs. We have also included an additional study to show the disruption in the localization of an endogenous nucleocytoplasmic protein – CRTC1 (cAMP Regulated Transcriptional Coactivator), which is CREB coactivator responsive to neural activity. We observed under basal and also in tetrodotoxin-silenced conditions, there is much higher CRTC1 in the nucleus in App KI neurons relative to WT. This reflects the compromised permeability barrier that we observed via FRAP studies. (Supplementary Figure S15).

      The second part of this manuscript reports that in App KI neurons, disruption in the permeability barrier and nucleocytoplasmic transport may enhance activation of key components of the necrosome complex that include receptor-interacting kinase 3 (RIPK3) and mixed lineage kinase domain1 like (MLKL) protein, resulting in an increase in TNFα-induced necroptosis. While this is of potential interest, it is not well integrated in the study. This potential disease pathway is not shown in the very simple schematic (Fig. 8) and is barely mentioned in the Discussion section, although it would deserve a more thorough examination.

      The study of necroptosis is meant to showcase a single cellular pathway that requires nucleocytoplasmic transport for activation that is compromised and is relevant for AD. We agree there is much more to explore in this pathway but feel is outside the scope of this study. We have included a new illustration that models how damage to NPCs and permeability barrier results in enhanced vulnerability of App KI neurons for necroptosis (Supplemental figure S12).

      Reviewer #2 (Public Review):

      (1) Adding statistics and comparisons between wild-type changes at different times/ages to determine if the nuclear pore changes with time in wild-type neurons. The images show differences in the Nuclear pore in neurons from the wild-type mice, with time in culture and age. However, a rigorous statistical analysis is lacking to address the impact of age/development on NUP function. Although the authors state that nuclear pore transport is reported to be altered in normal brain aging, the authors either did not design their experiments to account for the normal aging mechanisms or overlooked the analysis of their data in this light.

      All our quantifications and statistical comparisons in neuron cocultures are time-matched between WT and App KI neurons, and thus independent of age and maturity of the neurons in culture. The accelerated loss of NUP expression is evident across all time groups. However, we cannot compare across age groups in cultured neurons as the time-matched WT and App KI samples for each time point were processed and imaged separately as neurons matured over time (Fig. 1B-C). An experiment must be done simultaneously across all age groups to compare agerelated effects for WT and App KI neurons in order to account for time-dependent changes. Given the unique challenges of studying “aging” in culture systems, we opted to be more conservative in our interpretation of the results and as such, we were careful to describe the accelerated nuclear pore deficits in App KI neurons relative to time-matched WT expression and speculate its relationship to normal brain aging only in the discussion section. We seek your understanding in this matter. That said, we are able to capture the decline of the NPC in histology of brain sections and observed a statistically significant drop in WT NUP levels in animal sections across age groups where we quantified and compared the raw nuclear intensities from brain sections that were processed and imaged simultaneously across independent experiments (Fig. 1D-E). We have included a statement in the results section to highlight that point.

      (2) Add experiments to assess the contribution of wild-type beta-amyloid accumulation with aging. It was described in 2012 (Guix FX, Wahle T, Vennekens K, Snellinx A, Chávez-Gutiérrez L, Ill-Raga G, Ramos-Fernandez E, Guardia-Laguarta C, Lleó A, Arimon M, Berezovska O, Muñoz FJ, Dotti CG, De Strooper B. 2012. Modification of γ-secretase by nitrosative stress links neuronal ageing to sporadic Alzheimer's disease. EMBO Mol Med 4:660-673, doi:10.1002/emmm.201200243) and 2021 (Burrinha T, Martinsson I, Gomes R, Terrasso AP, Gouras GK, Almeida CG. 2021. Upregulation of APP endocytosis by neuronal aging drives amyloid-dependent synapse loss. J Cell Sci 134. doi:10.1242/jcs.255752), 28 DIV neurons are senescent and accumulate beta-amyloid42. In addition, beta-amyloid 42 accumulates normally in the human brain (Baker-Nigh A, Vahedi S, Davis EG, Weintraub S, Bigio EH, Klein WL, Geula C. 2015. Neuronal amyloid-β accumulation within cholinergic basal forebrain in ageing and Alzheimer's disease. Brain 138:1722-1737. doi:10.1093/brain/awv024), thus, it would be important to determine if it contributes to NUP dysfunction. Unfortunately, the authors tested the Abeta contribution at div14 when wild-type Abeta accumulation was undetected. It would enrich the paper and allow the authors to conclude about normal aging if additional experiments were performed, namely, treating 28Div neurons with DAPT and assessing if NUP is restored.

      Your point is well-noted. We are intrigued at the potential contribution of WT Aβ to the decline in NUPs and NPC but decided to focus on mutant Aβ for this manuscript. We have observed negligible MOAB2-positive Aβ signals in WT neurons across all age groups (data not shown) but acknowledge the potential contributions of aging toward a reduction in NPC function. Instead, we have included a section in the discussion to highlight the aging-related expression of Aβ in WT neurons and a subset of the citations above to indicate a possible link with normal decay of NPCs.

      Reviewer #3 (Public Review):

      Weaknesses:

      (1) It does not consider the relationship of the findings here to other published work on the intraneuronal perinuclear and nuclear accumulation of amyloid in other transgenic mouse models and in humans.

      We have updated the discussion to further elaborate on intraneuronal and perinuclear accumulation of amyloid and how that relates to our NPC phenotype.

      (2) It appears to presume that soluble, secreted Abeta is responsible for the effect rather than the insoluble amyloid fibrils.

      At present, our data cannot fully discount the role of fibrils or other forms of Aβ causing the NPC deficits, but our studies do show that external presence of Aβ (e.g. addition of synthetic oligomeric Aβ or App KI conditioned media) leads to intracellular accumulation and NPC dysfunction. We are aware that endogenous formation of fibrils could also contribute to the NPC dysfunction but refrained from drawing any conclusions without further studies. We have stated this in the discussion.

      (5) It is not clear when the alteration in NUP expression begins in the KI mice as there is no time at which there is no difference between NUP expression in KI and Wt and the earliest time shown is 2 months. If NUP expression is decreased from the earliest times at birth, then this makes the significance of the observation of the association with amyloid pathology less clear.

      The phenotype we observed early in neuronal cultures and in very young animals is subtle and in all our studies, the severity of the NUP phenotypes consistently correlates with elevated intracellular Aβ. We expect that by looking at earlier/younger neurons, the deficits will not be present. However, neurons before DIV7 are immature, and hence we chose not to include those in our observations. In animals, we observed Aβ expression in neuronal soma in young mice (2 mo.), but it is not clear when the deficits manifests and how early to look. While the NUP expression is reduced at an early stage, we speculate in discussion that cellular homeostatic mechanisms can compensate for any compromised nuclear functions and to maintain viability to the point where age-dependent degradation of cellular mechanisms will eventually lead to progression of AD.

      Reviewer #1 (Recommendations For The Authors):

      While the App KI model is suitable for modeling one key aspect of human AD, the use of the term "AD neurons" throughout the manuscript is misleading and should be avoided when describing experiments with "App KI neurons".

      Noted and corrected.

      The claim that Aβ pathology causes NPC dysfunction via reduced nucleoporin protein expression would be stronger if it was better supported by biochemical evidence based on western blots (WBs) to complement the strong microscopy data. The results shown in Figure 2H show a very weak effect compared to microscopy data that does not appear to match the quantification (e.g. Lamin-B1 staining appears reduced after 2 months in WB but not the graph). It is also not clear why nuclear fractionation is required. WB analyses with RL1 and MAB414 (that recognizes multiple FG-Nupsin ICCs and WBs) would help identify Nups that are most affected by Aβ pathology.

      The weaker Western blot results is due to the heterogeneity of the nuclei we isolated from the whole brain which includes non-neuronal cells. We reasoned that isolating the nuclear fraction would give us a cleaner Western blot with fewer background bands as the input lysate is more specific. We also decided to use antibodies against specific NUPs as a way to complement the pan-NPC antibodies that detect glycosylation-enriched epitopes in the nucleus. We reasoned that Western blot identification of individual subunits should provide complementary and stronger evidence for the reduction of NUPs at the peptide level. Overall, we used four different nuclear pore antibodies (RL1, Mab414, NUP98, NUP107) to demonstrate the same mutant phenotype in App KI neurons.

      While the observed NCT defects are discussed in detail, the authors do not present any potential mechanisms to be tested, how intracellular Aβ may impact NPCs. Does Aβ pathology affect nucleoporin expression or stability?

      We have observed the presence of Aβ adjacent to the nuclear membrane and also in the cytosol via high resolution confocal microscopy (Supplementary Figure S14). Our primary goal in this paper is to provide convincing evidence – using different assays and in more than one mouse model – for the reduction of NUPs and lower NPC counts. We feel mechanistic details of Aβdriven NPC disruption requires more extensive experimentation more suitable for subsequent publications.

      The very simple schematic just represents the loss of compartmentalization, without illustrating more complex concepts. It would also be improved by representing the outer and inner nuclear membrane fusing around the NPCs with a much wider perinuclear space between the membranes. As shown now, the nuclear envelope almost looks like a single membrane, while >60kDa proteins are shown at a similar size as the 125MDa NPC.

      We have updated the illustration along with a new schematic for necroptosis (Supplementary Figure S12). We have refrained from giving specific details of the damage to the nuclear pore complex because it is not yet clear the nature of these deficits.

      Misspelling of "Hoechst" as "Hochest" in several figures (Fig. 1, 2, S5, S7).

      Noted and corrected

      Reviewer #2 (Recommendations For The Authors):

      (1) Additional data analysis is required concerning the wild-type controls. The figures show clear differences in the wild-type neurons with time in culture (referring to figures 1A, 1B, 1C; 2A, 2B, 2C, 2D,6E, 6F, 6G, s4) and in different ages (2E, 2F, 2G, 5B, 5C, 5D). The data analysis is shown for knockin vs the time-matched wild-type condition. The effect of time in wild-type neurons/mice should also be analyzed. All the data is suggested to be normalized to 7 DIV/2month wild-type neurons/mice. Were these experiments done with different time points of the same culture? This would be the best to conclude on the effect of time.

      We have noted a decline of NUPs in WT neurons over time in primary cultures and in animal sections. This is not surprising since the NPC and nuclear signaling pathways deteriorate with age (Liu and Hetzer, 2022; Mertens et al., 2015). However, we are unable to do a direct comparison across age groups in cultured neurons as the time-matched WT and App KI neuronal samples for each time point were processed and imaged separately as neurons matured over time (Fig. 1B-C). Hence, we perform statistical analysis for each time-matched WT and App KI neurons. To be clear, multiple independent experiments across different cultures were performed at each time point. Given the inherent challenges of studying aging in culture systems, we opted to be more conservative in our interpretation of the results and as such, we were careful to describe the accelerated nuclear pore deficits in App KI neurons relative to WT levels without inferring the effect of time and speculate its relationship to normal brain aging only in the discussion section. That said, we are able to capture the decline of the nuclear pore complex across different age groups in histology of brain sections where we observed a drop in WT NUP levels in animal sections when we quantified and compared the raw nuclear intensities from brain sections that were processed and imaged simultaneously across independent experiments (Fig. 1D-E).

      Similarly, in Figure 2H, why aren't 2 months compared with 14 months? Why were these ages chosen? 2 months is a young adult, and 14 months is a middle-aged adult. To conclude, aging should have included an age between 18 and 24 months old.

      As with cultures, we isolated age-matched WT and App KI animals separately. We chose 2 to 14 months as they represent young and middle-aged adults as we wanted to showcase the nuclear pore deficits induced by the presence of Aβ without drawing a conclusion on the effects of age or time. That said, we do show histology of brain sections at 18 months of age with individual NUPs. We agree that the temporal aspects of NPC loss in WT neurons is interesting, however, given our experimental parameters, we cannot draw conclusions across different age groups at the moment.

      In Figure 3, statistics between wild type should have been included.

      Similar to the above comment, samples were processed and imaged independently across different groups, hence we cannot compare the datapoints across time.

      (4) Additional quantification: The intensity of MOAB2 at 2 and 13 months should be measured as in Figure 3C.

      Intracellular Aβ signal in 2-mo. old App KI mice is diffuse throughout the soma but in older animals, they are punctate. This observation was similarly described by Lord et al. for tgAPPArcSwe mice (Lord et al., 2006). We have included a confocal micrograph of MOAB-2 immunocytochemistry of a 13-mo. App KI brain section in supplemental figures (Supplementary Figure S13). We found it challenging to differentiate whether the signal is localized intracellularly or as an extracellular aggregate. Regardless, the differences in the quality and uneven distribution of Aβ signal makes any direct comparison of soma intensity across the different age groups harder to interpret in the context of the mutant phenotype.

      (5) Additional experiments: Because primary neurons differentiate, mature, and age with time in culture, they are required to control for the developmental stage of your cultures. Analyzing neuronal markers such as doublecortin for neuronal precursors, MAP2 (or Tau) for dendritic/axonal maturation, synapsin for synaptic maturation, and accumulation of senescenceassociated beta-galactosidase (SA-Beta-Gal) as an aging marker.

      As part of the maintenance of cultures, we stain cultures for axodendritic markers (e.g. MAP2), glial cell distribution (e.g GFAP) and excitatory vs. inhibitory neuronal subpopulations (e.g. Gad65) and synaptic markers (e.g. PSD95) to ensure that growth, survival and viability of neurons are not compromised (data not shown). These markers for maturity are routinely tracked to ensure proper development. We also test the health of the cultures (e.g. apoptosis, necrosis) and to look for cytoskeletal disruption or fragmentation for neuronal processes.

      (6) Additional methods: The quantification of Abeta intensity in Figure 3 is not clearly explained in the methods. Was the intensity measured per field, per cell body?

      The quantifications for Aβ are done for each MAP2-positive cell body and have included that statement in the methods.

      (7) Missing in discussion integration and references to these papers:

      a. Mertens J, Paquola ACM, Ku M, Hatch E, Böhnke L, Ladjevardi S, McGrath S, Campbell B, Lee H, Herdy JR, Gonçalves JT, Toda T, Kim Y, Winkler J, Yao J, Hetzer MW, Gage FH. 2015. Directly Reprogrammed Human Neurons Retain Aging-Associated Transcriptomic Signatures and Reveal Age-Related Nucleocytoplasmic Defects. Cell Stem Cell 17:705-718. doi:10.1016/j.stem.2015.09.001

      b. Guix FX, Wahle T, Vennekens K, Snellinx A, Chávez-Gutiérrez L, Ill-Raga G, Ramos-Fernandez E, Guardia-Laguarta C, Lleó A, Arimon M, Berezovska O, Muñoz FJ, Dotti CG, De Strooper B. 2012. Modification of γ-secretase by nitrosative stress links neuronal ageing to sporadic Alzheimer's disease. EMBO Mol Med 4:660-673. doi:10.1002/emmm.201200243

      c. Burrinha T, Martinsson I, Gomes R, Terrasso AP, Gouras GK, Almeida CG. 2021. Upregulation of APP endocytosis by neuronal aging drives amyloid-dependent synapse loss. J Cell Sci 134. doi:10.1242/jcs.255752),

      Neuronal amyloid-β accumulation within cholinergic basal forebrain in ageing and Alzheimer's disease. Brain 138:1722-1737. doi:10.1093/brain/awv024).

      We have cited a subset of the papers in the discussion section and also expanded the discussion to include the possibility of time-dependent changes for Aβ expression in WT neurons.

      Reviewer #3 (Recommendations For The Authors):

      Specific comments:

      (1) Fig. 1D,E. Fig. 2E, F. This shows the change in NUP IR with time for the APP-KI, but there is also a difference between Wt and KI from the earliest time shown. How early is this difference apparent? From birth? The study should go back to the earliest time possible as the timing of the staining for NUP is important to correlate this with other events of intraneuronal Abeta and amyloid IR. Is the difference between 4 and 7-month ko mice in Figures 2G and 2F statistically significant? If not, perhaps we need a larger N to determine the timing accurately.

      The point is well taken. We have not examined the WT and App KI brains before 2-mo. of age. At this early time point, the extracellular amyloid deposits are very low but intracellular Aβ can be readily detected in neuronal soma. We expect that as the animal ages, the Aβ inside cells will directly impact the NPC mutant phenotype, but it is unclear how early this phenotype manifests in animals and when we should look. To be clear, in less mature neurons (DIV7), the phenotype is very subtle and can only be observed via high resolution microscopy. The differences between 4-7 mo. old animals (Fig. 2F and G) in terms of severity of the reduction cannot be assessed as the age-matched animals for each time point were processed separately, but at each time point, we observed a significant reduction of NPC relative to WT. Nevertheless, in Figure 1E, we performed immunohistochemistry experiments with pan-NPC antibodies and quantified raw intensities to show a difference between 4/7-mo. with 13-mo. old animals.

      (2) Similarly, the increase in Abeta IR is only shown for cultured neurons and only a single time point of 2 months is shown for CA1 in KI brain. Since a major point is that the decrease in NUP IR is correlated with an increase in Abeta IR, a more convincing approach would be to stain for both simultaneously in KI brain, especially since Abeta IR is quite sensitive to conformational variation between APP, Abeta, and aggregated forms and whether they are treated with denaturants for "antigen retrieval". The entire brain hemisphere should be shown as the pathology is not limited to CA1. There are many different Abeta antibodies that are specific to the amyloid state so it should be possible to come up with a set of antibodies and conditions that work for both Abeta and NUP staining.

      The intracellular Aβ signal in 2-mo. old App KI mice is diffuse throughout the soma but in older animals, they are punctate. We have included a confocal micrograph of MOAB-2 immunocytochemistry of a 13-mo. App KI brain section (Supplementary Figure S13). We did not quantify Aβ as it was challenging to differentiate if the signal is intracellular Aβ or amyloid β plaques. Regardless, the differences in the quality and uneven distribution of Aβ signal makes any direct comparison of soma intensity across the different age groups much harder to interpret.

      (3) Figure 3A. The staining with MOAB 2 and 82E1 appears qualitatively different with 82E1 exhibiting larger perinuclear puncta. Both antibodies appear to stain puncta inside the nucleus consistent with previously published reports of intranuclear amyloid IR. If these are flattened images, then 3D Z stacks should be shown to clarify this. Figure 3H shows what appears to be Abeta immunofluorescence quantitation in DAPT-treated cells, but the actual images are apparently not shown. The details of this experiment aren't clear or what antibody is used, but this may not be Abeta as many APP fragments that are not Abeta also react with antibodies like MOAB2.

      Since 82E1 detects a larger epitope (aa1-16 as compared to 1-4 in MOAB-2), it is possible some forms of Aβ are differentially detected inside the cell. MOAB-2 is shown to detect the different forms of Aβ40 and 42, with a stronger selectivity for the latter. However, it is not known to react with APP or APP/CTFs (Youmans et al., 2012). DAPT-treated cells were processed and imaged as with other experiments in figure 3 using MOAB-2 antibodies to detect Aβ. We have included that information in the figure legends.

      The way we image the cell is to collect LSM800 confocal stacks and use IMARIS software to render the nucleus in a 3D object prior to quantifying the intensity or coverage. In this way, we are capturing and quantifying the entire volume of the nucleus and not just a single plane. The majority of signal for MOAB-2 positive Aβ are punctate signals in the cytosol with a subset adjacent to the nucleus (Supplementary Figure 14; Airyscan; single plane). We also detected MOAB-2 signals coming from within the nucleus. The nature of this interaction between Aβ and the nuclear membrane/perinuclear space/nucleoplasm remains unclear.

      (4) P20 L12. "We demonstrate an Aβ-driven loss of NUP expression in hippocampal neurons both in primary cocultures and in AD mouse models" It isn't clear that exogenous or extracellular Abeta drives this in living animals. All the data that demonstrate this is derived from cell culture and things may be very different (eg. Soluble Abeta concentration) in vivo. It is OK to speculate that the same thing happens in vivo, but to say it has been demonstrated in vivo is not correct.

      We have rewritten the opening statement in the paragraph to narrowly define our observations in the context of App KI. We understand the caveats of our studies in primary cultures, but we have done our due diligence to study the phenomenon in different assays, using at least four different nuclear pore antibodies, and in more than one mouse model to show the deficits. We mentioned Aβ-driven loss but did not conclude which Aβ peptide (e.g. 40 vs. 42) or form (e.g. fibrillar) that drives the deficits. However, we have shown some data that oligomers and not monomers as well as extracellular Aβ can accumulate in the soma and trigger NPC deficits. We also state in the discussion that other possible mechanisms of action, mainly via indirect interactions of Aβ with the cell, could result in the deficits.

      (5) P21, L21 "Inhibition of γ-secretase activity prevented cleavage of mutant APP and generation of Aβ, which led to the partial restoration of NUP levels". What the data actually shows is that treatment of the cells with DAPT led to partial restoration of NUP levels. Other studies have shown that DAPT is a gamma secretase inhibitor, so it is reasonable to suspect that the effect to gamma secretase activity, but the substrates and products are assumed rather than measured, so a little caution is a good idea here. For example, CTF alpha is also a substrate, producing P3, which is not considered abeta. The products Abeta and P3 also typically are secreted, where they can be further degraded. Abeta and P3 can also aggregate into amyloid, so whether the effect is really due to Abeta per se as a monomer or Abeta-containing aggregates isn't clear.

      The point is noted. DAPT inhibition of -secretase can impact more than one substate as the complex can cleave multiple substrates. However, we have measured Aβ intensity which increases with DAPT, and while a singular experiment is insufficient to show direct Aβ involvement, we have performed other experiments that show a correlation of Aβ levels inside the soma and the degree of NPC reduction. This includes the direct application of synthetic Aβ42 oligomers. We agree the data cannot fully exclude the involvement of other -secretase cleavage products, but we feel there is strong enough evidence that Aβ – in whatever form - is at least partially if not, the main driver that promote these deficits.

      (6) Discussion. The authors point to "intracellular Abeta" as a potential causative agent for decreased NUP expression and function and cite a number of papers reporting intracellular Abeta. (D'Andrea et al., 2001; Iulita et al., 2014; Kimura et al., 2003; LaFerla et al., 1997; Oddo et al., 2003b; Takahashi et al., 2004; Wirths et al., 2001). Most of these papers report immunoreactivity with Abeta antibodies and argue about whether this is really Abeta40 or 42 and not APP or APP-CTF immunoreactivity. What is missing from these papers and the discussion in this manuscript is that this is not just soluble Abeta, but Abeta amyloid of the same type that ends up in plaques because it has the same immunoreactivity with Abeta amyloid fibril-specific antibodies and even the classical anti-Abeta antibodies 6E10 and 4G8 after antigen retrieval as shown in papers by Pensalfini, et al., 2014 and Lee, et al., 2022 (1,2) who describe the evolution of neuritic plaques and their amyloid core beginning inside neurons. The term "dystrophic neurite" is a misnomer because the structures that resemble "neurites" morphologically are actually autophagic vesicles packed with Abeta and APP immunoreactive material which has the detergent insolubility properties of amyloid plaques. See (1,2). The apparent intranuclear IR of MOAB2 and 82E1 mentioned in comment 3 is relevant here. In Lee et al., the 3D serial section EM reconstruction of one of these neurons with perinuclear and nuclear amyloid shows abundant amyloid fibrils in the remnant of the nucleus. The nuclear envelope appears to break down as evidenced by the redistribution of NeuN immunoreactivity (Pensalfini et al.,) and other nuclear markers and the EM evidence (Lee et al.,). These papers are also improperly cited as evidence for a hypothetical intracellular source for soluble Abeta.

      We have devoted a section of the discussion to highlight some of these findings in the context of Pensalfini et al. 2014 and Lee et al. 2022. Lee et al. tested multiple animal strains to observe the Panthos structures but did not use the App KI mouse model. Since none of our experiments directly tested their observations (e.g. perinuclear fibrils or acidity of autophagic vesicles) in App KI, we decided to take a more conservative approach in our interpretations by framing the NPC deficits without specifying the nature of the intracellular Aβ. We note in discussion that it is entirely possible that App KI animals also show the same Panthos phenotypes and the perinuclear accumulation of Aβ which results in damaged NUPs. To do that, the Panthos phenotype must first be established in App KI mice.

      (7) The authors also cite the work of Ditaranto et al., 2001 and Ji et al., 2002 for Aβ-induced lysosomal leakage from these vesicular structures but overlook the original publications on Abeta-induced lysosomal leakage by Yang et al., (3) who further show that this is correlated with aggregation of Abeta42 upon internalization which also leads to the co-aggregation of APP and APP-CTFs in a detergent-insoluble form (4) and pulse-chase studies demonstrate that metabolically-labeled APP ultimately ends up as insoluble Abeta that have "ragged" N-termini (5). This work seems relevant to the results reported here as the perinuclear amyloid that the authors report here is likely to be the same insoluble, aggregated APP and APP-CTF-containing amyloid as that reported in references 1 and 2.

      We have included the literature references in the discussion, highlighting the possibility of lysosomal leakage contributing to the NPC damage.

      Minor points.

      (1) P2, L28 "permeability barrier facilities passive" should be 'facilitates'.

      (2) P7, L24 "homogenate and grounded for 5 additional strokes" One of the peculiarities of English is that the past tense of grind is ground. Grounded means something else.

      (3) P8, L9 "For synthetic Aβ experiments," Abeta what? 42? 40? It makes a difference and if it is Abeta42, you should be specific in the rest of the text where it is used.

      (4) P11, L14. "To determine if Aβ can trigger changes in nuclear structure and function" It seems a little early to start by presupposing that it is Abeta that triggers changes in nuclear structure and function. It sounds like you are starting out with a bias.

      (5) P11, L16,17 "While Aβ pathology is robustly detected in App KIs" At some point in the manuscript, either here or in the introduction, it would be useful to include a couple of sentences about what the pathology is in these mice along with the timing of the development of the pathology to compare with the results presented here. There are several types of amyloid deposits, "neuritic" plaques, diffuse plaques, and cerebrovascular amyloid. This is important because the early "neuritic" plaques are intraneuronal at least early on before the neuron dies. See (1,2).

      (6) P19, L10. "LMB is an inhibitor or CRM-1 mediated" should be of

      All minor points have been addressed in the manuscript and figures.

      References

      (1) Pensalfini, A., Albay, R., 3rd, Rasool, S., Wu, J. W., Hatami, A., Arai, H., Margol, L., Milton, S., Poon, W. W., Corrada, M. M., Kawas, C. H., and Glabe, C. G. (2014) Intracellular amyloid and the neuronal origin of Alzheimer neuritic plaques. Neurobiol Dis 71C, 53-61

      (2) Lee, J. H., Yang, D. S., Goulbourne, C. N., Im, E., Stavrides, P., Pensalfini, A., Chan, H., Bouchet-Marquis, C., Bleiwas, C., Berg, M. J., Huo, C., Peddy, J., Pawlik, M., Levy, E., Rao, M., Staufenbiel, M., and Nixon, R. A. (2022) Faulty autolysosome acidification in Alzheimer’s disease mouse models induces autophagic build-up of Abeta in neurons, yielding senile plaques. Nat Neurosci 25, 688-701

      (3) Yang, A. J., Chandswangbhuvana, D., Margol, L., and Glabe, C. G. (1998) Loss of endosomal/lysosmal membrane impermeability is an early event in amyloid Aß1-42 pathogenesis. J. Neurosci. Res. 52, 691-698

      (4) Yang, A. J., Knauer, M., Burdick, D. A., and Glabe, C. (1995) Intracellular A beta 1-42 aggregates stimulate the accumulation of stable, insoluble amyloidogenic fragments of the amyloid precursor protein in transfected cells. J Biol Chem 270, 14786-14792

      (5) Yang, A., Chandswangbhuvana, D., Shu, T., Henschen, A., and Glabe, C. G. (1999) Intracellular accumulation of insoluble, newly synthesized Aßn-42 in APP transfected cells that have been treated with Aß1-42. J. Biol. Chem. 274, 20650-20656

      References

      Boehmer, T., Enninga, J., Dales, S., Blobel, G., and Zhong, H. (2003). Depletion of a single nucleoporin, Nup107, prevents the assembly of a subset of nucleoporins into the nuclear pore complex. Proc Natl Acad Sci U S A 100, 981-985.

      D'Angelo, M.A., Raices, M., Panowski, S.H., and Hetzer, M.W. (2009). Age-dependent deterioration of nuclear pore complexes causes a loss of nuclear integrity in postmitotic cells. Cell 136, 284-295.

      Eftekharzadeh, B., Daigle, J.G., Kapinos, L.E., Coyne, A., Schiantarelli, J., Carlomagno, Y., Cook, C., Miller, S.J., Dujardin, S., Amaral, A.S., et al. (2018). Tau Protein Disrupts Nucleocytoplasmic Transport in Alzheimer's Disease. Neuron 99, 925-940 e927.

      Liu, J., and Hetzer, M.W. (2022). Nuclear pore complex maintenance and implications for agerelated diseases. Trends Cell Biol 32, 216-227.

      Lord, A., Kalimo, H., Eckman, C., Zhang, X.Q., Lannfelt, L., and Nilsson, L.N. (2006). The Arctic Alzheimer mutation facilitates early intraneuronal Abeta aggregation and senile plaque formation in transgenic mice. Neurobiol Aging 27, 67-77.

      Mertens, J., Paquola, A.C., Ku, M., Hatch, E., Bohnke, L., Ladjevardi, S., McGrath, S., Campbell, B., Lee, H., Herdy, J.R., et al. (2015). Directly Reprogrammed Human Neurons Retain Aging-Associated Transcriptomic Signatures and Reveal Age-Related Nucleocytoplasmic Defects. Cell stem cell 17, 705-718.

      Wu, X., Kasper, L.H., Mantcheva, R.T., Mantchev, G.T., Springett, M.J., and van Deursen, J.M. (2001). Disruption of the FG nucleoporin NUP98 causes selective changes in nuclear pore complex stoichiometry and function. Proc Natl Acad Sci U S A 98, 3191-3196.

      Youmans, K.L., Tai, L.M., Kanekiyo, T., Stine, W.B., Jr., Michon, S.C., Nwabuisi-Heath, E., Manelli, A.M., Fu, Y., Riordan, S., Eimer, W.A., et al. (2012). Intraneuronal Abeta detection in 5xFAD mice by a new Abeta-specific antibody. Molecular neurodegeneration 7, 8.

  4. Apr 2024
    1. Author response:

      The following is the authors’ response to the original reviews.

      We would like to express our gratitude to the reviewers for their suggestions and critiques as we continually strive to enhance the quality of the manuscript. We improved it, by incorporating the reviewers’ suggestions, changing the content and numbering of figures (Figs 1, 3S1 were edited; 4 figures were moved to supplemental materials), and adding several analyses suggested by the reviewers along with accompanying figures (1S2, 1S3) and tables (1 and 2). These analyses include investigating the link between freezing behavior and 44-kHz calls as well as their sound mean power and duration. Also, we have introduced detailed information regarding the experiments performed as well as expanded the description and discussion of the results section. Finally, we added the information about 44-kHz calls reported by another group – which was inspired by our findings.

      Below is the point-by-point response to the reviewers’ comments.

      Reviewer #1 (Public Review):

      Olszyński and colleagues present data showing variability from canonical "aversive calls", typically described as long 22 kHz calls rodents emit in aversive situations. Similarly long but higher-frequency (44 kHz) calls are presented as a distinct call type, including analyses both of their acoustic properties and animals' responses to hearing playback of these calls. While this work adds an intriguing and important reminder, namely that animal behavior is often more variable and complex than perhaps we would like it to be, there is some caution warranted in the interpretation of these data. The authors also do not provide adequate justification for the use of solely male rodents. With several reported sex differences in rat vocal behaviors this means caution should be exercised when generalizing from these findings.

      We fully agree that our data should be interpreted with caution and we followed the Reviewer’s suggestions along these lines (see below). Also, we appreciate the suggestion to explore the prevalence of 44-kHz calls in female subjects, which would indeed represent an important and intriguing extension of our research. However, due to present financial constraints, we can only plan such experiments. To address the comment, we have added the sentence: “Here we are showing introductory evidence that 44-kHz vocalizations are a separate and behaviorally-relevant group of rat ultrasonic calls. These results require further confirmations and additional experiments, also in form of repetition, including research on female rat subjects.”

      It is important to note that the data presented in the current manuscript originates primarily from previously conducted experiments. These earlier experiments employed male subjects only; it was due to established evidence indicating that the female estrus cycle significantly influences ultrasonic vocalization (Matochik et al., 1992). Adhering to controls for the estrus cycle would require a greater number of female subjects than males, which would not only increase animal suffering but also escalate the demands of human labor and financial costs.

      Firstly, the authors argue that the shift to higher-frequency aversive calls is due to an increase in arousal (caused by the animals having received multiple aversive foot shocks towards the end of the protocols). However, it cannot be ruled out that this shift would be due to factors such as the passage of time and increase in fatigue of the animals as they make vocalizations (and other responses) for extended periods of time. In fact the gradual frequency increase reported for 22 kHz calls and the drop in 44 kHz calls the next day in testing is in line with this.

      Answer: We would like to point out that the “increased-arousal” hypothesis, declared in the manuscript, is only a hypothesis – as reflected by the wording used. However, we changed the beginning of the sentence in question from “It could be argued” to “We would like to propose a hypothesis” to emphasize the speculative aspect of the proposed explanation behind the increase of 44-kHz ultrasonic emissions.

      Also, we do agree that other factors could contribute to the increased emission of 44kHz calls. These factors could include: heightened fear, stress/anxiety, annoyance/anger, disgust/boredom, grief/sadness, despair/helplessness, and weariness/fatigue. We are listing these potential factors in the discussion. Also, we added: “It is not possible, at this stage, to determine which factors played a decisive role. Please note that the potential contribution of these factors is not mutually exclusive”. However, we propose a list of arguments supporting the idea that 44-kHz vocalizations communicate an increased negative emotional state. Among these arguments were the conclusions drawn from additional analyses – mostly inspired by the fatigue hypothesis proposed by the Reviewer #1. In particular, we investigated changes in the sound mean power and duration of 22-kHz and 44-kHz calls. Specifically, we showed that the mean power of 44-kHz vocalizations did not change, and was higher than that of 22-kHz vocalizations (Fig. 1S2EF).

      Finally, the Reviewer #1 listed “the gradual frequency increase reported for 22 kHz calls and the drop in 44 kHz calls the next day” as arguments for the fatigue hypothesis. We do not agree that the “increase” should be interpreted as a sign of fatigue [Producing and maintaining higher frequency calls require greater effort from the vocalizer, on which we elaborated in the manuscript], also we are not sure what “drop in 44 kHz calls” the Reviewer is referring to [We assume it refers to less 44-kHz calls during testing vs. training; we suppose that the levels of arousal are lower in the test due to shorter session time and lack of shocks, which additionally contributes to fear extinction].

      Secondly, regarding the analysis where calls were sorted using DBSCAN based on peak frequency and duration, it is not surprising that the calls cluster based on frequency and duration, i.e. the features that are used to define the 44 kHz calls in the first place. Thus presenting this clustering as evidence of them being truly distinct call types comes across as a circular argument.

      Answer: The DBSCAN sorting results were to convey that when changing the clustering ε value, the degree of cluster separation, the 44-kHz vocalizations remained distinct from the 22-kHz and various short-call clusters that merged. In other words: 44-kHz calls remained separate from long 22-kHz, short 22-kHz and 50-kHz vocalizations, which all consolidated into one common cluster. As a result, in this mathematical analysis, 44-kHz vocalizations remained distinct without applying human biases. Additionally, frequency and duration are the two most common features used to define all types of calls (Barker et al., 2010; Silkstone & Brudzynski, 2019a, 2019b; Willey & Spear, 2013). In summary, we did not expect the analysis to isolate out the 44-kHz calls, and we were surprised by this result.

      The sparsity of calls in the 30-40 kHz range (shown in the individual animal panels in Figure 2C) could in theory be explained by some bioacoustics properties of rat vocal cords, without necessarily the calls below and above that range being ethologically distinct.

      Answer: We respectfully disagree with the argument regarding sparsity. It is important to note that, during prolonged fear conditioning experiments, we observed an increased incidence of 44-kHz calls (Fig. 1E-G) of up to >19% (Fig. 1S2AB) of the total ultrasonic vocalizations during specific inter-trial intervals. Also, it is possible that in observed experimental circumstances almost every fifth call could be attributed to the vocal apparatus as an artifact of its functioning (assuming we are interpreting the Reviewer’s argument correctly). While we do not believe this to be the case, we acknowledge the importance of considering such a hypothesis.

      The behavioral response to call playback is intriguing, although again more in line with the hypothesis that these are not a distinct type of call but merely represent expected variation in vocalization parameters. Across the board animals respond rather similarly to hearing 22 kHz calls as they do to hearing 44 kHz calls, with occasional shifts of 44 kHz call responses to an intermediate between appetitive and aversive calls. This does raise interesting questions about how, ethologically, animals may interpret such variation and integrate this interpretation in their responses. However, the categorical approach employed here does not address these questions fully.

      Answer: We are unsure of the Reviewer’s critique in this paragraph and will attempt to address it to the best of our understanding. Our finding of up to >19% of long seemingly aversive, 44-kHz calls, at a frequency in the define appetitive ultrasonic range (usually >32 kHz) is unexpected rather than “expected”. We would agree that aversive call variation is expected, but not in the appetitive frequency range.

      Kindly note the findings by Saito et al. (2019), which claim that frequency band plays the main role in rat ultrasonic perception. It is possible that the higher peak frequency of 44kHz calls may be a strong factor in their perception by rats, which is, however, modified by the longer duration and the lack of modulation.

      Also, from our experience, it is quite challenging to demonstrate different behavioral responses of naïve rats to pre-recorded 22-kHz (aversive) vs. 50-kHz (appetitive) vocalizations. Therefore, to demonstrate a difference in response to two distinct, potentially aversive, calls, i.e., 22-kHz vs. 44-kHz calls, to be even more difficult (as to our knowledge, a comparable experiment between short vs. long 22-kHz ultrasonic vocalizations, has not been done before).

      Therefore, we do not take lightly the surprising and interesting finding that “animals respond rather similarly to hearing 22 kHz calls as they do to hearing 44 kHz calls, with occasional shifts of 44 kHz call responses to an intermediate between appetitive and aversive calls”. We would rather put this description in analogous words: “the rats responded similarly to hearing 44-kHz calls as they did to hearing aversive 22-kHz calls, especially regarding heartrate change, despite the 44-kHz calls occupying the frequency band of appetitive 50-kHz vocalizations” and “other responses to 44-kHz calls were intermediate, they fell between response levels to appetitive vs. aversive playback” – which we added to the Discussion.

      Finally, we acknowledge that our findings do not present a finite and complete picture of the discussed aspects of behavioral responses to the presented ultrasonic stimuli (44-kHz vocalizations). Therefore, we have incorporated the Reviewer’s suggestion in the discussion. The added sentence reads: “Overall, these initial results raise further questions about how, ethologically, animals may interpret the variation in hearing 22-kHz vs. 44-kHz calls and integrate this interpretation in their responses.”

      In sum, rather than describing the 44kHz long calls as a new call type, it may be more accurate to say that sometimes aversive calls can occur at frequencies above 22 kHz. Individual and situational variability in vocalization parameters seems to be expected, much more so than all members of a species strictly adhering to extremely non-variable behavioral outputs.

      Answer: The surprising fact that there are presumably aversive calls that are beyond the commonly applied thresholds, i.e. >32 kHz, while sharing some characteristics with 22-kHz calls, is the main finding of the current publication. Whether they be finally assigned as a new type, subtype, i.e. a separate category or become a supergroup of aversive calls with 22-kHz vocalizations is of secondary importance to be discussed with other researchers of the field of study.

      However, we would argue – by showing a comparison – that 22-kHz calls occur at durations of <300 ms and also >300 ms, and are, usually, referred to in literature as short and long 22-kHz vocalizations, respectively (not introduced with a description that “sometimes 22kHz calls can occur at durations below 300 ms”). These are then regarded and investigated as separate groups or classes usually referred to as two different “types” (e.g., Barker et al., 2010) or “subtypes” (e.g., Brudzynski, 2015). Analogously, 44-kHz vocalizations can also be regarded as a separate type or a subtype of 22-kHz calls. The problem with the latter is that 22-kHz vocalizations are traditionally and predominantly defined by 18–32 kHz frequency bandwidth (Araya et al., 2020; Barroso et al., 2019; Browning et al., 2011; Brudzynski et al., 1993; Hinchcliffe et al., 2022; Willey & Spear, 2013).

      Reviewer #2 (Public Review):

      Olszyński et al. claim that they identified a "new-type" ultrasonic vocalization around 44 kHz that occurs in response to prolonged fear conditioning (using foot-shocks of relatively high intensity, i.e. 1 mA) in rats. Typically, negative 22-kHz calls and positive 50-kHz calls are distinguished in rats, commonly by using a frequency threshold of 30 or 32 kHz. Olszyński et al. now observed so-called "44-kHz" calls in a substantial number of subjects exposed to 10 tone-shock pairings, yet call emission rate was low (according to Fig. 1G around 15%, according to the result text around 7.5%).

      Answer: We are thankful for praising the strengths. Please note Figure 1G referred to 10-trial Wistar rats during delay fear conditioning session in which 44-kHz constituted 14.1% of ultrasonic vocalizations. The 7.5% number in results refers to the total of vocalizations analyzed across all animal groups used in fear conditioning experiments. These values have been updated in the current version of the manuscript. Also, please note – 44-kHz calls constituted up to 19.4% of calls, on average, in one of the ITI during fear conditioning session. However, the prevalence of aversive calls and of 44-kHz vocalizations in particular varied. It varied between individual rats; we added the text: “for n = 3 rats, 44-kHz vocalizations accounted for >95% of all calls during at least one ITI (e.g., 140 of total 142, 222 of 231, and 263 of 265 tallied 44-kHz calls), and in n = 9 rats, 44-kHz vocalizations constituted >50% of calls in more than one ITI.” See also further for the description of the array of experiments analyzed and the prevalence/percentage of 44-kHz calls encountered (Tab. 1, Fig. 1S3).

      Weaknesses: I see a number of major weaknesses.

      While the descriptive approach applied is useful, the findings have only focused importance and scope, given the low prevalence of "44 kHz" calls and limited attempts made to systematically manipulate factors that lead to their emission. In fact, the data presented appear to be derived from reanalyses of previously conducted studies in most cases and the main claims are only partially supported. While reading the manuscript, I got the impression that the data presented here are linked to two or three previously published studies (Olszyński et al., 2020, 2021, 2023). This is important to emphasize for two reasons:

      (1) It is often difficult (if not impossible) to link the reported data to the different experiments conducted before (and the individual experimental conditions therein). While reanalyzing previously collected data can lead to important insight, it is important to describe in a clear and transparent manner what data were obtained in what experiment (and more specifically, in what exact experimental condition) to allow appropriate interpretation of the data. For example, it is said that in the "trace fear conditioning experiment" both single- and grouphoused rats were included, yet I was not able to tell what data were obtained in single- versus group-housed rats. This may sound like a side aspect, however, in my view this is not a side aspect given the fact that ultrasonic vocalizations are used for communication and communication is affected by the social housing conditions.

      Answer: Preparing the current manuscript, we indeed used data collected during fear conditioning experiments which were described previously (Olszyński et al., 2021; Olszyński et al., 2022). Please note, however, that vocalization behavior during the fear conditioning itself was not the main subject of these publications. Our previous publications (Olszyński et al., 2020; Olszyński et al., 2021; Olszyński et al., 2022) present primarily ultrasonic-vocalization data from playback-part of experiments whereas here we analyze recordings obtained during fear conditioning experiments, thus we are analyzing new parts, i.e., not yet analyzed, of previously published studies. Also, we have performed additional experiments.

      In the first version of the current manuscript, we did not attempt to demonstrate exactly which calls were recorded in which conditions as the focus was to demonstrate that 44-kHz calls were emitted in several different fear-conditioning experiments. Also, as the experiments were not performed simultaneously and are results from different experimental situations, we would prefer to not compare these results directly.

      However, in the current version of the manuscript, we have introduced an additional reference system, based on Tab. 1, to more clearly indicate which rats have been employed in each analysis, e.g. the group of “Wistar rats that undergone 10 trials of fear conditioning” are described as “Tab. 1/Exp. 1-3/#2,4,8,13; n = 46”, i.e., these are the rats listed in rows 2, 4, 8, and 13 of Tab. 1.

      We have also tried to unify the analyses, in terms of rats used, as much as possible. Finally, we have also introduced Fig. 1S3 to demonstrate the prevalence of 44-kHz calls in all experiments analyzed with the note that “the experiments were not performed in parallel”.

      Regarding the Reviewer’s concerns about analyzing single- and pair-housed rats together. We have examined ultrasonic vocalizations emitted and freezing behavior in these two groups.

      • Ultrasonic vocalizations; when comparing the number of vocalizations, their duration, peak frequency and latency to first occurrence, equally for all types of calls and divided into types (short 22-kHz, long 22-kHz, 44-kHz, 50-kHz), the only difference was observed in peak frequency in 50-kHz vocalizations (50.7 ± 2.8 kHz for paired vs. 61.8 ± 3.1 kHz for single rats; p = 0.0280, Mann-Whitney). Since 50-kHz calls are not the subject of the current publication, we did not investigate this difference further. Also, this difference was not observed during playback experiments (Olszyński et al., 2020, Tab. 1).

      • Freezing. There were no differences between single- and pair-housed groups in freezing behavior, both in the time before first shock presentation and during fear conditioning training (Mann-Whitney).

      In summary, since the two groups did not differ in relevant ultrasonic features and freezing, we decided to present the results obtained from these rats together. However, we agree with the Reviewer, and it is possible that social housing conditions may in fact affect the emission of 44-kHz vocalizations, which could be a subject of another project – involving, e.g., larger experimental groups observed under hypothesis-oriented and defined conditions.

      (2) In at least two of the previously published manuscripts (Olszyński et al., 2021, 2023), emission of ultrasonic vocalizations was analyzed (Figure S1 in Olszyński et al., 2021, and Fig. 1 in Olszyński et al., 2023). This includes detailed spectrographic analyses covering the frequency range between 20 and 100 kHz, i.e. including the frequency range, where the "newtype" ultrasonic vocalization, now named "44 kHz" call, occurs, as reflected in the examples provided in Fig. 1 of Olszyński et al. (2023). In the materials and methods there, it was said: "USV were assigned to one of three categories: 50-kHz (mean peak frequency, MPF >32 kHz), short 22-kHz (MPF of 18-32 kHz, <0.3 s duration), long 22-kHz (MPF of 18-32 kHz, >0.3 s duration)". Does that mean that the "44 kHz" calls were previously included in the count for 50-kHz calls? Or were 44 kHz calls (intentionally?) left out? What does that mean for the interpretation of the previously published data? What does that mean for the current data set? In my view, there is a lack of transparency here.

      Answer: As mentioned above, we indeed used data collected during fear conditioning experiments which were described previously (Olszyński et al., 2021; Olszyński et al., 2022). However, in these publications, ultrasonic vocalizations emitted during playback experiments were the main subject, while the ultrasonic calls emitted during fear conditioning (performed before the playback) were only analyzed in a preliminary way. As a result, the 44-kHz vocalizations analyzed in the current manuscript were not included in the previous analyses. In particular, in Olszyński et al. (2021), we counted the overall number of ultrasonic vocalizations before fear conditioning session to determine the basal ultrasonic emissions (Fig. S1). Then, our next article (Olszyński et al., 2022), we analyzed again the number of all ultrasonic vocalizations before fear conditioning (Fig. S1) and restricted the analysis of vocalizations during fear conditioning to 22-kHz calls (Tab. S1 and S2).

      Also, we re-reviewed all the data used in our previous playback publications. Overall, 44-kHz calls were extremely rare in playback parts of the experiments. There were no 44-kHz calls in the playback data used in Olszyński et al. (2022) and Olszyński et al. (2020). In Olszyński et al. (2021), one rat produced eight 44-kHz calls. These 44-kHz calls constituted 0.03% of all vocalizations analyzed in the experiment (8/24888) and were included in the total number of calls analyzed (but not in the 50-kHz group), they were not described in further detail in that publication.

      Moreover, whether the newly identified call type is indeed novel is questionable, as also mentioned by the authors in their discussion section. While they wrote in the introduction that "high-pitch (>32 kHz), long and monotonous ultrasonic vocalizations have not yet been described", they wrote in the discussion that "long (or not that long (Biały et al., 2019)), frequency-stable high-pitch vocalizations have been reported before (e.g. Sales, 1979; Shimoju et al., 2020), notably as caused by intense cholinergic stimulation (Brudzynski and Bihari, 1990) or higher shock-dose fear conditioning (Wöhr et al., 2005)" (and I wish to add that to my knowledge this list provided by the authors is incomplete). Therefore, I believe, the strong claims made in abstract ("we are the first to describe a new-type..."), introduction ("have not yet been described"), and results ("new calls") are not justified.

      Answer: We would argue that 44-kHz vocalizations were indeed reported but not described. As far as we are concerned, an in-depth analysis of the properties and experimental circumstance of emission of long, high-frequency calls has not yet been performed. These researchers have observed, at least to a degree, similar calls to the ones we observed – as we mentioned in the discussion section. However, since these reported 44-kHz vocalizations were not fully described, we can only guess that they may be similar to ours. We speculate that perhaps like us, these researchers unknowingly recorded 44-kHz calls in their experiments and may also be able to describe them more extensively when re-analyzing their data as we have done here.

      Possibly, it was difficult to find reports on vocalizations, similar to the 44-kHz calls that we observed, because of the canonical and accepted definitions of ultrasonic vocalization types. Biały et al. (2019) allocated them as a part of 22-kHz group, perhaps because their calls were often of a step variation having both low and high components. Shimoju et al. (2020) grouped them along with 50-kHz vocalizations because they appeared during stroking rats held vertically; this procedure was compared to tickling which usually elicits appetitive calls.

      The Reviewer #2 states there are other publications to complete the list. We are aware of other articles authored by the same team as Shimoju et al. (2020) with different first authors. However, they are reporting similar findings to the cited article. Otherwise, we would gladly cite a more complete list of publications showing atypical, long, monotonous highfrequency vocalizations, similar to those observed in our experiments. Therefore, we would argue that ultrasonic vocalizations which were long, flat, high in frequency, and repeatedly occurring in a defined behavioral situation, have not been reported before. However, concerning the strong claims of novelty of our finding, we toned them down where we found this was warranted.

      In general, the manuscript is not well written/ not well organized, the description of the methods is insufficient, and it is often difficult (if not impossible) to link the reported data to the experiments/ experimental conditions described in the materials and methods section.

      Answer: The description of the methods has been adjusted and expanded. We added the requested link to each particular experiment as a formula “Tab. 1/Exp. nos./# nos.” which shows, each time, which experiments and experimental groups were analyzed. The list of the experiments and groups is found in the Tab. 1.

      For example, I miss a clear presentation of basic information: 1) How many rats emitted "44 kHz" calls (in total, per experiment, and importantly, also per experimental condition, i.e. single- versus group-housed)?

      Answer: We now clearly show which experiments were performed and how many animals were tested in each condition (Tab. 1), while the prevalence of 44-kHz calls amongst experimental conditions and animal groups is shown in Fig. 1S3. Also, we included information regarding the number of animals and treatment of each group of rats when reporting results. For example, we are stating that:

      (1a) “53 of all 84 conditioned Wistar rats (Tab. 1/Exp. 1-3/#2,4,6-8,13, Figs 1B, 1E, 1S1BC) displayed” 44-kHz vocalizations – as a general assessment; these numbers are different from those in the first version of the Ms, when we are mentioning Wistar rats conditioned 6 or 10 times only.

      (1b) “From this group of rats (n = 46), n = 41 (89.1%) emitted long 22-kHz calls, and 32 of them (69.6%) emitted 44-kHz calls” – this time referring only to 10-times conditioned Wistar rats as the biggest group that could be analyzed together (Figs 1F, 1G, 1S2A).

      (1c) “for n = 3 rats, 44-kHz vocalizations accounted for >95% of all calls during at least one ITI (e.g., 140 of total 142, 222 of 231, and 263 of 265 tallied 44-kHz calls), and in n = 9 rats, 44kHz vocalizations constituted >50% of calls in more than one ITI.”

      (2) Out of the ones emitting "44 kHz" calls, what was the prevalence of "44 kHz" calls (relative to 22- and 50-kHz calls, e.g. shown as percentage)?

      Answer: The prevalence of 44-kHz vocalizations in all investigated experiments and groups is shown in Fig. 1S3CD. Also, more information regarding the percentage of 44-kHz calls was demonstrated in Fig. 1S2AB where we calculated the distribution of 44-kHz calls to 22-kHz calls in Wistar rats, in 10-trial fear conditioning, across the length of the session.

      Additionally, the values are listed in the sentence regarding all Wistar rats which underwent 10 trials of fear conditioning: “these vocalizations were less frequent following the first trial (1.2 ± 0.4% of all calls), and increased in subsequent trials, particularly after the 5th (8.8 ± 2.8%), through the 9th (19.4 ± 5.5%, the highest value), and the 10th (15.5 ± 4.9%) trials, where 44-kHz calls gradually replaced 22-kHz vocalizations in some rats (Fig. 1F, 1S2B, Video 1; comp Fig. 1D vs. 1E).”

      (3) How did this ratio differ between experiments and experimental conditions?

      Answer: The prevalence of 44-kHz vocalizations in all experimental conditions is shown in Fig. 1S3. However, the direct comparison of results obtained in different conditions was not the goal of the present work. Also, we would argue, that such direct comparisons of results of different experiments would not be allowed. These experiments were done with different groups of animals, at different times, with different timetables of experimental manipulations.

      However, we are comfortable to state that:

      • There were more 44-kHz vocalizations during fear conditioning training than testing in all fear-conditioned Wistar rats;

      • We observed more 44-kHz vocalizations in Wistar rats compared to SHR.

      (4) Was there a link to freezing? Freezing was apparently analyzed before (Olszyński et al., 2021, 2023) and it would be important to see whether there is a correlation between "44-kHz" calls and freezing. Moreover, it would be important to know what behavior the rats are displaying while such "44-kHz" calls are emitted? (Note: Even not all 22-kHz calls are synced to freezing.) All this could help to substantiate the currently highly speculative claims made in the discussion section ("frequency increases with an increase in arousal" and "it could be argued that our prolonged fear conditioning increased the arousal of the rats with no change in the valence of the aversive stimuli"). Such more detailed analyses are also important to rule out the possibility that the "new-type" ultrasonic vocalization, the so-called "44 kHz" call, is simply associated with movement/ thorax compression.

      Answer: We analyzed freezing behavior and its association with ultrasonic emissions. The emission of 44-kHz vocalizations was associated with freezing. The results are now described and presented in the manuscript, i.e., Tab. 2, its legend and the description in Results: “Freezing during the bins of 22-kHz calls only (p < 0.0001, for both groups) and during 44-kHz calls only bins (p = 0.0003) was higher than during the first 5 min baseline freezing levels of the session. Also, the freezing associated with emissions of 44-kHz calls only was higher than during bins with no ultrasonic vocalizations (p = 0.0353), and it was also 9.9 percentage points higher than during time bins with only long 22-kHz vocalizations, but the difference was not significant (p = 0.1907; all Wilcoxon)” and “To further investigate this potential difference, we measured freezing during the emission of randomly selected single 44-kHz and 22-kHz vocalizations. The minimal freezing behavior detection window was reduced to compensate for the higher resolution of the measurements (3, 5, 10, or 15 video frames were used). There was no difference in freezing during the emission of 44-kHz vs. 22-kHz vocalizations for ≥150ms-long calls (3 frames, p = 0.2054) and for ≥500-ms-long calls (5 frames, p = 0.2404; 10 frames, p = 0.4498; 15 frames, p = 0.7776; all Wilcoxon, Tab. 2B).”

      Please note, that the general observation that "frequency increases with an increase in arousal" is not our claim but a general rule derived from large body of observations and proposed by the others (Briefer et al., 2012); we changed the wording of this statement to: “frequency usually increases with an increase in arousal (Briefer et al., 2012)”.

      The figures currently included are purely descriptive in most cases - and many of them are just examples of individual rats (e.g. majority of Fig. 1, all of Fig. 2 to my understanding, with the exception of the time course, which in case of D is only a subset of rats ("only rats that emitted 44-kHz calls in at least seven ITI are plotted" - is there any rationale for this criterion?)), or, in fact, just representative spectrograms of calls (all of Fig. 3, with the exception of G, all of Fig. 4).

      Answer: Please note, the former figures 2, 4, 6, and 8 have been now moved to supplementary figures 1S1, 2S1, 3S1, and 4S1 – to better organize the presentation of data. Figures 1, 3, 5, 7 are now 1, 2, 3, 4 respectively. In regards to presenting data from individual rats, this was to show the general patterns of ultrasonic-calls distributions observed. Showing the full data set as seen in Fig. 5A (now Fig. 3A) would obscure the readability of the graph without using mathematical clustering techniques such as DBSCAN.

      Concerning the Reviewer’s #2 question regarding the criterion of “minimum seven ITI”, we selected the highest vocalizers by taking animals above the 75th percentile of the number of ITI with 44-kHz calls. However, in the current version of the manuscript, we decided to omit this part of the analysis and the accompanying part of the figure, since it did not provide any additional informative value (apart from employing questionable criterion).

      Moreover, the differences between Fig. 5 and Fig. 6 are not clear to me. It seems Fig. 5B is included three times - what is the benefit of including the same figure three times?

      Answer: We hope that designating Fig. 6 as supplementary to Fig. 5 (now Figs 3S1 and 3, respectively) will make interpreting them more streamlined. Fig. 6A (now Fig. 3S1A) is a more detailed look on information presented in Fig. 5B (now Fig. 3B) with spectrogram images of ultrasonic vocalizations from different areas of the plot. Also, Fig. 3B (former Fig. 5B) was removed from Fig. 3S1B (former Fig. 6B).

      A systematic comparison of experimental conditions is limited to Fig. 7 and Fig. 8, the figures depicting the playback results (which led to the conclusion that "the responses to 44-kHz aversive calls presented from the speaker were either similar to 22-kHz vocalizations or in between responses to 22-kHz and 50-kHz playbacks", although it remains unclear to me why differences were seen b e f o r e the experimental manipulation, i.e. the different playback types in Fig. 8B).

      Answer: There were indeed instances of such before-differences. Such differences were observed in our previous studies (Olszyński et al., 2020, Tabs S9-12; Olszyński et al., 2021, Tabs S7; Olszyński et al., 2022, Tabs S4, S9, S13, S17, S18) and were most likely due to analyzing multiple comparisons. However, we think that the carry-over effect, mentioned by the Reviewer #2 (see below), also played a role.

      Related to that, I miss a clear presentation of relevant methodological aspects: 1) Why were some rats single-housed but not the others?

      Answer: As stated before, data were collected from our previous experiments and the observation of 44-kHz vocalizations in fear conditioning was an emergent discovery as we decided to analyze ultrasonic recordings from fear conditioning procedures. Single-housed animals were part of our experiment comparing fear conditioning and social situation on the perception of ultrasonic playback as described in Olszyński et al. (2020). Aside from this experiment, all other rats were housed in pairs.

      (2) Is the experimental design of the playback study not confounded? It is said that "one group (n = 13) heard 50-kHz appetitive vocalization playback while the other (n = 16) 22-kHz and 44kHz aversive calls". How can one compare "44 kHz" calls to 22- and 50-kHz calls when "44 kHz" calls are presented together with 22-kHz calls but not 50-kHz calls? What about carry-over effects? Hearing one type of call most likely affects the response to the other type of call. It appears likely that rats are a bit more anxious after hearing aversive 22-kHz calls, for example. Therefore, it would not be very surprising to see that the response to "44 kHz" calls is more similar to 22-kHz calls than 50-kHz calls.

      Of note, in case of the other playback experiment it is just said that rats "received appetitive and aversive ultrasonic vocalization playback" but it remains unclear whether "44 kHz" calls are seen as appetitive or aversive. Later it says that "rats were presented with two 10-s-long playback sets of either 22-kHz or 44-kHz calls, followed by one 50-kHz modulated call 10-s set and another two playback sets of either 44-kHz or 22-kHz calls not previously heard" (and wonder what data set was included in the figures and how - pooled?). Again, I am worried about carry-over effects here. This does not seem to be an experimental design that allows to compare the response to the three main call types in an unbiased manner.

      Answer: We apologize for being confounding and brief in our original description of the playback experiments. We wanted to avoid confusion associated with including several additional playback signals (please note some are not related to the current comparisons and include different 50-kHz ultrasonic subtypes and two different subtypes of short 22-kHz calls). We lengthened the description of these playback experiments in the current version.

      In general, including more than one type of ultrasonic calls as playback has a risk of a carry-over effect as well as a habituation effect (the responses become weak). However, it greatly reduces the number of required animals. Finally, regarding the first experiment, we chose 3 playbacks to compare the rats’ reactions, as this was the most conservative choice we thought of.

      We would like to highlight that we wanted to compare specifically the rats’ responses to 22-kHz vs. 44-kHz playback (as well as the effects of playback of different subtypes 50-kHz calls, which is not the subject of the current work). Therefore, we would argue, that the design of both experiments is actually unbiased regarding this key comparison (responses to 22-kHz vs. 44-kHz playback). In both experiments, 22-kHz and 44-kHz playbacks were included in the same sequences of stimuli and counterbalanced regarding their order (i.e., taking into account possible carry-over effects), and presented to the same rats. We regarded the group of rats that heard 50-kHz recordings as a baseline/control, since we know from previous playback studies what reactions to expect from rats exposed to these vocalizations (and 22-kHz playback), while in the second experiment, we reduced the 50-kHz playback to one set in order to minimize possible habituation to multiple playbacks.

      We agree that the design of both experiments does not allow for full comparison of the effects of aversive playbacks to 50-kHz playback. Also, we agree that some carry-over effects could play a role. It was mentioned in the discussion: ”Please factor in potential carryover effects (resulting from hearing playbacks of the same valence in a row) in the differences between responses to 50-kHz vs. 22/44-kHz playbacks, especially, those observed before the signal (Fig. 4AB).” However, we would still argue that the observed lack of difference in heartrate response (Fig. 4A) and the differences regarding the number of 50-kHz calls emitted (e.g., Fig. 4S1F) are void of the constraints raised by the Reviewer #2.

      We acknowledge that our studies do not give a complete picture of 44-kHz ultrasonic perception in relation to other ultrasonic bands and, given the possibility, we would like to perform more in-depth and focused experiments to study this aspect of 44-kHz calls in the future.

      Finally, regarding the second experiment, the description of the rats now includes that they “received 22-kHz, 44-kHz, and 50-kHz ultrasonic vocalization playback”, while the description of the experiment itself includes: “Responses to the pairs of playback sets were averaged”.

      Of note, what exactly is meant by "control rats" in the context of fear conditioning is also not clear to me. One can think of many different controls in a fear conditioning experiment.

      More concrete information is needed.

      Answer: This information was included in our previous publications. However, it was now provided in the method section of the current version of the manuscript. In general, control rats were subjected to the same procedures but did not receive electric shocks.

      Literature included in the answers

      Araya, E. I., Baggio, D. F., Koren, L. O., Andreatini, R., Schwarting, R. K. W., Zamponi, G. W., & Chichorro, J. G. (2020). Acute orofacial pain leads to prolonged changes in behavioral and affective pain components. Pain, 161(12), 2830-2840. https://doi.org/10.1097/j.pain.0000000000001970

      Barker, D. J., Root, D. H., Ma, S., Jha, S., Megehee, L., Pawlak, A. P., & West, M. O. (2010). Dose-dependent differences in short ultrasonic vocalizations emitted by rats during cocaine self-administration. Psychopharmacology (Berl), 211(4), 435-442. https://doi.org/10.1007/s00213-010-1913-9

      Barroso, A. R., Araya, E. I., de Souza, C. P., Andreatini, R., & Chichorro, J. G. (2019). Characterization of rat ultrasonic vocalization in the orofacial formalin test: Influence of the social context. Eur Neuropsychopharmacol, 29(11), 1213-1226. https://doi.org/10.1016/j.euroneuro.2019.08.298

      Biały, M., Podobinska, M., Barski, J., Bogacki-Rychlik, W., & Sajdel-Sulkowska, E. M. (2019). Distinct classes of low frequency ultrasonic vocalizations in rats during sexual interactions relate to different emotional states. Acta Neurobiol Exp (Wars), 79(1), 1-12. https://www.ncbi.nlm.nih.gov/pubmed/31038481

      Briefer, E. F., Padilla de la Torre, M., & McElligott, A. G. (2012). Mother goats do not forget their kids' calls. Proc Biol Sci, 279(1743), 3749-3755. https://doi.org/10.1098/rspb.2012.0986

      Browning, J. R., Browning, D. A., Maxwell, A. O., Dong, Y., Jansen, H. T., Panksepp, J., & Sorg, B. A. (2011). Positive affective vocalizations during cocaine and sucrose self administration: a model for spontaneous drug desire in rats. Neuropharmacology, 61(1-2), 268-275. https://doi.org/10.1016/j.neuropharm.2011.04.012

      Brudzynski, S. M. (2015). Pharmacology of Ultrasonic Vocalizations in adult Rats: Significance, Call Classification and Neural Substrate. Curr Neuropharmacol, 13(2), 180-192. https://doi.org/10.2174/1570159x13999150210141444

      Brudzynski, S. M., & Bihari, F. (1990). Ultrasonic vocalization in rats produced by cholinergic stimulation of the brain. Neurosci Lett, 109(1-2), 222-226. https://doi.org/10.1016/0304-3940(90)90567-s

      Brudzynski, S. M., Bihari, F., Ociepa, D., & Fu, X. W. (1993). Analysis of 22 kHz ultrasonic vocalization in laboratory rats: long and short calls. Physiol Behav, 54(2), 215-221. https://doi.org/10.1016/0031-9384(93)90102-l

      Hinchcliffe, J. K., Jackson, M. G., & Robinson, E. S. (2022). The use of ball pits and playpens in laboratory Lister Hooded male rats induces ultrasonic vocalisations indicating a more positive affective state and can reduce the welfare impacts of aversive procedures. Lab Anim, 56(4), 370-379. https://doi.org/10.1177/00236772211065920

      Matochik, J. A., White, N. R., & Barfield, R. J. (1992). Variations in scent marking and ultrasonic vocalizations by Long-Evans rats across the estrous cycle. Physiol Behav, 51(4), 783-786. https://doi.org/10.1016/0031-9384(92)90116-j

      Olszyński, K. H., Polowy, R., Małż, M., Boguszewski, P. M., & Filipkowski, R. K. (2020). Playback of Alarm and Appetitive Calls Differentially Impacts Vocal, Heart-Rate, and Motor Response in Rats. iScience, 23(10), 101577. https://doi.org/10.1016/j.isci.2020.101577

      Olszyński, K. H., Polowy, R., Wardak, A. D., Grymanowska, A. W., & Filipkowski, R. K. (2021). Increased Vocalization of Rats in Response to Ultrasonic Playback as a Sign of Hypervigilance Following Fear Conditioning. Brain Sci, 11(8). https://doi.org/10.3390/brainsci11080970

      Olszyński, K. H., Polowy, R., Wardak, A. D., Grymanowska, A. W., Zieliński, J., & Filipkowski, R. K. (2022). Spontaneously hypertensive rats manifest deficits in emotional response to 22-kHz and 50-kHz ultrasonic playback. Prog Neuropsychopharmacol Biol Psychiatry, 120, 110615. https://doi.org/10.1016/j.pnpbp.2022.110615

      Saito, Y., Tachibana, R. O., & Okanoya, K. (2019). Acoustical cues for perception of emotional vocalizations in rats. Scientific Reports, 9(1), 10539.

      Sales, G. D. (1979). Strain Differences in the Ultrasonic Behavior of Rats (Rattus norvegicus) Am Zool, 19(2), 513-527. https://www.jstor.org/stable/3882331

      Shimoju, R., Shibata, H., Hori, M., & Kurosawa, M. (2020). Stroking stimulation of the skin elicits 50-kHz ultrasonic vocalizations in young adult rats. J Physiol Sci, 70(1), 41. https://doi.org/10.1186/s12576-020-00770-1

      Silkstone, M., & Brudzynski, S. M. (2019a). The antagonistic relationship between aversive and appetitive emotional states in rats as studied by pharmacologically-induced ultrasonic vocalization from the nucleus accumbens and lateral septum. Pharmacology Biochemistry and Behavior, 181, 77-85. https://doi.org/10.1016/j.pbb.2019.04.009

      Silkstone, M., & Brudzynski, S. M. (2019b). Intracerebral injection of R-(-)-Apomorphine into the nucleus accumbens decreased carbachol-induced 22-kHz ultrasonic vocalizations in rats. Behavioural Brain Research, 364, 264-273. https://doi.org/10.1016/j.bbr.2019.01.044

      Willey, A. R., & Spear, L. P. (2013). The effects of pre-test social deprivation on a natural reward incentive test and concomitant 50 kHz ultrasonic vocalization production in adolescent and adult male Sprague-Dawley rats. Behav Brain Res, 245, 107-112. https://doi.org/10.1016/j.bbr.2013.02.020

      Wöhr, M., Borta, A., & Schwarting, R. K. (2005). Overt behavior and ultrasonic vocalization in a fear conditioning paradigm: a dose-response study in the rat. Neurobiol Learn Mem, 84(3), 228-240. https://doi.org/10.1016/j.nlm.2005.07.004

      Recommendations For The Authors:

      Reviewer #1 (Recommendations For The Authors):

      Additional considerations:

      The discussion of the "perfect fifth" and the proposition that this observation could be evidence of an evolutionary mechanism underlying it is rather far-fetched, especially for being presented in the Results section (with no supporting non-anecdotal evidence).

      Answer: We agree with the Reviewer #1. The text was modified, the word “evolutionary” was deleted. Instead, we expended on the possible reason for prevalence of the perfect fifth in the current version of the manuscript; we added that the prevalence of the perfect fifth: “could be explained by the observation that all physical objects capable of producing tonal sounds generate harmonic vibrations, the most prominent being the octave, perfect fifth, and major third (Christensen, 1993, discussed in Bowling and Purves, 2015).”

      It is not clear why Sprague-Dawleys were used as "receivers" in the playback experiment, when presumably the calls were recorded from Wistars and SHRs. While this does not critically impact the conclusions, within the species rats should be able to respond appropriately to calls made by rats of different genetic backgrounds, it adds an unnecessary source of variance.

      Answer: Sprague-Dawley rats were used to test another normotensive strain of rats. Regarding the Reviewer’s main point – we beg to differ as we think that it is worth testing playback stimuli in different strains. Diverging the stimuli between different rat strains would add unnecessary variance and it seemed logical to use the same recordings to test effects in different strains. Please note that finally, in spite of this additional variance, the results of both playback experiments are, in general, similar – which may point to a universal effect of 44-kHz playback across rat strains.

      It is pertinent to note that for the trace fear conditioning experiment, the rats had previously been exposed to a vocalization playback experiment. While such a pre-exposure is unlikely to be a very strong stressor, the possibility for it to influence the vocal behaviors of these rats in later experiments cannot be ruled out. It is also not clear what the control rats in this experiment experienced (home cage only?), nor what they were used for in analyses.

      Answer: In the current version of the manuscript, we have described in greater detail all the experiments performed and analyzed. We would like to emphasize that both delay and trace fear conditioning experiments with radiotelemetric transmitters were not performed specifically to elicit any particular response during fear conditioning, rather that our observation of 44-kHz vocalizations emerged as a result of re-examining the audio recordings. As a result, this work summarizes our observations of 44-kHz calls from several different experiments. It is relevant to note, that 44-kHz vocalizations were observed “in rats which were exposed to vocalization playback experiment”, in rats before the playback experiments as well as in naïve rats, without transmitters implemented, trained in fear conditioning (Tab. 1/Exp. 1-3).

      Our main message is that 44-kHz vocalizations were present in several experiments, with different conditions and subjects, while we are not attempting to compare in detail the results across the different experiments. In other words, we agree that pre-exposure to playback (and even more likely – transmitters implantation) could influence, but are not necessary, for 44-kHz ultrasonic emissions by the rats. To demonstrate this, we added a prolonged fear conditioning group with naïve Wistar rats (Exp. 3) to verify the emission of 44kHz calls in the absence of those experimental factors.

      We modified the methods section to clarify the circumstances under which these discoveries were made, such as including the information regarding the control rats in trace fear conditioning. In particular we mention that: “Control rats were subjected to the exact same procedures but did not receive the electric shock at the end of trace periods”.

      For Figure 1A-E, only example call distributions from individual rats are shown. It would perhaps be more informative to see the full data set displayed in this manner, with color/shape codes distinguishing individuals if desired.

      Answer: Please note the Fig. 1S1 shows more examples of ultrasonic call distribution. Showing all the data would make it more difficult to read and interpret. The problem is partly amended in Fig. 3A.

      It is not clear what is presented in Figure 2D vs. E, i.e. panel D is shown only for "selected rats" but the legend does not clarify how and why these rats were selected. It is also not clear why the legend reports p-values for both Friedman and Wilcoxon tests; the latter is appropriate for paired data which seems to be the case when the question is whether the call peak frequency alters across time, but the Friedman assumes non-paired input data.

      Answer: The question refers to the current Fig. 1S2C panel (former Fig. 2E panel) and the former Fig. 2D panel. The latter was not included in the current version of the manuscript, since both reviewers opposed the presentation of “selected rats” only (see above). The full description of the Fig. 1S2C panel is now in the results section together with p-values for Friedman and Wilcoxon test. We used the latter to investigate the difference between the first and the last ITI (selected paired data), while the Friedman to investigate the presence of change within the chain of ten ITI – since it is a suitable test for a difference between two or more paired samples.

      Reviewer #2 (Recommendations For The Authors):

      The weaknesses listed in the public review need to be addressed.

      Answer: We have done our best to address the weaknesses.

      Notes: 1) Page and line numbers would have been useful.

      Answer: We are including a separate manuscript version with page and line numbers.

      .(2) English language needs to be improved.

      Answer: The text has been checked by two native English speakers (one with a scientific background). Both only identified minor changes to improve the text which we applied.

      (3) I am a bit unsure whether the comment about the Star Wars movie (1997) and the Game of Thrones series (2011) is supposed to be a joke.

      Answer: These are indeed two genuine examples of the perfect fifth in human music that we hope are easily recognizable and familiar to readers. Parts of the same examples of the perfect fifth can also heard in the rat voice files provided.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      During the last decades, extensive studies (mostly neglected by the authors), using in vitro and in vivo models, have elucidated the five-step mechanism of intoxication of botulinum neurotoxins (BoNTs). The binding domain (H chain) of all serotypes of BoNTs binds polysialogangliosides and the luminal domain of a synaptic vesicle protein (which varies among serotypes). When bound to the synaptic membrane of neurons, BoNTs are rapidly internalized by synaptic vesicles (SVs) via endocytosis. Subsequently, the catalytic domain (L chain) translocates, a process triggered by the acidification of these organelles. Following translocation, the disulfide bridge connecting the H chain with the L chain is reduced by the thioredoxin reductase/thioredoxin system, and it is refolded by the chaperone Hsp90 on SV's surface. Once released into the cytosol, the L chains of different serotypes cleave distinct peptide bonds of specific SNARE proteins, thereby disrupting neurotransmission. In this study, Yeo et al. extensively revise the neuronal intoxication model, suggesting that BoNT/A follows a more complex intracellular route than previously thought. The authors propose that upon internalization, BoNT/A-containing endosomes are retro-axonally trafficked to the soma. At the level of the neuronal soma, this serotype then traffics to the endoplasmic reticulum (ER) via the Golgi apparatus. The ER SEC61 translocon complex facilitates the translocation of BoNT/A's LC from the ER lumen into the cytosol, where the thioredoxin reductase/thioredoxin system and HSP complexes release and refold the catalytic L chain. Subsequently, the L chain diffuses and cleaves SNAP25 first in the soma before reaching neurites and synapses. Strengths:

      I appreciate the authors' efforts to confirm that the newly established methods somehow recapitulate aspects of the BoNTs mechanism of action, such as toxin binding and uptake occurring at the level of active synapses. Furthermore, even though I consider the SNAPR approach inadequate, the genome-wide RNAi screen has been well executed and thoroughly analyzed. It includes well-established positive and negative controls, making it a comprehensive resource not only for scientists working in the field of botulinum neurotoxins but also for cell biologists studying endocytosis more broadly. Weaknesses:

      I have several concerns about the authors' main conclusions, primarily due to the lack of essential controls and validation for the newly developed methods used to assess toxin cleavage and trafficking into neurons. Furthermore, there is a significant discrepancy between the proposed intoxication model and existing studies conducted in more physiological settings. In my opinion, the authors have omitted over 20 years of work done in several labs worldwide (Montecucco, Montal, Schiavo, Rummel, Binz, etc.). I want to emphasize that I support changes in biological dogma only when these changes are supported by compelling experimental evidence, which I could not find in the present manuscript.

      We thank the reviewer for his reading and comments and for pointing out the discrepancy between our proposed model and the existing model. However, we respectfully disagree with the phrase of “extensive studies have elucidated the five-steps mechanism of intoxication…”. This sentence and the following imply that the model is well-established and demonstrated. It also highlights how the reviewer is convinced about this previous model.

      We contest this model for theoretical reasons and contest the strength of evidences that support it. We previously included references to previous work showing that the model is also being challenged by others. In light of the reviewer’s comments, we incluced more references in the introduction and we also explicit our main theoretical concern in the introduction:

      “Arguably, the main problem of the model is its failure to propose a thermodynamically consistent explanation for the directional translocation of a polypeptidic chain across a biologial membrane. Other known instances of polypeptide membrane translocation such as the co-translational translocation into the ER indicate that it is an unfavorable process, which consumes significant energy (Alder and Theg 2003). ”

      We also added the following text in the Discussion to address with the reviewer’s concerns: “Our study contradicts the long-established model of BoNT intoxication, which is described in several reviews specifically dedicated to the subject 1–4. In short, these reviews support the notion that BoNT are molecular machines able to mediate their own translocation across membranes; this notion has convinced some cell biologists interested in toxins and retrograde traffic, who describe BoNT mode of translocation in their reviews 5,6.

      But is this notion well supported by data? A careful examination of the primary literature reveals that early studies indeed report that BonTs form ion channels at low pH values 7,8. These studies have been extended by the use of patch-clamp 9,10. These works and others lead to various suppositions on how the toxin forms a channel and translocate the LC 1,11 .

      However, only a single study claims to reconstitute in vitro the translocation of BonT LC across membranes 12. In this paper, the authors report using a system of artificial membranes separating two aqueous compartments. They load the toxin in the cis compartment and measure the protease activity in the trans compartment after incubation. However, when the experimental conditions described are actually converted in terms of molarity, it appears that the cis compartment was loaded at 10e-8M BonT and that the reported translocated protease activity is equivalent to 10e-17 M (Figure 3D, 12). Thus, in this experiment, about 1 LC molecule in 100 millions has crossed the membrane. Such extremely low transfert rate does not tally with the extreme efficiency of intoxication in vivo, even while taking into account the difference between artificial and biological membranes.

      In sum, a careful analysis of the primary literature indicate that while there is ample evidence that BoNTs have the ability to affect membranes and possibly create ion channels, there is actually no credible evidence that these channels mediate translocation of the LC. As mentioned earlier, it is not clear how such a self-translocation mechanism would function thermodynamically. By contrast, our model proposes a mechanism without a thermodynamic problem, is consistent with current knowledge about other protein toxins, such as PE, Shiga and Ricin, and can help explain previously puzzling features of BonT effects. It is worth noting that a similar self-translocation model was proposed for other protein toxins such as Pseudomonas exotoxin, which have similar molecular organisation as BonT (68). However, it has since been demonstrated that the PE toxins require cellular machinery, in particular in the ER, for intoxication (21,69,70).”

      Reviewer #2 (Public Review):

      Summary:

      The study by Yeo and co-authors addresses a long-lasting issue about botulinum neurotoxin (BoNT) intoxication. The current view is that the toxin binds to its receptors at the axon terminus by its HCc domain and is internalized in recycled neuromediator vesicles just after the release of the neuromediators. Then, the HCn domain assists the translocation of the catalytic light chain (LC) of the toxin through the membrane of these endocytic vesicles into the cytosol of the axon terminus. There, the LC cleaves its SNARE substrate and blocks neurosecretion. However, other views involving kinetic aspects of intoxication suggest that the toxin follows the retrograde axonal transport up to the nerve cell body and then back to the nerve terminus before cleaving its substrate.

      In the current study, the authors claim that the BoNT/A (isotype A of BoNT) not only progresses to the cell body but once there, follows the retrograde transport trafficking pathway in a retromer-dependent fashion, through the Golgi apparatus, until reaching the endoplasmic reticulum. Next, the LC dissociates from the HC (a process not studied here) and uses the translocon Sec61 machinery to retro-translocate into the cytosol. Only then, does the LC traffic back to the nerve terminus following the anterograde axonal transport. Once there, LC cleaves its SNARE substrate (SNAP25 in the case of BoTN/A) and blocks neurosecretion.

      To reach their conclusion, Yeo and co-authors use a combination of engineered tools: a cell line able to differentiate into neurons (ReNcell VN), a reporter dual fluorescent protein derived from SNAP25, the substrate of BoNT/A (called SNAPR), the use of either native BoNT/A or a toxin to which three fragment 11 of the reporter fluorescent protein Neon Green (mNG) are fused to the N-terminus of the LC (BoNT/A-mNG11x3), and finally ReNcell VN transfected with mNG1-10 (a protein consisting of the first 10 beta strands of the mNG).

      SNAPR is stably expressed all over in the ReNcell VN. SNAPR is yellow (red and green) when intact and becomes red only when cleaved by BoNT/A LC, the green tip being degraded by the cell. When the LC of BoNT/A-mNG11x3 reaches the cytosol in ReNcell VN transfected by mNG1-10, the complete mNG is reconstituted and emits a green fluorescence.

      In the first experiment, the authors show that the catalytic activity of the LC appears first in the cell body of neurons where SNAPR is cleaved first. This phenomenon starts 24 hours after intoxication and progresses along the axon towards the nerve terminus during an additional 24 hours. In a second experiment, the authors intoxicate the ReNcell VN transfected by mNG1-10 using the BoNT/A-mNG11x3. The fluorescence appears also first in the soma of neurons, then diffuses in the neurites in 48 hours. The conclusion of these two experiments is that translocation occurs first in the cell body and that the LC diffuses in the cytosol of the axon in an anterograde fashion.

      In the second part of the study, the authors perform a siRNA screen to identify regulators of BoNT/A intoxication. Their aim is to identify genes involved in intracellular trafficking of the toxin and translocation of the LC. Interestingly, they found positive and negative regulators of intoxication. Regulators could be regrouped according to the sequential events of intoxication.

      Genes affecting binding to the cell-surface receptor (SV2) and internalization. Genes involved in intracellular trafficking. Genes involved in translocation such as reduction of the disulfide bond linking the LC to the HC and refolding in the cytosol. Genes involved in signaling such as tyrosine kinases and phosphatases. All these groups of genes may be consistent with the current view of BoNT intoxication within the nerve terminus. However, two sets of genes were particularly significant to reach the main conclusion of the work and definitely constitute an original finding important to the field. One set of genes consists of those of the retromer, and the other relates to the Sec61 translocon. This should indicate that once endocytosed, the BoNT traffics from the endosomes to the Golgi apparatus, and then to the ER. Ultimately, the LC should translocate from the ER lumen to the cytosol using the Sec61 translocon. The authors further control that the SV2 receptor for the BoNT/A traffics along the axon in a retromer-dependent fashion and that BoNT/A-mNG11x3 traverses the Golgi apparatus by fusing the mNG1-10 to a Golgi resident protein.

      Strengths:

      The findings in this work are convincing. The experiments are carefully done and are properly controlled. In the first part of the study, both the activity of the LC is monitored together with the physical presence of the toxin. In the second part of the work, the most relevant genes that came out of the siRNA screen are checked individually in the ReNcell VN / BoNT/A reporter system to confirm their role in BoNT/A trafficking and retro-translocation.

      These findings are important to the fields of toxinology and medical treatment of neuromuscular diseases by BoNTs. They may explain some aspects of intoxication such as slow symptom onset, aggravation, and appearance of central effects.

      Weaknesses:

      The findings antagonize the current view of the intoxication pathway that is sustained by a vast amount of observations. The findings are certainly valid, but their generalization as the sole mechanism of BoNT intoxication should be tempered. These observations are restricted to one particular neuronal model and engineered protein tools. Other models such as isolated nerve/muscle preparations display nerve terminus paralysis within minutes rather than days. Also, the tetanus neurotoxin (TeNT), whose mechanism of action involving axonal transport to the posterior ganglia in the spinal cord is well described, takes between 5 and 15 days. It is thus possible that different intoxication mechanisms co-exist for BoNTs or even vary depending on the type of neurons.

      Although the siRNA experiments are convincing, it would be nice to reach the same observations with drugs affecting the endocytic to Golgi to ER transport (such as Retro-2, golgicide or brefeldin A) and the Sec61 retrotranslocation (such as mycolactone). Then, it would be nice to check other neuronal systems for the same observations.

      We thank the reviewer for the careful reading and comments of our manuscript. The reference to “a vast amount of observation” is a similar argument to the Reviewer 1 and used to suggest that our study may not be applicable as a general mechanism.

      We respectfully disagree as described above and posit on the contrary that the model we propose is much more likely to be general than the model presented in current reviews for the several reasons cited (see added text in Introduction and Discussion). While we agree that more work is needed to confirm the proposed mechanisms of BonT translocation in other models, these experiments fall outside the perimeter of our study.

      The fact that nerve/muscle preparations of BonT activity have relatively fast kinetics does not pose a contradiction to our model. Our model reveals primarily the requirement for trafficking to the ER membranes. This ER targeting requires trafficking through the Golgi complex, in turn explaining the requirement for trafficking to the soma of neurons in the experimental system we used. However, in neuronal cells in vivo, Golgi bodies can be found along the lenght of the axon, thus BonT may not always require trafficking to the soma of the affected cells. The time required for intoxication could thus vary greatly depending on the neuronal structural organisation.

      TenT is proposed to transfer from excitatory neurons into inhibitory neurons before exerting its action. While the detailed mechanism of this fascinating mechanism remain to be explored, it clearly falls beyond the purview of this manuscript.

      Regarding the use of drugs, we agree that it would be a nice addition; unfortunately we are unable to perform such experiments at this stage. Setting up a large scale siRNA screen for BonT mechanism of action is challenging as it requires a special facility with controlled access and police authorisation (in Singapore) given the high toxicity of this molecule. Unfortunately, the authorisations have now lapsed.

      Reviewer #3 (Public Review): Summary:

      The manuscript by Yao et al. investigates the intracellular trafficking of Botulinum neurotoxin A (BoNT/A), a potent toxin used in clinical and cosmetic applications. Contrary to the prevailing understanding of BoNT/A translocation into the cytosol, the study suggests a retrograde migration from the synapse to the soma-localized Golgi in neurons. Using a genome-wide siRNA screen in genetically engineered neurons, the researchers identified over three hundred genes involved in this process. The study employs organelle-specific split-mNG complementation, revealing that BoNT/A traffics through the Golgi in a retromer-dependent manner before moving to the endoplasmic reticulum (ER). The Sec61 complex is implicated in the retro-translocation of BoNT/A from the ER to the cytosol. Overall, the research challenges the conventional model of BoNT/A translocation, uncovering a complex route from synapse to cytosol for efficient intoxication. The findings are based on a comprehensive approach, including the introduction of a fluorescent reporter for BoNT/A catalytic activity and genetic manipulations in neuronal cell lines. The conclusions highlight the importance of retrograde trafficking and the involvement of specific genes and cellular processes in BoNT/A intoxication.

      Strengths:

      The major part of the experiments are convincing. They are well-controlled and the interpretation of their results is balanced and sensitive.

      Weaknesses:

      To my opinion, the main weakness of the paper is in the interpretation of the data equating loss of tGFP signal (when using the Red SNAPR assay) with proteolytic cleavage by the toxin. Indeed, the first step for loss of tGFP signal by degradation of the cleaved part is the actual cleavage. However, this needs to be degraded (by the proteasome, I presume), a process that could in principle be affected (in speed or extent) by the toxin.

      We thank the reviewer for his comments and careful reading of our manuscript.

      Regarding the read-out of the assay, we agree that the assay could be sensitive to alteration in the protein degradation pathway. We have added the following sentence in the Discussion to take it into account:

      “As noted by one reviewer, the assay may be sensitive to perturbation in the general rate of protein degradation, a consideration to keep in mind when evaluating the results of large scale screens.”

      While this may be valid for some hits in the general list, it is important to note that the main hits have been shown to affect toxin trafficking by an independent, orthogonal assay based on the split GFP reconstitution.

      Recommendations to authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) To assess the activity of BoNT/A in neurons, Yeo et al. have generated a neuronal stem line referred to as SNAPR. This cell line stably expresses a chimeric reporter protein that consists of SNAP25 flanked at its N-terminus with a tagRFPT and at its C-terminus with a tagGFP. After exposure to BoNT/A, SNAP25 is cleaved and, the C-terminal tGFP-containing moiety is rapidly degraded. I have many doubts about the validity of the described method. Indeed, BoNT/A activity is analysed in an indirect way by quantifying the degradation of the GFP moiety generated after toxin cleavage (Fig. 2). In this regard, the authors should consider that their approach is dependent, not only on the toxin's metalloprotease activity but also on the functionality of the proteasome in neurons. Therefore, considering the current dataset, it is impossible to rule out the possibility that the progression of GFP signal loss from the soma to the neurite terminals may be attributed to the different proteasome activity in these compartments. Is it conceivable that the GFP fragment generated upon toxin cleavage degrades more rapidly in the soma in comparison to axonal terminals? This alternative explanation could challenge the conclusion drawn in Fig. 2.

      The reviewer’s alternative explanation disregards the experiments performed with the split-GFP complementation approach, which indicate translocation in the soma first. The split GFP reporter is not dependent on the proteasome activity. It also disregard the genetic data implicating many genes involved in membrane retrograde traffic, which are also not consistent with the hypothesis of the reviewer. These genes depletions not only affect SNAPR degradation but also BoNT/A-mNG11 trafficking: thus, their effect cannot be attributed to an completely hypothetical spatial heterogeneous distribution of the proteasome.

      For this reason, I strongly suggest using a more physiological approach that does not depend on proteasomal degradation or on the expression of the sensor in neurons. The authors should consider performing a time course experiment following intoxication and staining BoNT/A-cleaved SNAP25 by using specific antibodies (see Antonucci F. et al., Journal of Neuroscience, 2008 or Rheaume C. et al., Toxins 2015).

      For the above reason, we do not agree with the pressing importance of confirming by a third method using specific antibodies; especially considering that BonT is very difficult to detect in cells when incubated at physiological levels. By the way, the cited paper, by Antonucci F; et al. documents long distance retrograde traffic of BonT/A, which is in line with our data.

      An alternative approach could involve the use of microfluidic devices that physically separate axons from cell bodies. Such a separation will allow us to test the authors' primary conclusion that SNAP25 is initially cleaved in the soma. The suggested experiments will also rule out potential overexpression artifacts that could influence the authors' conclusions when using the newly developed SNAPR approach. Without these additional experiments, the authors' main conclusion that SNAP25 is cleaved first in the neuronal soma rather than at the nerve terminal is inadequate.

      As discussed above we disagree about the doubts raised by the reviewer: we present three types of evidences (SNAPR, split GFP and genetic hits) and they all point in the same direction. Thus, we respectfully doubt that a fourth approach would convince this reviewer. To note, we have attempted to use microfluidics devices as suggested by the reviewer, however, the Ren-VM neurons were not able to extend axons long enough across the device.

      (2) To detect BoNT/A translocation into the cytosol, the authors have used a complementation assay by intoxicating ReNcell VM cell expressing a cytosolic HA-tagged split monomeric NeonGreen (Cyt-mNG1-10) with an engineered BoNT/A, where the catalytic domain (LC) was fused to mNG1-11. When drawing conclusions regarding the detection of cytosolic LC in the neuronal soma, the authors should highlight the limitations of this assay and explicitly describe them to the readers. Firstly, the authors need to investigate whether the addition of mNG1-11 to the LC affects the translocation process itself (by comparing with a WT, not tagged, LC).

      Additionally, from the data shown in Fig. 2C, it is evident that the Cyt-mNG1-10 is predominantly expressed in the cytosol and less detected in neurites. This raises the question of whether there might be a bias for the cell soma in this assay. To address this important concern, I suggest quantifying MFI per cell (Fig. 2D) taking into consideration the amount of HA-tagged Cyt-mNG1-10. Furthermore, I strongly suggest targeting mNG1-10 to synapses and performing a similar time course experiment to observe when LC translocation occurs at nerve terminals. Alternative experiments, to prove that BoNT/A requires retrograde trafficking before it can translocate, may be done to repeat the experiments shown in Fig. 2D in the presence of inhibitors (or by KD some of the hits identified as microtubule stabilizers) that should interfere with BoNT/A trafficking to the neuronal somata. Without these additional experiments, the authors' main conclusion that the BoNT/A catalytic domain is first detected in the neuronal soma rather than at the nerve terminal is very preliminary.

      Similarly as for the SNAPR assay, the reviewer is raising the level of doubt to very high levels. We respect his thoroughness and eagerness to question the new model. However, we note that a similar level of scrutiny does not apply to the prevalent competitive model. Indeed, the data supporting the self-translocation model is based on a single in vitro experiment published in one panel as we have explain din the discussion (see above).

      (3) In the genome-wide RNAi screening, rather than solely assessing SV2 surface levels, it would have been beneficial to directly investigate BoNT/A binding to the neuronal membrane. For instance, this could have been achieved by using a GFP-tagged HC domain of BoNT/A. At present, the authors cannot exclude the possibility that among the 135 hits that did not affect SV2 levels, some might still inhibit BoNT/A binding to the neuronal surface. These concerns, already exemplified by B4CALT4 (which is known to be involved in the synthesis of GT1b), should be explicitly addressed in the main text.

      We agree with the reviewer that perturbation of binding of BonT is possible. We added the following text:

      “Network analysis reveals regulators of signaling, membrane trafficking and thioreductase redox state involved in BoNT/A intoxication

      Among the positive regulators of the screen, 135 hits did not influence significantly surface SV2 levels and are thus likely to function in post-endocytic processes (Supplementary Table 2). However, we cannot formerly exclude that they could affect binding of BonT to the cell surface independently of SV2.”

      (4) The authors should clearly state which reagents they have tried to use in order to explain the challenges they faced when directly testing the trafficking of BoNT/A. The accumulation of Dendra-SV2 bulbous structures at the neurite tips in VPS35-depleted cells could be interpreted as a sign of neuronal stress/death. Have the authors investigated other proteins that do not undergo retro-axonal trafficking in a retromer-dependent manner? This control is essential. In this regard, the use of a GFP-tagged HC domain of BoNT/A could prove to be quite helpful.

      We tried multiple commercially available antibodies against BonT but we could not get a very good signal. The postdoc in charge of this project has now gone to greener pastures and we are not in the capacity to provide the details corresponding to these antibodies. We di dnot observe significant cell death after VPS-35 knockdown at the time of the experiment, however longe rterm treatment might result in toxicity indeed.

      (5) Considering my concerns related to the SNAPR system and the complementation assay to study SNAP25 cleavage and BoNT/A trafficking, I suggest validating some of their major hits (ex. VPS34 and Sec61) by performing WB or IF analysis to examine the cleavage of endogenous SNAP25. Furthermore, the authors should test VPS35 depletion in the context of the experiments performed in Fig. 6G-H, by validating that this protein is essential for BoNT/A retrograde trafficking.

      The reviewer concerns are well noted but as discussed above, the two systems we used are completely orthogonal. Thus, for the reviewer’s concerns to be valid, it would have to be two completely independent artefacts giving rise to the same result. The alternative explanation is that BonT/A translocates in the soma. The Ockham razor principle dictates that the simplest explanation is the likeliest.

      (6) The introduction and the discussion section of this paper completely disregard more than 20 years of research conducted by several labs worldwide (Montecucco, Montal, Schiavo, Rummel, Binz, etc). The authors should make an effort to contextualize their data within the framework of these studies and address the significant discrepancies between their proposed intoxication model and existing research that clearly demonstrates BoNTs translocating upon the endocytic retrieval of SVs at presynaptic sites. Nevertheless, even assuming that the model proposed by the authors is accurate, numerous questions emerge. One such question is: How can the authors explain the exceptional toxicity of botulinum neurotoxin in an ex vivo neuromuscular junction preparation devoid of neuronal cell bodies (see Cesare Montecucco and Andreas Rummel's seminal studies)?

      Please see above in the answer to public reviews.

      (7) Scale bars should be added to all representative pictures.

      This has been done. Thank you for the thorough reading of our manuscript.

      Reviewer #2(Recommendations For The Authors):*

      (1) The title overstates the results. It may be indicated "in differenciated ReNcell VM".

      Title changed to: “Botulinum toxin intoxication requires retrograde transport and membrane translocation at the ER in RenVM neurons”

      (2) In the provided manuscript there are two Figure 2 and no Figure 3. This made the reading and understanding extremely difficult and should be corrected. As a result, the Figure legends do not fit the numbering. There are also discrepancies between some Figure panels (A, B, C, etc), the text, and the Legends. All this needs to be carefully checked.

      We apologize for the confusion as the manuscript as followed multiple rounds of revisions. We have carefully verified labels and legends.

      (3) The BoNT/A-mNG11x3 may introduce some bias that could be discussed. Would these additional peptides block LC translocation from synaptic vesicles in the nerve termini? In addition, the mNG peptides that are unfolded before complementation may direct LC towards Sec61. These aspects should be discussed.

      The comment would be valid if BoNT/A-mNG11x3 was the only approach used in the paper, however the SNAPR reporter is used with native BonT and shows data consistent with the split GFP approach.

      (4) In the Figure about SV2 (Fig 3 or 4): The authors did not locate SV2. The cells seem not to have the same differentiated phenotype as in Figure 1 and Figure 2/3A.

      We apologized above for the mislabeling. It is not clear what is the question here.

      (5) The authors should check whether BoNT/A wt cleaves the endogeneous SNAP25 by western blot for instance in the original ReNcell VN before SNAPR engineering. This should be compared with wt SNAP25 cleavage by the BoNT/A-LC-mNG.

      It is likely that BoNT/A-LC-mNG11 should have similar activity as it is only adding a small peptide at the end of the LC. At any rate, it is not clear why this is so important since both molecules translocate in the cytosol, with the same kinetics and in the same subcellular locale.

      (6) Perhaps I did not understand. How can the authors exclude that what is observed is the kinetic overproduction of the reporter substrate SNAPR?

      The authors could use SLO toxin (PNAS 98, 3185-3190, 2001) to permeabilize the cells all along their body and axon to introduce BoNT/A or LC (wt) and observe synchronized SNAPR cleavage throughout the cells.

      The concept mentioned here is not very clear to us. The reviewer is proposing that the SNAPR is produced much more efficiently at the tips of the neurites and thus its cleavage takes longer to be detected and is apparent first in the soma?? With all due respect, this is a strange hypothesis, at odds with what we know of protein dynamics in the neurons (i.e. most proteins are largely made in the soma and transported or diffuse into the neurites).

      Again, the two orthogonal approaches: split GFP and SNAPR reporter use different constructs and methods, yet converge on similar results. Perhaps, the incredulity of the reviewer might be more productively directed at the current data “demonstrating” the translocation of LC in the synaptic button?

      (7) The authors could also use an essay on neurotransmitter release monitoring by electrophysiology measurements to check the functional consequences of the kinetic diffusion of LC activity along the axon. Can the authors exclude that some toxin molecules translocate from the endocytic vesicles and block neurotransmission within minutes or a few hours?

      It is well established that inhibition of neurotransmission does not occur within minutes in vivo and in vitro, but rather within hours or even days. This kinetic delay is experienced by many patients and is one of the key argument against the current model of self-translocation at the synaptic vesicle level.

      Minor remarks

      Thank you for pointing out all these.

      (1) Please check typos. There are many. Check space before the parenthesis, between numbers and h (hours), reference style etc.

      Thank you. We have reviewed the text and try to eliminate all these instances.

      (2) Line 90: The C of HC should be capitalized.

      Fixed

      (3) Line 107: add space between "neurons(Donato".

      Fixed

      (4) Line 109: space "72 h".

      Fixed

      (5) Line 115: a word is missing ? ...to show retro-axonal... ? Please clarify this sentence.

      Fixed

      (6) Figure 1E: does nm refer to nM (nanomolar)? Please correct. No mention of panel F.

      Fixed

      (7) Line 161: do you mean ~16 µm/h? Please correct.

      Fixed

      (8) Line 168, words are missing.

      Fixed, thank you

      We verified that Cyt-mNG1-10 was expressed using the HA tag, the expression was homogeneously distributed in differentiated neurons and we observed no GFP signal (Figure2C).

      (9) Line 171: Isn't mNG 11 the eleventh beta strand of the neon green fluorescent protein, not alpha helix? Otherwise, can the authors confirm it acquires the shape of an alpha helix? Same at line 326.

      We have corrected the mistake; thanks for pointing it out.

      (10) Figure 2 is doubled. The legend of Fig 2 refers to Figure 3. There is no legend for Figure 2. Then, some figures are shifted in their numbering.

      Fixed

      (11) The fluorescence in the cell body must appear before the fluorescence in the axon due to higher volume. Please discuss.

      The fluorescence progresses in the neurites extensions in a centripetal fashion. The volume of the neurite near the cell body is not significantly different from the end of the neurite. Thus the fluorescence data is consistent with translocation in soma and not with an effect due to higher volume in the soma.

      (12) Figure 2D, right: the term intoxication is improper for this experiment. Rather, it is the presence of the BoNT/A-mNG11 that is detected. I believe the authors should be particularly careful about the use of terms: intoxication means blockade of neurosecretion, SNAPR cleavage means activity etc.

      While the reviewer is correct that it is the presence of BoNT/A-mNG11 that is detected, it remains that it is an active toxin, so the neurons are effectively intoxicated; as they are when we use the wild type toxin. We do not imply that we are measuring intoxication, but simply that the neurons are put into contact with a toxin.

      (13) Line 196: Should we read TXNRD1 is required for BoNT/A LC translocation? TXNRD1 in the current model of translocation is located in the cytoplasm and is supposed to play a role in the cleavage of the disulfide bond linking LC to HC. In the model proposed by this study, LC is translocated through the Sec61 translocon. In this case, I would assume that the protein disulfide isomerase (PDI) in the endoplasmic reticulum would reduce the LC-HC disulfide bond. In that case, TXNRD1 would not be required anymore. Please discuss.

      Why should we assume that a PDI is involved in the reduction of the LC-HC disulfide bond? In our previous studies on A-B toxins (PE and Ricin), different reduction systems seemed to be at play. There is no conceptual imperative to assume reduction in the ER because the Sec61 translocon is implicated. Reduction might occur on the cytosolic side by TXNRD1 or the effect of this reductase could be indirect.

      (14) The legend of Figure 4 (in principle Figure 5?) is not matching with the panels and panel entries are missing (Figure 4F in particular).

      Fixed

      (15) Figure 6 panels E and H, please match colors with legend (grey and another color).

      Not clear

      (16) Please indicate BoNT/A construct concentrations in all Figure legends.

      Done

      (17) Line 416: isn't SV2 also involved in epilepsy?

      Yes it is.

      (18) Line 433: as above, shouldn't the disulfide bond linking LC to HC be cleaved by PDI in the ER in this model (as for other translocating bacterial toxins) rather than by thioredoxin reductases in the cytoplasm? Please discuss.

      See above

      (19) Identification of vATPase in the screen could be consistent with the endocytic vesicle acidification model of translocation.

      Yes

      (20) Did the authors add KCl in screening controls without toxins? This should be detailed in the Materials and Methods. Could there be a KCl effect on the cells? KCl exposure for 48 hours may be highly stressful for cells. The KCl exposure should last only several minutes for toxin entry.

      We did not observe significant cell detah with the cell culture conditions used. Cell viability was controlled at multiple stages using nuclei number for instance

      Reviewer #3 (Recommendations For The Authors):

      Main comments: (1) In Figure 1B: could you devise a means to prevent proteosomal degradation of the tGFP cleaved part to assess whether this is formed?

      We have also used a FRET assay after tintoxication and obtained similar results

      (2) Line 152: Where it reads "was not surprising", maybe I missed something, but to me, this is indeed surprising. If the toxin is rapidly internalized and translocated (therefore, it is able to cleave SNAP25), the fact that tGFP requires 48 hours to be degraded seems surprising to me. Or does it mean that the toxin also slows down the degradation of the tGFP fragment? So, how can you differentiate between the effect being on cleavage of the fragment or in tGFP degradation?

      The reviewer is correct, the “not” was a typo due to re-writting; the long delay between adding the toxin and observing cleavage was suprising indeed. Our interpretation is that it is trafficking that takes time, indeed, the split-GFP data kinetics indicates that the toxin takes about 48h to fill up the entire cytosol (Fig. 2D).

      (3) Regarding the effect of Sec61G knockdown, is it possible that the observed effects are indirect and not due to the translocon being directly responsible for translocating the protein?

      As discussed in the last part of the results,Sec61 knock-down results in block of intoxication, but does not prevent BonT from reaching the lumen of the ER (Figure 6G,H). Thus, Sec61 is “is instrumental to the translocation of BoNT/A LC into the neuronal cytosol at the soma.”

      Minor comments:

      (1) Fig. 3E: in the legend I think one of the NT3+ should be NT3-.

      Yes, thanks for spotting it

      (2) Would you consider adding Figure S4 as a main figure?

      Thanks for the suggestion

      (3) Please, check that all microscopy image panels have scale bars.

      Done

      (4) Figure 6B (bottom panes): why does it seem that there is a lot of mNeonGreen positive signal in regions that are not positive for HA? Shouldn't complementation keep HA in the complemented protein.

      Our assumption i sthat there is an excess of receptor protein (HA tag) over reconstituted protein (GFP protein) given the relatively low concentration of toxin being internalized and translocated Refs: (1) Pirazzini M, Azarnia Tehran D, Leka O, Zanetti G, Rossetto O, Montecucco C. On the translocation of botulinum and tetanus neurotoxins across the membrane of acidic intracellular compartments. Biochim Biophys Acta. 2016 Mar;1858(3):467–474. PMID: 26307528

      (2) Pirazzini M, Rossetto O, Eleopra R, Montecucco C. Botulinum Neurotoxins: Biology, Pharmacology, and Toxicology. Pharmacol Rev. 2017 Apr;69(2):200–235. PMCID: PMC5394922

      (3) Dong M, Masuyer G, Stenmark P. Botulinum and Tetanus Neurotoxins. Annu Rev Biochem. Annual Reviews; 2019 Jun 20;88(1):811–837.

      (4) Rossetto O, Pirazzini M, Fabris F, Montecucco C. Botulinum Neurotoxins: Mechanism of Action. Handb Exp Pharmacol. 2021;263:35–47. PMCID: 6671090

      (5) Williams JM, Tsai B. Intracellular trafficking of bacterial toxins. Curr Opin Cell Biol. 2016 Aug;41:51–56. PMCID: PMC4983527

      (6) Mesquita FS, van der Goot FG, Sergeeva OA. Mammalian membrane trafficking as seen through the lens of bacterial toxins. Cell Microbiol. 2020 Apr;22(4):e13167. PMCID: PMC7154709

      (7) Hoch DH, Romero-Mira M, Ehrlich BE, Finkelstein A, DasGupta BR, Simpson LL. Channels formed by botulinum, tetanus, and diphtheria toxins in planar lipid bilayers: relevance to translocation of proteins across membranes. Proc Natl Acad Sci U S A. 1985 Mar;82(6):1692–1696. PMCID: PMC397338

      (8) Donovan JJ, Middlebrook JL. Ion-conducting channels produced by botulinum toxin in planar lipid membranes. Biochemistry. 1986 May 20;25(10):2872–2876. PMID: 2424493

      (9) Fischer A, Montal M. Single molecule detection of intermediates during botulinum neurotoxin translocation across membranes. Proc Natl Acad Sci U S A. 2007 Jun 19;104(25):10447–10452. PMCID: PMC1965533

      (10) Fischer A, Nakai Y, Eubanks LM, Clancy CM, Tepp WH, Pellett S, Dickerson TJ, Johnson EA, Janda KD, Montal M. Bimodal modulation of the botulinum neurotoxin protein-conducting channel. Proc Natl Acad Sci U S A. 2009 Feb 3;106(5):1330–1335. PMCID: PMC2635780

      (11) Fischer A, Montal M. Crucial role of the disulfide bridge between botulinum neurotoxin light and heavy chains in protease translocation across membranes. J Biol Chem. 2007Oct 5;282(40):29604–29611. PMID: 17666397

      (12) Koriazova LK, Montal M. Translocation of botulinum neurotoxin light chain protease through the heavy chain channel. Nature structural biology. 2003. p. 13–18. PMID: 12459720

      (13) Moreau D, Kumar P, Wang SC, Chaumet A, Chew SY, Chevalley H, Bard F.Genome-wide RNAi screens identify genes required for Ricin and PE intoxications. Dev Cell. 2011 Aug 16;21(2):231–244. PMID: 21782526

      (14) Bassik MC, Kampmann M, Lebbink RJ, Wang S, Hein MY, Poser I, Weibezahn J, Horlbeck MA, Chen S, Mann M, Hyman AA, Leproust EM, McManus MT, Weissman JS. A systematic mammalian genetic interaction map reveals pathways underlying ricin susceptibility. Cell. 2013 Feb 14;152(4):909–922. PMCID: PMC3652613

      (15) Tian S, Muneeruddin K, Choi MY, Tao L, Bhuiyan RH, Ohmi Y, Furukawa K, Furukawa K, Boland S, Shaffer SA, Adam RM, Dong M. Genome-wide CRISPR screens for Shiga toxins and ricin reveal Golgi proteins critical for glycosylation. PLoS Biol. 2018 Nov;16(11):e2006951. PMCID: PMC6258472

    1. Author response:

      We would like to thank the reviewers for their helpful comments. We note that both reviews are strongly supportive with comments including, “a biophysical tour de force” (rev #1), “the study is exemplary” (rev #2), and “represents a roadmap for future work” (rev #2). Below we respond to each reviewer comment.

      Reviewer #1

      This study provides a detailed and quantitative description of the allosteric mechanisms resulting in the paradoxical activation of BRAF kinase dimers by certain kinase inhibitors. The findings provide a much needed quantiative basis for this phenomenon and may lay the foundation for future drug development efforts aimed at the important cancer target BRAF. The study builds on very evidence obtained by multiple independent biophysical methods.

      Summary:

      The authors quantitatively describe the complex binding equilibria of BRAF and its inhibitors resulting in some cases in the paradoxical activation of BRAF dimer when bound to ATP competitive inhibitors. The authors use a biophysical tour de force involving FRET binding assays, NMR, kinase activity assays and DEER spectroscopy.

      We are gratified by the reviewer’s supportive summary.

      Strengths:

      The strengths of the study are the beautifully conducted assays that allow for a thorough characterization of the allostery in this complex system. Additionally, the use of F-NMR and DEER spectroscopy provide important insights into the details of the process. The resulting model for binding of inhibitors and dimerization (Fig.4) is very helpful.

      Weaknesses:

      This is a complex system and its communication is inherently challenging. It might be of interest to the broader readership to understand the implications of the model for drug development and therapy.

      We agree with the reviewer that this is a complicated system. With regard to inhibitor development, a key insight is that designing aC-in state inhibitors that avoid paradoxical activation may be non-trivial because these molecules not only induce dimers but also tend to bind the second dimer subunit more weakly than the first, due to allosteric asymmetry and/or inherently different affinities for each RAF isoform. We feel the full implications for future therapeutic development are an extensive topic that is beyond the scope of our work, which is focused on the properties of current inhibitors.

      Recommendations for the author:

      The experimental work, analysis and resulting model are excellent. I had some difficulty following the complex model in some instances and it may be useful to review the description of the model and see whether it can be made more palatable to the broader readership. I think it would be useful to discuss the model presented in reference 40 (Kholodenko) and to compare it to the presented model here.

      We regret any confusion with regards to the nature of the model. Our analysis was built upon the model developed by Boris Kholodenko as reported in his 2015 Cell Reports paper. This formed the theoretical framework that combined with our experimental data allowed us to parameterize this model to obtain experimental values for the equilibrium constants and allosteric coupling factors.

      Reviewer #2

      This manuscript combines elegant biophysical solution measurements to address paradoxical kinase activation by Type II BRAF inhibitors. The novel findings challenge prevailing models, through experiments that are rigorous and carefully controlled. The study is exemplary in the breadth of strategies it uses to address protein kinase dynamics and inhibitor allostery.

      Summary:

      This manuscript uses FRET, 19F-NMR and DEER/EPR solution measurements to examine the allosteric effects of a panel of BRAF inhibitors (BRAFi). These include first-generation aC-out BRAFi, and more recent Type I and Type II aC-in inhibitors. Intermolecular FRET measurements quantify Kd for BRAF dimerization and inhibitor binding to the first and second subunits. Distinct patterns are found between aC-in BRAFi, where Type I BRAFi bind equally well to the first and second subunits within dimeric BRAF. In contrast, Type II BRAFi show stronger affinity for the first subunit and weaker affinity for the second subunit, an effect named "allosteric asymmetry". Allosteric asymmetry has the potential for Type II inhibitors to promote dimerization while favoring occupancy of only one subunit (BBD form), leading to enrichment of an active dimer.

      Measurements of in vitro BRAF kinase activity correlate amazingly well with the calculated amounts of the half site-inhibited BBD forms with Type II inhibitors. This suggests that the allosteric asymmetry mechanism explains paradoxical activation by this class of inhibitors. DEER/EPR measurements further examine the positioning of helix aC. They show systematic outward movement of aC with Type II inhibitors, relative to the aC-in state with Type I inhibitors, and further show that helix aC adopts multiple states and is therefore dynamic in apo BRAF. This makes a strong case that negative cooperativity between sites in the BRAF dimer can account for paradoxical kinase activation by Type II inhibitors by creating a half site-occupied homodimer, BBD. In contrast, Type I inhibitors and aC-out inhibitors do not fit this model, and are therefore proposed to be explained by previous proposed models involving negative allostery between subunits in BRAF-CRAF heterodimers, RAS priming, and transactivation.

      Strengths:

      This study integrates orthogonal spectroscopic and kinetic strategies to characterize BRAF dynamics and determine how it impacts inhibitor allostery. The unique combination of approaches presented in this study represents a road map for future work in the important area of protein kinase dynamics. The work represents a worthy contribution not only to the field of BRAF regulation but protein kinases in general.

      Weaknesses:

      Some questions remain regarding the proposed model for Type II inhibitors and its comparison to Type I and aC-out inhibitors that would be useful to clarify. Specifically, it would be helpful to address whether the activation of BRAF by Type II inhibitors, while strongly correlated with BBD model predictions in vitro, also depends on CRAF via BRAF-CRAF in cells and therefore overlaps with the mechanisms of paradoxical activation by Type I and aC-out inhibitors.

      We agree with the reviewer that this is a worthy question to be pursued. However, given the substantial experimental effort required for such an endeavor, and the highly supportive nature of the reviewer comments, including that “This is a strong manuscript that I feel is well above the bar for publication”, we believe this effort is more appropriate for a future study.

      This is a strong manuscript that I feel is well above the bar for publication. Nevertheless, it is recommended that the authors consider addressing the following points in order to support their major conclusions.

      (1) Fig 3D shows similar effects of Type II and Type I inhibitors in the biphasic increase of cellular pMEK/pERK. From this, the authors argue that Type II inhibitors are explained by negative allostery in the BRAF homodimer (based on Fig 2E), while Type I inhibitors are not. But it seems possible that despite the terrific correlation between BBD and BRAF kinase activities measured in vitro, CRAF is still important to explain pathway activation in cells. It also seems conceivable that the calculated %BBD between different Type II inhibitors may not correlate as well with their effects on pathway activation in cells. These possibilities should be addressed.

      We agree with the reviewer that it is likely that CRAF contributes to paradoxical activation by type II inhibitors in cells. It is also likely that other cellular factors such as RAS-priming and membrane recruitment play a role in activation. However, we note that for the type II inhibitors there is good agreement between the biophysical predictions and the concentration regimes in which activation is observed in cells, suggesting that these predictions are capturing a key part of the activation process that occurs in cells.

      (2) In Fig 2A, is it possible to report the activity of dimeric BRAF-WT in the absence of inhibitor? This would help confirm that the maximal activity measured after titrating inhibitor is indeed consistent with the predicted %BBD population, which would be expected to have half of the specific activity of BB.

      In principle, it is possible to determine the catalytic activity of apo dimers (BB) by combining our model predictions for the concentration of BB dimers and our activity measurements. However, because the activity assays are performed at nanomolar kinase concentrations, whereas the baseline dimerization affinity of BRAF is in the micromolar range, the observed activity of apo BRAF arises from a small subpopulation of dimers (on the order of 4 percent under the conditions of our experiments) and is therefore difficult to define accurately. As a result, we deemed it more suitable to compare our results to published activity measurements derived from 14-3-3-activated dimers which should represent fully dimerized BRAF. This analysis, as reported in Figure 2E, suggests that the BBD activity is approximately half of that of BB.

      (3) The 19F-NMR experiments make a good case for broadening of the helix aC signal in the BRAF dimer. From this, the study proposes that after inhibitor binds one subunit, the second unoccupied subunit retains dynamics. It would be useful to address this experimentally, if possible. For example, can the 19F-NMR signal be measured in the presence of inhibitor, to support the prediction that the unoccupied subunit is indeed dynamic and samples multiple conformations as in apo BRAF?

      We agree with the reviewer that it would be interesting to determine the dynamic response of BRAF to inhibitor binding. However, this is a challenging undertaking due to the biochemical heterogeneity that occurs at sub saturating inhibitor concentrations. For example, at any given inhibitor concentration, BRAF exists as a mixture of monomers, apo dimers, dimers with one inhibitor molecule, and dimers with two inhibitor molecules bound. This makes it challenging to relate the 19F NMR signal to a single biochemical state. Addressing this would require a substantial experimental effort that we feel is beyond the scope of this study.

    1. Author response:

      Reviewer 1:

      The paper “Quantifying gliding forces of filamentous cyanobacteria by self-buckling” combines experiments on freely gliding cyanobacteria, buckling experiments using two-dimensional V-shaped corners, and micropipette force measurements with theoretical models to study gliding forces in these organisms. The aim is to quantify these forces and use the results to perhaps discriminate between competing mechanisms by which these cells move. A large data set of possible collision events are analyzed, bucking events evaluated, and critical buckling lengths estimated. A line elasticity model is used to analyze the onset of buckling and estimate the effective (viscous type) friction/drag that controls the dynamics of the rotation that ensues post-buckling. This value of the friction/drag is compared to a second estimate obtained by consideration of the active forces and speeds in freely gliding filaments. The authors find that these two independent estimates of friction/drag correlate with each other and are comparable in magnitude. The experiments are conducted carefully, the device fabrication is novel, the data set is interesting, and the analysis is solid. The authors conclude that the experiments are consistent with the propulsion being generated by adhesion forces rather than slime extrusion. While consistent with the data, this conclusion is inferred.

      We thank the reviewer for the positive evaluation of our work.

      Summary:

      The paper addresses important questions on the mechanisms driving the gliding motility of filamentous cyanobacteria. The authors aim to understand these by estimating the elastic properties of the filaments, and by comparing the resistance to gliding under a) freely gliding conditions, and b) in post-buckled rotational states. Experiments are used to estimate the propulsion force density on freely gliding filaments (assuming over-damped conditions). Experiments are combined with a theoretical model based on Euler beam theory to extract friction (viscous) coefficients for filaments that buckle and begin to rotate about the pinned end. The main results are estimates for the bending stiffness of the bacteria, the propulsive tangential force density, the buckling threshold in terms of the length, and estimates of the resistive friction (viscous drag) providing the dissipation in the system and balancing the active force. It is found that experiments on the two bacterial species yield nearly identical values of f (albeit with rather large variations). The authors conclude that the experiments are consistent with the propulsion being generated by adhesion forces rather than slime extrusion.

      We appreciate this comprehensive summary of our work.

      Strengths of the paper:

      The strengths of the paper lie in the novel experimental setup and measurements that allow for the estimation of the propulsive force density, critical buckling length, and effective viscous drag forces for movement of the filament along its contour – the axial (parallel) drag coefficient, and the normal (perpendicular) drag coefficient (I assume this is the case, since the post-buckling analysis assumes the bent filament rotates at a constant frequency). These direct measurements are important for serious analysis and discrimination between motility mechanisms.

      We thank the reviewer for this positive assessment of our work.

      Weaknesses:

      There are aspects of the analysis and discussion that may be improved. I suggest that the authors take the following comments into consideration while revising their manuscript.

      The conclusion that adhesion via focal adhesions is the cause for propulsion rather than slime protrusion is consistent with the experimental results that the frictional drag correlates with propulsion force. At the same time, it is hard to rule out other factors that may result in this (friction) viscous drag - (active) force relationship while still being consistent with slime production. More detailed analysis aiming to discriminate between adhesion vs slime protrusion may be outside the scope of the study, but the authors may still want to elaborate on their inference. It would help if there was a detailed discussion on the differences in terms of the active force term for the focal adhesion-based motility vs the slime motility.

      We appreciate this critical assessment of our conclusions. Of course we are aware that many different mechanisms may lead to similar force/friction characteristics, and that a definitive conclusion on the mechanism would require the combination of various techniques, which is beyond the scope of this work. Therefore, we were very careful in formulating the discussion of our findings, refraining, in particular, from a singular conclusion on the mechanism but instead indicating “support” for one hypothesis over another, and emphasizing “that many other possibilities exist”.

      The most common concurrent hypotheses for bacterial gliding suggest that either slime extrusion at the junctional pore complex [A1], rhythmic contraction of fibrillar arrays at the cell wall [A2], focal adhesion sites connected to intracellular motor-microtubule complexes [A3], or modified type-IV pilus apparati [A4] provide the propulsion forces. For the slime extrusion hypothesis, which is still abundant today, one would rather expect an anticorrelation of force and friction: more slime extrusion would generate more force, but also enhance lubrication. The other hypotheses are more conformal to the trend we observed in our experiments, because both pili and focal adhesion require direct contact with a substrate. How contraction of fibrilar arrays would micromechanically couple to the environment is not clear to us, but direct contact might still facilitate force transduction. Please note that these hypotheses were all postulated without any mechanical measurements, solely based on ultra-structural electron microscopy and/or genetic or proteomic experiments. We see our work as complementary to that, providing a mechanical basis for evaluating these hypotheses.

      We agree with the referee that narrowing down this discussion to focal adhesion should have been avoided. We rewrote the concluding paragraph (page 8):

      “…it indicates that friction and propulsion forces, despite being quite vari able, correlate strongly. Thus, generating more force comes, inevitably, at the expense of added friction. For lubricated contacts, the friction coefficient is proportional to the thickness of the lubricating layer (Snoeijer et al., 2013 ), and we conjecture active force and drag both increase due to a more intimate contact with the substrate. This supports mechanisms like focal adhesion (Mignot et al., 2007 ) or a modified type-IV pilus (Khayatan et al., 2015 ), which generate forces through contact with extracellular surfaces, as the underlying mechanism of the gliding apparatus of filamentous cyanobacteria: more contacts generate more force, but also closer contact with the substrate, thereby increasing friction to the same extent. Force generation by slime extrusion (Hoiczyk and Baumeister, 1998 ), in contrast, would lead to the opposite behavior: More slime generates more propulsion, but also reduces friction. Besides fundamental fluid-mechanical considerations (Snoeijer et al., 2013 ), this is rationalized by two experimental observations: i. gliding velocity correlates positively with slime layer thickness (Dhahri et al., 2013 ) and ii. motility in slime-secretion deficient mutants is restored upon exogenous addition of polysaccharide slime. Still we emphasize that many other possibilities exist. One could, for instance, postulate a regulation of the generated forces to the experienced friction, to maintain some preferred or saturated velocity.”

      Can the authors comment on possible mechanisms (perhaps from the literature) that indicate how isotropic friction may be generated in settings where focal adhesions drive motility? A key aspect here would probably be estimating the extent of this adhesion patch and comparing it to a characteristic contact area. Can lubrication theory be used to estimate characteristic areas of contact (knowing the radius of the filament, and assuming a height above the substrate)? If the focal adhesions typically cover areas smaller than this lubrication area, it may suggest the possibility that bacteria essentially present a flat surface insofar as adhesion is concerned, leading to a transversely isotropic response in terms of the drag. Of course, we will still require the effective propulsive force to act along the tangent.

      We thank the referee for suggesting to estimate the dimensions of the contact region. Both pili and focal adhesion sites would be of sizes below one micron [A3, A4], much smaller than the typical contact region in the lubricated contact, which is on the order of the filament radius (few microns). So indeed, isotropic friction may be expected in this situation [A5] and is assumed frequently in theoretical work [A6–A8]. Anisotropy may then indeed be induced by active forces [A9], but we are not aware of measurements of the anisotropy of friction in bacterial gliding.

      For a more precise estimate using lubrication theory, rheology and extrusion rate of the secreted polysaccharides would have to be known, but we are not aware of detailed experimental characterizations.

      We extended the paragraph in the buckling theory on page 5 regarding the assumption of isotropic friction:

      “We use classical Kirchhoff theory for a uniform beam of length L and bending modulus B, subject to a force density ⃗b = −f ⃗t− η ⃗v, with an effective active force density f along the tangent ⃗t, and an effective friction proportional to the local velocity ⃗v, analog to existing literature (Fily et al., 2020; Chelakkot et al., 2014; Sekimoto et al., 1995 ). Presumably, this friction is dominated by the lubrication drag from the contact with the substrate, filled by a thin layer of secreted polysaccharide slime which is much more viscous than the surrounding bulk fluid. Speculatively, the motility mechanism might also comprise adhering elements like pili (Khayatan et al., 2015 ) or foci (Mignot et al., 2007 ) that increase the overall friction (Pompe et al., 2015 ). Thus, the drag due to the surrounding bulk fluid can be neglected (Man and Kanso, 2019 ), and friction is assumed to be isotropic, a common assumption in motility models (Fei et al., 2020; Tchoufag et al., 2019; Wada et al., 2013 ). We assume…”

      We also extended the discussion regarding the outcome of isotropic friction (page 7):

      “…Thus we plot f/v over η in Figure 4 D, finding nearly identical values over about two decades. Since f and η are not correlated with v0, this is due to a correlation between f and η. This relation is remarkable in two aspects: On the one hand, it indicates that friction is mainly isotropic. This suggests that friction is governed by an isotropic process like bond friction or lubrication from the slime layer in the contact with the substrate, the latter being consistent with the observation that mutations deficient of slime secretion do not glide but exogenous addition of slime restores motility (Khayatan et al., 2015 ). In contrast, hydrodynamic drag from the surrounding bulk fluid (Man and Kanso, 2019 ), or the internal friction of the gliding apparatus would be expected to generate strongly anisotropic friction. If the latter was dominant, a snapping-like transition into the buckling state would be expected, rather than the continuously growing amplitude that is observed in experiments. On the other hand, it indicates that friction and propulsion forces…”

      I am not sure why the authors mention that the power of the gliding apparatus is not rate-limiting. The only way to verify this would be to put these in highly viscous fluids where the drag of the external fluid comes into the picture as well (if focal adhesions are on the substrate-facing side, and the upper side is subject to ambient fluid drag). Also, the friction referred to here has the form of a viscous drag (no memory effect, and thus not viscoelastic or gel-like), and it is not clear if forces generated by adhesion involve other forms of drag such as chemical friction via temporary bonds forming and breaking. In quasi-static settings and under certain conditions such as the separation of chemical and elastic time scales, bond friction may yield overall force proportional to local sliding velocities.

      We agree with the referee that the origin of the friction is not easily resolved. Lubrication yields an isotropic force density that is proportional to the velocity, and the same could be generated by bond friction. Importantly, both types of friction would be assumed to be predominantly isotropic. We explicitly referred to lubrication drag because it has been shown that mutations deficient of slime extrusion do not glide [A4].

      Assuming, in contrast, that in free gliding, friction with the environment is not rate limiting, but rather the internal friction of the gliding apparatus, i.e., the available power, we would expect a rather different behavior during early-buckling evolution. During early buckling, the tangential motion is stalled, and the dynamics is dominated by the growing buckling amplitude of filament regions near the front end, which move mainly transversely. For geometric reasons, in this stage the (transverse) buckling amplitude grows much faster than the rear part of the filament advances longitudinally. Thus that motion should not be impeded much by the internal friction of the gliding apparatus, but by external friction between the buckling parts of the filament and the ambient. The rate at which the buckling amplitude initially grows should be limited by the accumulated compressive stress in the filament and the transverse friction with the substrate. If the latter were much smaller than the (logitudinal) internal friction of the gliding apparatus, we would expect a snapping-like transition into the buckled state, which we did not observe.

      In our paper, we do not intend to evaluate the exact origin of the friction, quantifying the gliding force is the main objective. A linear force-velocity relation agrees with our observations. A detailed analysis of friction in cyanobacterial gliding would be an interesting direction for future work.

      To make these considerations more clear, we rephrased the corresponding paragraph on page 7 & 8:

      “…Thus we plot f/v over η in Figure 4 D, finding nearly identical values over about two decades. Since f and η are not correlated with v0, this is due to a correlation between f and η. This relation is remarkable in two aspects: On the one hand, it indicates that friction is mainly isotropic. This suggests that friction is governed by an isotropic process like bond friction or lubrication from the slime layer in the contact with the substrate, the latter being consistent with the observation that mutations deficient of slime secretion do not glide but exogenous addition of slime restores motility (Khayatan et al., 2015 ). In contrast, hydrodynamic drag from the surrounding bulk fluid (Man and Kanso, 2019 ), or the internal friction of the gliding apparatus would be expected to generate strongly anisotropic friction. If the latter was dominant, a snapping-like transition into the buckling state would be expected, rather than the continuously growing amplitude that is observed in experiments. On the other hand, it indicates that friction and propulsion forces…”

      For readers from a non-fluids background, some additional discussion of the drag forces, and the forms of friction would help. For a freely gliding filament if f is the force density (per unit length), then steady gliding with a viscous frictional drag would suggest (as mentioned in the paper) f ∼ v! L η||. The critical buckling length is then dependent on f and on B the bending modulus. Here the effective drag is defined per length. I can see from this that if the active force is fixed, and the viscous component resulting from the frictional mechanism is fixed, the critical buckling length will not depend on the velocity (unless I am missing something in their argument), since the velocity is not a primitive variable, and is itself an emergent quantity.

      We are not sure what “f ∼ v! L η||” means, possibly the spelling was corrupted in the forwarding of the comments.

      We assumed an overdamped motion in which the friction force density ff (per unit length of the filament) is proportional to the velocity v0, i.e. ff ∼ η v0, with a friction coefficient η. Overdamped means that the friction force density is equal and opposite to the propulsion force density, so the propulsion force density is f ∼ ff ∼ η v0. The total friction and propulsion forces can be obtained by multiplication with the filament length

      L, which is not required here. In this picture, v0 is an emergent quantity and f and η are assumed as given and constant. Thus, by observing v0, f can be inferred up to the friction coefficient η. Therefore, by using two descriptive variables, L and v0, with known B, the primitive variable η can be inferred by logistic regression, and f then follows from the overdamped equation of motion.

      To clarify this, we revised the corresponding section on page 5 of the paper:

      “The substrate contact requires lubrication from polysaccharide slime to enable bacteria to glide (Khayatan et al., 2015 ). Thus we assume an over- damped motion with co-linear friction, for which the propulsion force f and the free gliding velocity v0 of a filament are related by f = η v0, with a friction coefficient η. In this scenario, f can be inferred both from the observed Lc ∼ (f/B)−1/3 and, up to the proportionality coefficient η, from the observed free gliding velocity. Thus, by combining the two relations, one may expect also a strong correlation between Lc and v0. In order to test this relation for consistency with our data, we include v0 as a second regressor, by setting x = (L−Lc(v0))/∆Lc in Equation 1, with Lc(v0) = (η v0/(30.5722 B))−1/3, to reflect our expectation from theory (see below). Now, η rather than f is the only unknown, and its ensemble distribution will be determined in the regression. Figure 3 E,F show the buckling behavior…”

      Reviewer 2:

      In the presented manuscript, the authors first use structured microfluidic devices with gliding filamentous cyanobacteria inside in combination with micropipette force measurements to measure the bending rigidity of the filaments.

      Next, they use triangular structures to trap the bacteria with the front against an obstacle. Depending on the length and rigidity, the filaments buckle under the propulsive force of the cells. The authors use theoretical expressions for the buckling threshold to infer propulsive force, given the measured length and stiffnesses. They find nearly identical values for both species, f ∼ (1.0 ± 0.6) nN/µm, nearly independent of the velocity.

      Finally, they measure the shape of the filament dynamically to infer friction coefficients via Kirchhoff theory. This last part seems a bit inconsistent with the previous inference of propulsive force. Before, they assumed the same propulsive force for all bacteria and showed only a very weak correlation between buckling and propulsive velocity. In this section, they report a strong correlation with velocity, and report propulsive forces that vary over two orders of magnitude. I might be misunderstanding something, but I think this discrepancy should have been discussed or explained.

      We regret the misunderstanding of the reviewer regarding the velocity dependence, which indicates that the manuscript should be improved to convey these relations correctly.

      First, in the Buckling Measurements section, we did not assume the same propulsion force for all bacteria. The logistic regression yields an ensemble median for Lc (and thus an ensemble median for f ), along with the width ∆Lc of the distribution (and thus also the width of the distribution of f ). Our result f ∼ (1.0 ± 0.6) nN/µm indicates the median and the width of the distribution of the propulsion force densities across the ensemble of several hundred filaments used in the buckling measurements. The large variability of the forces found in the second part is consistently reflected by this very wide distribution of active forces detected in the logistic regression in the first part.

      We did small modifications to the buckling theory paragraph to clarify that in the first part, a distribution of forces rather than a constant value is inferred (page 6)

      “Inserting the population median and quartiles of the distributions of bending modulus and critical length, we can now quantify the distribution of the active force density for the filaments in the ensemble from the buckling measurements. We obtain nearly identical values for both species, f ∼ (1.0±0.6) nN/µm, where the uncertainty represents a wide distribution of f across the ensemble rather than a measurement error.”

      The same holds, of course, when inferring the distribution of the friction coefficients (page 5):

      “The substrate contact requires lubrication from polysaccharide slime to enable bacteria to glide (Khayatan et al., 2015 ). Thus we assume an over- damped motion with co-linear friction, for which the propulsion force f and the free gliding velocity v0 of a filament are related by f = η v0, with a friction coefficient η. In this scenario, f can be inferred both from the observed Lc ∼ (f/B)−1/3 and, up to the proportionality coefficient η, from the observed free gliding velocity. Thus, by combining the two relations, one may expect also a strong correlation between Lc and v0. In order to test this relation for consistency with our data, we include v0 as a second regressor, by setting x = (L−Lc(v0))/∆Lc in Equation 1, with Lc(v0) = (η v0/(30.5722 B))−1/3, to reflect our expectation from theory (see below). Now, η rather than f is the only unknown, and its ensemble distribution will be determined in the regression. Figure 3 E,F show the buckling behavior…”

      The (naturally) wide distribution of force (and friction) leads to a distribution of Lc as well. However, due to the small exponent of 1/3 in the buckling threshold Lc ∼ f 1/3, the distribution of Lc is not as wide as the distributions of the individually inferred f or η. This is visualized in panel G of Figure 3, plotting Lc as a function of v0 (v0 is equivalent to f , up to a proportionality coefficient η). The natural length distribution, in contrast, is very wide. Therefore, the buckling propensity of a filament is most strongly characterized by its length, while force variability, which alters Lc of the individual, plays a secondary role.

      In order to clarify this, we edited the last paragraph of the Buckling Measurements section on page 5 of the manuscript:

      “…Within the characteristic range of observed velocities (1 − 3 µm/s), the median Lc depends only mildly on v0, as compared to its rather broad distribution, indicated by the bands in Figure 3 G. Thus a possible correlation between f and v0 would only mildly alter Lc. The natural length distribution (cf. Appendix 1—figure 1 ), however, is very broad, and we conclude that growth rather than velocity or force distributions most strongly impacts the buckling propensity of cyanobacterial colonies. Also, we hardly observed short and fast filaments of K. animale, which might be caused by physiological limitations (Burkholder, 1934 ).”

      Second, in the Profile analysis section, we did not report a correlation between force and velocity. As can be seen in Figure 4—figure Supplement 1, neither the active force nor the friction coefficient, as determined from the analysis of individual filaments, show any significant correlation with the velocity. This is also written in the discussion (page 7):

      We see no significant correlation between L or v0 and f or η, but the observed values of f and η cover a wide range (Figure 4 B, C and Figure 4—figure Supplement 1 ).

      Note that this is indeed consistent with the logistic regression: Using v0 as a second regressor did not significantly reduce the width of the distribution of Lc as compared to the simple logistic regression, indicating that force and velocity are not strongly correlated.

      In order to clarify this in the manuscript, we modified that part (page 7):

      “…We see no significant correlation between L or v0 and f or η, but the observed values of f and η cover a wide range (Figure 4 B,C and Figure 4— figure Supplement 1 ). This is consistent with the logistic regression, where using v0 as a second regressor did not significantly reduce the width of the distribution of critical lengths or active forces. The two estimates of the friction coefficient, from logistic regression and individual profile fits, are measured in (predominantly) orthogonal directions: tangentially for the logistic regression where the free gliding velocity was used, and transversely for the evolution of the buckling profiles. Thus we plot f/v over η in Figure 4 D, finding nearly identical values over about two decades. Since f and η are not correlated with v0, this is due to a correlation between f and η. This relation is remarkable in two aspects: On the one hand, it indicates that friction is mainly isotropic…”

      From a theoretical perspective, not many new results are presented. The authors repeat the well-known calculation for filaments buckling under propulsive load and arrive at the literature result of buckling when the dimensionless number (f L3/B) is larger than 30.6 as previously derived by Sekimoto et al in 1995 [1] (see [2] for a clamped boundary condition and simulations). Other theoretical predictions for pushed semi-flexible filaments [1–4] are not discussed or compared with the experiments. Finally, the Authors use molecular dynamics type simulations similar to [2–4] to reproduce the buckling dynamics from the experiments. Unfortunately, no systematic comparison is performed.

      [1]        Ken Sekimoto, Naoki Mori, Katsuhisa Tawada, and Yoko Y Toyoshima. Symmetry breaking instabilities of an in vitro biological system. Physical review letters, 75(1):172, 1995.

      [2]       Raghunath Chelakkot, Arvind Gopinath, Lakshminarayanan Mahadevan, and Michael F Hagan. Flagellar dynamics of a connected chain of active, polar, brownian particles. Journal of The Royal Society Interface, 11(92):20130884, 2014.

      [3]       Rolf E Isele-Holder, Jens Elgeti, and Gerhard Gompper. Self-propelled worm-like filaments: spontaneous spiral formation, structure, and dynamics. Soft matter, 11(36):7181–7190, 2015.

      [4]       Rolf E Isele-Holder, Julia J¨ager, Guglielmo Saggiorato, Jens Elgeti, and Gerhard Gompper. Dynamics of self-propelled filaments pushing a load. Soft Matter, 12(41):8495–8505, 2016.

      We thank the reviewer for pointing us to these publications, in particular the work by Sekimoto we were not aware of. We agree with the referee that the calculation is straight forward (basically known since Euler, up to modified boundary conditions). Our paper focuses on experimental work, the molecular dynamics simulations were included mainly as a consistency check and not intended to generate the beautiful post-buckling patterns observed in references [2-4]. However, such shapes do emerge in filamentous cyanobacteria, and with the data provided in our manuscript, simulations can be quantitatively matched to our experiments, which will be covered by future work.

      We included the references in the revision of our manuscript, and a statement that we do not claim priority on these classical theoretical results.

      Introduction, page 2:

      “…Self-Buckling is an important instability for self-propelling rod-like micro-organisms to change the orientation of their motion, enabling aggregation or the escape from traps (Fily et al., 2020; Man and Kanso, 2019; Isele-Holder et al., 2015; Isele-Holder et al., 2016 ). The notion of self-buckling goes back to work of Leonhard Euler in 1780, who described elastic columns subject to gravity (Elishakoff, 2000 ). Here, the principle is adapted to the self-propelling, flexible filaments (Fily et al., 2020; Man and Kanso, 2019; Sekimoto et al., 1995 ) that glide onto an obstacle. Filaments buckle if they exceed a certain critical length Lc ∼ (B/f)1/3, where B is the bending modulus and f the propulsion force density…”

      Buckling theory, page 5:

      “…The buckling of gliding filaments differs in two aspects: the propulsion forces are oriented tangentially instead of vertically, and the front end is supported instead of clamped. Therefore, with L < Lc all initial orientations are indifferently stable, while for L > Lc, buckling induces curvature and a resultant torque on the head, leading to rotation (Fily et al., 2020; Chelakkot et al., 2014; Sekimoto et al., 1995 ). Buckling under concentrated tangential end-loads has also been investigated in literature (de Canio et al., 2017; Wolgemuth et al., 2005 ), but leads to substantially different shapes of buckled filaments. We use classical Kirchhoff theory for a uniform beam of length L and bending modulus B, subject to a force density ⃗b = −f ⃗t − η ⃗v, with an effective active force density f along the tangent ⃗t, and an effective friction proportional to the local velocity ⃗v, analog to existing literature (Fily et al., 2020; Chelakkot et al., 2014; Sekimoto et al., 1995 )…”

      Further on page 6:

      “To derive the critical self-buckling length, Equation 5 can be linearized for two scenarios that lead to the same Lc: early-time small amplitude buckling and late-time stationary rotation at small and constant curvature (Fily et al., 2020; Chelakkot et al., 2014 ; Sekimoto et al., 1995 ). […] Thus, in physical units, the critical length is given by Lc = (30.5722 B/f)1/3, which is reproduced in particle based simulations (Appendix Figure 2 ) analogous to those in Isele-Holder et al. (2015, 2016).”

      Discussion, page 7 & 8:

      “…This, in turn, has dramatic consequences on the exploration behavior and the emerging patterns (Isele-Holder et al., 2015, 2016; Abbaspour et al., 2021; Duman et al., 2018; Prathyusha et al., 2018; Jung et al., 2020 ): (L/Lc)3 is, up to a numerical prefactor, identical to the flexure number (Isele-Holder et al., 2015, 2016; Duman et al., 2018; Winkler et al., 2017 ), the ratio of the Peclet number and the persistence length of active polymer melts. Thus, the ample variety of non-equilibrium phases in such materials (Isele-Holder et al., 2015, 2016; Prathyusha et al., 2018; Abbaspour et al., 2021 ) may well have contributed to the evolutionary success of filamentous cyanobacteria.”

      Reviewer 3:

      Summary:

      This paper presents novel and innovative force measurements of the biophysics of gliding cyanobacteria filaments. These measurements allow for estimates of the resistive force between the cell and substrate and provide potential insight into the motility mechanism of these cells, which remains unknown.

      We thank the reviewer for the positive evaluation of our work. We have revised the manuscript according to their comments and detail our replies and modifications next to the individual points below.

      Strengths:

      The authors used well-designed microfabricated devices to measure the bending modulus of these cells and to determine the critical length at which the cells buckle. I especially appreciated the way the authors constructed an array of pillars and used it to do 3-point bending measurements and the arrangement the authors used to direct cells into a V-shaped corner in order to examine at what length the cells buckled at. By examining the gliding speed of the cells before buckling events, the authors were able to determine how strongly the buckling length depends on the gliding speed, which could be an indicator of how the force exerted by the cells depends on cell length; however, the authors did not comment on this directly.

      We thank the referee for the positive assessment of our work. Importantly, we do not see a significant correlation between buckling length and gliding speeds, and we also do not see a correlation with filament length, consistent with the assumption of a propulsion force density that is more or less homogeneously distributed along the filament. Note that each filament consists of many metabolically independent cells, which renders cyanobacterial gliding a collective effort of many cells, in contrast to gliding of, e.g., myxobacteria.

      In response also to the other referees’ comments, we modified the manuscript to reflect more on the absence of a strong correlation between velocity and force/critical length. We modified the Buckling measurements section on page 5 of the paper:

      “The substrate contact requires lubrication from polysaccharide slime to enable bacteria to glide (Khayatan et al., 2015 ). Thus we assume an over-damped motion with co-linear friction, for which the propulsion force f and the free gliding velocity v0 of a filament are related by f = η v0, with a friction coefficient η. In this scenario, f can be inferred both from the observed Lc ∼ (f/B)−1/3 and, up to the proportionality coefficient η, from the observed free gliding velocity. Thus, by combining the two relations, one may expect also a strong correlation between Lc and v0. In order to test this relation for consistency with our data, we include v0 as a second regressor, by setting x = (L−Lc(v0))/∆Lc in Equation 1, with Lc(v0) = (η v0/(30.5722 B))−1/3, to reflect our expectation from theory (see below). Now, η rather than f is the only unknown, and its ensemble distribution will be determined in the regression. Figure 3 E, F show the buckling behavior…”

      Further, we edited the last paragraph of the Buckling measurements section on page 5 of the manuscript:

      “Within the characteristic range of observed velocities (1 − 3 µm/s), the median Lc depends only mildly on v0, as compared to its rather broad distribution, indicated by the bands in Figure 3 G. Thus a possible correlation between f and v0 would only mildly alter Lc. The natural length distribution (cf. Appendix 1—figure 1 ), however, is very broad, and we conclude that growth rather than velocity or force distributions most strongly impacts the buckling propensity of cyanobacterial colonies. Also, we hardly observed short and fast filaments of K. animale, which might be caused by physiological limitations (Burkholder, 1934 ).”

      We also rephrased the corresponding discussion paragraph on page 7:

      “…Thus we plot f/v over η in Figure 4 D, finding nearly identical values over about two decades. Since f and η are not correlated with v0, this is due to a correlation between f and η. This relation is remarkable in two aspects: On the one hand, it indicates that friction is mainly isotropic. This suggests that friction is governed by an isotropic process like bond friction or lubrication from the slime layer in the contact with the substrate, the latter being consistent with the observation that mutations deficient of slime secretion do not glide but exogenous addition of slime restores motility (Khayatan et al., 2015 ). In contrast, hydrodynamic drag from the surrounding bulk fluid (Man and Kanso, 2019 ), or the internal friction of the gliding apparatus would be expected to generate strongly anisotropic friction. If the latter was dominant, a snapping-like transition into the buckling state would be expected, rather than the continuously growing amplitude that is observed in experiments. On the other hand, it indicates that friction and propulsion forces…”

      Weaknesses:

      There were two minor weaknesses in the paper.

      First, the authors investigate the buckling of these gliding cells using an Euler beam model. A similar mathematical analysis was used to estimate the bending modulus and gliding force for Myxobacteria (C.W. Wolgemuth, Biophys. J. 89: 945-950 (2005)). A similar mathematical model was also examined in G. De Canio, E. Lauga, and R.E Goldstein, J. Roy. Soc. Interface, 14: 20170491 (2017). The authors should have cited these previous works and pointed out any differences between what they did and what was done before.

      We thank the reviewer for pointing us to these references. The paper by Wolgemuth is theoretical work, describing A-motility in myxobacteria by a concentrated propulsion force at the rear end of the bacterium, possibly stemming from slime extrusion. This model was a little later refuted by [A3], who demonstrated that focal adhesion along the bacterial body and thus a distributed force powers A-motility, a mechanism that has by now been investigated in great detail (see [A10]). The paper by Canio et al. contains a thorough theoretical analysis of a filament that is clamped at one end and subject to a concentrated tangential load on the other. Since both models comprise a concentrated end-load rather than a distributed propulsion force density, they describe a substantially different motility mechanism, leading also to substantially different buckling profiles. Consequentially, these models cannot be applied to cyanobacterial gliding.

      We included both citations in the revision and pointed out the differences to our work in the introduction (page 2):

      “…A few species appear to employ a type-IV-pilus related mechanism (Khayatan et al., 2015; Wilde and Mullineaux, 2015 ), similar to the better- studied myxobacteria (Godwin et al., 1989; Mignot et al., 2007; Nan et al., 2014; Copenhagen et al., 2021; Godwin et al., 1989 ), which are short, rod-shaped single cells that exhibit two types of motility: S (social) motility based on pilus extension and retraction, and A (adventurous) motility based on focal adhesion (Chen and Nan, 2022 ) for which also slime extrusion at the trailing cell pole was earlier postulated as mechanism (Wolgemuth et al., 2005 ). Yet, most gliding filamentous cyanobacteria do not exhibit pili and their gliding mechanism appears to be distinct from myxobacteria (Khayatan et al., 2015 ).”

      And in Buckling theory, page 5:

      “….The buckling of gliding filaments differs in two aspects: the propulsion forces are oriented tangentially instead of vertically, and the front end is supported instead of clamped. Therefore, with L < Lc all initial orientations are indifferently stable, while for L > Lc, buckling induces curvature and a resultant torque on the head, leading to rotation (Fily et al., 2020; Chelakkot et al., 2014; Sekimoto et al., 1995 ). Buckling under concentrated tangential end-loads has also been investigated in literature (de Canio et al., 2017; Wolgemuth et al., 2005 ), but leads to substantially different shapes of buckled filaments.”

      The second weakness is that the authors claim that their results favor a focal adhesion-based mechanism for cyanobacterial gliding motility. This is based on their result that friction and adhesion forces correlate strongly. They then conjecture that this is due to more intimate contact with the surface, with more contacts producing more force and pulling the filaments closer to the substrate, which produces more friction. They then claim that a slime-extrusion mechanism would necessarily involve more force and lower friction. Is it necessarily true that this latter statement is correct? (I admit that it could be, but is it a requirement?)

      We thank the referee for raising this interesting question. Our claim regarding slime extrusion is based on three facts: i. mutations deficient of slime extrusion do not glide, but start gliding as soon as slime is provided externally [A4]. ii. A positive correlation between speed and slime layer thickness was observed in Nostoc [A11]. iii. The fluid mechanics of lubricated sliding contacts is very well understood and predicts a decreasing resistance with increasing layer thickness.

      We included these considerations in the revision of our manuscript (page 8):

      “…it indicates that friction and propulsion forces, despite being quite variable, correlate strongly. Thus, generating more force comes, inevitably, at the expense of added friction. For lubricated contacts, the friction coefficient is proportional to the thickness of the lubricating layer (Snoeijer et al., 2013 ), and we conjecture active force and drag both increase due to a more intimate contact with the substrate. This supports mechanisms like focal adhesion (Mignot et al., 2007 ) or a modified type-IV pilus (Khayatan et al., 2015 ), which generate forces through contact with extracellular surfaces, as the underlying mechanism of the gliding apparatus of filamentous cyanobacteria: more contacts generate more force, but also closer contact with the substrate, thereby increasing friction to the same extent. Force generation by slime extrusion (Hoiczyk and Baumeister, 1998 ), in contrast, would lead to the opposite behavior: More slime generates more propulsion, but also reduces friction. Besides fundamental fluid-mechanical considerations (Snoeijer et al., 2013 ), this is rationalized by two experimental observations: i. gliding velocity correlates positively with slime layer thickness (Dhahri et al., 2013 ) and ii. motility in slime-secretion deficient mutants is restored upon exogenous addition of polysaccharide slime. Still we emphasize that many other possibilities exist. One could, for instance, postulate a regulation of the generated forces to the experienced friction, to maintain some preferred or saturated velocity.”

      Related to this, the authors use a model with isotropic friction. They claim that this is justified because they are able to fit the cell shapes well with this assumption. How would assuming a non-isotropic drag coefficient affect the shapes? It may be that it does equally well, in which case, the quality of the fits would not be informative about whether or not the drag was isotropic or not.

      The referee raises another very interesting point. Given the typical variability and uncertainty in experimental measurements (cf. error Figure 4 A), a model with a sightly anisotropic friction could be fitted to the observed buckling profiles as well, without significant increase of the mismatch. Yet, strongly anisotropic friction would not be consistent with our observations.

      Importantly, however, we did not conclude on isotropic friction based on the fit quality, but based on a comparison between free gliding and early buckling (Figure 4 D). In early buckling, the dominant motion is in transverse direction, while longitudinal motion is insignificant, due to geometric reasons. Thus, independent of the underlying model, mostly the transverse friction coefficiont is inferred. In contrast, free gliding is a purely longitudinal motion, and thus only the friction coefficient for longitudinal motion can be inferred. These two friction coefficients are compared in Figure 4 D. Still, the scatter of that data would allow to fit a certain anisotropy within the error margins. What we can exclude based on out observation is the case of a strongly anisotropic friction. If there is no ab-initio reason for anisotropy, nor a measurement that indicates it, we prefer to stick with the simplest

      assumption. We carefully chose our wording in the Discussion as “mainly isotropic” rather

      than “isotropic” or “fully isotropic”.

      We added a small statement to the Discussion on page 7 & 8:

      “... Thus we plot f/v over η in Figure 4 D, finding nearly identical values over about two decades. Since f and η are not correlated with v0, this is due to a correlation between f and η. This relation is remarkable in two aspects: On the one hand, it indicates that friction is mainly isotropic. This suggests that friction is governed by an isotropic process like bond friction or lubrication from the slime layer in the contact with the substrate, the latter being consistent with the observation that mutations deficient of slime secretion do not glide but exogenous addition of slime restores motility (Khayatan et al., 2015 ). In contrast, hydrodynamic drag from the surrounding bulk fluid (Man and Kanso, 2019 ), or the internal friction of the gliding apparatus would be expected to generate strongly anisotropic friction. If the latter was dominant, a snapping-like transition into the buckling state would be expected, rather than the continuously growing amplitude that is observed in experiments. On the other hand, it indicates that friction and propulsion forces ...”

      Recommendations for the authors

      The discussion regarding how the findings of this paper imply that cyanobacteria filaments are propelled by adhesion forces rather than slime extrusion should be improved, as this conclusion seems questionable. There appears to be an inconsistency with a buckling force said to be only weakly dependent on the gliding velocity, while its ratio with the velocity correlates with a friction coefficient. Finally, data and source code should be made publicly available.

      In the revised version, we have modified the discussion of the force generating mechanism according to the reviewer suggestions. The perception of inconsistency in the velocity dependence of the buckling force was based on a misunderstanding, as we detailed in our reply to the referee. We revised the corresponding section to make it more clear. Data and source code have been uploaded to a public data repository.

      Reviewer #2 (recommendations for the authors)

      Despite eLife policy, the authors do not provide a Data Availability Statement. For the presented manuscript, data and source code should be provided “via trusted institutional or third-party repositories that adhere to policies that make data discoverable, accessible and usable.” https://elifesciences.org/inside-elife/51839f0a/for-authors-updates- to-elife-s-data-sharing-policies

      Most of the issues in this reviewer’s public review should be easy to correct, so I would strongly support the authors to provide an amended manuscript.

      We added the Data Availability Statement in the amended manuscript.

      References

      [A1] E. Hoiczyk and W. Baumeister. “The junctional pore complex, a prokaryotic secretion organelle, is the molecular motor underlying gliding motility in cyanobacteria”. In: Curr. Biol. 8.21 (1998), pp. 1161–1168. doi: 10.1016/s0960-9822(07)00487-3.

      [A2] N. Read, S. Connell, and D. G. Adams. “Nanoscale Visualization of a Fibrillar Array in the Cell Wall of Filamentous Cyanobacteria and Its Implications for Gliding Motility”. In: J. Bacteriol. 189.20 (2007), pp. 7361–7366. doi: 10.1128/jb.00706- 07.

      [A3] T. Mignot, J. W. Shaevitz, P. L. Hartzell, and D. R. Zusman. “Evidence That Focal Adhesion Complexes Power Bacterial Gliding Motility”. In: Science 315.5813 (2007), pp. 853–856. doi: 10.1126/science.1137223.

      [A4] Behzad Khayatan, John C. Meeks, and Douglas D. Risser. “Evidence that a modified type IV pilus-like system powers gliding motility and polysaccharide secretion in filamentous cyanobacteria”. In: Mol. Microbiol. 98.6 (2015), pp. 1021–1036. doi: 10.1111/mmi.13205.

      [A5] Tilo Pompe, Martin Kaufmann, Maria Kasimir, Stephanie Johne, Stefan Glorius, Lars Renner, Manfred Bobeth, Wolfgang Pompe, and Carsten Werner. “Friction- controlled traction force in cell adhesion”. In: Biophysical journal 101.8 (2011), pp. 1863–1870.

      [A6] Hirofumi Wada, Daisuke Nakane, and Hsuan-Yi Chen. “Bidirectional bacterial gliding motility powered by the collective transport of cell surface proteins”. In: Physical Review Letters 111.24 (2013), p. 248102.

      [A7] Jo¨el Tchoufag, Pushpita Ghosh, Connor B Pogue, Beiyan Nan, and Kranthi K Mandadapu. “Mechanisms for bacterial gliding motility on soft substrates”. In: Proceedings of the National Academy of Sciences 116.50 (2019), pp. 25087–25096.

      [A8] Chenyi Fei, Sheng Mao, Jing Yan, Ricard Alert, Howard A Stone, Bonnie L Bassler, Ned S Wingreen, and Andrej Kosmrlj. “Nonuniform growth and surface friction determine bacterial biofilm morphology on soft substrates”. In: Proceedings of the National Academy of Sciences 117.14 (2020), pp. 7622–7632.

      [A9] Arja Ray, Oscar Lee, Zaw Win, Rachel M Edwards, Patrick W Alford, Deok-Ho Kim, and Paolo P Provenzano. “Anisotropic forces from spatially constrained focal adhesions mediate contact guidance directed cell migration”. In: Nature communications 8.1 (2017), p. 14923.

      [A10] Jing Chen and Beiyan Nan. “Flagellar motor transformed: biophysical perspectives of the Myxococcus xanthus gliding mechanism”. In: Frontiers in Microbiology 13 (2022), p. 891694.

      [A11] Samia Dhahri, Michel Ramonda, and Christian Marliere. “In-situ determination of the mechanical properties of gliding or non-motile bacteria by atomic force microscopy under physiological conditions without immobilization”. In: PLoS One 8.4 (2013), e61663.

    1. Author response:

      eLife assessment

      This study provides valuable evidence indicating that Syngap1 regulates the synaptic drive and membrane excitability of parvalbumin- and somatostatin-positive interneurons in the auditory cortex. Since haplo-insufficiency of Syngap1 has been linked to intellectual disabilities without a well-defined underlying cause, the central question of this study is timely. However, the support for the authors' conclusions is incomplete in general and some parts of the experimental evidence are inadequate. Specifically, the manuscript requires further work to properly evaluate the impact on synaptic currents, intrinsic excitability parameters, and morphological features.

      We are happy that the editors found that our study provides valuable evidence and that the central question is timely. We thank the reviewers for their detailed comments and suggestions. Below, we provide a point-by-point answer (in blue) to the specific comments and indicate the changes to the manuscript and the additional experiments we plan to perform to answer these comments.

      Public Reviews:

      Reviewer #1 (Public Review):

      The study is designed to assess the role of Syngap1 in regulating the physiology of the MGE-derived PV+ and SST+ interneurons. Syngap1 is associated with some mental health disorders, and PV+ and SST+ cells are the focus of many previous and likely future reports from studies of interneuron biology, highlighting the translational and basic neuroscience relevance of the authors' work.

      Strengths of the study are using well-established electrophysiology methods and the highly controlled conditions of ex vivo brain slice experiments combined with a novel intersectional mouse line, to assess the role of Syngap1 in regulating PV+ and SST+ cell properties. The findings revealed that in the mature auditory cortex, Syngap1 haploinsufficiency decreases both the intrinsic excitability and the excitatory synaptic drive onto PV+ neurons from Layer 4. In contrast, SST+ interneurons were mostly unaffected by Syngap1 haploinsufficiency. Pharmacologically manipulating the activity of voltage-gated potassium channels of the Kv1 family suggested that these channels contributed to the decreased PV+ neuron excitability by Syngap insufficiency. These results therefore suggest that normal Syngap1 expression levels are necessary to produce normal PV+ cell intrinsic properties and excitatory synaptic drive, albeit, perhaps surprisingly, inhibitory synaptic transmission was not affected by Syngap1 haploinsufficiency.

      Since the electrophysiology experiments were performed in the adult auditory cortex, while Syngap1 expression was potentially affected since embryonic stages in the MGE, future studies should address two important points that were not tackled in the present study. First, what is the developmental time window in which Syngap1 insufficiency disrupted PV+ neuron properties? Albeit the embryonic Syngap1 deletion most likely affected PV+ neuron maturation, the properties of Syngap-insufficient PV+ neurons do not resemble those of immature PV+ neurons. Second, whereas the observation that Syngap1 haploinsufficiency affected PV+ neurons in auditory cortex layer 4 suggests auditory processing alterations, MGE-derived PV+ neurons populate every cortical area. Therefore, without information on whether Syngap1 expression levels are cortical area-specific, the data in this study would predict that by regulating PV+ neuron electrophysiology, Syngap1 normally controls circuit function in a wide range of cortical areas, and therefore a range of sensory, motor and cognitive functions. These are relatively minor weaknesses regarding interpretation of the data in the present study that the authors could discuss.

      We agree with the reviewer on the proposed open questions, which we will certainly discuss in the revised manuscript we are preparing. We do have experimental evidence suggesting that Syngap1 mRNA is expressed by PV+ and SST+ neurons in different cortical areas, during early postnatal development and in adulthood; therefore, we agree that it will be important, in future experiments, to tackle the question of when the observed phenotypes arise.

      Reviewer #2 (Public Review):

      Summary:

      In this manuscript, the authors investigated how partial loss of SynGap1 affects inhibitory neurons derived from the MGE in the auditory cortex, focusing on their synaptic inputs and excitability. While haplo-insufficiently of SynGap1 is known to lead to intellectual disabilities, the underlying mechanisms remain unclear.

      Strengths:

      The questions are novel

      Weaknesses:

      Despite the interesting and novel questions, there are significant concerns regarding the experimental design and data quality, as well as potential misinterpretations of key findings. Consequently, the current manuscript fails to contribute substantially to our understanding of SynGap1 loss mechanisms and may even provoke unnecessary controversies.

      Major issues:

      (1) One major concern is the inconsistency and confusion in the intermediate conclusions drawn from the results. For instance, while the sEPSC data indicates decreased amplitude in PV+ and SOM+ cells in cHet animals, the frequency of events remains unchanged. In contrast, the mEPSC data shows no change in amplitudes in PV+ cells, but a significant decrease in event frequency. The authors conclude that the former observation implies decreased excitability. However, traditionally, such observations on mEPSC parameters are considered indicative of presynaptic mechanisms rather than changes of network activity.‎ The subsequent synapse counting experiments align more closely with the traditional conclusions. This issue can be resolved by rephrasing the text. However, it would remain unexplained why the sEPSC frequency shows no significant difference. If the majority of sEPSC events were indeed mediated by spiking (which is blocked by TTX), the average amplitudes and frequency of mEPSCs should be substantially lower than those of sEPSCs. Yet, they fall within a very similar range, suggesting that most sEPSCs may actually be independent of action potentials. But if that was indeed the case, the changes of purported sEPSC and mEPSC results should have been similar.

      We understand the reviewer’s perspective; indeed, we asked ourselves the very same question regarding why the sEPSC and mEPSC frequency fall within a similar range when we analysed neuron means (bar graphs). We have already recorded sEPSCs followed by mEPSCs from several PV neurons (control and cHet) and are in the process of analyzing the data. We will add this data to the revised version of the manuscript. We will also rephrase the manuscript to present multiple potential interpretations of the data.

      We hope that we have correctly interpreted the reviewer's concern. However, if the question is why sEPSC amplitude but not frequency is affected in cHet vs ctrl then the reviewer’s comment is perhaps based on the assumption that the amplitude and frequency of miniature events should be lower for all events compared to those observed for spontaneous events. However, it's essential to note that changes in the mean amplitude of sEPSCs are primarily driven by alterations in large sEPSCs (>9-10pA, as shown in cumulative probability in Fig. 1b right), with smaller ones being relatively unaffected. Consequently, a reduction in sEPSC amplitude may not necessarily result in a significant decrease in frequency since their values likely remain above the detection threshold of 3 pA. This could explain the lack of a significant decrease in average inter-interval event of sEPSCs (as depicted in Fig. 1b left).

      If the question is whether we should see the same parameters affected by the genetic manipulation in both sEPSC and mEPSC, then another critical consideration is the involvement of the releasable pool in mEPSCs versus sEPSCs. Current knowledge suggests that activity-dependent and -independent release may not necessarily engage the same pool of vesicles or target the same postsynaptic sites. This concept has been extensively explored (reviewed in Kavalali, 2015). Consequently, while we may have traditionally interpreted activity-dependent and -independent data assuming they utilize the same pool, this is no longer accurate. The current discussion in the field revolves around understanding the mechanisms underlying such phenomena. Therefore, comparisons between sEPSCs and mEPSCs may not yield conclusive data but rather speculative interpretations. For a rigorous analysis, particularly in this context involving thousands of events, it is essential to assess these data sets (mEPSCs vs sEPSCs) separately and provide cumulative probability curves. This approach allows for a more comprehensive understanding of the underlying distributions and helps to elucidate any potential differences between the two types of events. We will rephrase the text, and as mentioned above, add additional data, to better reflect these considerations.

      (2) Another significant concern is the quality of synapse counting experiments. The authors attempted to colocalize pre- and postsynaptic markers Vglut1 and PSD95 with PV labelling. However, several issues arise. Firstly, the PV labelling seems confined to soma regions, with no visible dendrites. Given that the perisomatic region only receives a minor fraction of excitatory synapses, this labeling might not accurately represent the input coverage of PV cells. Secondly, the resolution of the images is insufficient to support clear colocalization of the synaptic markers. Thirdly, the staining patterns are peculiar, with PSD95 puncta appearing within regions clearly identified as somas by Vglut1, hinting at possible intracellular signals. Furthermore, PSD95 seems to delineate potential apical dendrites of pyramidal cells passing through the region, yet Vglut1+ partners are absent in these segments, which are expected to be the marker of these synapses here. Additionally, the cumulative density of Vglut2 and Vglut1 puncta exceeds expectations, and it's surprising that subcortical fibers labeled by Vglut2 are comparable in number to intracortical Vglut1+ axon terminals. Ideally, N(Vglut1)+N(Vglut2) should be equal or less than N(PSD95), but this is not the case here. Consequently, these results cannot be considered reliable due to these issues.

      We apologize, as it appears that the images we provided have caused confusion. The selected images represent a single focal plane of a confocal stack, which was visually centered on the PV cell somata. We chose just one confocal plane because we thought it showed more clearly the apposition of presynaptic and postsynaptic immunolabeling around the somata. In the revised version of the manuscript, we will provide higher magnification images, which will clearly show how we identified and selected the region of interest for the quantification of colocalized synaptic markers. In our confocal stacks, we can also identify PV immunolabeled dendrites and colocalized vGlut1/PSD95 or vGlut2/PSD95 puncta on them; but these do not appear in the selected images because, as explained, only one focal plane, centered on the PV cell somata, was shown.

      We acknowledge the reviewer's point that in PV+ cells the majority of excitatory inputs are formed onto dendrites; however, we focused on the somatic excitatory inputs to PV cells, because despite their lower number, they produce much stronger depolarization in PV neurons than dendritic excitatory inputs (Hu et al., 2010; Norenberg et al., 2010). Further, quantification of perisomatic putative excitatory synapses is more reliable since by using PV immunostaining, we can visualize the soma and larger primary dendrites, but smaller, higher order dendrites are not be always detectable. Of note, PV positive somata receive more excitatory synapses than SST positive and pyramidal neuron somata as found by electron microscopy studies in the visual cortex (Hwang et al., 2021; Elabbady et al., 2024).

      Regarding the comment on the density of vGlut1 and vGlut2 puncta, the reason that the numbers appear high and similar between the two markers is because we present normalized data (cHet normalized to their control values for each set of immunolabelling) to clearly represent the differences between genotypes. This information is present in the legends but we apologize for not clearly explaining it the methods section. We will provide a more detailed explanation of our methods in the revised manuscript.

      Briefly, immunostained sections were imaged using a Leica SP8-STED confocal microscope, with a 63x (NA 1.4) at 1024 X 1024, z-step =0.3 μm, stack size of ~15 μm. Images were acquired from the auditory cortex from at least 3 coronal sections per animal. All the confocal parameters were maintained constant throughout the acquisition of an experiment. All images shown in the figures are from a single confocal plane. To quantify the number of vGlut1/PSD95 or vGlut2/PSD95 putative synapses, images were exported as TIFF files and analyzed using Fiji (Image J) software. We first manually outlined the profile of each PV cell soma (identified by PV immunolabeling). At least 4 innervated somata were selected in each confocal stack. We then used a series of custom-made macros in Fiji as previously described (Chehrazi et al, 2023). After subtracting background (rolling value = 10) and Gaussian blur (σ value = 2) filters, the stacks were binarized and vGlut1/PSD95 or vGlut2/PSD95 puncta were independently identified around the perimeter of a targeted soma in the focal plane with the highest soma circumference. Puncta were quantified after filtering particles for size (included between 0-2μm2) and circularity (included between 0-1). Data quantification was done by investigators blind to the genotype, and presented as normalized data over control values for each experiment.

      (3) One observation from the minimal stimulation experiment was concluded by an unsupported statement. Namely, the change in the onset delay cannot be attributed to a deficit in the recruitment of PV+ cells, but it may suggest a change in the excitability of TC axons.

      We agree with the reviewer, please see answer to point below.

      (‎4) The conclusions drawn from the stimulation experiments are also disconnected from the actual data. To make conclusions about TC release, the authors should have tested release probability using established methods, such as paired-pulse changes. Instead, the only observation here is a change in the AMPA components, which remained unexplained.

      We agree with the reviewer and we will perform additional paired-pulse ratio experiments at different intervals. We will rephrase the discussion and our interpretation and potential hypothesis according to the data obtained from this new experiment.

      (5) The sampling rate of CC recordings is insufficient ‎to resolve the temporal properties of the APs. Therefore, the phase-plots cannot be interpreted (e.g. axonal and somatic AP components are not clearly separated), raising questions about how AP threshold and peak were measured. The low sampling rate also masks the real derivative of the AP signals, making them apparently faster.

      We acknowledge that a higher sampling rate could offer a more detailed analysis of the action potential waveform. However, in the context of action potential analysis, it is acceptable to use sampling rates ranging from 10 kHz to 20 kHz (Golomb et al., 2007; Stevens et al., 2021; Zhang et al., 2023), which are considered adequate in the context of the present study. Indeed, our study aims to evaluate "relative" differences in the electrophysiological phenotype when comparing groups following a specific genetic manipulation. A sampling rate of 10 kHz is commonly employed in similar studies, including those conducted by our collaborator and co-author S. Kourrich (e.g., Kourrich and Thomas 2009, Kourrich et al., 2013), as well as others (Russo et al., 2013; Ünal et al., 2020; Chamberland et al., 2023).

      Despite being acquired at a lower sampling rate than potentially preferred by the reviewer, our data clearly demonstrate significant differences between the experimental groups, especially for parameters that are negligibly or not affected by the sampling rate used here (e.g., #spikes/input, RMP, Rin, Cm, Tm, AP amplitude, AP latency, AP rheobase).

      Regarding the phase-plots, we agree that a higher sampling rate would have resulted in smoother curves and more accurate absolute values. However, the differences were sufficiently pronounced to discern the relative variations in action potential waveforms between the experimental groups.

      A related issue is that the Methods section lacks essential details about the recording conditions, such as bridge balance and capacitance neutralization.

      We indeed performed bridge balance and neutralized the capacitance before starting every recording. We will add the information in the methods.

      (6) Interpretation issue: One of the most fundamental measures of cellular excitability, the rheobase, was differentially affected by cHet in BCshort and BCbroad. Yet, the authors concluded that the cHet-induced changes in the two subpopulations are common.

      We are uncertain if we have correctly interpreted the reviewer's comment. While we observed distinct impacts on the rheobase (Fig. 7d and 7i), there seems to be a common effect on the AP threshold (Fig. 7c and 7h), as interpreted and indicated in the final sentence of the results section for Figure 7 (page 12). If our response does not address the reviewer's comment adequately, we would greatly appreciate it if the reviewer could rephrase their feedback.

      (7) Design issue:

      The Kv1 blockade experiments are disconnected from the main manuscript. There is no experiment that shows the causal relationship between changes in DTX and cHet cells. It is only an interesting observation on AP halfwidth and threshold. However, how they affect rheobase, EPSCs, and other topics of the manuscript are not addressed in DTX experiments.

      Furthermore, Kv1 currents were never measured in this work, nor was the channel density tested. Thus, the DTX effects are not necessarily related to changes in PV cells, which can potentially generate controversies.

      While we acknowledge the reviewer's point that Kv1 currents and density weren't specifically tested, an important insight provided by Fig. 5 is the prolonged action potential latency. This delay is significantly influenced by slowly inactivating subthreshold potassium currents, namely the D-type K+ current. It's worth noting that D-type current is primarily mediated by members of the Kv1 family. The literature supports a role for Kv1.1-containing channels in modulating responses to near-threshold stimuli in PV cells (Wang et al., 1994; Goldberg et al., 2008; Zurita et al., 2018). However, we recognize that besides the Kv1 family, other families may also contribute to the observed changes.

      To address this concern, we will revise our interpretation. We will opt for the more accurate term "D-type K+ current" and only speculate about the involved channel family in the discussion. It is not our intention to open unnecessary controversy, but present the data we obtained. We believe this approach and rephrasing the discussion as proposed will prevent unnecessary controversy and instead foster fruitful discussions.

      (8) Writing issues:

      Abstract:

      The auditory system is not mentioned in the abstract.

      One statement in the abstract is unclear‎. What is meant by "targeting Kv1 family of voltage-gated potassium channels was sufficient..."? "Targeting" could refer to altered subcellular targeting of the channels, simple overexpression/deletion in the target cell population, or targeted mutation of the channel, etc. Only the final part of the Results revealed that none of the above, but these channels were blocked selectively.

      We agree with the reviewer and we will rephrase the abstract accordingly.

      Introduction:

      There is a contradiction in the introduction. The second paragraph describes in detail the distinct contribution of PV and SST n‎eurons to auditory processing. But at the end, the authors state that "relatively few reports on PV+ and SST+ cell-intrinsic and synaptic properties in adult auditory cortex". Please be more specific about the unknown properties.

      We agree with the reviewer and we will rephrase more specifically.

      (9) The introduction emphasizes the heterogeneity of PV neurons, which certainly influences the interpretation of the results of the current manuscript. However, the initial experiments did not consider this and handled all PV cell data as a pooled population.

      In the initial experiments, we handled all PV cell data together because we wanted to be rigorous and not make assumptions/biases on the different PV cells, which in later experiments we were to distinguish based on the intrinsic properties alone. We will make this point clear in the revised manuscript.

      (10) The interpretation of the results strongly depends on unpublished work, which potentially provide the physiological and behavioral contexts about the role of GABAergic neurons in SynGap-haploinsufficiency. The authors cite their own unpublished work, without explaining the specific findings and relation to this manuscript.

      We agree with the reviewer and apologize for the lack of clarity. Our unpublished work is in revision right now. We will provide more information and update references in the revised version of this manuscript.

      (11) The introduction of Scholl analysis ‎experiments mentions SOM staining, however, there is no such data about this cell type in the manuscript.

      We apologize for the error, we will change SOM with SST (SOM and SST are two commonly used acronyms for Somatostatin expressing interneurons).

      Reviewer #3 (Public Review):

      This paper compares the synaptic and membrane properties of two main subtypes of interneurons (PV+, SST+) in the auditory cortex of control mice vs mutants with Syngap1 haploinsufficiency. The authors find differences at both levels, although predominantly in PV+ cells. These results suggest that altered PV-interneuron functions in the auditory cortex may contribute to the network dysfunction observed in Syngap1 haploinsufficiency-related intellectual disability. The subject of the work is interesting, and most of the approach is direct and quantitative, which are major strengths. There are also some weaknesses that reduce its impact for a broader field.

      (1) The choice of mice with conditional (rather than global) haploinsufficiency makes the link between the findings and Syngap1 relatively easy to interpret, which is a strength. However, it also remains unclear whether an entire network with the same mutation at a global level (affecting also excitatory neurons) would react similarly.

      The reviewer raises an interesting and pertinent open question which we will address in the discussion of the revised paper.

      (2) There are some (apparent?) inconsistencies between the text and the figures. Although the authors appear to have used a sophisticated statistical analysis, some datasets in the illustrations do not seem to match the statistical results. For example, neither Fig 1g nor Fig 3f (eNMDA) reach significance despite large differences.

      We respectfully disagree, we do not think the text and figures are inconsistent. In the cited example, large apparent difference in mean values does not show significance due to the large variability in the data; further, we did not exclude any data points, because we wanted to be rigorous. In particular, for Fig.1g, statistical analysis shows a significant increase in the inter-mEPSC interval (*p=0.027, LMM) when all events are considered (cumulative probability plots), while there is no significant difference in the inter-mEPSCs interval for inter-cell mean comparison (inset, p=0.354, LMM). Inter-cell mean comparison does not show difference with Mann-Whitney test either (p=0.101, the data are not normally distributed, hence the choice of the Mann-Whitney test). For Fig. 3f (eNMDA), the higher mean value for the cHet versus the control is driven by two data points which are particularly high, while the other data points overlap with the control values. The Mann-Whitney test show also no statistical difference (p=0.174).

      In the manuscript, discussion of the data is based on the results of the LMM analysis, which takes in account both the number of cells and the numbers of mice from which these cells are recorded. We chose this statistical approach because it does not rely on the assumption that cells recorded from same mouse are independent variables. In the supplemental tables, we provided the results of the statistical analysis done with both LMM and the most commonly used Mann Whitney (for not normally distributed) or t-test (for normally distributed), for each data set.

      Also, the legend to Fig 9 indicates the presence of "a significant decrease in AP half-width from cHet in absence or presence of a-DTX", but the bar graph does not seem to show that.

      We apologize for our lack of clarity. In legend 9, we reported the statistical comparisons between 1) cHET mice in absence of a-DTX and control mice and 2) cHET mice in presence of a-DTX and control mice. We will rephrase result description and the legend of the figure to avoid confusion.

      (3) The authors mention that the lack of differences in synaptic current kinetics is evidence against a change in subunit composition. However, in some Figures, for example, 3a, the kinetics of the recorded currents appear dramatically different. It would be important to know and compare the values of the series resistance between control and mutant animals.

      We agree with the reviewer that there appears to be a qualitative difference in eNMDA decay between conditions, although quantified eNMDA decay itself is similar between groups. We have used a cutoff of 15 % for the series resistance (Rs), which is significantly more stringent as compared to the cutoff typically used in electrophysiology, which are for the vast majority between 20 and 30%. To answer this concern, we re-examined the Rs, we compared Rs between groups and found no difference for Rs in eAMPA (13.2±0.5 in WT n=16 cells, 7 mice vs 13.7±0.3 in cHet n=14 cells, 7 mice, p=0.432 LMM) and eNMDA (12.7±0.7 in WT n=6 cells, 3 mice vs 13.8±0.7 in cHet n=6 cells, 5 mice, p=0.231, LMM). Thus, the apparent qualitative difference in eNMDA decay stems from inter-cell variability rather than inter-group differences. Notably, this discrepancy between the trace (Fig. 3a) and the data (Fig. 3f, right) is largely due to inter-cell variability, particularly in eNMDA, where a higher but non-significant decay rate is driven by a couple of very high values (Fig. 3f, right). In the revised manuscript, we will show traces that better represent our findings.

      (4) A significant unexplained variability is present in several datasets. For example, the AP threshold for PV+ includes points between -50-40 mV, but also values at around -20/-15 mV, which seems too depolarized to generate healthy APs (Fig 5c, Fig7c).

      We acknowledge the variability in AP threshold data, with some APs appearing too depolarized to generate healthy spikes. However, we meticulously examined each AP that spiked at these depolarized thresholds and found that other intrinsic properties (such as Rin, Vrest, AP overshoot, etc.) all indicate that these cells are healthy. Therefore, to maintain objectivity and provide unbiased data to the community, we opted to include them in our analysis. It's worth noting that similar variability has been observed in other studies (Bengtsson Gonzales et al., 2020; Bertero et al., 2020).

      Further, we conducted a significance test on AP threshold excluding these potentially unhealthy cells and found that the significant differences persist. After removing two outliers from the cHet group with values of -16.5 and 20.6 mV, we obtain: -42.6±1.01 mV in control, n=33, 15 mice vs -36.2±1.1 mV in cHet, n=38 cells, 17 mice, ***p<0.001, LMM. Thus, whether these cells are included or excluded, our interpretations and conclusions remain unchanged.

      We would like to clarify that these data have not been corrected with the junction potential. We will add this info in the revised version.

      (5) I am unclear as to how the authors quantified colocalization between VGluts and PSD95 at the low magnification shown in Supplementary Figure 2.

      We apologize for our lack of clarity. Although the analysis was done at high resolution, the figures were focused on showing multiple PV somata receiving excitatory inputs. We will add higher magnification figures and more detailed information in the methods of the revised version. Please also see our response to reviewer #2.

      (6) The authors claim that "cHet SST+ cells showed no significant changes in active and passive membrane properties", but this claim would seem to be directly refused by the data of Fig 8f. In the absence of changes in either active or passive membrane properties shouldn't the current/#AP plot remain unchanged?

      While we acknowledge the theoretical expectation that changes in intrinsic parameters should correlate with alterations in neuronal firing, the absence of differences in the parameters analyzed in this study should not overshadow the clear and significant decrease in firing rate observed in cHet SST+ cells. This decrease serves as a compelling indication of reduced intrinsic neuronal excitability. It's certainly possible that other intrinsic factors, not assessed in this study, may have contributed to this effect. However, exploring these mechanisms is beyond the scope of our current investigation. We will rephrase the discussion and add this limitation of our study in the revised version.

      (7) The plots used for the determination of AP threshold (Figs 5c, 7c, and 7h) suggest that the frequency of acquisition of current-clamp signals may not have been sufficient, this value is not included in the Methods section.

      This study utilized a sampling rate of 10 kHz, which is a standard rate for action potential analysis in the present context. We will describe more extensively the technical details in the method section of the revised manuscript we are preparing. While we acknowledge that a higher sampling rate could have enhanced the clarity of the phase plot, our recording conditions, as detailed in our response to Rev#2/comment#5, were suitable for the objectives of this study.

      Reference list

      Bengtsson Gonzales C, Hunt S, Munoz-Manchado AB, McBain CJ, Hjerling-Leffler J (2020) Intrinsic electrophysiological properties predict variability in morphology and connectivity among striatal Parvalbumin-expressing Pthlh-cells. Scientific Reports, 10, 15680. https://doi.org/10.1038/s41598-020-72588-1

      Bertero A, Zurita H, Normandin M, Apicella AJ (2020) Auditory long-range parvalbumin cortico-striatal neurons. Frontiers in Neural Circuits, 14, 45. http://doi.org/ 10.3389/fncir.2020.00045

      Chamberland S, Nebet ER, Valero M, Hanani M, Egger R, Larsen SB, Eyring KW, Buzsáki G, Tsien RW (2023) Brief synaptic inhibition persistently interrupts firing of fast-spiking interneurons. Neuron, 111, 1264–1281. http://doi.org/10.1016/j.neuron.2023.01.017

      Chehrazi P, Lee KKY, Lavertu-Jolin M, Abbasnejad Z, Carreño-Muñoz MI, Chattopadhyaya B, Di Cristo G (2023). The p75 Neurotrophin Receptor in Preadolescent Prefrontal Parvalbumin Interneurons Promotes Cognitive Flexibility in Adult Mice. Biol Psychiatry, 94, 310-321. doi: 10.1016/j.biopsych.2023.04.019.

      Elabbady L, Seshamani S, Mu S, Mahalingam G, Schneider-Mizell C, Bodor AL, Bae JA, Brittain D, Buchanan J, Bumbarger DJ, Castro MA, Dorkenwald S, Halageri A, Jia Z, Jordan C, Kapner D, Kemnitz N, Kinn S, Lee K, Li K…Collman F (2024) Perisomatic features enable efficient and dataset wide cell-type classifications across large-scale electron microscopy volumes. bioRxiv, https://doi.org/10.1101/2022.07.20.499976

      Goldberg EM, Clark BD, Zagha E, Nahmani M, Erisir A, Rudy B (2008) K+ Channels at the axon initial segment dampen near-threshold excitability of neocortical fast-spiking GABAergic interneurons. Neuron, 58, 387–400. https://doi.org/10.1016/j.neuron.2008.03.003

      Golomb D, Donner K, Shacham L, Shlosberg D, Amitai Y, Hansel D. (2007). Mechanisms of firing patterns in fast-spiking cortical interneurons. PLoS Computational Biology, 38, e156. http://doi.org/10.1371/journal.pcbi.0030156

      Hu H, Martina M, Jonas P (2010). Dendritic mechanisms underlying rapid synaptic activation of fast-spiking hippocampal interneurons. Science, 327, 52–58. http://doi.org/10.1126/science.1177876

      Hwang YS, Maclachlan C, Blanc J, Dubois A, Petersen CH, Knott G, Lee SH (2021). 3D ultrastructure of synaptic inputs to distinct gabaergic neurons in the mouse primary visual cortex. Cerebral Cortex, 31, 2610–2624. http://doi.org/10.1093/cercor/bhaa378

      Kavalali E (2015) The mechanisms and functions of spontaneous neurotransmitter release. Nature Reviews Neuroscience, 16, 5–16. https://doi.org/10.1038/nrn3875

      Kourrich S, Thomas MJ (2009) Similar neurons, opposite adaptations: psychostimulant experience differentially alters firing properties in accumbens core versus shell. Journal of Neuroscience, 29, 12275-12283. http://doi.org:10.1523/JNEUROSCI.3028-09.2009

      Kourrich S, Hayashi T, Chuang JY, Tsai SY, Su TP, Bonci A (2013) Dynamic interaction between sigma-1 receptor and Kv1.2 shapes neuronal and behavioral responses to cocaine. Cell, 152, 236–247. http://doi.org/10.1016/j.cell.2012.12.004

      Norenberg A, Hu H, Vida I, Bartos M, Jonas P (2010) Distinct nonuniform cable properties optimize rapid and efficient activation of fast-spiking GABAergic interneurons. Proceedings of the National Academy of Sciences, 107, 894–9. http://doi.org/10.1073/pnas.0910716107

      Stevens SR, Longley CM, Ogawa Y, Teliska LH, Arumanayagam AS, Nair S, Oses-Prieto JA, Burlingame AL, Cykowski MD, Xue M, Rasband MN (2021) Ankyrin-R regulates fast-spiking interneuron excitability through perineuronal nets and Kv3.1b K+ channels. Elife, 10, e66491. http://doi.org/10.7554/eLife.66491

      Russo G, Nieus TR, Maggi S, Taverna S (2013) Dynamics of action potential firing in electrically connected striatal fast-spiking interneurons. Frontiers in Cellular Neuroscience, 7, 209. https://doi.org/10.3389/fncel.2013.00209

      Ünal CT, Ünal B, Bolton MM (2020) Low-threshold spiking interneurons perform feedback inhibition in the lateral amygdala. Brain Structure and Function, 225, 909–923. http://doi.org/10.1007/s00429-020-02051-4

      Wang H, Kunkel DD, Schwartzkroin PA, Tempel BL (1994) Localization of Kv1.1 and Kv1.2, two K channel proteins, to synaptic terminals, somata, and dendrites in the mouse brain. The Journal of Neuroscience, 14, 4588-4599. https://doi.org/10.1523/JNEUROSCI.14-08-04588.1994

      Zhang YZ, Sapantzi S, Lin A, Doelfel SR, Connors BW, Theyel BB (2023) Activity-dependent ectopic action potentials in regular-spiking neurons of the neocortex. Frontiers in Cellular Neuroscience, 17. https://doi.org/10.3389/fncel.2023.1267687

      Zurita H, Feyen PLC, Apicella AJ (2018) Layer 5 callosal parvalbumin-expressing neurons: a distinct functional group of GABAergic neurons. Frontiers in Cellular Neuroscience, 12, 53. https://doi.org/10.3389/fncel.2018.00053

    1. For example, the person in charge of the donations seemed to be overwhelmed and could not always answer questions we may have regarding delivery. She always sounded frustrated whenever she was asked questions because she indicated no one informed her of what and where things were going. I think if they provided her with more information, she would feel more comfortable answering questions as well as feeling more motivated to seek out the answers.

      insightful observation

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      General response of the authors to the editor and the reviewers:

      We thank the reviewers for their feedback, input and questions as these have helped us to (hopefully) improve the manuscript. We have rewritten several sections of the manuscript, moved methodological descriptions from the Results to the Methods section, and added imaging data for two cytoskeletal proteins, Shot and Cofilin/Twinstar, which confirm the predicted differential DV expression. Because the changes to the text were extensive, we did not mark them by track changes (the manuscript would have been illegible), but would be happy to provide an additional version that includes the tracked changes.

      We provide below the point-by-point response to each question and comment made by the reviewers. Our text is in blue.



      __Reviewer #1 __

      __Evidence, reproducibility and clarity __

      __Summary __

      This manuscript investigated changes in the proteome and phosphoproteome during dorsovental axis specification in the Drosophila embryo. To model the three regions in the embryo that are relevant for DV axis development, the authors used specific mutations to enrich for a single type of cells (ventral, lateral, or dorsal). The detected proteins and phosphopeptides were clustered according to the region of expression. There were differences between the protein and corresponding phosphopeptide abundance, suggesting that phosphorylation is a regulatory modification in DV axis establishment. Two different mutations that both result in a ventralized phenotype were found to change marker protein expression in different ways. Using inhibition of microtubule polymerization, this study also investigated the role of microtubules in epithelial folding.

      __Major comments __

      1. Generally, there is a lack of significance testing throughout the manuscript. Simply reporting fold changes can be misleading, if these changes are not significant. Examples:

      2. Rigor of the proteomics evidence showing changes for the expected markers is insufficient because no statistical evaluation is provided. Specifically, in Fig. 1D and Suppl Fig 2: are the fold changes statistically significant?

      3. Data in Fig. 4F, 5F need to be assessed for significance. There are other instances in the manuscript where significance should be tested.

      We did ANOVA testing for all proteome and phosphoproteome data, and the outcome of these analyses is reported in Supplementary Tables 2 and 3. We have added references to significance throughout, wherever possible and relevant and have included a table that summarizes all p values for all comparisons in all of the figures (Supplementary Table 2). However, note that we do our clustering independent of statistical significance, i.e., we include all values, as we explain in the manuscript.

      It is difficult to see the value of the obtained dataset for the community, in part because the data are analyzed by a linear model and cluster assignment developed by the authors, which is a somewhat arbitrary representation. Perhaps the authors could explain how their data could be used by other researchers, and maybe even develop an accessible portal for interacting with the data.

      We do provide the entire set of data in a formatted Excel Table as Supplementary Tables 3 and 4, which contain common pairwise comparisons and ANOVA tests that allow a researcher without a strong proteomics background to explore the data, and we also provide the raw proteomics datasets deposited in PRIDE, so any interested colleague can re-analyse them in the manner that suits their purposes best.

      We analysed the data in the way we did because it takes account of the knowledge from genetics that we have of all these cell populations. This also allowed us to include the important set of proteins and phosphosites that are completely absent from all but one mutant genotype, and would therefore have dropped out of the statistical analyses.

      For example, what does it mean biologically that a protein is a member of a specific cluster shown in Fig. 3C? Is there a predictive value in such an assignment, and how does it relate to the main question of DV axis regulation? An example of a novel insight obtained for specific protein(s) would be useful to illustrate the utility of this analysis.

      The clusters represent groups of proteins that are present at higher or lower abundance in subsets of cell populations. So, for example, being present in cluster 5 means (Fig. 3C) that this protein is predicted to be more abundant in the mesoderm than elsewhere (which includes being detected ONLY in the mesoderm, like Snail). This clustering therefore is the way for us to find new proteins that conform to these groups.

      We provide here the immunostainings of two cytoskeleton-associated proteins that our proteomic analyses predicted to be more abundant in the ectoderm (Cluster 6: dorsal+lateral):

      • The actin-microtubule crosslinker Short-stop (Shot), which is seen to be reduced in the mesoderm.
      • The actin-severing protein Cofilin/Twinstar, which was also found downregulated in the mesoderm in the work cited in Ref.:10 Gong L. et al., Development (2004). The staining shows that cofilin-GFP is abundant in the entire subapical region of ectodermal cells, but strongly reduced in ventral furrow cells, where it is only retained in a few apical membrane blebs. These proteins are targets for functional analyses in follow-up work.

      [Imaging Data for Reviewers]

      Figure: Physical cross-sections of fixed embryos showing the enrichment of proteins in the ectoderm (cluster 6: DL). Dorsal is top, ventral is bottom. Scale bar: 50 um Top panel: Staining for short-stop (shot; cyan / grayscale) and snail (yellow) in embryos expressing gap43-mCherry. Bottom panel: staining for discs large (dlg, magenta) and GFP (green / grayscale) in embryos expressing cofilin-GFP (Kyoto protein trap for Cofilin/Twinstar).

      Overall, at present the study appears to have limited novelty and mechanistic insight. The data generally align with prior expectations, but it is unclear how this work advances the field.

      We were reassured that the data align with previous studies, but as we state in the text, they go well beyond these valuable and important studies in several dimensions. We had made the following assumptions:

      1. DV patterning mutants recapitulate biological qualities of DV cell populations and the differential expression of DV fate determinants, as confirmed in Fig. 1 and Fig. 3D.
      2. The differential regulation of the proteomes and phosphoproteomes across DV patterning mutants recapitulates the abundances of proteins and phosphosites within DV cell populations of a wildtype embryo. We confirmed this in Fig. 3A and Fig. 5C with the implementation of a linear model for the abundances of detected proteins and phosphosites. The resulting analysis revealed new avenues for future functional studies, as intended. Most of the work on cell shape regulation at the gastrulation stage has focused on actomyosin and a subset of cell adhesion molecules. We have identified networks of proteins and phosphoproteins that may also control gastrulation (Fig. 6 and Supplementary Fig. 5), including microtubules, which were significantly enriched in networks of phosphoproteins (Fig. 7 and Supplementary Fig. 6).

      For example, the observed differences between marker proteins in Toll10B vs. spn27A data seem to confirm previous suggestions that spn27A has a stronger ventralizing effect.

      This suggestion was made by colleagues who had unpublished observations on a limited number of gene expression patterns that supported their contention. A correlation analysis (see figure below) of our results now shows that proteins with a restricted dorso-ventral pattern change more in spn27Aex mutants than in Toll10B. If we look at the known mesodermal genes such as Snail, Twist, Mdr49 and CG4500 we find them at higher abundance in spn27Aex than Toll10B , while the ectodermal genes Egr, Zen, Dtg, Tsg, Bsk, and Ptr are reduced more strongly in spn27Aex than in Toll10B. This takes the prior observation of a stronger ventralization of spn27Aex from an anecdotal to a systematic analysis.

      [Correlation analyses available for reviewers]

      Cross-correlation between the fold changes (FCs) in Toll10B/WT vs. spn27Aex/WT for all proteins detected in wildtype, Toll10B and spn27Aex. Each dot is a protein. The green line is the 'identity' function (slope = 1) that would be expected if the FCs for each protein in both ventralized mutants were exactly the same. A set of proteins with restricted dorso-ventral distribution are highlighted in yellow: mesodermal (ventral) and blue: ectodermal (dorsal).

      The role of microtubules in epithelial folding in the embryo has also been demonstrated before.

         The role of microtubules in epithelial folding in the *Drosophila *embryo has indeed been examined in three previous studies that studied dorsal fold formation (Ref.: 35, Takeda et al. NCB 2018), ventral furrow formation (VFF, Ref.: 36, Ko et al. JCB 2019), and salivary gland invagination (Booth et al. Dev Cell 2014). These data reveal diverse and non-conservative functional requirements, ranging from acto-myosin contractility during apical constriction (Booth et al. 2014), force transmission and repair of the supracellular contractile network (but not apical constriction per se, Ko et al 2019), to the generation of expansile forces during cell shape homeostasis (Takeda et al 2018). In light of this potentially broad functional spectrum, we sought to compare three epithelial folds that form within the context of gastrulation: ventral furrow, cephalic furrow and dorsal folds. We confirmed that the initiation of VFF was normal, but the final invagination failed, as per Ko et al. 2019, while dorsal fold initiation did not occur (extending conclusions from Takeda et al 2018). In contrast, cephalic furrow formation, though delayed, did not require microtubules. We also revealed a novel commonality of MT function. Specifically, prior to the initiation of all three epithelial folds, proper nuclear positioning requires MTs. We additionally discovered novel membrane abnormalities in two distinct types of blebs during ventral furrow and dorsal fold formation, respectively. Thus, our data provide insights into the roles of microtubules during epithelial folding that go beyond prior work.
      

      The shown phosphorylation changes (if they are significant) for Toll and Cactus are difficult to explain. In Suppl Fig 2B, E: why is Toll more phosphorylated in the lateralized than in ventralized embryos? (the provided reference 20 does not seem to clarify this).

         These changes are indeed significant (Toll-S871: Vtl vs. WT p = 0.01 , Vsp vs. WT p = 0.002; Cactus-S463: Vsp vs WT p = 0.03); see Supplementary Figure 2B and Supplementary Table 2).
      
         We have corrected Ref. 20 (Shen B. and Manley J.L., Development 1998). Ref. 20 only shows that Tl is phosphorylated by Pelle (Ref 20: Fig. 6A), although neither the exact position of Tl phosphosite(s) nor the function of Tl phosphorylation were explored in this article. A hallmark of Toll Like Receptor (TLR) regulation is these receptors are subject to tyrosine phosphorylation, which has been widely connected to the regulation of the binding of adaptor proteins to the cytoplasmic tail of TLRs. Both our finding of Serine phosphorylation in Tl, and the differential phosphorylation across cell populations is new, but since we do not know what this particular Serine phosphorylation site does in TLRs in general, we cannot speculate on the meaning of it occurring more in lateral than in ventral cells. In Ref. 20, the authors speculate that Tl phosphorylation by Pelle regulates the association between Tl and Pelle, which then enables Dorsal translocation to the nucleus. It might also be part of a feedback regulation loop, but this is entirely speculative.
      

      Also, certain Cactus phosphorylations appear higher in dorsalized and ventralized embryos, but not in lateralized embryos. Are such changes expected and do they make sense biologically? It is unclear why these phosphorylation data are used to validate the success of the approach.

         The three Cactus phosphosites S463, S467 and S468 were identified and characterised in the work cited in Ref. 19 (Liu Z.P. et al., Genes and Development, 1997), and we used these sites to validate that our approach was sensitive enough to detect known phosphosites in proteins that act on the dorso-ventral patterning pathway specifically at the point of gastrulation (Stage 6 of embryonic development). We also reported in this manuscript the detection of known phosphosites within the Rho-pathway (Fig. 5E,F, Myosin Light Chain: T21, S22; Cofilin: S3).
      
         Liu Z.P. et al. reported that these three sites map to the Cactus PEST domain, which is required for Cactus degradation in the mesoderm (Belvin M. et al, Genes and Development 1995).  Liu Z.P. et al. also showed that mutating these phosphosites impairs Cactus turnover without affecting the ability of Cactus to bind Dorsal. We can only speculate that the differential phosphorylation across dorso-ventral embryonic cell populations is associated with the regulation of Cactus turnover. Consistent with this, we find Cactus downregulated 1.5 log2 fold in ventralized embryos derived from *spn27Aex/def* mothers. Furthermore, there are a number of signalling pathways that act both in the dorsal and the ventral-lateral domain (e.g., rhomboid/EGF), so it is not surprising to find modifications that are shared by these regions.
      

      The rationale to use a diffusion algorithm for data analysis is not clear. How would the analysis differ if diffusion was not used?

      Phosphoproteomics data are often sparse and noisy for a number of reasons (technical; low abundance of phosphorylated peptides compared to other peptides in the cell; biological: not all phosphosites are functional). Network diffusion is a common way used for various data types to boost the signal-to-noise ratio. For example, if from a list of 10 phosphosites, 5 all fall in the same network region or process, and the rest are randomly distributed in the network, chances are that the first region is more representative of the regulated process in that dataset. Using network propagation, the signal coming from the first 5 phosphosites would give a higher score to that network region, marking it as the predominant signal. Our specific implementation, which uses the semantic similarity between nodes to model the edges in the network, further boosts the functional signal by preferentially including nodes that have a higher functional similarity to the initial phosphosites. Our approach therefore allows us to identify the processes that are predominantly ‘active’ in our dataset. We refer the reviewer to our recent preprint for more evidence that this strategy boosts the signal-to-noise ratio in phosphoproteomic datasets and further prioritises more functional phosphosites (https://www.biorxiv.org/content/10.1101/2023.08.07.552249v1). If this approach was not used and we based the identification of relevant processes only on the list of phosphosites, we would have acquired more spurious terms in our functional enrichment analysis. The above preprint also shows that different methods such as the Prize Collecting Steiner Forest algorithm perform worse for phosphoproteomics data.

      Generally, the discussion of enriched GO categories presented in Fig. 6 is not rigorous, and it is unclear what biological insight is provided by this figure, probably because the categories are extremely diverse and not clustered in a meaningful way. Despite stating that the work on microtubules came out as a result of proteomic analysis, there is no connection between proteomic data (e.g., data shown in Fig. 6) and microtubule analysis in Fig. 7.

         The connection is between the __phosphoproteomic__ data and the microtubules. The reviewer is correct about the fact there is little connection at the proteomic level with microtubules. Only the diffused network analyses performed on the phosphoproteomic data pointed in this direction. We have improved the writing about this point.
      

      The Discussion section touches on areas of differential protein degradation and mRNA regulation; however, these data are not presented in Results or Figures and so it is difficult to assess the relevance of this analysis.

           We present these data in Figure 6A,B. The network analyses of the clusters showed significant enrichment of cellular component terms that are connected with protein turnover and mRNA regulation. We have added a reference to figure 6 in the Discussion for clarity.
      

      There is insufficient citation of prior literature throughout the manuscript: many statements are lacking proper references.

      We have corrected the mistakes and added missing references.

      Proteomics data should be deposited into a standard repository that is a member of ProtomeXchange Consortium, such as PRIDE, etc.

      All proteomics and phosphoproteomics data have been uploaded to PRIDE:

      The raw files for the proteomics and phosphoproteomics experiments were deposited in PRIDE under separate identifiers:

      Proteome: Identifier PXD046050 (Reviewer account details: reviewer_pxd046050@ebi.ac.uk, pw: coJ9otiX).

      Phosphoproteome: Identifier PXD046192 (Reviewer account details: reviewer_pxd046192@ebi.ac.uk, pw: nvkbwClp).

      We have included a statement of raw data availability in the revised version of the manuscript with the PRIDE access information.

      __Minor comments __

      The text has several typos and should be proof-read, and references to figures and tables should be checked, as some of these are not correct.

      We have corrected typos, references to figures and tables in the revised version of the manuscript.

      The genotypes for the mutations used in this study should be accompanied by citation describing identification of these mutations and the resulting phenotypes. It would also be helpful to describe the nature of these alleles (molecular lesion, gain vs loss of function, etc.). Some of this information is included in the Discussion, but it would be useful for the reader to learn this early on, when the chosen genotypes are presented.

      All this information is and was provided in the methods section and in Table 1, including stock numbers and sources of the stocks. Please see 'Methods, Drosophila genetics and embryo collections'.

      2G,H - the X axis should be clearly labeled as logarithmic.

      We introduced the log2 label in the X-axis of Fig. 2G,H and any other panel in which this was not expressly made clear.

      In Fig. 2G the locations of lines showing fold changes for Twist and Snail seem incorrect. In Fig. 2H the dotted line does not appear to correspond to 50% of the number of phosphosites.

      We apologise for these errors, both have been corrected in the revised version of the manuscript.

      5D can be improved by adding letters for the coloured clusters.

      We have labelled the clusters in Fig. 3B and Fig. 5D. to ease the identification of biologically relevant clusters.

      It is unclear if any specific additional insight was obtained using SILAC, the authors may want to discuss this approach and outcomes more.

      SILAC has been widely used to deal with the inherent variability of proteomic analyses by introducing a standard that is metabolically labelled, in our case, w1118 flies fed with SILAC yeast were used as the standard. Because the inherent variability is larger in phosphoproteomic experiments (because protein identification is based on phosphorylated peptides only, see Methods), we used SILAC labelling only in the phosphoproteomic experiment.



      __Reviewer #2 __

      Evidence, reproducibility and clarity


      The present article by Gomez et al describes a deep proteomics analysis of the proteome and phosphoproteome of embryos mutated for key genes involved in the dorso-ventral axis in Drosophila melanogaster. Overall, this is a nice article showing new insight in this development process. The results are mainly descriptive, yet identifies potential new players in the definition of the dorso-ventral axis.

      The generation of mutants for genes found up- or down-regulated in each mutant strain would be a significant addition to this manuscript. But I think in its current form the data brings enough new information on this particular developmental step and would be of interest for the fly community.

      My main concern is that the manuscript can be difficult to read and overly convoluted at times even for experts in the field. I would suggest the author move some methodological explanations from the results to the methods section to further detail the goals of some results sections.

      We have followed these suggestions and hope we have made the manuscript more easily readable.

      As an example, the goal of part 3) « A linear model for quantitative interpretation of the proteomes » is not clear to me. Are the authors comparing the abundance of a protein in the WT versus a theoretical WT in order to determine which fractions of mesoderm, lateral ectoderm and dorsal region are actually present in the WT? (...)

      Yes, in part, but the main purpose was to compare how well the theoretical WT, as ‘reconstituted’ from the mutants, corresponds to the observed actual WT (for which we have at least approximate values).

      The question that we faced when we started these calculations was: what is the ‘correct’ fraction (or proportion) we should use to weight each protein (or phosphosite) measurement in the mutants. Theoretically, these values should be those that result in the best match between the theoretical WT and the measured WT abundance of each protein (or phosphosite). We knew from actual measurements only the mesodermal fraction, which was determined to be ~20% of the cross-sectional area (Ref. 21: Rahimi, N., et al Dev. Cell. 2016). The neuroectoderm and ectoderm fractions were estimated to be approx. 40% each (Ref.: 22, Jazwinska, A et al. Development 1999), but we lacked an exact number. The systematic exploration of these proportions led us to conclude that indeed both the neuroectoderm and ectoderm fractions should be around 40% each, provided the mesoderm is fixed at 20%. Thus, we used these fractions: D: 0.4 L: 0.4 V: 0.2 for our follow-up analyses.

      (...) Or are they using it as a reference to obtain a fold change for the different proteins quantified (in this case why not use the WT?)?

      yes, again, in part: as a reference for the EXPECTED fold changes, as would be predicted from the WT.

      Since we have moved some of the details of this approach from the main text to the methods section, we have also revised the remaining text and hope it is now clearer.

      The proteomics data must be deposited in a public repository. I did not see it stated in the methods section.

      All proteomics and phosphoproteomics data have been uploaded to PRIDE; see further comments above in response 13.

      The version of the uniprot database is quite old (2016) so is the version of MaxQuant used in this study. Any reasons for that (other than that the analysis was performed in 2016)?

      That is indeed the reason.

      The data were run on different MS platforms, how did the authors account for the variability in MS signals? What samples were run on which MS platform? Were the WT embryos ran on both?

      We measured three replicates, and all five genotypes (four mutant genotypes plus wildtype) for each of the replicates were measured on the same instrument. Specifically, for the whole proteome analyses, replicate one and three of all genotypes were measured on the QExactive Plus instrument and replicate 2 of all genotypes were measured on a QExactive HF-x instrument, as were the phosphoproteomes. So, indeed, the wildtype was measured on both instruments. We thus did not observe instrument-specific bias in the PCA analysis for the proteome data.

      We have added this in more detail to the method section:

      “Samples of replicate one and three were measured on the QE-Plus system and replicate two was measured on the QE-HF-x system.

      For phosphoproteome analysis, (…) Samples of all three replicates were measured on the QEx-HFx system. We added trial samples measured on the QEx-Plus system to increase the phosphosite coverage using the match between runs algorithm.”

      In the methods section the authors mention that a high-pH reverse phase fractionation was performed? How many fractions of High-pH reverse phase separation were injected per sample? Was this separation performed for all the samples?

      We have adjusted the Methods section regarding the high-pH fractionation by adding the following sentence: “Fractions were collected every 60s in a 96 well plate over 60 min gradient time collecting a total number of 8 fractions per sample.“

      Why did the authors used label-free (proteome) and SILAC (phosphoproteome) quantification methods?

      See our response to reviewer #1, point 19.

      Why is the threshold based on the Q3 of the standard deviation (if I got it right) ? Couldn't they be calculated directly on the distribution of the ratio?

      We could also have done it that way.

      However, we had wanted also to take into account the variation between the replicates, i.e., the quality of the individual measurements, and we therefore devised the procedure we used, by which the standard deviation of the individual technical replicates enters the calculation with the ratio of the averages, the variability between replicates would have been ignored and we considered it more appropriate to take the more conservative approach. But as it turns out, the cut-off would have ended up being very similar had we calculated it the way the referee suggests,

      Page 6: The supplementary figure 2E refers to the protein Cactus and the text to CKII, please modify one or the other to avoid any confusion. Page 7: A dot is missing at the end of the following sentence « if used with the assumed weightings for the populations »

      We have corrected these sentences.

      Page 19: Replace SppedVac by SpeedVac

      We have corrected the error in the manuscript and thank the reviewer for the detailed inspection.

      Page 8: why not using a z-score with thresholds directly instead of a -1/+1/0 system and then using the z-score?

      Because we wanted to compare the relative changes over wt between mutants (i.e. the similarity between 1 0 0 and 0 -1 -1) rather than the relationship of their absolute values to the wt, and to assign proteins with similar relationships into the same dorso-ventral regulation categories.

      The text states this (previously in main text, now in methods):

      “The reason for this is that this method takes into account that value sets that represent similar relative differences between the mutants (for example, 0 -1 -1 vs. 1 -1 -1 or 1, 0, 0) are biologically more similar to each other than the raw values indicate. The z-scores for all of these cases would be 1.1547 -0.5774 -0.5774.”

      In the abstract it is mentioned that 3,399 proteins are differentially regulated at the proteome level versus 1,699 significantly deregulated at a 10 % FDR in the main text (page 5). Is there a reason for this discrepancy? Same comment for the phosphopeptides.

      But we now also see the need to better clarify this point, and we have edited the text accordingly.

      The second number refers to those proteins that show statistically significant changes based on ANOVA (1699 proteins).

      The first number (3398; note that the number 3399 in the abstract was a typo, now corrected) includes all proteins that were detected in at least 1 replicate in the wildtype (5883/6111) minus those that do not change between the genotypes (2156/6111) and minus all those that change in the same direction in all mutants (329).

      This includes proteins that are automatically excluded from ANOVA, i.e., those that are detected only in the wildtype (35/6111 proteins) or in two or more genotypes but only in 1 technical replicate ANOVA negative ones.

      As we stated, we did this because it “allows us to include the important group of proteins that show a ‘perfect’ behaviour, like dMyc and WntD, in that they are undetectable in the mutants that correspond to the regions in the normal embryo where these genes are not expressed.”. This 'regulated' set consists of those proteins that exceed the |0.5| fold threshold.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      This review is a list of many individual critiques. It is unclear what the expertise of the reviewer is (they do not provide the answer to that question in the review form, unlike the other referees), but several of the criticisms are unfounded. Three of the PIs of this work are researchers with extensive experience in Drosophila genetics and early development but are nevertheless confounded by some of the comments made by this referee.

      The mutants do not completely "flatten" the embryos.

      We do not claim that they do. Nor are the ventral, lateral and dorsal regions in the normal embryo completely ‘flat’ or homogeneous. But the mutants are good representations of the major fates in these regions, as a wealth of published literature from the last 30 years indicates.

      For instance, Tl10B broadly expresses snail but also expresses sog in the head. (i.e. Fig 1B - sog and sna expression in Figure 1B mutant backgrounds looks odd.) The sog expression likely relates to a deficiency specific effect.

      This ‘sensitive’ area is well known also from other genetic conditions – e.g. partial loss of dorsal and indeed in Spn27A mutants. It is therefore not specific to the Tl10B deficiency but says something about gene interactions in this region. Thus, this cannot be a deficiency-specific effect.

      Is sog seen in a Toll10B/+ mutant background?

      Yes, it is, and more frequently than in Toll10B/Def.

      The deficiency used for the Toll10B experiment is Df(3R)ro80b which is quite large and deletes 14+ genes.

      True. However, this does not matter: the mothers are heterozygous, so the genes are not missing, they are present in one wildtype copy! And these mothers are then mated with wildtype fathers, so if expression of these genes were needed in the embryo, then there would be another full wt copy of each. We appreciate that maternal effect genetics can be difficult to follow, but this is all work that has been done a long time ago, and is not the point of this paper at all.

      The deficiency used for the spn27A experiment is Df(2L)BSC7 and removes 4+ genes.

      Again, this would only matter if these were maternal effect genes that were needed for the establishment of the dorso-ventral axis, and they are not.

      Furthermore, the gd9 allele may not be a complete loss of function.

      It may not be – but what matters is the well characterized phenotype which has been shown to represent dorsal cell types.

      It is possible that the Toll10B allele picked up an accessory dominant mutation.

      This again would only matter if it was a dominant AND maternal effect mutation that affects the DV axis in the embryo – and there are very few of these known. And nothing in our analysis of these embryos, with which we have been working on and off over 3 decades and therefore know very well, indicates that our current stock is any different from those we have seen in the past.

      Unfortunately, these mutant phenotypes that affect DV and AP patterning mean that conclusions cannot be made that changes in protein relate to DV patterning.

      We simply do not understand this statement.

      Why do the mutant phenotypes (gene expression patterns and cell morphologies representative of the ventral, lateral and dorsal cell populations) not mean that the proteins downstream of the fate changes correspond to the cell fates?

      To get a better view of the ventralized phenotype, the authors should repeat the analysis by ectopically expressing Toll10B using the Gal4-UAS system; UAS-activate Toll transgenes are available.

      All Gal4-UAS maternal drivers, even the best and the strongest, result in mosaic expression. Our lab has extensive experience with this system and we know that, for example, the homogeneous, high levels of twist or snail expression that we see in spn or Tl10B embryos cannot be achieved with GAL4.

      Fig 1C-F - due to combined AP and DV effects seen with ventralizing mutants, it is important that the authors confirm that cross-section views relate to the middle to posterior of the embryo.

      We confirm this.

      Costaining with anti-Kr or -Caudal would help to ensure they are assaying the correct AP domain for pure DV effects.

      In our view, this is an unnecessary experiment. I know where the middle of the embryo is. If the reviewer does not believe when we say we are showing a section from the middle, they can see that the sections are not from the end region by, for example, the cell number, and the section angles.

      The authors refer to reference [60] for stages but there is no information regarding morphological criteria used under the microscope to stage the embryos.

      We have now added more detail in the methods section:

      Briefly. using a Zeiss binocular, the embryos were individually hand-selected on wet agar which made the embryos semi-transparent, allowing the assessment of a range of morphological features, of which at least some are visible in each of the mutants:

      • Yolk distance to embryonic surface: distinguishes between early (stage 5a) and late cellularisation (stage 5b).
      • Yolk distribution within the embryo: identification of large embryonic movements of the germ band (e.g.: Initiation of germ band extension, marking the initiation of stage 7). In DV patterning mutants this is seen as twisting of the embryo.
      • Change in the outline of the dorsal-posterior region: polar cell movement from the posterior most region of the embryo (stage 5a/b) to stage 6a/b.
      • Formation of the cephalic and dorsal folds: identification of stage 6 (initiation of cephalic fold) and stage 7 (dorsal folds). The combined use of these morphological criteria, together with the synchronised egg collections allows accurate staging of wild type and mutant embryos.

      Furthermore, what is stage 6a,b? Stage 6 is not typically divided in two stages nor is it clear what a,b relate to.

      We used a generally accepted standard for staging embryos: Campos-Ortega J.A. and Hartenstein V. ‘The embryonic development of Drosophila melanogaster’ book (ref. Nº 60). In this book, they describe the morphological criteria that can be followed in living embryos for proper staging. These stages, with these exact names, are shown on pages 11 and 12 of the 1997 edition (2nd edition).

      According to the published timetable of Drosophila development by Foe et al. 1993 (not cited), gastrulating embryos are 200 min or 3 hr 20'. It's unclear if this is the stage that was assayed.

      Foe is a beautiful paper, but we did not cite it because the commonly used nomenclature predates it (Campos-Ortega and Hartenstein 1985).

      In addition, timing depends on temperature whereas morphological criteria do not.

      The mutant embryos likely develop at different rates relative to wildtype. It seems important to provide details about the staging of embryos. If the mutant embryos take longer to gastrulate, for instance, might that also be a factor that impacts the proteome.

      As described above, we used a combination of criteria to accurately judge staging. DV patterning embryos could in principle develop faster or slower than wildtype. We performed synchronised egg collections (Methods: Embryo collections) for 15’. Therefore, any developmental timing defect would have become evident based on a difference in the number of embryos entering stage 6 and 7 at the point of visual inspection of the collections. This was not the case.

      How many replicates for each genotype? In the text it states, "replicates from the same genotype clustered together (Fig. 2E)....." Similar vague reference for phosphoproteome follows (Fig 2F). It is then stated that it was impossible to determine the experimental source for this variation. Could it relate to differences in timing of samples?

      We had given the numbers of replicates in the figure legend but have now also included them in the methods section for more clarity. We did 3 replicates for each genotype in each experiment, with the exception of gd9 and spn27aex mutants, for which we did 2 biological replicates each with 3 replicates, making a total of 6 replicates for these genotypes in the proteomic experiment. We have included an additional clarification in figure legend 2. The number of replicates per genotype per experiment can also be seen from the correlation matrices shown Fig. 2E and 2F, in which the replicates are shown individually. The measurements for each replicate for each genotype within each experiment were reported in Supplementary Tables 2 and 3, 'description' tabs of the worksheets.

      The lengthy discussion of ratio estimation on page 7 should be streamlined and made more clear. Are the authors throwing out data and only keeping samples that support their model? This seems like overfitting - if I am understanding correctly, you are selecting the samples that support the "majority of proteins fit the linear model" but this isn't necessarily the case.

      No, this is a misunderstanding. We do not select data.

      We have rephrased this section, but to explain here briefly: We do not select any samples, we state that the majority of proteins fit the theoretical model (and that is not even surprising, because any protein that does not change across the populations will automatically fit the model). We then discuss why some might NOT fit the model. The model doesn’t need to be supported, it simply is a calculation that allows us to stratify the data.

      They call this the 'correct' manner (see section 4 page 7) but it seems like a working model and presumptuous to imply that it is the correct way.

      We explained in the text why we refer to this as ‘correct’. It is a matter or definition, not presumption, and we even used quotes to be clear about this. ’Correct’ indicates a combination of values that is consistent with the biological model that the DV mutants are good representations of the corresponding embryonic cell populations in a wild type embryo. We do not in any way ‘throw out’ other data, we just note they don’t fit that model. Clarifications on the concept for the model have been added in various places in the text

      Figure 3C - it is confusing to use a circular diagram to show DV inferred position of the 14 clusters as their position on the circle does not correspond to where they are expressed on the embryos. Perhaps a stacked bar graph for 6 different domains would be better.

      This figure does not show positions of clusters. It is simply a pie chart, as is stated in the figure legend and as can be seen by the numbers and the corresponding sizes of the sectors. We have tried a stacked representation (shown below), but find it no clearer and have therefore stuck with this very common way of representing quantities, and in particular, proportions. We use the same representation with the same colour schemes in all subsequent figures, so proportions can be compared across figures.

      It is very hard to follow the text on page 9.

      We have rephrased this section

      It is very hard to see the gene expression patterns shown in Fig 4A with the color scheme/scale used.

      We appreciate this colour scheme does not correspond to the commonly used dark colour on a light background which would mimic histochemistry to show gene expression. The ‘inferno’ colour scheme was used because it allows better quantitative comparisons between subtly different patterns. However, to make these figures more similar to the types of in situ hybridisations that embryologists are used to seeing, we now use a different representation.

      In general, Figure 4 is uninterpretable - in particular, what do the numbers mean on the greyscale circle plots in panel D?

      We apologize for having failed to explicitly include the explanation for this in the figure legend. The reader will notice that these numbers add up to the number in the circle to the left, and the numbers indicate the number of proteins showing perfect matches (white), partial overlaps (grey) and mismatches (black). We have improved the graphic representation and added an explanation in the figure legend.

      Figure 5A. Why wasn't protein abundance and phosphosites identified from an individual, identical sample?

      This was because of the way the project developed over the course of the research, and the protein part was originally intended only as a proof of concept, with the intended focus being the phosphoproteome. We later decided to include a full analysis of the proteome, but did not consider it worthwhile and necessary to repeat the entire laborious and expensive experiment with both analyses being done from the same samples.

      How can one be sure that the phosphosites were correctly assigned if the proteins were not detected in the proteome but they were only identified in the phosphosite analysis?

      We are not sure we understand this question. The phosphoproteomic analysis identifies phosphopeptides of proteins that in turn allow one to identify the protein itself and the amino acid in that peptide that is phosphorylated. So the identification is done only WITHIN the phosphoproteomic analysis and does not relate directly to the proteomic analysis. This explains why we found some phosphopeptides for which we did not detect the full host protein in the proteomic analysis.

      Thus, if a protein was detected only in either of the experiments, this fact doesn’t modify the validity of the result, because the identification was done individually for each experiment.

      Page 16 - much discussion about the difference between Spn27A and Toll10b/def mutant background. One has half as much Toll receptor. The phenotype of Toll10b/+ should be examined.

      Both genotypes have been extensively examined in the past. Tl10B/def has only one copy of the gene from the mother, and the mutant protein is constitutively active. By putting it over a deficiency, we (and others in the past) made sure that the exclusive source for Tl signalling is from this gain of function Tl allele, and that the wildtype receptor, which would still be activated by the natural ligand in a graded pattern along the DV axis, does not confound the result.

      The Tl10B/+ combination creates a less ventralized phenotype which is not more similar to that of spn27Aex/def but in fact less similar.

      Page 12 - hard to follow the discussion of modeling (?) presented in Figure 6. The results (bottom of page 12 - #1 "most networks are enriched for cellular components associated with regulation of gene expression" and page 13 #2 - "cytoskleeton emerges as a major target of regulation") seem vague and unsubstantiated. Rhabdomere, P granule, micropyle, autophagosome?

      We agree with the reviewer that there are many cellular components that are enriched in the diffused network analyses, many of them unrelated to morphogenesis. We had highlighted this finding on page 12, paragraph 3. Nevertheless, we have rephrased the statements as ‘the heat maps illustrate that most of the enriched cellular components in both experiments were highly enriched with cellular components associated with DNA and RNA metabolism or the regulation of gene expression.’ and have now included numbers.

      We think ‘a major target’ for phosphorylation does in fact apply to the cytoskeleton, and we had already supplied the number to substantiate this in the manuscript (14/62).

      Readers will be able to evaluate these network analyses based on their own fields of interest or particular questions they may wish to address. We haven’t excluded any cellular component terms.

      Figure 7 seems like a separate study.

      Why were the phosphopeptides investigated to determine if they relate to phosphorylated proteins? Phosphoantibodies could have been generated for a subset. Instead the manuscript pivots to analysis of microtubules.

      We are reporting here one example of a proof-of-concept study that we carried out, chosen based on our own research interests and on available tools and reagents. There are clearly many other avenues that could have been explored and that others may want to explore, but that go well beyond this report. We have made this more explicit in the text.

      Page 14 - discussion first paragraph. Please cite ref[10] when discussing the "previous study" otherwise the reader will not understand which study you are referring to until the next paragraph.

      We have moved the reference from its current position to the one suggested by the reviewer.

      • In general, the study would benefit from more attention to references and citations of prior work. A comparison of this work to the Gong et al. Development 2004 study should be made earlier. This work is cited very early on, namely in the introduction.

      • The authors start off saying that no other study has looked at proteins from a spatial perspective. We are unsure what the reviewer refers to. We say precisely the opposite: we indicate that studies have been performed to look at differences in cell populations, including that by the lab of Jon Minden (Gong et al), a highly respected former co-author of one of the current authors (ML). We do state that the technologies at the time did not allow the same depth and temporal resolution as the methods that are available nowadays. For instance, Gong et al. used an excellent and original approach at the time, which however did not detect Snail and Twist in the ventralized mutants.

      The only time we say ‘no other study’ is about ‘region-specific post-translational regulation of proteins’ - though we do state in the discussion that Gong et al would have detected some of these cases because they used 2D gels.

      • Along these lines, there is another more recent proteomic study from Beati et al. Fly 2020 using similarly staged embryos. How do these other experiments compare to the current ones? As they apparently analyzed proteome and phosphopeptides from an identical sample, are the authors' new data using separate samples consistent? This study is actually about a later stage (stage 8 embryos, post-gastrulation). Again, an excellent study, but not directly relevant to our current analysis. It validates the use of SILAC in Drosophila, although it is not the first study to do this. Furthermore, it looks at a different question and biological process using a mutant, htl, to understand the effect of FGF signalling.

      • Furthermore, Adam Martin's lab has been studying microtubule action along the dorsoventral axis (Denk-Lobnig et al 2021) and this work is not cited. Denk-Lobnig et al 2021 is about spatial patterns of myosin and actin and how that is governed genetically on the ventral side of the embryo, pertaining primarily to ventral furrow formation. It does not analyse microtubules nor dorsal-ventral cell populations.

      It is possible there may be some confusion with another excellent study from Adam Martin’s lab, in which the role of microtubules is analysed. But this is exclusively in the ventral furrow, and the study did not look at the effect of microtubule depolymerisation on nuclear positioning nor membrane behaviour. We cite this work extensively (Ref.: 36, Ko et al. JCB 2019) and we compare our results to that paper. However, our work here goes beyond this study in that it looks at all cells along the DV axis.

      General comments:

      Typos throughout. For example, page .4 section heading "dorso-ventral cell..."

      We have scanned the entire document for typos.

      Font size extremely small - for example see Figure 1A gene names, and 1F magnified view.

      We have adjusted the fonts in the main figures.

      Scale bars not shown when showing magnified views. For example, see Fig 1E,

      We have added these.

      Reviewer #3 (Significance (Required)): This study by Gomez et al. uses a proteomic-centered approach to study proteomes associated with cell populations in the embryo that they argue relate to different positions along the dorso-ventral axis. They generate a proteomic resource, though it was unclear how anyone could use the data they produce. There is no searchable database and we have to trust that the authors will ultimately provide such a resource to the community.

      All proteomics and phosphoproteomics data have been uploaded to PRIDE. Also see responses to the other referees’ queries about this point.

      There is the potential for interesting insights but the work is not presented in a way that is accessible or useful. The presentation needs significant improvement.

      We have improved the presentation and way the results are presented as per the suggestion of all reviewers.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Strengths:

      This work (almost didactically) demonstrates how to develop, calibrate, validate and analyze a comprehensive, spatially resolved, dynamical, multicellular model. Testable model predictions of (also non-monotonic) emergent behaviors are derived and discussed. The computational model is based on a widely-used simulation platform and shared openly such that it can be further analyzed and refined by the community.

      Weaknesses:

      While the parameter estimation approach is sophisticated, this work does not address issues of structural and practical non-identifiability (Wieland et al., 2021, DOI:10.1016/j.coisb.2021.03.005) of parameter values, given just tissue-scale summary statistics, and does not address how model predictions might change if alternative parameter combinations were used. Here, the calibrated model represents one point estimate (column "Value" in Suppl. Table 1) but there is specific uncertainty of each individual parameter value and such uncertainties need to be propagated (which is computationally expensive) to the model predictions for treatment scenarios.

      We thank the reviewer for the excellent suggestions and observations. The CaliPro parameterization technique applied puts an emphasis on finding a robust parameter space instead of a global optimum. To address structural non-identifiability, we utilized partial rank correlation coefficient with each iteration of the calibration process to ensure that the sensitivity of each parameter was relevant to model outputs. We also found that there were ranges of parameter values that would achieve passing criteria but when testing the ranges in replicate resulted in inconsistent outcomes. This led us to further narrow the parameters into a single parameter set that still had stochastic variability but did not have such large variability between replicate runs that it would be unreliable. Additional discussion on this point has been added to lines 623-628. We acknowledge that there are likely other parameter sets or model rules that would produce similar outcomes but the main purpose of the model was to utilize it to better understand the system and make new predictions, which our calibration scheme allowed us to accomplish.

      Regarding practical non-identifiability, we acknowledge that there are some behaviors that are not captured in the model because those behaviors were not specifically captured in the calibration data. To ensure that the behaviors necessary to answer the aims of our paper were included, we used multiple different datasets and calibrated with multiple different output metrics. We believe we have identified the appropriate parameters to recapitulate the dominating mechanisms underlying muscle regeneration. We have added additional discussion on practical non-identifiability to lines 621-623.

      Suggested treatments (e.g. lines 484-486) are modeled as parameter changes of the endogenous cytokines (corresponding to genetic mutations!) whereas the administration of modified cytokines with changed parameter values would require a duplication of model components and interactions in the model such that cells interact with the superposition of endogenous and administered cytokine fields. Specifically, as the authors also aim at 'injections of exogenously delivered cytokines' (lines 578, 579) and propose altering decay rates or diffusion coefficients (Fig. 7), there needs to be a duplication of variables in the model to account for the coexistence of cytokine subtypes. One set of equations would have unaltered (endogenous) and another one have altered (exogenous or drugged) parameter values. Cells would interact with both of them.

      Our perturbations did not include delivery of exogenously delivered cytokines and instead were focused on microenvironmental changes in cytokine diffusion and decay rates or specific cytokine concentration levels. For example, the purpose of the VEGF delivery perturbation was to test how an increase in VEGF concentrations would alter regeneration outcome metrics with the assumption that the delivered VEGF would act in the same manner as the endogenous VEGF. We have clarified the purpose of the simulations on line 410. We agree that exploring if model predictions would be altered if endogenous and exogenous were represented separately; however, we did not explore this type of scenario.

      This work shows interesting emergent behavior from nonlinear cytokine interactions but the analysis does not provide insights into the underlying causes, e.g. which of the feedback loops dominates early versus late during a time course.

      Indeed, analyzing the model to fully understand the time-varying interactions between the multiple feedback loops is a challenge in and of itself, and we appreciate the opportunity to elaborate on our approach to addressing this challenge. First: the crosstalk/feedback between cytokines and the temporal nature was analyzed in the heatmap (Fig. 6) and lines 474-482. Second: the sensitivity of cytokine parameters to specific outputs was included in Table 9 and full-time course sensitivity is included in Supplemental Figure 2. Further correlation analysis was also included to demonstrate how cytokine concentrations influenced specific output metrics at various timepoints (Supplemental Fig. 3). We agree that further elaboration of these findings is required; therefore, we added lines 504-509 to discuss the specific mechanisms at play with the combined cytokine interactions. We also added more discussion (lines 637-638) regarding future work that could develop more analysis methods to further investigate the complex behaviors in the model.

      Reviewer #2 (Public Review):

      Strengths:

      The manuscript identified relevant model parameters from a long list of biological studies. This collation of a large amount of literature into one framework has the potential to be very useful to other authors. The mathematical methods used for parameterization and validation are transparent.

      Weaknesses:>

      I have a few concerns which I believe need to be addressed fully.

      My main concerns are the following:

      (1) The model is compared to experimental data in multiple results figures. However, the actual experiments used in these figures are not described. To me as a reviewer, that makes it impossible to judge whether appropriate data was chosen, or whether the model is a suitable descriptor of the chosen experiments. Enough detail needs to be provided so that these judgements can be made.

      Thank you for raising this point. We created a new table (Supplemental table 6) that describes the techniques used for each experimental measurement.

      (2) Do I understand it correctly that all simulations are done using the same initial simulation geometry? Would it be possible to test the sensitivity of the paper results to this geometry? Perhaps another histological image could be chosen as the initial condition, or alternative initial conditions could be generated in silico? If changing initial conditions is an unreasonably large request, could the authors discuss this issue in the manuscript?

      We appreciate your insightful question regarding the initial simulation geometry in our model. The initial configuration of the fibers/ECM/microvascular structures was kept consistent but the location of the necrosis was randomly placed for each simulation. Future work will include an in-depth analysis of altered histology configuration on model predictions which has been added to lines 618-621. We did a preliminary example analysis by inputting a different initial simulation geometry, which predicted similar regeneration outcomes. We have added Supplemental Figure 5 that provides the results of that example analysis.

      (3) Cytokine knockdowns are simulated by 'adjusting the diffusion and decay parameters' (line 372). Is that the correct simulation of a knockdown? How are these knockdowns achieved experimentally? Wouldn't the correct implementation of a knockdown be that the production or secretion of the cytokine is reduced? I am not sure whether it's possible to design an experimental perturbation which affects both parameters.

      We appreciate that this important question has been posed. Yes, in order to simulate the knockout conditions, the cytokine secretion was reduced/eliminated. The diffusion and decay parameters were also adjusted to ensure that the concentration within the system was reduced. Lines 391-394 were added to clarify this assumption.

      (4) The premise of the model is to identify optimal treatment strategies for muscle injury (as per the first sentence of the abstract). I am a bit surprised that the implemented experimental perturbations don't seem to address this aim. In Figure 7 of the manuscript, cytokine alterations are explored which affect muscle recovery after injury. This is great, but I don't believe the chosen alterations can be done in experimental or clinical settings. Are there drugs that affect cytokine diffusion? If not, wouldn't it be better to select perturbations that are clinically or experimentally feasible for this analysis? A strength of the model is its versatility, so it seems counterintuitive to me to not use that versatility in a way that has practical relevance. - I may well misunderstand this though, maybe the investigated parameters are indeed possible drug targets.

      Thank you for your thoughtful feedback. The first sentence (lines 32-34) of the abstract was revised to focus on beneficial microenvironmental conditions to best reflect the purpose of the model. The clinical relevance of the cytokine modifications is included in the discussion (lines 547-558) with additional information added to lines 524-526. For example, two methods to alter diffusion experimentally are: antibodies that bind directly to the cytokine to prevent it from binding to its receptor on the cell surface and plasmins that induce the release of bound cytokines.

      (5) A similar comment applies to Figure 5 and 6: Should I think of these results as experimentally testable predictions? Are any of the results surprising or new, for example in the sense that one would not have expected other cytokines to be affected as described in Figure 6?

      We appreciate the opportunity to clarify the basis for these perturbations. The perturbations included in Figure 5 were designed to mimic the conditions of a published experiment that delivered VEGF in vivo (Arsic et al. 2004, DOI:10.1016/J.YMTHE.2004.08.007). The perturbation input conditions and experimental results are included in Table 8 and Supplemental Table 6 has been added to include experimental data and method description of the perturbation. The results of this analysis provide both validation and new predictions, because some the outputs were measured in the experiments while others were not measured. The additional output metrics and timepoints that were not collected in the experiment allow for a deeper understanding of the dynamics and mechanisms leading to the changes in muscle recovery (lines 437-454). These model outputs can provide the basis for future experiments; for example, they highlight which time points would be more important to measure and even provide predicted effect sizes that could be the basis for a power analysis (lines 639-640).

      Regarding Figure 6, the published experimental outcomes of cytokine KOs are included in Table 8. The model allowed comparison of different cytokine concentrations at various timepoints when other cytokines were removed from the system due to the KO condition. The experimental results did not provide data on the impact on other cytokine concentrations but by using the model we were able to predict temporally based feedback between cytokines (lines 474-482). These cytokine values could be collected experimentally but would be time consuming and expensive. The results of these perturbations revealed the complex nature of the relationship between cytokines and how removal of one cytokine from the system has a cascading temporal impact. Lines 533-534 have been added to incorporate this into the discussion.

      (6) In figure 4, there were differences between the experiments and the model in two of the rows. Are these differences discussed anywhere in the manuscript?

      We appreciate your keen observation and the opportunity to address these differences. The model did not match experimental results for CSA output in the TNF KO and antiinflammatory nanoparticle perturbation or TGF levels with the macrophage depletion. While it did align with the other experimental metrics from those studies, it is likely that there are other mechanisms at play in the experimental conditions that were not captured by simulating the downstream effects of the experimental perturbations. We have added discussion of the differences to lines 445-454.

      (7) The variation between experimental results is much higher than the variation of results in the model. For example, in Figure 3 the error bars around experimental results are an order of magnitude larger than the simulated confidence interval. Do the authors have any insights into why the model is less variable than the experimental data? Does this have to do with the chosen initial condition, i.e. do you think that the experimental variability is due to variation in the geometries of the measured samples?

      Thank you for your insightful observations and questions. The lower model variability is attributed to the larger sample size of model simulations compared to experimental subjects. By running 100 simulations it narrows in the confidence interval (average 2.4 and max 3.3) compared to the experiments that typically had a sample size of less than 15. If the number of simulations had been reduced to 15 the stochasticity within the model results in a larger confidence interval (average 7.1 and max 10). There are also several possible confounding variables in the experimental protocols (i.e. variations in injury, different animal subjects for each timepoint, etc.) that are kept constant in the model simulation. We have added discussion of this point to the manuscript (lines 517519). Future work with the model will examine how variations in conditions, such as initial muscle geometry, injury, etc, alter regeneration outcomes and overall variability. This discussion has been incorporated into lines 640-643.

      (8) Is figure 2B described anywhere in the text? I could not find its description.

      Thank you for pointing that out. We have added a reference for Fig. 2B on line 190.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) The model code seems to be available from https://simtk.org/projects/muscle_regen but that website requests member status ("This is a private project. You must be a member to view its contents.") and applying for membership could violate eLife's blind review process. So, this reviewer liked to but couldn't run the model her/himself. To eLife: Can the authors upload their model to a neutral server that reviewers and editors can access anonymously?

      The code has been made publicly available on the following sites:

      SimTK: https://simtk.org/docman/?group_id=2635

      Zendo: https://zenodo.org/records/10403014

      GitHub: https://github.com/mh2uk/ABM-of-Muscle-Regeneration-with-MicrovascularRemodeling

      Line 121 has been updated with the new link and the additional resources were added to lines 654-657.

      (2) The muscle regeneration field typically studies 2D cross-sections and the present model can be well compared to these other 2D models but cells as stochastic and localized sources of diffusible cytokines may yield different cytokine fields in 3D vs. 2D. I would expect more broadened and smoothened cytokine fields (from sources in neighboring cross-sections) than what the 2D model predicts based on sources just within the focus cross-section. Such relations of 2D to 3D should be discussed.

      We thank the reviewer for the excellent suggestions and observations. It has been reported in other Compucell3D models (Sego et al. 2017, DOI:10.1088/17585090/aa6ed4) that the convergence of diffusion solutions between 2D and 3D model configurations had similar outcomes, with the 3D simulations presenting excessive computational cost without contributing any noticeable additional accuracy. Similarly, other cell-based ABMs that incorporate diffusion mechanisms (Marino et al. 2018, DOI:10.3390/computation6040058) have found that 2D and 3D versions of the model both predict the same mechanisms and that the 2D resolution was sufficient for determining outcomes. Lines 615-618 were added to elaborate on this topic.

      (3) Since the model (and title) focuses on "nonlinear" cytokine interactions, what would change if cytokine decay would not be linear (as modeled here) but saturated (with nonlinear Michaelis-Menten kinetics as ligand binding and endocytosis mechanisms would call for)?

      Thank you for raising an intriguing point. The model includes a combination of cytokine decay as well as ligand binding and endocytosis mechanisms that can be saturated. For a cytokine-dependent model behavior to occur the cytokines necessary to induce that action had to reach a minimum threshold. Once that threshold was reached, that amount of the cytokine would be removed at that location to simulate ligand-receptor binding and endocytosis. These ligand binding and endocytosis mechanisms behave in a saturated way, removing a set amount when above a certain threshold or a defined ratio when under the threshold. Lines 313-315 was revised to clarify this point. There were certain concentrations of cytokines where we saw a plateau in outputs likely as a result of reaching a saturation threshold (Supplemental Fig. 3). In future work, more robust mathematical simulation of binding kinetics of cytokines (e.g., using ODEs) could be included.

      (4) Limitations of the model should be discussed together with an outlook for model refinement. For example, fiber alignment and ECM ultrastructure may require anisotropic diffusion. Many of the rate equations could be considered with saturation parameters etc. There are so many model assumptions. Please discuss which would be the most urgent model refinements and, to achieve these, which would be the most informative next experiments to perform.

      We appreciate your thoughtful consideration of the model's limitations and the need for a comprehensive discussion on model refinements and potential future experiments. The future direction section was expanded to discuss additional possible model refinements (lines 635-643) and additional possible experiments for model validation (lines 630-634).

      (5) It is not clear how the single spatial arrangement that is used affects the model predictions. E.g. now the damaged area surrounds the lymphatic vessel but what if the opposite corner was damaged and the lymphatic vessel is deep inside the healthy area?

      Thank you for highlighting the importance of considering different spatial arrangements in the model and its potential impact on predictions. We previously tested model perturbations that included specifying the injury surrounding the lymphatic vessel versus on the side opposite the vessel. Since this paper focuses more on cytokine dynamics, we plan to include this perturbation, along with other injury alterations, in a follow-on paper. We added more context about this in the future efforts section lines 640-643.

      (6) It seems that not only parameter values but also the initial values of most of the model components are unknown. The parameter estimation strategy does not seem to include the initial (spatial) distributions of collagen and cytokines and other model components. Please discuss how other (reasonable) initial values or spatial arrangements will affect model predictions.

      We appreciate your thoughtful consideration of unknown initial values/spatial arrangements and their potential influence on predictions. Initial cytokine levels prior to injury had a low relative concentration compared to levels post injury and were assumed to be negligible. Initial spatial distribution of cytokines was not defined as initial spatial inputs (except in knockout simulations) but are secreted from cells (with baseline resident cell counts defined from the literature). The distribution of cytokines is an emergent behavior that results from the cell behaviors within the model. The collagen distribution is altered in response to clearance of necrosis by the immune cells (decreased collagen with necrosis removal) and subsequent secretion of collagen by fibroblasts. The secretion of collagen from fibroblast was included in the parameter estimation sweep (Supplemental Table 1).

      We are working on further exploring the model sensitivity to altered spatial arrangements and have added this to the future directions section (lines 618-621), as well as provided Supplemental Figure 5 to demonstrate that model outcomes are similar with altered initial spatial arrangements.

      (7) Many details of the CC3D implementation are missing: overall lattice size, interaction neighborhood order, and "temperature" of the Metropolis algorithm. Are the typical adhesion energy terms used in the CPM Hamiltonian and if so, then how are these parameter values estimated?

      Thank you for bringing attention to the missing details regarding the CC3D implementation in our manuscript. We have included supplemental information providing greater detail for CPM implementation (Lines 808-854). We also added two additional supplemental tables for describing the requested CC3D implementation details (Supplemental Table 4) and adhesion energy terms (Supplemental Table 5).

      (8) Extending the model analysis of combinations of altered cytokine properties, which temporal schedules of administration would be of interest, and how could the timing of multiple interventions improve outcomes? Such a discussion or even analysis would further underscore the usefulness of the model.

      In response to your valuable suggestion, lines 558-562 were added to discuss the potential of using the model as a tool to perturb different cytokine combinations at varying timepoints throughout regeneration. In addition, this is also included in future work in lines 636-637.

      (9) The CPM is only weakly motivated, just one sentence on lines 142-145 which mentions diffusion in a misleading way as the CPM just provides cells with a shape and mechanical interactions. The diffusion part is a feature of the hybrid CompuCell3D framework, not the CPM.

      Thank you for bringing up this distinction. We removed the statement regarding diffusion and updated lines 143-146 to focus on CPM representation of cellular behavior and interactions. We also added a reference to supplemental text that includes additional details on CPM.

      (10) On lines 258-261 it does not become clear how the described springs can direct fibroblasts towards areas of low-density collagen ECM. Are the lambdas dependent on collagen density?

      Thank you for highlighting this area for clarification. The fibroblasts form links with low collagen density ECM and then are pulled towards those areas based on a constant lambda value. The links between the fibroblast and the ECM will only be made if the collagen is below a certain threshold. We added additional clarification to lines 260-264.

      (11) On line 281, what does the last part in "Fibers...were regenerating but not fully apoptotic cells" mean? Maybe rephrase this.

      The last of part of that line indicates that there were some fibers surrounding the main injury site that were damaged but still had healthy portions, indicating that they were impacted by the injury and are regenerating but did not become fully apoptotic like the fiber cells at the main site of injury. We rephrased this line to indicate that the nearby fibers were damaged but not fully apoptotic.

      (12) Lines 290-293 describe interactions of cells and fields with localized structures (capillaries and lymphatic vessel). Please explain in more detail how "capillary agents...transport neutrophiles and monocytes" in the CPM model formalism. Are new cells added following rules? How is spatial crowding of the lattice around capillaries affecting these rules? Moreover, how can "lymphatic vessel...drain the nearby cytokines and cells"? How is this implemented in the CPM and how is "nearby" calculated? We appreciate your detailed inquiry into the interactions of cells and fields with localized structures. The neutrophils and monocytes are added to the simulation at the lattice sites above capillaries (within the cell layer Fig. 2B) and undergo chemotaxis up their respective gradients. The recruitment of the neutrophils and monocytes are randomly distributed among the healthy capillaries that do not have an immune cell at the capillary location (a modeling artifact that is a byproduct of only having one cell per lattice site). This approach helped to prevent an abundance of crowding at certain capillaries. Because immune cells in the simulation are sufficiently small, chemotactic gradients are sufficiently large, and the simulation space is sufficiently large, we do not see aggregation of recruited immune cells in the CPM.

      The lymphatic vessel uptakes cytokines at lattice locations corresponding to the lymphatic vessel and will remove cells located in lattice sites neighboring the lymphatic vessel. In addition, we have included a rule in our ABM to encourage cells to migrate towards the lymphatic vessel utilizing CompuCell3D External Potential Plugin. The influence of this rule is inversely proportional to the distance of the cells to the lymphatic vessel.

      We have updated lines 294-298 and 305-309 to include the above explanation.

      (13) Tables 1-4 define migration speeds as agent rules but in the typical CPM, migration speed emerges from random displacements biased by chemotaxis and other effects (like the slope of the cytokine field). How was the speed implemented as a rule while it is typically observable in the model?

      We appreciate your inquiry regarding the implementation of migration speeds. To determine the lambda parameters (Table 7) for each cell type, we tested each in a simplified control simulation with a concentration gradient for the cell to move towards. We tuned the lambda parameters within this simulation until the model outputted cell velocity aligned with the literature reported cell velocity for each cell type (Tables 1-4). We have incorporated clarification on this to lines 177-180.

      (14) Line 312 shows the first equation with number (5), either add eqn. (1-4) or renumber.

      We have revised the equation number.

      (15) Typos: Line 456, "expect M1 cell" should read "except M1 cell".

      Line 452, "thresholds above that diminish fibroblast response (Supplemental Fig 3)." remains unclear, please rephrase.

      Line 473, "at 28." should read "at 28 days.".

      Line 474, is "additive" correct? Was the sum of the individual effects calculated and did that match?

      Line 534, "complexity our model" should read "complexity in our model".

      We have corrected the typos and clarified line 452 (updated line 594) to indicate that the TNF-α concentration threshold results in diminished fibroblast response. We updated terminology line 474 (updated line 512) to indicate that there was a synergistic effect with the combined perturbation.

      (16) Table 7 defines cell target volumes with the same value as their diameter. This enforces a strange cell shape. Should there be brackets to square the value of the cell diameter, e.g. Value=(12µm)^2 ?

      The target volume parameter values were selected to reflect the relative differences in average cell diameter as reported in the literature; however, there are no parameters that directly enforce a diameter for the cells in the CPM formalism separate from the volume. We have observed that these relative cell sizes allow the ABM to effectively reproduce cell behaviors described in the literature. Single cells that are too large in the ABM would be unable to migrate far enough per time step to carry out cell behaviors, and cells that are too small in the CPM would be unstable in the simulation environment and not persist in the simulation when they should. We removed the units for the cell shape values in Table 7 since the target volume is a relative parameter and does not directly represent µm.

      (17) Table 7 gives estimated diffusion constants but they appear to be too high. Please compare them to measured values in the literature, especially for MCP-1, TNF-alpha and IL-10, or relate these to their molecular mass and compare to other molecules like FGF8 (Yu et al. 2009, DOI:10.1038/nature08391).

      We utilized a previously published estimation method (Filion et al. 2004, DOI:10.1152/ajpheart.00205.2004) for estimating cytokine diffusivity within the ECM. This method incorporates the molecular masses and accounts for the combined effects of the collagen fibers and glycosaminoglycans. The paper acknowledged that the estimated value is faster than experimentally determined values, but that this was a result of the less-dense matrix composition which is more reflective of the tissue environment we are simulating in contrast to other reported measurements which were done in different environments. Using this estimation method also allowed us to more consistently define diffusion constants versus using values from the literature (which were often not recorded) that had varied experimental conditions and techniques (such as being in zebrafish embryo Yu et al. 2009, DOI:10.1038/nature08391 as opposed to muscle tissue). This also allowed for recalculation of the diffusivity throughout the simulation as the collagen density changed within the model. Lines 318-326 were updated to help clarify the estimation method.

      (18) Many DOIs in the bibliography (Refs. 7,17,20,31,40,47...153) are wrong and do not resolve because the appended directory names are not allowed in the DOI, just with a journal's URL after resolution.

      Thank you for bringing this to our attention. The incorrect DOIs have been corrected.

      Reviewer #2 (Recommendations For The Authors):

      Minor comments:

      (9) On line 174, the authors say "We used the CC3D feature Flip2DimRatio to control the number of times the Cellular-Potts algorithm runs per mcs." What does this mean? Isn't one monte carlo timestep one iteration of the Cellular Potts model? How does this relate to physical timescales?

      We appreciate your attention to detail and thoughtful question regarding the statement about the use of the CC3D feature Flip2DimRatio. Lines 175-177 were revised to simplify the meaning of Flip2DimRatio. That parameter alters the number of times the Cellular-Potts algorithm is run, which is the limiting factor for cell movement. The physical timescale is kept to a 15-minute timestep but a high Flip2DimRatio allows more flexibility and stability to allow the cells to move faster in one timestep.

      (10) Has the costum matlab script to process histology images into initial conditions been made available?

      The Matlab script along with CC3D code for histology initialization with documentation has been made available with the source code on the following sites:

      SimTK: https://simtk.org/docman/?group_id=2635

      Zendo: https://zenodo.org/records/10403014

      GitHub: https://github.com/mh2uk/ABM-of-Muscle-Regeneration-with-MicrovascularRemodeling

      (11) Equation 5 is provided without a reference or derivation. Where does it come from and what does it mean?

      Thank you for highlighting the diffusion equation and seeking clarification on its origin and significance. Lines 318-326 were revised to clarify where the equation comes from. This is a previously published estimation method that we applied to calculate the diffusivity of the cytokines considering both collagen and glycosaminoglycans.

      (12) Line 326: "For CSA, experimental fold-change from pre-injury was compared with fold-change in model-simulated CSA". Does this step rely on the assumption that the fold change will not depend on the CSA? If so, is this something that is experimentally known, or otherwise, can it be confirmed by simulations?

      We appreciate the opportunity to clarify our rationale. The fold change was used as a method to normalize the model and experiment so that they could be compared on the same scale. Yes, this step relies on the assumption that fold change does not depend on pre-injury CSA. Experimentally it is difficult to determine the impact of initial fiber morphology on altered regeneration time course. This fold-change allows us to compare percent recovery which is a common metric utilized to assess muscle regeneration outcomes experimentally. Line 340-343 was revised to clarify.

      (13) Line 355: "The final passing criteria were set to be within 1 SD for CSA recovery and 2.5 SD for SSC and fibroblast count" Does this refer to the experimental or the simulated SD?

      The model had to fit within those experimental SD. Lines 371-372 was edited to specify that we are referring the experimental SD.

      (14) "Following 8 iterations of narrowing the parameter space with CaliPro, we reached a set that had fewer passing runs than the previous iteration". Wouldn't one expect fewer passing runs with any narrowing of the parameter space? Why was this chosen as the stopping criterion for further narrowing?

      We appreciate your observation regarding the statement about narrowing the parameter space with CaliPro. We started with a wide parameter space, expecting that certain parameters would give outputs that fall outside of the comparable data. So, when the parameter space was narrowed to enrich parts that give passing output, initially the number of passing simulations increased.

      Once we have narrowed the set of possible parameters into an ideal parameter space, further narrowing will cut out viable parameters resulting in fewer passing runs. Therefore, we stopped narrowing once any fewer simulations passed the criteria that they had previously passed with the wider parameter set. Lines 375-379 have been updated to clarify this point.

      (15) Line 516: 'Our model could test and optimize combinations of cytokines, guiding future experiments and treatments." It is my understanding that this is communicated as a main strength of the model. Would it be possible to demonstrate that the sentence is true by using the model to make actual predictions for experiments or treatments?

      This is demonstrated by the combined cytokine alterations in Figure 7 and discussed in lines 509-513. We have also added in a suggested experiment to test the model prediction in lines 691-695.

      (16) Line 456, typo: I think 'expect' should be 'except'.

      Thank you for pointing that out. The typo has been corrected.

    1. All roads lead to progress.

      for - key insight - all roads lead to progress - progress trap - Prometheus complex - impuslive urge to invent

      Comment - This is fleshed out in the final three paragraphs of this article - I disagree with the closing sentence, however

      • “It’s not possible [to avoid invention],
        • because all knowledge is interconnected like a web,” Carlin told Big Think.
      • “If you walled off a certain part of it because you saw the potential downside,
        • you would get to the same outcome sort of in a roundabout way, right?
      • The connections might not be direct, like saying, ‘Oh, I see nuclear weapons in the distance; let’s go there,’

        • but we would go through the back door, and eventually we would discover everything around that thing.”
      • To bring Carlin’s analogy home,

        • we can think about the idea of artificial general intelligence, or AGI.
      • AGI is the point at which AI can perform a wide variety of tasks so competently
        • that it matches or exceeds human intelligence and performance.
      • Some people might see AGI as dangerous.
      • Others may see AGI as the savior of humanity.
      • But while we have debates and conversations,
        • we’re still marching toward AGI.
      • Scientists and programmers behind their computers are
        • solving “everything around that thing.”
      • Our hands and our brains will,
        • perhaps unconsciously,
      • drift toward the very thing we’re debating if we should do.

      • The Prometheus complex can be seen over and over again

        • in the history of science.
      • It is not simply that Edenic urge to eat the fruit or push the red button.
      • It’s the fact that
        • as the rational, intellectual part of ourselves wrestles with the decision,
        • a deeper, Promethean part of ourselves has pressed it already.
      • Thankfully, it usually turns out okay.

      comment - I disagree with the last line - If the meta-poly-perma-crisis is what is meant by "OK", then it is a very distorted use of that word. - Rather, this Promethian way of thinking and act - compounded over the lifetime of human civilization - is EXACTLY what has brought us to the brink of civilizational disaster - and it may not turn out to be "ok"!

    1. We often think of software development as a ticket-in-code-out business but this is really only a very small portion of the entire thing. Completely independently of the work done as a programmer, there exists users with different jobs they are trying to perform, and they may or may not find it convenient to slot our software into that job. A manager is not necessarily the right person to evaluate how good a job we are doing because they also exist independently of the user–software–programmer network, and have their own sets of priorities which may or may not align with the rest of the system.

      Software development as a conversation

    1. Author response:

      Reviewer #1 (Public Review):

      Summary:

      The authors collected genomic information from public sources covering 423 eukaryote genomes and around 650 prokaryote genomes. Based on pre-computed CDS annotation, they estimated the frequency of alternative splicing (AS) as a single average measure for each genome and computed correlations with this measure and other genomic properties such as genome size, percentage of coding DNA, gene and intergenic span, etc. They conclude that AS frequency increases with genome complexity in a somewhat directional trend from "lower" organisms to "higher" organisms.

      Strengths:

      The study covers a wide range of taxonomic groups, both in prokaryotes and eukaryotes.

      Weaknesses:

      The study is weak both methodologically and conceptually. Current high throughput sequencing technologies, coupled with highly heterogeneous annotation methods, can observe cases of AS with great sensitivity, and one should be extremely cautious of the biases and rates of false positives associated with these methods. These issues are not addressed in the manuscript. Here, AS measures seem to be derived directly from CDS annotations downloaded from public databases, and do not account for differing annotation methods or RNA sequencing depth and tissue sample diversity.

      We are aware of the bias that may exist in annotation files. Since the source of noise can be highly variable, we have assumed that most of the data has a similar bias. However, we agree with the reviewer that we could perform some analysis to test for these biases and their association to different methodologies. Thus, we will measure the uncertainty present in the data. From one side, we will be more explicit about the data limitations and the biases it can generate in the results. On the other side, while analyzing the false positives in the data is out of our scope, we will perform a statistical test to detect possible biases regarding different methods of sequencing and annotation, and types of organisms (model or non-model organisms). If positive, we will proceed, as far as possible, to normalize the data or to estimate a confidence interval.

      Here, AS measures seem to be derived directly from CDS annotations downloaded from public databases, and do not account for differing annotation methods or RNA sequencing depth and tissue sample diversity.

      Beyond taking into account the differential bias that may exist in the data, we do not consider that our AS measure is problematic. The NCBI database is one of the most reliable databases that we have to date and is continuously updated from all scientific community. So, the use of this data and the corresponding procedures for deriving the AS measure are perfectly acceptable for a comparative analysis on such a huge global scale. Furthermore, the proposal of a new genome-level measure of AS that allows to compare species spanning the whole tree of life is part of the novelty of the study. We understand that small-scale studies require a high specificity about the molecular processes involved in the study. However, this is not the case, where we are dealing with a large-scale problem. On the other side, as we have previously mention, we agree with the reviewer to analyze the degree of uncertainty in the data to better interpret the results.

      There is no mention of the possibility that AS could be largely caused by random splicing errors, a possibility that could very well fit with the manuscript's data. Instead, the authors adopt early on the view that AS is regulated and functional, generally citing outdated literature.

      There is no question that some AS events are functional, as evidenced by strongly supported studies. However, whether all AS events are functional is questionable, and the relative fractions of functional and non-functional AS are unknown. With this in mind, the authors should be more cautious in interpreting their data.

      Many studies suggest that most of the AS events observed are the result of splicing errors and are therefore neither functional nor conserved. However, we still have limited knowledge about the functionality of AS. Just because we don’t have a complete understanding of its functionality, doesn’t mean there isn’t a fundamental cause behind these events. AS is a highly dynamic process that can be associated with processes of a stochastic nature that are fundamental for phenotypic diversity and innovation. This is one of the reasons why we do not get into a discussion about the functionality of AS and consider it as a potential measure of biological innovation. Nevertheless, we agree with the reviewer’s comments, so we will add a discussion about this issue with updated literature and look at any possible misinterpretation of the results.

      The "complexity" of organisms also correlates well (negatively) with effective population size. The power of selection to eliminate (slightly) deleterious mutations or errors decreases with effective population size. The correlation observed by the authors could thus easily be explained by a non-adaptive interpretation based on simple population genetics principles.

      We appreciate the observation of the reviewer. We know well the M. Lynch’s theory on the role of the effective population size and its eventual correlation with genomic parameters, but we want to emphasize that our objective is not to find an adaptive or non-adaptive explanation of the evolution of AS, but rather to reveal it. Nevertheless, as the reviewer suggests, we will look at the correlation between the AS and the effective population size and discuss about a possible non-adaptive interpretation.

      The manuscript contains evidence that the authors might benefit from adopting a more modern view of how evolution proceeds. Sentences such as "... suggests that only sophisticated organisms optimize alternative splicing by increasing..." (L113), or "especially in highly evolved groups such as mammals" (L130), or the repeated use of "higher" and "lower" organisms need revising.

      As the reviewer suggests, we will proceed with the corresponding linguistic corrections.

      Because of the lack of controls mentioned above, and because of the absence of discussion regarding an alternative non-adaptive interpretation, the analyses presented in the manuscript are of very limited use to other researchers in the field. In conclusion, the study does not present solid conclusions.

      Reviewer #2 (Public Review):

      Summary:

      In this contribution, the authors investigate the degree of alternative splicing across the evolutionary tree and identify a trend of increasing alternative splicing as you move from the base of the tree (here, only prokaryotes are considered) towards the tips of the tree. In particular, the authors investigate how the degree of alternative splicing (roughly speaking, the number of different proteins made from a single ORF (open reading frame) via alternative splicing) relates to three genomic variables: the genome size, the gene content (meaning the fraction of the genome composed of ORFs), and finally, the coding percentage of ORFs, meaning the ratio between exons and total DNA in the ORF. When correlating the degree of alternative splicing with these three variables, they find that the different taxonomic groups have a different correlation coefficient, and identify a "progressive pattern" among metazoan groups, namely that the correlation coefficient mostly increases when moving from flowering plants to arthropods, fish, birds, and finally mammals. They conclude that therefore the amount of splicing that is performed by an organismal group could be used as a measure of its complexity.

      Weaknesses:

      While I find the analysis of alternative splicing interesting, I also find that it is a very imperfect measure of organismal complexity and that the manuscript as a whole is filled with unsupported statements. First, I think it is clear to anyone studying evolution over the tree of life that it is the complexity of gene regulation that is at the origin of much of organismal structural and behavioral complexity. Arguably, creating different isoforms out of a single ORF is just one example of complex gene regulation. However, the complexity of gene regulation is barely mentioned by the authors.

      We disagree with the reviewer with that our measure of AS is imperfect. Just as we responded to the first reviewer, we will quantify the uncertainty in the data and correct for differential biases caused by annotation and sequencing methods. Thus, beyond correcting relevant biases in the data, we consider that our measure is adequate for a comparative analysis at a global scale. A novelty of our study is the proposal of a genome-level measure of AS that takes into account data from the entire scientific community. 

      We want also to emphasize that we assume from the beginning that AS may reflect some kind of biological complexity, it is not a conclusion from the results. An argument in favor of such an assumption is that AS is associated with stochastic processes that are fundamental for phenotypic diversity and innovation. Of course, we agree with the reviewer that it is not the only mechanism behind biological complexity, so we will emphasize it in the manuscript. On the other side, we will be more explicit about the assumptions and objectives, and will correct any unsupported statement.

      Further, it is clear that none of their correlation coefficients actually show a simple trend (see Table 3). According to these coefficients, birds are more complex than mammals for 3 out of 4 measures.

      An evolutionary trend is broadly defined as the gradual change in some characteristic of organisms as they evolve or adapt to a specific environment. Under our context, we define an evolutionary trend as the gradual change in genome composition and its association with AS across the main taxonomic groups. If we look at Figure 4 and Table 3 we can conclude that there is a progressive trend. We will be more precise about how we define an evolutionary trend and correct any possible misinterpretation of the results. On the other side, we do not assume that mammals should be more complex than birds. First, we will emphasize that our results show that birds have the highest values of such a trend. Second, after reading the reviewer’s comments, we have decided that we will perform an additional analysis to correct for differences in the taxonomic group sizes, which will allow us to have more confidence in the results.

      It is also not clear why the correlation coefficient between alternative splicing ratio and genome length, gene content, and coding percentage should display such a trend, rather than the absolute value. There are only vague mechanistic arguments.

      The study analyzes the relationship of AS with genomic composition for the large taxonomic groups. We assume that significant differences in these relationships are indicators of the presence of different mechanisms of genome evolution. However, we agree with the reviewer that a correlation does not imply a causal relation, so we will be more cautious when interpreting the results.

      To quantify the relationships we use correlation coefficients, the slopes of such correlations, and the relation of variability. Although the absolute values of AS are also illustrated in Table 4, we consider that they are less informative than if we include how it relates to the genomic composition. For example, we observe that plants have a different genome composition and relation with AS if compared to animals, which suggest that they follow different mechanisms of genome evolution. On the other hand, we observe a trend in animals, where high values of AS are associated to a large percentage of introns and a percentage of intergenic DNA of about the 50% of genomes.

      Much more troubling, however, is the statement that the data supports "lineage-specific trends" (lines 299-300). Either this is just an ambiguous formulation, or the authors claim that you can see trends *within* lineages.

      We agree with the reviewer that this statement is not correct, so we will proceed to correct it.

      The latter is clearly not the case. In fact, within each lineage, there is a tremendous amount of variation, to such an extent that many of the coefficients given in Table 3 are close to meaningless. Note that no error bars or p-values are presented for the values shown in Table 3. Figure 2 shows the actual correlation, and the coefficient for flowering plants there is given as 0.151, with a p-value of 0.193. Table 3 seems to quote r=0.174 instead. It should be clear that a correlation within a lineage or species is not a sign of a trend.

      The reviewer is not understanding correctly the results in Table 3. It is precisely the variation of the genome variables what we are measuring. Given the standardization of these values by the mean values, we have proceeded to compare the variability between groups, which is the result shown in Table 3. In this case there are no error bars or p-values associated. On the other hand, we agree that a correlation is not a sign of a trend. But the relations of variability, together with the results obtained in Figure 3, are indicators of a trend. As we mentioned before, we will proceed to analyze whether the variation in the group sizes is causing a bias in the results.

      There are several wrong or unsupported statements in the manuscript. Early on, the authors state that the alternative splicing ratio (a number greater or equal to one that can be roughly understood as the number of different isoforms per ORF) "quantifies the number of different isoforms that can be transcribed using the same amount of information" (lines 51-52). But in many cases, this is incorrect, because the same sequence can represent different amounts of information depending on the context. So, if a changed context gives rise to a different alternative splice, it is because the genetic sequence has a different meaning in the changed context: the information has changed.

      We agree that there are not well supported statements, so we will proceed to revise them.

      In line 149, the authors state that "the energetic cost of having large genomes is high". No citation is given, and while such a statement seems logical, it does not have very solid support.

      We will also revise the bibliography and support our statements with updated references.

      If there was indeed a strong selective force to reduce genome size, we would not see the stunning diversity of genome sizes even within lineages. This statement is repeated (without support) several times in the manuscript, apparently in support of the idea that mammals had "no choice" to increase complexity via alternative splicing because they can't increase it by having longer genomes. I don't think this reasoning can be supported.

      We agree with the reviewer in this issue, so we will carefully revise the statements that indirectly (or directly) assume the action of selective forces on the genome composition.

      Even more problematic is the statement that "the amount of protein-coding DNA seems to be limited to a size of about 10MB" (line 219). There is no evidence whatsoever for this statement.

      In Figure 1A we observe a one-to-one relationship between the genome size and the amount of coding. However, in multicellular organisms, although the genome size increases we observe that the amount of coding does not increase by more than 10Mb, which suggest the presence of some genomic limitation. Of course, this is not an absolute or general statement, but rather a suggestion. We are only describing our results.

      The reference that is cited (Choi et al 2020) suggests that there is a maximum of 150GB in total genome size due to physiological constraints. In lines 257-258, the authors write that "plants are less restricted in terms of storing DNA sequences compared to animals" (without providing evidence or a citation).

      We will revise the bibliography and add updated references.

      I believe this statement is made due to the observation that plants tend to have large intergenic regions. But without examining the functionality of these interagency regions (they might host long non-coding RNA stretches that are used to regulate the expression of other genes, for example) it is quite adventurous to use such a simple measure as being evidence that plants "are less restricted in terms of storing DNA sequences", whatever that even means. I do not think the authors mean that plants have better access to -80 freezers. The authors conclude that "plant's primary mechanism of genome evolution is by expanding their genome". This statement itself is empty: we know that plants are prone to whole genome duplication, but this duplication is not, as far as we understand, contributing to complexity. It is not a "primary mechanism of genome evolution".

      We will revise these statements.

      In lines 293-294, the authors claim that "alternative splicing is maximized in mammalian genomes". There is no evidence that this ratio cannot be increased. So, to conclude (on lines 302-303) that alternative splicing ratios are "a potential candidate to quantify organismal complexity" seems, based on this evidence, both far-fetched and weak at the same time.

      Our results show the highest values of AS in mammals, but we understand that the results are limited to the availability and accuracy of data, which we will emphasize in the manuscript. As we previously mention, we will also proceed to analyze the uncertainty in data and carry out the appropriate corrections.

      I am also not very comfortable with the data analysis. The authors, for example, say that they have eliminated from their analysis a number of "outlier species". They mention one: Emmer wheat because it has a genome size of 900 Mb (line 367). Since 900MB does not appear to be extreme, perhaps the authors meant to write 900 Gb. When I consulted the paper that sequenced Triticum dicoccoides, they noted that 14 chromosomes are about 10GB. Even a tetraploid species would then not be near 900Gb. But more importantly, such a study needs to state precisely which species were left out, and what the criteria are for leaving out data, lest they be accused of selecting data to fit their hypothesis.

      The reviewer is right, we wanted to say 900Mb, which is approximately 7.2Gb. We had a mistake of nomenclature. This value is extreme compared to the typical values, so it generates large deviations when applying measures of central tendency and dispersion. We want to obtain mean values that are representative of the most species composing the taxonomic groups, so we find appropriate to exclude all outlier values in the study. Nevertheless, we will specify the criteria that we have used to select the data in a rigorous way.

      I understand that Methods are often put at the end of a manuscript, but the measures discussed here are so fundamental to the analysis that a brief description of what the different measures are (in particular, the "alternative splicing ratio") should be in the main text, even when the mathematical definition can remain in the Methods.

      We agree with the reviewer, so we will add a brief description of the genomic variables at the beginning of the Results section.

      Finally, a few words on presentation. I understand that the following comments might read differently after the authors change their presentation. This manuscript was at the border of being comprehensible. In many cases, I could discern the meaning of words and sentences in contexts but sometimes even that failed (as an example above, about "species-specific trends", illustrates). The authors introduced jargon that does not have any meaning in the English language, and they do this over and over again.

      Note that I completely agree with all the comments by the other reviewer, who alerted me to problems I did not catch, including the possible correlation with effective population size: a possible non-adaptive explanation for the results.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2023-02154

      Corresponding author(s): Marco, Galardini

      1. General Statements

      We have carefully read the comments put forward by the two reviewers and we have produced a revised version of the manuscript that we believe addresses all the concerns expressed by the reviewers. In short, we have validated our approach against experimentally derived epistatic coefficients, compared our mutual information (MI) method against one that uses direct coupling analysis (DCA), and experimentally tested three interactions in the spike RBD that we have predicted and which emerged only in summer 2023, thus demonstrating the potential predictive power of this approach. We have also carefully reworded the manuscript to acknowledge the inherent limitation of a method based on MI to identify epistatic interactions. We believe that the revised manuscript is now more robust with these new in-silico and in-vitro validations, and more direct in exposing the advantages (speed) and caveats (higher false-positives) of this approach.

      Note: the line numbers referenced in the responses to reviewers below refer to the document in which the changes are highlighted.

      Point-by-point description of the revisions

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary: The authors inferred the pairwise epistasis through the Mutual Information provided by the spydrpick algorithm. They claim that the MIs could serve as a real-time identification of the epistatic interactions with the SARS-CoV-2 genomes due to the fast inference and high sensitivities.

      Major comments:

      1.The authors take a data-driven approach to infer the Mutation Information as the epistatic interactions between the mutations over different sites over SARS-CoV-2 genomes. However, it would be better to specify why this metric is reliable to be used as the representation of the pairwise epistatic interactions, and any theoretical explanations to support this.

      We agree that readers should be better informed on why MI can be used to estimate epistatic interactions from genomic data. We have therefore expanded the introduction (lines 93-98), methods (lines 540-543) and discussion (lines 453-457) sections to provide a proper theoretical and practical foundation on the use of a MI-based method. Furthermore, we have expanded the results section to add one additional in-silico validation (lines 244-249, Supplementary Figure 5, and updated Supplementary Figure 8) and an in-vitro one (Figure 5, see also reply to comment 2 from reviewer #2), which we believe give strong support to the MI-based method.

      2.The authors claimed that the DCA method requires more computational resources and more time to complete. However, with a proper filtering procedure, the computational time could be reduced heavily. An example is Physical Review E 106 (4), 044409, 2002, in which the DCA was used to investigate the real-time pair-wise interactions (month-to-month). There the DCA results were compared with the correlation analysis. It would be nice to have comparisons of the inferred interactions between MIs and other methods.

      We agree that our MI-based approach should be compared against DCA-based methods. The original manuscript had in fact one such comparison (for the 2023-03 dataset, Figure 3C), which indicated a strong correlation between the two methods. To make this result more robust we have computed the DCA values for the complete time-series dataset and measured the correlation with the MI values (Supplementary Figure 4)

      We observed a relatively high correlation in estimated values between the two methods, with the exception of three time points, i.e., 2020-11, 2023-02 and 2023-03. We can explain these lower correlations with the low overall sequence diversity observed in the early phase of the pandemic (2020-11) and with the different weighting scheme of our approach, which would significantly alter the dataset when compared to the one used by the DCA method, especially towards the later timepoints (see also the reply to reviewer #2, comment 3, section iv). When those three timepoints are excluded, the two methods show a high degree of correlation, implying that they are comparably suitable in detecting coevolutionary signals.

      We have also used the 2nd order coefficients derived from experimental data in Moulana et al., 2022 (10.1038/s41467-022-34506-z) to validate both approaches (see methods, lines 624-631).

      The panels which we have combined to create the new Supplementary Figure 5, indicate how both approaches (MI for panel A and C, and DCA for panels B and D) correctly recover the interaction with 2nd order epistatic coefficient > 0.15, based on the odds-ratio metric. Our MI-based approach has, however, a higher recall across multiple time points, which is especially visible comparing panels A and B. The DCA-based method did correctly identify known epistatic interactions, but did so only in sporadic timepoints, even though the distribution of the underlying variants did not change significantly month to month. We believe that the higher recall of the MI-based method has a higher value for genomic epidemiology, at least for SARS-CoV-2.

      3.In Figure 1C, the authors show that their spydrpick algorithm provides more pairwise MIs for longer distances, where the outliers are denser than those with short distances. How do we explain this phenomenon?

      We thank the reviewer for bringing this point up; we actually think that our data shows the opposite, meaning that we observe a higher proportion of close interactions when normalizing by the number of possible interactions. If we take an arbitrary distance threshold of 1'000 bases to define "close" Vs. "distant" interactions, we observe 194 and 280 interactions, respectively. It is true that distant interactions would be more, but the space of possible interactions is orders of magnitude larger for "distant" interactions, simply by the fact that there are more sites from which interactions can originate. As a crude estimate we can use the combinations between 1,000 sites (499,500 possible interactions) Vs those between 28,903 sites (the full SARS-CoV-2 genome length 29,903 bp minus 1,000, 417,677,253). Based on these estimates we have indeed observed less "close" than "distant" interactions.

      Minor comments:

      4.The explanations of Fig. 1E could be in more detail. Say, the grey dots in Fig. 1E, which is marked as "other" and such "other"s are dominated here. Why?

      We thank the reviewer for pointing out a section where more clarity was needed. We have added the following sentence to the figure legend: "The category "other" indicates positions which are not known to have an impact on affinity to ACE2, immune escape or otherwise flagged as MOI/MOC.". This indicates that predicted interactions involving a site classified as "other" are either false positives or previously undiscovered interactions.

      5.On line 210, the authors mentioned that the weights of the old sequences are lower "at around six months (120 days)". It would be better to specify why six months is 120 days instead of 180 days,

      We have corrected this mistake and indicated 4 months. We thank the reviewer for spotting this error.

      Referees cross-commenting

      I agree with what Reviewer #2 presented in the Consults Comments. The authors should present the reasons why MIs can be explained as the epistatic interations between sites as both of us mentioned this point. I checked the other revision points that raised by the Reviewer #2. They would be definetely helpful for enhancing the quality of the manuscript.

      Reviewer #1 (Significance (Required)):

      The work in the current manuscript is interesting and presented nicely. However, the theoretical foundations that the MIs could be explained as epistatic interactions should be illustrated. Otherwise, the tools would be useful for SARS-CoV-2 and other potential pandemics by different virus.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      The manuscript proposes an approach to identify epistatic interactions in the SRAR-CoV-2 genome using the large amount of genomic data which accumulated during the COVID pandemics. They argue that due to a relatively low computational cost, this can be done online in any ongoing pandemics nowadays (i.e. in the situation where the viral spreading and evolution are closely monitored by massive sequencing). In principle, this is interesting, but in my opinion the manuscript has some strong problems and will require major rewrighting:

      1) In difference to the claims of the manuscript, detected correlation does not necessarily imply epistatic couplings:

      • Even in a totally neutral setting, mutations may occur by chance together, and expand due to genetic drift or when ecountering a susceptible population. Equally, to independent muations may spread in different geographic regions, without the double mutant ever arising. Both cases lead to non-zero mutual information.

      • In evolution, frequently driver and passenger mutations are observed, in particular in settings of relatively high mutation rate. The passenger will rise in frequency with the driver, without any epistatic coupling.

      • The very unequal sequencing across geographic areas will enhance certain variants and leave others undetected. Even if the authors avoid double counting of identical sequences, more small variation is detected when sequencing deeper. The Omicron variant illustrates an extreme case here: it combined a large number of mutations, never detected before, but epistasis is not the most likely explanation, but rather lack of monitoring of the evolutionary path from the ancestral variants to Omicron.

      • MI has been criticised because it overestimates the effect of indirecrt correlations in particular in dense epistatic networks. The situation in the spike protein in Fig. 1B seems very dense.

      Currently the manuscript does not make any effort to disentangle any of these effects.

      Following this (and reviewer 1) comments, we have made a number of changes to the manuscript in order to provide more context into how MI can be used to estimate epistatic interactions and the inherent limitations of this approach. In particular, we have expanded the introduction (lines 93-98), methods (lines 540-543) and discussion (lines 453-457) sections in a way that we believe exposes the limitations of the approach. Despite these limitations, we still believe that a MI-based approach strikes a good balance between speed, ease of implementation, and sensitivity. To further demonstrate this point we have added two additional validations to our results: the first one (in-silico) uses estimated 2nd order epistatic coefficients derived from experimental data (Moulana et al., 2022, 10.1038/s41467-022-34506-z), and the second (in-vitro) our own experimental data on three predicted interactions. The results of the new in-vitro validation have been described in the reply to comment #2 from reviewer 1; in short they show how the MI-based method has comparable sensitivity and specificity as the DCA-based method, and most importantly they allow the recovery of known epistatic interactions across the time period in which they have appeared. The results of the in-vitro validation are discussed in the reply to the next comment from this reviewer, as they directly address the predictive power of our approach: in short, we show how we could also validate these predictions. We think that these new results clearly show how, despite its limitations, the MI-based approach is able to identify bona-fide epistatic interactions, with the advantage of being a simple method to be implemented and with the possibility to be run in real time. For a more detailed discussion of the merits of the MI-based approach over DCA, see the reply to comment #3 from this reviewer.

      2) What are the predictive capacities of the approach? Mutual information is bounded from above by the individual site entropies. So high MI can be detected only in highly mutated sites - i.e. in sides for sure already under monitoring. In fact, the sites in Fig. 1B with many links reflect the overall profile of variant frequencies in single sites (i.e. a totally non-epistatic measure) available on Nextstrain, and extracted from the same data sources.

      The discussion of the results is very anecdotal and it is not clear to me in how far there is any real prediction in the paper, which might surprise and trigger observation or further analyses.

      There is an entire line of related research in estimating and exploiting epistatic couplings in HIV evolution (A Chakraborty, M. Kardar, J. Barton, M MacKay and others) - not cited in the manuscript but relevant for the question how to detect epistatic couplings and what they are good for.

      We thank the reviewer for pointing out relevant literature we had not covered in the original manuscript, and which can be used to indicate how epistatic interaction signals can be leveraged when studying viruses. We have added citations to these studies in the introduction (lines 76-78) to provide a better background for our own study. Regarding the broader concern of showing the predictive power of our approach, we had a similar concern after the manuscript was submitted, and we had already planned a "blind" in-vitro validation to put our approach to the test. In order to make this validation as "blind" as possible, we expanded the dataset to include sequences until August 2023. We then selected interactions within the spike RBD with confidence level O4 in at least the last 4 time points and with one position already flagged as either "affinity", "escape" or "other MOI/MOC"

      We then selected the top three interactions (446-460, 446-486 and 452-490) for our validation, as they have an outlier confidence O4 in at least the 4 time points, and lower or no prediction before. We also added the known 498-501 interaction as a control (Figure 5, panel B)

      We then focused on selecting a set of non-synonymous substitutions to test for their potential epistatic interactions. We decided to select 6 substitutions affecting the 3 predicted interactions based on their frequency in the time points after the cutoff of the original manuscript, shown in Figure 5, panel C.

      Of those, L452R/F490S and G446S/F486V are anti-correlated in their frequency and virtually never observed together in our dataset, G446S/F486S is observed at low frequency (87 samples after 2023-05), and G446S/N460H is virtually never observed (5 samples). We chose the anti-correlated pairs to test the potential of the MI method to explain these "avoidance" phenomenon, and the low frequency pairs as a way to test an early warning system for mutation signatures that might rise in the future. We then planned to test the impact of the individual variants, the double variants, both in the wild-type background and in the Q498R/N501Y background as a crude model for the Omicron variant.

      We then used a pseudovirus assay to test mutated RBDs across two phenotypes: infectivity (i.e. the ability to infect Vero B4 cells) and immune escape (i.e. antibody neutralization curves). We then tested for the presence of epistatic interactions for the double mutants in both backgrounds using a simple linear model (see Methods, lines 711-727). The results of these in-vitro assays are summarized below (Figure 5, panel E for infectivity, F for immune escape).

      Double mutants with a significant (p-value -10) interaction have been highlighted with an asterisk. We confirmed the epistatic interaction for the Q498R/N501H, both for its effect on infectivity and immune escape. For both anti-correlated pairs we found a significant interaction for either the infectivity assay (both) and immune escape (G446S/F486V). In particular, we found that the one hand the G446S/F486V pair induced a large drop in infectivity in the Q498R/N501H background while the double mutant was fairly similar to the immune escape profile of the single G446S variant, thus compensating for the loss of escape shown by the F486V variant alone. We observed the opposite for the L452R/F490S pair in terms of infectivity, with the pair showing a large increase in infectivity in the Q498R/N501H background, an effect we found to be significant. The double mutant had a slightly better immune escape profile than the single mutants, although not significant. From these observations we can hypothesize that the G446S/F486V is anticorrelated for their strong defect in infectivity; we cannot apply the same reasoning for the L452R/F490S pair, whose absence from circulating variants could be ascribed to stochasticity in population dynamics or interactions with other variants. We observed a similar impact of the G446S/F486S and G446S/N460H pairs on infectivity as G446S/F486V; based on these results we could estimate that variants carrying these pairs might have a fitness disadvantage. The inability of unsupervised methods (MI or DCA based) to predict the direction of the effect of course makes it difficult to inform which of the two pairs should be added to a "watchlist", but it would potentially reduce the number of interactions to be tested. We believe that the results of this admittedly small scale in-vitro validation demonstrates the potential of the MI-based approach to flag emerging interactions worthy of further studying. Recent advances in scalability of molecular assays (e.g. 10.1101/2024.03.08.584176) could then be coupled with a real-time system as the one we describe in our manuscript to filter out the more relevant interactions. We have added this forward-looking observation in the discussion as well (lines 465-474).

      3) The authors say that more involved methods like the Direct Coupling Analysis with Pseudolikelihood maximisation would be too slow for the analysis, but several papers show the contrary. The paper by Zeng et al. (Ref. [39]) does so very early in the pandemics in 2020, and another uncited paper of the same authors (Physical Review 2022) uses a nearly identical approach to study the time evolution of epistatic couplings (extractions from Gisaid at several times). As one of theit results, they show that their approach is not only feasible, but delivers more stable results than simpler correlation measures like MI.

      We thank the reviewer for pointing out a relevant reference we had missed in the initial manuscript. At a general level Zeng et al. take a similar approach to what we have described, namely to divide the data according to the isolation date to look for temporal trends. We however see a few differences that we think are in favor of the approach we describe:

      1- Our manuscript covers the time period after the emergence of the Omicron variant, in which epistatic interactions are known and have been characterized and validated experimentally, a crucial requirement for validation. We have also conducted an in-vitro validation on a selected set of predicted interactions (see the reply to the previous comment), which indicates that the method is sound and predictive.

      2- We have prepared a cumulative time-series dataset, meaning that each month introduces new sequences on top of the ones already selected from the previous time points. To the best of our knowledge the Zheng et al. dataset has "insulated" sequences at each month. We believe our approach has the advantage of allowing for a higher recall, as it includes a representation of extinct lineages, which may increase diversity at key loci and thus boost the signal. As described in the original manuscript and in the reply to this reviewer's comments "iv" and "v", we have added a weighting scheme in order to reduce the influence of older sequences and increase the relevance of smaller lineages.

      3- While we have not tested the DCA implementation used by Zeng et al., and we cannot therefore directly comment on its scalability, we have encountered serious limitations when scaling up the popular plmc C implementation developed by the lab of Deborah Marks. In particular we were unable to successfully run it for datasets with more than ~300k sequences, encountering segmentation faults.

      Regarding the third point, while this meant that we could not test the DCA approach on the full dataset, we could still manage to apply it on the time series data, focusing exclusively on the spike (S) gene. As shown above in the reply to reviewer's 1 comment #2, the two methods have a high correlation and are both able to recover known interactions, although with the DCA method having a lower recall. Taken together we believe that the MI-based approach we describe is robust enough to be considered when a tradeoff between speed, ease of implementation and sensitivity has to be struck, which we believe may be the case for a rapid response during a potential future pandemic. We have added more details to the part of the discussion in which the comparison with the DCA-based methods was made to point out how those are still feasible with very large collections of sequences (lines 444-448).

      It would therefore be essential that the authors strongly revise their manuscript to show the relaibility of the results, the predictive value of the predicted couplings, and the originality and robustness of the approach.

      We believe that our response to both reviewers have addressed these concerns, and as a result we have provided a more nuanced view on the use of MI-based methods in the prediction of epistatic interactions in pandemic viruses. Our wording has been modified to make sure that readers interested in replicating our approach are aware of its strengths (speed, ease of implementation) and limitations.

      Furthermore, there are some minor issues in the formulations, which should be corrected

      i) "the virus has differentiated into a number of lineages, almost all of which have taken over the whole population..." This is wrong. SARS-CoV-2 has always been very heterogeneous, with diverse variants circulating (the authors use millions of non-redundant sequences), and only very few have become VOIs or VOCs at some point. This image of competition between multiple coexisting strains is much closer to clonal interference than what the authors describe (even if clonal interference does not rely on population structure, which has always been an important element in COVID).

      We thank the reviewer for pointing out this error in our observation. We have changed "almost all" to "some", which we agree is more accurate.

      ii) The authors say that pseudolikelihood methods would require "aggressive subsampling". This is not true, in machine learning massive training data are frequently used in the context of batch learning, i.e. in each learning epoch a "batch" is sampled from the full data. This leads to stochasticity in learning, but all data are eventually used.

      We have reformulated this sentence (lines 85-90) to indicate how batch learning could also be used to make certain methods scalable, with the caveat that they would be more complicated to implement.

      iii) The authors say that the download also a phylogenetic tree, but I do not see where it is used.

      As indicated in the methods section, we have used the phylogenetic tree for two purposes:

      1- To single out high quality sequences from the raw MSA (line 515)

      2- To compute the weight of each sequence in the final MSA, as described in line 540-549

      iv)The authors use sequence weights as implemented in Ref. [31]. There a weighting at sequence similarity threshold of 90% is used. I would expect that there are no SARS-CoV-2 genomes having accumulated more than 10% of nucleotide mutations, i.e. the weighting procedure would be without any effect.

      We realized that the sequence weighting scheme we have used is not described in Pensar et al. (10.1093/nar/gkz656), but rather in the implementation of the spydrpick algorithm used by the panaroo software (Tonkin-Hill et al., 10.1186/s13059-020-02090-4). This weighting scheme is based on the more granular metric that is the patristic distance of each sequence from the root of the tree, divided at each branching point by the number of its terminal leaves. In practical terms this means that sequences belonging to smaller lineages (i.e. with fewer observed samples) will have a larger weight, regardless of a discrete sequence similarity threshold, as was done in the original implementation. We have updated the methods section to clearly indicate that the weighting scheme is that first shown in the panaroo software package (line 543).

      v)The authors estimate that they need 10,000-100,000 sequences to estimate MI, but find the epistatic coupling in spike residues 498-501 as soon as 6 double mutants are present, which is a frequency of about 1e-4. The corresponding entropies should be low and in consequence the MI, too.

      We thank the reviewer for raising this point, which prompted us to devise a way to better illustrate the sequence weighting scheme we have used. As a side note we also discovered that the number of Omicron sequences at the 2021-11 was actually 7, and not 6 as stated throughout the original manuscript, an error we have now fixed. As described in the methods section we have combined two weights in the time-series analysis: the first one, described in the response to the previous comment, is based on the "density" of the phylogenetic tree, which deflates the contribution of "denser" regions of the tree, and the second reduces the relevance of older sequences. The two weights are then combined multiplicatively. As a result the "real" (i.e. effective) number of sequences harboring a particular double mutation will be different than by just counting their occurrences.

      As shown in Supplementary Figure 3, the combination of both weights (first column) leads to an increased effective number of sequences for "younger" samples and those that come from "sparser" regions of the overall phylogenetic tree. This is particularly evident for the middle row (2021-11); the light orange dot, which indicates sequences belonging to the first Omicron lineage to appear in the dataset (BA.1), has an actual N of 7, but an effective N of ~100 (exact value 86), thanks to its "novelty" both in the tree (middle panel) and in terms of time (right panel). We again thank the reviewer for raising this point, which led us to generate this visualization, which will hopefully clarify the rationale for the weighting strategy we have used for moist readers.

      vi)The authors say that the public health toll of COVID has been "balanced" by scientific discovery - I would urge the authors to avoid such formulations, which sound cynical.

      We agree with the reviewer that this comment might sound cynical and tone-deaf, and have reformulated to indicate that the impact of the pandemic has coincided with an accelerated pace of applied scientific discovery.

      Referees cross-commenting

      Both reports bring up very similar points (points 1 of both reports, point 2 of Reviewer #1 vs. my point 3) but add partially complementary questions (point 3 of Reviewer #1, my point 2), both related to the interpretation of the data. My report is more severe, but reading the ms I am convinced that the paper requires serious revision. So reports seem coherent but with different degrees of recommendations. However, none of the comments of one reviewer is contradiction to the other reviewer.

      Reviewer #2 (Significance (Required)):

      While the paper asks interesting questions and wants to make use of the quite unique data which have accumulated during the COVID pandemics, the above mentioned problems raise important questions about the manuscript. It would be essential that the authors strongly revise their manuscript to show the relaibility of the results, the predictive value of the predicted couplings, and the originality and robustness of the approach.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This is a follow-up study to the authors' previous eLife report about the roles of an alpha-arrestin called protein thioredoxin interacting protein (Txnip) in cone photoreceptors and in the retinal pigment epithelium. The findings are important because they provide new information about the mechanism of glucose and lactate transport to cone photoreceptors and because they may become the basis for therapies for retinal degenerative diseases.

      Strengths:

      Overall, the study is carefully done and, although the analysis is fairly comprehensive with many different versions of the protein analyzed, it is clearly enough described to follow. Figure 4 greatly facilitated my ability to follow, understand and interpret the study. The authors have appropriately addressed a few concerns about statistical significance and the relationship between their findings and previous studies of the possible roles of Txnip on GLUT1 expression and localization on the surfaces of RPE cells.

      We are delighted that Reviewer #1 is satisfied with this revised version.

      Reviewer #2 (Public Review):

      The hard work of the authors is much appreciated. With overexpression of a-arrestin Txnip in RPE, cones and the combined respectively, the authors show a potential gene agnostic treatment that can be applied to retinitis pigmentosa. Furthermore, since Txnip is related to multiple intracellular signaling pathway, this study is of value for research in the mechanism of secondary cone dystrophy as well.

      There are a few areas in which the article may be improved through further analysis and application of the data, as well as some adjustments that should be made in to clarify specific points in the article.

      Strengths

      • The follow-up study builds on innovative ground by exploring the impact of TxnipC247S and its combination with HSP90AB1 knockdown on cone survival, offering novel therapeutic pathways.

      • Testing of different Txnip deletion mutants provides a nuanced understanding of its functional domains, contributing valuable insights into the mechanism of action in RP treatment.

      • The findings regarding GLUT1 clearance and the differential effects of Txnip mutants on cone and RPE cells lay the groundwork for targeted gene therapy in RP.

      Weaknesses

      • The focus on specific mutants and overexpression systems might overlook broader implications of Txnip interactions and its variants in the wider context of retinal degeneration.

      Txnip is not expressed in WT or RP cones, as described in our previous study (Xue et al., 2021, eLife), so we could not perform loss of function assays. We thus chose overexpression, and assayed various alleles, based upon the literature, as we describe in our manuscript.

      • The study's reliance on cell count and GLUT1 expression as primary outcomes misses an opportunity to include functional assessments of vision or retinal health, which would strengthen the clinical relevance.

      In our previous study, we demonstrated that the optomotor response of Txnip-treated RP mice improved (Xue et al., 2021, eLife). Also, as described in our previous Txnip study, as well as an independent study (Xue et al., 2021, eLife; Xue et al., 2023, PNAS), ERG assays of Txnip-treated RP cones were no different than the controls. Other therapies that prolong RP cone survival and the optomotor response in our lab also failed to save the ERG, suggesting that there are other pathways that need to be addressed, e.g. the visual cycle. A combination therapy addressing multiple problems is one of our goals.

      • The paper could benefit from a deeper exploration of why certain treatments (like Best1-146 Txnip.C247S) do not lead to cone rescue and the potential for these approaches to exacerbate disease phenotypes through glucose shortages.

      This system is more complicated than we currently understand, and more work needs to be done.

      • Minor inconsistencies, such as the missing space in text references and the need for clarification on data representation (retinas vs. mice), should be addressed for clarity and accuracy.

      The missing spaces are added.

      We described the strategy of injecting the same mouse in each eye, one eye with control and one with the experimental vector. However, the following sentence has been added to the Materials and Methods to better assist the reader:

      “In almost all experiments, other than as noted, one eye of the mouse was treated with control (AAV8-RedO-H2BGFP, 2.5 × 108 vg/eye), and the other eye was treated with the experimental vector plus AAV8-RedO-H2BGFP, 2.5 × 108 vg/eye.”

      • The observation of promoter leakage and potential vector tropism issues raise questions about the specificity and efficiency of the gene delivery system, necessitating further discussion and validation.

      The following sentences have been added to the Results. We do not think this phenomenon affects the practice of the experiments or the interpretation of the results in this study.

      “To enable automated cone counting and trace the infection, we co-injected an AAV (AAV8-RedO-H2BGFP-WPRE-bGHpA) encoding an allele of GFP fused to histone 2B (H2BGFP), which localized to the nucleus. As the red opsin promoter was used to express this gene, H2BGFP was seen in cone nuclei, but not in the RPE, if AAV8-RedO-H2BGFP-WPRE-bGHpA was injected alone. However, when an AAV that expressed in the RPE, i.e. AAV8-Best1-Sv40intron-(Gene)-WPRE-bGHpA, was co-injected with AAV8-RedO-H2BGFP-WPRE-bGHpA, H2BGFP was expressed in the RPE, along with expression in cones (Figure 2A). We speculate that this is due to concatenation or recombination of the two genomes, such that the H2BGFP comes under the control of the RPE promoter. This may be due to the high copy number of AAV in the RPE, as it did not happen in the reverse combination, i.e. AAV with an RPE promoter driving GFP and a cone promoter driving another gene, perhaps due to the observation that the AAV genome copy number is »10 fold lower in cones than in the RPE (Wang et al., 2020).”

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #1 (Public Review):

      Summary:

      This paper provides a straightforward mechanism of how mycobacterial cAMP level is increased under stressful conditions and shows that the increase is important for the survival of the bacterium in animal hosts. The cAMP level is increased by decreasing the expression of an enzyme that degrades cAMP.

      We thank the reviewer for these extremely encouraging comments.

      Strengths:

      The paper shows that under different stresses the response regulator PhoP represses a phosphodiesterase (PDE) that degrades cAMP specifically. Identification of PhoP as a regulator of cAMP is significant progress in understanding Mtb pathogenesis, as increase in cAMP apparently increases bacterial survival upon infection. On the practical side, reduction of cAMP by increasing PDE can be a means to attenuate the growth of the bacilli. The results have wider implications since PhoP is implicated in controlling diverse mycobacterial stress responses and many bacterial pathogens modulate host cell cAMP level. The results here are straightforward, internally consistent, and of both theoretical and applied interests. The results also open considerable future work, especially how increases in cAMP level help to increase survival of the pathogen.

      Weaknesses:

      It is not clear whether PhoP-PDE Rv0805 is the only pathway to regulate cAMP level under stress.

      Reviewer 1 (Recommendations for the authors):

      (1) L.1: "maintenance of" or 'regulating'- I thought change in cAMP level upon stress is the whole point of the paper. Also, can replace "intracellular survival" with 'survival in host macrophages' if you want to be more specific.

      We agree with the reviewer, and therefore, we have now replaced “maintenance of” with “regulating cAMP level” in the title. However, we feel more comfortable with “intracellular survival” rather than being more specific with ‘survival in host macrophages’ as we have also shown animal experiments to demonstrate ‘in vivo’ effect in mice lung and spleen.

      (2) L.26: ---requires the bacterial virulence regulator –

      The suggested change has been made to the text.

      (3) L.30: Replace "phoP locus since the" with 'PhoP since this'. (The product, not the locus, is the regulator). The same comment for l.113.

      We agree with the reviewer. The suggested changes have been made to the text.

      (4) L.31: Change represtsor to repressor.

      We are sorry for the embarrassing spelling mistake. We have rectified the mistake in the revised version.

      (5) L.32: "hydrolytically degrades" or hydrolyses? (lytic and degrade sound like tautology). Same comment for l.117.

      We agree. The suggested change has been made to the text in both places of the revised manuscript.

      (6) L.35: I would also suggest changing "intra-mycobacterial" to 'intra bacterial' because you are talking about one bacterium here. The same change is recommended in l.29.

      Following reviewer’s recommendation, we have made the changes in the revised manuscript.

      (7) L.37: bacillus unless use of the plural form is the norm in the field.

      We agree. The suggested change has been made to the text.

      (8) L.43: Delete "intracellular" and change "intracellular" to host in l.44.

      The suggested changes have been made to the text.

      (9) L.66: --that a burst--

      We have corrected the mistake in the revised manuscript.

      (10) L.76: Receptor or receptor?

      We have corrected the mistake in the revised manuscript.

      (11) L.86: -- mechanisms of regulation of mycobacterial cAMP level. (homeostasis needs to be introduced first, and not used in the concluding statement for the first time).

      The suggested changes have been made to the text.

      (12) L.96: "essential" or 'a requirement'. (reduction is not the same as elimination)

      We understand the reviewer’s concern. However, several studies have independently established that phoPR remains an essential requirement for mycobacterial virulence.

      (13) L.97: Moreover, a mutant

      The suggested change has been made to the text.

      (14) L.113: --locus since PhoP has been –

      The suggested change has been made to the text.

      (15) L.119: mechanism or manner? (you are stating a fact, not a mechanism)

      We agree. We have now replaced ‘mechanism’ with ‘manner’ in the revised manuscript.

      (16) L.130: --lacking copies of both phoP and phoR (I am assuming you don't have two copies of each gene)

      We understand the reviewer’s concern. For better clarity, we have now clearly mentioned that the phoPR-KO mutant lacks both the single copies of phoP and phoR genes.

      (17) L.156: Indicate why GroEL2? - cells as another cytoplasmic protein, GroEL2 was also undetectable

      We have now mentioned it in the secretion experiments that mycobacterial cells did not undergo autolysis. To prove this point, we have used cytoplasmic GroEL2 as a marker protein. The absence of detectable GroEL2 in the culture filtrates (CFs) suggests absence of autolysis. To this end, we have modified the sentence in the revised manuscript (duplicated below):

      “Fig. 1C confirms absence of autolysis of mycobacterial cells as GroEL2, a cytoplasmic protein, was undetectable in the culture filtrates (CF).”

      (18) L.266: May delete "Together". Start with These data--, which would draw more attention to integrated view. In l.268-270, a reminder that intracellular pH is acidic in the normal course would enhance the physiological significance of the present results.

      We agree. We have made the suggested changes to the text. In view of the second comment of the reviewer, we have modified the text (duplicated below):

      “These data represent an integrated view of our results suggesting that PhoP-dependant repression of rv0805 regulates intra-mycobacterial cAMP level. In keeping with these results, activated PhoP under acidic pH conditions significantly represses rv0805, and intracellular mycobacteria most likely utilizes a higher level of cAMP to effectively mitigate stress for survival under hostile environment including acidic pH of the phagosome.”

      (19) L.272: Delete "and intracellular survival" (?) (I am assuming the survival is due to stress tolerance; also the section talks about stress only). No period in l.273.

      Following reviewer’s recommendations, the suggested changes have been made to the text.

      (20) L.295: Start the sentence thus: It appears that at least one of ---. (This would put more emphasis on the inference)

      We agree. We have now incorporated the recommended changes in the revised version.

      (21) L.301: No parenthesis.

      The parenthesis has been removed in the revised manuscript.

      (22) L.306: Together already implies these. Either delete Together (which I would prefer) or say 'Together, the results suggest that strains expressing wild type and mutant----properties, and the results are

      We agree. We have now deleted ‘Together’ in the revised manuscript.

      (23) L.311: These results support our view that higher---- (to avoid repetition of l.266)

      We agree. We have now incorporated the suggested change in the revised manuscript.

      (24) L.316: Using or with?

      We think “with” goes well with the statement.

      (25) L.329: Rephrase thus: Effect of intra-bacterial cAMP level on in vivo--

      The recommended change has been made to the text.

      (26) L.333: I would use ~, if you want to indicate about.

      We agree. We have now used ‘~’ in the revised version. Changes were incorporated in lines 328, 330 and 333 of the revised manuscript.

      (27) L.350: Change "somewhat functionally" to phenotypically?

      We thank the reviewer for this suggestion. We have changed “somewhat functionally” to “phenotypically” in the revised manuscript.

      (28) L.361: Change "is connected to" to 'regulates'.

      The suggested change has been made to the text.

      (29) L.365: ACs (to be parallel with PDEs)

      We agree. The suggested change has been made to the text.

      (30) L.366: delete "very" (let the readers decide how recent from the reference date).

      The suggested change has been made to the text.

      (31) L.382: level remained unknown before the present study.

      The recommended change has been made to the text.

      (32) L.399: add at the end of the sentence 'under stress'. Also, represent, not represents.

      The recommended changes have been made to the text.

      (33) L.560 and 571: Section headings formatted differently from the rest. Similar problem in l.900.

      We have rectified the issue and all of the section headings are now formatted in the same style.

      Reviewer #2 (Public Review):

      Summary:

      In the manuscript, the authors have presented new mechanistic details to show how intracellular cAMP levels are maintained linked to the phosphodiesterase enzyme which in turn is controlled by PhoP. Later, they showed the physiological relevance linked to altered cAMP concentrations.

      Strengths:

      Well thought out experiments. The authors carefully planned the experiments well to uncover the molecular aspects of it diligently.

      We thank the reviewer for these extremely encouraging comments.

      Weaknesses:

      Some fresh queries were made based on the author's previous responses and hope to get satisfactory answers this time.

      We provide below a point-by-point response to the fresh queries.

      (2) Line 134: please describe the complementation strain features as it is mentioned for the first time (plasmid, copy number, promoter etc.) in the manuscript. Especially under NO stress what could be the authors' justification regarding the high cAMP concentration in the complementation strain?

      As recommended by the reviewer, the details of construction of the complemented strain have been incorporated in the 'Materials and Methods' section of the revised manuscript (duplicated below): "To complement phoPR expression, pSM607 containing a 3.6-kb DNA fragment of M. tuberculosis phoPR including 200-bp phoP promoter region, a hygromycin resistance cassette, attP site and the gene encoding phage L5 integrase, as detailed earlier (Walters et al., 2006) was used to transform phoPR mutant to integrate at the L5 attB site.

      " To address the reviewer's other concern, we have now included the following sentence in the 'Results' section of the revised manuscript (duplicated below): "A higher cAMP level in the complemented strain under NO stress is possibly attributable to reproducibly higher phoP expression in the complemented mutant under specific stress condition (Khan et al., 2022)."

      Reference: Khan et al. (2022) Convergence of two global regulators to coordinate expression of essential virulence determinants of Mycobacterium tuberculosis. eLife 2022, 11:e80965.

      New query: The complemented gene (in pSM607 plasmid) becomes a single copy after chromosomal integration, so it should ideally behave like a WT strain. How could authors still justify the high cAMP concentration under NO stress?

      We agree with the reviewer. We are unable to provide a cogent justification regarding this result. We speculate that PhoP is strikingly activated under NO stress by a non-canonical mechanism and strongly represses rv0805 expression. As a result, there is a significantly higher cAMP concentration in case of the complemented mutant under NO stress.

      (13) Line 292: There is a difference between red and green bars. Authors should do statistical analysis and then comment on whether overexpression of WT and mutant pde are different or similar, to me they are different; also, explain why the WT-Rv0805 strain is different than the phoPR-KO strain in the context of cell wall metabolism.

      As recommended by the reviewer, we have now included statistical significance of the data in the revised version, and modified the text accordingly in the manuscript.

      New query: Authors are asked to put a statistical significance test between WT-Rv0805 and WT-Rv0805M.

      We have included it in the modified figure. Also, to explain it we incorporated new text in the legend to Fig. 4C of the revised manuscript (duplicated below):

      “Note that similar to phoPR-KO, WT-Rv0805 shows a comparably higher sensitivity to CHP relative to WT bacilli. However, WT-Rv0805M expressing a mutant Rv0805, shows a significantly lower sensitivity to CHP relative to WT-Rv0805, as measured by the corresponding CFU values.”

      (14) Line 299-303: Authors should explain how the colocalization % are calculated. Also, in the figure 4D merge panel please highlight the difference.

      As suggested by the reviewer, we have now explained the methodology used to calculate percent colocalization in greater details. Also, we have modified Figure 4D to highlight the difference between samples shown in merge panel. Please see our response to comment # 33 from the Reviewer 1.

      New query: In the figure legend it should be mentioned that the white arrow indicates non-co-localization which is visibly higher in WT and WT Rvo805M.

      We thank the reviewer for this very important suggestion. We have now included the following text in the legend to Fig. 4D of the revised manuscript.

      “White arrowheads in the merge panels indicate non-colocalization, which remains higher in WT-H37Rv and WT-Rv0805M relative to phoPR-KO or WT-Rv0805.”

  5. inst-fs-iad-prod.inscloudgate.net inst-fs-iad-prod.inscloudgate.net
    1. The older you get, the worse it is

      I did not previously think about how age impacts the way you experience poverty, but I can see how this may be true. When we are little kids, we are still unaware of a lot of the different facets of identity that set us apart, and are more likely to be open minded to more things-- we are still very easily impressionable. As we become older though, and cliques form, there is a way clearer understanding of what may be deemed as cool or desirable for teenagers and what is embarrassing or something to be ashamed of.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendation for the authors):

      (1) On a few occasions, I found that the authors would introduce a concept, but provide evidence much later on. For example, in line 57, they introduced the idea that feedback timing modulates engagement of the hippocampus and striatum, but they provided the details much later on around line 99. There are a few instances like these, and the authors may want to go through the manuscript critically to bridge such gaps to improve the flow of reading.

      First, we thank the reviewer for acknowledging the contribution of our study and the methodological choices. We acknowledge the concern raised about the flow of information in the introduction. We have critically reviewed the manuscript, especially on writing style and overall structure, to ensure a smoother transition between the introduction of concepts and the provision of supporting evidence. In the case of the concept of feedback timing and memory systems, lines 46-58 first introduce the concept enhanced with evidence regarding adults, and we then pick up the concept around line 103 again to relate it to children and their brain development to motivate our research question. To further improve readability, we have included an outline of what to expect in the introduction. Specifically, we added a sentence in line 66-68 that provides an overview of the different paragraphs: “We will introduce the key parameters in reinforcement learning and then we review the existing literature on developmental trajectories in reinforcement learning as well as on hippocampus and striatum, our two brain regions of interest.”

      This should prepare the reader better when to expect more evidence regarding the concepts introduced. We included similar “road-marker” outline sentences in other occasions the reviewer commented on, to enhance consistency and readability.

      (2) I am curious as to how they think the 5-second delay condition maps onto real-life examples, for example in a classroom setting feedback after 5 seconds could easily be framed as immediate feedback.

      The authors may want to highlight a few illustrative examples.

      Thank you for asking about the practical implications of a 5-second delay condition, which may be very relevant to the reader. We have modified the introduction example in line 39-41 towards the role of feedback timing in the classroom to point out its practical relevance early on: “For example, children must learn to raise their hand before speaking during class. The teacher may reinforce this behavior immediately or with a delay, which raises the question whether feedback timing modulates their learning”.

      We have also expanded a respective discussion point in lines 720-728 to pick up the classroom example and to illustrate how we think timescale differences may apply: “In scenarios such as in the classroom, a teacher may comment on a child’s behavior immediately after the action or some moments later, in par with our experimental manipulation of 1 second versus 5 seconds. Within such short range of delay in teachers’ feedback, children’s learning ability during the first years of schooling may function equally well and depend on the striatal-dependent memory system. However, we anticipate that the reliance on the hippocampus will become even more pronounced when feedback is further delayed for longer time. Children’s capacity for learning over longer timescales relies on the hippocampal-dependent memory system, which is still under development. This knowledge could help to better structure learning according to their development.”

      (3) In the methods section, there are a few instances of task description discrepancies which make things a little bit confusing, for example, line 173 reward versus punishment, or reward versus null elsewhere e.g. line 229. In the same section, line 175, there are a few instances of typos.

      We appreciate your attention to detail in pointing out discrepancies in task descriptions and typos in the method section. We have revised the section, corrected typos, and now phrased the learning outcomes consistently as “reward” and “punishment”.

      (4). I wasn't very clear as to why the authors did not compute choice switch probability directly from raw data but implemented this as a model that makes use of a weight parameter. Former would-be much easier and straightforward for data plotting especially for uninformed readers, i.e., people who do not have backgrounds in computational modelling.

      Thank you for asking for clarification on the calculation of switching behavior. Indeed, in the behavioral results, switching behavior was directly calculated from the raw data. We now stressed this in the methods in lines 230-235, also by naming win-stay and lose-shift as “proportions” instead of as “probabilities”:“As a first step, we calculated learning outcomes diretly from the raw data, which where learning accuracy, win-stay and lose-shift behavior as well as reaction time.

      Learning accuracy was defined as the proportion to choose the more rewarding option, while win-stay and lose-shift refer to the proportion of staying with the previously chosen option after a reward and switching to the alternative choice after receiving a punishment, respectively.”

      In contrast to the raw data switching behavior, the computational heuristic strategy model indeed uses a weight for a relative tendency of switching behavior. We have also stressed the advantage of the computational measure and its difference to the raw data switching behavior in lines 248-252 and believe that the reader can now clearly distinguish between the raw data and the computational results: “Note that these model-based outcomes are not identical to the win-stay and lose-shift behavior that were calculated from the raw data. The use of such model-based measure offers the advantage in discerning the underlying hidden cognitive process with greather nuance, in contrast to classical approaches that directly use raw behavioral data.”

      (5) I agree with the authors' assertion that both inverse temperature and outcome sensitivity parameters may lead to non-identifiability issues, but I was not 100% convinced about their modelling approach exclusively assessing a different family of models (inv temperature versus outcome sensitivity). Here, I would like to make one mid-way recommendation. They may want to redefine the inverse temperature term in terms of reaction time, i.e., B=exp^(s+g(RT-mean (RT)) where s and g are free parameters (see Webb, 2019), and keep the outcome sensitivity parameter in the model with bounds [0,2] so that the interpretation could be % increase or decrease in actual outcome. Personally, in tasks with binary outcomes i.e. [0,1: null vs reward] I do not think outcome sensitivity parameters higher than 2 are interpretable as these assign an inflated coefficient to outcomes.

      We appreciate the mid-way recommendation regarding the modeling approach for inverse temperature and outcome sensitivity parameters. We have carefully revised our analysis approach by considering alternative modeling choices. Regarding the suggestion to redefine the inverse temperature in terms of reaction time by B=exp^(s+g(RT-mean (RT)), we unfortunately were not able to identify the reference Webb (2019), nor did we find references to the suggested modeling approach. Any further information that the reviewer could provide will be greatly appreciated. Regardless, we agree that including reaction times through the implementation of drift-diffusion modeling may be beneficial. However, changing the inverse temperature model in such a way would necessitate major changes in our modeling approach, which unfortunately would result in non-convergence issues in our MCMC pipeline using Rstan. Hence, this approach goes beyond the scope of the manuscript. Nonetheless, we have decided to mention the use of a drift-diffusion model, along with other methodological considerations, as future recommendation for disentangling outcome sensitivity from inverse temperature in lines 711-712: “Future studies might shed new light by examining neural activations at both task phases, by additionally modeling reaction times using a drift-diffusion approach, or by choosing a task design that allows independent manipulations of these phases and associated model parameters, e.g., by using different reward magnitudes during reinforcement learning, or by studying outcome sensitivity without decisionmaking.“

      Regarding the upper bound of outcome sensitivity, we agree that traditionally, limiting the parameter values at 2 is the choice for the parameter to be best interpretable. During model fitting, we had experienced non-convergence issues and ceiling effects in the outcome sensitivity parameter when fixing the inverse temperature at 1. The non-convergence issue was not resolved when we fixed the inverse temperature at 15.47, which was the group mean of the winning inverse temperature family. Model convergence was only achieved after increasing the outcome sensitivity upper bound to 20, with inverse temperature again fixed at 1. Since this model also performed well during parameter and model recovery, we argue that the parameter is nevertheless meaningful, despite the more extreme trial-to-trial value fluctuations under higher outcome sensitivity. We described our choice for this model in the methods section in lines 282-288: “Even though outcome sensitivity is usually restricted to an upper bound of 2 to not inflate outcomes at value update, this configuration led to ceiling effects in outcome sensitivity and non-converging model results. Further, this issue was not resolved when we fixed the inverse temperature at the group mean of 15.47 of the winning inverse temperature family model. It may be that in children, individual differences in outcome sensitivity are more pronounced, leading to more extreme values. Therefore, we decided to extend the upper bound to 20, parallel to the inverse temperature, and all our models converged with Rhat < 1.1.”.

      (6) I think the authors reporting optimal parameters for the model is very important (line 464), but the learning rate they report under stable contingencies is much higher than LRs reported by for example Behrens et al 2007, LRs around 0.08 for the optimal learning behaviour. The authors may want to discuss why their task design calls for higher learning rates.

      Thank you for appreciating our optimal parameter analysis, and for the recommendation to discuss why optimal learning rates in our task design may call for higher learning rates compared to those reported in some other studies. As largely articulated in Zhang et al (2020; primer piece by one of our co-authors), the optimal parameter combination is determined by several factors, such as the reward schedule (e.g., 75:25, vs 80:20) and task design (e.g., no reversal, one reversal, vs multiple reversal) and number of trials (e.g., 80, vs 100, vs, 120). Notably, in these taskrelated regards, our task is different from Behrens et al. (2007), which hinders a quantitative comparison among the optimal parameters in the two tasks. We have now included more details in our discussion in lines 643-656: “However, the differences in learning rate across studies have to be interpreted with caution. The differences in the task and the analysis approach may limit their comparability. Task proporties such as the trial number per condition differed across studies. Our study included 32 trials per cue in each condition, while in adult studies, the trials per condition ranged from 28 to 100. Optimal learning rates in a stable learning environment were at around 0.25 for 10 to 30 trials, another study reported a lower optimal learning rate of around 0.08 for 120 trials. This may partly explain why in our case of 32 trials per condition and cue, optimal learning rates called for a relatively high optimal learning rate of 0.29, while in other studies, optimal learning rates may be lower. Regarding differences in the analysis approach, the hierarchical bayesian estimation approach used in our study produces more reliable results in comparison to maximum likelihood estimation, which had been used in some of the previous adult studies and may have led to biased results towards extreme values. Taken together, our study underscores the importance of using longitudinal data to examine developmental change as well as the importance of simulation-based optimal parameters to interpret the direction of developmental change.”

      (7) The authors may want to report degrees of freedom in t-tests so that it would be possible to infer the final sample size for a specific analysis, for example, line 546.

      We appreciate the recommendation to include degrees of freedom, which are now added in all t-test results, for example in line 579: “Episodic memory, as measured by individual corrected object recognition memory (hits - false alarms) of confident (“sure”) ratings, showed at trend better memory for items shown in the delayed feedback condition (𝛽!""#$%&’(#")%*"# = .009, SE =.005, t(df = 137) = 1.80, p = .074, see Figure 5A).”

      (8) I'm not sure why reductions in lose shift behaviour are framed as an improvement between 2 assessment points, e.g. line 578. It all depends on the strength of the contingency so a discussion around this point should be expanded.

      We acknowledge that a reduction in lose-shift behavior only reflect improvements under certain conditions where uncertainty is low and the learning contingencies are stable, which is the case in our task. We have added Supplementary Material 4 to illustrate the optimality of win-stay and lose-shift proportions from model simulation and to confirm that children’s longitudinal development was indeed towards more optimal switching behavior. In the manuscript, we refer to these results in lines 488-490: “We further found that the average longitudinal change in win-stay and lose-shift proportion also developed towards more optimal value-based learning (Supplementary Material 4).”

      (9) If I'm not mistaken, the authors reframe a trend-level association as weak evidence. I do not think this is an accurate framing considering the association is strictly non-significant, therefore should be omitted line 585.

      We thank for the point regarding the interpretation of a trend-level association as weak evidence. We changed our interpretation, corrected in lines 581-585: “The inclusion of poor learners in the complete dataset may have weakend this effect because their hippocampal function was worse and was not involved in learning (nor encoding), regardless of feedback timing. To summarize, there was inconclusive support for enhanced episodic memory during delayed compared to immediate feedback, calling for future study to test the postulation of a selective association between hippocampal volume and delayed feedback learning.” as well as lines 622-623: “Contrary to our expectations, episodic memory performance was not enhanced under delayed feedback compared to immediate feedback.”

      Reviewer # 2 (Public Review):

      We thank the reviewer for acknowledging the strength of our study and pointing out its weaknesses.

      Weaknesses:

      There were a few things that I thought would be helpful to clarify. First, what exactly are the anatomical regions included in the striatum here?

      We appreciate the clarification question regarding the anatomical regions included in the striatum. The striatum included ventral and dorsal regions, i.e., accumbens, caudate and putamen. We have now specified the anatomical regions that were included in the striatum in lines 211-212: “We extracted the bilateral brain volumes for our regions of interest, which were striatum and hippocampus. The striatum regions included nucleus accumbens, caudate and putamen.”

      Second, it was mentioned that for the reduced dataset, object recognition memory focused on "sure" ratings. This seems like the appropriate way to do it, but it was not clear whether this was also the case for the full analyses in the main text.

      Thank you for pointing out that in the full dataset analysis, the use of “sure” ratings for object recognition memory was previously not mentioned. Including only “sure” ratings was used consistently across analyses. This detail is now described under methods in lines 332-333: “Only confident (“sure”) ratings were included in the analysis, which were 98.1 % of all given responses.”

      Third, the children's fitted parameters were far from optimal; is it known whether adults would be closer to optimal on the task?

      We thank for your question on whether adult learning rates in the task have been reported to be more optimal than those of the children in our study. This indeed seems to be the case, and we added this point in our discussion in line 639-643: “Adult studies that examined feedback timing during reinforcement learning reported average learning rates range from 0.12 to 0.34, which are much closer to the simulated optimal learning rates of 0.29 than children’s average learning rates of 0.02 and 0.05 at wave 1 and 2 in our study. Therefore, it is likely that individuals approach adult-like optimal learning rates later during adolescence.”

      The main thing I would find helpful is to better integrate the differences between the main results reported and the many additional results reported in the supplement, for example from the reduced dataset when excluding non-learners. I found it a bit challenging to keep track of all the differences with all the analyses and parameters. It might be helpful to report some results in tables side-by-side in the two different samples. And if relevant, discuss the differences or their implication in the Discussion. For example, if the patterns change when excluding the poor learners, in particular for the associations between delayed feedback and hippocampal volume, and those participants were also those less well fit by the value-based model, is that something to be concerned about and does that affect any interpretations? What was not clear to me is whether excluding the poor learners at one extreme simply weakens the general pattern, or whether there is a more qualitative difference between learners and non-learners. The discussion points to the relevance of deficits in hippocampaldependent learning for psychopathology and understanding such a distinction may be relevant.

      We appreciate the feedback that it might seem challenging to keep track of differences between the analyses of the full and the reduced dataset. We have now gathered all the analyses for the reduced dataset in Supplementary Material 6, with side-by-side tables for comparison to the full dataset results. Whenever there were differences between the results, they were pointed out in the results section, see lines 557-560: “In the results of the reduced dataset, the hippocampal association to the delayed learning score was no longer significant, suggesting a weakened pattern when excluding poor learners (Supplementary Material 6). It is likely that the exclusion reduced the group variance for hippocampal volume and delayed learning score in the model.” and lines 579-581: “Note that in the reduced dataset, delayed feedback predicted enhanced item memory significantly (Supplementary Material 6).”

      The found differences were further included in our discussion in lines 737-740 in the context of deficits in hippocampal-dependent learning and psychopathology: “Interestingly, poor learners showed relatively less value-based learning in favor of stronger simple heuristic strategies, and excluding them modulated the hippocampal-dependent associations to learning and memory in our results. More studies are needed to further clarify the relationship between hippocampus and psychopathology during cognitive and brain development.”

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      (1) There appears to be a flaw in the exploration of cortical inputs. the authors never show that HFS of cortical inputs has no effect in the absence of thalamic stimulation. It appears that there is a citation showing this, but I think it would be important to show this in this study as well.

      We understand that the reviewer would like us to induce an HFS protocol on cortical input and then test if there is any change in synaptic strength in thalamic input. We have done this experiment which shows that without a footshock, high-frequency stimulation (HFS) of the cortical inputs did not induce synaptic potentiation on the thalamic pathway (Extended Data Fig. 4d).

      (2) t is somewhat confusing that the authors refer to the cortical input as driving heterosynaptic LTP, but this is not shown until Figure 4J, that after non-associative conditioning (unpaired shock and tone) HFS of the cortex can drive freezing and heterosynaptic LTP of thalamic inputs.

      We agree with the reviewer that it is in figure 4j and figure 5,b,c which we show electrophysiological evidence for cortical input driving heterosynaptic LTP. It is only to be consistent with our terminology that initially we used behavioral evidence as the proxy for heteroLTP (figure 3c).

      …, the authors are 'surprised' by this outcome, which appears to be what they predict.

      We removed the phrase “To our surprise”.

      (3) 'Cortex' as a stimulation site is vague. The authors have coordinates they used, it is unclear why they are not using standard anatomical nomenclature.

      We replaced “cortex” with “auditory/associative cortex”.

      (4) The authors' repeated use of homoLTP and heteroLTP to define the input that is being stimulated makes it challenging to understand the experimental detail. While I appreciate this is part of the goal, more descriptive words such as 'thalamic' and 'cortical' would make this much easier to understand.

      We agree with the reviewer that a phrase such as “an LTP protocol on thalamic and cortical inputs” would be more descriptive. We chose the words “homoLTP” and “heteroLTP” only to clarify (for the readers) the physiological relevance of these protocols. We thought by using “thalamic” and “cortical” readers may miss this point. However, when for the first time we introduce the words “homoLTP” and “heteroLTP”, we describe which stimulated pathway each refers to.

      Reviewer #2 (Public Review):

      (1) …The experimental schemes in Figs. 1 and 3 (and Fig. 4e and extended data 4a,b) show that one group of animals was subjected to retrieval in the test context at 24 h, then received HFS, which was then followed by a second retrieval session. With this design, it remains unclear what the HFS impacts when it is delivered between these two 24 h memory retrieval sessions.

      We understand that the reviewer has raised the concern that the increase in freezing we observed after the HFS protocol (ex. Fig. 1b, the bar labeled as Wth+24hHFSth) could be caused or modulated by the recall prior to the HFS (Fig. 1a, top branch). To address this concern, in a new group of mice, 24 hours after weak conditioning, we induced the HFS protocol, followed by testing (that is, no testing prior to the HFS protocol). We observed that homoLTP was as effective in mice that were tested prior to the induction protocol as those that were not (Fig. 1b, Extended Data Fig. 1d,e).

      It would be nice to see these data parsed out in a clean experimental design for all experiments (in Figs 1, 3, and 4), that means 4 groups with different treatments that are all tested only once at 24 h, and the appropriate statistical tests (ANOVA). This would also avoid repeating data in different panels for different pairwise comparisons (Fig 1, Fig 3, Fig 4, and extended Fig 4).

      While we understand the benefit of the reviewer’s suggestion, the current presentation of the data was done to match the flow of the text and the delivery of the information throughout the manuscript. We think it is unlikely that the retrieval test prior to the HFS impacts its effectiveness, as confirmed by homosynaptic HFS data (Extended Data Fig. 1d,e). It is beyond the scope of current manuscript to investigate the mechanisms and manipulations related to reconsolidation and retrieval effects.

      (2) … It would be critical to know if LFPs change over 24 h in animals in which memory is not altered by HFS, and to see correlations between memory performance and LFP changes, as two animals displayed low freezing levels. … They would suggest that thalamo-LA potentiation occurs directly after learning+HFS (which could be tested) and is maintained over 24 h.

      We have performed the experiment where we recorded the evoked LFP 2hrs and 24hrs following the weak conditioning protocol. We observed that a weak conditioning protocol that was not followed by an optical LTP protocol on the cortical inputs failed to produce synaptic potentiation of the thalamic inputs (tested 2hrs and 24hrs after the LTP protocol; Extended Data Fig. 5d,e).

      (3) The statistical analyses need to be clarified. All statements should be supported with statistical testing (e.g. extended data 5c, pg 7 stats are missing). The specific tests should be clearly stated throughout. For ANOVAs, the post-hoc tests and their outcomes should be stated. In some cases, 2-way ANOVAs were performed, but it seems there is only one independent variable, calling for one-way ANOVA.

      All the statistical analyses have been revised and the post-hoc tests performed after the ANOVAs are mentioned in the relevant figure legends.

      Reviewer #2 (Recommendations For The Authors):

      The wording "transient" and "persistent" used here in the context of memory seems a bit misleading, as only one timepoint was assessed for memory recall (24 h), at which the memory strength (freezing levels) seem to change.

      As the reviewer mentioned, we have tested memory recall only at one time point. For this reason, throughout the text we used “transient” exclusively to refer to the experience (receiving footshock) and not to the memory. We replaced “persistence” with “stabilization” where it refers to a memory (“the induction of plasticity influences the stabilization of the memory”).

      For the procedures in which the CS and US were not paired, the term "unpairing" is used (which is probably the more adequate one), but the term "non-associative conditioning" appears in the text, which seems a bit misleading, as this term may have another connotation. There is also literature that an unpairing of CS and US could lead to the formation of a safety memory to the CS, that may be disrupted by HFS stimulation.

      We replaced "non-associative" with “unpaired”.

      Validation of viral injection sites for all experiments: Only representative examples are shown, it would be nice to see all viral expression sites.

      For this manuscript, we have used 155 mice. For this reason, including the injection sites for all the animals in the manuscript is not feasible. Except for the mice that have been excluded, (please see exclusion criteria added in the methods), the expression pattern we observed was consistent across animals and therefore the images shown are true representatives.

      Extended Data 1b: Please explain what N, U, W, and S behavioral groups mean. To what groups mentioned in the text (pg 2,3) do these correspond?

      The requested clarifications are implemented in the figure legend.

      Please elaborate on the following aspects of your methods and approaches:

      • Please explain if the protocol for HFS to manipulate behavior was the same as the one used for the LTP experiments (Fig 1d, Fig 4j) and was identical for homo/hetero inputs from thal and ctx?

      We used the same HFS protocol for all the HFS inductions. We included this information in the methods section.

      • Please state when the HFS was given in respect to the conditioning (what means immediately before and after?) and in which context it was given. Were animals subjected to HFS exposed to the context longer (either before or after the conditioning while receiving HFS) than the other groups? When the HFS was given in another context (for the 24 h group)- how was this controlled for?

      Requested information has been added to the methods section. The control and intervention groups were treated in the same way.

      • When were the footshocks given in the anesthesized recordings (Fig. 4j) and how was the temporal relationship to the HFS? Was the timing the same as for the HFS in the behavioral experiments?

      Requested information has been added to the methods section.

      • Please add information on how the LFP was stimulated and how the LFP- EPSP slope was determined in in vivo recordings, likewise for the whole cell recordings of EPSPs in Fig. 5d-f.

      Requested information has been added to the methods section.

      Here, the y-Axis in Fig. 5e should be corrected to EPSP slope rather than fEPSP slope if these are whole-cell recordings.

      This has been corrected.

      • Please include information if the viral injections and opto-manipulations were done bilateral or unilateral and if so in which hemisphere. Likewise, indicate where the LFP recordings were done.

      Requested information has been added to the methods section.

      • Were there any exclusion criteria for animals (e.g. insufficient viral targeting or placement of fibers and electrodes), other than the testing of the optical CS for adverse effects?

      Requested information has been added to the methods section.

      Statistics: In addition to clarifying analytical statistics, please clarify n-numbers for slice recordings (number of animals, number of slices, and number of cells if applicable).

      Requested information has been added to the methods section.

      It would be nice to scrutinize the results in extended data 4b. The freezing levels with U+24h HFS show a strong trend towards an increase, the effect size may be similar to immediate HFS Fig 4f and extended data 4a) if n was increased.

      We agree with the reviewer. To address this point, we added “HomoLTP protocol when delivered 24hrs later, produced an increase in freezing; however, the value was not statistically significant.” To show this point, we used the same scale for freezing in Extended Data Fig. 4a and b.

      In the final experiment (Fig. 5a-c), Fig. 5b seems to show results from only one animal, but behavioral results are from 4 animals (Fig 5c). It would be helpful to see the quantification of potentiation in each animal.

      The results (now with error bar) include all mice.

      Please spell out the abbreviation "STC".

      Now, it is spelled out.

      Page 8 last sentence of the discussion does not seem to fit there.

      The sentence has been removed.

      Reviewer #3 (Recommendations For The Authors):

      (1) The authors did not determine how WTh affects Th-LA synapses, as field EPSPs were recorded only after HFS. WTh was required for the effects of HFS, as HFS alone did not produce CR in naïve and/or unpaired controls. As such the effects of the WTh protocol on synaptic strength must be investigated.

      We have performed the experiment where we recorded the evoked LFP 2hrs and 24hrs following the weak conditioning protocol. We observed that a weak conditioning protocol that was not followed by an optical LTP protocol on the cortical inputs failed to produce synaptic potentiation of the thalamic inputs (tested 2hrs and 24hrs after the LTP protocol; Extended Data Fig. 5d,e).

      (2) The authors provide some evidence that their dual opsin approach is feasible, particularly the use of sustained yellow light to block the effects of blue light on ChrimsonR. However, this validation was done using single pulses making it difficult to assess the effect of this protocol on Th input when HFS was used. Without strong evidence that the optogenetic methods used here are fault-proof, the main conclusions of this study are compromised. Why did the authors not use a protocol in which fibers were placed directly in the Ctx and Th while using soma-restricted opsins to avoid cross-contamination?

      We understand that the reviewer raises the possibility that our dual-opsin approach, although effective with single pulses, may fail in higher frequency stimulation protocols (10Hz and 85Hz). To address this concern, in a new group of mice we applied our approach to 10Hz and 85Hz stimulation protocols. We show that our approach is effective in single-pulse as well as in 10Hz and 85Hz stimulation protocols (Fig. 2d-h).

    1. Author response:

      The following is the authors’ response to the current reviews.

      We sincerely appreciate the reviewer’s dedication to evaluating our manuscript and raising essential considerations regarding the classification of the migration behavior we described. While the reviewer suggests that this behavior aligns with the concept of itinerancy, we contend that it represents a distinct phenomenon, albeit with similarities, as both involve the non-breeding movements of birds. We acknowledge that our manuscript did not adequately address this distinction and have considered the reviewer’s feedback. In our response, we clarify the difference between the described phenomenon and itinerancy. Our revised manuscript will include a new section in the Discussion to address this issue comprehensively.

      In the first part of the review, the reviewer emphasizes that the pattern we are describing is consistent with itinerancy. Regardless of the terminology used, we want to highlight the existence of two different types of migratory behavior, both of which involve movement in non-breeding areas.

      The first type, called itinerancy, was first described by Moreau in 1972 in “The Palaearctic-African Bird Migration Systems.” As noted by the reviewer, this behavior involves an alternation of stopovers and movements between different short-term non-breeding residency areas. They usually occur in response to food scarcity in one part of the non-breeding range, causing birds to move to another part of the same range. These movements typically cover distances of 10 to 100 kilometers but are neither continuous nor directional. Moreau (1972) defined itinerancy as prolonged stopovers, normally lasting several months, primarily in tropical regions. He noted observations of certain species disappearing from his study areas in sub-Saharan Africa in December and others appearing, suggesting they may have multiple home ranges during the non-breeding season. Subsequent research, as mentioned by the reviewer, has confirmed itinerancy in many species, particularly among Palaearctic-African migrants in sub-Saharan Africa. In particular, the Montagu’s Harrier has been extensively studied in this regard. The reviewer rightly points out that our study does not include recent findings on this species. In our revised version, we will include references to recent studies, such as those by Trierweiler et al. (2013, Journal of Animal Ecology, 82:107-120) and Schlaich et al. (2023, Ardea, 111:321-342), which show that Montagu’s Harrier has an average of 3-4 home ranges separated by approximately 200 kilometers. These studies suggest that the species spends approximately 1.5 months at each site, with the most extended period typically observed at the last site before migrating to the breeding grounds.

      In the second type, birds undertake a post-breeding migration, arrive in their non-breeding range, and then gradually move in a particular direction throughout the season. This continuous directional movement covers considerable distances and continues throughout the non-breeding period. In our study, this movement covered about 1000 km, comparable to the total migration distance of Rough-legged Buzzards of about 1500 km. As observed in our research, these movements are influenced by external factors such as snow cover. In such cases, the progression of snow cover in a south-westerly direction during winter can prevent birds from finding food, forcing them to continue migrating in the same direction. In essence, this movement represents a prolonged phase of the migration process but at a slower pace. Similar behavior has been documented in buzzards, as reported by Strandberg et al. (2009, Ibis 151:200-206). Although several transmitters in their study stopped working in mid-winter, the authors observed a phenomenon they termed ‘prolonged autumn migration.’

      In the second part of the review, the reviewer questions the need to distinguish between the two behaviors we have discussed. However, we believe these behaviors differ in their structure (with the first being intermittent and often non-directional, whereas the second is continuous and directional) and in their causes (with the first being driven by seasonal food resource cycles and the second by advancing snow cover). We therefore argue that it is worth distinguishing between them. To differentiate these forms of non-breeding movement, we propose to use ‘itinerancy’ for the first type, as described initially by Moreau in 1972, and introduce a separate term for the second behavior. Although ‘slow directional itinerancy’ could be considered, we find it too cumbersome.

      Moreover, ‘itinerancy’ in the literature refers not only to non-breeding movements but also to the use of different nesting sites, e.g., Lislevand et al. (2020, Journal of Avian Biology: e02595), reinforcing its association with movements between multiple sites within habitats. We, therefore, propose that the second behavior be given a distinct name. We acknowledge the reviewer’s point that we did not adequately address this distinction in the Discussion and plan to include a separate section in our paper’s revised version. In the third part of his review, the reviewer suggests an alternative title. Another reviewer, Dr Theunis Piersma, suggested the current title during the first round of reviewing, and we have chosen his version.

      In the fourth part of the review, the reviewer questions whether it is appropriate to discuss the conservation aspect of this study. This type of non-breeding movement raises concerns about accurately determining non-breeding ranges and population dynamics for species that exhibit this behavior. We believe that accurate determination of range and population dynamics is critical to conservation efforts. While this may be less important for species breeding in Europe and migrating to Africa, for which monitoring breeding territories is more feasible, it’s essential for Arctic and sub-Arctic breeding species. Large-scale surveys in these regions have historically been challenging and have become even more so with the end of Arctic cooperation following Russia’s war with Ukraine (Koivurova, Shibata, 2023). For North America and Europe, non-breeding abundance is typically estimated once per season in mid-winter. In North America, these are the so-called Christmas counts (which take place once at the end of December), and in Europe, they are the IWC counts mentioned by the reviewer (as follows from their official website - “The IWC requires a single count at each site, which should be repeated each year. The exact dates vary slightly from region to region, but take place in January or February”). Because of such a single count in mid-winter, non-breeding habitats occupied in autumn and spring will be listed as ‘uncommon’ at best, while south-western habitats where birds are only present in mid-winter will be listed as ‘common.’ However, the situation will be reversed if we consider the time birds spend in these habitats.

      The reviewer also highlights the introduction’s unconventional structure and information redundancy at the beginning. We have chosen this structure and provided basic explanations to improve readability for a wider audience, given eLife’s readership. At the same time, we will certainly take the reviewers’ feedback into account in the revised version. We plan to include the references to modern itinerancy research mentioned above and to add a section on itinerancy to the Discussion.

      We appreciate the reviewer’s input and sincerely thank them for their time and effort in reviewing our paper. While we may not fully agree on the classification of the behavior we describe, we value the opportunity to engage in discussion and believe that presenting arguments and counterarguments to the reader is beneficial to scientific progress.


      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations For The Authors):

      I much enjoyed reading this manuscript, that is, once I understood what it is about. Titles like "Conserving bird populations in the Anthropocene: the significance of non-breeding movements" are a claim to so-called relevance, they have NOTHING to do with the content of the paper, so once I understood that this paper was about the "Quick quick slow: the foxtrot migration of rough-legged buzzards is a response to habitat and snow" (an alternative title), it was becoming very interesting. So the start of the abstract as well as the introduction is very tedious, as clearly much trouble is taken here to establish reputability. In my eyes this is unnecessary: eLife should be interested in publishing such a wonderful description of such a wonderful migrant in a study that comes to grips with limiting factors on a continental scale!

      We sincerely appreciate your time and effort in reviewing our manuscript. Thank you for your appreciation of our study.

      We agree that the focus of the article should be changed from conservation to migration patterns. We have rewritten the Introduction and Discussion as suggested. We have added the application of this pattern including conservation at the end of the Discussion by completely changing Figure 5. We have also changed the title to the suggested one.

      Not sure that the first paragraph statements that seek to downplay what we know about wintering vs breeding areas are valid (although I see what purpose they serve). Migratory shorebirds have extensively been studied in the nonbreeding areas, for example, including movement aspects (see, as just one example, Verhoeven, M.A., Loonstra, A.H.J., McBride, A.D., Both, C., Senner, N.R. & Piersma, T. (2020) Migration route, stopping sites, and non breeding destinations of adult Black tailed Godwits breeding in southwest Fryslân, The Netherlands. Journal of Ornithology 162, 61-76) and there are very impressive studies on the winter biology of migrants across large scale (for example in Zwarts' Living on the Edge book on the Sahel wetlands). Think also about geese and swans and about seabirds!

      We have rewritten the first paragraph and it now talks about patterns of migratory behavior. We have also rewritten the second paragraph, now it is devoted to studies of movements in the non-breeding period. We explain how our pattern differs from those already studied and give references to the papers you mentioned.

      Directional movements in nonbreeding areas as a function of food (in this case locusts) have really beautifully been described by Almut Schlaich et al in JAnimEcol for Montagu's harriers.

      We have added Montagu's harrier example in the second paragraph of the Introduction and the Discussion. We have added a reference to Schlaich and to Garcia and Arroyo, who suggested that Montagu's harriers have long directional migrations during the non-breeding period.

      Once the paper starts talking buzzards, and the analyses of the wonderful data, all is fine. It is a very competent analysis with a description of a cool pattern.

      Thank you for your appreciation of our study. We hope the revised version is better and clearer.

      However, i would say that it is all a question of spatial scale. The buzzards here respond to changes in food availability, but there is not an animal that doesn't. The question is how far they have to move for an adequate response: in some birds movements of 100s of meters may be enough, and then anything to the scale of rough-legged buzzards.

      In the new version of the manuscript, we emphasize that this is a large distance (about 1000 km), comparable to the distance of the fall and spring migrations (about 1400 km) in lines 70-72 of the Introduction and 379-383 of the Discussion.

      And actually, several of the shorebirds I know best also do a foxtrot, such as red knots and bar-tailed godwits moulting in the Wadden Sea, then spending a few months in the UK estuaries, before returning to the Wadden Sea before the long migrations to Arctic breeding grounds. The publication of the rough-legged buzzard story may help researchers to summarize patterns such as this too. Mu problem with this paper is the framing. A story on the how and why of these continental movements in response to snow and other habitat features would be a grand contribution. Drop Anthropocene, and rethink whether foxtrot should be introduced as a hypothesis or a summary of cool descriptions. I prefer the latter, and recommend eLife to go with that too, rather than encourage "disconnected frames that seek 'respectability'" Good luck, theunis piersma

      We thank the reviewer again for his valuable comments and suggestions. We have changed the framing to the suggested one and removed the Anthropocene from the article.

      Reviewer #2 (Recommendations For The Authors):

      We sincerely appreciate the time and effort you have taken to review our manuscript. We have carefully considered all of your comments, including both public and author comments, and provided detailed responses to each of them below. In addition, we would like to address the most important public comments.

      We agree with the suggestion to shift the focus of the article from conservation to migration patterns. Accordingly, we have rewritten both the Introduction and Discussion sections to focus on migration behavior rather than conservation.

      However, we respectfully disagree with the suggestion that the migration patterns we describe are synonymous with itinerancy. We acknowledge that our original presentation may have been unclear and may have hindered full understanding. In the revised version, we provide a detailed analysis of migratory behavior in the Introduction that describes how our pattern differs from itinerancy. We also revisit this distinction in the Discussion section. We have also carefully revised Figure 1 to improve clarity and avoid potential misunderstandings.

      Regarding the applicability of the described migration pattern, we acknowledge that the Rough-legged Buzzard is not listed as an endangered species. However, we believe that our findings have practical implications. We have moved our discussion of this issue to the end of the Discussion section and have completely revised Figure 5. While the overall population of Rough-legged Buzzards is not declining, certain regions within its range are experiencing declines. We show that this decline does not warrant listing the species as endangered. Instead, it may represent a redistribution within the non-breeding range - a shift in range dynamics. We use the example of the Rough-legged Buzzard to illustrate this concept and emphasize the importance of considering such dynamics when assessing the conservation status of species in the future.

      We also acknowledge that the hypothesis of this form of behavior has been proposed previously for Montagu's Harrier, and we have included this information in the revised manuscript. In addition, we agree that the focus on the Anthropocene is unnecessary in this context and have therefore removed it.

      We believe that these revisions significantly improve the clarity and robustness of the manuscript, and we are grateful for your insightful comments and suggestions.

      As a general comment, please note that including line numbers (as it is the standard in any manuscript submission) would facilitate reviewers providing more detailed comments on the text.

      We apologize for this oversight and have added line numbers to our revised manuscript.

      Dataset: unclear what is the frequency of GPS transmissions. Furthermore, information on relative tag mass for the tracked individuals should be reported.

      We have included this information in our manuscript (L 157-163). We also refer to the study in which this dataset was first used and described in detail (L 164).

      Data pre-processing: more details are needed here. What data have been removed if the bird died? The entire track of the individual? Only the data classified in the last section of the track? The section also reports on an 'iterative procedure' for annotating tracks, which is only vaguely described. A piecewise regression is mentioned, but no details are provided, not even on what is the dependent variable (I assume it should be latitude?).

      Regarding the deaths. We only removed the data when the bird was already dead. We have corrected the text to make this clear (L 170).

      Regarding the iterative procedure. We have added a detailed description on lines 175-188.

      Data analysis: several potential issues here:

      (1) Unclear why sex was not included in all mixed models. I think it should be included.

      Our dataset contains 35 females and eight males. This ratio does not allow us to include sex in all models and adequately assess the influence of this factor. At the same time, because adult females disperse farther than males in some raptor species, we conducted a separate analysis of the dependence of migration distance on sex (Table S8) and found no evidence for this in our species. We have written a separate paragraph about this. This paragraph can be found on lines 356-360 of the new manuscript.

      (2) Unclear what is the rationale of describing habitat use during migration; is it only to show that it is a largely unsuitable habitat for the species? But is a formal analysis required then? Wouldn't be enough to simply describe this?

      Habitat use and snow cover determine the two main phases (quick and slow) of the pattern we describe. We believe that habitat analysis is appropriate in this case and that a simple description would be uninformative and would not support our conclusions.

      (3) Analysis of snow cover: such a 'what if' analysis is fine but it seems to be a rather indirect assessment of the effect of snow cover on movement patterns. Can a more direct test be envisaged relating e.g. daily movement patterns to concomitant snow cover? This should be rather straightforward. The effectiveness of this method rests on among-year differences in snow cover and timing of snowfall. A further possibility would be to demonstrate habitat selection within the entire non-breeding home range of an individual in relation snow cover. Such an analysis would imply associating presence-absence of snow to every location within the non-breeding range and testing whether the proportion of locations with snow is lower than the proportion of snow of random locations within the entire non-breeding home range (95% KDE) for every individual (e.g. by setting a 1/10 ratio presence to random locations).

      The proposed analysis will provide an opportunity to assess whether the Rough-legged Buzzard selects areas with the lowest snow cover, but will not provide an opportunity to follow the dynamics and will therefore give a misleading overall picture. This is especially true in the spring months. In March-April, Rough-legged Buzzards move northeast and are in an area that is not the most open to snow. At this time, areas to the southwest are more open to snow (this can be seen in Figure 4b). If we perform the proposed analysis, the control points for this period would be both to the north (where there is more snow) and to the south (where there is less snow) from the real locations, and the result would be that there is no difference in snow cover.

      A step-selection analysis could be used, as we did in our previous work (Curk et al 2020 Sci Rep) with the same Rough-legged Buzzard (but during migration, not winter). But this would only give us a qualitative idea, not a quantitative one - that Rough-legged Buzzards move from snow (in the fall) and follow snowmelt progression (in the spring).

      At the same time, our analysis gives a complete picture of snow cover dynamics in different parts of the non-breeding range. This allows us to see that if Rough-legged Buzzards remained at their fall migration endpoint without moving southwest, they would encounter 14.4% more snow cover (99.5% vs. 85.1%). Although this difference may seem small (14.4%), it holds significance for rodent-hunting birds, distinguishing between complete and patchy snow cover. Simultaneously, if Rough-legged Buzzards immediately flew to the southwest and stayed there throughout winter, they would experience 25.7% less snow cover (57.3% vs. 31.6%). Despite a greater difference than in the first case, it doesn't compel them to adopt this strategy, as it represents the difference between various degrees of landscape openness from snow cover.

      We write about this in the new manuscript on lines 385-394.

      Results: it is unclear whether the reported dispersion measures are SDs or SEs. Please provide details.

      For the date and coordinates of the start and end of the different phases of migration, we specified the mean, sd, and sample size. We wrote this in line 277. For the values of the parameters of the different phases of the migration (duration, distance, speed, and direction), we used the mean, the standard error of the mean, and the confidence interval (obtained using the ‘emmeans’ package). We have indicated this in lines 302-303 and the caption of Table 1 (L 315) and Figure 2 (L 293-294). For the values of habitat and snow cover experienced by the Rough-legged Buzzards, we used the mean and the error of the mean. We reported this on lines 322 and 337 and in Figures 3 (L 332-333) and 4 (L 355-356).

      Discussion: in general, it should be reshaped taking into account the comments. It is overlong, speculative and quite naive in several passages. Entire sections can be safely removed (I think it can be reduced by half without any loss of information). I provide some examples of the issues I have spotted below. For instance, the entire paragraph starting with 'Understanding....' is not clear to me. What do you mean by 'prohibited management' options? Without examples, this seems a rather general text, based on unclear premises when related to the specific of this study. Some statements are vague, derive from unsubstantiated claims, and unclear. E.g. "Despite their scarcity in these habitats, forests appear to hold significant importance for Rough-legged buzzards for nocturnal safety". I could not find any day-night analysis showing that they actually roost in forests during nighttime. Being a tundra species, it may well be possible that rough-legged buzzards perceive forests as very dangerous habitats and that they prefer instead to roost in open habitats. Analysing habitat use during day and night during the non-breeding period may be of help to clarify this. Furthermore, considering the fast migration periods, what is the flight speed during day and night above forests? Do these birds also migrate at night or do they roost during the night? Perhaps a figure visualizing day and night track segments could be of help (or an analysis of day vs. night flight speed) (there are several R packages to annotate tracks in relation to day and night). This is an example of another problematic statement: "The progression of snow cover in the wintering range of Rough-legged buzzards plays a significant role in their winter migration pattern." The manuscript does not contain any clear demonstration of this, as I wrote in my previous comments. Without such evidence, you must considerably tone down such assertions. But since providing a direct link is certainly possible, I think that additional analyses would clearly strengthen your take-home message.

      The paragraph starting with "The quantification of environmental changes that could prove fatal to bird species presents yet another challenge for conservation efforts in an era of rapid global change." is quite odd. Take the following statement "For instance, the presence of small patches of woodland in the winter range might appear crucial to the survival of the Rough-legged buzzard. Elimination of these seemingly minor elements of vegetation cover through management actions could have dire consequences for the species.". It is based on the assumption that minor vegetation elements play a key role in the ecology of the species, without any evidence supporting this. Does it have any sense? I could safely say exactly the opposite and I would believe it might even be more substantiated.

      We agree with these comments.

      We have completely rewritten this section. As suggested, we have shortened it by removing statements that were not supported by the research. We have completely removed the statements about "prohibited management". We have also removed the statement that "forests appear to be of significant importance to Rough-legged buzzards for nocturnal safety" and everything associated with that statement, e.g. the statement about "small elements of vegetation cover", etc. We do believe that this statement is true in substance, but we also agree that it is not supported by the results and requires separate analysis. At the same time, we believe that this is a topic for a separate study and would be redundant here. Therefore, we leave it for a separate publication.

      Conclusion paragraph: I believe this severely overstates the conservation importance of this study. That the results have "crucial implications for conservation efforts in the Anthropocene, where rapidly changing environmental factors can severely impact bird migration" seems completely untenable to me. What is the evidence for such crucial implications? For instance, these results may suggest that climate change, because global warming is predicted to reduce snow cover in the non-breeding areas, might well be beneficial for populations of this species, by reducing non-breeding energy expenditure and improving non-breeding survival. I think statements like these are simply not necessary, and that the study should be more focused on the actual results and evidence provided.

      We have completely rewritten this section. We removed the reference to the Anthropocene and focused on migratory behavior and migration patterns.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      Connelly and colleagues provide convincing genetic evidence that importation from mainland Tanzania is a major source of Plasmodium falciparum lineages currently circulating in Zanzibar. This study also reveals ongoing local malaria transmission and occasional near-clonal outbreaks in Zanzibar. Overall, this research highlights the role of human movements in maintaining residual malaria transmission in an area targeted for intensive control interventions over the past decades and provides valuable information for epidemiologists and public health professionals.

      Reviewer #1 (Public Review):

      Zanzibar archipelago is close to achieving malaria elimination, but despite the implementation of effective control measures, there is still a low-level seasonal malaria transmission. This could be due to the frequent importation of malaria from mainland Tanzania and Kenya, reservoirs of asymptomatic infections, and competent vectors. To investigate population structure and gene flow of P. falciparum in Zanzibar and mainland Tanzania, they used 178 samples from mainland Tanzania and 213 from Zanzibar that were previously sequenced using molecular inversion probes (MIPs) panels targeting single nucleotide polymorphisms (SNPs). They performed Principal Component Analysis (PCA) and identity by descent (IBD) analysis to assess genetic relatedness between isolates. Parasites from coastal mainland Tanzania contribute to the genetic diversity in the parasite population in Zanzibar. Despite this, there is a pattern of isolation by distance and microstructure within the archipelago, and evidence of local sharing of highly related strains sustaining malaria transmission in Zanzibar that are important targets for interventions such as mass drug administration and vector control, in addition to measures against imported malaria.

      Strengths:

      This study presents important samples to understand population structure and gene flow between mainland Tanzania and Zanzibar, especially from the rural Bagamoyo District, where malaria transmission persists and there is a major port of entry to Zanzibar. In addition, this study includes a larger set of SNPs, providing more robustness for analyses such as PCA and IBD. Therefore, the conclusions of this paper are well supported by data.

      Weaknesses:

      Some points need to be clarified:

      (1) SNPs in linkage disequilibrium (LD) can introduce bias in PCA and IBD analysis. Were SNPs in LD filtered out prior to these analyses?

      Thank you for this point. We did not filter SNPs in LD prior to this analysis. In the PCA analysis in Figure 1, we did restrict to a single isolate among those that were clonal (high IBD values) to prevent bias in the PCA. In general, disequilibrium is minimal only over small distances <5-10kb without selective forces at play. This is much less than the average spacing of the markers in the panel. If there is minimal LD, the conclusions drawn on relative levels and connections at high IBD are unlikely to be confounded by any effects of disequilibrium.

      ( 2) Many IBD algorithms do not handle polyclonal infections well, despite an increasing number of algorithms that are able to handle polyclonal infections and multiallelic SNPs. How polyclonal samples were handled for IBD analysis?

      Thank you for this point. We added lines 157-161 to clarify. This section now reads:

      “To investigate genetic relatedness of parasites across regions, identity by descent (IBD) estimates were assessed using the within sample major alleles (coercing samples to monoclonal by calling the dominant allele at each locus) and estimated utilizing a maximum likelihood approach using the inbreeding_mle function from the MIPanalyzer package (Verity et al., 2020). This approach has previously been validated as a conservative estimate of IBD (Verity et al., 2020).”

      Please see the supplement in (Verity et al., 2020) for an extensive simulation study that validates this approach.

      Reviewer #1 (Recommendations For The Authors):

      (3) I think Supplementary Figures 8 and 9 are more visually informative than Figure 2.

      Thank you for your response. We performed the analysis in Figure 2 to show how IBD varies between different regions and is higher within a region than between.

      Reviewer #2 (Public Review):

      This manuscript describes P. falciparum population structure in Zanzibar and mainland Tanzania. 282 samples were typed using molecular inversion probes. The manuscript is overall well-written and shows a clear population structure. It follows a similar manuscript published earlier this year, which typed a similar number of samples collected mostly in the same sites around the same time. The current manuscript extends this work by including a large number of samples from coastal Tanzania, and by including clinical samples, allowing for a comparison with asymptomatic samples.

      The two studies made overall very similar findings, including strong small-scale population structure, related infections on Zanzibar and the mainland, near-clonal expansion on Pemba, and frequency of markers of drug resistance. Despite these similarities, the previous study is mentioned a single time in the discussion (in contrast, the previous research from the authors of the current study is more thoroughly discussed). The authors missed an opportunity here to highlight the similar findings of the two studies.

      Thank you for your insights. We appreciated the level of detail of your review and it strengthened our work. We have input additional sentences on lines 292-295, which now reads:

      “A recent study investigating population structure in Zanzibar also found local population microstructure in Pemba (Holzschuh et al., 2023). Further, both studies found near-clonal parasites within the same district, Micheweni, and found population microstructure over Zanzibar.”

      Strengths:

      The overall results show a clear pattern of population structure. The finding of highly related infections detected in close proximity shows local transmission and can possibly be leveraged for targeted control.

      Weaknesses:

      A number of points need clarification:

      (1) It is overall quite challenging to keep track of the number of samples analyzed. I believe the number of samples used to study population structure was 282 (line 141), thus this number should be included in the abstract rather than 391. It is unclear where the number 232 on line 205 comes from, I failed to deduct this number from supplementary table 1.

      Thank you for this point. We have included 282 instead of 391 in the abstract. We added a statement in the results at lines 203-205 to clarify this point, which now reads:

      “PCA analysis of 232 coastal Tanzanian and Zanzibari isolates, after pruning 51 samples with an IBD of greater than 0.9 to one representative sample, demonstrates little population differentiation (Figure 1A).”

      (2) Also, Table 1 and Supplementary Table 1 should be swapped. It is more important for the reader to know the number of samples included in the analysis (as given in Supplementary Table 1) than the number collected. Possibly, the two tables could be combined in a clever way.

      Thank you for this advice. Rather than switch to another table altogether, we appended two columns to the original table to better portray the information (see Table 1).

      Methods

      (3) The authors took the somewhat unusual decision to apply K-means clustering to GPS coordinates to determine how to combine their data into a cluster. There is an obvious cluster on Pemba islands and three clusters on Unguja. Based on the map, I assume that one of these three clusters is mostly urban, while the other two are more rural. It would be helpful to have a bit more information about that in the methods. See also comments on maps in Figures 1 and 2 below.

      Cluster 3 is a mix of rural/urban while the clusters 2, 4 and 5 are mostly rural. This analysis was performed to see how IBD changes in relation to local context within different regions in Zanzibar, showing that there is higher IBD within locale than between locale.

      (4) Following this point, in Supplemental Figure 5 I fail to see an inflection point at K=4. If there is one, it will be so weak that it is hardly informative. I think selecting 4 clusters in Zanzibar is fine, but the justification based on this figure is unclear.

      The K-means clustering experiment was used to cluster a continuous space of geographic coordinates in order to compare genetic relatedness in different regions. We selected this inflection point based on the elbow plot and based the number to obtain sufficient subsections of Zanzibar to compare genetic relatedness. This point is added to the methods at lines 174-178, which now reads:

      “The K-means clustering experiment was used to cluster a continuous space of geographic coordinates in order to compare genetic relatedness in different regions. We selected K = 4 as the inflection point based on the elbow plot (Supplemental Figure 5) and based the number to obtain sufficient subsections of Zanzibar to compare genetic relatedness.”

      (5) For the drug resistance loci, it is stated that "we further removed SNPs with less than 0.005 population frequency." Was the denominator for this analysis the entire population, or were Zanzibar and mainland samples assessed separately? If the latter, as for all markers <200 samples were typed per site, there could not be a meaningful way of applying this threshold. Given data were available for 200-300 samples for each marker, does this simply mean that each SNP needed to be present twice?

      Population frequency is calculated based on the average within sample allele frequency of each individual in the population, which is an unbiased estimator. Within sample allele frequency can range from 0 to 1. Thus, if only one sample has an allele and it is at 0.1 within sample frequency, the population allele frequency would be 0.1/100 = 0.001. This allele is removed even though this would have resulted in a prevalence of 0.01. This filtering is prior to any final summary frequency or prevalence calculations (see MIP variant Calling and Filtering section in the methods). This protects against errors occurring only at low frequency.

      Discussion:

      (6) I was a bit surprised to read the following statement, given Zanzibar is one of the few places that has an effective reactive case detection program in place: "Thus, directly targeting local malaria transmission, including the asymptomatic reservoir which contributes to sustained transmission (Barry et al., 2021; Sumner et al., 2021), may be an important focus for ultimately achieving malaria control in the archipelago (Björkman & Morris, 2020)." I think the current RACD program should be mentioned and referenced. A number of studies have investigated this program.

      Thank you for this point. We have added additional context and clarification on lines 275-280, which now reads:

      “Thus, directly targeting local malaria transmission, including the asymptomatic reservoir which contributes to sustained transmission (Barry et al., 2021; Sumner et al., 2021), may be an important focus for ultimately achieving malaria control in the archipelago (Björkman & Morris, 2020). Currently, a reactive case detection program within index case households is being implemented, but local transmission continues and further investigation into how best to control this is warranted (Mkali et al. 2023).”

      (7) The discussion states that "In Zanzibar, we see this both within and between shehias, suggesting that parasite gene flow occurs over both short and long distances." I think the term 'long distances' should be better defined. Figure 4 shows that highly related infections rarely span beyond 20-30 km. In many epidemiological studies, this would still be considered short distances.

      Thank you for this point. We have edited the text at lines 287-288 to indicate that highly related parasites mainly occur at the range of 20-30km, which now reads:

      “In Zanzibar, highly related parasites mainly occur at the range of 20-30km.”

      (8) Lines 330-331: "Polymorphisms associated with artemisinin resistance did not appear in this population." Do you refer to background mutations here? Otherwise, the sentence seems to repeat lines 324. Please clarify.

      We are referring to the list of Pfk13 polymorphisms stated in the Methods from lines 146-148. We added clarifying text on lines 326-329:

      “Although polymorphisms associated with artemisinin resistance did not appear in this population, continued surveillance is warranted given emergence of these mutations in East Africa and reports of rare resistance mutations on the coast consistent with spread of emerging Pfk13 mutations (Moser et al., 2021). “

      (9) Line 344: The opinion paper by Bousema et al. in 2012 was followed by a field trial in Kenya (Bousema et al, 2016) that found that targeting hotspots did NOT have an impact beyond the actual hotspot. This (and other) more recent finding needs to be considered when arguing for hotspot-targeted interventions in Zanzibar.

      We added a clarification on this point on lines 335-345, which now reads:

      “A recent study identified “hotspot” shehias, defined as areas with comparatively higher malaria transmission than other shehias, near the port of Zanzibar town and in northern Pemba (Bisanzio et al., 2023). These regions overlapped with shehias in this study with high levels of IBD, especially in northern Pemba (Figure 4). These areas of substructure represent parasites that differentiated in relative isolation and are thus important locales to target intervention to interrupt local transmission (Bousema et al., 2012). While a field cluster-randomized control trial in Kenya targeting these hotspots did not confer much reduction of malaria outside of the hotspot (Bousema et al. 2016), if areas are isolated pockets, which genetic differentiation can help determine, targeted interventions in these areas are likely needed, potentially through both mass drug administration and vector control (Morris et al., 2018; Okell et al., 2011). Such strategies and measures preventing imported malaria could accelerate progress towards zero malaria in Zanzibar.”

      Figures and Tables:

      (10) Table 2: Why not enter '0' if a mutation was not detected? 'ND' is somewhat confusing, as the prevalence is indeed 0%.

      Thank you for this point. We have put zero and also given CI to provide better detail.

      (11) Figure 1: Panel A is very hard to read. I don't think there is a meaningful way to display a 3D-panel in 2D. Two panels showing PC1 vs. PC2 and PC1 vs. PC3 would be better. I also believe the legend 'PC2' is placed in the wrong position (along the Y-axis of panel 2).

      Supplementary Figure 2B suffers from the same issue.

      Thank you for your comment. A revised Figure 1 and Supplemental Figure 2 are included, where there are separate plots for PC1 vs. PC2 and PC1 vs. PC3.

      (12) The maps for Figures 1 and 2 don't correspond. Assuming Kati represents cluster 4 in Figure 2, the name is put in the wrong position. If the grouping of shehias is different between the Figures, please add an explanation of why this is.

      Thank you for this point. The districts with at least 5 samples present are plotted in the map in Figure 1B. In Figure 2, a totally separate analysis was performed, where all shehias were clustered into separate groups with k-means and the IBD values were compared between these clusters. These maps are not supposed to match, as they are separate analyses. Figure 1B is at the district level and Figure 2 is clustering shehias throughout Zanzibar.

      The figure legend of Figure 1B on lines 410-414 now reads:

      “B) A Discriminant Analysis of Principal Components (DAPC) was performed utilizing isolates with unique pseudohaplotypes, pruning highly related isolates to a single representative infection. Districts were included with at least 5 isolates remaining to have sufficient samples for the DAPC. For plotting the inset map, the district coordinates (e.g. Mainland, Kati, etc.) are calculated from the averages of the shehia centroids within each district.”

      The figure legend of Figure 2 on lines 417-425 now reads:

      “Figure 2. Coastal Tanzania and Zanzibari parasites have more highly related pairs within their given region than between regions. K-means clustering of shehia coordinates was performed using geographic coordinates all shehias present from the sample population to generate 5 clusters (colored boxes). All shehias were included to assay pairwise IBD between differences throughout Zanzibar. Pairwise comparisons of within cluster IBD (column 1 of IBD distribution plots) and between cluster IBD (column 2-5 of IBD distribution plots) was done for all clusters. In general, within cluster IBD had more pairwise comparisons containing high IBD identity.”

      (13) Figure 2: In the main panel, please clarify what the lines indicate (median and quartiles?). It is very difficult to see anything except the outliers. I wonder whether another way of displaying these data would be clearer. Maybe a table with medians and confidence intervals would be better (or that data could be added to the plots). The current plots might be misleading as they are dominated by outliers.

      Thank you for this point and it greatly improved this figure. We changed the plotting mechanisms through using a beeswarm plot, which plots all pairwise IBD values within each comparison group.

      (14) In the insert, the cluster number should not only be given as a color code but also added to the map. The current version will be impossible to read for people with color vision impairment, and it is confusing for any reader as the numbers don't appear to follow any logic (e.g. north to south).

      Thank you very much for these considerations. We changed the color coding to a color blind friendly palette and renamed the clusters to more informative names; Pemba, Unguja North (Unguja_N), Unguja Central (Unguja_C), Unguja South (Unguja_S) and mainland Tanzania (Mainland).

      (15) The legend for Figure 3 is difficult to follow. I do not understand what the difference in binning was in panels A and B compared to C.

      Thank you for this point. We have edited the legend to reflect these changes. The legend for Figure 3 on lines 427-433 now reads:

      “Figure 3. Isolation by distance is shown between all Zanzibari parasites (A), only Unguja parasites (B) and only Pemba parasites (C). Samples were analyzed based on geographic location, Zanzibar (N=136) (A), Unguja (N=105) (B) or Pemba (N=31) (C) and greater circle (GC) distances between pairs of parasite isolates were calculated based on shehia centroid coordinates. These distances were binned at 4km increments out to 12 km. IBD beyond 12km is shown in Supplemental Figure 8. The maximum GC distance for all of Zanzibar was 135km, 58km on Unguja and 12km on Pemba. The mean IBD and 95% CI is plotted for each bin.”

      (16) Font sizes for panel C differ, and it is not aligned with the other panels.

      Thank you for pointing this out. Figure 3 and Supplemental Figure 10 are adjusted with matching formatting for each plot.

      (17) Why is Kusini included in Supplemental Figure 4, but not in Figure 1?

      In Supplemental Figure 4, all isolates were used in this analysis and isolates with unique pseudohaplotypes were not pruned to a single representative infection. That is why there are additional isolates in Kusini. The legend for Supplemental Figure 4 now reads:

      “Supplemental Figure 4. PCA with highly related samples shows population stratification radiating from coastal Mainland to Zanzibar. PCA of 282 total samples was performed using whole sample allele frequency (A) and DAPC was performed after retaining samples with unique pseudohaplotypes in districts that had 5 or more samples present (B). As opposed to Figure 1, all isolates were used in this analysis and isolates with unique pseudohaplotypes were not pruned to a single representative infection.”

      (18) Supplemental Figures 6 and 7: What does the width of the line indicate?

      The sentence below was added to the figure legends of Supplemental Figures 6 and 7 and the legends of each network plot were increased in size:

      “The width of each line represents higher magnitudes of IBD between pairs.”

      (19) What was the motivation not to put these lines on the map, as in Figure 4A? This might make it easier to interpret the data.

      Thank you for this comment. For Supplemental Figure 8 and 9, we did not put these lines that represent lower pairwise IBD to draw the reader's attention to the highly related pairs between and within shehias.

      Reviewer #2 (Recommendations For The Authors):

      (1) There is a rather long paragraph (lines 300-323) on COI of asymptomatic infections and their genetic structure. Given that the current study did not investigate most of the hypotheses raised there (e.g. immunity, expression of variant genes), and the overall limited number of asymptomatic samples typed, this part of the discussion feels long and often speculative.

      Thank you for your perspective. The key sections highlighted in this comment, regarding immunity and expression of variant genes, were shortened. This section on lines 300-303 now reads:

      “Asymptomatic parasitemia has been shown to be common in falciparum malaria around the globe and has been shown to have increasing importance in Zanzibar (Lindblade et al., 2013; Morris et al., 2015). What underlies the biology and prevalence of asymptomatic parasitemia in very low transmission settings where anti-parasite immunity is not expected to be prevalent remains unclear (Björkman & Morris, 2020).”

      (2) As a detail, line 304 mentions "few previous studies" but only one is cited. Are there studies that investigated this and found opposite results?

      Thank you for this comment. We added additional studies that did not find an association between clinical disease and COI. These changes are on lines 303-308, which now reads:

      “Similar to a few previous studies, we found that asymptomatic infections had a higher COI than symptomatic infections across both the coastal mainland and Zanzibar parasite populations (Collins et al., 2022; Kimenyi et al., 2022; Sarah-Matio et al., 2022). Other studies have found lower COI in severe vs. mild malaria cases (Robert et al., 1996) or no significant difference between COI based on clinical status (Earland et al. 2019; Lagnika et al. 2022; Conway et al. 1991; Kun et al. 1998; Tanabe et al. 2015)”

      (3) Table 2: Percentages need to be checked. To take one of several examples, for Pfk13-K189N a frequency of 0.019 for the mutant allele is given among 137 samples. 2/137 equals to 0.015, and 3/137 to 0.022. 0.019 cannot be achieved. The same is true for several other markers. Possibly, it can be explained by the presence of polyclonal infections. If so, it should be clarified what the total of clones sequenced was, and whether the prevalence is calculated with the number of samples or number of clones as the denominator.

      Thank you for this point. We mistakenly reported allele frequency instead of prevalence. An updated Table 2 is now in the manuscript. The method for calculating the prevalence is now at lines 148-151:

      “Prevalence was calculated separately in Zanzibar or mainland Tanzania for each polymorphism by the number of samples with alternative genotype calls for this polymorphism over the total number of samples genotyped and an exact 95% confidence interval was calculated using the Pearson-Klopper method for each prevalence.”

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This study presents useful findings regarding the role of formin-like 2 in mouse oocyte meiosis. The submitted data are supported by incomplete analyses, and in some cases, the conclusions are overstated. If these concerns are addressed, this paper would be of interest to reproductive biologists.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The presented study focuses on the role of formin-like 2 (FMNL2) in oocyte meiosis. The authors assessed FMNL2 expression and localization in different meiotic stages and subsequently, by using siRNA, investigated the role of FMNL2 in spindle migration, polar body extrusion, and distribution of mitochondria and endoplasmic reticulum (ER) in mouse oocytes.

      Strengths:

      Novelty in assessing the role of formin-like 2 in oocyte meiosis.

      Weaknesses:

      Methods are not properly described.

      Overstating presented data.

      It is not clear what statistical tests were used.

      My main concern is that there are missing important details of how particular experiments and analyses were done. The material and methods section are not written in the way that presented experiments could be repeated - it is missing basic information (e.g., used mouse strain, timepoints of oocytes harvest for particular experiments, used culture media, image acquisition parameters, etc.). Some of the presented data are overstated and incorrectly interpreted. It is not clear to me how the analysis of ER and mitochondria distribution was done, which is an important part of the presented data interpretation. I'm also missing important information about the timing of particular stages of assessed oocytes because the localization of both ER and mitochondria differs at different stages of oocyte meiosis. The data interpretation needs to be justified by proper analysis based on valid parameters, as there is considerable variability in the ER and mitochondria structure and localization across oocytes based on their overall quality and stage.

      Thank you for your comment. We regret the oversight of omitting critical information in the manuscript. In the revised manuscript, we have included essential details such as mouse strains, culture media, stages of oocyte and statistical methods in the materials and methods section. Please find our details responses in the “Recommendations for the authors” part.

      Reviewer #2 (Public Review):

      Summary:

      This research involves conducting experiments to determine the role of Fmnl2 during oocyte meiosis I.

      Strengths:

      Identifying the role of Fmnl2 during oocyte meiosis I is significant.

      Weaknesses:

      The quantitative analysis and the used approach to perturb FMNL2 function are currently incomplete and would benefit from more confirmatory approaches and rigorous analysis.

      (1) Most of the results are expected. The new finding here is that FMNL2 regulates cytoplasmic F-actin in mouse oocytes, which is also expected given the role of FMNL2 in other cell types. Given that FMNL2 regulates cytoplasmic F-actin, it is very expected to see all the observed phenotypes. It is already established that F-actin is required for spindle migration to the oocyte cortex, extruding a small polar body and normal organelle distribution and functions.

      Thank you for your comment. In the recent decade, Arp2/3 complex (Nat Cell Biol 2011), Formin2 (Nat Cell Biol 2002, Nat Commun 2020), and Spire (Curr Biol 2011) were reported to be 3 key factors to involve into this process. These factors regulate actin filaments in different ways. However, how they cross with each other for the subcellular events were still fully clear. Our current study identified that FMNL2 played a critical role in coordinating these molecules for actin assembly in oocytes. Our findings demonstrate that FMNL2 interacts with both the Arp2/3 complex and Formin2 to facilitate actin-based meiotic spindle migration. Additionally, we discovered a novel role for FMNL2 in determining the distribution and function of the endoplasmic reticulum and mitochondria, which may in turn influence meiotic spindle migration in oocytes. Our results not only uncover the novel functions of FMNL2-mediated actin for organelle distribution, but also extend our understanding of the molecular basis for the unique meiotic spindle migration in oocyte meiosis.

      (2) The authors used Fmnl2 cRNA to rescue the effect of siRNA-mediated knockdown of Fmnl2. It is not clear how this works. It is expected that the siRNA will also target the exogenous cRNA construct (which should have the same sequence as endogenous Fmnl2) especially when both of them were injected at the same time. Is this construct mutated to be resistant to the siRNA?

      Thank you for your question. We regret any misunderstanding that may have been caused by the inappropriate description in our manuscript. In the rescue experiments, we initially injected FMNL2 siRNA into oocytes, followed by the microinjection of FMNL2 mRNA 18-20 hours later. After conducting our previous experiments, we have verified through Western blotting that endogenous FMNL2 is effectively suppressed 18-20 hours following the microinjection of FMNL2 siRNA. Additionally, we observed a significant increase in exogenous FMNL2 protein expression 2 hours after the injection of FMNL2 mRNA. We believe that the exogenous FMNL2 could compensate the decrease by FMNL2 knockdown, and this approach was adopted in many oocyte studies.

      (3) The authors used only one approach to knockdown FMNL2 which is by siRNA. Using an additional approach to inhibit FMNL2 would be beneficial to confirm that the effect of siRNA-mediated knockdown of FMNL2 is specific.

      Thank you for your question. Yes, the specificity is always the concern for siRNA or morpholino microinjection due to the off-target issue. Due to the limitation we could not generate the knock out model, and there are no known inhibitors with specific targeting capabilities for FMNL2. To solve this, we performed the rescue study with exogenous mRNA to confirm the effective knock down of FMNL2. These measures provide reassurance regarding the credibility of the experimental outcomes, and this is also the general way to avoid the off-target of siRNA or morpholino.

      Reviewer #3 (Public Review):

      Summary:

      The authors focus on the role of formin-like protein 2 in the mouse oocyte, which could play an important role in actin filament dynamics. The cytoskeleton is known to influence a number of cellular processes from transcription to cytokinesis. The results show that downregulation of FMNL2 affects spindle migration with resulting abnormalities in cytokinesis in oocyte meiosis I.

      Weaknesses:

      The overall description of methods and figures is overall dismissively poor. The description of the sample types and number of replicate experiments is impossible to interpret throughout, and the quantitative analysis methods are not adequately described. The number of data points presented is unconvincing and unlikely to support the conclusions. On the basis of the data presented, the conclusions appear to be preliminary, overstated, and therefore unconvincing.

      Thank you for your comment. We regret the oversight of omitting critical information in the manuscript. In the revised manuscript, we have incorporated your suggestions for modification, particularly regarding the Materials and Methods section. Please see the detailed revision and responses in the “Recommendations for the authors” part.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for The Authors):

      My main concern is that there are missing important details of how particular experiments and analyses were done. The material and methods section is not written in the way that presented experiments could be repeated - it is missing basic information (e.g., used mouse strain, timepoints of oocytes harvest for particular experiments, used culture media, image acquisition parameters, etc.). Some of the presented data are overstated and incorrectly interpreted. It is not clear to me how the analysis of ER and mitochondria distribution was done, which is an important part of the presented data interpretation. I'm also missing important information about the timing of particular stages of assessed oocytes because the localization of both ER and mitochondria differs at different stages of oocyte meiosis. The data interpretation needs to be justified by proper analysis based on valid parameters, as there is considerable variability in the ER and mitochondria structure and localization across oocytes based on their overall quality and stage. My specific comments are listed below.

      (1) Information about statistical tests that were used needs to be provided for all quantification experiments.

      Thank you for your suggestion. Based on your suggestions, we revised the statistical analysis description in the Materials and Methods section. Additionally, we also included a description of the statistical methods in the legends of the relevant result figures.

      (2) I recommend replacing the plunger plots, used in most quantification data, with alternatives allowing evaluation of the distribution of the data (dot plots, box plots, whisker plots).

      Thank you for your suggestion. Following your suggestion, we replaced the plunger plots in Fig 2C, D, H, I and Fig3 B, C with dot plots.

      (3) Can the authors provide information about particular time points when were individual oocyte stages (GVBD, meiosis I, and meiosis II) harvested/used for immunofluorescence protein detection, western blotting, microinjection, and ER and mitochondria staining? Were the time points always the same in all presented experiments and experimental vs control group? If not, this needs to be clarified.

      Thank you for your suggestion. We used oocytes in the metaphase I (MI) stage for the statistical analysis of spindle migration, actin filament aggregation, endoplasmic reticulum localization, and mitochondrial localization. In the Western blot analysis, GV stage oocytes were utilized to evaluate the efficiency of knockdown and rescue experiments. The protein expression levels of Arp2, Formin2, INF2, Cofilin, Grp78, and Chop in different treatment groups were detected using MI-stage oocytes. In the revised version, we provided all the detailed information about the stages.

      (4) Figure 1B: Can the authors comment on why there is a missing representative image of MII oocyte FMBL2-Ab? I recommend including this in the figure to have a complete view of comparing overexpressed and endogenous FMNL2 localization in oocyte meiosis.

      Thank you for your suggestion. In the revised manuscript, we added immunostaining images of FMNL2 antibody in MII stage oocytes.

      (5) Figure 1C: The figure legend says, "FMNL2 and actin overlapped in cortex and spindle surrounding". In MI oocytes, there is usually no accumulated actin signal around the spindle, which is also true in the presented images, so there cannot be overlapping with the FMNL2 signal. The interpretation should be changed.

      We apologize for this inappropriate description that was used, and we deleted this sentence.

      (6) Figure 2B: What were the parameters of the "large" and "normal" polar bodies for performing the analysis?

      Thank you for your question. In order to assess the size of the polar body, we conducted a comparison between the diameter of the polar body and that of the oocyte. If the diameter of the polar body was found to be less than 1/3 of the oocyte's diameter, we categorized it as normal-sized polar body. Conversely, if the polar body's diameter exceeded 1/3 of the oocyte's diameter, we categorized it as a large polar body. We have included these details in the Results section of the manuscript.

      (7) Figure 2F: Can the authors comment on what can be the second band in the rescue group?

      Thank you for your question. In the rescue experiment, we microinjected exogenous FMNL2-EGFP mRNA into the oocytes. As a result, compared to endogenous FMNL2, the protein size increased due to the addition of the EGFP tag, approximately 27 kDa. Hence, in the Western blot bands of the rescue group, the upper band represents the expression of exogenous FMNL2-EGFP, while the lower band corresponds to the expression of endogenous FMNL2. We have provided annotations in the revised Figure 2F to clarify this.

      (8) Can the authors comment on the variability of PBE between 2C and 2H in the FMNL2-KD groups? In panel C, the PBE in the KD group was 59.5 {plus minus} 2.82%; in panel H, the PBE in the KD group was 48.34 {plus minus} 4.2%, and in the rescue group, the PBE was 62.62 {plus minus} 3.6%. The rescue group has a similar PBE rate as the KD group in panel C. How consistent was the FMNL2 knockdown across individual replicates? Can the authors provide more details on how the rescue experiment was performed?

      Thank you for your question. We believe that the difference in PBE observed in Figure 2C and 2H of the FMNL2-KD group was due to the microinjection times and the duration of in vitro arrest. The results shown in Figure 2C depict the outcome of a single injection of FMNL2 siRNA into GV stage oocytes, followed by 18 hours of in vitro arrest; the results shown in Figure 2H contain a subsequent additional injection of FMNL2-EGFP mRNA with another 2 hours of arrest. The two rounds of microinjection and the extended period of in vitro arrest both affect oocyte maturation rates.

      (9). Figure 2J and K: What groups were compared together? The used statistic needs to be properly described.

      Thank you for your question. The FMNL2-KD, FMNL3-KD, and FMNL2+3-KD groups were all compared to the Control group, therefore, t-test was used for analysis. We have provided explanations in the revised manuscript.

      (10) Figure 4B and C: Can the authors provide representative images without oversaturated actine signal?

      Thank you for your question. For the analysis of oocyte F-actin, the F-actin are divided into cortex actin and cytoplasmic actin. Due to the contrast during imaging, the strong cortex actin signals affected the detection of cytoplasmic actin, therefore, it is necessary to increase the scanning index, which will cause the overexpose the cortex actin signal. This is for the better observation of the cytoplasmic signals.

      (11) Figure 4G + 5H: Can the authors comment on why they used as a housekeeping gene actin instead of tubulin, which was used in the rest of the WB experiments?

      Thank you for your question. In most of the western blot experiments conducted in this study, we used tubulin as a housekeeping gene. However, due to the supply of antibodies by delivery period, we had GAPDH and actin as well for some experiments. These housekeeping genes were all valid for the study.

      (12) Based on what parameters was ER considered normally or abnormally distributed, and what stages of oocytes were assessed?

      Thank you for your question. In this study, we employed oocytes at the MI stage for the analysis of ER localization. In the MI stage, the ER localized around the spindle, which is regarded as the typical localization pattern. The ER displayed a dispersed distribution throughout the cytoplasm or clustered were categorized as aberrant positioning. We included relevant descriptions in the revised version of the manuscript.

      (13) Figure 5H: As a housekeeping gene was used actin - the quantification is labeled as a Grp78 to tubulin ratio.

      Thank you for pointing out the error. This is a label mistake and we corrected it.

      (14) Information about how JC-1 staining was done needs to be provided.

      Thank you for your carefully reading. We included a description of JC1 staining in the Materials and Methods section.

      (15). Line 231-232: "As shown in Figure 4A" - the text doesn't correspond to the figure.

      Thank you for pointing out the error. We revised this mistake in the revised manuscript by correcting "Fig3A" to "Fig4A."

      (16) Line 265: there is probably a missing word "Formin2".

      Thank you and we corrected the error and made the necessary changes in the revised manuscript.

      Reviewer #2 (Recommendations for The Authors):

      (1) Quantification and analysis:

      • Fig. 3B: The rate of spindle migration should be quantified based on the distance from the spindle to the cortex. Also, the orientation of the spindle (Z-position) needs to be taken into consideration.

      • Fig. 5C, D: It is unclear how the rate of ER distribution was calculated.

      • Western blot: In many experiments (such as Fig. 5H), the bands are saturated which will prevent accurate intensity measurements and quantifications.

      For spindle migration, we specifically focused on spindles exhibiting a distinctive spindle-like shape with clear bipolarity to eliminate any statistical discrepancies potentially caused by variations in Z-axis alignment. Our criterion for determining successful migration was based on the contact between the spindle pole and the cortical region of the oocyte. Therefore, we think that the rate is better to reflect the phenotype than the distance.

      For the examination of ER localization, Reviewer 1 also raised this issue. We utilized oocytes at the MI stage in this study. The ER localized around the spindle in MI stage. The ER displayed a dispersed distribution throughout the cytoplasm or clustered were categorized as aberrant positioning. We included relevant descriptions in the revised version of the manuscript.

      For the bands of the western blot results, during the experimental procedure we typically capture multiple images at different exposure levels (3-5 images). In the revised manuscript, we have replaced the inappropriate images with more suitable ones.

      (2) Given that all Immunoprecipitation experiments in this manuscript were performed on the whole ovary which contains more somatic cells than oocytes, the results do not necessarily reflect meiotic oocytes. Please consider this possibility during the interpretation.

      Thank you for your suggestion. Yes, we agree with you. In the revised manuscript, we made appropriate modifications to the relevant descriptions.

      (3) 351-365: The conclusion that Arp2/3 compensates for the decreased formin 2 in FMNL2 knockdown oocytes is a bit unconvincing. 1- In mouse oocytes, it is already known that Arp2/3 and formin 2 regulate different pools of F-actin nucleation. 2- The authors found an increase in Arp2/3 in FMNL2 knockdown oocytes compared to control oocytes without any change in cortical F-actin. Given that Arp2/3 is primarily promoting cortical F-actin, it is expected to see an increase in cortical F-actin in FMNL2 knockdown oocytes, which was not the case.

      Thank you for your question. Yes, previous studies showed that formin2 localizes to the cytoplasm of oocytes and accumulates around the spindle, which facilitate cytoplasmic actin assembly. While Arp2/3 is primarily responsible for actin assembly at the cortex region of oocytes. In invasive cells, FMNL2 is mainly localized in the leading edge of the cell, lamellipodia and filopodia tips, to improve cell migration ability by actin-based manner (Curr Biol 2012). We showed that FMNL2 localized both at spindle periphery and cortex, but depletion of FMNL2 did not affect cortex actin intensity. We think that FMNL2 and Arp2/3 both contribute to the cortex actin dynamics, when FMNL2 decreased, ARP2 increased to compensate for this, which maintained the cortex actin level. In the revised manuscript, we have made modifications to avoid excessive extrapolation from our results, ensuring that our conclusions are presented in a more objective manner.

      (4) Lines 195-197: The spindle is initially formed soon after the GVBD, so there is no spindle during GVBD. Also, I can't see oocytes at anaphase I or telophase I in this figure. Please revise.

      Thank you for your suggestion. We apologize for the inappropriate descriptions that were used. In the revised manuscript, we have made modifications to the respective descriptions in the Results part.

      (5) Fig. 2E: It seems that the control oocyte is abnormal with mild cytokinesis defects. Please replace or delete it since this information is already included in Fig. 3A.

      Thank you for your suggestion. Based on our observations, during the extrusion of the first polar body in oocytes, there is a temporary occurrence of cellular morphological fragmentation due to cortical reorganization (11h in control oocyte from Fig 2E). However, after the extrusion of the first polar body, the oocyte morphology returns to normal. Figure 2E illustrates the meiotic division process of oocytes, while Figure 3A primarily focuses on the process of oocyte spindle migration. We think that it is better to retain both to present our results.

      Reviewer #3 (Recommendations for The Authors):

      In the case of the observed phenotype, the stage of GV is important. The phenotypes presented also occur in meiotic or developmentally incompetent oocytes. In addition, the images of GV oocytes appear as NSN, which also show the KD phenotype in Figs. 2 and 3.

      Thank you for your concern. As the oocyte grows, the proportion of SN-type oocytes gradually increases. When the oocyte diameter reaches 70-80 μm, the proportion of SN oocytes is approximately 52.7% (Mol Reprod Dev. 1995). In our study, both the control and knockdown groups collected oocytes with a diameter of around 80 μm, which is considered as fully-grown oocytes, predominantly in the SN phase. Since the collection period and size of the oocytes were consistent, we can sure that the observed differences between the control and knockdown groups in phenotype analysis could be solid and reliable.

      MII is absent in Fig. 1B.

      In the revised manuscript, we added immunostaining images of FMNL2 in MII stage oocytes.

      The result of KD is not convincing. Also, discuss whether the heterozygous effect of Fmnl2 deletion affects reproductive fitness.

      Thank you for your concern. In our investigation, limited to the setup of knock out model, we employed siRNA to knockdown FMNL2 expression, to avoid the risk of off-target, we performed rescue experiment with exogenous mRNA, which we believe that it could solve this issue. When designing siRNA sequences, we ensured their specificity for binding to FMNL2 mRNA only, and we assessed the levels of FMNL2 and FMNL3 mRNA in oocytes after injection of FMNL2 siRNA. The results showed that, compared to the control group, the expression of FMNL2 mRNA decreased by approximately 70% after 18 hours of FMNL2 siRNA injection, while the level of FMNL3 mRNA was not decreased.

      Fig. 2F rescue experiment with double bands. What bands are seen here? Did the authors inject tagged or untagged FMNL2? Or does endogenous FMNL2 appear higher in the sample after KD?

      Thank you for your question. In the rescue experiment, we microinjected exogenous FMNL2-EGFP mRNA into the oocytes. As a result, compared to endogenous FMNL2, the protein size increased due to the addition of the EGFP tag, approximately 27 kDa. Hence, in the Western blot bands of the rescue group, the upper band represents the expression of exogenous FMNL2-EGFP, while the lower band corresponds to the expression of endogenous FMNL2. We provided annotations in the revised Figure 2F to clarify this.

      Variability in mitochondria and ER distribution patterns is also known in healthy and developing oocytes, although the authors described only a single phenotype.

      Thank you for your concern. Yes, mitochondria and ER show dynamic localization in different stage of oocyte maturation. However, in this study we employed oocyte MI stage for the analysis of ER and mitochondria localization, and in MI stage, both the ER and mitochondria localize around the spindle. This pattern is considered as the normal localization. Several studies showed that dispersed or clustered localization contributed to maturation defects. We included relevant descriptions in the revised manuscript.

      What exactly is meant by input in the IP experiments? Why is the target missing in the input sample?

      Thank you for your question. We subjected the input samples to electrophoresis on a single channel, all the analyzed proteins demonstrated normal expression, thereby confirming the viability of the input sample. However, upon simultaneous exposure with the IP samples, we observed a lack of clear signal for certain proteins in the input group. This phenomenon is due to the excessive signal intensity resulting from protein enrichment in the IP group, which caused the low exposure of proteins in input group.

      Explain the rationale for using, actin or tubulin as loading or normalization controls in the study focusing on the cytoskeleton.

      Thank you for your question. Actin and tubulin are both widely used as the control due to their stable expression. For actin, there are α-actin and β-actin isoforms. Formins and Arp2/3 complex regulate the polymerization of α-actin and β-actin to form F-actin, not isoform expression. In our study F-actin (the functional type) was examined. While α-tubulin and β-tubulin are two subtypes of tubulin, and they interact with each other to form stable α/β-tubulin heterodimers. The changes of cytoskeleton dynamics could not change the expression of α/β-tubulin. Therefore, β-actin and α-tubulin could be used as normalization controls.

      Fig. 6E shows only , but the legend says *.

      Thank you for pointing out the error. We correct the mistake in the revised manuscript.

      Spindle positioning appears to differ between control and KD. Does this affect the quantification of Fig. 6F? Adequate nomenclature should be used here.

      Thank you for your question. Yes, spindle positioning was affected by FMNL2 depletion. However, central spindle or cortex spindle all belong to MI stage, and JC1 is not related with the stage difference. To avoid misunderstanding we replaced the representative images and corresponding description in Figure 6F.

      The description of the methods and legends should be significantly improved.

      Thank you for your suggestion. Reviewer 1 and 2 also raised the similar concern. We enriched the description of methods and legends in the revised manuscript.

    1. Author response:

      The following is the authors’ response to the original reviews.

      We thank the reviewers for their thoughtful comments. We were pleased that they thought our study was "well crafted and written", "important", and that it provides a "valuable resource for researchers studying color vision". They also expressed several constructive criticisms, concerning – among other things – the lack of details regarding experimental procedures and analysis, the challenge in relating retinal data to cortical recordings, and consistency of results across animals. In response to the reviewers’ comments and following their suggestions, we performed additional analyses, and substantially revised the paper:

      We added a section in the Discussion about "Limitations of the stimulus paradigm". In addition, we added a new Suppl. Figure that illustrates the effect of deconvolution of calcium traces on our results and clarified in the text why we use deconvolved signals for all analyses. The new Suppl. Figure also shows an additional analysis with a more conservative threshold of neuron exclusion.

      We now clarify how retinal signaling relates to our cortical results and rewrote the text to be more conservative regarding our conclusions.

      In addition, we added a new Suppl. Figure showing the key analyses from Figures 2 and 4 separately for each animal. We now mention consistency across animals in the Results section and clearly state which analyses were performed an data pooled across animals.

      We are positive that these additions address the issues raised by the reviewers. Please find our point-by-point replies to all comments below.

      eLife assessment

      Franke et al. explore and characterize the color response properties in the mouse primary visual cortex, revealing specific color opponent encoding strategies across the visual field. The data is solid; however, the evidence supporting some conclusions and details about some procedures are incomplete. In its current form, the paper makes a useful contribution to how color is coded in mouse V1. Significance would be enhanced with some additional analyses and resolution of some technical issues.

      We thank the reviewers for appreciating our manuscript and their thoughtful comments.

      Referee 1 (Remarks to the Author):

      Summary:

      In this study, Franke et al. explore and characterize the color response properties across the primary visual cortex, revealing specific color opponent encoding strategies across the visual field. The authors use awake-behaving 2P imaging to define the spectral response properties of visual interneurons in Layer 2/3. They find that opponent responses are more prominent at photopic light levels, and diversity in color opponent responses exists across the visual science, with green ON/ UV OFF responses being stronger represented in the upper visual field. This is argued to be relevant for detecting certain features that are more salient when the chromatic space is used, possibly due to noise reductions.

      Strengths:

      The work is well crafted and written and provides a thorough characterization that reveals an uncharacterized diversity of visual properties in V1. I find this characterization important because it reveals how strongly chromatic information can modulate the response properties in V1. In the upper visual field, 25% of the cells differentially relay chromatic information, and one may wonder how this information will be integrated and subsequently used to aid vision beyond the detection of color per see. I personally like the last paragraph of the discussion that highlights this fact.

      We thank the reviewer for appreciating our manuscript.

      Weaknesses: One major point highlighted in this paper is the fact that Green ON/UV OFF responses are not generated in the retina. But glancing through the literature, I saw this is not necessarily true. Fig 1. of Joesch and Meister, a paper cited, shows this can be the case. Thus, I would not emphasize that this wasn’t present in the retina. This is a minor point, but even if the retina could not generate these signals, I would be surprised if the diversity of responses would only arise through feed-forward excitation, given the intricacies of cortical connectivity. Thus, I would argue that the argument holds for most of the responses seen in V1; they need to be further processed by cortical circuitries.

      We thank the reviewer for this comment. When analyzing available data from the retina using a similar center-surround color flicker stimulus (Szatko et al. 2020), we found that Green On/UV Off color opponency is very rare in the RF center of retinal ganglion cells (Suppl. Fig. 5). This suggests that center Green On/UV Off color opponency in V1 neurons is not inherited by the RF center of retinal neurons. However, we agree with the reviewer that retinal neurons might still contribute to V1 color opponency, for example by being center-surround color opponent (e.g. Joesch et al. 2016 and Szatko et al. 2020). We rephrased the text to acknowledge this fact.

      This takes me to my second point, defining center and surround. The center spot is 37.5 deg of visual angle, more than 1 mm of the retinal surface. That means that all retinal cells, at least half and most likely all of their surrounds will also be activated. Although 37.5 deg is roughly the receptive field size previously determined for V1 neurons, the one-to-one comparison with retinal recording, particularly with their center/surround properties, is difficult. This should be discussed. I assume that the authors tried a similar approach with sparse or dense checker white noise stimuli. If so, it would be interesting if there were better ways of defining the properties of V1 neurons on their complex/simple receptive field properties to define how much of their responses are due to an activation of the true "center" or a coactivation of the surround. Interestingly, at least some of the cells (Fig. 1d, cells 2 and 5) don’t have a surround. Could it be that in these cases, the "center" and "surround" are being excited together? How different would the overall statistics change if one used a full-filed flicker stimulus instead of a center/surround stimulus? How stable are the results if the center/surround flicker stimulus is shifted? These results won’t change the fact that chromatic coding is present in the VC and that there are clear differences depending on their position, but it might change the interpretation. Thus, I would encourage you to test these differences and discuss them.

      Thanks for this comment. We agree with the reviewer that a one-to-one comparison of retina and V1 data is challenging, due to differences in both RF and stimulus size. We rephrased the Results text to clarify this point and now also mention it in the Discussion.

      To be able to record from many V1 neurons simultaneously, we used a stimulus size of 37.5 degree visual angle in diameter, which is slightly larger than center RFs of single V1 neurons. As the reviewer mentions, the disadvantage of this approach is that the stimulus is only roughly centered on the neurons’ center RFs. To reduce the impact of potential stimulus misalignment on our results, we used the following steps:

      For each recording, we positioned the monitor such that the mean RF across all neurons lies within the center of the stimulus field of view.

      We confirmed that this procedure results in good stimulus alignment for the large majority of recorded neurons within individual recording fields by using a sparse noise stimulus (Suppl. Fig. 1a-c). Specifically, we found that for 83% of tested neurons, more than two thirds of their center RF, determined by the sparse noise stimulus, overlapped with the center spot of the color noise stimulus.

      For analysis, we excluded neurons without a significant center STA, which may be caused by misalignment of the stimulus.

      Together, we believe these points strongly suggest that the center spot and the surround annulus of the noise stimulus predominantly drive center (i.e. classical RF) and surround (i.e. extraclassical RF), respectively, of the recorded V1 neurons. This is further supported by the fact that color response types identified using an automated clustering method were robust across mice (Suppl. Fig. 6c), indicating consistent stimulus centering.

      Nevertheless, we cannot exclude that the stimulus was misaligned for a subset of the recorded neurons used for analysis. We agree with the reviewer that such misalignment might have contributed to cells not having surround STAs, due to simultaneous activation of antagonistic center and surround RF components by the surround stimulus. While a full-field stimulus would get rid of the misalignment problem, it would not allow to study color tuning in center and surround RF components separately. Instead, one could compare the results of our approach with an approach that centers the stimulus on individual neurons. However, we believe that performing these additional experiments is out of the scope of the current study.

      To acknowledge the experimental limitations of our study and the concerns brought up by the reviewer, we now explicitly mention the steps we perform to reduce the effects of stimulus misalignment in the Results section and discuss the problem of stimulus alignment in the Discussion. We believe these changes will help the reader to interpret our results.

      Referee 2 (Remarks to the Author):

      Summary:

      Franke et al. characterize the representation of color in the primary visual cortex of mice and how it changes across the visual field, with a particular focus on how this may influence the ability to detect aerial predators. Using calcium imaging in awake, head-fixed mice, they characterize the properties of V1 neurons (layer 2/3) using a large center-surround stimulation where green and ultra-violet were presented in random combinations. Using a clustering approach, a set of functional cell-types were identified based on their preference to different combinations of green and UV in their center and surround. These functional types were demonstrated to have varying spatial distributions in V1, including one neuronal type (Green-ON/UV-OFF) that was much more prominent in the posterior V1 (i.e. upper visual field). Modelling work suggests that these neurons likely support the detection of predator-like objects in the sky.

      Strengths:

      The large-scale single-cell resolution imaging used in this work allows the authors to map the responses of individual neurons across large regions of the visual cortex. Combining this large dataset with clustering analysis enabled the authors to group V1 neurons into distinct functional cell types and demonstrate their relative distribution in the upper and lower visual fields. Modelling work demonstrated the different capacity of each functional type to detect objects in the sky, providing insight into the ethological relevance of color opponent neurons in V1.

      We thank the reviewer for appreciating our manuscript.

      Weaknesses:

      While the study presents solid evidence a few weaknesses exist, including the size of the dataset, clarity regarding details of data included in each step of the analysis and discussion of caveats of the work. The results presented here are based on recordings of 3 mice. While the number of neurons recorded is reasonably large (n > 3000) an analysis that tests for consistency across animals is missing. Related to this, it is unclear how many neurons at each stage of the analysis come from the 3 different mice (except for Suppl. Fig 4).

      Thank you for this comment. We apologize that the original manuscript did not clearly indicate the consistency of our results across animals. We have revised the manuscript in the following ways:

      We have added an additional Suppl. Figure, which shows the variability of the data within and across animals (Suppl. Fig. 4). Specifically, we show the distribution of color and luminance selectivity for (i) center and surround components of V1 RFs and (ii) for upper and lower visual field. This data is used for all analyses shown in Figures 2-4. The figure legend of this figure also states the number of neurons per animal.

      We now clearly state in the Results section that all analyses in the main figures were performed by pooling data across animals, and refer to the Suppl. Figures for consistency across animals.

      We believe these changes help the reader to interpret our results.

      Finally, the paper would greatly benefit from a more in depth discussion of the caveats related to the conclusion drawn at each stage of the analysis. This is particularly relevant regarding the caveats related to using spike triggered averages to assess the response preferences of ON-OFF neurons, and the conclusions drawn about the contribution of retinal color opponency.

      Thanks. We substantially revised the text to discuss caveats and limitations of the approach. For example, we added a section into the Discussion called "Limitations of the stimulus paradigm". In addition, we clarified how retinal signals relate to cortical ones and phrased our conclusions more conservatively.

      The authors provide solid evidence to support an asymmetric distribution of color opponent cells in V1 and a reduced color contrast representation in lower light levels. Some statements would benefit from more direct evidence such as the integration of upstream visual signals for color opponency in V1.

      Based on the comments from Reviewer 1, we have rephrased the statements regarding the integration of upstream visual signals for color opponency in V1. We think these revisions increase the clarity of the results and help the reader with interpretation.

      Overall, this study will be a valuable resource for researchers studying color vision, cortical processing, and the processing of ethologically relevant information. It provides a useful basis for future work on the origin of color opponency in V1 and its ethological relevance.

      Thanks! We thank the reviewer again for the helpful comments.

      Referee 3 (Remarks to the Author):

      This paper studies chromatic coding in mouse primary visual cortex. Calcium responses of a large collection of cells are measured in response to a simple spot stimulus. These responses are used to estimate chromatic tuning properties - specifically sensitivity to UV and green stimuli presented in a large central spot or a larger still surrounding region. Cells are divided based on their responses to these stimuli into luminance or chromatic sensitive groups. Several technical concerns limit how clearly the data support the conclusions. If these issues can be fixed, the paper would make a valuable contribution to how color is coded in mouse V1.

      We thank the reviewer for the helpful comments.

      Analysis: The central tool used to analyze the data is a "spike triggered average" of the responses to randomly varying stimuli. There are several steps in this analysis that are not documented, and hence evaluating how well it works is difficult. Central to this is that the paper does not measure spikes. Instead, measured calcium traces are converted to estimated spike rates, which are then used to estimate STAs. There are no raw calcium traces shown, and the approach to estimate spike rates is not described in any detail. Confirming the accuracy of these steps is essential for a reader to be able to evaluate the paper. Further, it is not clear why the linear filters connecting the recorded calcium traces and the stimulus cannot be estimated directly, without the intermediate step of estimating spike rates.

      Thank you for this comment. We have used the genetically encoded calcium sensor GCaMP6s in our recordings. This sensor is a very sensitive GCaMP6 variant, but also one with slow kinetics. To remove the effect of the slow sensor kinetics from recorded calcium responses, the recorded traces are commonly deconvolved with the impulse function of the sensor to obtain the deconvolved calcium traces. We now include this reasoning in the Results section. To illustrate the effect of the deconvolution, we added a new Suppl. Figure (Suppl. Fig. 2) showing raw calcium and deconvolved traces, and the STAs estimated from both types of traces. This illustrates that the results regarding neuronal color preferences are consistent across raw and deconvolved calcium traces.

      We agree with the reviewer that the term STA might be confusing. We have replaced it with the term "even-triggered-average" (ETA). In addition, we have replaced the phrase "estimated spike rate" with "deconvolved calcium trace" throughout the manuscript because the unit of the deconvolved traces is not interpretable, like spike rate would be (spikes per second). In the revised version, we now clarify in the Methods section that we estimate the ETAs based on deconvolved calcium traces, which is correlated with and an approximation for spike rate.

      A further issue about the STAs is that the inclusion criterion (correlation of predicted vs measured responses of 0.25) is pretty forgiving. It would be helpful to see a distribution of those correlation values, and some control analyses to check whether the STA is providing a sufficiently accurate measure to support the results (e.g. do the central results hold for the cells with the highest correlations).

      We thank the reviewer for this comment. To exclude noisy neurons from analysis, we used the following procedure:

      For each of the four stimulus conditions (center and surround for green and UV stimuli), kernel quality was measured by comparing the variance of the STA with the variance of the baseline, defined as the first 500 ms of the STA. Only cells with at least 10-times more variance of the kernel compared to baseline for UV or green center STA were considered for further analysis.

      We have added the distribution of quality values to a new Suppl. Figure (Suppl. Fig. 2d,e). We now also show the percentage of neurons above threshold, given different quality thresholds. Finally, we have repeated the analysis shown in Figure 2 for a much more conservative threshold, including only the top 25% of neurons (Suppl. Fig. 2e,f). We now mention this new analysis in the Methods and Results section.

      Limitations of stimulus choice: The paper relies on responses to a large (37.5 degree diameter) modulated spot and surrounding region. This spot is considerably larger than the receptive fields of both V1 cells and retinal ganglion cells. As a result, the spot itself is very likely to strongly activate both center and surround mechanisms, and responses of cells are likely to depend on where the receptive fields are located within the spot (and, e.g., how much of the true neural surround samples the center spot vs the surround region). The impact of these issues on the conclusions is considered briefly at the start of the results but needs to be evaluated in considerably more detail. This is particularly true for retinal ganglion cells given the size of their receptive fields (see also next point).

      We agree with the reviewer that the centering of the stimulus is critical and apologize if this point was not discussed sufficiently. To be able to record from many V1 neurons simultaneously, we used a stimulus size of 37.5 degree visual angle in diameter, which is slightly larger than center RFs of single V1 neurons. As the reviewer mentions, the disadvantage of this approach is that the stimulus is only roughly centered on the neurons’ center RFs. To reduce the impact of potential stimulus misalignment on our results, we have used different experimental and analysis steps and controls (see also second comment of Reviewer 1):

      For each recording, we positioned the monitor such that the mean RF across all neurons lies within the center of the stimulus field of view.

      We confirmed that this procedure results in good stimulus alignment for the large majority of recorded neurons within individual recording fields by using a sparse noise stimulus (Suppl. Fig. 1a-c). Specifically, we found that for 83% of tested neurons, more than two thirds of their center RF, determined by the sparse noise stimulus, overlapped with the center spot of the color noise stimulus.

      For analysis, we excluded neurons without a significant center STA, which may be caused by misalignment of the stimulus.

      We now mention those clearly in the Results section and added the limitations of our approach to the Discussion section.

      Comparison with retina: A key conclusion of the paper is that the chromatic tuning in V1 is not inherited from retinal ganglion cells. This conclusion comes from comparing chromatic tuning in a previously-collected data set from retina with the present results. But the retina recordings were made using a considerably smaller spot, and hence it is not clear that the comparison made in the paper is accurate. This issue may be handled by the analysis presented in the paper, but if so it needs to be described more clearly. The paper from which the retina data is taken argues that rod-cone chromatic opponency originates largely in the outer retina. This mechanism would be expected to be shared across retinal outputs. Thus it is not clear how the Green-On/UV-Off vs Green-Off/UV-On asymmetry could originate. This should be discussed.

      We agree with the reviewer that a one-to-one comparison of retina and V1 data is challenging, due to differences in both RF and stimulus size. We rephrased the Results text to clarify this point and now also mention it in the Discussion.

      When analyzing available data from the retina using a similar center-surround color flicker stimulus (Szatko et al. 2020), we found that Green On/UV Off color opponency is very rare in the RF center of retinal ganglion cells (Suppl. Fig. 5). This suggests that center Green On/UV Off color opponency in V1 neurons is not inherited by the RF center of retinal neurons. However, we agree with the reviewer that retinal neurons might still contribute to V1 color opponency, for example by being center-surround color opponent (e.g. Joesch et al. 2016 and Szatko et al. 2020). We rephrased the text to acknowledge this fact.

      Residual chromatic cells at low mesopic light levels The presence of chromatically tuned cells at the lowest light level probed is surprising. The authors describe these conditions as rod-dominated, in which case chromatic tuning should not be possible. This again is discussed only briefly. It either reflects the presence of an unexpected pathway that amplifies weak cone signals under low mesopic conditions such that they can create spectral opponency or something amiss in the calibrations or analysis. Data collected at still lower light levels would help resolve this.

      Thank you for this comment. We call the lowest light level "low mesopic" and "rod-dominated" because the spectral contrast of V1 center responses in posterior recording fields is green-shifted for this light level (Fig. 3a). This is only expected if responses in the UV-cone dominant ventral retina are predominantly driven by rod photoreceptors. We now explain this rationale in the Results section. In addition, we mention in the Discussion that future studies are required to test whether cone signals need to be amplified for low light levels. While we agree with the reviewer that it would be exciting to use even lower light levels during recordings, we believe this is out of the scope of the current study due to the technical challenges involved in achieving scotopic stimulation.

    1. Author response:

      The following is the authors’ response to the original reviews.

      We have revised the manuscript mainly in the following aspects: (1) the data of electrophysiological and behavioral responses of larvae and adults to trehalose have been added, and the related figures and texts have been modified accordingly; (2) the photos of taste organs of larvae and adults indicating the position of recorded sensilla have been added; (3) the potential off-target effects of GR knock-out on other GR expressions has been carefully explained and revised in the relevant text; (4) the abstract has been revised to present the findings more technically in a limited number of words; (5) some details of experiments in Materials and Methods and some new literatures have been added; (6) a new figure (Figure 8) summarizing the main findings of the study has been added.

      In the following, we respond to the reviewers’ comments and suggestions one by one. We hope that our answers will satisfy you and the three reviewers. We are also very happy to get further valuable advices from you.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The process of taste perception is significantly more intricate and complex in Lepidopteran insects. This investigation provides valuable insights into the role of Gustatory receptors and their dynamics in the sensation of sucrose, which serves as a crucial feeding cue for insects. The article highlights the differential sensitivity of Grs to sucrose and their involvement in feeding and insect behavior.

      Strengths:

      To support the notion of the differential specificity of Gr to sucrose, this study employed electrophysiology, ectopic expression of Grs in Xenopus, genome editing, and behavioral studies on insects. This investigation offers a fundamental understanding of the gustation process in lepidopteran insects and its regulation of feeding and other gustation-related physiological responses. This study holds significant importance in advancing our comprehension of lepidopteran insect biology, gustation, and feeding behavior.

      Thank you for your recognition of our research.

      Weaknesses:

      While this manuscript demonstrates technical proficiency, there exists an opportunity for additional refinement to optimize comprehensibility for the intended audience. Several crucial sugars have been overlooked in the context of electrophysiology studies and should be incorporated. Furthermore, it is imperative to consider the potential off-target effects of Gr knock-out on other Gr expressions. This investigation focuses exclusively on Gr6 and Gr10, while neglecting a comprehensive narrative regarding other Grs involved in sucrose sensation.

      We accept the reviewer's suggestion. Because trehalose is a main sugar in insect blood, and it is converted by insects after feeding on plant sugars, we have added the new data on electrophysiological and behavioral responses of larvae and adults of Helicoverpa armigera to trehalose (see Figure 1-2, Figure 1-figure supplement 1, Figure 2-figure supplement 1). Now, the total eight sugars include 2 pentoses (arabinose and xylose), 4 hexoses (fructose, fucose, galactose and glucose), and 2 disaccharides (sucrose and trehalose), which were chosen because they are mainly present in host-plants of H. armigera and/or representative in the structure and source of sugars.

      We fully agree to the reviewer’s opinion and have already taken the potential off-target effects of CRISPR/Cas9 knockout of Gr on other GR expressions into consideration. To predict the potential off-target sites of sgRNA of Gr6 and Gr10 establishing homozygous mutants using CRISPR/Cas9 technology, we first use online software CasOFFinder (http://www.rgenome.net/cas-offinder/) to blast the genome of the wild type cotton bollworm and set the mismatch number less than or equal to 3. We found that Gr10 sgRNA had no potential potential off-target site, and the sgRNA of Gr6 had only one potential off-target site. Therefore, we designed primers according to the sequence of potential off-target sites of Gr6 sgRNA, and conducted PCR using genomic DNA of homozygous mutant as a template, performed Sanger sequencing on the PCR products obtained, and found that the potential off-target sites of Gr6 sgRNA were no different from those of the wild type. Particularly, concerning the sgRNA of Gr6 and Gr10 may produce off-target effects on other sugar receptor genes of H. armigera, we conducted the same off-target site analysis with the designed sgRNA on each of the other eight sugar receptor genes, and found that there were no off-target sites on these receptor genes (see Line254-256).

      Reviewer #2 (Public Review):

      Summary:

      To identify sugar receptors and assess the capacity of these genes the authors first set out to identify behavioral responses in larvae and adults as well as physiological response. They used phylogenetics and gene expression (RNAseq) to identify candidates for sugar reception. Using first an in vitro oocyte system they assess the responses to distinct sugars. A subsequent genetic analysis shows that the Gr10 and Gr6 genes provide stage specific functions in sugar perception.

      Strengths:

      A clear strength of the manuscript is the breadth of techniques employed allowing a comprehensive study in a non-canonical model species.

      Thank you for your recognition of our research.

      Weaknesses:

      There are no major weaknesses in the study for the current state of knowledge in this species. Since it is much basic work to establish a broader knowledge, context with other modalities remains unknown. It might have been possible to probe certain contexts known from the fruit fly, which would have strengthened the manuscript.

      Thank you so much for your suggestion. According to this suggestion, we further added some sentences probing sugar sensing and behaviors of fruit fly larvae in the Introduction and discussion sections (Line 68-71 in Introduction section, Line 395-399 in Discussion section).

      Reviewer #3 (Public Review):

      In this study, the authors combine electrophysiology, behavioural analyses, and genetic editing techniques on the cotton bollworm to identify the molecular basis of sugar sensing in this species.

      The larval and adult forms of this species feed on different plant parts. Larvae primarily consume leaves, which have relatively lower sugar concentrations, while adults feed on nectar, rich in sugar. Through a series of experiments-spanning electrophysiological recordings from both larval and adult sensillae, qPCR expression analysis of identified GRs from these sensillae, response profiles of these GRs to various sugars via heterologous expression in Xenopus oocytes, and evaluations of CRISPR mutants based on these parameters-the authors discovered that larvae and adults employ distinct GRs for sugar sensing. While the larva uses the highly sensitive GR10, the adult uses the less sensitive and broadly tuned GR6. This differential use of GRs are in keeping with their behavioral ecology.

      The data are cohesive and consistently align across the methodologies employed. They are also well presented and the manuscript is clearly written.

      Recommendations for the authors:

      While appreciating the quality of the work and its presentation, we have a few comments for the authors, should they wish to consider them, that would significantly improve the presentation of the work.

      Title: Could the authors please revisit their title to better reflect the main finding of their work?

      The title has been changed into “The larva and adult of Helicoverpa armigera use differential gustatory receptors to sense sugars”.

      Text: There are a few comments related to the text, and these are listed below:

      (1) Could the authors place their work in the context of what's known about sugar sensing in Drosophila larva and adult?

      In the Introduction section, we added the status of research on sugar perception in Drosophila larvae, pointing out "No external sugar-sensing mechanism in Drosophila larvae has yet been characterized." (Line 70-71); in the Discussion section, the research progress of sugar sensing in Drosophila adults and larvae was also summarized (Line 397-399).

      (2) For each results section, could the authors please include a sentence or two that interprets the data in the context of previously presented data?

      We accept the reviewer's suggestion. In order to make it easy for readers to follow up, we included a sentence interprets the above data at the beginning of each part of the Results on the premise of avoiding duplication.

      (3) Could the authors please provide details of the generation and screening of the CRISPR mutants?

      We have added more details on mutant establishment and screening in the Materials and Methods section (Line 722-726, 729-732).

      Figures: Could the authors please include images and schematics wherever possible? For example, a schematic depicting the position of the sense organs and one summarising the main findings of the studies.

      In Figure 1 we added the photo of each taste organ, on which the recorded sensilla were indicated. We also added a new figure, Figure 8, summarizing the main findings of the study.

      Choice of Sugars: Could the authors please justify their choice of sugars they have used in the analyses?

      In the first paragraph of the Results section of the article, we further explain the reasons for using the sugars in the study. “We first investigated the electrophysiological responses of the lateral and medial sensilla styloconica in the larval maxillary galea to eight sugars. These sugars were chosen because they are mostly found in host-plants of H. armigera or are representative in the structure and source of sugars.”

      In addition to this, there are several specific comments in the detailed reviewers comments below, which the authors could consider responding to.

      Reviewer #1 (Recommendations For The Authors):

      The article titled "Sucrose taste receptors exhibit dissimilarities between larval and adult stages of a moth" by Shuai-Shuai Zhang and colleagues provides an intriguing analysis. The authors have conducted a meticulously planned and executed study. However, I do have some inquiries.

      (1) What precisely does the term "differ" signify in the title? It can be expounded upon in terms of differing in expression or sensitivity. The title could benefit from being more informative. The authors should appropriately specify the insect species in the title of the paper. This would make it more comprehensible to readers. Merely mentioning the term "moth" does not provide any information about the model organism. Hence, it would be preferable to mention Helicoverpa armigera instead of using the generic term "moth" in the title.

      Thank you for your suggestions. We considered it better to emphasize that the receptors for sucrose are different, and we have accepted the suggestion of adding the name of the animal. The title has been changed into “The larva and adult of Helicoverpa armigera use differential gustatory receptors to sense sugars”.

      (2) The abstract is written in a simple and easily understandable manner, but it overlooks important findings from a technical standpoint.

      We add some key experimental techniques to illustrate some important findings in the Abstract.

      (3). Almost all herbivorous insects are said to consume plants and utilize sucrose as a stimulus for feeding, as stated by the authors. Sucrose, glucose, and fructose sugar are among the commonly observed stimulants for feeding in numerous insects. It would be appropriate to incorporate not only sucrose but also glucose and fructose as feeding stimulants for almost all herbivorous insects.

      Thank you for your suggestion. Sucrose is the major sugar in plants, and its concentration varies greatly from tissue to tissue, while the concentration of the hexose sugars is much lower and the concentration does not change much. In Line 48, we state that sucrose, glucose, and fructose are feeding stimuli for herbivorous insects. From the previous studies, it seems that sucrose is the strongest, followed by fructose, and finally glucose. The cotton bollworm larvae showed no electrophysiological and behavioral response to glucose.

      (4) The reason why trehalose is not considered in the electrophysiology analysis is unclear. Given that trehalose is a major sugar in insects and plants, it would be intriguing to include it in the analysis.

      We have accepted the reviewer's suggestion, and supplemented the electrophysiological responses of taste organs in larvae and adults of Helicoverpa armigera to trehalose (Figure 1, Figure 1-Figure Supplement 1), and also tested the behavioral responses of the larvae and adults to trehalose (Figure 2, Figure 2-Figure Supplement 1). Therefore, all the related figures have been changed.

      (5) The author's intention regarding the co-receptor relationship between Gr5 and Gr6 (line 211) is unclear. If this is indeed the case, then the reason for considering Gr5 in further studies remains uncertain.

      We have changed the sentence as follows: “Since Gr5 was highly expressed with Gr6 in the proboscis and tarsi (Figure 3D-3E, Figure 3—figure supplement 1), we suspected that Gr5 and Gr6 might be expressed in the same cells, and then tested the response profile of their co-expression in oocytes.”

      (6) The homologous nature of Grs is emphasized by the authors. It is not specified how the author ensured that the guide RNA targeting Gr6 or Gr10 did not result in off-target effects on other Grs.

      Thank you so much for your suggestion. We have rewritten the relevant paragraph (Line 238-251), detailing our tests and the results on the potential off-target effects of knocking out GRs by CRISPR/Cas9: “In order to predict the potential off-target sites of sgRNA of Gr6 and Gr10, we used online software Cas-OFFinder (http://www.rgenome.net/cas-offinder/) to blast the genome of H. armigera, and the mismatch number was set to less than or equal to 3. According to the predicted results, the Gr10 sgRNA had no potential off-target region but Gr6 sgRNA had one. Therefore, we amplified and sequenced the potential off-target region of Gr6-/- and found there was no frameshift or premature stop codon in the region compared to WT (Figure 5—figure supplement 2). It is worth mentioning that there was no potential off-target region of Gr6 and Gr10 sgRNA in other sugar receptor genes of H. armigera, Gr4, Gr5, Gr7, Gr8, Gr9, Gr11 and Gr12. We further found there was no difference in the response to xylose of the medial sensilla styloconica among WT, Gr10-/- and Gr6-/- (Figure 5—figure supplement 2). Furthermore, WT, Gr10-/- and Gr6-/- did not show differences in the larval body weight, adult lifespan, and number of eggs laid per female (Figure 5—figure supplement 2). All these results suggest that no off-target effects occurred in the study.”

      (7) Is it possible that knocking out Gr10 is not compensated for by the overexpression of Gr6 or other sucrose sensing Grs? Similarly, would the vice versa scenario hold true?

      In the Discussion section, we have added some sentences to discuss this issue: “From our results, knocking out Gr10 or Gr6 is unlikely to be compensated by overexpression of other sugar GRs. One of our recent studies showed that Orco knockout had no significant effect on the expression of most OR, IR and GR genes in adult antennae of H. armigera, but some genes were up- or down-regulated (Fan et al., 2022).”

      (8) What was the rationale for selecting nine candidate GR genes for expression analysis?

      Based on the reviewer's suggestion, we expanded the relevant paragraphs to illustrate the rationale for selecting nine candidate GR genes for expression analysis: “To reveal the molecular basis of sugar reception in the taste sensilla of H. armigera, we first analyzed the putative sugar gustatory receptor genes based on the reported gene sequences of GRs in H. armigera and their phylogenetic relationship of D. melanogaster sugar gustatory receptors (Jiang et al., 2015; Pearce et al., 2017; Xu et al., 2017). Nine putative sugar GR genes, Gr4–12 were identified, and their full-length cDNA sequences were cloned (The GenBank accession number is provided in Appendix—Table S1).” (Line 155-161)

      (9) What is the potential reason for the difference between the major larval sugar receptors of Drosophila and Lepidopterans?

      The difference between the major larval sugar receptors of Drosophila and Lepidopterans is probably due to differences in the food their larvae feed on. Fruit fly larvae feed on rotten fruit, the main sugar of which is fructose. The larvae of Lepidoptera mainly feed on plants, and the main sugar is sucrose. In the Discussion section, we have added a sentence “This is most likely due to fruit fly larvae feeding on rotten fruits, which contain fructose as the main sugar.” (Line 399-401)

      (10) There is a disparity in GRs, specifically GR5 and GR6, between the female antenna, proboscis, and tarsi. What could be the possible justification and significance of this?

      Thank you so much for this question. We have added a sentence in the Discussion section, “In this study, the expression patterns of 9 sugar GRs in three taste organs of adult H. armigera show that there is a disparity in GRs, specifically GR5 and GR6, between the female antenna, tarsi and proboscis, which may be an evolutionary adaptation reflecting subtle differentiation in the function of these taste organs in adult foraging. Antennae and tarsi play a role in the exploration of potential sugar sources, while the proboscis plays a more precise role in the final decision to feed.” (Line 433-438)

      (11) I suggest that a visual representation illustrating the positioning of GSNs, particularly the lateral and medial sensilla, in both larva and adult stages would enhance the correlation with the results.

      In Figure 1 we added the photo of each taste organ and the position of the recorded sensilla, and also added a new figure, Figure 8 summarizing the main findings of the studies.

      (12) Further experiments can be conducted to elucidate the precise molecular mechanisms, particularly the downstream effects of GRs, in order to establish the specificity of GRs more convincingly.

      Thank you so much for your suggestion. We have discussed the further experiments in the Discussion section, “To elucidate the precise molecular mechanisms of sugar reception in H. armigera is necessary to compare a series of single, double and even multiple Gr knock-out lines and investigate the downstream effects of the GRs.” (Line 363-369)

      (13) Figure 6 caption: In Figure 6 (D to I), the percentage of PER is depicted. There is redundancy in the Y-axis title (Percentage of PER) and the legend. This appears to be repetitive. I suggest that it would be better to include the Y-axis title only in Figure D or in Figures D and G.

      We accept the suggestion. Figure 7 (not Figure 6) has been revised accordingly.

      (14) In Figures 6A and 6C, there is inconsistency in the colors used for WT, Gr6, and Gr10. This could potentially confuse the reader. I recommend using the same colors in both figures instead of using a blue color. Please specify how the authors calculated the feeding area in Figure 6.

      We accept the reviewer's suggestion and have changed the color of Figure 7A, B. We have also added the detail method for calculating feeding area (Line 541-545).

      (15) In Two-choice tests, why did the authors use 0.01% Tween 80? Please provide comments on this.

      Use of 0.01% Tween 80 is to reduce the surface tension and increase the malleability of the solution. We have given detailed explanation in the Method section and cite the reference. (Line538-540)

      (16) It would be valuable if the authors could comment on the prospects of this study, considering that GRs play a vital role in controlling behavior and developmental pathways. What are the potential consequences of blocking or disrupting these receptors in terms of behavioral and developmental phenotypic deformities? Could this potentially lead to increased insect mortality?

      Thank you so much for your suggestions. In the last paragraph of the Discussion section, we have added the following perspectives, “Knockout of Gr10 or Gr6 led to a significant decrease in sugar sensitivity and food preference of the larvae and adults of H. armigera, respectively, which is bound to bring adverse consequences to survival and reproduction of the insects. Therefore, studying the molecular mechanisms underlying sugar perception in phytophagous insects may provide new insights into the behavioral ecology of this important and highly diverse group of insects, and measures blocking or disrupting sugar receptors could also have applications to control agricultural pests and improve crop yields worldwide” (Line 449-456).

      Reviewer #2 (Recommendations for The Authors):

      There are a few comments, that I feel would be beneficial to be addressed.

      • The authors used 7 different sugars for their experimental approach. While I agree that this is a sufficiently large collection for a study, I was wondering why they specifically chose these sugars; an explanatory section might be helpful for a reader to follow the reasoning.

      According to reviewer 1's suggestion, we increased trehalose to 8 sugars in experiments. Trehalose is a main sugar in insect blood. It is converted by insects after feeding on plant sugars. The 8 sugars were chosen because they are present in host-plants of H. armigera or are representative in the structure and source of sugars. They contain 2 pentoses (arabinose and xylose), 4 hexoses (fructose, fucose, galactose and glucose), and 2 disaccharides (sucrose and trehalose).

      • It might be beneficial to provide some broader overview on the gustatory system in the cotton bollworm, particularly at the larval stage since this may not be common knowledge. Along these lines eg. the complexity of sensilla types, organs and overall number (or estimation) of neurons might be good to know, a graphical representation of the sense organs might be informative.

      In the Introduction section, we give a more specific description on sugar sensitive GSNs in the taste system of the larva and adult of H. armigera, and cite the corresponding references.

      • Concerning phylogeny of GRs, it might be relevant to know how complete the genome information is and some more general background on GR diversity in the cotton bollworm.

      We agree to your opinion. According to this idea, we got the putative sugar GRs from the previously published genome (Pearce et al. 2017) and the related annotation of GRs (Jiang et al. 2015, Xu et al. 2012). We have made a more detailed explanation about this in the new version of the manuscript, “We first analyzed the putative sugar gustatory receptor genes based on the genome data of H. armigera (Pearce et al. 2017), the reported gene sequences of sugar GRs in H. armigera and their phylogenetic relationship of D. melanogaster sugar gustatory receptors (Jiang et al. 2015, Xu et al. 2012). All nine putative sugar GR genes in H. armigera, Gr4–12 were validated, and their full-length cDNA sequences were cloned (The GenBank accession number is provided in Appendix—Table S1).” (Line 155-161).

      • Generation of mutants based on CRISPR is intriguing and a powerful step. While the techniques are well described in the method section, there is no information concerning efficiency or broader feasibility of the approach. I feel it would be quite interesting to learn about how feasible or laborious the approach is to generate mutants (e.g. number of initial injected eggs, the resulting F0 offspring, number of back-crosses, number of screened F1s ....).

      In the Materials and Methods section, we have added specific success rates for each step in the process of building the two mutants (Line 722-726, 729-732).

      Reviewer #3 (Recommendations For The Authors):

      I want to congratulate the authors on this very nice study and have only minor comments for them.

      (1) It would be very nice to include pictures of the larva and adult of H. armigera. It would also help to have schematics of where the sensilla they are recording from are.

      We have added photos of four taste organs on which the recoded sensilla were indicated (Figure 1), and picture of the larva and adult on which the stimulating site was indicated (Figure 2).

      (2) A schematic summarising their findings, including the relevance to the animal's behavioural ecology, will greatly improve interpretations for the broader audience.

      A schematic summarizing the findings has been added.

      (3) The manner in which PIs are represented in figure 2A, B (among others) is confusing. Can the authors please plot the PI and not the feeding area? From the PI values listed beside the plot, it actually suggests that the larvae don't really show a preference. Could the authors please comment on this?

      Yes, sucrose has a significant stimulating effect on larva feeding, but the effect is not as large as the predicted based on the sensitivity of the sensillum, the main reasons are as follows: (1) there are many factors affecting larva feeding, sucrose is only one of them; (2) due to the substrate leaf discs also contain sugar, the effect of newly added sucrose may be reduced. After careful consideration, we think it is better to display the feeding area and PI together so that readers have a complete understanding of the data.

      (4) The heterologous expression experiments suggest that co-expression of GR6 with either GR10 or GR5 somehow suppress the response of the GR6 alone to fucose. Am I reading the data correctly? Why would this be? Perhaps the authors could discuss this. In this context, it would help to reproduce all the GR6 data together.

      Your interpretation is reasonable to a certain extent. The result of co-injection might be that Gr10 or Gr5 inhibited the response of Gr6. However, there is another possibility that the amount of Gr6 sRNA was diluted by co-injection of two GRs, resulting in a reduced response of Gr6 to fucose.

      (5) In general, for each results section, it would help to have a sentence or two that interprets the data in the context of previously presented data. This would help the reader digest the data and interpret it as they read along. Currently, the authors summarise the observations and leave all the interpretation to the discussion section.

      We accept the suggestion. In each part of the results, we have added a sentence to explain the above data, which will help readers to clarify the context of the research more easily.

      (6) Is the GR6 data in 4C not lined up correctly?

      Yes, it is right.

      (7) Line 228 suggests that the mutants were validating with qPCRs - I don't see that data.

      The mutants were not validating with qPCR. We used the ordinary PCR technology at the mRNA level to verify whether the related sequences were really deleted in the mutants.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The authors present a detailed study of a nearly complete Entomophthora muscae genome assembly and annotation, along with comparative analyses among related and non-related entomopathogenic fungi. The genome is one of the largest fungal genomes sequenced, and the authors document the proliferation and evolution of transposons and the presence/absence of related genetic machinery to explore how this may have occurred. There has also been an expansion in gene number, which appears to contain many "novel" genes unique to E. muscae. Functionally, the authors were interested in CAZymes, proteases, circadian clock related genes (due to entomopathogenicity/ host manipulation), other insect pathogenspecific genes, and secondary metabolites. There are many interesting findings including expansions in trahalases, unique insulinase, and another peptidase, and some evidence for RIP in Entomophthoralean fungi. The authors performed a separate study examining E. muscae species complex and related strains. Specifically, morphological traits were measured for strains and then compared to the 28S+ITSbased phylogeny, showing little informativeness of these morpho characters with high levels of overlap.

      This work represents a big leap forward in the genomics of non-Dikarya fungi and large fungal genomes. Most of the gene homologs have been studied in species that diverged hundreds of millions of years ago, and therefore using standard comparative genomic approaches is not trivial and still relatively little is known. This paper provides many new hypotheses and potential avenues of research about fungal genome size expansion, entomopathogenesis in zygomycetes, and cellular functions like RIP and circadian mechanisms.

      Strengths:

      There are many strengths to this study. It represents a massive amount of work and a very thorough functional analysis of the gene content in these fungi (which are largely unsequenced and definitely understudied). Too often comparative genomic work will focus on one aspect and leave the reader wondering about all the other ways genome(s) are unique or different from others. This study really dove in and explored the relevant aspects of the E. muscae genome.

      The authors used both a priori and emergent properties to shape their analyses (by searching for specific genes of interest and by analyzing genes underrepresented, expanded, or unique to their chosen taxa), enabling a detailed review of the genomic architecture and content. Specifically, I'm impressed by the analysis of missing genes (pFAMs) in E. muscae, none of which are enriched in relatives, suggesting this fungus is really different not by gene loss, but by its gene expansions.

      Analyzing species-level boundaries and the data underlying those (genetic or morphological) is not something frequently presented in comparative genomic studies, however, here it is a welcome addition as the target species of the study is part of a species complex where morphology can be misleading and genetic data is infrequently collected in conjunction with the morphological data.

      Thank you for your careful reading of our work. We’re glad that you identified these areas as strengths.

      Weaknesses:

      The conclusions of this paper are mostly well supported by data, but a few points should be clarified.

      In the analysis of Orthogroups (OGs), the claim in the text is that E. muscae "has genes in multi-species OGs no more frequently than Enotomophaga maimaiga. (Fig. 3F)" I don't see that in 3F. But maybe I'm really missing something.

      Thank you for catching this. You were, in fact, not missing anything at all. There was a mismatch between the data plotted in F and G and how the caption described these data. We very much apologize for the confusion that this must have caused. We have corrected these plots and also made changes to improve interpretability (see below).

      Also related, based on what is written in the text of the OG section, I think portions of Figure 3G are incorrect/ duplicated. First, a general question, related to the first two portions of the graph. How do "Genes assigned to an OG" and "Genes not assigned to an OG" not equal 100% for each species? The graph as currently visualized does not show that. Then I think the bars in portion 3 "Genes in speciesspecific OG" are wrong (because in the text it says "N. thromboides had just 16.3%" species-specific OGs, but the graph clearly shows that bar at around 50%. I think portion 3 is just a duplicate of the bars in portion 4 - they look exactly the same - and in addition, as stated in the text portion 4 "Potentially speciesspecific genes" should be the simple addition of the bars in portion 2 and portion 3 for each species.

      As mentioned above, we sincerely regret the error made in the plot and for the confusion that this caused. F now reflects the percentage of orthogroups (OGs) that possess at least one representative from the indicated species (left) and the percentage of OGs that are species-specific (only possess genes from one species; right). The latter is a subset of the former. G now reflects the percentage of annotated genes that were assigned an OG, per species, as well as the inverse of this - genes that were not assigned to any OG. These should, and now do, sum to 100%. The “Within species-specific OG” data summed with the “Not assigned OG” data yields the “Potentially species-specific data” in the rightmost column.

      In the introduction, there is a name for the phenomenon of "clinging to or biting the tops of plants," it's called summit disease. And just for some context for the readers, summit disease is well-documented in many of these taxa in the older literature, but it is often ignored in modern studies - even though it is a fascinating effect seen in many insect hosts, caused by many, many fungi, nematodes (!), etc. This phenomenon has evolved many times. Nice discussions of this in Evans 1989 and Roy et al. 2006 (both of whom cite much of the older literature).

      You’re right. We have now clarified that this behavior is called “summit disease” and referenced the suggested articles, along with a more recent review.

      Reviewer #2 (Public Review):

      In their study, Stajich and co-authors present a new 1.03 Gb genome assembly for an isolate of the fungal insect parasite Entomophthora muscae (Entomophthoromycota phylum, isolated from Drosophila hydei). Many species of the Entomophthoromycota phylum are specialised insect pathogens with relatively large genomes for fungi, with interesting yet largely unexplored biology. The authors compare their new E. muscae assembly to those of other species in the Entomophthorales order and also more generally to other fungi. For that, they first focus on repetitive DNA (transposons) and show that Ty3 LTRs are highly abundant in the E. muscae genome and contribute to ~40% of the species' genome, a feature that is shared by closely related species in the Entomophthorales. Next, the authors describe the major differences in protein content between species in the genus, focusing on functional domains, namely protein families (pfam), carbohydrate-active enzymes, and peptidases. They highlight several protein families that are overrepresented/underrepresented in the E. muscae genome and other

      Entomophthorales genomes. The authors also highlight differences in components of the circadian rhythm, which might be relevant to the biology of these insect-infecting fungi. To gain further insights into E. muscae specificities, the authors identify orthologous proteins among four Entomophthorales species. Consistently with a larger genome and protein set in E. muscae, they find that 21% of the 17,111 orthogroups are specific to the species. To finish, the authors examine the consistency between methods for species delineation in the genus using molecular (ITS + 28S) or morphological data (# of nuclei per conidia + conidia size) and highlight major incongruences between the two.

      Although most of the methods applied in the frame of this study are appropriate with the scripts made available, I believe there are some major discrepancies in the datasets that are compared which could undermine most of the results/conclusions. More precisely, most of the results are based on the comparison of protein family content between four Entomophthorales species. As the authors mention on page 5, genome (transcriptome) assembly and further annotation procedures can strongly influence gene discovery. Here, the authors re-annotated two assemblies using their own methods and recovered between 30 and 60% more genes than in the original dataset, but if I understand it correctly, they perform all downstream comparative analyses using the original annotations. Given the focus on E. muscae and the small sample size (four genomes compared), I believe performing the comparisons on the newly annotated assemblies would be more rigorous for making any claim on gene family variation.

      Thank you for this comment. While we did compare gene model predictions for two of these assemblies to assess if this difference could account for discrepancies in gene counts, completely reannotating all non-E. muscae datasets was outside of the scope of this study. In our opinion, the total number of predicted genes in a genome is not a best representation of differences since splitting or fusing gene models can inflate seeming differences; the orthology and domain counts are a more accurate assessment of the content. It’s possible that annotation differences may have inflated some gene family counts, however we will note that similar domain trends were observed between the closest species to E. muscae, Entomophaga maimaiga, suggesting that these differences were not sufficient to prevent us from detecting real biological signals. We look forward to continued improvement of our genome through additional sequencing and more clarity on total gene content of E. muscae.

      The authors also investigate the putative impact of repeat-induced point mutation on the architecture of the large Entomophthorales genomes (for three of the eight species in Figure 1) and report low RIP-like dinucleotide signatures despite the presence of RID1 (a gene involved in the RIP process in Neurospora crassa) and RNAi machinery. They base their analysis on the presence of specific PFAM domains across the proteome of the three Entomophthorales species. In the case of RID1, the authors searched for a DNA methyltransferase domain (PF00145), however other proteins than RID1 bear such functional domain (DNMT family) so that in the current analysis it is impossible to say if the authors are actually looking at RID1 homologs (probably not, RID1 is monophyletic to the Ascomycota I believe). Similar comments apply to the analysis of components of the RNAi machinery. A more reliable alternative to the PFAM analysis would be to work with full protein sequences in addition to the functional domains.

      While we understand this concern regarding domain vs. full length protein, the advantage of the domain search is that HMM-based searches are sensitive to detecting more distantly related homologs. Entomophthoralean fungi are distantly related from the ascomycetes in which these mechanisms have been characterized, so we chose a broader search approach that may identify proteins with similar domain structure, but are not necessarily homologs. These searches are presented in the manuscript as preliminary, but worth further investigation. However, our RID-based analysis did not identify convincing homologs for RID1 in entomophthoralean fungi included in our investigation, and we reported low homology (i.e., 12-14%) among our orthogroup of interest and RID1. We have further edited this section to clarify our understanding that these candidates are not RID1 homologs. We had hoped to avoid this implication, but we felt this investigation and null result were worth reporting.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Specific points:

      Results:

      "1.03 Gb genome consisting of 7,810 contigs (N50 = 301.1 kb). Additional... resulted in a final contig count of 7,810 (N50 = 329.6 kb)" So you started and ended with the same contig count but a different N50? Is this a typo?

      Yes, this was a typo. Thank you for bringing this to our attention.

      Figure 1D.

      The colors of Complete1x and Complete2x are too similar to tell them apart.

      The colors have been made more distinct.

      Figure 4B.

      I know C. rosea has been found from insects before, but it's mostly a mycoparasite and occasionally an endophyte, and has bioactivity against a lot of things. I just saw that it's listed as an entomopathogen, and I was surprised. Anyway, leave it as is if you want to, but it's definitely better studied and better known (Google Scholar) as a mycoparasite.

      Thanks for this comment. For the sake of including a more diverse representation of entomopathogenic fungi, we have opted to leave this as is.

      Full references (from the public comment)

      Evans, H.C., 1989. Mycopathogens of insects of epigeal and aerial habitats. Insect-fungus interactions, pp.205-238.

      Roy, H.E., Steinkraus, D.C., Eilenberg, J., Hajek, A.E. and Pell, J.K., 2006. Bizarre interactions and endgames: entomopathogenic fungi and their arthropod hosts. Annu. Rev. Entomol., 51, pp.331-357.

      Reviewer #2 (Recommendations For The Authors):

      I believe the manuscript could largely benefit from restructuring the results section to enhance clarity. The results section reads like a lot of descriptive back and forth, so that the reader lacks a clear rationale. The absence of a consistent dataset used for the different comparisons made all along the manuscript makes it hard to follow.

      Minor comments:

      (No line numbers were available so I refer to page numbers).

      p1

      • not sure about the use of "allied" to describe other fungal species in the title and after (sister species?).

      We didn’t want to use the word sister because not all of these species could be considered sister.

      • Genomic defence against transposable elements rather than "anti"?

      We have rephrased to genomic defense.

      p3

      • Extra parenthesis at Bronski et al.

      This is now corrected.

      • What does newly-available mean here?

      We mean recent. A lot of the datasets we used were very new, and we wanted to emphasize that point.

      • The back and forth between genomes and transcriptomes makes it hard to follow, would clarify from the beginning (in addition to the sequencing method - short vs long-read assemblies as in Figure 1B) or perhaps use a consistent dataset for all subsequent comparative analysis in the Entomophthorales.

      We have denoted our transcriptomic datasets in Fig 1C using parentheses.

      p5

      • Perhaps clarify that class II DNA transposons can also "copy" (single-strand excisions can be repaired by the host machinery).

      We have now included mention of “copy” as well as “jump” mechanisms of Class II transposons per your suggestion.

      p6

      • "beginning roughly concurrently", not clear what "began".

      This is now corrected.

      • "control" rather than "protect against"?

      We’ve changed “protect against” to “counter”.

      • I believe RIP has only been observed (experimentally) in a handful of fungal species, all from the Ascomycota phylum.

      Hood et al, 2005 found signatures of RIP in anther-smut fungus and Horns et al, 2012, found evidence of hypermutability across repeat elements within several Pucciniales species.

      • "RID1 contains two DNA_methylase domains", RID1 has one methyltransferase domain according to the reference Freitag et al, 2002.

      Thank you for drawing this to our attention. It is true RID1 has one methyltransferase region; however, the sequence deposited by Freitag et al, 2002 (AAM27408) is predicted by HMMer to have two adjacent Pfam DNA_methylase domains (i.e., PF00145). In this exploratory analysis, we tried to leverage this characteristic to identify candidate proteins of interest. We have reworded this section to clarify this.

      p8

      • Here and after I would use more informative titles for each paragraph.

      With the exception of the headings for Pfam, CAZy and MEROPs analyses, we believe the other headings are informative. We appreciate this comment, but opt to leave the heading titles as is.

      • I believe presenting the orthology analysis before the more in-depth protein family domain search.

      We leveraged the OG analysis mostly as a way to identify potentially unique genes in E. muscae, so we think the current order makes the most sense.

      p10

      • Figures 3F and G are confusing. The legend for Figure 3F mentions "OGs with >= 2 species" while the figure shows "multi-species OGs", and reads as redundant with the "species-specific" OGs. For the "OGs within species" do I understand it correctly that it represents the number of genes assigned to OGs for each species? If yes, the numbers are in contradiction with Figure 3G. And in Figure 3G shouldn't the sum of "genes assigned in OGs" and "genes nor assigned in OGs" add up to 100? I'm probably missing something here, but I would clarify what the different sets of orthogroups are in the figure and in the text (perhaps adopting a pangenome-like nomenclature).

      Thanks for this comment. This legend, unfortunately, reflected an earlier version of the figure and was overlooked prior to submission. We have since amended this and sincerely apologize for the error on our part.

      p12

      • The whole first paragraph reads more like it should be part of an introduction/discussion.

      We’ve moved some of this paragraph to the discussion but left the background information necessary for the reader to understand why we were looking for homologs of wc and frq.

      p13

      • The last paragraph reads like discussion.

      We have revised this paragraph so it now reads: “Because E. muscae is an obligate insect-pathogen only living inside live flies, we investigate the presence of canonical entomopathogenic enzymes in the genome. We find that E. muscae appear to have an expanded group of acid-trehalases compared to other entomopathogenic and non-entomopathogenic Entomophthorales (Fig. 4A), which correlates with the primary sugar in insect blood (hemolymph) being trehalose (Thompson, 2003). The obligate insectpathogenic lifestyle is also evident when comparing the repertoire of lipases, subtilisin-like serine proteases, trypsins, and chitinases in our focal species versus Zoopagomycota and Ascomycota fungi that are not obligate insect pathogens (Fig. 4B). Sordariomycetes within Ascomycota contains the other major transition to insect-pathogenicity within the kingdom Fungi (Araújo and Hughes, 2016). Based on our comparison of gene numbers, Entomophthorales possess more enzymes suitable for cuticle penetration than Sordariomycetes (Fig. 4B). In contrast, insect-pathogenic fungi within Hypocreales possess a more diverse secondary metabolite biosynthesis machinery as evidenced by the absence of polyketide synthase (PKS) and indole pathways in Entomophthorales (Fig. 4C).”

      p15 and 16

      • This all reads as redundant with the previous protein family domain analysis. I would try to merge them.

      Thank you for this comment, however we have opted to maintain the current structure.

      p18

      • In the first sentence, I'm not sure about what was performed here.

      This has been reworded to clarify.

      p20

      • Regarding the assembly, do I understand it correctly that a nuclear genome can be partially haploid / diploid?

      Thanks for your comment. The genome itself is, of course, some integer multiple of n, but based on BUSCO scores our assembly doesn’t appear to have completely collapsed into a haploid genome. We think it makes more sense here to say “partially haploid” than “partially diploid” so have altered this.

      p21

      • RIP has only been observed in a couple of Ascomycetes. RIP-like genomic signatures (GC bias) have been observed elsewhere.

      Hood et al, 2005 found signatures of RIP in anther-smut fungus and Horns et al, 2012, found evidence of hypermutability across repeat elements within several Pucciniales species.

      p23

      • Interesting that the peptidase A2B domain is found uniquely in E. muscae genome and is associated with Ty3 activity. Does the domain often overlap with annotated Ty3 in E. muscae genome? Or how come the domain is not present in other sister species with large genomes full of Ty3 transposons? Could it relate to a new active transposon in E. muscae specifically?

      Thanks for this comment. The domain-based analysis was only performed on the predicted transcriptome of the genome assembly, which does not include the repeat elements (e.g., Ty3). It could be that this peptidase reflects a new active transposon that’s specific to E. muscae, which would certainly be very interesting. We’ve now included this idea in the discussion.

      p26

      • In the case of fungal genomes, I would not advise masking the assembly for repeated sequences prior to gene annotation (in particular given the current focus on protein family variation).

      Thank you for this comment, however we disagree with this assertion as a typical approach for genome annotation in fungi and eukaryotic genomes is to use soft masking of transposable elements before performing gene prediction to avoid over-prediction. While there could be alternative approaches that compare masked or unmasked. This is a recommended protocol for underlying tools like Augustus (10.1002/cpbi.57) and in general descriptions of genome annotation (10.1002/0471250953.bi0401s52). The false positive rate of genes predicted through TE regions is likely to be more a problem than false negatives of missed genes in our experience. Further it seems appropriate to use consistent approach to annotation throughout when including genomes from other sources (e.g., Joint Genome Institute annotated genomes) which also use a repeat masking approach first before annotation. It seems most appropriate to use consistent methods when generating datasets to be used for comparative analyses. It is outside the scope of this project to reannotate all genomes with and without repeat masking.

      p27

      • Interrupted sentence at "Classification of DNA and LTR .. by similarity The".

      This was an unnecessary partial phrase as the information on classification of elements via RepBase was made a few sentences above this.

      p28

      • Enriched/depleted rather than "significantly different"?

      Thank you for this comment, however we have opted to maintain the current phrasing.

    1. Author response:

      The following is the authors’ response to the original reviews.

      We thank the reviewers for a careful review of the manuscript and for their comments, which we address below.

      Reviewer #1:

      (1) …the authors could examine division in a population of cells with only one centrosome. Seeing some restoration of mitotic progression in the absence of SAC-dependent delays would suggest that even one centrosome with uninhibited Eg5 is sufficient to negate SAC-dependent delays, and would limit models for what exactly centrosomes contribute.

      We agree that the one-centrosome question (i.e. whether cells with a single centriole, and therefore a single centrosome, have the same SAC dependence) would be interesting to address. It is known that cells with a single centriole generated through centrinone treatment also have elongated mitoses, like cells lacking centrioles (see Chinen, et. al. 2021, compare Fig 2C to Fig 2D), We have tried this experiment in RPE-1 cells with preliminary results confirming that there is a mitotic delay. It is not known whether this delay requires SAC activity, and we hope to address that in future work. In addition, we note that we show in Fig. 4b-c that cells with the normal centrosome number but with a single focus of microtubules due to Eg5 inhibition, were also sensitive to MPS1 inhibition. This suggests that centrosome presence alone cannot overcome the requirement for SAC activity, rather, the centrosomes need to be able to separate in a timely fashion.

      Reviewer #2:

      (1) An example is how to interpret the effect of Aurora B inhibition, which does not block acentrosomal cell division. If Aurora B is required for SAC activity, it suggests this effect of MPS1 may be a function other than SAC. Given the complexity of the SAC, it would be informative to test other SAC components. Instead, the authors conclude that the mitotic delay caused by MPS is required for acentrosomal cell division. I don't think they have ruled out, or even addressed other functions of MPS1.

      We agree that it is possible that functions of the MPS1 kinase other than those involved in the SAC could be important. Although we have not directly tested other SAC components, we did “mimic” SAC activity by delaying anaphase onset using APC/C inhibition while also inhibiting MPS1 (Fig. 2b-b’’). The fact that this restored division suggests that it is the SAC function of MPS1 kinase activity that is relevant to this delay. 

      (2) The authors find that when both the APC and MPS1 are inhibited, the cells eventually divide. These results are intriguing, but hard to interpret. The authors suggest that the failure to divide in MPS1-inhibited cells is because they enter anaphase, and then must back out. This is hard to understand and there is not data supporting some kind of aborted anaphase. Is the division observed with double inhibition some sort of bypass of the block caused by MPS1 inhibition alone? It is not clear why inhibition of APC causes increased cell division when MPS1 is inhibited.

      As described in the response to 1), we believe that reinstating the delay to anaphase onset by APC/C inhibition provided the time needed to establish a functional bipolar spindle even in the absence of the SAC, and that cells eventually overcome the proTAME block and proceed through mitosis, as observed in control cells in our experiments. We note that we chose concentrations of proTAME specifically for each cell line (RPE-1 and U2OS) that would result only in a temporary block, following on the work of Lara-Gonzalez and Taylor (2012), who reported similar findings for HeLa cells.

      (3) The authors characterize MTOC formation in these cells, which is also interesting. MTOCs are established after NEB in acentrosomal cells. Indeed, forming these MTOCs is probably a key mechanism for how these cells complete a division, like mouse oocytes.

      We agree that the observed intermediates of MTOCs are interesting and likely crucial to the mechanism of cell division in acentrosomal somatic cells. We are investigating further the differences and similarities between somatic cell MTOC formation in the absence of centrosomes and the naturally-occurring form of that process in oocytes.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1

      Evidence, reproducibility and clarity

      Seleit and colleagues set out to explore the genetics of developmental timing and tissue size by mapping natural genetic variation associated with segmentation clock period and presomitic mesoderm (PSM) size in different species of Medaka fish. They first establish the extent of variation between five different Medaka species of in terms of organismal size, segmentation rate, segment size and presomitic mesoderm size, among other traits. They find that these traits are species-specific but strongly correlated. In a massive undertaking, they then perform developmental QTL mapping for segmentation clock period and PSM size in a set of ~600 F2 fish resulting from the cross of Orizyas sakaizumii (Kaga) and Orizyas latipes (Cab). Correlation between segmentation period and segment size was lost among the F2s, indicating that distinct genetic modules control these traits. Although the researchers fail to identify causal variants driving these traits, they perform proof of concept perturbations by analyzing F0 Crispants in which candidate genes were knocked out. Overall, the study introduces a completely new methodology (QTL mapping) to the field of segmentation and developmental tempo, and therefore provides multiple valuable insights into the forces driving evolution of these traits.

      Major comments: - The first sentence in the abstract reads "How the timing of development is linked to organismal size is a longstanding question". It is therefore disappointing that organismal size is not reported for the F2 hybrids. Was larval length measured in the F2s? If so, it should be reported. It is critical to understand whether the correlation between larval size and segmentation clock period is preserved in F2s or not, therefore determining if they represent a single or separate developmental modules. If larval length data were not collected, the authors need to be more careful with their wording.

      The question the reviewer raises here is indeed a very relevant one, and a question that we also were curious about ourselves. While it was not possible (logistically) to grow the 600 F2 fish to adulthood, we did measure larval length in a subset of F2 hatchling (n=72) to ask precisely the question the reviewer raises here. Our results (new Supplementary Figure 5) show that the correlation between larval length and segmentation timing (which we report across the Oryzias species) is absent in the F2s. This indeed argues that the traits represent separate developmental modules.

      In the current version of the paper, organismal size is often incorrectly equated to tissue size (e.g. PSM size, segment size). For example, in page 3 lines 33-34, the authors state that faster segmentation occurred in embryos of smaller size (Fig. 1D). However, Fig. 1D shows correlation between segmentation rate and unsegmented PSM area. The appropriate data to show would be segmentation rate vs. larval or adult length.

      The reviewer is correct. We have now linked the data more clearly to data we show in Supplementary Figure 1, which shows that adult length and adult mass are strongly correlated (S1A) and that adult mass is in turn strongly correlated with segmentation rate in the different Oryzias species (S1B). Additionally main Figure 1B shows that larval length is correlated with PSM length. We have corrected the main text to reflect these relationships more clearly.

      • Is my understanding correct in that the her7-venus reporter is carried by the Cab F0 but not the Kaga F0? Presumably only F2s which carried the reporter were selected for phenotyping. I would expect the location of the reporter in the genome to be obvious in Figure 3J as a region that is only Cab or het but never Kaga. Can the authors please point to the location of the reporter?

      The reviewer is correct. Indeed the location of our her7-venus KI is on chromosome 16 and the recombination patterns on this chromosome overwhelmingly show either Hom Cab (green) or Het Cab/Kaga (Black). This is expected as we selected fish carrying the her7-venus KI for phenotyping.

      • devQTL mapping in this study seems like a wasted opportunity. The authors perform mapping only to then hand pick their targets based on GO annotations. This biases the study towards genes known to be involved in PSM development, when part of the appeal of QTL mapping is precisely its unbiased nature and the potential to discover new functionally relevant genes. The authors need to better justify their rationale for candidate prioritization from devQTL peaks. The GO analysis should be shown as supplemental data. What criteria were used to select genes based on GO annotations?

      We have now commented on these valid points and outlined our rationale in more detail in the text (page 4, lines 20-30). Our rationale now also includes selection of differentially expressed genes (n=5 genes) that fall within segmentation timing devQTL hits (for more details see below). Essentially, while we indeed finally focused on the proof of principle using known genes, these genes were previously not known to play a role in either setting the timing of segmentation or controlling the size of the PSM. Hence, we do think our strategy demonstrates the "the potential to discover new functionally relevant genes", even though the genes themselves had been involved overall in somitogenesis. We added the GO analysis as supplemental data as requested (new Supplementary Figure 7E).

      • Analysis of the predicted functional consequence of divergent SNPs (Fig. S6B, F) is superficial. Among missense variants, which genes harbor the most deleterious mutations? Which missense variants are located in highly conserved residues? Which genes carry variants in splice donors/acceptors? Carefully assessing the predicted effect of SNPs in coding regions would provide an alternative, less biased approach to prioritize candidate genes.

      We now included our analysis of SNPs based on the Variant effect predictor (VEP) tool from ensembl. This analysis does rank the predicted severity of the SNP on protein structure and function (Impact: low, moderate, high) and does annotate which variants can affect splice donors/acceptors. The VEP analysis for both phenotypes is now added to the manuscript as supplemental data (new Supplementary Data S2, S5).

      • Another potential way to prioritize candidate genes within devQTL peaks would be to use the RNA seq data. The authors should perform differential expression analysis between Kaga and Cab RNA-seq datasets. Do any of the differentially expressed genes fall within the devQTL peaks?

      As suggested we have performed this additional experiment and report the RNAseq differential analysis in new Supplement Figure 7C-D. The analysis revealed 2606 differentially expressed genes in the PSM between Kaga and Cab, five of which were candidate genes from the devQTL analysis. We now tested all of these (5 in total, 4 new and 1 previously targeted adgrg1) for segmentation timing by CRISPR/Cas9 KO in the her7-venus background, none of which showed a timing phenotype (new Supplementary Figure 7F-F'). We provide the complete set of results in new Supplementary Figure 7 , Supplementary Data file 3 (DE-genes), all data were deposited on publicly available repository Biostudies under accession number: E-MTAB-13927.

      • The use of crispants to functionally test candidate genes is inappropriate. Crispants do not mimic the effect of divergent SNPs and therefore completely fail to prove causality. While it is completely understandable that Medaka fish are not amenable to the creation of multiple knock-in lines where divergent SNPs are interconverted between species, better justification is needed. For instance, is there enough data to suggest that the divergent alleles for the candidate genes tested are loss of function? Why was a knockout approach chosen as opposed to overexpression?

      We agree with the reviewer that we do not address the causality of SNPs with the CRISPR/Cas9 KO approach we followed. And medaka does offer the genome editing capabilities to create tailored sequence modifications. So in principle, this can be done. In practice, however, we reasoned that any given SNP will contribute only partially to the observed phenotypes and combinatorial sequence edits are simply very laborious given the current state of the art in genome editing technologies. We therefore opted for an alternative proof of principle approach that aims to "to discover new functionally relevant genes", not SNPs.

      -Along the same line, now that two candidate genes have been shown to modulate the clock period in crispants (mespb and pcdh10b), the authors should at least attempt to knock in the respective divergent SNPs for one of the genes. This is of course optional because it would imply several months of work, but it would significantly increase the impact of the study.

      As above, this is in principle the correct rationale to follow though very time, cost and labour intensive. It is for the later practical consideration that we decided not to follow this option.

      Minor Comments - It would be highly beneficial to describe the ecological differences between the two Medaka species. For example, do the northern O. sakaizumii inhabit a colder climate than the southern O. latipes? Is food more abundant or easily accessible for one species compared to the other? What, if anything, has been described about each species' ecology?

      There are indeed differences in the ecology of both species, with the northern O.sakaizumii inhabiting a colder climate than the southern O. latipes. In addition, it is known that the breeding season is shorter in the north than the south, and also there is the fact that northern species have been shown to have a faster juvenile growth rate than southern species. While it would be premature to link those ecological factors to the timing differences we observe, we can certainly speculate. A line to this effect has been added to the main text (Page 5, line 28-30).

      • The authors describe two different methods for quantifying segmentation clock period (mean vs. intercept). It is still unclear what is the difference between Figs. 3A (clock period), S4A (mean period) and S4B (intercept period). Is clock period just mean period? Are the data then shown twice? How do Fig. 3A and S4A differ?

      The clock period shown in all the main figures is the intercept period, which was also used for the devQTL analysis. Both measurements (mean and intercept) are indeed highly correlated and we include both in supplement for completeness.

      • devQTL as shorthand for developmental QTL should be defined in page 4 line 1 (where the term first appears), not later in line 12 of the same page.

      Noted and corrected, we thank the reviewer for spotting this error.

      • Python code for period quantification should be uploaded to Github and shared with reviewers.

      All period quantification code that was used in this study was obtained from the publicly available tool Pyboat (https://www.biorxiv.org/content/10.1101/2020.04.29.067744v3). All code that is used in PyBoat is available from the Github page of the creator of the tool (https://github.com/tensionhead/pyBOAT). Both are linked in the references and materials and methods sections.

      • RNA-seq data should be uploaded to a publicly accessible repository and the reviewer token shared with reviewers.

      We have uploaded all RNA-sequencing Data to public repository BioStudies under accession numbers : E-MTAB-13927, E-MTAB-13928. This information is now also added to material and methods in the manuscript text.

      Why are the maintenance (27-28C) vs. imaging (30C) temperatures different?

      Medaka fish have a wide range of temperatures they can physiologically tolerate, i.e. 17-33. The temperature 30C was chosen for practical reasons, i.e. a slightly faster developmental rate enables higher sample throughput in overnight real-time imaging experiments.

      • For Crispants, control injections should have included a non-targeting sgRNA control instead of simply omitting the sgRNA.

      We agree a non-targeting sgRNA control can be included, though we choose a different approach. For clarity, we now also include a control targeting Oca2, a gene involved in the pigmentation of the eye to probe for any injection related effect on timing and PSM size. As expected, 3 sgRNAs + Cas9 against Oca2 had no impact on timing or PSM size. This data is now shown in new Supplementary Figure 9 F-G'.

      It is difficult to keep track of the species and strains. It would be most helpful if Fig. S1 appeared instead in main figure 1.

      We agree and included an overview of the phylogenetic relationship of all species and their geographical locales in new Figure 1 A-B.

      Significance

      • The study introduces a new way of thinking about segmentation timing and size scaling by considering natural variation in the context of selection. This new framing will have an important impact on the field.
      • Perhaps the most significant finding is that the correlation between segment timing and size in wild populations is driven not by developmental constraints but rather selection pressure, whereas segment size scaling does form a single developmental module. This finding should be of interest to a broad audience and will influence how researchers in the field approach future studies.
      • It would be helpful to add to the conclusion the author's opinion on whether segmentation timing is a quantitative trait based on the number of QTL peaks identified.
      • The authors should be careful not to assign any causality to the candidate genes that they test in crispants.
      • The data and results are generally well-presented, and the research is highly rigorous.
      • Please note I do have the expertise to evaluate the statistical/bioinformatic methods used for devQTL mapping.

      Reviewer #2

      Evidence, reproducibility and clarity

      Seleit et al. investigate the correlation between segment size, presomitic mesoderm and the rhythm of periodic oscilations in the segmentation clock of developing medaka fish. Specifically, they aim to identify the genetic determinants for said traits. To do so, they employ a common garden approach and measure such traits in separate strains (F0) and in interbreedings across two generations (F1 and F2). They find that whereas presomitic mesoderm and segment size are genetically coupled, the tempo of her7 oscilations it is not. Genetic mapping of the F0 and F2 progeny allows them to identify regions associated to said traits. They go on an perturb 7 loci associated to the segmentation clock and X related to segment size. They show that 2/7 have a tempo defect, and 2/ affect size.

      Major comments: The conclusions are convincing and well supported by the data. I think the work could be published as is in its current state, and no additional experiments that I can think of are needed to support the claims in the paper.

      Minor comments: - The authors could provide a more detailed characterization of the identified SNPs associated to the clock and to PSM size. For the segmentation clock, the authors identify 46872 SNPs, most of which correspond to non-coding regions and are associated to 57 genes. They narrow down their approach to those expressed in the PSM of Cab Kaga. Was the RNA selected from F1 hybrids? I wonder if this would impact the analysis for tempo and or size in any way, as F2 are derived from these, and they show broader variability in the clock period than the F0 and F1 fishes.

      The RNA was obtained from the pure F0 strains and we have now extended this analysis by deep bulk-RNA sequencing and differential gene expression analysis. As indicated also to reviewer 1, this revealed 2606 differentially expressed genes in the unsegmented tails of Kaga and Cab embryos, some of which occurred in devQTL peaks. Based on this information we expanded our list of CRISPR/Cas9 KOs by targeting all differentially expressed genes (5 in total, 4 new and 1 previously targeted) for segmentation timing, none of which showed a timing phenotype (new Supplementary figure 7C-D). We provide the complete set of results in new Supplementary Figure 7, Supplementary Data file 3 (DE-genes). All data were deposited on publicly available repository Biostudies under accession number: E-MTAB-13927.

      It would be good if the authors could discuss if there were any associated categories or overall functional relationships between the SNPs/genes associated to size. And what about in the case of timing?

      In the case of PSM size there were no clear GO terms or functional relationships between the genes that passed the significance threshold on chromosome 3.

      For the 35 genes related to segmentation timing, there were a number of GO enrichment terms directly related to somitogenesis. We have included the GO analysis in the new Supplementary Figure 7E.

      • Have any of the candidate genes or regulatory loci been associated to clock defects (57) or segment size (204) previously in the literature?

      To the best of our knowledge none of the genes have been associated with clock or PSM size defects so far. It might be worthwhile using our results to probe their function in other systems enabling higher throughput functional analysis, such as newly developed organoid models.

      • When the authors narrow down the candidate list, it is not clear if the genes selected as expressed in the PSM are tissue specific. If they are, I wonder if genes with ubiquitous expression would be more informative to investigate tempo of development more broadly. It would be good if the authors could specifically discuss this point in the manuscript.

      We have not addressed the spatial expression pattern of the 35 identified PSM genes in this study, so we cannot speculate further. But the reviewer raises an important point, how timing of individual processes (body axis segmentation) are linked at organismal scale is indeed a fundamental, additional, question that will be addressed in future studies, indeed the in-vivo context we follow here would be ideal for such investigations.

      Can the authors speculate mechanistically why mespb or pchd10b accelerates the period of her7 oscillations?

      While we do not have a mechanistic explanation yet, an additional experiment we performed, i.e. bulk-RNAsequencing on WT and mespb mutant tails, provided additional insight, we now added this data to the manuscript . This analysis revealed 808 differentially expressed genes between wt and mespb mutants. Interestingly, many of these affected genes are known to be expressed outside of the mespb domain, i.e. in the most posterior PSM (i.e. tbxt, foxb1,msgn1, axin2, fgf8, amongst others). This indicates that the effect of mespb downregulation is widespread and possibly occurs at an earlier developmental stage. This requires more follow up studies. This data is now shown in new Supplementary figure 9A, Supplementary Data file S4. We now comment on this point in the revised manuscript.

      • Are there any size difference associated to the functionally validated clock mutants?

      We addressed this point directly and added this analysis as supplementary Figure 9H-H'. While pcdh10b mutants do not show any detectable difference in PSM size, we find a small, statistically significant reduction in PSM size (area but not length) in mespb mutants. All this data is now included in the revised manuscript.

      -Ref 27 shows a lack of correlation between body size and the segmentation period in various species of mammals. The work supports their findings, and it would be good to see this discussed in the text.

      We are not certain how best to compare our in-vivo results in externally developing fish embryos to in-vitro mammalian 2-D cell cultures. In our view, the correlation of embryo size, larval and adult size that we find in Oryzias might not necessarily hold in mammalian species, which would make a comparison more difficult. We do cite the work mentioned so the reader is pointed towards this interesting, complementary literature.

      Significance

      The work is quite remarkable in terms of the multigenerational genetic analysis performed. The authors have analysed >600 embryos from three separate generations to obtain quantitative data to answer their question (herculean task!). Moreover, they have associated this characterization to specific SNPs. Then, to go beyond the association, they have generated mutant lines and identified specific genes associated to the traits they set out to decipher.

      To my knowledge, this is the first project that aims to identify the genetic determinants for developmental timing. Recent work on developmental timing in mammals has focused on interspecies comparisons and does not provide genetic evidence or insight into how tempo is regulated in the genome. As for vertebrates, recent work from zebrafish has profiled temperature effects on cell proportions and developmental timing. However, the genetic approach of this work is quite elegant and neat.

      Conceptually, it is quite important and unexpected that overall size and tempo are not related. Body size, lifespan, basal metabolic rates and gestational period correlate positively and we tend to think that mechanistically they would all be connected to one another. This paper and Lazaro et al. 2023 (ref 27) are one of the first in which this preconception is challenged in a very methodical and conclusive manner. I believe the work is a breakthrough for the field and this work would be interesting for the field of biological timing, for the segmentation clock community and more broadly for all developmental biologists.

      My field is quantitative stem cell biology and I work on developmental timing myself, so I acknowledge that I am biased in the enthusiasm for the work. It should be noted that as an expert on the field, I have identified instances where other work hasn't been as insightful or well developed in comparison to this piece. It is also worth noting that I am not an expert in fish development, phylogenetic studies or GWAS analyses, so I am not capable to asses any pitfalls in that respect.

      __Reviewer #3 (Evidence, reproducibility and clarity (Required)): __

      __Summary: __

      This manuscript explores the temporal and spatial regulation of vertebrate body axis development and patterning. In the early stages of vertebrate embryo development, the axial mesoderm (presomitic mesoderm - PSM) undergoes segmentation, forming structures known as somites. The exact genetic regulation governing somite and PSM size, and their relationship to the periodicity of somite formation remains unclear.

      To address this, the authors used two evolutionarily closely related Medaka species, Oryzias sakaizumii and Oryzias latipes, which, although having distinct characteristics, can produce viable offspring. Through analysis spanning parental (generation F0) and offspring (generations F1 and F2) generations, the authors observed a correlation between PSM and somite size. However, they found that size scaling does not correlate with the timing of somitogenesis.

      Furthermore, employing developmental quantitative trait loci (devQTL) mapping, the authors identified several new candidate loci that may play a role during somitogenesis, influencing timing of segment formation or segment size. The significance of these loci was confirmed through an innovative CRISPR-Cas9 gene editing approach.

      This study highlights that the spatial and temporal aspects of vertebrate segmentation are independently controlled by distinct genetic modular mechanisms.

      __Major comments: __

      1) In the main text page 3, lines 11 and 12, the authors state that the periodicity of the embryo clock of the F1 generation is the intermediate between the parental F0 lineages. However, the authors look only at the periodicity of the Cab strain (Oryzias latipes) segmentation clock. The authors should have a reporter fish line for the Kaga strain (Oryzias sakaizumii) to compare the segmentation clock of both parental strains and their offspring. Since it could be time consuming and laborious, I advise to alternatively rephrase the text of the manuscript.

      We agree a careful distinction between segment forming rate (measured based on morphology) and clock period (measured using the novel reporter we generated) is essential. We show that both measures correlate very well in Cab, in both F0 and F1 and F2 carrying the Cab allele. For Kaga F0, we indeed can only provide the rate of somite formation, which nevertheless allows comparison due to the strong correlation to the clock period we have found. We have rephrased the text accordingly.

      2) It is evident that only a few F0 and F1 animals were analyzed in comparison with the F2 generation. Could the authors kindly explain whether and how this could bias or skew the observed results?

      We provide statistical evidence through the F-test of equality that the variances between the F0, F1 and F2 samples are equal. Additionally if we sub-sample and separate the F2 data into groups of 100 embryos (instead of all 638) we get the same distribution of the F2s. We therefore believe that this is sufficient evidence against a bias or skew in the results.

      3) It would be interesting to create fish lines with the validated CRISPR-Cas9 gene manipulations in different genetic contexts (Cab or Kaga) to analyze the true impact on the segmentation clock and/or PSM & somite sizes.

      We agree with the reviewer this would in principle be of interest indeed, please see our response to reviewer 1 earlier.

      4) Please add the results of the Go Analysis as supplementary material.

      We have added the GO analysis in new Supplementary Figure 7E.

      __Minor comments: __

      1) In the main text, page 2, line 29, Supplementary Figure 1D should be referenced.

      We have added a clearer phylogeny and geographical location of the different species in new Figure 1 A-B. And reference it at the requested location.

      2) In the main text, page 2, line 32, the authors refer to Figure 1B, but it should be 1C.

      We have corrected the information.

      3) Regarding the topic "Correlation of segmentation timing and size in the Oryzias genus" the authors should also give information on the total time of development of the different Oryzias species, as well as the total number of formed somites.

      We follow this recommendation and have added this information in new Supplementary Figure 5. We also now include segment number measured in F2 embryos. We indeed view segmentation rate as a proxy for developmental rate, which however needs to be distinguished from total developmental time. The latter can be measured for instance by quantifying hatching time, which we did. These measurements show that Kaga, Cab and O.hubbsi embryos kept at constant 28 degrees started hatching on the same day while O.minutillus and O.mekongensis embryos started hatching one day earlier. We have not included this data in the manuscript because we think a distinction should be made between rate of development and total development time.

      4) In Figures 3A and B, please add info on the F1 lines for comparison.

      The information on F1 lines is provided in Supplementary Figure 3

      5) Supplementary Figures 2F shows that the generation F1 PSM is similar to Cab F0, and not an intermediate between Kaga F0 and Cab F0. This is interesting and should be discussed.

      We show that the F1 PSM is indeed closer to the PSM of Cab than it is to the Kaga PSM. This is indeed intriguing and we have now commented on this point directly in the text.

      6) Supplementary Figures 6C to H are not mentioned either in the main text or in the extended information. Please add/mention accordingly.

      We have added references to both in the text

      7) The order of Supplementary Figure 8 E to H and A to D appears to be not correct and not following the flow of the text. Please update/correct accordingly.

      We have updated the text accordingly.

      8) The authors should choose between "Fig.", "Fig", "fig.", "fig" or "Figure". All 'variants' can be found in the text.

      Noted, and updated. Fig. is used for main figures and fig. is used for supplementary figures.

      9) The color scheme of several figures (graphs with colored dots) should be revised. Several appear to be difficult to discern and analyze.

      We have enhanced the colours and increased the font on the figure panels. The colour panel was chosen to be colour-blind friendly.

      10) Please address/discuss following questions: What are the known somitogenesis regulating genes in Medaka? How do they correlate with the new candidates?

      The candidates we found and tested had not been implicated in regulating the tempo of segmentation or PSM size, while for some a role in somite formation had been previously established, hence the enrichment in GO analysis Somitogenesis.

      Reviewer #3 (Significance (Required)):

      General assessment:

      This interesting manuscript describes a novel approach to study and find new players relevant to the regulation of vertebrate segmentation. By employing this innovative methodology, the authors could elegantly demonstrate that the segmentation clock periodicity is independent from the sizes of the PSM and forming somites. The authors were further able to find new genes that may be involved in the regulation of the segmentation clock periodicity and/or the size of the PSM & somites. A limitation of this study is the fact that the results mainly rely on differences between the two species. The integration of additional Medaka species would be beneficial and may help uncover relevant genes and genetic contexts.

      Advance:

      To my best knowledge this is the first time that such a methodology was employed to study the segmentation clock and axial development. Although the topic has been extensively studied in several model organisms, such as mice, chicken, and zebrafish, none of them correlated the size of the embryonic tissues and the periodicity of the embryo clock. This study brings novel technological and functional advances to the study of vertebrate axial development.

      Audience:

      This work is particularly interesting to basic researchers, especially in the field of developmental biology and represents a fresh new approach to study a core developmental process. This study further opens the exciting possibility of using a similar methodology to investigate other aspects of vertebrate development. It is a timely and important manuscript which could be of interest to a wider scientific audience and readership.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary:* In this paper the authors explore the function of Syndecan in Drosophila stem cells focussing primarily on the intestinal stem cells. They use RNAi knockdown to conclude that Syndecan is required for long term stem cell maintenance as its knockdown results in apoptosis. They suggest that this effect is independent of LINC complex proteins but is associated with changes to nuclear morphology and DNA damage. They go on to show that a similar impact on nuclear shape can be seen in larval neuroblasts but not in stem cells of the female germline. *

      Major Comments: *The key conclusion that underpins the paper is that reduced Syndecan causes loss of stem cells. This is based entirely on evidence from cell-type specific RNAi using 3 independent RNAi lines. Overexpression has no phenotype and there is no analysis of loss of function mutants. SdcRNAi3 gives strong phenotypes that are statistically significant and is used throughout the paper. SdcRNAi2 gives comparatively moderate phenotypes which trend in the same direction but it is not clear if these are statistically significant (Fig S1). SdcRNAi line 1 appears to have very little effect (and if anything trends in the opposite direction in S1A). In addition, the knockdown efficiency of the three lines has not been assessed. Another possible concern given the dependence on RNAi3 is that the RNAi control line used is not an ideal match for the VDRC GD RNAi lines as it is in a different genetic background. In order to robustly draw conclusions: the phenotypes with RNAi lines 1 and 2 should be tested for significance; the extent of knockdown in each should be quantified either by qPCR in whole tissue knockdown, or by staining for protein levels if possible, to assess whether the variation in phenotypes is due to different knockdown levels. The use of a loss of function mutant in clones or tissue specific CRISPR-Cas9 KO or KD would also significantly increase confidence in the findings. *

      • Our qPCR data indicate that SdcRNAi3 produces the most efficient knockdown, whilst SdcRNAi1 generates the weakest knockdown. The new manuscript version will incorporate this data in figure S1. Knockdown efficacy of SdcRNAi 3 has also been previously reported (Eveland et al., 2016).

      • We apologise for omitting to add the statistical tests on phenotypic categories in figure S1A, this will be revised. We confirm that all Sdc RNAi phenotypic distributions are significantly different to that seen for age-matched controls (p- It should also be noted that despite weaker knockdowns with SdcRNAi1 and 2, we still observed statistically significant ISC depletion after 28 days of RNAi expression - we will add this data in figure S1. Overall, we are confident about Sdc’s role in maintaining intestinal stem cells.

      *Similarly, the evidence for a lack of LINC protein role in the phenotype relies on single RNAi lines without validation of knockdowns. The authors should ideally validate these lines in this system or reference other studies that have validated the lines in this or other contexts. *

      • The klarsicht RNAi line (BDSC 36721) and klaroid RNAi line (BDSC 40924) used in this study have been validated and used in other studies. (Falo-Sanjuan & Bray, 2022; Collins et al., 2017)

      • For Msp300 RNAi knockdown we have used two independent RNAi lines which gave similar results. We will amend the text to clarify these points. In addition, the line reported in the manuscript was previously validated (Dondi et al., 2021; Frost et al., 2016).

      Minor Comments: *The figures are generally very clear but some of the IF image panels are very small and require significant on-screen enlargement to be legible. In particular in Figure 1B the cross section views make it difficult to assess expression in the different cell types (and don't show very many cells), could this be shown in wholemount or as separated channels in a supplementary figure? In addition, it would strengthen the argument to include counterstains for markers of the different cell types (particularly to distinguish ISC/EB from EE). This could include esg-lacZ to mark ISC/EBs or prospero for EEs. However, if a broader view of these panels makes it clearer that all epithelial cells are expressing Syndecan this may not be essential. *

      • We are happy to incorporate larger fields of view, and co-immunostaining with different cell type markers.

      *Syndecan is referred to throughout as a stem cell regulator. This implies that in certain contexts or in response to certain stimuli its expression may be altered to elicit a stem cell response but no examples of this are shown. Moreover, only knockdown and not overexpression gives phenotypes suggesting its role may be as a required protein than a regulator. Either examples of its expression being modulated in homeostasis or in response to a challenge could be included or the wording could be amended. *

      • We agree with the reviewer and will amend the wording.

      *Expression of Syndecan in neuroblasts is described as data not shown, it would be better to include this for completeness. *

      • We will add this data in figure 4.

      *In addition to the intestinal validation of the Syndecan RNAi lines, validation of knockdown in the germline would be valuable to support the conclusions of Fig S4 given differences of knockdown in the germline with some RNAi lines (although inclusion of Dicer in the driver line should have overcome this). *

      • Sdc expression is very low in the germline, compared to the surrounding somatic cells, therefore we are not confident that we can detect differences in expression level after knockdown. We suggest adding a panel in figure S4 to show the low expression and adding a comment in the text. Reviewer #1 (Significance (Required)): *The study describes a potentially very interesting, novel link between Syndecan, nuclear shape and apoptosis in cycling cells that could have broad relevance. If fully validated this could have implications for other stem cell populations, including those in mammals and disease relevance in the context of cancer. The paper is fundamentally descriptive in nature and so the level of significance hinges on the strength of evidence and how interesting the phenotype itself is. At this stage the audience will be primarily in the areas of fundamental research in biology of the nucleus and cytoskeleton. Defining the mechanistic link between Syndecan and nuclear morphology will be a critical next step and while not essential for this study would significantly increase the likely interest in the paper. *

      • We thank the reviewer for these constructive comments. We agree that discovering the mechanistic links between Syndecan and nuclear morphology in future studies, in this and other model systems, will be relevant to many areas of biological research.

      *In terms of significance in stem cell biology the distinction between a regulator and a requirement to prevent stem cell apoptosis is important and the lack of evidence for a context in which Syndecan plays a regulatory role somewhat detracts from the breadth of impact. My field of expertise is in epithelial stem cell biology. *

      • We agree and will amend our wording.

      Reviewer #2 *(Evidence, reproducibility and clarity (Required)): ** Summary: Stem cell (SC) maintenance and proliferation are necessary for tissue morphogenesis and homeostasis. The basement membrane (BM) has been shown to play a key role in regulating stem cell behavior. In this work, the authors unravel a new connection between the receptor for BM components Syndecan (Sdc) and SC behavior, using Drosophila as model system. They show that Sdc is required for intestine stem cell (ISC) maintenance, as Sdc depletion results in their progressive loss. At a cellular level, they also find that Sdc depletion in ISCs affects cell survival, cell and nuclear shape, nuclear lamina and DNA damage. In addition, they show that the defects in shape are not related to cell death. They also find that Sdc depletion in neural stem cells also results in nuclear envelope remodeling during cell division. This is in contrast to what happens in female germline stem cells where Sdc does not seem to be required for their survival or maintenance. In general, I believe that this work unravels a connection between Sdc and stem cell behavior. However, I think the study is still at a preliminary stage, as how Sdc regulates different facets of stem cell behavior remains unclear.

      Major comments: 1. To clearly show that the cellular changes produced by loss of Sdc are not due to cell death, one should quantify the ISC area and shape of Sdc-depleted ISCs expressing DIAP1 and compare it to that of Sdc-depleted ISCs. As DIAP1 overexpression only partially rescues ISC loss due to Sdc depletion, one should show that the Sdc-depleted ISCs expressing DIAP1 that still show cellular changes are not dying, as overexpression of Diap1 might not be sufficient to completely rescue cell death in all Sdc-depleted ISCs. In fact, apoptosis in Sdc depleted guts and the ability of Diap1 overexpression to rescue cell death should be analyzed using markers of caspase activity, this will provide a better idea of the contribution of apoptosis to the phenotypes associated to Sdc depletion. *

      • We can, as suggested by the reviewer, quantify the area and shape of Sdc-depleted ISCs expressing DIAP1 and compare it to that of Sdc-depleted ISCs. However, our immunostainings with anti-Caspase 3 or Drice do not pick up apoptotic cells in the fly gut. This is not entirely unexpected, as apoptosis is unfortunately not easily detected in this tissue. In the absence of a positive readout of apoptosis, we will not be able to discriminate between apoptotic and non-apoptotic stem cells when quantifying area and shape and will only have global quantifications.

      • The authors show that ISC loss is associated with reduced cell density, suggesting that this is most likely due to failure in new cell production. What do they mean with cell production? Is this related to a problem in regulating cell division or to the fact that as some ISCs are lost by apoptosis there is progressively less ISCs or to a combination of both? I think that cell division should be monitored throughout time as well as cell death in ISCs.*

      • Based on esgF/O experiments (fig. 1D-F and S1C) where we can trace the production of new cells with GFP, we know that Sdc RNAi expression (i) impairs the appearance of newly differentiated cells in the tissue and (ii) results in the disappearance of progenitor cells (fig. S1C). Supporting these points, (i) we have observed PH3+ mitotic stem cells upon Sdc RNAi, so we are confident the cells are able to initiate cell division (see also fig. 2G), and (ii) we have occasionally noted in fixed samples stem cells looking like they were in the process of delaminating. Overall, the failure of cell production is likely related to problems with both completion of cell division and progressive stem cell loss. High resolution live imaging will in future give us a better insight into stem cell division dynamics/behaviour, however, the technical improvements required are beyond the scope of this project. In the meantime, we propose to clarify our statement in the text.

      • The authors report that in contrast to what happens when Sdc is eliminated from ISCs, its elimination from EEs results in an increase in the number of these cells. An explanation for this result is missing.*

      • Based on known roles of Syndecan in other Drosophila tissues (Johnson et al., 2004; Steigemann et al., 2004; Chanana et al., 2009; Schulz et al., 2011), we speculate that Syndecan may contribute to robo/slit signalling, which is an important regulator of EE activity in the Drosophila gut (Biteau & Jasper 2014; Zeng et al., 2015). We propose to amend the text to express this hypothesis.

      • The authors suggest that "Sdc function is unlikely to be fully accounted for by individual LINC complex proteins, although these proteins might act redundantly". Checking redundancy seems a straight forward experiment, which only requires the simultaneous expression of RNAis against several of these proteins. This would help to settle the implication of LINC complex proteins on Sdc function.*

      • To check redundancy, we propose to combine Klaroid RNAi with Msp300 or Klarsicht RNAis, and express two RNAis at a time in ISCs. We will then measure stem cell proportions and the proportion of ISCs with DNA damage.

      • Although quantification of DNA damage, by immunolabelling with gH2Av, reveals that knockdown of individual LINC complex components did not recapitulate the damage observed upon Sdc depletion (Fig.3G), the image shown in Fig.3F reflects much higher levels of gH2Av in Msp300 RNAi cells compared to Sdc RNAi cells. Authors should clarify this. *

      • Like the reviewer, we are intrigued by the higher levels of H2Av staining in the tissue, despite Msp300 knockdown in stem cells only (fig. 3F). It is worth noting that we observed this with two independent RNAi lines (we showed only one RNAi in the manuscript, but we will amend the text to indicate this). In fig. 3F, we will indicate with an arrow the only ISC that is H2Av positive, and mention in the text that the majority of DNA damage signal observed in the Msp300 RNAi condition is in enterocytes, not ISCs. We currently do not have an explanation for why loss of Msp300 in ISCs should cause DNA damage in neighboring cells.

      *In addition, the consequences of the simultaneous elimination of more than one component of the LINC complex on DNA damage should be analyzed. *

      • We agree, and as we check for redundancy (as in point 4), we will also immunostain the tissues for H2Av.

      • The authors claim that the fact that "DNA damage was found more frequently in Sdc-depleted ISCs with lamina invaginations compared to those without (Figure 3H), supports a model whereby the development of nuclear lamina invaginations precedes the acquisition of DNA damage". However, to me, these results show that there is a relation between these two phenotypes, but not that one precedes the other. In order to show which one is the possible cause and which the consequence, the authors should perform a time course of the appearance of each of these phenotypes.*

      • We agree with the reviewer that we should rephrase our statement to indicate a relationship between lamina invaginations and DNA damage, rather than a causality (as stated in fig. 3H).

      (In terms of performing a time course analysis, the difficulty is that after 3 days of Sdc RNAi expression, the apparent DNA damage (fig. 3G) corresponds to a very small proportion of stem cells, meaning that an exceptionally large sample size would be required to achieve robust statistical analysis.)

      • When studying the role of Sdc in neural stem cells, the authors show that elimination of Sdc in neuroblasts also affect nuclear envelope and shape. Furthermore, in this case, they also show that Sdc elimination affects cell division. To look for a more conserved role of Sdc in stem cell behavior, I believe the authors should also analyze whether Sdc elimination in neural stem cells results in an increase in DNA damage, as it is the case in ISCs.*

      • We will stain larval brains for H2Av to see if DNA damage is also observed following Sdc knockdown in neuroblasts.

      • When analyzing a possible role of Sdc in fGSCs, quantification of germline stem cells and gH2Av levels in control nosGal4 and nos>Sdc RNAi germaria should be done. In addition, it is not clear to me whether Sdc is in fact expressed in fGSCs.*

      • *

      • As mentioned in comments to reviewer 1, we will add a panel in figure S4 to show the low Sdc expression in fGSCs. We will also clarify in the text that we do not see any H2Av staining in the fGSCs (thus, there is nothing to quantify in this case).

      * The authors should show presence of Sdc in neuroblasts.*

      • Yes, we agree, as also mentioned in comments to reviewer 1.

      Reviewer #2 (Significance (Required)): *In general, although this work reveals that elimination of Sdc affects different aspects of intestinal and neural stem cell behavior, including cell survival, cell production, nuclear shape, nuclear lamina or DNA damage, their contribution to stem cell loss and interactions between them have not been analyzed in detail. The role of the basement membrane in stem cell behavior has been extensively studied. In particular, the role of syndecan in stem cell regulation has been primarily confined to cancer, muscle, neural and hematopoietic stem cells. Thus, the study here presented could extend the role of Sdc to intestinal stem cells and could potentially reveals a conserved role for Sdc in neural stem cell behavior. However, the problem with the data mentioned above, hinders the assessment of the significance of this work. *

      • We thank the reviewer for their assessment and are glad that they also find that our study provides novel connections between Syndecan and the regulation of intestinal and neural stem cell behaviors. To strengthen our conclusions, we will include additional experiments or amend the text, as indicated above.

      Reviewer #3* (Evidence, reproducibility and clarity (Required)): ** Peer-review: The transmembrane protein Syndecan regulates stem cell nuclear properties and cell maintenance.

      In this work, the authors investigate the role of the transmembrane protein Syndecan (Sdc) in nuclear organisation and stem cell maintenance. Theys show that Sdc knockdown in intestinal stem cells (ISCs) results in a reduction of the ISC pool as well as of their progeny. They hypothesise that these ISCs might get eliminated via cell death, however, expression of the apoptotic inhibitor DIAP1 only rescued ISC loss by 50%. Hence, they suggest that apoptosis can not account for the total decrease in ISCs observed upon Sdc loss. ISCs depleted from Sdc exhibited abnormal cytoplasmic and nuclear morphologies. As Sdc has previously been implicated in the abscission machinery in mammalian cultured cells, they tested if Sdc could be playing a similar role in the abscission of ISCs. However, ISCs were capable of undergoing cytokinesis. Next, they tested if Sdc depletion could be altering the linkage between the plasma membrane and the nucleus mediated by the Linker of Nucleoskeleton and Cytoskeleton (LINC) complex. However, individual knockdowns of the different components of the complex did not disrupt the nuclear morphology to the same extent as Sdc knockdown, suggesting that Sdc function may be independent of the LINC complex. Finally, they observed that Sdc-depleted ISCs exhibited DNA damage, suggesting that Sdc may play a role in DNA protection. The authors next tested if Sdc played similar roles in other stem cell types such as the female germline stem cells (fGSCs) and larval neural stem cells (NSCs). While Sdc depletion appeared dispensable for fGSC maintenance, it prolonged NSC divisions and altered the nuclear morphology of NSCs. Upon further investigations, they observed that the NSC's nuclear envelope was disrupted upon division, hence causing defects in the nuclear size ratio of NSC and their progeny. This study provides with interesting findings in the field and proves a new role for Sdc in the regulation of intestinal and neural stem cell maintenance. I would recommend this manuscript to be accepted if the authors address the following comments.

      __Major comments: __ 1. In Figure 2 A-B, Sdc RNAi should ideally have a UAS control transgene to match the number of UAS being expressed to that of Sdc RNAi, DIAP1. Otherwise, it is plausible that reduced RNAi expression of Sdc RNAi, DIAP1 animals is the cause of the partial rescue. Staining against cell death markers such as Dcp-1 or TUNEL might also quantify the number of cells undergoing cell death in each of the genotypes. *

      • As mentioned in comments to reviewer 2 (point 1), it is difficult to label apoptotic cells in the fly gut. However, we could set up an additional control to test that the partial rescue observed upon DIAP1 expression is not a result of Gal4 dilution.

      • " These phenotypes were observed both with and without DIAP1 expression (Figure 2C), indicating that these cell shapes are not caused by apoptosis."Misleading, as DIAP overexpression in Sdc knockdown background only rescued apoptosis by 50%. Hence, it is possible that those cells undergoing morphological defects, protrusions and blebbing might still undergo death - also considering those morphological changes are typically observed in apoptotic cells...Therefore, to rule apoptosis out, these cells should be shown to be negative for cell death markers. *

      • We agree, however, it is difficult to label apoptotic cells. We think that the quantification of shape and area (as suggested by reviewer 2, point 1) will clearly show that the cell shapes resulting from Sdc depletion are not caused by apoptosis.

      • Show if Sdc is expressed in fGSCs - the lack of phenotype caused by Sdc knockdown might be due to lack of expression of Sdc.*

      • As mentioned in comments to reviewers 1&2, we will add a panel in figure S4 to show the low sdc expression in fGSCs.

      • "After confirming the presence of Sdc in neuroblasts (data not shown)."Data should be shown. It would be of great interest for researchers if you showed a staining of different brain cell types (NBs, glia, neurons) and the Sdc expression patterns.*

      • As mentioned in comments to reviewers 1&2, we will add a panel in figure 4 to show sdc expression in NBs and the overall expression pattern.

      • You show how Slc-depleted NBs have disrupted nuclear morphologies. However, does Slc KD in NB lineages affect their ability to self-renew and generate differentiated progeny? Is the number of NBs and of their progeny cells altered as it is for ISCs?*

      • We propose to knockdown Sdc in NBs and quantify brain size in 3rd instar larvae to test if the ability to generate progeny is affected.

      • Does protection against DNA damage in an Slc knockdown background prevent the defects observed with the single knockdown and ISC elimination?*

      • This is a good question, and we should emphasize this point in the discussion. However, because of the multiple routes of DNA damage response, and the multiple lines needed to explore this connection, we feel that investigating this question is beyond this project.

      • Any idea the similarities between ISC and NBs that can account for why Sdc knockdown has effects in those systems, while no effect was observed in the germ cells?*

      • Besides the differences in expression level, we speculate that GSCs may have a different nuclear / lamina architecture which might reflect differences in how GSCs control the physical integrity of their nuclei. It is also possible that the differences observed between tissues reflect the way stem cells connect to their microenvironment. Notably, fGSCs rely extensively on E-Cadherin mediated adhesion with neighbouring cells, and it is possible that contact with the extracellular matrix is dispensable. We will consider these possibilities in the discussion.

      Minor comments:* ** 8. Lamina invaginations, for example in Figure 3 A, could be indicated with an arrow for easier detection. *

      • Thanks for this suggestion, we will amend the figure.

      Specify the type and location of NB imaged during live cell experiments.

      • The NBs were imaged in the brain lobes, and we did not distinguish between type I and II NBs. We will add a sentence in the method section to clarify.

      *Reviewer #3 (Significance (Required)): Expertise: Drosophila stem cells *

      • Many thanks for the constructive comments.
    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      (1) The data strongly suggest that iron depletion in urine leads to conditional essentiality of some genes. It would be informative to test the single gene deletions (Figure 3G) for growth in urine supplemented with iron, to determine how many of those genes support growth in urine due to iron limitation.

      We appreciate this suggestion. We have now included this suggested experiment as a new panel (Figure 5G).

      (2) Line 641. The authors raise the intriguing possibility that some mutants can "cheat" by benefitting from the surrounding cells that are phenotypically wild-type. Growing a fepA deletion strain in urine, either alone or mixed with wild-type cells, would address this question. Given that other mutants may be similarly "masked", it is important to know whether this phenomenon occurs.

      We thank the reviewer for this suggestion but believe that this would be very difficult to ascertain in K. pneumoniae as several redundant iron uptake systems exist. This would require significantly more time to construct sequential/combinatorial iron-uptake mutants to exactly determine this “cheating” and “masking” phenomenon and such work is beyond the scope of the current study.

      (3) In cases where there are disparities between studies, e.g., for genes inferred to be essential for serum resistance, it would be informative to test individual deletions for genes described as essential in only one study.

      We thank the reviewer for this suggestion, and we agree that deleting conditionally essential genes (i.e. serum resistance) could help identify discrepancies in methodology with other studies but this is beyond the scope of this study. Furthermore, we do not have these other strains readily available to us and importing these strains into Australia is challenging due to the strict import/quarantine laws.

      Reviewer #1 (Recommendations For The Authors)

      (4) Line 529. Why was 50 chosen as the read count threshold?

      This was chosen as the minimum threshold needed to exclude essential genes from the comparative analysis, as these can contribute false positive results where a change from, for example, 2 to 5 reads between conditions is considered a >2-fold change. We have updated the manuscript text to highlight this: “were removed from downstream analysis to exclude confounding essential genes and minimize the effect of stochastic mutant loss” (line 539

      (5) The titles for Figure 5 and Figure 6 appear to be switched.

      Thank you, we have now corrected this error.

      (6) Line 381. "Forty-six of these regions contain potential open reading frames that could encode proteins". How is a potential ORF defined?

      This was based on submitting the selected 145bp regions to BLASTx using default parameters and listing the top hit (if one was found). We have now edited the manuscript text to make this clearer. (Line 394)

      (7) Two previous TnSeq studies looking at Escherichia coli and Vibrio cholerae suggest that H-NS can prevent transposon insertion, leading to false positive essentiality calls. Is there any evidence of this phenomenon here? A/T content could be used as a proxy for H-NS occupancy.

      We thank the reviewer for this point and also agree that H-NS or other DNA-binding proteins could indeed lead to false-positive essentiality calls using TraDIS. Based on this, we have now included a sentence in the conclusion section mentioning this methodological caveat (Line 631). We believe that A/T content could potentially be used as a proxy for H-NS occupancy,

      Reviewer #2 (Recommendations For The Authors):

      (1) The authors may wish to reformat the manuscript by decanting a number of panels and figures as supplementary material. These include the panels related to the description of TraDIS (for example Fig 1D, 1E, 1F. 1G, Fig 2A, Fig 3C, 3D, 3E, 3F, Fig 5C, Fig 6D). This is a well-established method.

      We thank the reviewer for this suggestion but believe that these panels allow the methodology and resulting insertion plots to be more followable and allow other researchers, of varying expertise, to better understand this functional genetic screen technique.

      (2) The authors need to indicate how relevant the strain they have probed is. Is it a good reference strain of the KpI group?

      This is a great suggestion and we have now included a new figure illustrating the genetic context and relatedness of K. pneumoniae ECL8 within the KpI phylogroup (New Figure 3).

      (3) The authors need to provide an extensive comparison between the data obtained and those reported testing other Klebsiella strains. A Table identifying the common and different genes, as well as a figure, may suffice. I would encourage authors to compare also their data against E. coli and Salmonella. For example, igaA seems to be not essential in Kebsiella although data indicates it is in Salmonella.

      We thank the reviewer for their comment and appreciate that our data could be extended and compared to other relevant Enterobacteriaceae members. However, we believe this is beyond the scope of this study as the focus is more on K. pneumoniae.

      (4) None of the mutants tested further are complemented. Without these experiments, it cannot be rigorously claimed that these loci play any role in the phenotypes investigated.

      We agree that complementation is an important tenet for validation of mutant gene phenotypes to specific gene loci, in this case wbbY has already been complemented and believe complementation for an already known molecular mechanism would be redundant. Please refer to our response in point 6.

      We complemented isolated transposon mutants hns7::Tn5 and hns18::Tn5 with a mid-copy IPTG inducible . We observed a slight increase in serum susceptibility but not full rescue of the WT phenotype (i.e. serum susceptibility). We suspect that the imperfect rescue of the serum-resistance phenotype observed could be due to the expression levels and copy number of the complement hns plasmid used. As hns is a known global regulator its possible pleiotropic role is complex as many aspects of stress response, metabolism or capsule could be affected in Klebsiella (doi.org/10.1186/1471-2180-6-72, doi.org/10.3389/fcimb.2016.00013). We have now included in the text our efforts in complementation and have included a new supplementary figure (Figure S11).

      (5) The contribution of siderophores to survival in urine is not conclusively established. Authors may wish to test the transcription of relevant genes, and to assess whether the expression is fur dependent in urine. Also, authors may wish to identify the main siderophore needed for survival in urine by probing a number of mutants; this will allow us to assess whether there is a degree of selection and redundancy.

      We thank the reviewer for their comment and agree siderophore uptake is important. We have now included an additional panel (Figure 5G) interrogating the importance of iron-uptake genes grown in urine which is iron limited. We do appreciate that further experiments looking into the Fur regulon and siderophore biosynthesis would be interesting but believe this is outside the scope of this study.

      (6) The role of wbbY is intriguing, pointing towards the importance of high molecular weight O-polysaccharide. In this mutant background, the authors need to assess whether the expression of the capsule, and ECA is affected. Authors need also to complement the mutant. Which is the mechanism conferring resistance?

      We thank the reviewer for their comment and would like to mention that wbbY has already been shown to play a role in LPS profile/biosynthesis and serum-resistance (10.3389/fmicb.2014.00608 ). Furthermore, blast analysis shows that the wbbY gene between the NTUH-K2044 (strain used in aforementioned study) and ECL8 shares 100% sequence identity and also shares lps operon structure. Hence, we do not find it pertinent to complement this mutant as we believe its molecular mechanism has already been established. We have now in the text more prominently highlighted the results of this study and how our screen was robust enough to also identify this gene for serum resistance.

      (7) hns and gnd mutants most likely will have their capsule affected. The authors need to assess whether this is the case. Which is the mechanism conferring resistance?

      As mentioned in point 6, we believe that the serum resistance phenotype is attributable to the LPS phenotype. Previous studies have listed hns and gnd mutants would likely have differences in capsule but due to hns being pleiotropic and gnd being intercalated/adjacent to the LPS/O-antigen biosynthesis it would be difficult to exactly delineate which cellular surface structure is involved.

      (8) The conclusion section can be shortened significantly as much of the text is a repetition of the results/discussion section.

      We thank the reviewer for their suggestion and have made edits to limit repetition in the conclusion section.

      Reviewer #3 (Public Review):

      Below I include several comments regarding potential weaknesses in the methodology used:

      • The study was done with biological duplicates. In vitro studies usually require 3 samples for performing statistical robust analysis. Thus, are two duplicates enough to reach reproducible results? This is important because many genes are analyzed which could lead to false positives. That said, I acknowledge that genes that were confirmed through targeted mutagenesis led to similar phenotypic results. However, what about all those genes with higher p and q values that were not confirmed? Will those differences be real or represent false positives? Could this explain the differences obtained between this and other studies?

      We thank the reviewer for their comment and apologize for the confusion, data were only pooled for the statistical analysis of gene essentiality. Here, two technical replicates of the input library were sequenced and the number of insertions per gene quantified (insertion index scores). These replicates had a correlation coefficient of r2 = 0.955, and the insertions per gene data were pooled to give total insertions index scores to predict gene essentiality. For conditional analyses (growth in urine or serum), replicate data were not combined. As mentioned previously, differences between this and other studies could also be attributed to inherent genomic differences or due to differences in experimental methodology, computational approaches, or the stringency of analysis used to categorize these genes.

      • Two approaches are performed to investigate genes required for K. pneumoniae resistance to serum. In the first approach, the resistance to complement in serum is investigated. And here a total of 356 genes were identified to be relevant. In contrast, when genes required for overall resistance to serum are studied, only 52 genes seem to be involved. In principle, one would expect to see more genes required for overall resistance to serum and within them identify the genes required for resistance to complement. So this result is unexpected. In addition, it seems unlikely that 356 genes are involved in resistance to complement. Thus, is it possible false positives account for some of the results obtained?

      We thank the reviewer for their comment and do believe false positives may account for some of the identified genes. Specifically, to the large contrast in genes, we believe this is due to the methodology as alluded to in our conclusion section. For overall resistance to serum, we used a longer time point (180 min exposure) where fewer surviving mutants are recovered hence fewer overall genes will be identified, whereas strains with short killing windows will have more (i.e. complement-mediated killing, 90 minute exposure).

      Reviewer #3 (Recommendations For The Authors):

      • In Figure 4 it is shown that genes important for growth in urine include several that are required for enterobactin uptake. Moreover, an in vitro experiment shows that the complementation of urine with iron increases K. pneumoniae growth. It would have been informative to do a competition experiment between the WT and Fep mutants in urine supplemented with iron. This could demonstrate that the genes identified are only necessary for conditions in which iron is in limiting concentrations and confirm that the defect of the mutants is not due to other characteristics of urine.

      We appreciate this suggestion. We have now included a new panel (Figure 5G) addressing the supplementation of iron in urine for these select mutants.

      • Considering the results section, the title for Figure 6 seems to be more appropriate for Figure 5.

      Thank you, this has now been corrected.

      Other points:

      • Line 44: treat instead of treating

      Thank you, this has now been corrected.

      • Line 63: found that only 3 genes played a role instead of "found only 3 genes played a role"

      Thank you, this has now been corrected.

      • Line 105: is there any reason for only using males? Since UTIs are frequent in women? Why not use urine from women volunteers?

      Due to accessibility of willing volunteers and human ethic application processes, only male samples were available. We are currently undertaking further studies to understand how male and female urine influences growth of uropathogens.

      • Line 105: since the urine was filter-sterilized, maybe the authors can comment that another point that is missing in urine - and that it may be important to study - will be the presence of the urine microbiome and how this affects growth of K. pneumoniae.

      We again thank the reviewer for this comment and have now edited the manuscript discussing how the absence of urine microbiome could affect growth (Line 659). As an aside, future studies in our lab are interested in looking at the role of commensal/microbiome co-interactions for essentiality/pathogenesis using TraDIS.

      • Line 116: I understand that the 8 healthy volunteers combined males and females

      Thank you, we have now edited this methods line to make this clearer.

      • Line 120: incubate in serum 90 min and 180 RPM shaking: any reasons for using these conditions, any reference supporting these conditions?

      Thank you for pointing this out, we were mirroring a previous K. pneumoniae serum-resistance study (doi.org/10.1128/iai.00043-).

      • Line 156: space after the dot.

      Thank you, we have now corrected this in the manuscript.

      • Line 164: resulting reads were mapped to the K. pneumoniae: what are the parameters used for mapping (e.g. % of identity...)?

      Thank you for bringing this to our attention, we have now included in our manuscript that we used the default parameters of BWA-MEM for mapping for minimum seed length (default -k =20bp exact match)

      • Line 180: it will be good to upload to a repository the In-house scripts used or indicate the link beside the reference for those scripts.

      Our scripts are derived from the pioneering TraDIS study (doi: 10.1101/gr.097097.109). We are currently still optimizing our scripts and intend to upload these to be publicly available. However, in the meantime we are more than happy to share them with other parties upon request.

      • Line 191: why were genes classified as 12 times more likely to be situated in the left mode? Any particular reason for using this threshold?

      We opted for a more-stringent threshold for classifying essential genes, in keeping with previous and comparable studies (doi.org/10.1371/journal.pgen.1003834).

      • Line 209: do you mean Q-value of <0.05 instead of >0.05 ? How is this Q value is calculated, and which specific tests are applied?

      Thank you for pointing out this Q value error, we have now corrected this in the manuscript. These values were generated using the biotradis tradis_comparison.R script which uses the EdgeR package. For further reading please see DOI: 10.1093/bioinformatics/btp616. The Q-values are from P values corrected for multiple testing by the Benjamini-Hochberg method.

      • Line 212: again, which type of test is used? What about the urine growth analysis? The same type of tests were applied?

      Thank you for bringing this to our attention, we have now indicated in the referenced method section the use of which package for which datasets (i.e. or serum). Line 212 refers to our use of the AlbaTraDIS package, which builds on the biotradis toolkit, to identify gene commonalities/differences in the selected growth conditions again using multiple testing by the Benjamini-Hochberg methods. For further reading, please refer to DOI: 10.1371/journal.pcbi.1007980

      • Line 226: do the authors mean Sanger sequencing instead of SangerSanger sequencing?

      Thank you, we have now corrected this in the manuscript.

      • Line 239: does the WT strain contain another marker for differentiating this strain from the mutant? Or is the calculation of the number of WT CFUs done by subtracting the number of CFUs in media with antibiotics from the total number of CFUs in media without antibiotics? The former will be a more accurate method.

      The calculation was based on the latter assumption, “number of WT CFUs done by subtracting the number of CFUs in media with antibiotics from the total number of CFUs in media without antibiotics”. We have now updated the methods section to make this clearer.

      • Line 266: can you indicate approximately how many CFUs you have in this OD?

      Thank you, we have now also indicated an approximate CFU for this mentioned OD600 (OD600 1 = 7 × 108 cells).

      • Line 309: besides indicating Figure 1D please indicate here Dataset S1 (the table where one can see the list of essential and non-essential genes). This table is shown afterwards but I think it will be more appropriate to show it at the begging of the section.

      Thank you, we have now taken on this recommendation and have now edited the manuscript to also indicate Dataset S1 earlier.

      • Table 3. regarding the comparison of essential genes between different strains. I think it will be more clear if a Venn diagram was drawn including only genes that have homologs in all the studied strains (i.e. defining the core genome essentially).

      We would like to thank the reviewer for suggesting a venn diagram and have now removed Table 3 which has been replaced with a new Figure 3.

      • Line 461: replicates were combined for downstream analyses? But are replicates combined for doing the statistical analysis? If so, how is the statistical analysis performed? How is it taken into account the potential variability in the abundance in each library? An r of 0.9 is high but not perfect.

      Technical replicates of the sequenced input library were combined following identification of a correlation coefficient of r2 = 0.955, for the calculation of insertion index scores used in gene essentiality analysis. While r2 = 0.955 is not perfect, discrepancies here can be attributed to higher variance in insertion index scores when sampling small genes, as these are represented by fewer insertions and the stochastic absence of a single insertion event has a greater effect on the overall IIS. Replicate data were not pooled for statistical analysis of mutant fitness (growth in urine and serum).

      • Line 487: is there any control strain containing the kanamycin gene in a part of the genome that does not affect the growth of K. pneumoniae? This could be used to show that having the kanamycin gene does not provide any defect in urine growth.

      We thank the reviewer for this suggestion but argue that introduction of the kanamycin gene into each unique loci may result in various levels of gene fitness that would be incomparable to a single control strain. Instead, we culture the ECL8 mutant library in urine and ensure that its kinetics are comparable to the wildtype. As the library contains thousands of kanamycin cassettes uniquely positioned across most of the genome with no observable growth defect, we do not anticipate the presence or expression of the cassette to have an appreciable impact.

      • Line 569: in the methodology it was indicated that control cells were incubated in PBS for the same amount of time. I think this is an important control that is not cited in the results section. Please can you indicate?

      We apologise for this misunderstanding due to how the methodology was written. The experiment did not sequence the PBS incubated samples as this was solely used a check for viability of the used K. pneumoniae ECL8 stock solution.

      • Line 597: "Mutants in igaA are enriched in our experiments". Can you show this data?

      We have now included this as a supplementary (Figure S11A)

      • Line 615: when doing this calculation, I guess the authors take into account only genes that are also present in the other strains.

      That is correct, we were aiming to highlight the high conservation of “essential genes” among all the selected strains.

      • Line 627: why surprisingly? Because is too low. Then indicate.

      Thank you, we have now edited this sentence to indicate that.

      • Figure 4: please, for clarity, can you indicate the meaning of the colors in the figure itself besides indicating it in the figure legend?

      Thank you, we have now included a color legend in these figure panels for clarity.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      The authors build upon prior data implicating the secreted peptidoglycan hydrolase SagA produced by Enterococcus faecium in immunotherapy. Leveraging new strains with sagA deletion/complementation constructs, the investigators reveal that sagA is non-essential, with sagA deletion leading to a marked growth defect due to impaired cell division, and sagA being necessary for the immunogenic and anti-tumor effects of E. faecium. In aggregate, the study utilizes compelling methods to provide both fundamental new insights into E. faecium biology and host interactions and a proof-of-concept for identifying the bacterial effectors of immunotherapy response.

      We thank the Reviewers for their positive feedback on our manuscript. We also appreciate their helpful comments/critiques and have revised the manuscript as indicated below.

      Public Reviews:

      Reviewer #1 (Public Review):

      Klupt, Fam, Zhang, Hang, and colleagues present a novel study examining the function of sagA in E. faecium, including impacts on growth, peptidoglycan cleavage, cell separation, antibiotic sensitivity, NOD2 activation, and modulation of cancer immunotherapy. This manuscript represents a substantial advance over their prior work, where they found that sagA-expressing strains (including naturally-expressing strains and versions of non-expressing strains forced to overexpress sagA) were superior in activating NOD2 and improving cancer immunotherapy. Prior to the current study, an examination of sagA mutant E. faecium was not possible and sagA was thought to be an essential gene.

      The study is overall very carefully performed with appropriate controls and experimental checks, including confirmation of similar densities of ΔsagA throughout. Results are overall interpreted cautiously and appropriately.

      I have only two comments that I think addressing would strengthen what is already an excellent manuscript.

      In the experiments depicted in Figure 3, the authors should clarify the quantification of peptidoglycans from cellular material vs supernatants. It should also be clarified whether the sagA need to be expressed endogenously within E. faecium, and whether ambient endopeptidases (perhaps expressed by other nearby bacteria or recombinant enzymes added) can enzymatically work on ΔsagA cell wall products to produce NOD2 ligands?

      We mentioned in the main text that peptidoglycan was isolated from bacterial sacculi and digested with mutanolysin for LC-MS analysis. We have now also included “mutanolysin-digested” sacculi in the Figure 3 legend as well.

      We have added the following text “We next evaluated live bacterial cultures with mammalian cells to determine their ability to activate the peptidoglycan pattern recognition receptor NOD2” and “our analysis of these bacterial strains” to indicate live cultures were evaluated for NOD2 activation.

      We have also added the following text “Our results also demonstrated that while many enzymes are required for the biosynthesis and remodeling of peptidoglycan in E. faecium, SagA is essential for generating NOD2 activating muropeptides ex vivo.”

      In the murine experiments depicted in Figure 4, because the bacterial intervention is being performed continuously in the drinking water, the investigators have not distinguished between colonization vs continuous oral dosing of the mice peptidoglycans. While I do not think additional experimentation is required to distinguish the individual contributions of these 2 components in their therapeutic intervention, I do think the interpretation of their results should include this perspective.

      We have added the following text “We note that by continuous oral administration in the drinking water, live E. faecium and soluble muropeptides that are released into the media during bacterial growth may both contribute to NOD2 activation in vivo.” and revised the following text “Nonetheless, these results demonstrate SagA is not essential for E. faecium colonization, but required for promoting the ICI antitumor activity through NOD2 in vivo.

      Reviewer #2 (Public Review):

      Summary:

      The gut microbiome contributes to variation in the efficacy of immune checkpoint blockade in cancer therapy; however, the mechanisms responsible remain unclear. Klupt et al. build upon prior data implicating the secreted peptidoglycan hydrolase SagA produced by Enterococcus faecium in immunotherapy, leveraging novel strains with sagA deleted and complemented. They find that sagA is non-essential, but sagA deletion leads to a marked growth defect due to impaired cell division. Furthermore, sagA is necessary for the immunogenic and anti-tumor effects of E. faecium. Together, this study utilizes compelling methods to provide fundamental new insights into E. faecium biology and host interactions, and a proof-of-concept for identifying the bacterial effectors of immunotherapy response.

      Strengths:

      Klupt et al. provide a well-written manuscript with clear and compelling main and supplemental figures. The methods used are state-of-the-art, including various imaging modalities, bacterial genetics, mass spectrometry, sequencing, flow cytometry, and mouse models of immunotherapy response. Overall, the data supports the conclusions, which are a valuable addition to the literature.

      Weaknesses:

      Only minor revision recommendations were noted.

      Recommendations for the authors:

      Reviewer #2 (Recommendations For The Authors):

      General comments - the number/type of replicates and statistics are missing from some of the figure panels. Please be sure to add these throughout - all main figure panels should have replicates. I've also noted some specific cases below.

      Abstract - sagA is non-essential, need to edit text at "essential functions".

      This change has been made.

      "small number of mutations" - specify how many in the text.

      We revised the text. “Small number” is changed to “11”.

      "under control of its native promoter" - what was the plasmid copy number? It looks clearly overexpressed in Figure 1d despite using a native promoter, although it's a bit hard to know for sure without a loading control.

      pAM401 has p15A origin of replication, therefore the plasmid copy number ~20-30 copies (Lutz R. et al Nucleic Acids Res. 1997). Total protein was visualized by Stain-Free™ imaging technology (BioRad) and serves as protein loading control and has been relabeled accordingly.

      "decrease levels of small muropeptides" - the asterisks are missing from Figure 3a.

      Green asterisks for peaks 2, 3, 7 and purple asterisks for peaks 13, 14 were added.

      The use of "Com 15 WT" in the figures is confusing - just replace it with "wt" and specify the strain in the text. Presumably, all of the strains are on the Com 15 background.

      “Com15 WT” was replaced to “WT” in figures and main text.

      Change 1d to 1b so that the panels are in order (reading left to right and then top to bottom).

      Figure 1 legend is missing a number of replicates and statistics for 1a.

      Number of replicates were added.

      Figure 1b - it's unclear to me what to look at here, could add arrows indicating the feature or interest and expand the relevant text.

      Arrows pointing to cell clusters were added.

      Figure 1d - what is "stain free"? It would be preferable to show a loading control using an antibody against a constitutive protein to allow for normalization of the loading control.

      Stain-Free Imaging technology (BioRad) utilizes gel-containing trihalo compound to make proteins fluorescent directly in the gel with a short photoactivation, allowing the immediate visualization of proteins at any point during electrophoresis and western blotting. Stain-Free total protein measurement serves as a reliable loading control comparable to Coomassie Blue Staining. This has been relabeled a “Total protein” in the Figure and Stain-free imaging technology is noted in the legend.

      ED Figure 1 - representative of how many biological replicates?

      Legends are updated.

      ED Figure 2a - I would replace this with a table, it's not necessary to show the strip images. Also, please specify the number of replicates per group.

      Additional Extended Data Table 2 was added.

      ED Figure 2b - This data was not that convincing since the sagA KO has a marked growth defect and the time points are cut off too soon to know if growth would occur later. The MIC definition is potentially misleading. Should specific a % growth cutoff (i.e. <10% of vehicle control) and the metric used (carrying capacity or AUC). Then assign MIC to the tested concentration, not a range. The empty vector also seems to impact MIC, which is concerning and complicates the interpretation. Specify the number of replicates and add statistics. Given these various concerns, I might suggest removing this figure, as it doesn't really add much to the story.

      We appreciate this comment from the Reviewer, but believe this data is helpful for paper and have included longer time points for the growth data. The definition of MIC for ED Fig. 2b has been included in the legend.

      Figure 2 - specify the type of replicate. Number of cells? Number of slices? Number of independent cultures?

      For Cryo-ET experiments single bacterial cultures were prepared. Number of cells and slices for analysis are indicated in the legend. Legends are updated.

      Figure 4e - missing the water group, was it measured?

      Water (αPD-L1) group was not included in immune profiling of tumor infiltrating lymphocytes (TILs) experiment, as we have previously demonstrated limited impact on ICI anti-tumor activity and T cell activation in this setting (Griffin M et al Science 2021).

      Figure 4d - is this media specific to your strains? If not, qPCR may be a better method using strain-specific primers.

      Yes, HiCrome™ Enterococcus faecium agar plates (HIMEDIA 1580) are selective for Enterococcus species, moreover the agar is chromogenic allowing to identify E. faecium as yellow colonies among other Enterococcus species.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      RC-2023-02105R: Brunetta et al.,

      IF1 is a cold-regulated switch of ATP synthase to support thermogenesis in brown fat

      We are happy to submit our revised manuscript after considering the suggestions made by reviewers. The comments were overall positive, and the changes requested were mostly editorial. We have, nevertheless, added new experiments as quality controls. These experiments did not affect the main conclusions of our work. In addition, we also included two in vivo experimental models of gain and loss-of-function, to further address the physiological relevance of IF1 in BAT thermogenesis. We believe with these additional experiments, quality controls as well as in vivo models, our study has improved considerably. We hope our efforts will be appreciated by the reviewers and we make ourselves available to answer any further questions.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary: In the present manuscript, the authors present data in support of their primary discovery that "IF1 controls UCP1-dependent mitochondrial bioenergetics in brown adipocytes". The opening figure convincingly demonstrates that IF1 expression is cold-exposure dependent. They then go on to show that loss of IF1 has functional consequences that would be predicted based on IF1's know role as a regulator of ATP hydrolysis by CV. They go on to make a few additional claims, succinctly detailed in the Discussion section. Specific claims include the following: 1) IF1 is downregulated in cold-adapted BAT, allowing greater hydrolytic activity of ATP synthase by operating in the reverse mode; 2) when IF1 is upregulated in brown adipocytes in vitro mitochondria unable to sustain the MMP upon adrenergic stimulation, 3) IF1 ablation in brown adipocytes phenocopies the metabolic adaptation of BAT to cold, and 4) IF1 overexpression blunts mitochondrial respiration without any apparent compensator response in glycolytic activity. The claims described above are well supported by the evidence. The manuscript is very well written, figures are clear and succinct. Overall, the quality of the work is very high. Given that IF1 is implicated across many fields of study, the novel discovery of IF1 as a regulator of brown adipose mitochondrial bioenergetics will be of significance across several fields. That said, a few areas of concern were apparent. Concerns are detailed in the "Major" and "Minor" comments section below. Additional experiments do not appear to be required, assuming the authors adequately acknowledge the limitations of the study and either remove or qualify speculative claims.

      Major Comments:

      1. The authors convincingly demonstrate that IF1 expression is specifically down-regulated in BAT upon cold-exposure. These data strongly implicate a role for IF1 in BAT bioenergetics, a major claim of the authors and a novel finding herein. Additional major strengths of the paper, which provide excellent scientific rigor include the use of both loss of function and gain of function approaches for IF1. In addition, the mutant IF1 experiments are excellent, as they convincingly show that the effects of IF1 are dependent on its ability to bind CV. RESPONSE: We thank the reviewer for the positive feedback on our work.

      Regarding Figure 1 - Did the content of ATP synthase change? In figure 1A-B, the authors show that ATPase activity of CV is higher in cold-adapted mice. While this result could be due to a loss of IF1, it could also be due to a higher expression of CV. To control for this, the authors should consider blotting for CV, which would allow for ATPase activity to be normalized to expression.

      RESPONSE: Thank you for this suggestion. We have now determined complex V subunit A in our experimental protocol. We found that cold exposure does not impact complex V protein levels. Given the importance of this control, we have now included it in Figure 1 (Please, see the revised version) alongside the IF1/complex V ratio. In addition, we have now performed WBs in the BAT from mice exposed for 3 and 7 days to thermoneutrality (~28°C). We found that IF1 is not reduced following whitening of BAT by this approach whilst UCP1 and other mitochondrial proteins are reduced. This set of data is now included in Figure 1I,K,L.

      Regarding MMP generated specifically by ATP hydrolysis at CV, the reversal potential for ANT occurs at a more negative MMP than that of CV (PMID: 21486564). Because reverse transport of ATP (cytosol to matrix) via ANT will also generate a MMP, it is speculative to state that the MMP in the assay is driven by ATP hydrolysis at CV. It is possible and maybe even likely that the majority of the MMP is driven by ANT flux, which in turn limits the amount of ATP hydrolyzed by CV. Admittedly, it is very challenging to different MMP from ANT vs that from CV, thus the authors simply need to acknowledge that the specific contribution of ATP hydrolysis to MMP remains to be fully determined. That said, the fact that ATP-dependent MMP tracks with IF1 expression does certainly implicate a role for ATP hydrolysis in the process. The authors should consider including a discussion of the ambiguity of the assay to avoid confusion. A role for ANT likely should be incorporated in the Fig. 1J cartoon.

      RESPONSE: Thank you for bringing the ANT contribution to MMP to our attention. The effects of ATP in the real-time MMP measurements were totally abolished by the addition of oligomycin in BAT-derived isolated mitochondria, thus suggesting dependency of complex V in this process. However, the assessment of MMP in intact cells is much more challenging given cytosolic vs. mitochondrial contribution to ATP pool, and ATP synthase vs. ANT reversal capacity depending on MMP. Nevertheless, we have addressed these points in the discussion section as well as added to our schematic cartoon in Figure 1m.

      Regarding the lack of effect of IF1 silencing on MMP, it is possible that IF1 total protein levels are simply lower in cultured brown fat cells relative to tissue? The authors could consider testing this by blotting for IF1 and CV in BAT and brown fat cells. The ratio of IF1/ATP5A1 in tissue versus cells may provide some amount of mechanistic evidence as to their findings.

      RESPONSE: We have now blotted for complex V and IF1 in both differentiated primary brown adipocytes and BAT homogenates derived from mice kept at room temperature (~22°C). We found the levels of complex V in primary brown adipocytes are higher than BAT homogenates. Therefore, IF1/complex V ratio is different between these two systems. This has indeed the potential to influence our gain and loss-of-function experiments. We have added these results alongside their interpretation in the revised manuscript.

      The calculation of ATP synthesis from respiration sensitive to oligomycin has many conceptual flaws. Unlike glycolysis, where ATP is produced via substrate level phosphorylation, during OXPHOS, the stoichiometry of ATP produced per 2e transfer is not known in intact brown adipose cells. This is a major limitation of this "calculated ATP synthesis" approach that is beginning to become common. Such claims are speculative and thus likely do more harm than good. In addition to ANT and CV, there are many proton-consuming reactions driven by the proton motive force (e.g., metabolite transport, Ca2+ cycling, NADPH synthesis). Although it remains unclear how much proton conductance is diverted to non-ATP synthesis dependent processes, it seems highly likely that these processes contribute to respiratory demand inside living cells. Moreover, just as occurs with UCP1 in response to adrenergic stimuli, proton conductance across the various proton-dependent processes likely changes depending on the cellular context, which is another reason why using a fixed stoichiometry to calculate how much ATP is produced from oxygen consumption is so highly flawed. Maximal P/O values that are often used for NAD/FAD linked flux are generated using experimental conditions that favor near complete flux through the ATP synthesis system (supraphysiological substrate and ADP levels). The true P/O value inside living cells is likely to be lower.

      RESPONSE: We agree with the reviewer regarding the limitations on calculating ATP production in intact cells based on respiration and proton flux. However, this was only one experiment on which we based our conclusions, as these were also supported by i.e. ATP/ADP ratio measurements and oxygen consumption using different substrates. Therefore, we do not rely exclusively on the ATP production estimative, rather we use this experiment to support complementary methodologies. Nevertheless, we have now better detailed our experimental protocol as well as acknowledged the limitations of the method, so the reader is aware of our procedure and its limitations. We hope the reviewer understands our motivation to perform these experiments and the contribution to our study.

      Why are the results in Figure 3K expressed as a % of basal? Could the authors please normalize the OCR data to protein and/or provide a justification for why different normalization strategies were used between 3K and 3M?

      RESPONSE: We apologize for the lack of consistency. We have now updated Figure 3 to show all the data in absolute values divided by protein content. This change does not affect the overall interpretation of the findings.

      The authors claim that IF1 overexpression lowers ATP production via OXPHOS. However, given the major limitations of this assay (ass discussed above), these claims should be viewed as speculation. This needs to be addressed by the authors as a major limitation. The fact that the ATP/ADP levels did not change do not support of reduction in ATP production, as claimed in the title of Figure 4.

      RESPONSE: The reduction in ATP levels and mitochondrial respiration (independent of the substrate offered) suggests a reduction in ATP production rather than an increase in ATP consumption. Moreover, the maintenance of ATP/ADP ratio suggests the existence of a compensatory mechanism to avoid cellular energy crises, which we interpreted as reduced metabolic activity of the cells. Nevertheless, we have now reworded our statements to address the limitations of the methods and our interpretation of the data.

      In the discussion, the authors state "However, considering that IF1 inhibits F1-ATP synthase in a 1:1 stoichiometric ratio, the relatively higher expression of IF1 in BAT at room temperature could represent an additional inhibitory factor for ATP synthesis in this tissue." This does not appear to be correct. Although IF1 has been suggested to partially lower maximal rates of ATP synthesis rates, most of this evidence comes from over-expression experiments. According to the current understanding of IF1-CV interaction, the protein is expelled from the complex during rotation in favor of ATP synthesis (PMID: 37002198). It is far more likely that ATP synthesis is low in BAT mitochondria due to the low CV expression. Relative to heart and when normalized to mitochondrial content, CV expression in BAT mitochondria is about 10% that of heart (PMID: 33077793).

      RESPONSE: We agree with the reviewer and removed this sentence.

      The last sentence of the manuscript states, "Given the importance of IF1 to control brown adipocyte energy metabolism, lowering IF1 levels therapeutically might enhance approaches to enhance NST for improving cardiometabolic health in humans." This sentence seems at odds with the evidence that IF1 levels go up, not down, in human BAT upon cold exposure.

      RESPONSE: In light of our new experiments, we have now updated our conclusions.

      Minor Comments:

      The term "anaerobic glycolysis" is used throughout. All experiments were performed under normoxic conditions, thus the correct term is "aerobic glycolysis.

      RESPONSE: Thank you for this comment and we have replaced this term as suggested.

      Only male mice were used in the study, could the authors please provide a justification for this?

      RESPONSE: Given we devoted most of our efforts to the manipulation of IF1 in vitro, we have used the mouse model as a proof-of-principle on the impact of IF1 in adrenergic-induced thermogenesis. We have now included IF1 KO male and female mice to address the role of IF1 in adrenergic-induced thermogenesis. However, due to the limitation of material, we could only perform AAV in vivo gain-of-function in male mice, therefore, our results cannot be immediately transferred to both sexes, unfortunately.

      Reviewer #1 (Significance (Required)):

      Overall, the quality of the work is very high. Given that IF1 is implicated across many fields of study, the novel discovery of IF1 as a regulator of brown adipose mitochondrial bioenergetics will be of significance across several fields.

      My expertise is in mitochondrial thermodynamics; thus, I do not feel there are any parts of the paper that I do not have sufficient expertise to evaluate.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary

      The manuscript by Brunetta and colleagues conveys the message that the ATPase inhibitory factor 1 (IF1) protein, a physiological inhibitor of mitochondrial ATP synthase, is expressed in BAT of C57BL/6J mice. Moreover, upon cold-adaption of mice they report that the content of IF1 in BAT is downregulated to sustain the mitochondrial membrane potential (MMP) as a result of reverse functioning of the enzyme. In experiments of loss and gain of function of IF1 in cultured brown adipocytes and WT cells they further stress that IF1 silencing promotes metabolic reprogramming to an enhanced glycolysis and lipid oxidation, whereas IF1 overexpression blunts ATP production rendering a quiescent cellular state of the adipocytes.

      RESPONSE: We appreciate the time the reviewer invested in our work. Please, see our responses below in a point-by-point manner.

      Reviewer #2 (Significance (Required)):

      Claims and conclusions:

      I have been surprised by the claim that IF1 protein is expressed in BAT under basal conditions and that its expression is downregulated in the cold-adapted tissue. In a previously published work by Forner et al., (2009) Cell Metab 10, 324-335 (reference 43), using a quantitative proteomic approach, it is reported that the mitochondrial proteome of mouse BAT under basal conditions contains a low content of IF1 (at level comparable to the background of the analysis). Remarkably, in the same study they show that there is roughly a 2-fold increase in the content of IF1 protein in mitochondria of BAT at 4d and 24d of cold-adaptation of mice. In other words, just the opposite of what is being reported in the Brunetta study.

      RESPONSE: We are aware of the inconsistencies between our findings and Forner et al. (2009). We would like to point out that we have determined IF1 levels in BAT in two separate cohorts with the same findings, and in a third cohort, we observed IF1 mRNA levels to be downregulated in a much shorter timeframe. Our functional analysis is line with this pattern of regulation. A closer look at the supplementary table provided by Forner et al. (2009), shows that the increase in IF1 content following cold exposure is not supported and since we do not have further insight into the methods and analysis employed by the Forner et al. group, we believe a direct comparison should be avoided at the moment. Regarding the baseline levels of IF1 in BAT, the relatively high abundance of IF1 in BAT was also found by another independent group (https://doi.org/10.1101/2020.09.24.311076).

      Importantly, the last paragraph of the discussion needs to be amended when mentioning the work of Forner et al. (ref.43). The mentioned reference studied changes in the mouse mitochondrial proteome not in human mitochondria, as it is stated in the alluded paragraph.

      RESPONSE: We apologize for this overlook; we have now reworded our statement.

      More puzzling are the western blots in Figures 1E, 1H, Supp. Fig. 1C, D were IF1 (ATP5IF1) is identified by a 17kDa band. However, in other Figures (Fig. 2, Fig. 3, Fig. 4, Supp Fig. 2) IF1 is identified by its well-known 12kDa band. What is the reason for this change in labeling of the IF1 band? The reactivity of the anti-IF1 antibody used? It has been previously documented that liver of C57BL/6J and FVB mouse strains do not express IF1 to a significant level when compared to heart IF1 levels (Esparza-Molto (2019) FASEB J. 33, 1836-1851). However, in Fig. 1E they show opposite findings, much higher levels of IF1 in liver than in heart as reveal by the 17kDa band. Moreover, in Fig. 1H they show the vanishing of the 17 kDa band under cold adaptation, which is not the migration of IF1 in gels as shown in their own figures (see Fig. 2, Fig. 3, Fig. 4, Supp Fig. 2). I am certainly reluctant to accept that the 17kDa band shown in Figures 1E, 1H, Supp. Fig. 1C, D is indeed IF1. Most likely it represents a non-specific protein recognized by the antibody in the tissue extracts analyzed. Cellular overexpression experiments of IF1 in WT1 cells (Fig. 2E) and primary brown adipocytes (Fig. 4B) also support this argument. Overall, I do not support publication of this study for the reasons stated above.

      RESPONSE: We understand the concerns raised by the reviewer and apologize for the lack of details in our experimental procedures. While we used the same antibody in the study (Cell Sig. cat. Num. 8528, 1:500), we used two different types of gels. The difference in the molecular weight appearance of IF1 is likely through the migration of the protein in the agarose gel. By using custom-made gels, we observe the protein ~17kDa (Fig. 1 and 5), whereas by using commercial gels (Fig. 2, 3, and 4), we observe the protein closer to the predicted molecular weight (i.e. ~12kDa). Of note, gain and loss-of-function experiments, both in vivo as well as in vitro confirm this statement and the specificity of the antibody (Fig. 2, 3, 4, 5, Fig. EV2). In addition, when we ran a custom-made gel with primary BAT cells, we observed again the ~17kDa band (see Figure for the reviewer below). These experiments alongside the absence of other bands in the gels (see uncropped membranes in Supplementary Figure 1) make us conclude that the band we observe is indeed IF1. Nevertheless, we have now updated our methods section, so the reader is aware of our approaches. We hope the reviewer is satisfied with our additional experiments and editions throughout the manuscript.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary:

      In this manuscript, Brunneta et al describe the role of IF1 in brown adipose tissue activation using in vivo and in vitro experimental models. They observed that cold adaptation promotes a reduction in IF1 expression and an increase in the reverse activity of mitochondrial ATPase or Complex V. Based on these results, the authors explore the contribution of IF1 in this metabolic pathway by modeling the thermogenic process in differentiated primary brown adipocytes. They silenced and overexpressed IF1 in culture and studied their adrenergic stimulation under norepinephrine.

      Major comments:

      The experiments are well explained and the manuscript flows very well. There are several comments that should be addressed.

      RESPONSE: We thank the reviewer for the kind words regarding our work.

      1. The authors measure ATP hydrolysis in isolated mitochondria from BAT in Figure 1. They observed that IF1 is decreased upon cold exposure and that ATP hydrolysis is increased. They assess protein levels of different OXPHOS proteins, including IF1 but not other proteins of Complex V (ATP5A) as they do in Figures 3 and 4. It is important to see that cold exposure only affects IF1 levels but not other proteins from Complex V. Does IF1/Complex V ratio change? RESPONSE: We thank the reviewer for this suggestion which was also raised by Reviewer #1. We have now measured complex V subunit A in our experimental protocol. We found that cold exposure does not impact complex V protein levels. Given the importance of this information, we have now included it in Figure 1 (Please, see the revised version) alongside the IF1/complex V ratio. In addition, we have now performed WBs in the BAT exposed for 3 and 7 days to thermoneutrality (~28°C) where we found that IF1 is not reduced following whitening of BAT by this approach whilst UCP1 and other mitochondrial proteins are reduced.

      This set of data is now included in Figure 1I,K,L.

      In Figure 2J, the drop in MMP is lower upon adrenergic stimulation than in Figure 2E. The same observation applies to other results when the reduction in MMP after NE addition is minimal. Why do the authors remove TMRM for the measurements of membrane potential? TMRM imaging is normally done in the presence of the dye in non-quenching mode. Treatments should be done prior to the addition of the dye and then TMRM should be added and left during the imaging analysis and measure in non-quenching mode. This might explain some of the above-mentioned points regarding the MMP data. Alternatively, if the dye is removed before the measurements, they should let the cells to adapt and so the dye equilibrates between mitochondria and cytosol. A more elegant method to measure membrane potential could be live-cell imaging. In addition, authors propose that mitochondrial membrane potential upon NE stimulation is maintained by reversal of ATP synthase. If this is the case, one would expect that addition of oligomycin in NE treated adipocytes would cause depolarization. However, in FigS2A this is not the case. Authors should comment on this in addition to considering more elegant approach to measure MMP.

      RESPONSE: We apologize for the lack of details in the methods. All treatments (i.e., transfection and norepinephrine stimulation) were performed before the addition of TMRM. Indeed, this approach does not have the resolution compared to safranine in isolated mitochondria (Fig. 1D), which limits our interpretation regarding the dynamic role of IF1 on MMP in brown adipocytes. We have taken care to state the limitations of our method throughout the entire paper to avoid overinterpretation of our data. Regarding the removal of the dye before the measurements, our internal controls indicate that this procedure does not change the ability of our method to detect fluctuations in MMP (i.e., oligomycin and FCCP as internal controls). Nevertheless, as suggested by the reviewer, to test the time effect of the probe equilibrium (i.e., mitochondria versus cytosol) in our method, we loaded cells with TMRM 20 nM for 30 min and measured the fluorescence right after the removal of the probe/washing steps for another 10 min. We were not able to detect differences in the fluorescence in a time-dependent manner (see below). Therefore, we conclude the removal of TMRM does not influence the fluorescence of the probe in differentiated brown adipocytes.

      +NE

      -NE

      In addition, we performed a similar experiment using TMRM in the quenching mode (200 nM), however, after the removal of TMRM, we added FCCP (1 mM) to the cells for 10 min under constant agitations at 37°C. This approach aimed to expel all TMRM that accumulated within the mitochondria in an MMP-dependent manner. Therefore, excluding the dynamic Brownian movement that we could have caused by the removal of the dye before the measurement mentioned by the reviewer. By doing this, we found the same effect of IF1 overexpression in the reduction of MMP in the presence of norepinephrine.

      Protocol:

      • Transfection (24h) on day 4 of differentiation + 24h just normal media

      • 30 min norepinephrine 10 µM

      • 200 nM TMRM on top of NE

      • Washing step

      • Add FCCP 1 µM for 10 min, and read (The aim here was to release all TMRM accumulated inside of mitochondria in a MMP-dependent manner)

      In summary, the data suggests the removal of the dye from the cells does not influence the fluorescence of TMRM, therefore, enabling us to make conclusions regarding the biological effects of IF1 manipulation in the MMP of brown adipocytes. Regarding the reverse mode of ATP synthase and the absence of effects with oligomycin, given oligomycin inhibits both rotation of ATP synthase and even uncoupled brown adipocytes respond to oligomycin (i.e. reduction in O2 consumption), the prediction of lowering MMP in the presence of oligomycin due to inhibition of the reserve mode of ATP synthase is more complicated than anticipated. Nevertheless, we have now addressed this topic in the discussion section. Lastly, we generally observe a reduction in MMP around 10-25% in differentiated adipocytes upon NE treatment (30 minutes, 10mM). However, due to the differentiation state of the cells, MMP response from norepinephrine fluctuated from experiment to experiment. Therefore, we did not compare experiments performed on different days or batches, but only within the same differentiation batch to reduce variability.

      In Figure 2, in the model of siIF1, there is baseline more phosphorylation of AMPK than in the scramble control (pAMPK). However, this is not the case of p-p38MAPK. Do the authors have any explanation for those differences in baseline activation of the stress kinases when IF1 is silenced? In the same experimental group, addition of NE seems to have more effect in the scrambled than in siIF1, but the plotted data does not reflect these differences. In contrast, increase in pAMPK upon NE is higher in IF1 overexpressing cells compared to EV (Figure 2H), but again this is not reflected in western blot quantification (Figure 2I).

      RESPONSE: Although some differences in pAMPK in the treatments were observed as gathered by the representative blots, these changes were not confirmed later in different biological replicates, therefore, the overall effect of IF1 manipulation in pAMPK does not change. Given we used this approach as quality control for our experiments to guarantee norepinephrine treatment works, we removed the pAMPK data from the study and kept p38 as a marker of adrenergic signaling activation (please see revised Fig. 2 in the main file).

      Does NE promote decrease of IF1 expression in control (siScramble and EV) adipocytes? The authors should test it and see whether it goes in the same direction as the observations derived from the experiments in cold exposed mice. This is very important point, as it could explain the lack of an additional effect of IF1 silencing on NE-induced depolarization (Figure 2E).

      RESPONSE: We thank the reviewer for this suggestion. In line, with the in vivo data, acute NE treatment in differentiated brown adipocytes does not change IF1 mRNA and protein levels. We have now added this information and the corresponding interpretation to the updated manuscript.

      Does NE promote decrease of IF1 expression in the scramble and EV adipocytes? The authors should test it and see whether it goes in the same direction as the observations derived from the experiments in cold exposed mice.

      RESPONSE: As this question is the same as #4, we believe the reviewer may have erroneously pasted this here.

      For MMP data in Fig2, they should include significance between non treated and NE-treated groups. They say: "While UCP1 ablation did not cause any effect on MMP upon adrenergic stimulation...", but NE caused (probably significant) depolarization in siUCP1, which seems even stronger than depolarization in EV. This is opposite to what you would expect. They also didn't confirm UCP1 silencing with western blot.

      RESPONSE: We thank the reviewer for this suggestion. We have now included the expected statistical main effect of NE upon MMP. Although the effects of IF1 overexpression were blunted when Ucp1 was silenced, we indeed still observed the same degree of reduction in MMP in brown adipocytes. This finding has two possible explanations, one is the effectiveness of the silencing protocol, therefore, residual Ucp1 expression may still play a role in this experiment; second, other ATP-consuming processes are able to lower MMP in a UCP1-independent manner. We have added this information to the updated manuscript to make the reader aware of our findings as well as the limitations of the method. Unfortunately, we were not able to detect UCP1 protein levels due to technical issues. Given the effects of IF1 overexpression were blunted when Ucp1 was silenced, we believe this functional outcome is sufficient, alongside mRNA levels, to demonstrate the effectiveness of our silencing protocol.

      It has been established that decreased expression of IF1 promotes increase in the reverse activity of Complex V, ATP hydrolytic activity. Increase in ATP hydrolysis also affects ECAR. The authors should consider this when calculating the contribution of ATP glycolysis versus ATP OXPHOS since the ATP hydrolysis is also playing a role in the ECAR increase. The data should be reinterpreted. ATP hydrolysis should be measured in the situation where IF1 is silenced and overexpressed. These measurements can be done in cells using the seahorse.

      RESPONSE: The only differences we observed in MMP are in the presence of norepinephrine (i.e. UCP-1-dependent proton conductance), which is not present during the estimation of ATP production by Seahorse analysis. Nevertheless, we have now improved the description of our experimental protocol and limitations to estimate ATP production to make it as clear as possible to the reader. Lastly, given the addition of in vivo gain-of-function experiments, we have now determined the ATP hydrolytic activity in this model, which offers a better understanding of the in vivo modulation of IF1 levels affecting ATP synthase activity (reverse mode). We hope the reviewer understands our motivation to focus on the in vivo model of gain-of-function regarding ATP synthase activity.

      The authors use GAPDH as loading control in western blots. They should use another protein since GAPDH is part of the intermediary metabolism and plays a role in glycolysis.

      RESPONSE: We understand the concern of the reviewer regarding the use of GAPDH as a loading control for the studies of metabolism. However, as can be observed by the western blot images, GAPDH levels do not change in our experimental models, therefore, we feel confident that our loading is homogeneous throughout our gels.

      The authors show that reduction of IF1 involves more lipid utilization. They should include more experiments showing the connection of the metabolic adaptation in the absence of IF1 and some lipid imaging.

      RESPONSE: We appreciate this suggestion. We have now performed Oil Red O staining in differentiated adipocytes following ablation of IF1. However, we did not observe any effect on lipid accumulation in primary brown adipocytes following IF1 knockdown. Therefore, the effects of IF1 ablation on lipid mobilization are not due to lipid content or reflected in lipid accumulation. We have now added this new information to the manuscript (please, see the revised form Fig. EV3).

      In the text, "Despite this adjustment of experimental conditions, we did not detect any effect of IF1 ablation on mitochondrial oxygen consumption (Supplementary Fig. 3A,B)", this is true for baseline, NE-driven and ATP-linked respiration, but what about maximal respiration? There is a huge increase in IF1 knockdown... They should explain these results.

      RESPONSE: We perform this experiment to address the question of whether the lipid mobilization induced by norepinephrine would uncouple mitochondria in a UCP1-independent manner. Given the absence of effect between scrambled and IF1 ablated cells in mitochondrial respiration in the presence of norepinephrine and following the addition of oligomycin, we concluded no effect of lipolysis-induced UCP1-independent uncoupling. However, as observed by the reviewer and consistent with other data within the study, the interaction between lipid metabolism and IF1 knockdown seems to affect maximal electron transport chain activity, which although interesting, was not the focus of the present study. Nevertheless, we have now acknowledged these findings and a possible explanation for them in the revised manuscript.

      In Figure 3K they present OCR as % of baseline, but in a similar experiment in Figire 4G it is OCR/protein, they should make the Y axis consistent across experiments.

      RESPONSE: We apologize for this overlook. We have now edited all the axes and labels for consistency.

      The graphical abstract is confusing. In BAT there are two populations of mitochondria, the cytosolic and the mitochondria attached to the lipid droplet, peridroplet mitochondria (PDM). Upon adrenergic stimulation, PDM leave the lipid droplet and lipolysis takes place. The authors propose that upon adrenergic stimulation, IF1 is reduced and there is lipid mobilization. The part of the scheme where it says "fully recruited" should be removed or rewritten, since adrenergic stimulation is not compatible with mitochondria recruitment around the lipid droplet.

      RESPONSE: Thank you for this input. Given the addition of new experiments and interpretation, we have now redrawn the graphical abstract and addressed this topic in the discussion section.

      The title should be rewritten to better reflect the research presented in the manuscript.

      RESPONSE: Thank you for this input. Given the addition of new experiments, we have now rewritten the title accordingly.

      Minor comments:

      Some of the Y axis should be corrected. For example, in Figure 2J, L and M should say % of EV untreated, Similarly, in Figure 2E, it should say % of scramble untreated. In Figure 3N, the Y axis is misspelled. All the Y axis referring to percentages should have the same scale for comparison purposes.

      RESPONSE: Thank you for the proofreading. We have now edited the scales and labels to keep consistency.

      The authors should describe better the results corresponding to Figure 2. There is a lot of information and they should improve the description pertaining the connection between the different pieces of data relating the different signaling pathways that are shown. For westerns in this Figure, they should provide some rationale (one to two sentences in the results section) as to why they are checking the expression of pAMPK and p38-MAPK.

      RESPONSE: We have now edited the description of our results to make them as clear as possible.

      Here are some comments referring to the methods section:

      For Complex V hydrolytic activity, the reaction buffer contains 10mM Na-azide. I guess this is to inhibit respiration, but wouldn't azide also inhibit complex V at this concentration?

      RESPONSE: We thank the reviewer for this question. To test that, we performed complex V activity in buffers containing or not 10 mM sodium azide. As demonstrated below, the presence of sodium azide in the buffer does not influence complex V activity in two different tissues with low and high complex V activity (BAT and heart, respectively).

      Table 1. ATP synthase hydrolytic activity in the presence or absence of Na-azide.

      BAT

      Heart

      +Na-azide

      100 ± 43.01

      100 ± 39.36

      -Na-azide

      82.6 ± 4.33

      111.3 ± 43.32

      +Na-azide + oligomycin

      15.3 ± 4.32*

      13.8 ± 14.01*

      -Na-azide + oligomycin

      14.2 ± 3.53*

      11.9 ± 2.88*

      Data presented as % of control (i.e. presence of Na-azide and absence of oligomycin) for both tissues independently. N = 2-3/condition. Statistical test: two-way ANOVA. * main effect of oligomycin (p In the mitochondrial isolation protocol, they say "mitochondria were centrifuged at 800g for 10min..." Will this speed pellet the mitochondria? I think this is a mistake in writing.

      RESPONSE: We apologize for the lack of clarity. What was centrifuged at 800 g was the whole-tissue homogenate to discard cellular debris, before pelleting mitochondria at 5000 g. We have now corrected this mistake in the methods section.

      For the safranin-O experiment, they don't mention mitochondrial substrate used, probably it's in the reference that they provide, but I think it should be included in the text.

      RESPONSE: We did not use any substrate because our goal was to test the contribution of ATP synthase to mitochondrial membrane potential. For that, we inhibited proton movement within the ETC with antimycin A and through UCP1 with GDP (see Methods). We have now edited our Method’s description to make sure the reader is aware of our approach.

      Reviewer #3 (Significance (Required)):

      The manuscript is well written, and it flows well when reading. However, there are some additional experiments that need to be performed to reach the conclusions the authors claim.

      RESPONSE: We thank the reviewer for the positive commentaries regarding our work and hope to have answered the open questions with the edits and new experiments.

      The role of ATP hydrolysis in BAT thermogenesis is novel and interesting as it can sed some light onto potential approaches to promotes BAT activation.

      Reviewer #4 (Evidence, reproducibility and clarity (Required)):

      This is an interesting investigation into the activity of IF1 in brown adipocytes. The findings are innovative and the conclusion is well-supported by the data. The conclusion is in line with previous reports on IF1 activities in other cell types, particularly in terms of its regulation of FoF1-ATPase. The authors have executed an exceptional job in designing the study, preparing the figures, and writing the manuscript. Overall, this study significantly contributes to the understanding of IF1 activity in brown adipocytes and its role in thermogenesis.

      RESPONSE: We thank the reviewer for the kind words. Please, find below our answers in a point-by-point manner.

      Reviewer #4 (Significance (Required)):

      The study demonstrates involvement of IF1 in regulating thermogenesis in brown adipocytes, which is a unique aspect not covered in existing literature. Advantage of the study is well-designed cellular studies. The major weakness is lack of proof of conclusion in vivo. There are a few minor concerns that should be addressed to further enhance quality of the manuscript.

      RESPONSE: We have now included two in vivo models, whole-body IF1 KO mice and BAT-injected IF1 overexpression to test the role of IF1 in BAT biology. The whole dataset is included in the main manuscript, where we conclude the BAT IF1 overexpression partially suppresses b3-adrenergic induction of thermogenesis alongside a reduction (overall and UCP1 dependent) in mitochondrial oxygen consumption. Also, similar to our in vitro experiments, IF1 KO mice did not present any difference in adrenergic-stimulated oxygen consumption.

      1. Current discussion does not mention the regulation of IF1 protein by the cAMP/PKA pathway. This point should be included to provide a comprehensive understanding of the regulatory mechanisms of IF1 protein. RESPONSE: Thank you for this suggestion. We have now added this topic to the discussion.

      It has been reported that IF1 also influences the structure of mitochondrial crista. Considering the observed changes with IF1 knockdown, it would be valuable to discuss this activity in relation to the findings of the study.

      RESPONSE: We discussed the implications of IF1 modulation in mitochondrial morphology in the revised manuscript.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2023-02218R

      Corresponding author(s): Steven, McMahon

      1. General Statements [optional]

      *We were pleased to receive the encouraging critiques and very much appreciate the Reviewer's specific comments and suggestions. In this revised version of our manuscript, we have made a number of substantive additions and modifications in response to these comments/suggestions. We hope you agree that the study is now improved to the point where it is suitable for publication. *

      2. Point-by-point description of the revisions

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary This study describes efforts to characterize differences in the roles of the two related human decapping factors Dcp1a and Dcp1b by assessing mRNA decay and protein associations in knockdown and knockout cell lines. The authors conclude that these proteins are non-redundant based on the observations that loss of DCp1a versus Dcp1b impacts the decapping complex (interactome) and the transcriptome differentially.

      Major comments • While the experiments appear to be well designed and executed and the data of generally high quality, the conclusions are drawn without sufficient consideration for the fact that these two proteins form a heterotrimeric complex. The authors assume that there are distinct homotrimeric complexes rather than a single complex with both proteins in. Homotrimers may have new/different functions not normally seen when both proteins are expressed. Thus while it is acceptable to infer that the functions of these two proteins within the decapping complex are distinct, it is not clear that they act separately, or that complexes naturally exist without one or the other. A careful evaluation of the relative ratios of Dcp1a and b overall and in decapping complexes would be informative if the authors want to make stronger statements about the roles of these two factors.

      RESPONSE: Thank you for this valuable comment. We have substantially edited the manuscript to incorporate these points. Examples include a detailed analysis of iBAQ values for the DDX6, DCP1a, and DCP1b interactomes (which now allows us to estimate the ratios of DCP1a and DCP1b in these complexes) and cellular fractionation to interrogate complex integrity (using Superose 6).

      • The concept of buffering is not adequately introduced and the interpretation of observations that RNAs with increased half life do not show increased protein abundance - that Dcp1a/b are involved in transcript buffering is nebulous. In order to support this interpretation, the mRNA abundances (NOT protein abundances) should be assessed, and even then, there is no way to rule out indirect effects. RESPONSE: Thank you for this comment. In the revised version of the manuscript, we introduced the concept of transcript buffering at an earlier stage as one of the potential explanations for our findings. We were also able to use a new algorithm (grandR) to estimate half-lives and synthesis rates from our data. These new data add strength to the argument that DCP1a and DCP1b are linked to transcript buffering pathways.

      • It might be interesting to see what happens when both factors are depleted to get an idea of the overall importance of each one.

      RESPONSE: In our work we tried to emphasize the differences between the two paralogs. We believe that doing double knockout or knockdown would mask the distinct impacts of the paralogs. In data not included in this study, we have shown that cells lacking both DCP1a and DCP1b are viable. We did check PARP cleavage in the CRISPR generated cell pools of DCP1a KO, DCP1b KO, and the double KO. The WB measuring the PARP cleavage is shown in the supplemental material (Supplementary Material: Replicates)

      • The algorithms etc used for data analysis should be included at the time of publication. Version number and settings used for SMART to define protein domains, and webgestalt should be indicated

      RESPONSE: We apologize for this oversight. Version number and settings used for the webtools (SMART, Webgestalt) are now included. The analysis pipeline for half-lives and synthesis rates estimation as well as all the files and the code needed to generate the figures in the paper are available on zenodo (https://zenodo.org/records/10725429).

      • Statistical analysis is not provided for the IP experiments, the number of replicates performed is not indicated and quantification of KD efficiency are not provided.

      RESPONSE: The number of replicates performed in each experiment is now clearly indicated and quantifications of knockdown efficiency are provided (Supplemental Figure 3A and 3B, Figure 3A, Figure 3B).

      • The possibility that the IP Antibody interferes with protein-protein interactions is not mentioned.

      RESPONSE: Thank you for this comment. The revised manuscript includes a discussion of the antibody epitope location and the potential for impact on protein-protein interactions.

      Minor comments • P4 - "This translational repression of mRNA associated with decapping can be reversed, providing another point at which gene expression can be regulated (21)" - implies that decapping can be reversed or that decapped RNAs are translated. I don't think this is technically true.

      RESPONSE: There have been several studies that document the reversal of decapping. These findings are summarized in the following reviews.

      Schoenberg, D. R., & Maquat, L. E. (2009). Re-capping the message. Trends in biochemical sciences, 34(9), 435-442.

      Trotman, J. B., & Schoenberg, D. R. (2019). A recap of RNA recapping. Wiley Interdisciplinary Reviews: RNA, 10(1), e1504.

      • P11 - how common is it for higher eukaryotes to have 2 DCP genes? *RESPONSE: Metazoans have 2 DCP1 genes. *

      • Fig S1 - says "mammalian tissues" in the text but the data is all human. The statement that "expression analyses revealed that DCP1a and DCP1b have concordant rather than reciprocal expression patterns across different mammalian tissues (Supplemental Figure 1)" is a bit misleading as no evidence for correlation or anti-correlation is provided. Also co-expression is not strong support for the idea that these genes have non-redundant functions. Both genes are just expressed in all tissues - there's no evidence provided that they are concordantly expressed. In bone marrow it may be worth noting that one is high and the other low - i.e. reciprocal. *RESPONSE: We appreciate this comment. We have corrected the interpretation of the aforementioned dataset. We have also incorporated a more detailed discussion in the text of the paper. As the Reviewer pointed out, there are a subset of tissues where their expression appears to be reciprocal. *

      • Fig 1A - it is not clear what the different colors mean. Does Sc DCP1 have 1 larger EVH or 2 distinct ones. Are the low complexity regions in Sc DCP2 the SLiMs. *RESPONSE: Thank you for this comment. We have corrected this ambiguity to reflect that Sc DCP1 has one EVH1 domain that is interconnected by a flexible hinge. The low-complexity regions typically contain short linear motifs (SLIMs), however, not all low-complexity regions have been verified to contain them. In the figure, only low-complexity regions are shown. The text of the paper refers only to verified SLIMs . *

      • P11 - why were HCT116 cells selected? RESPONSE: HCT116 cells are an easily transfectable human cell line and have been widely used in biochemical and molecular studies, including studies of mRNA decapping (see references below). Since decapping is impacted by viral proteins we avoided the use of other commonly used cell models such as HEK293T or HeLa.

      https://pubmed.ncbi.nlm.nih.gov/?term=decapping+hct116&sort=date&size=200

      • Fig 1B - what are the asterisks by the RNA names? Might be worth noting that over-expression of DCP1b reduced IP of DCP1a. There's no quantification and no indication of the number of times this experiment was repeated. Data from replicates and quantification of the knockdown efficiency in each replicate would be nice to see. *RESPONSE: Thank you for this comment. Asterisks indicate that those bands were from a second gel, as DCP1a and DCP1b run at approximately the same molecular weight. We have now included a note in our figure legend to indicate this. The knockdown efficiency is provided (Figure 3 and Supplemental Figure 3). We also noted the number of replicas for each IP in figure 1. The replicas are provided as supplementary material (Supplementary Materials: Replicates). *

      • Fig 1C/1D - why are there 3 bands in the DCP1a blot? Quantification of the IP bands is necessary to say whether there is an effect or not of over-expression/KO. RESPONSE: The additional bands in DCP1a blots are background. When we stained the whole blot for DCP1a, in cells which with complete DCP1a KO cells (clone A3), these bands still appear (Supplementary Material: Validation of the KO clones). Quantifications of the bands in the overexpression experiments is now provided.

      • Fig 3 - is it possible that differences are due to epitope positions for the antibodies used for IP? RESPONSE: We do not believe so. DCP1a antibody binds roughly 300-400 residues on DCP1a, and DCP1b antibody binds around Val202. Antibodies therefore do not bind DCP1a or DCP1b low-complexity regions (which are largely responsible for interacting with the decapping complex interactome). Antibodies don't bind the EVH1 domains or the trimerization domain, which are needed for their interaction with DCP2 and each other.

      • Fig 5A - the legend doesn't match the colors in the figure. It is not clear how the pRESPONSE: Thank you for this comment. We have corrected this issue in the revised version of the paper. High-confidence proteins are those with pRESPONSE: Thank you for this comment. We have corrected this issue in the revised version of the paper.*

      • There are a few more recent studies on buffering that should be cited and more discussion of this in the introduction is necessary if conclusions are going to be drawn about buffering. *RESPONSE: We have included a discussion of transcript buffering in the introduction. *

      • The heatmaps in figure 2 are hard to interpret. RESPONSE: To clarify the heatmaps, we included a more detailed description in the figure legends, have enlarged the heatmaps themselves, and have added more extensive labeling.

      Reviewer #1 (Significance (Required)):

      • Strengths: The experiments appear to be done well and the datasets should be useful for the field. • Limitations: The results are overinterpreted - different genes are affected by knocking down one or other of these two similar proteins but this does not really tell us all that much about how the two proteins are functioning in a cell where both are expressed. • Audience: This study will appeal most to a specialized audience consisting of those interested in the basic mechanisms of mRNA decay. Others may find the dataset useful. • This study might complement and/or be informed by another recent study in BioRXiv - https://doi.org/10.1101/2023.09.04.556219 • My field of expertise is mRNA decay - I am qualified to evaluate the findings within the context of this field. I do not have much experience of LC-MS-MS and therefore cannot evaluate the methods/analysis of this part of the study.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      The authors provide evidence that Dcp1a and Dcp1b - two paralogous proteins of the mRNA decapping complex - may have divergent functions in a cancer cell line. In the first part, the authors show that interaction of Dcp2 with EDC4 is diminished upon depletion of Dcp1a but not affected by depletion of Dcp1b. The results have been controlled by overexpression of Dcp1b as it may be limiting factor (i.e. expression levels too low to compensate for depletion of Dcp1a reduced interaction with EDC3/4 while depletion of Dcp1b lead to opposite and increase interactions). They then defined the protein interactome of DDX6 in parental and Dcp1a or Dcp1b depleted cells. Here, the authors show some differential association with EDC4 again, which is along results shown in the first part. The authors further performed SLAM-seq and identified subsets of mRNA whose decay rates are common but also different upon depletion with Dcp1a and Dcp1b. Interestingly, it seems that Dcp1a preferentially targets mRNAs for proteins regulating lymphocyte differentiation. To further test whether changes in RNA decay rates are also reflected at the protein levels, they finally performed an MS analysis with Dcp1a/b depleted cells. However no significant overlap with mRNAs showing altered stability could be observed; and the authors suggested that the lack of congruence reflects translational repression.

      Major comments: 1. While functional difference between Dcp1a and Dcp1b are interesting and likely true, there are overinterpretations that need correction or further evidence for support. Sentences like "DCP1a regulates RNA cap binding proteins association with the decapping complex and DCP1b controls translational initiation factors interactions (Figure 2E)" sound misleading. While differential association with proteins has been recognised with MS-data, it does not necessary implement an active process of control/regulation. To make the claim on 'control/regulation', and inducible system or introduction of mutants would be required.

      RESPONSE: This set of comments were particularly useful in helping us refine the presentation of our findings. We have edited our manuscript to be more specific about the limits of our data.

      1. The MS analysis is not clearly described in the text and it is unclear how authors selected high-confident proteins. The reader needs to consider the supplemental tables to find out what controls were used. Furthermore, the authors should show correlation plots of MS data between replicates. For instance, there seems to be limited correlation among some of the replicates (e.g. Dcp1b_ko3 sample, Fig. 2c). Any explanation in this variance?

      *RESPONSE: We have now included a clear description of how all high-confidence proteins were selected in the Methods and Results sections. The revised manuscript also includes a more thorough description of the controls used and the number of replicates for individual experiments. The PCA plots have now been included where appropriate. The variance in this sample is likely technical. *

      1. GO analysis for the proteome analysis should consider the proteome and not the genome as the background. The authors should also indicate the corrected P-values (multiple testing) FDRs.

      *RESPONSE: Webgestalt uses a reference set of IDs to recognize the input IDs, and it does not use it for the background analysis in the classical sense. We repeated a subset of our proteome analyses using the 'genome-protein coding' as background and obtained the same result as in our original analysis. All ontology analyses now include raw p-values and/or FDRs when appropriate. *

      1. Fig 2E. The figures display GO enrichments needs better explanation and additional data can be added. The enrichment ratio is not explained (is this normalised?) and p-values and FDRs, number of proteins in respective GO category should be added. *RESPONSE: More thorough explanations of the GO enrichments are now included. The supplemental data contains all p-values (raw and adjusted), as well as the number of proteins in each GO category. The Enrichment ratio is normalized and contains information about the number of proteins that are redundant in multiple groups. GO Ontology analyses are now displayed with p-values and/or FDR values, and in this case the enrichment ratio contains information regarding the number of proteins found in our input set and the number of expected proteins in the GO group. The network analysis shows the FDR values and the number of proteins found in the groups compared. *

      Minor: 5. These studies were performed in a colorectal carcinoma cell line (HCT116). The authors should justify the choice of this specialised cell line. Furthermore, one wonders whether similar conclusions can be drawn with other cell lines or whether findings are specific to this cancer line.

      RESPONSE: The study that is currently in pre-print in BioRxiv (https://doi.org/10.1101/2023.09.04.556219*) utilized HEK293Ts and found similar results to ours when examining the various relationships between the core decapping core members. *

      1. Fig. 1B. It is unclear what DCP1b* refers to? There are bands of different size that are not mentioned by the authors - are those protein isoforms or what are those referring to? A molecular marker should be added to each Blots. Uncropped Western images and markers should be provided in the Supplement. *RESPONSE: The asterisk indicates that these images came from a second western blot gel (DCP1a and DCP1b have a similar molecular weight and cannot be probed on the same membrane). Uncropped western blot images and markers (as available) are provided in the supplement. *

      2. MS data submitted to public repository with access. No. indicated in the manuscript.

      RESPONSE: MS data is submitted as supplementary datasets to the paper. It contains the analyzed data as well as the LCMSMS output. We are in the process of submitting the raw LSMSMS data to a public repository.

      Fig 3. A Venn Diagram displaying the overlap of identified proteins should be added. GO analysis should be done considering the proteome as background (as mentioned above).

      *RESPONSE: A Venn diagram showing the overlap among the proteins identified is now included in the revised version. *

      Reviewer #2 (Significance (Required)):

      Overall, this is a large-scale integrative -omics study that suggest functional difference between Dcp1 paralogues. While it seems clear that both paralogous have some different functions and impact, there are overinterpretations in place and further evidence would to be provided to substantiate conclusions made in the paper. For instance, while the interactions with Dcp2/Ddx6 in the absence of Dcp1a,b with EDC4/3 may be altered (Fig. 1, 2), the functional implications of this changed associations remains unresolved and not further discussed. As such, it remains somehow disconnected with the following experiments and compromises the flow of the study. The observed differences in decay-rates for distinct functionally related sets of mRNAs is interesting; however, it remains unclear whether those are direct or rather indirect effects. This is further obscured by the absence of any correlation to changes in protein levels, which the authors interpreted as 'transcriptional buffering'. In this regard, it is puzzling how the authors can make a statement about transcriptional buffering? While this may be an interesting aspect and concept of the discussion, there is no primary data showing such a functional impact.

      As such, the study is interesting as it claims functional differences between DCP1a/b paralogous in a cancer cell line. Nevertheless, I am not sure how trustful the MS analysis and decay measurements are as there is not further validation. It woudl be interesting if the authors could go a bit further and draw some hypothesis how the selectivty could be achieved i.e interaction with RNA-binding proteins that may add some specificity towards the target RNAs for differential decay. As such, the study remains unfortunately rather descriptive without further functional insight.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Review on "Non-redundant roles for the human mRNA decapping cofactor paralogs DCP1a and DCP1b" by Steven McMahon and co-workers mRNA decay is a critical step in the regulation of gene expression. In eukaryotes, mRNA turnover typically begins with the removal of the poly(A) tail, followed by either removal of the 5' cap structure or exonucleolytic 3'-5' decay catalyzed by the exosome. The decapping enzyme DCP2 forms a complex with its co-activator DCP1, which enhances decapping activity. Mammals are equipped with two DCP1 paralogs, namely DCP1a and DCP1b. Metazoans' decapping complexes feature additional components, such as enhancer of decapping 4 (EDC4), which supports the interaction between DCP1 and DCP2, thereby amplifying the efficiency of decapping. This work focuses on DCP1a and DCP1b and investigates their distinct functions. Using DCP1a- and DCP1a-specific knockdowns as well as K.O. cell lines, the authors find surprising differences between the DCP1 paralogs. While DCP1a is essential for the assembly of EDC4-containig decapping complexes and interactions with mRNA cap binding proteins, DCP1b mediates interactions with the translational machinery. Furthermore, DCP1a and DCP1b target different mRNAs for degradation, indicating that they execute non-overlapping functions. The findings reported here expand our understanding of mRNA decapping in human cells, shedding light on the unique contributions of DCP1a and DCP1b to mRNA metabolism. The manuscript tackles an interesting subject. Historically, the emphasis has been on studying DCP1a, while DCP1b has been deemed a functionally redundant homolog of DCP1a. Therefore, it is commendable that the authors have taken on this topic and, with the help of knockout cell lines, aimed to dissect the function of DCP1a and DCP1b. Despite recognizing the significance of the subject and approach, the manuscript falls short of persuading me. Following a promising start in Figure 1 (which still has room for improvement), there is a distinct decline in overall quality, with only relatively standard analyses being conducted. However, I do not want to give the authors a detailed advice on maximizing the potential of their data and presenting it convincingly. So, here are just a few key points for improvement: Figure 1C: Upon closer examination, a faint band is still visible at the size of DCP1a in the DCP1a knockout cells. Could this be leaky expression of DCP1a? The authors should provide an in-depth characterization of their cells (possibly as supplementary material), including identification of genomic changes (e.g. by sequencing of the locus) and Western blots with longer exposure, etc.

      *RESPONSE: Thank you for this comment. The in-depth characterization of our cells is now included in the Supplementary Material. DCP1a KO cells and DCP1b KO cells indicated as single cell clones have been confirmed to have no DCP1a or DCP1b expression. In Figure 1D and Figure 3, polyclonal pool cells were used as indicated (only for DCP1a KO). *

      Figure 2: It is great to see that the effects of the KOs are also visible in the DDX6 immunoprecipitation. However, I wonder if the IP clearly confirms that the KO cells indeed do not express DCP1a or DCP1b. In the heatmap in Figure 2B, it appears as if the proteins are only reduced by a log2-fold change of approximately 1.5? Additionally, Figure 2 shows a problem that persists in the subsequent figures. The visual presentation is not particularly appealing, and essential details, such as the scale of the heatmap in 2B (is it log2 fold?), are lacking.

      *RESPONSE: The in-depth characterization of our cells is included in the Supplementary Materials and confirms the presence of single-cell clones where indicated. As noted above, only Figure 1D and Figure 3 used DCP1a KO pooled cells. The heatmap in Figure 2B is scaled by row using the pheatfunction in R studio. The actual data for the heatmap comes from protein intensities from the LC-MS/MS analysis. We have improved the visual presentation in the revised manuscript. *

      Figure 3: I wonder why there are no primary data shown here, only processed GO analyses. Wouldn't one expect that DCP2 interacts mainly with DCP1a, but less with DCP1b? Is this visible in the data? Moreover, such analyses are rather uninformative (as reflected in the GO terms themselves, for instance, "oxoglutarate dehydrogenase complex" doesn't provide much meaningful insight). The authors should rather try to derive functional and mechanistic insights from their data.

      RESPONSE: We have now revised this Figure to include primary data as well as the IP of DCP1a in DCP1b KO cells (single cell clones) and the IP of DCP1b in DCP1a KO cells (pooled cells). We identified EDC3 in the high-confidence protein pool. The EDC3:DCP1a interaction is enhanced in DCP1b KO cells. We also found that the EDC3:DCP1b interaction is less abundant in DCP1a KO cells. This is consistent with our data in Figures 1 and 2. DCP2 was not identified in the interactomes of either DCP1a or DCP1b. This is not unusual as DCP2 is highly flexible and the association between DCP1s with DCP2 is transient and facilitated by other proteins.

      In Fig. 4 the potential of the approach is not fully exploited. Firstly, I would advocate for omitting the GO analyses, as, in my opinion, they offer little insight. Again, crucial information is missing to assess the results. While 75 nt reads are mentioned in the methods, the sequencing depth remains unspecified. Figure 4b should be included in the supplements. Furthermore, I strongly recommend concentrating on insights into the mechanisms of DCP1a and DCP1b-containing complexes. E.g. what characteristics distinguish DCP1a and DCP1b-dependent mRNAs? Are these targets inherently unstable? Why are they degraded? Are they known decapping substrates?

      *RESPONSE: Thank you for this comment. We have now revised this figure and have included information about sequencing depth and other pertinent information. We have been able to use a newly available algorithm (grandR) and were able to estimate half-lives and synthesis rates. This is a significant addition to the paper. We were also able to compare significantly impacted mRNAs (by DCP1a or DCP1b loss) to the established DCP2 target list. *

      In general, I suggest the authors revise the manuscript with a focus on the potential readers. Reduce Gene Ontology (GO) analyses and heatmaps, and instead, incorporate more analyses regarding the molecular processes associated with the different decapping complexes.

      *RESPONSE: We removed selected GO analyses and heatmaps from the main body of the manuscript (included as Supplementary Figures instead). For our LC-MS/MS datasets, we added iBAQ analyses of the DDX6 IP, DCP1a IP, and DCP1b IP in the control conditions. Cellular fractionation studies (using Superose 6 chromatography) were also added to the paper and allow us to interrogate decapping complex composition in more detail. The revised version of the manuscript includes a new 4SU labeling experiment (pulse-chase) as well as estimation of half-lives and synthesis rates in our conditions. Also included is relevant information about DCP1b transcriptional regulation. *

      Reviewer #3 (Significance (Required)):

      The manuscript in its current form could benefit from substantial revisions for it to be considered impactful for researchers in the field.