10,000 Matching Annotations
  1. Feb 2025
    1. Reviewer #1 (Public review):

      Summary:

      The authors address the role of the centromere histone core in force transduction by the kinetochore.

      Strengths:

      They use a hybrid DNA sequence that combines CDEII and CDEIII as well as Widom 601 so they can make stable histones for biophysical studies (provided by the Widom sequence) and maintain features of the centromere (CDE II and III).

      Weaknesses:

      The main results are shown in one figure (Figure 2). Indeed the Centromere core of Widom and CDE II and III contribute to strengthening the binding force for the OA-beads. The data are very nicely done and convincingly demonstrate the point. The weakness is that this is the entire paper. It is certainly of interest to investigators in kinetochore biology, but beyond that, the impact is fairly limited in scope.

    2. Reviewer #2 (Public review):

      Summary:

      This paper provides a valuable addendum to the findings described in Hamilton et al. 2020 (https://doi.org/ 10.7554/eLife.56582). In the earlier paper, the authors reconstituted the budding yeast centromeric nucleosome together with parts of the budding yeast kinetochore and tested which elements are required and sufficient for force transmission from microtubules to the nucleosome. Although budding yeast centromeres are defined by specific DNA sequences, this earlier paper did not use centromeric DNA but instead the generic Widom 601 DNA. The reason is that it has so far been impossible to stably reconstitute a budding yeast centromeric nucleosome using centromeric DNA.

      In this new study, the authors now report that they were able to replace part of the Widom 601 DNA with centromeric DNA from chromosome 3. This makes the assay more closely resemble the in vivo situation. Interestingly, the presence of the centromeric DNA fragment makes one type of minimal kinetochore assembly, but not the other, withstand stronger forces.

      Which kinetochore assembly turned out to be affected was somewhat unexpected, and can currently not be reconciled with structural knowledge of the budding yeast centromere/kinetochore. This highlights that, despite recent advances (e.g. Guan et al., 2021; Dendooven et al., 2023), aspects of budding yeast kinetochore architecture and function remain to be understood and that it will be important to dissect the contributions of the centromeric DNA sequence.

      Given the unexpected result, the study would become yet more informative if the authors were able to pinpoint which interactions contribute to the enhanced force resistance in the presence of centromeric DNA.

      Strength:

      The paper demonstrates that centromeric DNA can increase the attachment strength between budding yeast microtubules and centromeric nucleosomes.

      Weakness:

      How centromeric DNA exerts this effect remains unclear.

    3. Author response:

      Reviewer #1:

      Summary:

      The authors address the role of the centromere histone core in force transduction by the kinetochore.

      Strengths:

      They use a hybrid DNA sequence that combines CDEII and CDEIII as well as Widom 601 so they can make stable histones for biophysical studies (provided by the Widom sequence) and maintain features of the centromere (CDE II and III).

      Weaknesses:

      The main results are shown in one figure (Figure 2). Indeed the Centromere core of Widom and CDE II and III contribute to strengthening the binding force for the OA-beads. The data are very nicely done and convincingly demonstrate the point. The weakness is that this is the entire paper. It is certainly of interest to investigators in kinetochore biology, but beyond that, the impact is fairly limited in scope.

      This reviewer might have missed that this is a Research Advance, not an article.  Research Advances are limited in scope by definition and provide a new development that builds on research reported in a prior paper.  They can be of any length.  Our Research Advance builds on our prior work, Hamilton et al., 2020 and provides the new result that native centromere sequences strengthen the attachment of the kinetochore to the nucleosome.

      Reviewer #2:

      Summary:

      This paper provides a valuable addendum to the findings described in Hamilton et al. 2020 (https://doi.org/ 10.7554/eLife.56582). In the earlier paper, the authors reconstituted the budding yeast centromeric nucleosome together with parts of the budding yeast kinetochore and tested which elements are required and sufficient for force transmission from microtubules to the nucleosome. Although budding yeast centromeres are defined by specific DNA sequences, this earlier paper did not use centromeric DNA but instead the generic Widom 601 DNA. The reason is that it has so far been impossible to stably reconstitute a budding yeast centromeric nucleosome using centromeric DNA.

      In this new study, the authors now report that they were able to replace part of the Widom 601 DNA with centromeric DNA from chromosome 3. This makes the assay more closely resemble the in vivo situation. Interestingly, the presence of the centromeric DNA fragment makes one type of minimal kinetochore assembly, but not the other, withstand stronger forces.

      We thank the reviewer for their careful and positive assessment of our work.

      Which kinetochore assembly turned out to be affected was somewhat unexpected, and can currently not be reconciled with structural knowledge of the budding yeast centromere/kinetochore. This highlights that, despite recent advances (e.g. Guan et al., 2021; Dendooven et al., 2023), aspects of budding yeast kinetochore architecture and function remain to be understood and that it will be important to dissect the contributions of the centromeric DNA sequence.

      We couldn’t agree more.

      Given the unexpected result, the study would become yet more informative if the authors were able to pinpoint which interactions contribute to the enhanced force resistance in the presence of centromeric DNA.

      Strength:

      The paper demonstrates that centromeric DNA can increase the attachment strength between budding yeast microtubules and centromeric nucleosomes.

      Weakness:

      How centromeric DNA exerts this effect remains unclear.

    1. eLife Assessment

      In this work, the authors use a Drosophila melanogaster adult ventral nerve cord injury model extending and confirming previous observations. This important study reveals key aspects of adult neural plasticity. Taking advantage of several genetic reporter and fate tracing tools, the authors provide solid evidence for different forms of glial plasticity, that are increased upon injury. The significance of the generated cell types under homeostatic conditions and in response to injury remains to be further explored and open up new avenues of research.

    2. Reviewer #2 (Public review):

      Summary:

      Casas-Tinto et al., provide new insight into glial plasticity using a crush injury paradigm in the ventral nerve cord (VNC) of adult Drosophila. The authors find that both astrocyte-like glia (ALG) and ensheating glia (EG) divide under homeostatic conditions in the adult VNC and identify ALG as the glial population that specifically ramps up proliferation in response to injury, whereas the number of EGs decreases following the insult. Using lineage-tracing tools, the authors interestingly observe interconversion of glial subtypes, especially of EGs into ALGs, which occurs independent of injury and is dependent on the availability of the transcription factor Prospero in EGs, adding to the plasticity observed in the system. Finally, when tracing the progeny of glia, Casas-Tinto and colleagues detect cells of neuronal identity and provide evidence that such glia-derived neurogenesis is favored following ventral nerve cord injury, which puts forward a remarkable way in which glia can respond to neuronal damage.

      Strengths:

      This study highlights a new facet of adult nervous system plasticity at the level of the ventral nerve cord, supporting the view that proliferative capacity is maintained in the mature CNS and stimulated upon injury.

      The injury paradigm is well chosen, as the organization of the neuromeres allows specific targeting of one segment, compared to the remaining intact and with the potential to later link observed plasticity to behavior such as locomotion.

      Numerous experiments have been carried out in 7-day old flies, showing that the observed plasticity is not due to residual developmental remodeling or a still immature VNC.

      Different techniques are used to observe proliferation in the VNC.

      By elegantly combining different methods, the authors show glial divisions including with mitotic-dependent tracing and find that the number of generated glia is refined by apoptosis later on.

      The work identifies prospero in glia as important coordinator of glial cell fate, from development to the adult context, which draws further attention to the upstream regulatory mechanisms.

      Weaknesses:

      The authors do not discuss their results on gliogenesis or neurogenesis in the adult VNC to previous findings made in the context of the injured adult brain.

      The authors speculate about the role of glial inter-conversion for tissue homeostasis or regeneration, but no supportive evidence is cited or provided. Further experiments will be required to test the function of the described glial plasticity.

      Elav+ cells originating from glia do not express markers for mature neurons at the analysed time-point. If they will eventually differentiate<br /> or what type of structure is formed by them will have to be followed up in future studies.

      Context/Discussion

      Highlighting some differences in the reactiveness of glia in the VNC compared to the brain could reveal important differences in repair strategies in different areas of the CNS.

    3. Reviewer #3 (Public review):

      In this manuscript, Casas-Tintó et al. explore the role of glial cell in the response to a neurodegenerative injury in the adult brain. They used Drosophila melanogaster as a model organism, and found that glial cells are able to generate new neurons through the mechanism of transdifferentiation in response to injury. This paper provides a new mechanism in regeneration, and gives an understanding to the role of glial cells in the process.

      The authors have now addressed all my concerns.

    4. Author response:

      The following is the authors’ response to the previous reviews.

      eLife Assessment

      In this work, the authors use a Drosophila adult ventral nerve cord injury model extending and confirming previous observations; this important study reveals key aspects of adult neural plasticity. Taking advantage of several genetic reporter and fate tracing tools, the authors provide solid evidence for different forms of glial plasticity, that are increased upon injury. The data on detected plasticity under physiologic conditions and especially the extent of cell divisions and cell fate changes upon injury would benefit from validation by additional markers. The experimental part would improve if strengthened and accompanied by a more comprehensive integration of results regarding glial reactivity in the adult CNS.

      Thank you very much for your thoughtful comments and constructive feedback regarding our manuscript. We appreciate all the positive remarks on the significance of our findings on neural plasticity in this Drosophila adult ventral nerve cord injury model.

      In response to your suggestion, we fully agree that the continuation of this project should address in detail cell fate changes with additional markers if available, or an “omic” approach such as scRNAseq. Unfortunately, these further experiments are beyond the scope of this paper to describe the in vivo phenomena of cell reprogramming, and the cellular events that take glial cells to convert into neurons or neuronal precursors.

      Additionally, we agree that the experimental part can be further improved by providing a more comprehensive integration of our results with current knowledge on glial reactivity in the adult CNS. We will revise the manuscript accordingly to include a deeper discussion of the broader implications of our findings and their alignment with existing literature.

      Thank you again for your valuable input, which will undoubtedly enhance the quality of our work. We look forward to submitting the revised manuscript for your consideration.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      Casas-Tinto et al. present convincing data that injury of the adult Drosophila CNS triggers transdifferentiation of glial cell and even the generation of neurons from glial cells. This observation opens up the possibility to get an handle on the molecular basis of neuronal and glial generation in the vertebrate CNS after traumatic injury caused by Stroke or Crush injury. The authors use an array of sophisticated tools to follow the development of glial cells at the injury site in very young and mature adults. The results in mature adults reveal a remarkable plasticity in the fly CNS and dispels the notion that repair after injury may be only possible in nerve cords which are still developing. The observation of so called VC cells which do not express the glial marker repo could point to the generation of neurons by former glial cells.

      Conclusion:

      The authors present an interesting story which is technically sound and could form the basis for an in depth analysis of the molecular mechanism driving repair after brain injury in Drosophila and vertebrates.

      Strengths:

      The evidence for transdifferentiation of glial cells is convincing. In addition, the injury to the adult CNS shows an inherent plasticity of the mature ventral nerve cord which is unexpected.

      Weaknesses:

      Traumatic brain injury in Drosophila has been previously reported to trigger mitosis of glial cells and generation of neural stem cells in the larval CNS and the adult brain hemispheres. Therefore this report adds to but does not significantly change our current understanding. The origin and identity of VC cells is still unclear. The authors show that VC cells are not GABA- or glutamergic. Yet, there are many other neurotransmitter or neuropetides. It would have been nice to see a staining with another general neuronal marker such as anti-Syt1 to confirm the neuronal identity of Syt1.

      We thank the reviewer for the constructive comments and positive feedback. We concur that previous studies have demonstrated glial cell proliferation in response to CNS injury. In contrast, our study focuses on glial transdifferentiation that emerges as a novel phenomenon, particularly in response to injury. We found that neuropile glia lose their glial identity and express the pan-neuronal marker Elav. To investigate the identity of these newly observed elav-positive cells, we employed anti-ChAT, antiGABA and anti-GluRIIA antibodies to determine the functional identity of these cells, besides we stained them with other neuronal markers such Enabled, Gigas or Dac (not shown); however, our attempts yielded limited success. To address this, we have now included a discussion section exploring the potential identity of these cells, considering the possibility that they may represent immature neurons.

      Reviewer #2 (Public review):

      Summary:

      Casas-Tinto et al., provide new insight into glial plasticity using a crush injury paradigm in the ventral nerve cord (VNC) of adult Drosophila. The authors find that both astrocyte-like glia (ALG) and ensheating glia (EG) divide under homeostatic conditions in the adult VNC and identify ALG as the glial population that specifically ramps up proliferation in response to injury, whereas the number of EGs decreases following the insult. Using lineage-tracing tools, the authors interestingly observe interconversion of glial subtypes, especially of EGs into ALGs, which occurs independent of injury and is dependent on the availability of the transcription factor Prospero in EGs, adding to the plasticity observed in the system. Finally, when tracing the progeny of glia, Casas-Tinto and colleagues detect cells of neuronal identity and provide evidence that such gliaderived neurogenesis is specifically favoured following ventral nerve cord injury, which puts forward a remarkable way in which glia can respond to neuronal damage.

      Strengths:

      This study highlights a new facet of adult nervous system plasticity at the level of the ventral nerve cord, supporting the view that proliferative capacity is maintained in the mature CNS and stimulated upon injury.

      The injury paradigm is well chosen, as the organization of the neuromeres allows specific targeting of one segment, compared to the remaining intact and with the potential to later link observed plasticity to behaviour such as locomotion.

      Numerous experiments have been carried out in 7-day old flies, showing that the observed plasticity is not due to residual developmental remodelling or a still immature VNC.

      By elegantly combining different methods, the authors show glial divisions including with mitotic-dependent tracing and find that the number of generated glia is refined by apoptosis later on.

      The work identifies prospero in glia as an important coordinator of glial cell fate, from development to the adult context, which draws further attention to the upstream regulatory mechanisms.

      We would like to thank the reviewer for his/her comments and the positive analysis of this work.

      Weaknesses:

      The authors observe consistent inter-conversion of EG to ALG glial subtypes that is further stimulated upon injury. The authors conclude that these findings have important consequences for CNS regeneration and potentially for memory and learning. However, it remains somewhat unclear how glial transformation could contribute to regeneration and functional recovery.

      This is an ongoing question in the laboratory and in the field. We know that glial cells contribute to the regenerative program in the nervous system, and molecular signalling in glial cells is determinant for the functional recovery (Losada-Perez et al 2021). Therefore, we include this concept in the discussion as the evidence indicates that glial cells participate in these programs. However, further investigation is required to clarify and determine the mechanisms underlying this glial contribution. To determine if glial to neuron transformation contributes to functional recovery, we would need to compare the recovery of animals with new VC to animals without VC, however, the  molecular mechanism that produces this change of identity is still unknown, and therefore we are not able to generate injured flies with no new VC

      The signal of the Fucci cell cycle reporter seems more complex to interpret based on the panels provided compared to the other methods employed by the authors to assess cell divisions.

      We agree that Fly Fucci is a genetic reporter that might be more complex to interpret than EdU staining or other markers. However, glial cells proliferation is a milestone of this manuscript, and we used different available tools to confirm our results. We have revised this specific section to ensure that the text is clear and straightforward.

      Elav+ cells originating from glia do not express markers for mature neurons at the analysed time-point. If they will eventually differentiate or what type of structure is formed by them will have to be followed up in future studies.

      We fully agree with the reviewer, and we will analyze later days to study neuronal fate and contribution to VNC function.

      Context/Discussion

      There is some lack of connecting or later comparing the observed forms of glial plasticity in the VNC with respect to plasticity described in the fly brain.

      Highlighting some differences in the reactiveness of glia in the VNC compared to the brain could point to relevant differences in repair capacity in different areas of the CNS.

      Based on the assays employed, the study points to a significant amount of glial "identity" changes or interconversions under homeostatic conditions. The potential significance of this rather unexpected "baseline" plasticity in adult tissues is not explicitly pointed out and could improve the understanding of the findings.

      Some speculations if "interconversion" of glia is driven by the needs in the tissue could enrich the discussion.

      We would like to thank the reviewer for these suggestions. We have changed the discussion to introduce these concepts.

      Reviewer #3 (Public review):

      In this manuscript, Casas-Tintó et al. explore the role of glial cell in the response to a neurodegenerative injury in the adult brain. They used Drosophila melanogaster as a

      model organism, and found that glial cells are able to generate new neurons through the mechanism of transdifferentiation in response to injury. This paper provides a new mechanism in regeneration, and gives an understanding to the role of glial cells in the process.

      Comments on revisions:

      In the previous version of the manuscript, I had suggested several recommendations for the authors. Unfortunately, none of these were addressed in the author's revision.

      We are sorry for this error. We apologize but we never received these comments. We have now found them, and we have incorporated these comments in the new version of the manuscript.

      (1) Have you tried screening for other markers for the EdU+ Repo+ Pros- cells?

      We have identified these cells as glial cells (Repo +), and not astrocyte-like glia (pros-). But we have not further characterized  the identity of these cells. Our aim was to identify these proliferating glial cells as NPG (Neuropile glia), which are Astrocyte-Like Glia (ALG), as previous works suggest in larvae (Kato et al., 2020; Losada-Perez et al., 2016), or Ensheathing Glia (EG). To discard the ALG identity, we used prospero as the best marker. The results indicate that there are ALG among the proliferating population, but in addition, we also found pros- glial cells that were EdU positive. These cells are located in the interface between cortex and neuropile, where the neuropile glia position is described. The anti-pros staining indicated they were no ALG which suggest that they are EG.

      There is no specific nuclear marker for EG cells, therefore we used FLY_FUCCI under the control of a EG specific promoter (R56F03-Gal4) to determine if the other dividing cells were EG. These results indicate that EG glia divide although their proliferation does not increase upon injury.

      The R56F03 Gal4 construct is described as ensheathing glia specific by previous publications, including:

      (1) Kremer M. C., Jung C., Batelli S., Rubin G. M. and Gaul U. (2017). The glia of the adult Drosophila nervous system. Glia 65, 606-638. 10.1002/glia.23115

      (2) Qingzhong Ren, Takeshi Awasaki, Yu-Chun Wang, Yu-Fen Huang, Tzumin Lee. Lineage-guided Notch-dependent gliogenesis by Drosophila multi-potent progenitors. Development. 2018 Jun 11;145(11):dev160127. doi: 10.1242/dev.160127   

      To summarize, our results suggest that part of these proliferating glial cells are ALG and EG. Our results can not discard that a residual part of these proliferating cells are not AG nor EG.

      (2) You mentioned that ALG are heterogenous in size and shape, does that mean that you may have different subpopulations of ALG? Would that also mean that only a portion of them responds to injury?

      Yes, as in Astrocytes in vertebrates this population is highly heterogeneous. Currently there are no molecular tools to specifically identify these subpopulations and characterize their distinct roles. However, emerging research suggests that differences in size, shape, and potentially molecular markers could correlate with functional diversity. This implies that certain subpopulations of ALG may be more specialized or primed to respond to injury, while others may play roles in homeostasis or other processes. Understanding this heterogeneity will require advanced techniques such as single-cell RNA sequencing, spatial transcriptomics, or live imaging to unravel how these subpopulations contribute to injury responses and overall tissue dynamics.

      (3) You mentioned that NP-like cells have similar nuclear shape and size to ALG and EG, while Ventral cortex cells have larger nuclei. Can you please show a quantification of the NP-like cells and Ventral cortex cells size, and show a direct comparison with ALG and EG cells to support those claims (images, quantification and analysis)?

      We added a new supplementary figure with a graph showing nuclei size differences between VC and NP-like cells, and a diagram showing VC cell localization. Images in figure 2A-A’ and 2B-B’ show both types of cells with the same scale, additionally, NPG cells are shown in red (current expression of the specific Gal4 line). A direct comparison between EG and NP-like glia can be observed in Figure 3 as well.

      Besides of size and localization, we conclude  that VC and N-like cells present different molecular markers as VC are elav-positive and reponegative whereas NP-like cells are repo-positive elav-negative

      (4) In Figure 2B, the repo expression is not very clear. I suggest using a different example to support the claim that NP cells are Repo+.

      We have changed the color of anti-elav staining to facilitate visualisation

      (5) Again, in Figure 2C, you need quantification and analysis to support the claim that you used nuclear shape and size to identify VC vs. NP like cells.

      Quantification in point 3, criteria in Figure S1

      (6) What is the identity of the newly formed neurons? Other than Elav, have you tried using other markers of neurons that are typically found in this area?

      This question is of great interest and relevance. We have done great efforts to solve this open question and so far, our data suggest that these neurons might be in an immature state. In this last version of the manuscript, we included the results (Figure S1) with several different markers. 

      The molecular identity of these cell populations, glia and neurons, is currently under investigation.

      Minor comments:

      (1) In the abstract, EG and ALG abbreviations are not introduced properly.

      Thank you very much for noticing this missing information, we have now included it in the abstract.

      (2) Please include a representation of the NPG somata location in Figure 1A.

      We have included this information in the figure

      (3) A schematic showing the differences between ALG and EG cells would be helpful as well.

      We have included in the introduction references and reviews where other authors describe in detail the differences.

      (4) In Figure 1 E, G, H- please indicated the genotype of the fly used in the panel as well as the cell type studied.

      The complete genotype is included in the corresponding figure legend. We have added a simplified genotype in the figure for clarity.

      (5) Please show the genotype used for images in Figure 2: ALG or EG specific drivers.

      This information is included in the corresponding figure legend. We believe that it is better to keep the figure clean so we decided to keep the complete genotype, which is considerably long, only in the figure legend.

    1. eLife Assessment

      This study presents valuable findings by using Fmr1 knockout mice as a model to investigate the role of Fmr1 in sleep regulation. These mice exhibited clear evidence of sleep and circadian disturbances, including abnormal retinal innervation of the SCN, which may provide a potential mechanistic explanation for the observed behavioral deficits. Interestingly, the results suggest that a scheduled feeding approach could improve sleep and circadian rhythms while enhancing social interactions and reducing repetitive behaviors in a mouse model of Fragile X syndrome. The topic is both intriguing and highly significant; however, while the evidence supporting the authors' claims is solid, several issues hinder the manuscript's clarity and impact.

    2. Reviewer #1 (Public review):

      Summary:

      The authors investigated sleep and circadian rhythm disturbances in Fmr1 KO mice. Initially, they monitored daily home cage behaviors to assess sleep and circadian disruptions. Next, they examined the adaptability of circadian rhythms in response to photic suppression and skeleton photic periods. To explore the underlying mechanisms, they traced retino-suprachiasmatic connectivity. The authors further analyzed the social behaviors of Fmr1 KO mice and tested whether a scheduled feeding strategy could mitigate sleep, circadian, and social behavior deficits. Finally, they demonstrated that scheduled feeding corrected cytokine levels in the plasma of mutant mice.

      Strengths:

      (1) The manuscript addresses an important topic-investigating sleep deficits in an FXS mouse model and proposing a potential therapeutic strategy.

      (2) The study includes a comprehensive experimental design with multiple methodologies, which adds depth to the investigation.

      Weaknesses:

      (1) The first serious issue in the manuscript is the lack of a clear description of how they performed the experiments and the missing definitions of various parameters in the results. Given that monitoring and analyzing sleep behaviors are the key experiments of this manuscript, I use the "Immobility-Based Sleep Behavior" section of Methods as an example to elaborate:

      Incomplete or Incorrect Description of Tracking Threshold:<br /> o The phrase "tracked the (40 sec or greater as previously described" is incomplete and does not clarify what is being tracked. This appears to be an error in writing or editing.<br /> Unclear Relationship Between Threshold and EEG Validation:<br /> o The threshold "40 sec or greater" is mentioned without context or explanation of what it represents (e.g., sleep bout duration, inactivity, or another parameter). The reference to Fisher et al. (2016) and "99% correlation with EEG-defined sleep" seems misaligned with the paragraph's content.

      Confusing Definition of Sleep Bout:<br /> o The definition of a sleep bout is unclear. Sleep bouts should logically be based on periods of inactivity, not activity. The sentence suggesting sleep is measured by "activity staying above the threshold" is confusing. The phrase "3 counts of sleep per minute for longer than one minute" requires clarification.

      Unclear Data Selection for Analysis:<br /> o The phrase "2 days with the best recording quality" is vague and does not specify how "best" was determined or why only two days out of five were analyzed.

      Awkward Grammar and Structure:<br /> o Phrases like "Acquiring data were exported in 1-min bins" are grammatically awkward. "Acquiring" should be "Acquired." Some sentences are overly long and lack clarity, making the text harder to follow.<br /> In addition to this section, the authors should review all paragraphs in the Methods section to improve readability.

      (2) Although the manuscript has a relatively long Methods section, some essential information is missing. For instance, the definition of sleep bout, as described above, is unclear. Additional missing information includes:

      Figure 2: "Rhythmic strength (%)" and "Cycle-to-cycle variability (min)."<br /> Figure 3: "Activity suppression."<br /> Figure 4: "Rhythmic power (V%)" (is this different from rhythmic strength (%)?) and "Subjective day activity (%)."<br /> Figure 5: Clear labeling of the SCN's anatomical features and an explanation for quantifying only the ventral part instead of the entire SCN. Alternatively, the authors should consider quantifying the whole SCN.<br /> Figure 6: Inconsistencies in terms like "Sleep frag. (bout #)" and "Sleep bouts (#)." Consistent terminology throughout the manuscript is essential.

      (3) Figure 1A shows higher mouse activity during ZT13-16. It is unclear why the authors scheduled feeding during ZT15-21, as this seems to disturb the rhythm. Consistent with this, the body weights of WT and Fmr1 KO mice decreased after scheduled feeding. The authors should explain the rationale for this design clearly.

      (4) The interpretation of social behavior results in Figure 6 is questionable. The authors claim that Fmr1 KO mice cannot remember the first stranger in a three-chamber test, writing, "The reduced time in exploring and staying in the novel-mouse chamber suggested that the Fmr1 KO mutants were not able to distinguish the second novel mouse from the first now-familiar mouse." However, an alternative explanation is that Fmr1 KO mice do remember the first stranger but prefer to interact with it due to autistic-like tendencies. Data in Table 5 show that Fmr1 KO mice spent more time interacting with the first stranger in the 3-chamber social recognition test, which support this possibility. Similarly, in the five-trial social test, Fmr1 KO mice's preference for familiar mice might explain the reduced interaction with the second stranger.

      In Figure 6C (five-trial social test results), only the fifth trial results are shown. Data for trials 1-4 should be provided and compared with the fifth trial. The behavioral features of mice in the 5-trial test can then be shown completely. In addition, the total interaction times for trials 1-4 (154 {plus minus} 15.3 for WT and 150 {plus minus} 20.9 for Fmr1 KO) suggest normal sociability in Fmr1 KO mice (it is different from the results of 3-chamber). Thus, individual data for trials 1-4 are required to draw reliable conclusions.

      In Table 6 and Figure 6G-6J, the authors claim that "Sleep duration (Figures 6G, H) and fragmentation (Figures 6I, J) exhibited a moderate-strong correlation with both social recognition and grooming." However, Figure 6I shows a p-value of 0.077, which is not significant. Moreover, Table 6 shows no significant correlation between SNPI of the three-chamber social test and any sleep parameters. These data do not support the authors' conclusions.

      (5) Figure 7 demonstrates the effect of scheduled feeding on circadian activity and sleep behaviors, representing another critical set of results in the manuscript. Notably, the WT+ALF and Fmr1 KO+ALF groups in Figure 7 underwent the same handling as the WT and Fmr1 KO groups in Figures 1 and 2, as no special treatments were applied to these mice. However, the daily patterns observed in Figures 7A, 7B, 7F, and 7G differ substantially from those shown in Figures 2B and 1A, respectively. Additionally, it is unclear why the WT+ALF and Fmr1 KO+ALF groups did not exhibit differences in Figures 7I and 7J, especially considering that Fmr1 KO mice displayed more sleep bouts but shorter bout lengths in Figures 1C and 1D.

      Furthermore, it is not specified whether the results in Figure 7 were collected after two weeks of scheduled feeding (for how many days?) or if they represent the average data from the two-week treatment period.

      The rationale behind analyzing "ZT 0-3 activity" in Figure 7D instead of the parameters shown in Figures 2C and 2D is also unclear.

      In Figure 7F, some data points appear to be incorrectly plotted. For instance, the dark blue circle at ZT13 connects to the light blue circle at ZT14 and the dark blue circle at ZT17. This is inconsistent, as the dark blue circle at ZT13 should link to the dark blue circle at ZT14. Similarly, it is perplexing that the dark blue circle at ZT16 connects to both the light blue and dark blue circles at ZT17. Such errors undermine confidence in the data. The authors need to provide a clear explanation of how these data were processed.

      Lastly, in the Figure 7 legend, Table 6 is cited; however, this appears to be incorrect. It seems the authors intended to refer to Table 7.

      (6) Similar to the issue in Figure 7F, the data for day 12 in Supplemental Figure 2 includes two yellow triangles but lacks a green triangle. It is unclear how the authors constructed this chart, and clarification is needed.

      (7) In Figure 8, a 5-trial test was used to assess the effect of scheduled feeding on social behaviors. It is essential to present the results for all trials (1 to 4). Additionally, it is unclear whether the results for familial mice in Figure 8A correspond to trials 1, 2, 3, or 4.<br /> The legend for Figure 8 also appears to be incorrect: "The left panels show the time spent in social interactions when the second novel stranger mouse was introduced to the testing mouse in the 5-trial social interaction test. The significant differences were analyzed by two-way ANOVA followed by Holm-Sidak's multiple comparisons test with feeding treatment and genotype as factors." This description does not align with the content of the left panels. Moreover, two-way ANOVA is not the appropriate statistical analysis for Figure 8A. The authors need to provide accurate details about the analysis and revise the figure legend accordingly.

      (8) The circadian activity and sleep behaviors of Fmr1 KO mice have been reported previously, with some findings consistent with the current manuscript, while others contradict it. Although the authors acknowledge this discrepancy, it seems insufficiently thorough to simply state that the reasons for the conflicts are unknown. Did the studies use the same equipment for behavior recording? Were the same parameters used to define locomotor activity and sleep behaviors? The authors are encouraged to investigate these details further, as doing so may uncover something interesting or significant.

      (9) Some subtitles in the Results section and the figure legends do not align well with the presented data. For example, in the section titled "Reduced rhythmic strength and nocturnality in the Fmr1 KOs," it is unclear how the authors justify the claim of altered nocturnality in Fmr1 KO mice. How do the authors define changes in nocturnality? Additionally, the tense used in the subtitles and figure legends is incorrect. The authors are encouraged to carefully review all subtitles and figure legends to correct these errors and enhance readability.

    3. Reviewer #2 (Public review):

      Summary:

      In the present study, the authors, using a mouse model of Fragile X syndrome, explore the very interesting hypothesis that restricting food access over a daily schedule will improve sleep patterns and, subsequently, behavioral capacities. By restricting food access from 12h to 6h over the nocturnal period (active period for mice), they show, in these KO mice, an improvement of the sleep pattern accompanied by reduced systemic levels of inflammatory markers and improved behavior. Using a classical mouse model of neurodevelopmental disorder (NDD), these data suggest that eating patterns might improve sleep quality, reduce inflammation and improve cognitive/behavioral capacities in children with NDD.

      Strengths:

      Overall, the paper is very well-written and easy to follow. The rationale of the study is generally well-introduced. The data are globally sound. The provided data support the interpretation overall.

      Weaknesses:

      (1) The introduction part is quite long in the Abstract, leaving limited space for the data provided by the present study.

      (2) A couple of points are not totally clear for a non-expert reader:<br /> - The Fmr1/Fxr2 double KO mice are not well described.<br /> - What is the rationale for performing both LD and DD measures?

      (3) The data on cytokines and chemokines are interesting. However, the rationale for the selection of these molecules is not given. In addition, these measures have been performed in the systemic blood. Measures in the brain could be very informative.

      (4) An important question is the potential impact of fasting vs the impact of the food availability restriction. Indeed fasting has several effects on brain functioning including cognitive functions.

      (5) How do the authors envision the potential translation of the present study to human patients? How to translate the 12 to 6 hours of food access in mice to children with Fragile X syndrome?

    1. eLife Assessment

      This study presents an important discovery regarding the diversity and evolution of gall-forming microbial effectors. Supported by convincing computational structural predictions and analyses, the research provides insights into the unique mechanisms by which gall-forming microbes exert their pathogenicity in plants. This study also offers guidance that is of value for future studies on pathogen effector function and co-evolution with host plants.

    2. Reviewer #1 (Public review):

      Summary:

      This manuscript presents a comprehensive structure-guided secretome analysis of gall-forming microbes, providing valuable insights into effector diversity and evolution. The authors have employed AlphaFold2 to predict the 3D structures of the secretome from selected pathogens and conducted a thorough comparative analysis to elucidate commonalities and unique features of effectors among these phytopathogens.

      Strengths:

      The discovery of conserved motifs such as 'CCG' and 'RAYH' and their central role in maintaining the overall fold is an insightful finding. Additionally, the discovery of a nucleoside hydrolase-like fold conserved among various gall-forming microbes is interesting.

      Weaknesses:

      Important conclusions are not verified by experiments.

    3. Reviewer #2 (Public review):

      Summary:

      Soham Mukhopadhyay et al. investigated the protein folding of the secretome from gall-forming microbes using the AI-based structure modeling tool AlphaFold2. Their study analyzed six gall-forming species, including two Plasmodiophorid species and four others spanning different kingdoms, along with one non-gall-forming Plasmodiophorid species, Polymyxa betae. The authors found no effector fold specifically conserved among gall-forming pathogens, leading to the conclusion that their virulence strategies are likely achieved through diverse mechanisms. However, they identified an expansion of the Ankyrin repeat family in two gall-forming Plasmodiophorid species, with a less pronounced presence in the non-gall-forming Polymyxa betae. Additionally, the study revealed that known effectors such as CCG and AvrSen1 belong to sequence-unrelated but structurally similar (SUSS) effector clusters.

      Strengths:

      (1) The bioinformatics analyses presented in this study are robust, and the AlphaFold2-derived resources deposited in Zenodo provide valuable resources for researchers studying plant-microbe interactions. The manuscript is also logically organized and easy to follow.

      (2) The inclusion of the non-gall-forming Polymyxa betae strengthens the conclusion that no effector fold is specifically conserved in gall-forming pathogens and highlights the specific expansion of the Ankyrin repeat family in gall-forming Plasmodiophorids.

      (3) Figure 4a and 4b effectively illustrate the SUSS effector clusters, providing a clear visual representation of this finding.

      (4) Figure 1 is a well-designed, comprehensive summary of the number and functional annotations of putative secretomes in gall-forming pathogens. Notably, it reveals that more than half of the analyzed effectors lack known protein domains in some pathogens, yet some were annotated based on their predicted structures, despite the absence of domain annotations.

      Weaknesses:

      (1) The effector families discussed in this paper remain hypothetical in terms of their functional roles, which is understandable given the challenges of demonstrating their functions experimentally. However, this highlights the need for experimental validation as a next step.

      (2) Some analyses, such as those in Figure 4e, emphasize motifs derived from sequence alignments of SUSS effector clusters. Since these effectors are sequence-unrelated, sequence alignments might be unreliable. It would be more rigorous to perform structure-based alignments in addition to sequence-based ones for motif confirmation. For instance, methods described in Figure 3E of de Guillen et al. (2015, https://doi.org/10.1371/journal.ppat.1005228) or tools like Foldseek (https://search.foldseek.com/foldmason) could be useful for aligning structures of multiple sequences.

      (3) When presenting AlphaFold-generated structures, it is essential to include confidence scores such as pLDDT and PAE. For example, in Figure 1D of Derbyshire and Raffaele (2023, https://doi.org/10.1038/s41467-023-40949-9), the structural representations were colored red due to their high pLDDT scores, emphasizing their reliability.

    4. Author response:

      We appreciate the constructive feedback provided by the reviewers and the editorial board. We are delighted by the positive reception of our work and the thoughtful insights shared.

      Regarding the validation of our predicted interactions, we are currently conducting yeast two-hybrid (Y2H) assays using a commercially available Arabidopsis thaliana cDNA library to screen for interacting partners of the ANK putative effector PBTT_00818 from Plasmodiophora brassicae. Following this initial screening, we will validate positive interactions through targeted 1-to-1 Y2H assays. In particular, we aim to confirm the AlphaFold Multimer-predicted interaction between PBTT_00818 and MPK3, a key immunity-related kinase in Arabidopsis

      We are grateful for the reviewers’ thoughtful suggestions regarding clustering visualization, sequence vs. structure-based motif alignments, and structural confidence assessments. We will carefully incorporate these improvements in our planned revisions.

      Once again, we thank the editors and reviewers for their rigorous and constructive assessment. We look forward to implementing these refinements and submitting an updated version that further enhances the impact of our study.

    1. eLife Assessment

      This important study reports a detailed computational analysis of the CFTR ion channel's permeation mechanism, advancing our understanding of its structure-function relationship. The conclusions are based on extensive molecular dynamics simulations and thorough analysis, but the use of an approximate chloride ion model, known to underestimate key ion-protein interactions, leaves them incomplete without experimental or alternative computational validation. The work will be of interest to biophysicists working on CFTR and cystic fibrosis.

    2. Reviewer #1 (Public review):

      Summary:

      The goal of this study was to overcome the apparent difficulty in constructing structural models of the open state of the CFTR chloride channel. While several CFTR structural models at near-atomic resolution have been published under a variety of conditions, none of them have demonstrated a pore open across the full dimension of the plasma membrane. Instead, these have routinely been referred to as "near-open" models. In the present study, the authors extended their findings from a prior paper from their group that investigated a series of brief MD simulations, a small number of which exhibited permeation events where chloride ions permeated the pore. This study included massively repeated simulations initiated from these aforementioned Cl permeable conformations. Extensive analysis of the data identified a novel penta-helical structure that comprises the channel pore. This comprehensive study attempted to explain several features of conducting CFTR channels, including single-channel conductance, selectivity, and the mechanisms linking the ATP-induced dimerization of the cytosolic nucleotide-binding domains (NBDs) to the opening of the channel pore (a.k.a., "pore-gating".

      Strengths:

      The major strength of this study is its comprehensive nature. The approaches applied are cutting-edge and beyond, and are used to explain many different aspects of channel function in CFTR. The strength of evidence is very strong. The paper is extremely well-written, and the arguments are well-supported.

      Weaknesses:

      The major weakness is that none of the novel conclusions (i.e., those arising solely from this study and not previously published (have been supported by experimental confirmation. That is typical of computational studies such as this.

    3. Reviewer #2 (Public review):

      Although recent cryo-EM structures of the CFTR ion channel were reported in a putative open state (ATP-bound, NBD-dimerized), it remains unclear whether these structures explain the conductive properties of the open channel observed in functional experiments. To investigate this, the authors conducted extensive molecular dynamics simulations at different voltages. The simulations are started from snapshots of their prior work, based on the experimental putative open state and including conditions with high negative voltage. Their analysis reveals that the cryo-EM structure represents a near-open metastable state, with most trajectories transitioning to either more closed or more open conformations, leading to the identification of a potential new open state. Permeation rate analysis shows that, unlike the other states, the proposed open state exhibits functional conductive properties of the open channel, although a strong inward rectification, inconsistent with experimental data, is also noted. Further structural analysis and simulations of ATP-unbound closed states offer additional mechanistic insights.

      Overall, this work tackles key questions about CFTR: What is the true open conductive state? Does the ATP-bound cryo-EM structure reflect an actual open state? What is the ion permeation mechanism, and what structural changes occur during the closed-to-open transition? Which residues are critical, particularly those linked to diseases like CF? The study, based on a comprehensive set of all-atom molecular dynamics simulations, including a range of physiologically relevant voltages, provides important insights in this regard. It identifies key structural states, permeation pathways, critical residues, and conductance properties that can be directly compared to functional data. Notably, the analysis identifies a new open state of the channel, which, systematic analysis convincingly demonstrates is a conductive conformation of the channel, in line with experimental data at negative voltages. The authors carefully address some of the limitations of their results, exploring and discussing discrepancies with functional experiments, such as inward rectification. The work is also very well written, with a clear and logical presentation of key findings.

      The main weakness of this study is that the simulation data rely on the conventional CHARMM36 force field for Cl− ions, which has been shown to significantly underestimate the interaction between Cl− and proteins (J. Chem. Theory Comput. 2021, 17, 6240-6261). For example, the conventional CHARMM36 force field destabilizes the Cl-binding site in CLC-ec1. The latter ion unbinds irreversibly during microseconds-long simulations which is at odds with the experimental binding affinity.

      This imbalance in Cl−/protein/water interactions could significantly impact the CFTR simulations, potentially altering state populations and Cl− permeability. Notably, recent work by Levring and Chen (Proc Natl Acad Sci U S A. 2024) identifies a likely Cl− binding site in the bottleneck region of the channel, which contradicts the simulation results showing low occupancy Cl− ions in this region (Fig. 1B and Fig. 6A). This discrepancy may be due to the underestimation of Cl−/protein interactions. Indeed, Orabi et al. have proposed corrections that specifically tune these interactions, including those with aromatic residues, in line with the binding site geometry suggested by Levring and Chen. This imbalance in interactions may also lead to an underestimation of the conductance in the experimental near-open state.<br /> Balanced Cl−/protein interactions could also influence voltage/current relationships, potentially affecting the degree of inward rectification. For example, higher Cl− occupancy in the bottleneck region may stabilize the down state of R334, along with other measured interactions, thereby increasing conductance as the authors have shown.

      The experimental evidence reported and discussed by the authors in support of the proposed open state is largely qualitative. For instance, in Figure 4 Supplement 2 there is a significant overlap in the distances and SASA distributions of open and near-open states for the reported residues (are those residues water accessible in the simulations?).

      Given the known limitations of the standard CHARMM36 Cl− force field and in the absence of robust experimental validation of the proposed open state, I recommend validating at least part of the results using an independent set of simulations (not started from the previous ones) with an updated Cl− force field. It would be especially important to reassess whether the experimental near-open state is truly metastable and less probable than the new open state, and confirm that the near-open state exhibits negligible conductance.

      A minor point worth discussing is whether the observed inward rectification may be influenced by hysteresis or incomplete equilibration, as many simulations were started from prior trajectories at large negative voltages and may not have fully relaxed. For instance, is not uncommon that small structural changes in backbone and sidechains occur in several microseconds (Shaw et al., Science, 2010). That said, discrepancies in current-voltage relationships are not unexpected due to challenges in simulation sampling and force field accuracy (J Gen Physiol 2013 May;141(5):619-32) as the authors stated.

      Another minor point to address is the preparation of the simulation setup for the ATP-free structure of the protein. It would be helpful to specify whether any particular controls or steps were taken, given that the structure is based on a relatively low resolution (3.87 Å) model.

    4. Reviewer #3 (Public review):

      Background:

      Cystic Fibrosis Transmembrane Conductance Regulator (CFTR) is a chloride channel whose dysfunction underlies cystic fibrosis, a life-limiting condition caused by thick, sticky mucus buildup in the lungs and other organs. Despite multiple high-resolution structures of CFTR, these snapshots have all captured the channel in a non-conducting or "closed" conformation - even when the protein was prepared under conditions that should favor channel opening. This discrepancy has posed a key challenge: how can a channel be experimentally observed as closed while physiological tests demonstrate it conducts chloride ions?

      Key Findings:

      (1) Stable Open Conformation

      Through repeated molecular dynamics (MD) simulations of human CFTR in lipid bilayers, researchers observed a reproducible, stable open state. Unlike previous transient openings seen in single-run or short simulations, this conformation remains consistently permeable over extended timescales.

      (2) Penta-Helical Arrangement

      The authors highlight a "penta-helical" pore-lining arrangement in which five transmembrane helices symmetrically organize to create a clear ion-conduction pathway. This novel configuration resolves the previously puzzling hydrophobic bottleneck found in cryo-EM structures.

      (3) Conductance Close to Experimental Values

      By analyzing chloride ion flow under near-physiological voltages, they calculate a channel conductance aligning well with electrophysiological measurements. This alignment provides strong support that the observed structure is functionally relevant.

      (4) Roles of Key Residues

      Several positively charged (cationic) residues in the pore appear crucial for guiding and stabilizing chloride ions. Simultaneously, small kinks in certain helices may act as structural "hinges," allowing or blocking chloride passage.

      How to Interpret These Results:

      (1) Bridging a Major Gap: The study tackles the mismatch between static "closed" CFTR structures and their known open-channel function. Successfully capturing a stable open state in MD simulations is a significant step toward reconciling what cryo-EM data shows versus what physiological experiments have long told us.

      (2) Strength in Multiple Replicas: Running many simulation repeats (rather than relying on a single trajectory) lends credibility. Only if a phenomenon is reproducible across multiple runs can it be considered robust.

      (3) Consistency with Mutational Data: Observing that known functional hotspots (e.g., specific charged residues) play a key role in the new pore model further validates these findings.

      Important Caveats and Limitations:

      (1) Simulation Timescales vs. Biology<br /> Even extended MD (on the microsecond scale) is still much faster, simpler, and more controlled than real cellular processes.

      (2) Physiological existence of the penta-helical pore<br /> Although the simulations and results are highly compelling, several factors leave open the possibility of a physiological open conformation differing from the observed penta-helical pore. These factors include ATP hydrolysis, interactions with physiological binding partners, the native membrane environment, and regions not modeled in the CFTR structures, such as the R domain. Most importantly, the transmembrane voltage is very high (500mV).

      Bottom Line:

      This work delivers a long-awaited, near-physiological view of CFTR's open conformation. It provides a foundational structure against which future experimental and computational studies can be compared. By demonstrating reliable chloride conduction and matching established biophysical data, these simulations bring us closer to understanding - and potentially targeting - CFTR's gating mechanism in health and disease. Readers should applaud the breakthroughs while recognizing that further exploration (including more complex in vitro and in vivo experiments) will still be necessary to capture the full dynamism of CFTR in the living cell environment.

    5. Reviewer #4 (Public review):

      Summary:

      The structural mechanism of anion permeation through the open CFTR pore has remained unresolved and is subject to ongoing debate. That is because even in CFTR structures obtained under conditions that normally maximally activate the channel (phosphorylation + ATP + non-hydrolytic mutations + potentiator drugs) a bottleneck region in the pore, too narrow to allow passage of hydrated chloride ions, is observed.

      The present study uses molecular dynamics (MD) simulations initiated from such "quasi-open" states to address local conformational dynamics of the pore. The authors conclude that the quasi-open structure stably relaxes to a fully open conformation on the sub-microsecond time scale. They provide a detailed analysis of this fully open structure and of the mechanism of chloride permeation. They conclude that two major exit pathways (a central and a peripheral) exist for chloride ions, and that the ions remain near-fully hydrated throughout the pore: chloride-protein interactions displace only 1-2 waters from the first solvation shell. Furthermore, the simulations provide some hints for conformational changes involved in gating.

      Strengths:

      The findings are interpreted in the context of the large body of published functional studies on CFTR permeation properties, and caveats are adequately discussed.

      Weaknesses:

      The conclusions on gating would benefit from further discussions. In particular, a fair comparison of the timescale at which channel gating happens, and that of the MD simulations would strengthen the manuscript.

    1. eLife Assessment

      Rennert et al. developed a valuable thermodynamic framework to study the force response of branched actin networks from the crucial and unexplored perspective of energetic cost. They used the fact that the entropy production rate must be positive to derive inequalities that set limits on the maximum force produced by branched actin networks, and speculate that the dissipative cost beyond that required to move the load may be necessary to maintain an adaptive steady state. This work is highly innovative, but remains incomplete until the hypotheses of the model are better justified and the conclusions about the dissipative cost of the system are better established.

    2. Reviewer #1 (Public review):

      Summary:

      This paper investigated the dynamic self-assembly of branched actin networks and the relation between the nonequilibrium features of the dynamics with the thermodynamic cost. The authors constructed a chain model to describe the self-assembly process of a branched actin network, including events like nucleation, polymerization, and capping. The forward and backward transition rates associated with the events allowed them to investigate the entropy production rate of the dynamics. They then used the fact that the entropy production rate has to be greater than zero to derive inequalities that set bounds for the maximum force produced by the branched actin network. The idea is similar to estimating the polymerization force of actin filament via the equation F_{max} = dG/delta, which sets a bound on the maximum force by the thermodynamic potential dG which is the chemical energy associated with ATP hydrolysis and delta is the length increment upon monomer insertion. Furthermore, they speculated the dissipative cost beyond what is necessary to move the load may be necessary to maintain an adaptive steady state.

      Strengths:

      The authors developed a simple model that is capable of qualitatively reproducing some mechanical phenomena for a branched actin network. The model has captured the essential dynamic elements in the branched actin network and built connections between the maximum load and the adaptation behavior with the energetic cost. It is an interesting study that provides a new perspective to look at the mechanical response of the branched actin network.

      Weaknesses:

      The text needs to be improved, particularly in the model introduction part. It is unclear to me what happens to the state when the reverse reaction in Figure 2 occurs.

      Furthermore, what the authors have done is similar to estimate the polymerization force of actin filaments but in a more complicated scenario. Their conclusion that "dissipative cost in the system beyond what is necessary to move the load may be necessary to maintain an adaptive steady state" is skeptical. The branched actin network is a nonequilibrium system driven by active processes like ATP hydrolysis that converts chemical energy into mechanical work. There has to be a gap between the actual E-C_f curve and that when dissipation rate dot{S} = 0. If the authors want to make the claim, they have to decompose the dissipation into different parts and show that a particular part is associated with adaption. Otherwise, the conclusion about the gap is baseless.

    3. Reviewer #2 (Public review):

      Summary:

      Rennert et al. developed a thermodynamic framework for the assembly of branched networks to calculate the entropy dissipation associated with this process. They base their model on the simplest possible experimental system consisting of four proteins: actin, Arp2/3, capping protein, and NPF. They decompose the network assembly into a linear model where the order of events (polymerization, capping, and nucleation) is recorded sequentially. Polymerization and capping are sensitive to load and affected by Brownian ratchet effects, while nucleation is not. This simplified model provides an analytical solution that describes the load sensitivity of actin networks and agrees well with experimental data for a given set of transition rates.

      Strengths:

      (1) These thermodynamic approaches are original and fundamental to our understanding of these non-equilibrium systems.

      (2) The fact that the model fits experimental data is encouraging.

      Weaknesses:

      (1) The possibility of describing branched actin assembly as a Markov process is not well justified.

      (2) The choice of parameters controlling the system is open to question. Some parameters are probably completely negligible, while other ignored effects are potentially significant.

      (3) The main conclusion of the manuscript, linked to the existence of a dissipation gap, is quite expected. The manuscript would have been more valuable if the authors had been able to decompose dissipation into different components in order to prove that a particular fraction is associated with adaptation.

    1. eLife Assessment

      This study presents a potentially fundamental analysis of the original color of a fossil feather from the crest of a 125-million-year-old enantiornithine bird, using sophisticated 3D microscopic and numerical methods to conclude that the feather was iridescent and brightly colored, possibly indicating that this was a male bird that used its crest in sexual displays. At present, the strength of evidence supporting the authors' conclusions is considered incomplete based on methodological incompleteness and questions about taphonomy.

    2. Reviewer #1 (Public review):

      Summary:

      Li et al describe a novel form of melanosome based iridescence in the crest of an Early Cretaceous enantiornithine avialan bird from the Jehol Group.

      Strengths:

      Novel set of methods applied to the study of fossil melanosomes.

      Weaknesses:

      (1) Firstly, several studies have argued that these structures are in fact not a crest, but rather the result of compression. Otherwise, it would seem that a large number of Jehol birds have crests that extend not only along the head but the neck and hindlimb. It is more parsimonious to interpret this as compression as has been demonstrated using actuopaleontology (Foth 2011).<br /> (2) The primitive morphology of the feather with their long and possibly not interlocking barbs also questions the ability of such feathers to be erected without geologic compression.<br /> (3) The feather is not in situ and therefore there is no way to demonstrate unequivocally that it is indeed from the head (it could just as easily be a neck feather)<br /> (4) Melanosome density may be taphonomic; in fact, in an important paper that is notably not cited here (Pan et al. 2019) the authors note dense melanosome packing and attribute it to taphonomy. This paper describes densely packed (taphonomic) melanosomes in non-avian avialans, specifically stating, "Notably, we propose that the very dense arrangement of melanosomes in the fossil feathers (Fig. 2 B, C, and G-I, yellow arrows) does not reflect in-life distribution, but is, rather, a taphonomic response to postmortem or postburial compression" and if this paper was taken into account it seems the conclusions would have to change drastically. If in this case the density is not taphonomic, this needs to be justified explicitly (although clearly these Jehol and Yanliao fossils are heavily compressed).<br /> (5) Color in modern birds is affected by the outer keratin cortex thickness which is not preserved but the authors note the barbs are much thicker (10um) than extant birds; this surely would have affected color so how can the authors be sure about the color in this feather?<br /> (6) Authors describe very strange shapes that are not present in extant birds: "...different from all other known feather melanosomes from both extant and extinct taxa in having some extra hooks and an oblique ellipse shape in cross and longitudinal sections of individual melanosome" but again, how can it be determined that this is not the result of taphonomic distortion?<br /> (7) The authors describe the melanosomes as hexagonally packed but this does not appear to be in fact the case, rather appearing quasi-periodic at best, or random. If the authors could provide some figures to justify this hexagonal interpretation?<br /> (8) One way to address these concerns would be to sample some additional fossil feathers to see if this is unique or rather due to taphonomy<br /> (9) On a side, why are the feet absent in the CT scan image?

    3. Reviewer #2 (Public review):

      Summary:

      The authors reconstructed the three-dimensional organization of melanosomes in fossilized feathers belonging to a spectacular specimen of a stem avialan from China. The authors then proceed to infer the original coloration and related ecological implications.

      Strengths:

      I believe the study is well executed and well explained. The methods are appropriate to support the main conclusions. I particularly appreciate how the authors went beyond the simple morphological inference and interrogated the structural implications of melanosome organization in three dimensions. I also appreciate how the authors were upfront with the reliability of their methods, results, and limitations of their study. I believe this will be a landmark study for the inference of coloration in extinct species and how to interrogate its significance in the future.

      Weaknesses:

      I have a few minor comments.<br /> Introduction: I would suggest the authors move the paragraph on coloration in modern birds (lines 75-97) before line 64, as this is part of the reasoning behind the study. I believe this change would improve the flow of the introduction for the general reader.<br /> Melanosome organization: I was surprised to find little information in the main text regarding this topic. As this is one of the major findings of the study, I would suggest the authors include more information regarding the general geometry/morphology of the single melanosomes and their arrangement in three dimensions.<br /> Keratin: the authors use such a term pretty often in the text, but how is this inference justified in the fossil? Can the authors extend on this? Previous studies suggested the presence of degradation products deriving from keratin, rather than immaculated keratin per se.<br /> Ontogenetic assessment: the authors infer a sub-adult stage for the specimen, but no evidence or discussion is reported in the SI. Can the authors describe and discuss their interpretations?<br /> CT scan data: these data should be made freely available upon publication of the study.

    4. Reviewer #3 (Public review):

      Summary:

      The paper presents an in-depth analysis of the original colour of a fossil feather from the crest of a 125-million-year-old enantiornithine bird. From its shape and location, it would be predicted that such a feather might well have shown some striking colour and pattern. The authors apply sophisticated microscopic and numerical methods to determine that the feather was iridescent and brightly coloured and possibly indicates this was a male bird that used its crest in sexual displays.

      Strengths:

      The 3D micro-thin-sectioning techniques and the numerical analyses of light transmission are novel and state-of-the-art. The example chosen is a good one, as a crest feather is likely to have carried complex and vivid colours as a warning or for use in sexual display. The authors correctly warn that without such 3D study feather colours might be given simply as black from regular 2D analysis, and the alignment evidence for iridescence could be missed.

      Weaknesses: Trivial.

    1. eLife Assessment

      This fundamental manuscript comprehensively examines the roles of nine structural proteins in herpes simplex virus 1 (HSV-1) assembly and nuclear egress. By integrating cryo-light microscopy and soft X-ray tomography, the study presents an innovative approach to investigating viral assembly within cells. The research is thoroughly executed, yielding compelling data that explain previously unknown functions of these structural proteins. This work is of broad interest to virologists, cellular biologists, and structural biologists, offering a robust, contextually rich methodology for studying large protein complex assembly within the cellular environment, serving as an excellent starting point for high-resolution techniques.

    2. Reviewer #1 (Public review):

      Summary:

      Nahas et al. investigated the roles of herpes simplex virus 1 (HSV-1) structural proteins using correlative cryo-light microscopy and soft X-ray tomography. The authors generated nine viral variants with deletions or mutations in genes encoding structural proteins. They employed a chemical fixation-free approach to study native-like events during viral assembly, enabling observation of a wider field of view compared to cryo-ET. The study effectively combined virology, cell biology, and structural biology to investigate the roles of viral proteins in virus assembly and budding.

      Strengths:

      (1) The study presented a novel approach to studying viral assembly in cellulo.

      (2) The authors generated nine mutant viruses to investigate the roles of essential proteins in nuclear egress and cytoplasmic envelopment.

      (3) The use of correlative imaging with cryoSIM and cryoSXT allowed for the study of viral assembly in a near-native state and in 3D.

      (4) The study identified the roles of VP16, pUL16, pUL21, pUL34, and pUS3 in nuclear egress.

      (5) The authors demonstrated that deletion of VP16, pUL11, gE, pUL51, or gK inhibits cytoplasmic envelopment.

      (6) The manuscript is well-written, clearly describing findings, methods, and experimental design.

      (7) The figures and data presentation are of good quality.

      (8) The study effectively correlated light microscopy and X-ray tomography to follow virus assembly, providing a valuable approach for studying other viruses and cellular events.

      (9) The research is a valuable starting point for investigating viral assembly using more sophisticated methods like cryo-ET with FIB-milling.

      (10) The study proposes a detailed assembly mechanism and tracks the contributions of studied proteins to the assembly process.

      (11) The study includes all necessary controls and tests for the influence of fluorescent proteins.

      Weaknesses:

      Overall, the manuscript does not have any major weaknesses, just a few minor comments:

      (1) The gel quality in Figure 1 is inconsistent for different samples, with some bands not well resolved (e.g., for pUL11, GAPDH, or pUL20).

      (2) The manuscript would benefit from a summary figure or table to concisely present the findings for each protein. It is a large body of manuscript, and a summary figure showing the discovered function would be great.

      (3) Figure 2 lacks clarity on the type of error bars used (range, standard error, or standard deviation). It says, however, range, and just checking if this is what the authors meant.

      (4) The manuscript could be improved by including details on how the plasma membrane boundary was estimated from the saturated gM-mCherry signal. An additional supplementary figure with the data showing the saturation used for the boundary definition would be helpful.

      (5) Additional information or supplementary figures on the mask used to filter the YFP signal for Figure 4 would be helpful.

      (6) The figure legends could include information about which samples are used for comparison for significance calculations. As the color of the brackets is different from the compared values (dUL34), it would be great to have this information in the figure legend.

      (7) In Figure 5B, the association between YFP and mCherry signals is difficult to assess due to the abundance of mCherry signal; single-channel and combined images might improve visualization.

      (8) In Figure 6D, staining for tubulin could help identify the cytoskeleton structures involved in the observed virus arrays.

      (9) It is unclear in Figure 6D if the microtubule-associated capsids are with the gM envelope or not, as the signal from mCherry is quite weak. It could be made clearer with the split signals to assess the presence of both viral components.

      (10) The representation of voxel intensity in Figure 8 is somewhat confusing. Reversion of the voxel intensity representation to align brighter values with higher absorption, which would simplify interpretation.

      (11) The visualization in panel I of Figure 8 might benefit from a more divergent colormap to better show the variation in X-ray absorbance.

      (12) Figure 9 would be enhanced by images showing the different virus sizes measured for the comparative study, which would help assess the size differences between different assembly stages.

      Overall, this is an excellent manuscript and an enjoyable read. It would be interesting to see this approach applied to the study of other viruses, providing valuable insights before progressing to high-resolution methods.

    3. Reviewer #2 (Public review):

      Summary:

      For centuries, humans have been developing methods to see ever smaller objects, such as cells and their contents. This has included studies of viruses and their interactions with host cells during processes extending from virion structure to the complex interactions between viruses and their host cells: virion entry, virus replication and virion assembly, and release of newly constructed virions. Recent developments have enabled simultaneous application of fluorescence-based detection and intracellular localization of molecules of interest in the context of sub-micron resolution imaging of cellular structures by electron microscopy.

      The submission by Nahas et al., extends the state-of-the-art for visualization of important aspects of herpesvirus (HSV-1 in this instance) virion morphogenesis, a complex process that involves virus genome replication, and capsid assembly and filling in the nucleus, transport of the nascent nucleocapsid and some associated tegument proteins through the inner and outer nuclear membranes to the cytoplasm, orderly association of several thousand mostly viral proteins with the capsid to form the virion's tegument, envelopment of the tegumented capsid at a virus-tweaked secretory vesicle or at the plasma membrane, and release of mature virions at the plasma membrane.

      In this groundbreaking study, cells infected with HSV-1 mutants that express fluorescently tagged versions of capsid (eYFP-VP26) and tegument (gM-mCherry) proteins were visualized with 3D correlative structured illumination microscopy and X-ray tomography. The maturation and egress pathways thus illuminated were studied further in infections with fluorescently tagged viruses lacking one of nine viral proteins.

      Strengths:

      This outstanding paper meets the journal's definitions of Landmark, Fundamental, Important, Valuable, and Useful. The work is also Exceptional, Compelling, Convincing, and Solid. The work is a tour de force of classical and state-of-the-art molecular and cellular virology. Beautiful images accompanied by appropriate statistical analyses and excellent figures. The numerous complex issues addressed are explained in a clear and coordinated manner; the sum of what was learned is greater than the sum of the parts. Impacts go well beyond cytomegalovirus and the rest of the herpesviruses, to other viruses and cell biology in general.

      Weaknesses:

      I have a few suggestions for minor adjustments in the text.

    4. Reviewer #3 (Public review):

      Summary:

      Kamal L. Nahas et al. demonstrated that pUL16, pUL21, pUL34, VP16, and pUS3 are involved in the egress of the capsids from the nucleous, since mutant viruses ΔpUL16, ΔpUL21, ΔUL34, ΔVP16, and ΔUS3 HSV-1 show nuclear egress attenuation determined by measuring the nuclear:cytoplasmic ratio of the capsids, the dfParental, or the mutants. Then, they showed that gM-mCherry+ endomembrane association and capsid clustering were different in pUL11, pUL51, gE, gK, and VP16 mutants. Furthermore, the 3D view of cytoplasmic budding events suggests an envelopment mechanism where capsid budding into spherical/ellipsoidal vesicles drives the envelopment.

      Strengths:

      The authors employed both structured illumination microscopy and cellular ultrastructure analysis to examine the same infected cells, using cryo-soft-X-ray tomography to capture images. This combination, set here for the first time, enabled the authors to obtain holistic data regarding a biological process, as a viral assembly. Using this approach, the researchers studied various stages of HSV-1 assembly. For this, they constructed a dual-fluorescently labelled recombinant virus, consisting of eYFP-tagged capsids and mCherry-tagged envelopes, allowing for the independent identification of both unenveloped and enveloped particles. They then constructed nine mutants, each targeting a single viral protein known to be involved in nuclear egress and envelopment in the cytoplasm, using this dual-fluorescent as the parental one. The experimental setting, both the microscopic and the virological, is robust and well-controlled. The manuscript is well-written, and the data generated is robust and consistent with previous observations made in the field.

      Weaknesses:

      It would be helpful to find out what role the targeted proteins play in nuclear egress or envelopment acquisition in a different orthoherpesvirus, like HSV-2. This would confirm the suitability of the technical approach set and would also act as a way to validate their mechanism at least in one additional herpesvirus beyond HSV-1. So, using the current manuscript as a starting point and for future studies, it would be advisable to focus on the protein functions of other viruses and compare them.

    1. eLife Assessment

      This study provides important insights into the regulation of type-I interferon signaling and anti-tumor immunity, demonstrating that ORMDL3 promotes RIG-I degradation to suppress immune responses. The evidence is convincing, with well-executed mechanistic experiments and in vivo validation in syngeneic tumor models. These findings have significant implications for cancer immunotherapy, highlighting ORMDL3 as a potential therapeutic target.

    2. Reviewer #2 (Public review):

      Summary:

      The authors identified ORMDL3 as a negative regulator of the RLR pathway and anti-tumor immunity. Mechanistically, ORMDL3 interacts with MAVS and further promotes RIG-I for proteasome degradation. In addition, the deubiquitinating enzyme USP10 stabilizes RIG-I and ORMDL3 disturbs this process. Moreover, in subcutaneous syngeneic tumor models in C57BL/6 mice, they showed that inhibition of ORMDL3 enhances anti-tumor efficacy by augmenting the proportion of cytotoxic CD8-positive T cells and IFN production in the tumor microenvironment (TME).

      Strengths:

      The paper has a clearly arranged structure and the English is easy to understand. It is well written. The results clearly support the conclusion.

      Comments on revisions:

      All questions have been answered.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations for the authors):

      Minor Points:

      • HEK293T cells are not typically Type 1 IFN-producing cells; it is recommended to use other immune cell lines to validate results obtained with ORMDL3 overexpression in 293T cells. The same applies to A549 alveolar basal epithelial cells.

      Thanks for the reviewer’s insightful comment. In Figure 1C, we overexpressed ORMDL3 in mouse primary BMDM cell and stimulated it with poly(I:C) or poly(dG:dC), which suggests that ORMDL3 inhibits IFN expression in primary cell BMDM.

      • Clarify whether TLR3 is expressed in the cell lines used in Figure 1 and whether TLR3 is present in mouse BMDMs.

      Thanks for your suggestions. We identified whether TLR3 is expressed in HEK293T, A549 and BMDM. We designed primers of human TLR3 and murine Tlr3, and the results showed that Tlr3 is expressed in BMDM but not in HEK293T and A549. As it shown in Author response image 1.

      Author response image 1.

      PCR amplification of human TLR3 was conducted on cDNA derived from HEK293T and A549 cells (lanes 1 and 2, respectively), and PCR amplification of murine Tlr3 was performed on cDNA from BMDM (lane 3). Human spleen cDNA (lane 4, TAKARA Human MTCTM Panel I, Cat# 636742) served as a positive control, and 18s rRNA was used as an internal control.

      primer sequences:

      human TLR3: forward TTGCCTTGTATCTACTTTTGGGG   reverse TCAACACTGTTATGTTTGTGGGT

      murine Tlr3: forward GTGAGATACAACGTAGCTGACTG   reverse TCCTGCATCCAAGATAGCAAGT

      18s (human/mice): forward GTAACCCGTTGAACCCCATT   reverse CCATCCAATCGGTAGTAGCG

      • Specify the type of luciferase reporter assay used in Figure 1E.

      Thanks for the reviewer’s insightful comment. The Dual-Luciferase® Reporter (DLR™) Assay System efficiently measures two luciferase signals. In brief, the IFN-reporter luciferase is derived from firefly (Photinus pyralis), while the internal control luciferase is from Renilla (Renilla reniformis or sea pansy). These dual luciferases are measured sequentially from a single sample. In Figure 1E, we measured the luciferase activity of IFN (firefly) and internal control gene TK (Renilla), and their ratio is shown in Figure 1E.

      • Clarify what was knocked down in the A549 stable KD cell line and whether HSV-1 infects and replicates in A549 cells.

      We sincerely appreciate the reviewer’s concern and apologize for any ambiguous descriptions. In Figure 1H, we knocked down ORMDL3 and infected the cell with HSV-1, which shows that ORMDL3 does not affect the infection and replication of HSV-1 in A549.

      • In Figure 2E, provide the rationale for using the same tag (Flag) in overexpression experiments with different molecules such as Flag-ORDML3 and Flag-RIG-I.

      We thank the reviewer’s concern. We tried to co-express different tags of ORMDL3 and innate immunity proteins, and we got the same results as before. ORMDL3-Myc overexpression can only promote the degradation of Flag-RIG-I-N, as shown in current Figure 2E.

      • Address the low knockdown efficiency shown in Figure 2D and consider whether it is sufficient for drawing conclusions.

      Thanks for the reviewer’s concern. Because ORMDL3 antibody (Abcam 107639) can recognize all ORMDL family members (ORMDL1, 2 and 3), this may explain why the knockdown efficiency of ORMDL3 is not apparent in Figure2D. We also detect the knockdown efficiency of ORMDL3 at mRNA level, which showed that ORMDL3 was silenced efficiently and specifically (Figure S2C).

      • Replace the Tubulin/β-Actin WB control with a more distinguishable band.

      Thanks for the suggestion. Owing to different gel concentration, sometimes the protein bands appear fused, but it is distinguishable that the internal controls are consistent.

      • In Figures 3D/E, the expression level of the Lysine mutant of RIG-I-N is too low. Please provide an explanation or repeat the experiment to achieve comparable expression levels and update the figure accordingly.

      Thanks for the question. The expression of lysine mutant of RIG-I-N is low, we have increased the amount of plasmid in transfection, but this still hasn't increased its expression level. Though its abundance is low, we provided evidence to show that it would not be degraded by ORMDL3. In some literatures (for example: RNF122 suppresses antiviral type I interferon production by targeting RIG-I CARDs to mediate RIG-I degradation. Proc Natl Acad Sci U S A. 2016 Aug 23;113(34):9581-6; TRIM4 modulates type I interferon induction and cellular antiviral response by targeting RIG-I for K63-linked ubiquitination. J Mol Cell Biol. 2014 Apr;6(2):154-63.), it has also been reported that lysine mutant can affect RIG-I stability. In addition, we speculate that the 4KR mutant (K146R, K154R, K164R, K172R) may change RIG-I conformation, so its expression is lower.

      • Explain why there is no difference in MAVS expression levels despite binding with MAVS.

      Thanks for the question. In our experiment, ORMDL3 has no effect on MAVS expression. Our results showed that ORMDL3 interacts with MAVS and promotes the degradation of RIG-I, so only RIG-I level has a significant difference.

      • Verify if Flag-tagged ORMDL3 is present in the IP sample in Figure 3G.

      Thanks for the comment. We reloaded the samples and blot flag, and we found that ORMDL3 cannot be pulled down by RIG-I. We have added the results in Figure 3G.

      • Reload the samples in Figure 4C to clearly identify the correct band for GFP-tagged ORMDL3.

      Thanks for the question. As ORMDL3 is small molecular protein, we fused it and its fragments to GFP to increase its molecular weight. In our GFP vector, for some unknown reason, the 26kDa band always exists. This is actually a technical difficulty. Although the GFP-fused protein and GFP band are very close, they can still be distinguished as two bands.

      • Rerun the Western blot for Actin IB in Figure 4E, as the ORMDL3-GFP (1-153) full-length appears abnormal.

      Thanks for the question. As we first blot GFP and then blot actin on the same membrane, so it appears abnormal. We reloaded the previous sample and blotted the actin again.

      • Clarify in which figure RIG-I ubiquitination is shown and whether ORMDL3 has E3 ubiquitin ligase activity. Explain how ORMDL3 facilitates USP10 transfer to RIG-I despite no direct interaction.

      Thank you for your question. In Figure 3B we showed the ubiquitination of RIG-I and ORMDL3 does not have an E3 ubiquitin ligase activity. Our results showed that although ORMDL3 does not directly interacted with RIG-I, it forms complex with USP10 (Figure 5B, 5C) and disrupt USP10 induced RIG-I stabilization by decreasing the interaction between USP10 and RIG-I (Figure 6A). The detailed mechanism needs further investigation.

      • Provide quantification for Figure 5D. Explain why the bands are not degraded by RIG-I and USP10.

      Thanks for the concern. We quantified the bands and found that overexpression of USP10 increased RIG-I protein abundance. The quantitative gray values are added into the image. USP10 functions to stabilize RIG-I rather than promoting its degradation.

      • Explain the decrease in RIG-I levels in Figure 5E when USP10 levels decrease.

      Thanks for the concern. As shown in the working model (Supplementary Figure 8), USP10 is a deubiquitinase that stabilizes RIG-I by decreasing its K48-linked ubiquitination. So, in Figure 5E, we knocked down USP10 and found a decrease in RIG-I levels, which is consistent with Figure 5D.

      • Clarify whether K48 ubiquitination on RIG-I has decreased in Figure 5F, as this is not clear from the image.

      Thanks for the question. In Figure 5F it is shown that the K48 ubiquitination level of RIG-I significantly decreased (please see the density of the bands in the IP samples).

      • Address whether ORMDL3 reduces RIG-I-N degradation in Figure 5H, as the results do not clearly support this claim.

      Thanks for the concern. We quantified the bands and the results showed that ORMDL3 promotes the degradation of RIG-I-N. The quantitative gray values are added into the image.

      • Reload Flag-ORMDL3 in Figure 6C to determine whether RIG-I-N is restored in the MG132-treated samples.

      Thank you for your question. We quantified the bands and the results showed that RIG-I-N is restored in the MG132-treated samples. The quantitative gray values are added into the image.

      • Correct numerous typos and errors, especially in the Discussion section, to improve readability

      Thanks for the suggestion. We have revised the manuscript carefully to correct these errors.

      Reviewer #2 (Recommendations for the authors):

      (1) In Figure 1G and H, The number of virus-infected cells was observed using a fluorescence microscope. In addition, can the author use other techniques to detect the impact of ORMDL3 on virus replication?

      Thanks for the question. Except for using a fluorescence microscope, we also used RT-PCR to quantify the amount of viral mRNA, and results were added in Figure 1G and H.

      (2) In Figure 3C, ORMDL3 overexpression promotes the degradation of RIG-I-N. ORMDL3 is one of three ORMDL proteins with similar amino acid sequences, does ORMDL1/2 also have this function?

      Thanks for the suggestion. We compared the function between ORMDLs and found that only ORMDL3 overexpression facilitated RIG-I-N degradation. The results were shown in Figure S2D.

      (3) In Figure 5A, USP10 is not the top protein in the Mass spec assay. Does the author verified the interaction between ORMDL3 and other protein (for example CAND1)?

      Thanks for your suggestion. We verified that ORMDL3 has no interaction with CAND1 and UFL1 but only interacts with USP10, as Figure S5 shows.

      (4) A scale bar to be added to the images in Figure 1 G, H and Figure 7K.

      Thanks for the suggestion. We have added the scale bars.

      (5) The annotations in Figure 4B, C and E should be aligned.

      Thanks for the suggestion. We have aligned the annotations.

      (6) Provide Statistical methods

      Thanks for the suggestion. We have provided the statistical methods in the materials and methods part.

    1. eLife Assessment

      This study addresses an important and longstanding question regarding the molecular mechanism of protein misfolding in Ig light chain (LC) amyloidosis (AL), a life-threatening condition. By combining advanced techniques, including small-angle X-ray scattering, molecular dynamics simulations, and hydrogen-deuterium exchange mass spectrometry, the authors provide convincing evidence that the "H state" distinguishes amyloidogenic from non-amyloidogenic LCs. These findings not only offer novel insights into LC structural dynamics but also hold promise for guiding therapeutic strategies in amyloidosis and will be of particular interest to structural biologists, biophysicists, and many others working on amyloid diseases.

    2. Reviewer #1 (Public review):

      The study investigates light chains (LCs) using three distinct approaches, with a focus on identifying a conformational fingerprint to differentiate amyloidogenic light chains from multiple myeloma light chains. The study's major contribution is the identification of a low-populated "H state," which the authors propose as a unique marker for AL-LCs. While this finding is promising, the review highlights several strengths and weaknesses. Strengths include the valuable contribution of identifying the H state and the use of multiple approaches, which provide a comprehensive understanding of LC structural dynamics. Weaknesses include a lack of physical insights explaining the changes.

    3. Reviewer #2 (Public review):

      Summary:

      This well-written manuscript addresses an important but recalcitrant problem - molecular mechanism of protein misfolding in Ig light chain (LC) amyloidosis (AL), a major life-threatening form of systemic human amyloidosis. The authors use expertly recorded and analyzed small-angle X-ray scattering (SAXS) data as a restraint for molecular dynamics simulations (called M&M). Six patient-based LC proteins are explored, including four AL and two non-AL. The authors report a partially populated "H-state" determined computationally, wherein the two domains in an LC molecule acquire a straight rather than bent conformation, with an extended interdomain linker; this H-state distinguishes AL from non-AL LCs. H-D exchange mass spectrometry is used to support this conclusion. This is a novel and interesting finding with potentially important translational implications.

      Strengths:

      Expertly recorded and analyzed SAXS data combined with clever M&M simulations lead to a novel and interesting conclusion, which is supported by limited H-D exchange data.<br /> Stabilization of the CL-CL interface is a good idea that may help protect a subset of AL LCs from misfolding in amyloid.

      Computational M&M evidence is convincing and is supported by SAXS data, which are used as restraints for simulations. Although Kratky plots reported in the main MS Fig. 1 show significant differences between the data and the structural model for only one AL protein, AL-55, H-state is also inferred for other AL proteins.

      Apparent limitations:

      HDX MS results show that residues 35-50 from VL-VL and VL-CL dimerization interface are less protected in AL vs. non-AL proteins, which is consistent with the H-state. However, the small number of proteins yielding useful HDX data (three AL and one non-AL) suggests that this conclusion should be treated with caution. It is unclear whether the conformational heterogeneity depicted in M&M simulations is consistent with HDX results, and whether prior HDX studies of AL and MM LCs are consistent with the conclusions that a particular domain-domain interface is weakened in AL vs. non-AL LCs. The butterfly plots in Fig. 5 could benefit from the X-axis labeling with the peptide fragments.

    4. Reviewer #3 (Public review):

      Summary:

      This study identifies confirmational fingerprints of amylodogenic light chains, that set them apart from the non-amylodogenic ones.

      Strengths:

      The research employs a comprehensive combination of structural and dynamic analysis techniques, providing evidence that conformational dynamics at VL-CL interface and structural expansion are distinguished features of amylodogenic LCs.

      Weaknesses:

      The sample size is limited, which may affect the generalizability of the findings. Additionally, the study could benefit from deeper analysis of specific mutations driving this unique conformation to further strengthen therapeutic relevance.

      Furthermore. p-value (statistical significance) of Rg difference should be computer. Finally, significance of mutations (SHM?) at the interface, such as A40G should be compared with previous observations. (Garofalo et al., 2021)

    5. Author response:

      The following is the authors’ response to the original reviews.

      eLife Assessment

      This important study identifies the "H-state" as a potential conformational marker distinguishing amyloidogenic from non-amyloidogenic light chains, addressing a critical problem in protein misfolding and amyloidosis. By combining advanced techniques such as small-angle X-ray scattering, molecular dynamics simulations, and H-D exchange mass spectrometry, the authors provide convincing evidence for their novel findings. However, incomplete experimental descriptions, limitations in SAXS data interpretation, and the way HDX MS data is presented aHect the strength and generalizability of the conclusions. Strengthening these aspects would enhance the impact of this work for researchers in amyloidosis and protein misfolding.

      We thank eLife editors and reviewers for their constructive feedback. The manuscript has been improved to provide a more complete description of the experiments and to strengthen the interpretation and presentation of all data. Updated Figures (Figure 2 and Figure 5) and a new Table (Table 2) in the main text provide a more complete and clearer comparison of the SAXS data with MD simulations as well as a clearer representation of the HDX MS data. Additional figures have been added in SI. The text has been extended accordingly and complete materials and methods are now included in the main text. Abstract, introduction and discussion have been revised to improve the overall readability of the manuscript.

      Public Reviews:

      Reviewer #1 (Public review):

      The study investigates light chains (LCs) using three distinct approaches, with a focus on identifying a conformational fingerprint to diHerentiate amyloidogenic light chains from multiple myeloma light chains. The study's major contribution is identifying a low-populated "H state," which the authors propose as a unique marker for AL-LCs. While this finding is promising, the review highlights several strengths and weaknesses. Strengths include the valuable contribution of identifying the H state and using multiple approaches, which provide a comprehensive understanding of LC structural dynamics. However, the study suHers from weaknesses, particularly in interpreting SAXS data, lack of clarity in presentation, and methodological inconsistencies. Critical concerns include high error margins between SAXS profiles and MD fits, unclear validation of oligomeric species in SAXS measurements, and insuHicient quantitative cross-validation between experimental (HDX) and computational data (MD). This reviewer calls for major revisions including clearer definitions, improved methodology, and additional validation, to strengthen the conclusions.

      We thank the reviewer for the supportive comments, in the revised version of the manuscript we have focused on improving the clarity and completeness of our work. We are sorry for example to not have made previously clear enough that the comparison of SAXS with MD simulation was not that shown in the main text in Figure 1 and Table 1 (this is the comparison with single structures) but that reported in the SI (previously Figure S1 and Table S2, showing very good fits). These data have been moved in the main text in the reworked Figure 2 and new Table 2.  We have also improved the presentation of the HDX MS data in Figure 5 and in the text adding also additional analysis in SI. Materials and methods are now completely moved in the main text. We generally revised the manuscript for clarity.

      Reviewer #2 (Public review):

      Summary:

      This well-written manuscript addresses an important but recalcitrant problem - the molecular mechanism of protein misfolding in Ig light chain (LC) amyloidosis (AL), a major life-threatening form of systemic human amyloidosis. The authors use expertly recorded and analyzed smallangle X-ray scattering (SAXS) data as a restraint for molecular dynamics simulations (called M&M) and to explore six patient-based LC proteins. The authors report that a highly populated "H-state" determined computationally, wherein the two domains in an LC molecule acquire a straight rather than bent conformation, is what distinguishes AL from non-AL LCs. They then use H-D exchange mass spectrometry to verify this conclusion. If confirmed, this is a novel and interesting finding with potentially important translational implications.

      We thank the reviewer for the supportive comments.

      Strengths:

      Expertly recorded and analyzed SAXS data combined with clever M&M simulations lead to a novel and interesting conclusion. Regardless of whether or not the CL-CL domain interface is destabilized in AL LCs explored in this (Figure 6) and other studies, stabilization of this interface is an excellent idea that may help protect at least a subset of AL LCs from misfolding in amyloid. This idea increases the potential impact of this interesting study.

      We thank the reviewer for the supportive comments.

      Weaknesses:

      The HDX analysis could be strengthened.

      We have extended the analysis and improved the presentation of the HDX data. Figure 5 has been reworked, text has been improved accordingly and additional analysis have been reported in SI.

      Reviewer #3 (Public review):

      Summary:

      This study identifies conformational fingerprints of amyloidogenic light chains, that set them apart from the non-amyloidogenic ones.

      We thank the reviewer for the supportive comments.

      Strengths:

      The research employs a comprehensive combination of structural and dynamic analysis techniques, providing evidence that conformational dynamics at the VL-CL interface and structural expansion are distinguished features of amyloidogenic LCs.

      We thank the reviewer for the supportive comments.

      Weaknesses:

      The sample size is limited, which may aHect the generalizability of the findings. Additionally, the study could benefit from deeper analysis of specific mutations driving this unique conformation to further strengthen therapeutic relevance.

      We agree, we tried to maximise the size of the sample and this was the best we could do. With respect to the analysis of the mutations, while we tried to discuss some of them also in view of previous works, because our set covers multiple germlines instead than focusing on a single one, this limit our ability to discuss single point mutations systematically, at the same time the discussion of single points mutations has been the focus of many recent works, while our approach provide a diNerent point of view.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      This study provides an investigation of light chains (LCs) using three distinct approaches, focusing primarily on identifying a conformational fingerprint to distinguish amyloidogenic light chains (AL-LCs) from multiple myeloma light chains (MM-LCs). The authors propose that the presence of a low-populated "H state," characterized by an extended quaternary structure and a perturbed CL-CL interface, is unique to AL-LCs. This finding is validated through hydrogendeuterium exchange mass spectrometry (HDX-MS). The study makes a valuable contribution to understanding the structural dynamics of light chains, particularly with the identification of the H state in AL-LCs. However, significant concerns regarding the interpretation of the SAXS data, clarity in presentation, and methodological rigor must be addressed. I recommend major revisions and resubmission of the work.

      Major concerns:

      (1) A critical concern is how the authors ensure that the SAXS profiles represent only dimeric species, given the high propensity of LCs to aggregate. If higher-order aggregates or monomers were present, this would significantly impact the SAXS data and SAXS-MD integration. Some measurements are bulk SAXS, while others are SEC-SAXS, making the study questionable. The authors need to clarify how only dimeric species were measured for the SEC-SAXS analysis, and all assessments of the dimeric state should be shown in the SI. Additionally, complementary techniques such as DLS or SEC-MALS should be used to verify the oligomeric state of the samples. Without this validation, the SAXS profiles may not be reliable.

      We added SEC-MALS and SEC-SAXS data in the SI (Figures S20 and S21) as well the SAXS curves shown in log-log plot (Figure S1) that display a flat trend at low q that exclude aggregation. SAXS is very sensitive to oligomers and aggregates and our data do not indicate the presence of those species. When we had indication of possible aggregation in the sample we used SEC-SAXS.

      (2) A major problem with the paper is that the claim of the "H state," which is the novelty of the study and serves as a marker of aggregation, is derived from samples where the error between the SAXS profiles and MD fits is extremely high. This casts doubt on whether the structure is indeed resolved by MD. The main conclusion of the paper is derived from weak consistency between experiment and simulation. In AL55, the error between experiment and simulation is greater than 5; for H7, it is higher than 2.8. The residuals show significant error at mid-q values, suggesting that long-range distance correlations (20-10 Å, CL, VL positioning) are not consistent between simulation and experiment. Furthermore, the FES plots of two independent replicas show deviation in the existence of the H state. One shows a minimum in that region, while the other does not. So, how robust is this conclusion? What is the chi-squared value if each replica is used independently? A separate experimental cross-validation is necessary to claim the existence of the H state.

      We apologise for the misunderstanding underlying this reviewer comment. The poor agreement mentioned is not between the SAXS and MD simulations, but with the individual structures, and this disagreement led us to perform MD simulations that are in much better agreement with the data (previously Fig. S1 and Table S2). To avoid this misunderstanding, which would indeed weaken our work, we have now moved both the figure and the table in the main text to the updated Figure 2 and the new Table 2.

      Regarding the robustness of the sampling, we believe that Table 3 (previously Table 2) clearly shows the statistical convergence of the data, diNerences in the presentation of the free energy are purely interpolation issues. The chi-squares of each replicate are reported in Table 2 (previously Table S2).

      (3) There is insuHicient discussion about SAXS computations from MD trajectories. The accuracy of these calculations is crucial to deriving the existing conclusions, and the study's reliance on the PLUMED plugin, which is known to give inaccurate results for SAXS computations, raises concerns. How the solvent is treated in the SAXS computations needs to be explained. Alternative methods like WAXSiS or Crysol should be explored to check whether the SAXS profiles derived from the MD trajectory are consistent across other SAXS computation methods for the major conformers of the proteins.

      We have now clarified that while the SAXS calculation to perform Metainference MD were done using PLUMED (that to our knowledge is as accurate as crysol) SAXS curves used for analysis were calculated using crysol.

      (4) The HDX and MD results do not seem to correlate well, and there is a disconnect between Figure 2 (SAXS profiles) and Figure 5 (HDX structural interpretation). The authors should quantitatively assess residue-level dynamics by comparing HDX signals with MD-derived HDX signals for each protein. This would provide a cross-validation between the experimental and computational data.

      In our opinion our SAXS, MD and HDX MS data provide a consistent picture. Our HDX-MS do not provide per residue data, making a quantitative comparison out of scope. RMSF data do not necessarily need to correlate with the deuterium uptake.

      (5) MD simulations are only used to refine the structure of AlphaFold predictions, but the trajectories could help explain why these structures diHer, what stabilizes the dimer, or what leads to the conformational transition of the H state. A lack of analysis regarding the physical mechanism behind these structural changes is a weakness of the study. The authors should dedicate more eHort to analyzing their data and provide physical insights into why these changes are observed.

      Our aim was to identify a property that could discriminate between AL and MM LCs. We used MD simulations, not to refine structures, but to explore the conformational dynamics of LCs (starting from either X-ray structures, homology or AlphaFold models), because SAXS data suggested that conformational dynamics could discriminate between AL- and MM-LCs. Simulations allowed us to propose a hypothesis, which we tested by HDX MS. While more insight is always welcome, we believe that we have achieved our goal for now. In the discussion, we present additional analysis of the simulations to connect with previous literature, we agree that more analysis can be done, and also for this reason, all our data are publicly available.

      Minor concerns

      (6) The abstract leans heavily on describing the problem and methods but lacks a clear presentation of key results. Providing a concise summary of the main findings (e.g., the identification of the H state) would better balance the abstract.

      We agree with the reviewer and we rewrote the abstract.

      (7) In the abstract, the term "experimental structure" is used ambiguously. Since SAXS also provides an experimental structure, it is unclear what the authors are referring to. This should be clarified.

      We agree with the reviewer and we rewrote the abstract.

      (8) Abbreviations such as VL (variable domain) and CL (constant domain) are not defined, making it harder for readers unfamiliar with the field to follow. Abbreviations should be defined when first mentioned.

      We agree with the reviewer and we rewrote the abstract.

      (9) The introduction provides a good general context but fails to explicitly define the knowledge gap. Specifically, the structural and dynamic determinants of LC amyloidogenicity are not well established, and this study could be framed as addressing that gap.

      We thank the reviewer and we agree this could be better framed, we improved the introduction accordingly.

      (10) The introduction does not present the novel discovery of the H state early enough. The unique contribution of identifying this state as a marker for AL-LCs should be mentioned upfront to guide the reader through the significance of the study.

      We thank the reviewer and we have now made more explicit what we found.

      (11) The therapeutic implications of this research should be highlighted more clearly in the discussion. Examples of how these findings could be utilized in drug design or therapeutic approaches would enhance the study's impact.

      We thank the reviewer, but while we think that the H-state could be targeted for drug design, since we do not have data yet we do not want to stress this point more than what we are already doing.

      (12) There is an overwhelming use of abbreviations such as H3, H7, H18, M7, and M10 without proper introduction. This makes it diHicult for readers to follow the results, and the average reader may become lost in the details. An introductory figure summarizing the sequences under study, along with a schematic of the dimeric structure defining VL and CL domains, would significantly aid comprehension.

      We agree and we tried to better introduce the systems and simplify the language without adding a figure that we think would be redundant.

      (13) In Figure 1, add labels to each SAXS curve to indicate which protein they correspond to. Also, what does online SEC-SAXS mean?

      Done

      (14) The caption of Figure 3 is unclear, particularly with abbreviations like Lb, Ls, G, and H, which are not mentioned in the captions. The authors should define these terms for clarity.

      Done

      (15) The study claims that the dominant structure of the dimer changes between diHerent LCs. However, Figure 5 shows identical structures for all proteins, raising questions about the consistency between the SAXS and HDX data. This inconsistency is a general problem between the MD and HDX sections, where cross-communication and comparisons are not properly addressed.

      We do not claim that the dominant structure of the dimer changes between diNerent LCs, this would also be in contradiction with current literature. We claim a diNerence in a low-populated state. From this point of view using always the same structure is consistent and should simplify the representation of the results. We agree that the manuscript may be not always easy to follow and we thank the reviewer in helping us improving it.

      (16) The authors show I(q) vs q and residuals for each protein. The Kratky plots are not suHicient to compare the SAXS computations with the measured profile.

      Showing Kratky and residuals is a standard and complementary way to present and compare SAXS data to structures. Chi-square values are also reported. Log-log plots have been added to SI in response to previous comments.

      (17) The authors need to explain how they estimate the Rg values (from simulation or SAXS profiles). If they are using simulations, they should compute the Rg values from the simulations for comparison.

      Rg values reported in Table 1 are derived from SAXS. Rg from simulations have been added in Table 2.

      (18) The evolution of the sampling is unclear. The authors need to show the initial starting conformation in each case and the most likely conformation after M&M in the SI, to demonstrate that their approach indeed caused changes in the initial predictions.

      Our approach is not structure refinement and as such the proposed analysis would be misleading. Metainference is meant to generate a statistical ensemble representing the equilibrium conformations that as whole reproduce the data. DiNerences (or not) between initial and selected configurations will not be particularly informative in this context.

      (19) The authors should also provide a running average of chi-squared values over time to demonstrate that the conformational ensemble converged toward the SAXS profile.

      Our simulations are not driven to improve the agreement with SAXS over time, this is not structure refinement. Metainference is meant to generate a statistical ensemble representing the equilibrium conformations that as whole reproduce the data. The suggested analysis would be a misinterpretation of our simulations. The comparison with SAXS is provided in Figure 2 and Table 2 as mentioned above.

      (20) The aggregate simulation time of 120 microseconds is misleading, as each replica was only run for 2-3 microseconds. This should be clarified.

      The number reported in the text is accurate and represent the aggregated sampling. The number of replicas for each metainference simulation and their length is reported in Table 2 now moved for clarity from the SI to main text.

      (21) It is not clear how the replicas were weighted to compute the SAXS profiles and FES. There are two independent runs in each case, and each run has about 30 replicas. How these replicas are weighted needs to be discussed in the SI.

      Done

      (22) The methods section is unevenly distributed, with detailed explanations of LC production and purification, while other key methodologies like SAXS+MD integration and HDX are not even mentioned in the main text (they are in the Supporting Information). The authors should provide a brief overview of all methodologies in the main text or move everything to the SI for consistency.

      We agree with the reviewer, all methods are now in main text. 

      Reviewer #2 (Recommendations for the authors):

      (1) Computational M&M evidence is strong (Figure 3) and is supported by SAXS (used as restraints). However, Kratky plots reported in the main MS Figure 1 show significant diHerences between the data and the structural model only for one protein, AL-55. It is hard for the general reader to see how these SAXS data support a clear diHerence between AL and non-AL proteins. If possible, please strengthen the evidence; if not, soften the conclusions.

      We thank the reviewer for the comments. The chi-square (Table 1) and the residuals (Figure 1) are a strong indication of the diNerence. To strengthen the evidence, following also the comment from reviewer 3 we calculated the p-value (<10<sup>-5</sup>) on the significance of the radius of gyration to discriminate AL and MM LCs. We agree that SAXS alone was not enough and this is indeed what prompted us to perform MD simulations.

      (2) HDX MS results are cursory and not very convincing as presented. The butterfly plots in Figure 5 are too small to read and are unlabeled so it is unclear which protein is which.  

      Figure 5 has been reworked for readability. More data have been added in SI. 

      (3) What labeling time was selected to construct these plots and why?

      The deuterium uptakes at 30 min HDX time showed the most pronounced diNerences between diNerent proteins, which were chosen to illustrate the key structural features in the main figure panel (Figure 5).

      How diHerent are the results at other labeling times? Showing uptake curves (with errors) for more than just two peptides in the supplement Figure S12 might be helpful. 

      We found a continuous increase in deuterium uptake as we increased the exchange time from 0.5 to 240 min, which reached saturation at 120 min. Therefore, the exchange follows the same pattern at all time points. Butterfly plots at diNerent HDX times of 0.5 to 240 min are shown in gradient of light blue to dark blue which clearly shows the pattern of deuterium uptake at increasing incubation times (Figure 5). The HDX uptake kinetics of selected peptides with corresponding error bars are shown in Figure S12.

      How redundant are the data, i.e. how good is the peptide coverage/resolution in key regions at the domain-domain interface that the authors deem important? Mapping the maximal deuterium uptake on the structures in Figure 5 is not very helpful. Perhaps mapping the whole range of uptake using a gradient color scheme would be more informative.

      Overall coverage and redundancy for all four proteins are> 90% and > 4.0, respectively, with an average error margin in fractional uptake among all peptides is 0.04-0.05 Da, which suggests that our data is reliable (Table S3). We modified the main panel figures showing the gradient of deuterium uptake in blue-white-red for 0 to 30% of deuterium uptake on the chain A of the dimeric LCs.

      (3) Is the conformational heterogeneity depicted in M&M simulations consistent with HDX results? The authors may want to address this by looking at the EX1/EX2 exchange kinetics for AL vs. non-AL proteins. Do AL proteins show more EX1?

      No, we don’t see any EX1 exchange kinetics in our analysis. This is compatible with the prediction of the H-state that is a native like state and not an unfolded/partially folded state. 

      (4) Perhaps the main conclusion could be softened given the small number of proteins (six), esp. since only four (3 AL and 1 non-AL) could be explored by HDX. Are other HDX MS data of AL LCs from the same Lambda6 family (e.g. PMID: 34678302) consistent with the conclusions that a particular domain-domain interface is weakened in AL vs. non-AL LCs?

      We thank the reviewer for this suggestions. A diNerence in HDX MS data is indeed visible between AL and MM proteins for peptide 33-47 in the suggested paper (Figures 4, S5 and S8). The diNerence is reduced by the mutation identified in the paper as driving the aggregation in that specific case. We now mention this in the discussion.

      (5) Please clarify if the H* state is the same for a covalent vs. non-covalent LC dimer.

      We do not know because our data are only for covalent dimers. But, interestingly, the state is very similar to what was observed for a model kappa light-chain in Weber, et al., we have better highlighted this point in the discussion.

      (6) Please try and better explain why a smaller distance between CL domains in H7 protein and a larger distance in other AL proteins both promote protein misfolding.

      We do not have elements to discuss this point in more detail.

      (7) Please comment on the Kratky plots data vs. model agreement (see comments above).

      Done.

      (8) Please find a better way to display, describe, and interpret the HD exchange MS data.

      We have generated new main text (new Figure 5) and SI figures that we think allow the reader to better appreciated our observations. Corresponding results sections have been also improved.

      Minor points:

      (9) Is the population of the H-state with perturbed CL-CL domain interface, which was obtained in M&M simulations, suHicient to be observable by HDX MS?

      While populations alone are not enough to determine what is observable by HDX MS, a 10% population correspond roughly to 6 kJ/mol of ΔG and is compatible with EX2 kinetics. Previous works suggested that HDX-MS data should be sensitive to subpopulations of the order of 10%, (https://doi.org/10.1016/j.bpj.2020.02.005, https://doi.org/10.1021/jacs.2c06148)

      (10) Typically, an excited intermediate in protein unfolding is a monomer, while here it is an LC dimer. Is this unusual?

      This is a good point, we think that intermediates have mostly been studied on monomeric proteins because these are more commonly used as model systems, but we do not feel like discussing this point.

      (11) Low deuterium uptake is consistent with a rigid structure but may also reflect buried structure and/or structure that moves on a time scale greater than the labeling time.

      We agree.

      Reviewer #3 (Recommendations for the authors):

      (1) The p-value (statistical significance) of Rg diHerence should be computed.

      We thank the reviewer for the suggestion, we calculated the p-value that resulted quite significant.

      (2) The significance of mutations (SHM?) at the interface, such as A40G should be compared with previous observations. (Garrofalo et al., 2021).

      We thank the reviewer for the suggestion, a sentence has been added in the discussion.

    1. eLife Assessment

      The authors present three transgenic models carrying three representative exon deletions of the dystrophin gene. The findings presented are valuable to the field of muscle diseases, particularly muscular dystrophies. The evidence provided in the manuscript is convincing, with rigorous biochemical assays and state-of-the-art microscopy methods.

    2. Reviewer #2 (Public review):

      Miyazaki et al. established three distinct BMD mouse models by deleting different exon regions of the dystrophin gene, observed in human BMD. The authors demonstrated that these models exhibit pathophysiological changes, including variations in body weight, muscle force, muscle degeneration, and levels of fibrosis, alongside underlying molecular alterations such as changes in dystrophin and nNOS levels. Notably, these molecular and pathological changes progress at different rates depending on the specific exon deletions in dystrophin gene. Additionally, the authors conducted extensive fiber typing, revealing a site-specific decline in type IIa fibers in BMD mice, which they suggest may be due to muscle degeneration and reduced capillary formation around these fibers.

      Strengths:

      The manuscript introduces three novel BMD mouse models with different dystrophin exon deletions, each demonstrating varying rates of disease progression similar to the human BMD phenotype. The authors also conducted extensive fiber typing across different muscles and regions within the muscles, effectively highlighting a site-specific decline in type IIa muscle fibers in BMD mice.

      Comments on revisions:

      The authors did an excellent job addressing all or most of the concerns I raised in my previous review and have incorporated the necessary changes into the manuscript.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      In this article the authors described mouse models presenting with backer muscular dystrophy, they created three transgenic models carrying three representative exon deletions: ex45-48 del., ex45-47 19 del., and ex45-49 del. This article is well written but needs improvement in some points.

      Strengths:

      This article is well written. The evidence supporting the authors' claims is robust, though further implementation is necessary. The experiments conducted align with the current state-of-the-art methodologies.

      Weaknesses:

      This article does not analyze atrophy in the various mouse models. Implementing this point would improve the impact of the work

      We thank the reviewer for their constructive suggestions and comments on this work. Muscle hypertrophy is shown with growth in dystrophin-deficient skeletal muscle in mdx mice; thus, we did not pay attention to the factors associated with muscle atrophy in BMD mice. As the reviewer suggested, the examination of the association between type IIa fiber reduction and muscle atrophy is important, and the result is considered to be helpful in resolving the cause of type IIa fiber reduction in BMD mice.

      In response, we reviewed the following.

      (1) The cross-sectional areas (CSAs) of muscles. We confirmed that the CSAs in BMD and mdx mice were rather high at 3 months, in accordance with muscle hypertrophy, compared with those of WT mice. The data is presented in Fig. 4–figure supplement 1B.

      (2) The mRNA expression levels of Murf1 and atrogin-1. We confirmed that these muscle atrophy inducing factors did not differ among WT, BMD, and mdx mice. The data is presented in Fig. 4–figure supplements 1C and 1D.

      Reviewer #2 (Public review):

      Summary:

      Miyazaki et al. established three distinct BMD mouse models by deleting different exon regions of the dystrophin gene, observed in human BMD. The authors demonstrated that these models exhibit pathophysiological changes, including variations in body weight, muscle force, muscle degeneration, and levels of fibrosis, alongside underlying molecular alterations such as changes in dystrophin and nNOS levels. Notably, these molecular and pathological changes progress at different rates depending on the specific exon deletions in the dystrophin gene. Additionally, the authors conducted extensive fiber typing, revealing a site-specific decline in type IIa fibers in BMD mice, which they suggest may be due to muscle degeneration and reduced capillary formation around these fibers.

      Strengths:

      The manuscript introduces three novel BMD mouse models with different dystrophin exon deletions, each demonstrating varying rates of disease progression similar to the human BMD phenotype. The authors also conducted extensive fiber typing across different muscles and regions within the muscles, effectively highlighting a site-specific decline in type IIa muscle fibers in BMD mice.

      Weaknesses:

      The authors have inadequate experiments to support their hypothesis that the decay of type IIa muscle fibers is likely due to muscle degeneration and reduced capillary formation. Further investigation into capillary density and histopathological changes across different muscle fibers is needed, which could clarify the mechanisms behind these observations.

      We thank the reviewer for these positive comments and the very important suggestion about type IIa fiber reduction and capillary change around muscle fibers in BMD mice. From the results of the cardiotoxin-induced muscle degeneration and regeneration model, type IIa and IIx fibers showed delayed recovery compared with that of type-IIb fibers. However, this delayed recovery of type IIa and IIx could not explain the cause of the selective muscle fiber reduction limited to type IIa fibers in BMD mice. Therefore, we considered vascular dysfunction as the reason for the selective type IIa fiber reduction, and we found morphological capillary changes from a “ring pattern” to a “dot pattern” around type IIa fibers in BMD mice. However, the association between selective type IIa fiber reduction and the capillary change around muscle fibers in BMD mice remains unclear due to the lack of information about capillaries around type IIx and IIb fibers. The reviewer pointed out this insufficient evaluation of capillaries around other muscle fibers (except for type IIa fibers), and this suggestion is very helpful for explaining the association between selective type IIa fiber reduction and vascular dysfunction in BMD mice.

      In response, we reviewed the following.

      (1) The capillary formation around type IIx, IIb, and I fibers, in addition to that around type IIa fibers. We found that capillaries contacting around type IIx, IIb, and I fibers were poor in WT mice compared with that around type IIa fibers, with ‘incomplete ring-patterns’ around type IIx fibers, and ‘dot-patterns’ around type IIb and I fibers in WT mice. Morphological capillary changes around muscle fibers from WT to d45-49 and mdx mice were ‘incomplete dot-pattern’ to ‘dot-pattern’ around type IIx fibers, and ‘dot-pattern’ to ‘dot-pattern’ around type IIb and I fibers. This was in contrast to those around type IIa fibers: remarkable ‘ring-pattern’ to ‘dot-pattern’. These data are presented in Fig. 6B.

      (2) The endothelial area in contact with type IIx, IIb, and I fibers, and additionally that in contact with type IIa fibers. The endothelial area in contact with both type IIa and IIx fibers was less in d45-49 and mdx mice than in WT mice, but the reduction was larger around type IIa fibers than around type IIx fibers, reflecting the difference between the ‘ring-pattern’ around the former and the ‘incomplete ring-pattern’ around the latter in WT mice. These data are presented in Fig. 6C.

      (3) Transversely interconnected branches and capillary loops, using longitudinal muscle sections. We confirmed that there were fewer interconnected capillaries in BMD and mdx mice than in WT mice. These data are presented in Fig. 6E.

      (4) The mRNA expression levels of neuronal nitric oxide synthase (nNOS). We confirmed that nNOS protein expression levels were decreased in BMD and mdx mice in spite of adequate levels of nNOS mRNA expression. The data on nNOS mRNA expression levels is presented in Fig. 3–figure supplement 1C.

      (5) We added a sentence in the Abstract about the potential utility of BMD mice in developing vascular targeted therapies.

      Recommendation for the authors:

      Reviewer #1 (Recommendation for the authors):

      Abstract:

      Abstract: more emphasis should be on the pathological implications of Becker muscular dystrophy (BMD). Furthermore, should be emphasized the findings made in this article and the conclusions. Abbreviations such as DMD and MDX should be written in full and only then with the acronym.

      We appreciate the reviewers’ comments, and we apologize for the confusion over abbreviations. DMD is the gene name encoding dystrophin, and mdx is the strain name of mouse lacking dystrophin.

      In the Abstract and the Figure legends we changed:

      (1) DMD to DMD;

      (2) mdx mice to mdx mice.

      Results:

      Line 95: in this line, authors evaluated serum creatinine kinase (CK) levels at 1, 3, 6 and 12 months in WT mice and mdx mice. Why did you decide to study it? This part should be described in more detail. Serum CK is one of the main markers of muscle necrosis; therefore, I would report this data alongside the description of the muscle histology and necrotic fibers.

      We thank the reviewers for the important remarks. In this study, serum creatine kinase (CK) levels were two-fold to four-fold higher in BMD mice than in WT mice, but its rate of increase was less than that of mdx mice. We consider that the lesser changes in serum CK levels in BMD mice may be due to the smaller area of muscle degeneration because of focal and uneven muscle degeneration compared with that in mdx mice, which showed diffuse muscle degeneration.

      In response, we have moved the description of serum CK levels in the Results, from the section about the establishment of BMD mice to the section about site-specific muscle degeneration in BMD mice.

      In addition, we added a description in the Discussion about the possible association between the lesser changes in serum CK levels in BMD mice and its uneven distribution of muscle degeneration.

      Line 192-202: In these lines, authors observed a decrease in type IIa fibers after 3 months in BMD mice. I suggest evaluating also atrophy through evaluating cross-sectional areas (CSA) and expression of Murf1 and Atrogin1

      We thank the reviewer for the point about the association between type IIa fiber reduction and muscle atrophy. We evaluated the CSAs and the mRNA expression levels of Murf1 and atrogin-1. We confirmed that the CSAs in BMD and mdx mice were rather high at 3 months, in accordance with muscle hypertrophy, compared with those of WT mice, and that Murf1 and atrogin-1 mRNA expression levels did not differ among WT, BMD, and mdx mice. These data are presented in Fig. 4–figure supplements 1B, 1C, and 1D. We added a sentence about the changes in CSA and muscle atrophy inducing factors in the Discussion.

      Methods and material

      Line 342-348: authors have described animals, but not specified sex and number of mice in each group. This part should be improved.

      We apologize for our insufficient information about the sex and number of mice in the Materials and methods.

      We added a sentence specifying the sex, number, and evaluation period of each mouse group in the section on the generation of BMD mice.

      Line 426-433: authors described qPCR. It is necessary that the authors also describe primer sequences.

      We apologize for any lack of information about the primer sequences used in qPCR analysis. Supplemental Table 1 lists the primer sequences.

      We also added a sentence about the information in the primer list in the section on RNA isolation and RT-PCR in the Materials and methods.

      Reviewer #2 (Recommendation for the authors):

      Miyazaki et al. established three distinct BMD mouse models by removing different exon regions of the dystrophin gene. The authors demonstrated that the pathophysiological and molecular changes in these models progress at varying rates. Additionally, they observed a site-specific decline in type IIa fibers in BMD mice, while the proportions of other fiber types, such as type I and type IIx, remained consistent with those in wild-type mice. They proposed that the selective decay of type IIa fibers in BMD mice could be due to two primary factors: 1) muscle degeneration and regeneration, supported by their findings in cardiotoxin-treated mouse models, and 2) reduced capillary formation around type IIa fibers. However, the authors also presented evidence that type IIx fibers exhibited delayed recovery, similar to type IIa fibers, as demonstrated in cardiotoxin-induced regeneration models. Additionally, dot-patterned capillary formations were observed around both type IIa and type IIx fibers. Despite these findings, BMD mice did not show any changes in the proportion of type IIx fibers in inner BMD muscles. The authors should consider adding further analysis to strengthen their hypothesis and to disclose any possible mechanisms that led to these discrepancies.

      If the authors hypothesize that reduced capillary density around type IIa fibers contribute to their site-specific decay in BMD mice, they should consider measuring and statistically analyzing the endothelial area around all fiber types. By plotting and comparing these measurements across different fiber types between wild-type, BMD, and mdx mice, the authors could provide more robust evidence to support their hypothesis. This approach would help clarify whether reduced capillary density is a contributing factor to the site-specific decay of type IIa fibers in BMD mice and the more diffuse, non-specific muscle changes observed in mdx mice.

      The authors reported in the first part of the manuscript that histopathological changes, including muscle degeneration in BMD mice, are predominantly restricted to the inner part of the muscles. In the second part, they noted a decline in type IIa fibers specifically in the inner muscle region. To strengthen the hypothesis that the decay of type IIa fibers in the inner muscle is linked to muscle degeneration, the authors should consider performing histopathological measurements across different fiber types within the inner muscle. Reporting the correlations between these measurements would provide more compelling evidence to support their hypothesis.

      We thank the reviewer for these important suggestions about the association between type IIa fiber reduction and capillary change around muscle fibers in BMD mice. We prepared an additional evaluation about the capillary formation (in Fig. 6B) and endothelial area (in Fig. 6C) around type IIx, IIb, and I fibers. We found that capillaries contacting around type IIx, IIb, and I fibers were poor in WT mice compared with those around type IIa fibers, and showed an ‘incomplete ring-pattern’ around type IIx fibers and a ‘dot-pattern’ around type IIb and I fibers in WT mice, in contrast with type IIa fibers, which showed remarkable ‘ring-pattern’ capillaries. Reflecting this, the changes in endothelial area around type IIx, IIb, and I fibers between WT and BMD mice were less than those around type IIa fibers. These results suggest that type IIa fibers may require numerous capillaries and maintained blood flow compared with type IIx, IIb, and I fibers, and this high requirement for blood flow might be associated with the type IIa fiber-specific decay in BMD mice.

      We added the following.

      (1) Sentences in the Results about the capillary changes around type IIx, IIb, and I fibers in WT, d45-49, and mdx mice.

      (2) Sentences in the Results about the changes in endothelial area around type IIx, IIb, and I fibers in WT, d45-49, and mdx mice.

      (3) Sentences in the Discussion about the association between the type IIa fiber-specific decay in BMD mice and the differences in capillary changes of each muscle fiber from WT to BMD mice.

      We changed a sentence in the Discussion about the delayed recovery of type IIa and IIx fibers after CTX injection, to make it clear that the recovery of type IIx fibers was slower than that of type IIa fibers after CTX injection, and that therefore the type IIa fiber-specific decay in BMD mice might not be explained by this vulnerability and delayed recovery during muscle degeneration and regeneration.

      Minor Issues:

      Line 103: The word "mice" is duplicated and should be corrected.

      We apologize that “mice” was duplicated. We have corrected it.

      Line 120: Revise for clarity: "The proportion of opaque fibers is significantly different between d45-48 mice and WT at 3 months, with an increased tendency observed only in 1-month-old mice."

      We apologize for the confusion about the proportion of opaque fibers. We revised this sentence as follows.

      “Opaque fibers, which are thought to be precursors of necrotic fibers, increased at an earlier age of 1 month in d45–49 mice compared with WT mice; in contrast, the proportion of opaque fibers differs significantly between d45–47 and WT mice at 3 months, with an increased tendency only in 1-month-old mice (Fig. 2C).”

      Line 152: Clarify the statement regarding utrophin levels, as it currently contradicts the Western blot data. The sentence reads: "The increased levels of utrophin are 8-fold higher at 1 month and 30-fold higher at 3 months." This should be verified against the data, as the band densities in the Western blots suggest otherwise.

      We apologize for the confusion about utrophin expression levels. We revised this sentence as follows.

      “By western blot analysis, the utrophin expression levels showed only an increased tendency in all BMD mice at 3 months, whereas there was a significant increase in mdx mice (8-fold at 1 month, and 30-fold at 3 months) compared to WT mice (Figs. 3C and F).”

      Line 235: Correct the sentence to accurately reflect the findings: "BMD mice showed reduced muscle weakness."

      We apologize for our incorrect wording. We have removed the word “reduced” in this sentence.

    1. eLife Assessment

      This valuable work provides solid evidence that a neuronal metallothionein, GIF/MT-3, incorporates metal-persulfide clusters. A variety of well-designed assays support the authors' hypothesis, revealing that sulfane sulfur is released from MT-3. The biological role of the persulfidated form is not yet clearly defined. There are caveats to the findings that limit the study, but the work will nevertheless prompt major follow-up work.

    2. Reviewer #2 (Public review):

      Summary:

      In this manuscript, the authors reveal that GIF/MT-3 regulates the zinc homeostasis depending on the cellular redox status. The manuscript technically sounds, and their data concretely suggest that the recombinant MTs, not only GIF/MT-3 but also canonical MTs such as MT-1 and MT-2, contain sulfane sulfur atoms for the Zn-binding. The scenario proposed by the authors seems to be reasonable to explain the Zn homeostasis by the cellular redox balance.

      Strengths:

      The data presented in the manuscript solidly reveal that recombinant GIF/MT-3 contains sulfane sulfur.

      Weaknesses:

      It remains unclear whether native MTs, in particular induced MTs in vivo contain sulfane sulfur or not.

      Comments on revisions:

      Although the authors have revealed the sulfane sulfur content in native MT-3, my question, namely, whether canonical MT-1 and MT-2 contained sulfane sulfur after the induction has been left.<br /> The authors argue that the biological significance of sulfane sulfur in MTs lies in its ability to contribute to metal binding affinity, provide a sensing mechanism against oxidative stress, and aid in the regulation of the protein. Due to their biological roles, induced MT-1 and MT-2 could contain sulfane sulfur in their molecules. Thus, I expect the authors to evaluate or explain the sulfane sulfur content in induced MT-1 and MT-2.

    3. Reviewer #3 (Public review):

      Summary:

      The authors were trying to show that a novel neuronal metallothionein of poorly defined function, GIF/MT3, is actually heavily persulfidated in both the Zn-bound and apo (metal-free) forms of the molecule as purified from a heterologous (bacterial) or native host. Evidence in support of this conclusion is strong, with both spectroscopic and mss spectrometry evidence strongly consistent with this general conclusion. The authors would appear to have achieved their aims.

      Strengths:

      The analytical data in support of the author's primary conclusions are strong. The authors also provide some modeling evidence that supports the contention that MT3 (and other MTs) can readily accommodate a sulfane sulfur on each of the 20 cysteines in the Zn-bound structure, with little perturbation of the overall structure. This is not the case with Cys trisulfides, which suggests that the persulfide-metallated state is clearly positioned at lower energy relative to the immediately adjacent thiolate- or trisulfidated metal coordination complexes.

      Weaknesses:

      The biological significance of the findings is not entirely clear. On the one hand, the analytical data are solid (albeit using a protein derived from a bacterial over-expression experiment), and yes, it's true that sulfane S can protect Cys from overoxidation, but everything shown in the summary figure (Fig. 9D) can be done with Zn release from a thiol by ROS, and subsequent reduction by the Trx/TR system. In addition, it's long been known that Zn itself can protect Cys from oxidation. I view this as a minor shortcoming that will motivate follow-up studies.

      Impact:

      The impact will be high since the finding is potentially disruptive to the MT field for sure. The sulfane sulfur counting experiment (the HPE-IAM electrophile trapping experiment) may well be widely adopted by the field. Those in the metals field always knew that this was a possibility, and it will interesting to see the extent to which metal binding thiolates broadly incorporate sulfane sulfur into their first coordination shells.

      Comments on revisions:

      The revised manuscript is only slightly changed from the original, with the inclusion of a supplementary figure (Fig. S2) and minor changes in the text. The authors did not choose to carry out the quantitative Zn binding experiment (which I really wanted to see), but given the complexities of the experiment, I'll let it go.

      Fig. 9: the authors imply in the mechanistic "redox-switch" figure that Trx/TR can not reduce persulfide linkages. A number of groups have shown this to be the case. I recommend modifying the figure legend or text to make this clear to the reader,

    4. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      The manuscript by Dr. Shinkai and colleagues is about the posttranslational modification of a highly important protein, MT3, also known as the growth inhibitory factor. Authors postulate that MT3, or generally all MT isoforms, are sulfane sulfur binding proteins. The presence of sulfane sulfur at each Cys residue has, according to the authors, a critical impact on redox protein properties and almost does not affect zinc binding. They show a model in which 20 Cys residues with sulfane sulfur atoms can still bind seven zinc ions in the same clusters as unmodified protein. They also show that recombinant MT3 (but also MT1 and MT2) protein can react with HPE-IAM, an efficient trapping reagent of persulfides/polysulfides. This reaction performed in a new approach (high temperature and high reagent concentration) resulted in the formation of bis-S-HPE-AM product, which was quantitatively analyzed using LC-MS/MS. This analysis indicated that all Cys residues of MT proteins are modified by sulfane sulfur atoms. The authors performed a series of experiments showing that such protein can bind zinc, which dissociates in the reaction with hydrogen peroxide or SNAP. They also show that oxidized MT3 is reduced by thioredoxin. It gives a story about a new redox-dependent switching mechanism of zinc/persulfide cluster involving the formation of cystine tetrasulfide bridge.

      The whole story is hard to follow due to the lack of many essential explanations or full discussion. What needs to be clarified is the conclusion (or its lack) about MT3 modification proven by mass spectrometry. Figure 1B shows the FT-ICR-MALDI-TOF/MS spectrum of recombinant MT3. It clearly shows the presence of unmodified MT3 protein without zinc ions. Ions dissociate in acidic conditions used for MALDI sample preparation. If the protein contained all Cys residues modified, its molecular weight would be significantly higher. Then, they show the MS spectrum (low quality) of oxidized protein (Fig. 1C), in which new signals (besides reduced apo-MT3) are observed. They conclude that new signals come from protein oxidation and modification with one or two sulfur atoms. If the conclusion on Cys residue oxidation is reasonable, how this protein contains sulfur is unclear. What is the origin of the sulfur if apo-MT does not contain it? Oxidized protein was obtained by acidification of the protein, leading to zinc dissociation and subsequent neutralization and air oxidation. Authors should perform a detailed isotope analysis of the isotopic envelope to prove that sulfur is bound to the protein. They say that the +32 mass increase is not due to the appearance of two oxygen donors. They do not provide evidence. This protein is not a sulfane sulfur binding protein, or its minority is modified. Moreover, it is unacceptable to write that during MT3 oxidation are "released nine molecules of H2". How is hydrogen molecule produced? Moreover, zinc is not "released", it dissociates from protein in a chemical process.

      Thank you for your comment. According to your suggestion, we have rewritten the corresponding sentences below, together with addition of new Fig.1D.

      First, the sentence “which corresponded to the mass of zinc-free apo-GIF/MT3 and indicated that zinc was removed during MS analysis.” was changed to “which corresponded to the mass of zinc-free apo-GIF/MT3 and indicated that zinc dissociates from protein in acidic conditions used for MALDI sample preparation.” in the introduction section. Second, we have added the following sentence “However, FT-ICR-MALDI-TOF/MS analysis failed to detect sulfur modifications in GIF/MT-3 (Fig. 1B), suggesting that sulfur modifications in the protein were dissociated during laser desorption/ionization. Therefore, we postulate that the small amount of sulfur detected in oxidized apo-GIF/MT-3 is derived from the effect of laser desorption/ionization rather than any actual modification of the minority component.” in the discussion section. Third, we have added new Fig. 1D and the corresponding citation in the introduction. Fourth, the sentence “An increase in mass of 32 Da can also result from addition of two oxygen atoms, but we attributed it to one sulfur atom for reasons described later.” was changed to “Note that an increase in mass of 32 Da can also result from addition of two oxygen atoms.”.

      Another important point is a new approach to the HPE-IAM application. Zinc-binding MT3 was incubated with 5 mM reagent at 60°C for 36 h. Authors claim that high concentration was required because apoMT3 has stable conformation. Figure 2B shows that product concentration increases with higher temperature, but it is unclear why such a high temperature was used. Figure 1D shows that at 37°C, there is almost no reaction at 5 mM reagent. Changing parameters sounds reasonable only when the reaction is monitored by mass spectrometry. In conclusion, about 20 sulfane sulfur atoms present in MT3 would be clearly visible. Such evidence was not provided. Increased temperature and reagent concentration could cause modification of cysteinyl thiol/thiolates as well, not only persulfides/polysulfides. Therefore, it is highly possible that non-modified MT3 protein could react with HPE-IAM, giving false results. Besides mass spectrometry, which would clearly prove modifications of 20 Cys, authors should use very important control, which could be chemically synthesized beta- or alfa-domain of MT3 reconstituted with zinc (many protocols are present in the literature). Such models are commonly used to test any kind of chemistry of MTs. If a non-modified chemically obtained domain would undergo a reaction with HPE-IAM under such rigorous conditions, then my expectation would be right.

      Thank you for your comments. Although we have already confirmed that no false-positive results were observed using this method in Fig. 5 (previously Fig. 4), we have conducted additional experiments by preparing chemically synthesized α- and β-domains of GIF/MT-3, as well as recombinant α- and β-domains of GIF/MT-3. As shown in the new Fig. S2A, the chemically synthesized α- and β-domains of GIF/MT-3 detected almost no sulfane sulfur (less than 1 molecule per protein), whereas the recombinant α- and β-domains detected several molecules of sulfane sulfur (more than 5 molecules per protein) (Fig. S2A). Therefore, I would like to emphasize here that the cysteine residue itself cannot be the source of the bis-S-HPE-AM product (sulfane sulfur derivative).

      Accordingly, we have added the following sentence in the results section: “Because this assay was performed at relatively high temperatures (60°C), we also examined the sulfane sulfur levels of several mutant proteins using chemically synthesized α- and β-domains of GIF/MT-3 to eliminate false-positive results. As shown in Fig. S2A, sulfane sulfur (less than 1 molecule per protein) was undetectable in chemically synthesized α- and β-domains of GIF/MT-3, whereas several molecules of sulfane sulfur per protein were detected in recombinant α- and β-domains exhibited (Fig. S2B, left panel). These findings indicated that the sulfane sulfur detected in our assay was derived from biological processes executed during the production of GIF/MT-3 protein. We further analyzed mutant proteins with β-Cys-to-Ala and α-Cys-to-Ala substitutions and found that their sulfane sulfur levels were comparable with those of the α- and β-domains of GIF/MT-3, respectively (Fig. S2B, left panel). Additionally, Ser-to-Ala mutation did not affect the sulfane sulfur levels of GIF/MT-3. The zinc content of each mutant protein was also determined under these conditions (Fig. S2B, right panel).”

      - The remaining experiments provided in the manuscript can also be applied for non-modified protein (without sulfane sulfur modification) and do not provide worthwhile evidence. For instance, hydrogen peroxide or SNAP may interact with non-modified MTs. Zinc ions dissociate due to cysteine residue modification, and TCEP may reduce oxidized residue to rescue zinc binding. Again, mass spectrometry would provide nice evidence.

      Thank you for your comment. We understand that such experiments can also be applied to non-modified proteins (without sulfane sulfur modification). However, the experiments shown in Fig. 4 and Fig. 6 were conducted to investigate the role of sulfane sulfur under oxidative stress conditions, rather than to examine sulfur modification in the protein itself. As mentioned previously, it is difficult to detect sulfur modifications directly in the protein using MALDI-TOF/MS (Fig. 1), as sulfur modifications appear to dissociate during the laser desorption/ionization process.

      - The same is thioredoxin (Fig. 7) and its reaction with oxidized MT3. Nonmodified and oxidized MT3 would react as well.

      Thank you for your comment. We understand that such experiments can also be applied to non-modified MT-3 protein. However, to the best of our knowledge, this is the first report demonstrating that apo-MT-3 can serve as a good substrate for the Trx system. In fact, this experiment is not intended to prove that MT-3 is sulfane sulfur-binding protein. Rather, it demonstrates the novel finding that apo-MT3 serves as an excellent substrate for Trx and that the sulfane sulfur (persulfide structure) remains intact throughout the reduction process.

      - If HPE-IAM reacts with Cys residues with unmodified MT3, which is more likely the case under used conditions, the protein product of such reaction will not bind zinc. It could be an explanation of the cyanolysis experiment (Fig. 6).

      Thank you for your comment. As you pointed out, HPE-IAM reacts with cysteine residues in unmodified MT-3, thereby preventing zinc from binding to the protein. However, we did not use HPE-IAM prior to measuring zinc binding. Instead, HPE-IAM was used solely for determining the sulfane sulfur content in the protein, and thus it cannot explain the results of the cyanolysis experiment.

      - Figure 4 shows the reactivity of (pol)sulfides with TCEP and HPE-IAM. What are redox potentials? Do they correlate with the obtained results?

      Thank you for your comment. However, we must apologize as we do not fully understand the rationale behind determining redox potentials in this experiment. We believe the data itself to be very clear and presenting convincing results.

      - Raman spectroscopy experiments would illustrate the presence of sulfane sulfur in MT3 only if all Cys were modified.

      Yes, that is correct. Since approximately 20 sulfane sulfur atoms are detected in the protein with 20 cysteine residues, we believe that nearly all cysteine residues are modified by sulfane sulfur. Therefore, Raman spectroscopy is considered applicable to our current study.

      - The modeling presented in this study is very interesting and confirms the flexibility of metallothioneins. MT domains are known to bind various metal ions of different diameters. They adopt in this way to larger size the ions. The same mechanism could be present from the protein site. The presence of 9 or 11 sulfur atoms in the beta or alfa domain would increase the size of the domains without changing the cluster structure.

      We truly appreciate your positive evaluation of this work.

      - Comment to authors. Apo-MT is not present in the cell. It exists as a partially metallated species. The term "apo-MT" was introduced to explain that MTs are not fully saturated by metals and function as a metal buffer system. Apo-MT comes from old ages when MT was considered to be present only in two forms: apo-form and fully saturated forms.

      Thank you for your insightful comments. We find it reasonable to understand that apo-MT exists as a partially metallated species within the cell.

      Reviewer #2 (Public Review):

      Summary:

      In this manuscript, the authors reveal that GIF/MT-3 regulates zinc homeostasis depending on the cellular redox status. The manuscript technically sounds, and their data concretely suggest that the recombinant MTs, not only GIF/MT-3 but also canonical MTs such as MT-1 and MT-2, contain sulfane sulfur atoms for the Zn-binding. The scenario proposed by the authors seems to be reasonable to explain the Zn homeostasis by the cellular redox balance.

      Strengths:

      The data presented in the manuscript solidly reveal that recombinant GIF/MT-3 contains sulfane sulfur.

      Weaknesses:

      It is still unclear whether native MTs, in particular, induced MTs in vivo contain sulfane sulfur or not.

      Thank you for pointing out the strengths and weaknesses of this manuscript. Based on your suggestions, we have determined the sulfane sulfur content in the native GIF/MT-3 protein, as explained in our response to "Recommendations for the Authors #2."

      Reviewer #3 (Public Review):

      Summary:

      The authors were trying to show that a novel neuronal metallothionein of poorly defined function, GIF/MT3, is actually heavily persulfidated in both the Zn-bound and apo (metal-free) forms of the molecule as purified from a heterologous or native host. Evidence in support of this conclusion is compelling, with both spectroscopic and mass spectrometry evidence strongly consistent with this general conclusion. The authors would appear to have achieved their aims.

      Strengths:

      The analytical data are compelling in support of the author's primary conclusions are strong. The authors also provide some modeling evidence that strongly supports the contention that MT3 (and other MTs) can readily accommodate sulfane sulfur on each of the 20 cysteines in the Zn-bound structure, with little perturbation of the structure. This is not the case with Cys trisulfides, which suggests that the persulfide-metallated state is clearly positioned at lower energy relative to the immediately adjacent thiolate- or trisulfidated metal coordination complexes.

      Weaknesses:

      The biological significance of the findings is not entirely clear. On the one hand, the analytical data are clearly solid (albeit using a protein derived from a bacterial over-expression experiment), and yes, it's true that sulfane S can protect Cys from overoxidation, but everything shown in the summary figure (Fig. 8D) can be done with Zn release from a thiol by ROS, and subsequent reduction by the Trx/TR system. In addition, it's long been known that Zn itself can protect Cys from oxidation. I view this as a minor weakness that will motivate follow-up studies. Fig. 1 was incomplete in its discussion and only suggests that a few S atoms may be covalently bound to MT3 as isolated. This is in contrast to the sulfate S "release" experiment, which I find quite compelling.

      Impact:

      The impact will be high since the finding is potentially disruptive to the metals in the biology field in general and the MT field for sure. The sulfane sulfur counting experiment (the HPE-IAM electrophile trapping experiment) may well be widely adopted by the field. Those of us in the metals field always knew that this was a possibility, and it will interesting to see the extent to which metal-binding thiolates broadly incorporate sulfate sulfur into their first coordination shells.

      Thank you for pointing out the strengths and weaknesses of this manuscript. As you noted, the explanations and discussions regarding Fig. 1 were missing. To address this, we have added the following sentences to the discission section: “However, FT-ICR-MALDI-TOF/MS analysis failed to detect sulfur modifications in GIF/MT-3 (Fig. 1B), suggesting that sulfur modifications in the protein were dissociated during laser desorption/ionization. Therefore, we postulate that the small amount of sulfur detected in oxidized apo-GIF/MT-3 is derived from the effect of laser desorption/ionization rather than any actual modification of the minority component.”

      Reviewer #1 (Recommendations For The Authors):

      Overall, the topic of the study is interesting, but the provided evidence is insufficient to claim that MT3 is a sulfane sulfur-binding protein. Indeed, some recent studies showed that natural and recombinant MT proteins can be modified, but only one or a few cysteine residues were modified. Authors should follow my suggestion and apply mass spectrometry to all performed reactions and, first of all, to freshly obtained protein. I strongly suggest using chemically synthesized and reconstituted domains to test whether the home-developed approach is appropriate. Moreover, native MS and ICP-MS analysis of MT3 would support their claims.

      Thank you for your insightful comments. Following your suggestions, we have prepared chemically synthesized proteins of the α- and β-domains of GIF/MT-3 and conducted additional experiments, as explained in response comments to “Public Review #1”. Regarding the MS analysis, we have also added a discussion on the difficulty of detecting sulfur modifications in the protein.

      Reviewer #2 (Recommendations For The Authors):

      I have some minor points which should be considered by the authors.

      (1) Table 1: In the simulation by MOE, the authors speculated 7 atoms of metal bound to GIF/MT-3. Although a total of 7 atoms of Zn or Cd are actually bound to MTs as a divalent ion, the number of Cu and Hg bound to MTs as a monovalent ion is scientifically controversial. Several ideas have been proposed in the literature, however, "7 atoms of Cu or Hg" could be inappropriate as far as I know. The authors should simulate again using a more appropriate number of Cu or Hg in MTs.

      Thank you for providing this valuable information. We reviewed several papers by the Stillman group and found that the relative binding constants of Cu4-MT, Cu6-MT, and Cu10-MT were determined after the addition of Cu(I) to apo MT-1A, MT-2, and MT-3 (Melenbacher and Stillman, Metallomics, 2024). However, incorporating these copper numbers into our GIF/MT-3 simulation model proved challenging. Therefore, we decided to omit the score value for copper in Table 1.

      On the other hand, some researchers have reported that mercury binds to MT as a divalent ion, and the formation of Hg<sub>7</sub>MT is possible (not just other forms). Therefore, we decided to continue using the score value for mercury shown in Table 1.

      (2) If possible, native MT samples isolated from an experimental animal should be evaluated for the sulfane sulfur content. Canonical MTs, MT-1 and MT-2, are highly inducible by not only heavy metals but also oxidative stress. Under the oxidative stress condition such as the exposure of hydrogen peroxide, it is questionable whether the induced Zn-MTs contain sulfane sulfur or not.

      According to your suggestion, we evaluated the sulfane sulfur content in native GIF/MT-3 samples isolated from mouse brain cytosol (Fig. 10). The measured amount was 3.3 per protein. This suggests that sulfane sulfur in GIF/MT-3 could be consumed under oxidative conditions, as you anticipated. Another possible explanation for the discrepancy between the native form and recombinant protein is likely related to metal binding in the protein. It is generally understood that both zinc and copper bind to GIF/MT-3 in approximately equal proportions in vivo. When we prepared recombinant copper-binding GIF/MT-3 protein, the sulfane sulfur content in the protein was significantly different (approximately 4.0 per protein) compared to the Zn<sub>7</sub>GIF/MT-3 form. Further studies are needed to clarify the relationship between sulfane sulfur binding and the types of metals in the future.

      (3) The biological significance of sulfane sulfur in MTs is still unclear to me.

      Thank you for your comments. To address this question, we have added the following sentence to the discussion section: “The biological significance of sulfane sulfur in MTs lies in its ability to 1) contribute to metal binding affinity, 2) provide a sensing mechanism against oxidative stress, and 3) aid in the regeneration of the protein.”

      (4) According to the widely accepted nomenclature of MT, "MT3" should be amended to "MT-3".

      According to your suggestion, we have amended from MT3 to MT-3 throughout the manuscript.

      Reviewer #3 (Recommendations For The Authors):

      Most of my comments are editorial in nature, largely focused on what I perceive as overinterpretation or unnecessary speculation.

      The authors state in the abstract that the intersection of sulfane sulfur and Zn enzymes "has been overlooked." This is not actually true - please tone down to "under investigated" or something like this.

      Based on your suggestion, we have replaced the term “has been overlooked” with “has been under investigated” in the abstract.

      Line 228: The discussion of Fig. 6C involved too much speculation. I cannot see a quantitative experiment that supports this.

      Based on your suggestion, we have removed Fig. 6C (currently referred to as Fig. 7C). Additionally, we have revised the sentence from “implying that the sulfane sulfur is an essential zinc ligand in apo-GIF/MT3 and that an asymmetric SSH or SH ligand is insufficient for native zinc binding (Fig. 6C)” to “implying the contribution of sulfane sulfur to zinc binding in GIF/MT-3”.

      Line 247 "persulfide in apo-GIF/MT3 seems.." I think the authors mean that the Zn form of the protein is resistant to Trx or TCEP.

      Thank you for pointing this out. We realized that the term “persulfide in apo-GIF/MT3” might be confusing. Therefore, we have replaced it with “persulfide formation derived from apo-GIF/MT3” in the corresponding sentence.

      Molecular modeling: We need more details- were these structures energy-minimized in any way? Can the authors comment on the plethora of S-S dihedral angles in these structures, and whether they are consistent with expectations of covalent geometry? Please add text to explain or even a table that compiles these data.

      Thank you for your comment. Yes, energy minimization calculations for structural optimization were conducted during homology modeling in MOE. In fact, we have already stated in the Methods section that “Refinement of the model with the lowest generalized Born/volume integral (GBVI) score was achieved through energy minimization of outlier residues in Ramachandran plots generated within MOE.” In this model, covalent geometry, including the S-S dihedral angles, is also taken into consideration.

      What is a thermostability score? Perhaps a bit more discussion here and what relationship this has to an apparent (or macroscopic) metal affinity constant.

      The thermostability score is used to compare the thermal stability between the wild-type and mutant proteins. As shown in Equation (1) in the method section, it is calculated by subtracting the energy of the hypothetical unfolded state from the energy of the folded state. Since obtaining the structure of the unfolded state requires extensive computational effort, MOE employs an empirical formula based on two-dimensional structural features to estimate it. The ΔΔG values represent the difference between ΔGf(WT) and ΔGf(Mut). However, because it is difficult to directly determine ΔGf(Mut) and ΔGf(WT), MOE calculates ΔΔG using the thermodynamic cycle equivalence: ΔΔGs =ΔGsf (WT→Mut) - ΔGsu (WT→Mut), as expressed in Equation (1).

      On the other hand, the affinity score represents the interaction energy between the target ligand and the protein. In this study, we calculated the affinity score by selecting metal atoms as the ligands. The interaction energy (E int) is defined as:

      E int = E complex − E receptor − E ligand

      where each term is as follows:

      E complex : Potential energy of the complex.

      E receptor : Potential energy of the receptor alone.

      E ligand : Potential energy of the ligand alone.

      Each potential energy term includes contributions from bonded interactions such as bond lengths and bond angles. However, since there is no structural difference among E receptor, and E ligand, the bonded energy components cancel out. Consequently, E int is determined as:

      E int = ΔEele +ΔEvdW +ΔE sol

      Here, a negative E int indicates that the complex is more stable, while a positive E int implies that the receptor and ligand are more stable in their dissociated states.

      We have revised the sentence "The affinity score was also calculated using MOE software as the difference between the ΔΔGs values of the protein, free zinc, and metal–protein complex” to "The affinity score was also calculated using MOE software as the difference between the potential energy values of the protein, free zinc, and metal–protein complex” to correct the misdescription.

      Lines 278-280: The authors state that they observe a "marked enhancement of metal binding affinity, and rearrangement of zinc ions." I don't see support for this rather provocative conclusion. This is the expectation of course. I would love to see actual experimental data on this point, direct binding titrations with metals performed before and after the release of the sulfate sulfur atoms.

      Thank you for your comments. Although this statement is based on the 3D modeling simulation, we have also experimentally observed that the diminishment of sulfane sulfur in GIF/MT-3 resulted in a decrease in zinc binding levels, as shown in Fig. 7. However, conducting direct binding titration experiments was difficult for us due to the difficulty in preparing pure GIF/MT-3 protein with or without sulfane sulfur. Therefore, we have revised the sentence "marked enhancement of metal binding affinity, and rearrangement of zinc ions" to simply "enhancement of metal binding affinity" to avoid over-speculation.

      Table I- quantitatively lower stability for the Cu complex- the stoichiometry is clearly wrong in this simulation- please redo this simulation with the right stoichiometry or Cu to MT3- consult a Stillman paper.

      Thank you for providing this valuable information. We reviewed several papers by the Stillman group and found that the relative binding constants of Cu4-MT, Cu6-MT, and Cu10-MT were determined after the addition of Cu(I) to apo MT-1A, MT-2, and MT-3 (Melenbacher and Stillman, Metallomics, 2024). However, incorporating these copper numbers into our GIF/MT-3 simulation model proved challenging. Therefore, we decided to omit the score value for copper in Table 1.

      I like the model for reversible metal release mediated by the thioredoxin system (Fig. 8D)- but you can also do this with thiols- nothing really novel here. Has it been generally established that tetraulfides are better substrates for the Trx/TR system? The data shown in Fig. 7B seems to suggest this, but is this broadly true, from the literature?

      There are reports describing that persulfides and polysulfides are reduced by the thioredoxin system. However, it is not well-established that tetraulfides are better substrates for the Trx/TR system. To the best of our knowledge, this is the first report demonstrating that apo-MT-3 can serve as a good substrate for the Trx/TR system. Further research is required to compare the catalytic efficiency between proteins containing disulfide and those with tetraulfide moieties.

      Line 380: Many groups have reported that many proteins are per- or polysulfidated in a whole host of cells using mass spectrometry workflows, and that terminal persulfides can be readily reduced by general or specific Trx/TR systems. This work could be better acknowledged in the context of the authors' demonstration of the reduction of the tetrasulfides, which itself would appear to be novel (and exciting!).

      We truly appreciate your positive evaluation of this work.

    1. eLife Assessment

      This fundamental article significantly advances our understanding of FGF signalling, and in particular, highlights the complex modifications affecting this pathway. The evidence for the authors' claims is convincing, combining state-of-the-art conditional gene deletion in the mouse lens with histological and molecular approaches. This work should be of great interest to molecular and developmental biologists beyond the lens community.

    2. Reviewer #1 (Public review):

      Summary:

      This manuscript uses the eye lens as a model to investigate basic mechanisms in the Fgf signaling pathway. Understanding Fgf signaling is of broad importance to biologists as it is involved in the regulation of various developmental processes in different tissues/organs and is often misregulated in disease states. The Fgf pathway has been studied in embryonic lens development, namely with regards to its involvement in controlling events such as tissue invagination, vesicle formation, epithelium proliferation and cellular differentiation, thus making the lens a good system to uncover the mechanistic basis of how the modulation of this pathway drives specific outcomes. Previous work has suggested that proteins, other than the ones currently known (e.g., the adaptor protein Frs2), are likely involved in Fgfr signaling. The present study focuses on the role of Shp2 and Shc1 proteins in the recruitment of Grb2 in the events downstream of Fgfr activation.

      Strengths:

      The findings reveal that the juxtamembrane region of the Fgf receptor is necessary for proper control of downstream events such as facilitating key changes in transcription and cytoskeleton during tissue morphogenesis. The authors conditionally deleted all four Fgfrs in the mouse lens that resulted in molecular and morphological lens defects, most importantly, preventing the upregulation of the lens induction markers Sox2 and Foxe3 and the apical localization of F-actin, thus demonstrating the importance of Fgfrs in early lens development, i.e. during lens induction. They also examined the impact of deleting Fgfr1 and 2, on the following stage, i.e. lens vesicle development, which could be rescued by expressing constitutively active KrasG12D. By using specific mutations (e.g. Fgfr1ΔFrs lacking the Frs2 binding domain and Fgfr2LR harboring mutations that prevent binding of Frs2), it is demonstrated that the Frs2 binding site on Fgfr is necessary for specific events such as morphogenesis of lens vesicle. Further, by studying Shp2 mutations and deletions, the authors present a case for Shp2 protein to function in a context-specific manner in the role of an adaptor protein and a phosphatase enzyme. Finally, the key surprising finding from this study is that downstream of Fgfr signaling, Shc1 is an important alternative pathway - in addition to Shp2 - involved in the recruitment of Grb2 and in the subsequent activation of Ras. The methodologies, namely, mouse genetics and state-of-the-art cell/molecular/biochemical assays are appropriately used to collect the data, which are soundly interpreted to reach these important conclusions. Overall, these findings reveal the flexibility of the Fgf signaling pathway and it downstream mediators in regulating cellular events. This work is expected to be of broad interest to molecular and developmental biologists.

      Weaknesses:

      A weakness that needs to be discussed is that Le-Cre depends on Pax6 activation, and hence its use in specific gene deletion will not allow evaluation of the requirement of Fgfrs in the expression of Pax6 itself. But since this is the earliest Cre available for deletion in the lens, mentioning this in the discussion would make the readers aware of this issue.

    3. Reviewer #2 (Public review):

      Summary

      I have reviewed the revised manuscript submitted by Wang et al., which is entitled "Shc1 cooperates with Frs2 and Shp2 to recruit Grb2 in FGF-induced lens development". In this paper, the authors first examined lens phenotypes in mice with Le-Cre-mediated knockdown (KD) of all four FGFR (FGFR1-4), and found that pERK signals, Jag1 and foxe3 expression are absent or drastically reduced, indicating that FGF signaling is essential for lens induction. Next, the authors examined lens phenotypes of FGFR1/2-KD mice and found that lens fiber differentiation is compromised and that proliferative activity and cell survival are also compromised in lens epithelium. Interestingly, Kras activation rescues defects in lens growth and lens fiber differentiation in FGFR1/2-KD mice, indicating that Ras activation is a key step for lens development, downstream of FGF signaling. Next, the authors examined the role of Frs2, Shp2 and Grb2 in FGF signaling for lens development. They confirmed that lens fiber differentiation is compromised in FGFR1/3-KD mice combined with Frs2-dysfunctional FGFR2 mutants, which is similar to lens phenotypes of Grb2-KD mice. However, lens defects are milder in mice with Shp2YF/YF and Shp2CS mutant alleles, indicating that involvement of Shp2 is limited for the Grb2 recruitment for lens fiber differentiation. Lastly, the authors showed new evidence on the possibility that another adapter protein, Shc1, promotes Grb2 recruitment independent of Frs2/Shp2-mediated Grb2 recruitment.

      Strength

      Overall, the manuscript provides valuable data on how FGFR activation leads to Ras activation through the adapter platform of Frs2/Shp2/Grb2, which advances our understanding on complex modification of FGF signaling pathway. The authors applied a genetic approach using mice, whose methods and results are valid to support the conclusion. The discussion also well summarizes the significance of their findings.

      Weakness

      The authors found that the new adaptor protein Shc1 is involved in Grb2 recruitments in response to FGF receptor activation. However, the main data on Shc1 are only histological sections and statistical evaluation of lens size. In the revised manuscript, the authors did not answer my major concern that cellular-level data are missing, which is not fully enough to support their main conclusion on the involvement of Shc1 in Grb2 recruitment of FGF signaling for lens development. Since the title of this manuscript is that Shc1 cooperates with Frs2 and Shp2 to recruit Grb2 in FGF-induced lens development, it is important to provide the cellular-level evidence on Shc1.

    4. Reviewer #3 (Public review):

      Summary:

      The manuscript entitled "Shc1 cooperates with Frs2 and Shp2 to recruit Grb2 in FGF-induced lens development" by Wang et al., investigates the molecular mechanism used by FGFR signaling to support lens development. The lens has long been known to depend on FGFR-signaling for proper development. Previous investigations have demonstrated the FGFR signaling is required for embryonic lens cell survival and for lens fiber cell differentiation. The requirement of FGFR signaling for lens induction has remained more controversial as deletion of both Fgfr1 and Fgfr2 during lens placode formation does not prevent the induction of definitive lens markers such as FOXE3 or αA-crystallin. Here the authors have used the Le-Cre driver to delete all four FGFR genes from the developing lens placode demonstrating a definitive failure of lens induction in the absence of FGFR-signaling. The authors focused on FGFR1 and FGFR2, the two primary FGFRs present during early lens development and demonstrated that lens development could be significantly rescued in lenses lacking both FGFR1 and FGFR2 by expressing a constitutively active allele of KRAS. They also showed that the removal of pro-apoptotic genes Bax and Bak could also lead to a substantial rescue of lens development in lenses lacking both FGFR1 and FGFR2. In both cases, the lens rescue included both increased lens size and the expression of genes characteristic of lens cells.

      Significantly the authors concentrated on the juxtamembrane domain, a portion of the FGFRs associated with FRS2. Previous investigations have demonstrated the importance of FRS2 activation for mediating a sustained level of ERK activation. FRS2 is known to associate both with GRB2 and SHP2 to activate RAS. The authors utilized a mutant allele of Fgfr1, lacking the entire juxtamembrane domain (Fgfr1ΔFrs) and an allele of Fgfr2 containing two-point mutations essential for Frs2 binding (Fgfr2LR). When combining three floxed alleles and leaving only one functional allele (Fgfr1ΔFrs or Fgfr2LR) the authors got strikingly different phenotypes. When only the Fgfr1ΔFrs allele was retained, the lens phenotype matched that of deleting both Fgfr1 and Fgfr2. However, when only the Fgfr2LR allele was retained the phenotype was significantly milder, primarily affecting lens fiber cell differentiation, suggesting that something other than FRS2 might be interacting with the juxtamembrane domain to support FGFR signaling in the lens. The authors also deleted Grb2 in the lens and showed that the phenotype was similar to that of the lenses only retaining the Fgfr2LR allele, resulting a failure of lens fiber cell differentiation and decreased lens cell survival. However, mutating the major tyrosine phosphorylation site of GRB2 did not affect lens development. The authors additionally investigated the role of SHP2 in lens development by either deleting SHP2 or by making mutations in the SHP2 catalytic domain. The deletion of the SHP2 phosphatase activity did not affect lens development as severely as total loss of SHP2 protein, suggesting a function for SHP2 outside of its catalytic activity. Although the loss of Shc1 alone has only a slight effect on lens size and pERK activation in the lens, the authors showed that the loss of Shc1 exacerbated the lens phenotype in lenses lacking both Frs2 and Shp2. The authors suggest that SHC1 binds to the FGFR juxtamembrane domain allowing for the recruitment of GRB2 in independently of FRS2.

      Strengths:

      (1) The authors used a variety of genetic tools to carefully dissect the essential signals downstream of FGFR signaling during lens development.

      (2) The authors made a convincing case that something other than FRS2 binding mediates FGFR signaling in the juxtamembrane domain.

      (3) The authors demonstrated that despite the requirement of both the adaptor function and phosphatase activity of SHP2 are required for embryonic survival, neither of these activities is absolutely required for lens development.

      (4) The authors provide more information as to why FGFR loss has a phenotype much more severe than the loss of FRS2 alone during lens development.

      (5) The authors followed up their work analyzing various signaling molecules in the context of lens development with biochemical analyses of FGF-induced phosphorylation in murine embryonic fibroblasts (MEFs).

      (6) In general, this manuscript represents a Herculean effort to dissect FGFR signaling in vivo with biochemical backing with cell culture experiments in vitro.

      Weaknesses:

      (1) The authors demonstrate that the loss of FGFR1 and FGFR2 can be compensated by a constitutive active KRAS allele in the lens and suggest that FGFRs largely support lens development only by driving ERK activation. However, the authors also saw that lens development was substantially rescued by preventing apoptosis through the deletion of BAK and BAX. To my knowledge, the deletion of BAK and BAX should not independently activate ERK. The authors do not show whether ERK activation is restored in the BAK/BAX deficient lenses. Do the authors suggest the FGFR3 and/or FGFR4 provide sufficient RAS and ERK activation for lens development when apoptosis is suppressed? Alternatively, is it the survival function of FGFR-signaling as much as a direct effect on lens differentiation?

      (2) Do the authors suggest that GRB2 is required for RAS activation and ultimately ERK activation? If so, do the authors suggest that ERK activation is not required for FGFR-signaling to mediate lens induction? This would follow considering that the GRB2 deficient lenses lack a problem with lens induction.

      (3) The increase in p-Shc is only slightly higher in the Cre FGFR1f/f FGFR2r/LR than in the FGFR1f/Δfrs FGFR2f/f. Can the authors provide quantification?

      (4) The authors have not shown directly that Shc1 binds to the juxtamembrane region of either Fgfr1 or Fgfr2.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This manuscript uses the eye lens as a model to investigate basic mechanisms in the Fgf signaling pathway. Understanding Fgf signaling is of broad importance to biologists as it is involved in the regulation of various developmental processes in different tissues/organs and is often misregulated in disease states. The Fgf pathway has been studied in embryonic lens development, namely with regards to its involvement in controlling events such as tissue invagination, vesicle formation, epithelium proliferation, and cellular differentiation, thus making the lens a good system to uncover the mechanistic basis of how the modulation of this pathway drives specific outcomes. Previous work has suggested that proteins, other than the ones currently known (e.g., the adaptor protein Frs2), are likely involved in Fgfr signaling. The present study focuses on the role of Shp2 and Shc1 proteins in the recruitment of Grb2 in the events downstream of Fgfr activation.

      Strengths:

      The findings reveal that the juxtamembrane region of the Fgf receptor is necessary for proper control of downstream events such as facilitating key changes in transcription and cytoskeleton during tissue morphogenesis. The authors conditionally deleted all four Fgfrs in the mouse lens that resulted in molecular and morphological lens defects, most importantly, preventing the upregulation of the lens induction markers Sox2 and Foxe3 and the apical localization of F-actin, thus demonstrating the importance of Fgfrs in early lens development, i.e. during lens induction. They also examined the impact of deleting Fgfr1 and 2, on the following stage, i.e. lens vesicle development, which could be rescued by expressing constitutively active KrasG12D. By using specific mutations (e.g. Fgfr1ΔFrs lacking the Frs2 binding domain and Fgfr2LR harboring mutations that prevent binding of Frs2), it is demonstrated that the Frs2 binding site on Fgfr is necessary for specific events such as morphogenesis of lens vesicle. Further, by studying Shp2 mutations and deletions, the authors present a case for Shp2 protein to function in a context-specific manner in the role of an adaptor protein and a phosphatase enzyme. Finally, the key surprising finding from this study is that downstream of Fgfr signaling, Shc1 is an important alternative pathway - in addition to Shp2 - involved in the recruitment of Grb2 and in the subsequent activation of Ras. The methodologies, namely, mouse genetics and state-of-the-art cell/molecular/biochemical assays are appropriately used to collect the data, which are soundly interpreted to reach these important conclusions. Overall, these findings reveal the flexibility of the Fgf signaling pathway and its downstream mediators in regulating cellular events. This work is expected to be of broad interest to molecular and developmental biologists.

      Weaknesses:

      A weakness that needs to be discussed is that Le-Cre depends on Pax6 activation, and hence its use in specific gene deletion will not allow evaluation of the requirement of Fgfrs in the expression of Pax6 itself. But since this is the earliest Cre available for deletion in the lens, mentioning this in the discussion would make the readers aware of this issue. Referring to Jag1 among "lens-specific markers" (page 5) is debatable, suggesting changing to the lines of "the expected upregulation of Jag1 in lens vesicle". The Abstract could be modified to clearly convey the existing knowledge gap and the key findings of the present study. As it stands now, it is a bit all over the place. Some typos in the manuscript need to be fixed, e.g. "...yet its molecular mechanism remains largely resolved" - unresolved? "...in the development lens" - in the developing lens? In Figure 4 legend, "(B) Grb2 mutants Grb2 mutants displayed...", etc.

      We thank the reviewer for the thoughtful and constructive feedback. We have added the caveat regarding the Le-Cre dependency on Pax6 expression to the discussion, removed the reference to Jag1 as a “lens-specific marker” and corrected the typographical errors noted by the reviewer.

      Reviewer #2 (Public review):

      Summary:

      I have reviewed a manuscript submitted by Wang et al., which is entitled "Shc1 cooperates with Frs2 and Shp2 to recruit Grb2 in FGF-induced lens development". In this paper, the authors first examined lens phenotypes in mice with Le-Cre-mediated knockdown (KD) of all four FGFR (FGFR1-4), and found that pERK signals, Jag1, and foxe3 expression are absent or drastically reduced, indicating that FGF signaling is essential for lens induction. Next, the authors examined lens phenotypes of FGFR1/2-KD mice and found that lens fiber differentiation is compromised and that proliferative activity and cell survival are also compromised in lens epithelium. Interestingly, Kras activation rescues defects in lens growth and lens fiber differentiation in FGFR1/2-KD mice, indicating that Ras activation is a key step for lens development. Next, the authors examined the role of Frs2, Shp2, and Grb2 in FGF signaling for lens development. They confirmed that lens fiber differentiation is compromised in FGFR1/3-KD mice combined with Frs2-dysfunctional FGFR2 mutants, which is similar to lens phenotypes of Grb2-KD mice. However, lens defects are milder in mice with Shp2YF/YF and Shp2CS mutant alleles, indicating that the involvement of Shp2 is limited for the Grb2 recruitment for lens fiber differentiation. Lastly, the authors showed new evidence on the possibility that another adapter protein, Shc1, promotes Grb2 recruitment independent of Frs2/Shp2-mediated Grb2 recruitment.

      Strengths:

      Overall, the manuscript provides valuable data on how FGFR activation leads to Ras activation through the adapter platform of Frs2/Shp2/Grb2, which advances our understanding of complex modification of the FGF signaling pathway. The authors applied a genetic approach using mice, whose methods and results are valid to support the conclusion. The discussion also well summarizes the significance of their findings.

      Weaknesses:

      The authors eventually found that the new adaptor protein Shc1 is involved in Grb2 recruitments in response to FGF receptor activation. however, the main data for Shc1 are histological sections and statistical evaluation of lens size. So, my major concern is that the authors need to provide more detailed data to support the involvement of Shc1 in Grb2 recruitment of FGF signaling for lens development.

      We thank the reviewer for the positive comments and valuable suggestions. We have addressed the concerns in detail in the response to the recommendation outlined below.

      Reviewer #3 (Public review):

      Summary:

      The manuscript entitled "Shc1 cooperates with Frs2 and Shp2 to recruit Grb2 in FGF-induced lens development" by Wang et al., investigates the molecular mechanism used by FGFR signaling to support lens development. The lens has long been known to depend on FGFR signaling for proper development. Previous investigations have demonstrated that FGFR signaling is required for embryonic lens cell survival and for lens fiber cell differentiation. The requirement of FGFR signaling for lens induction has remained more controversial as deletion of both Fgfr1 and Fgfr2 during lens placode formation does not prevent the induction of definitive lens markers such as FOXE3 or αA-crystallin. Here the authors have used the Le-Cre driver to delete all four FGFR genes from the developing lens placode demonstrating a definitive failure of lens induction in the absence of FGFR signaling. The authors focused on FGFR1 and FGFR2, the two primary FGFRs present during early lens development, and demonstrated that lens development could be significantly rescued in lenses lacking both FGFR1 and FGFR2 by expressing a constitutively active allele of KRAS. They also showed that the removal of pro-apoptotic genes Bax and Bak could also lead to a substantial rescue of lens development in lenses lacking both FGFR1 and FGFR2. In both cases, the lens rescue included both increased lens size and the expression of genes characteristic of lens cells.

      Significantly the authors concentrated on the juxtamembrane domain, a portion of the FGFRs associated with FRS2. Previous investigations have demonstrated the importance of FRS2 activation for mediating a sustained level of ERK activation. FRS2 is known to associate both with GRB2 and SHP2 to activate RAS. The authors utilized a mutant allele of Fgfr1, lacking the entire juxtamembrane domain (Fgfr1ΔFrs), and an allele of Fgfr2 containing two-point mutations essential for Frs2 binding (Fgfr2LR). When combining three floxed alleles and leaving only one functional allele (Fgfr1ΔFrs or Fgfr2LR) the authors got strikingly different phenotypes. When only the Fgfr1ΔFrs allele was retained, the lens phenotype matched that of deleting both Fgfr1 and Fgfr2. However, when only the Fgfr2LR allele was retained the phenotype was significantly milder, primarily affecting lens fiber cell differentiation, suggesting that something other than FRS2 might be interacting with the juxtamembrane domain to support FGFR signaling in the lens. The authors also deleted Grb2 in the lens and showed that the phenotype was similar to that of the lenses only retaining the Fgfr2LR allele, resulting in a failure of lens fiber cell differentiation and decreased lens cell survival. However, mutating the major tyrosine phosphorylation site of GRB2 did not affect lens development. The author additionally investigated the role of SHP2 lens development by making by either deleting SHP2 or by making mutations in the SHP2 catalytic domain. The deletion of the SHP2 phosphatase activity did not affect lens development as severely as the total loss of SHP2 protein, suggesting a function for SHP2 outside of its catalytic activity. Although the loss of Shc1 alone has only a slight effect on lens size and pERK activation in the lens, the authors showed that the loss of Shc1 exacerbated the lens phenotype in lenses lacking both Frs2 and Shp2. The authors suggest that SHC1 binds to the FGFR juxtamembrane domain allowing for the recruitment of GRB2 independently of FRS2.

      Strengths:

      (1) The authors used a variety of genetic tools to carefully dissect the essential signals downstream of FGFR signaling during lens development.

      (2) The authors made a convincing case that something other than FRS2 binding mediates FGFR signaling in the juxtamembrane domain.

      (3) The authors demonstrated that despite the requirement of both the adaptor function and phosphatase activity of SHP2 are required for embryonic survival, neither of these activities is absolutely required for lens development.

      (4) The authors provide more information as to why FGFR loss has a phenotype much more severe than the loss of FRS2 alone during lens development.

      (5) The authors followed up their work analyzing various signaling molecules in the context of lens development with biochemical analyses of FGF-induced phosphorylation in murine embryonic fibroblasts (MEFs).

      (6) In general, this manuscript represents a Herculean effort to dissect FGFR signaling in vivo with biochemical backing with cell culture experiments in vitro.

      We thank the reviewer for the thorough review of our paper and positive comments.

      Weaknesses:

      (1) The authors demonstrate that the loss of FGFR1 and FGFR2 can be compensated by a constitutive active KRAS allele in the lens and suggest that FGFRs largely support lens development only by driving ERK activation. However, the authors also saw that lens development was substantially rescued by preventing apoptosis through the deletion of BAK and BAX. To my knowledge, the deletion of BAK and BAX should not independently activate ERK. The authors do not show whether ERK activation is restored in the BAK/BAX deficient lenses. Do the authors suggest the FGFR3 and/or FGFR4 provide sufficient RAS and ERK activation for lens development when apoptosis is suppressed? Alternatively, is it the survival function of FGFR-signaling as much as a direct effect on lens differentiation?

      Our interpretation is that at the lens induction stage, where FGFR1 and FGFR2 are crucial, their primary function operates through Ras signaling to promote cell survival. Thus, either constitutively active KRAS or the direct suppression of apoptosis by deleting Bak and Bax is sufficient to rescue lens induction. This rescue enables the subsequent differentiation of lens progenitor cells, a process for which FGFR3 and FGFR4 are sufficient to support.

      (2) The authors make the argument that deleting all four FGFRs prevented lens induction but that the deletion of only FGFR1 and FGFR2 did not. Part of this argument is the retention of FOXE3 expression, αA-crystallin expression, and PROX1 expression in the FGFR1/2 double mutants. However, in Figure 1E, and Figure 1F, the staining of the double mutant lens tissue with FOXE3, αA-crystallin, and PROX1 is unconvincing. However, the retention of FOXE3 expression in the FGFR1/FGFR2 double mutants was previously demonstrated in Garcia et al 2011. Also, there needs to be an enlargement or inset to demonstrate the retention of pSMAD in the quadruple FGFR mutants in Figure 1D.

      We have updated Figure 1E with a clearer image of FOXE3 staining to better illustrate FOXE3 expression in the FGFR1/2 double mutants. It seems there may have been a misunderstanding regarding our claims about αA-crystallin and PROX1. To clarify, our observation is that both αA-crystallin and PROX1 are lost in the FGFR1/2 double mutants, which we believe is clearly demonstrated in Figure 1F. Additionally, we have added inserts to Figure 1D to highlight the retention of pSMAD.

      (3) Do the authors suggest that GRB2 is required for RAS activation and ultimately ERK activation? If so, do the authors suggest that ERK activation is not required for FGFR-signaling to mediate lens induction? This would follow considering that the GRB2 deficient lenses lack a problem with lens induction.

      We do believe that GRB2 is required for RAS-ERK signaling activation; however, ERK activation is not absolutely required for lens induction. This conclusion is consistent with our previous study, which showed that deletion of ERK1/2 did not prevent lens induction (Garg et al. eLife 2020;9:e51915), as well as with our current findings demonstrating that the GRB2-deficient mutant is still capable of supporting lens induction.

      (4) The increase in p-Shc is only slightly higher in the Cre FGFR1f/f FGFR2r/LR than in the FGFR1f/Δfrs FGFR2f/f. Can the authors provide quantification?

      pShc quantification is now provided in Fig. 7B.

      (5) The authors have not shown directly that Shc1 binds to the juxtamembrane region of either Fgfr1 or Fgfr2.

      It is not yet clear whether Shc1 directly binds to the juxtamembrane region of FGFR1 or FGFR2, as it may also be recruited indirectly. We acknowledge this as an important question that warrants further investigation in future studies.

      (6) The authors have used the Le-Cre strain for all of their lens deletion experiments. Previous work has documented that the Le-Cre transgene can cause lens defects independent of any floxed alleles in both homozygous and hemizygous states on some genetic backgrounds (Dora et al., 2014 PLoS One 9:e109193 and Lam et al., Human Genomics 2019 13(1):10. Are the controls used in these experiments Le-Cre hemizygotes?

      As stated in the Method section, Le-Cre only or Le-Cre and heterozygous flox mice were used as controls.  

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      Weaknesses

      There are only a few minor weaknesses that need to be addressed.

      (1) The point could be made in the Discussion that since Le-Cre depends on Pax6 placodal expression, it is challenging to evaluate the impact of deletion of the four Fgfrs on the expression of Pax6 (since Pax6 needs to be activated prior to achieving Fgfr deletion). A different Cre line (e.g. a Cre which is expressed in the surface ectoderm prior to lens placode formation) could help partially address this question, although it may not be able to comment on the requirement of the Fgfrs specifically in the lens ectoderm. Thus, it will be prudent to mention this in the discussion.

      We have added the caveat regarding the Le-Cre dependency on Pax6 expression to the discussion.

      (2) Referring to Jag1 among "lens-specific markers" (page 5) is debatable, I suggest changing it along the lines of "the expected upregulation of Jag1 in lens vesicle".

      The wording has been changed as suggested.  

      (3) The Abstract could be modified to clearly convey the existing knowledge gap and the key findings of the present study. As it stands now, it is a bit all over the place.

      The abstract has been revised.  

      (4) Some typos in the manuscript need to be fixed.

      e.g. "...yet its molecular mechanism remains largely resolved" - unresolved?, "...in the development lens" - in the developing lens?, In Fig. 4 legend, "(B) Grb2 mutants Grb2 mutants displayed...", etc.

      These typos have been corrected.

      Reviewer #2 (Recommendations for the authors):

      My specific suggestions are shown below.

      (1) The authors need to describe the role of Shc1 in FGF signaling and vertebrate lens development, by citing previous publications in the introduction.

      We have detailed previous studies on the role of Shc in FGF signaling in the Introduction and discussed its function in the vertebrate lens in the Discussion section.

      (2) Figure 1B bottom panels: Inset images seem to be missing, although frames and arrowheads are there. Please check them.

      The inset images were correctly placed.

      (3) Results (page 5, line 13): The authors mentioned "Sox2 expression remained at basal levels". Since Figure 1B indicates that Sox2 expression fails to be upregulated in FGFR1/2 mutant lens placode in contrast to Pax6, it is better to clearly mention the failure in upregulation of Sox2 expression in the FGFR1/2 mutants.

      This sentence has been rewritten as suggested.  

      (4) Results (page 6, line 8): The authors mentioned "we observed .... expression of Foxe3 in ...mutant lens cells (Figure 1E, arrows). However, Foxe3-expressing lens cells are a very small population in Figure 1E. It is important to state the decreased number of Foxe3-expressing lens cells in FGFR1/2 mutants. In addition, I would like to request the authors to show histograms indicating sample size and statistical analysis for marker expression: Foxe3 (Figure 1E), Prox1 and aA-crystallin (Fig. 1F), cyclin D1 and TUNEL (Fig. 1G) and pmTOR and pS6 (Supplementary figure 1B).

      We added a statement indicating that the number of Foxe3-expressing cells is reduced in FGFR1/2 mutants, which is now quantified in Fig. 1H. Quantifications for Cyclin D1 and TUNEL are now shown in Fig. 1I and J, respectively. However, we chose not to quantify Prox1, αA-crystallin, pmTOR, and pS6, as the FGFR1/2 mutants showed no staining for these markers.

      (5) Results (page 6, line 19- page 7, line 6): The authors showed that inducible expression of constitutive active Kras, KrasG12D, using Le-Cre, recovered lens size to the half level of wild-type control. However, in the lens of mice with Le-Cre; FGFR1/2f/f; LSL-KrasG12D, pERK was detected in the most posterior edge of the lens fiber core, whereas pERK was detected in the broader area of the lens in control. Furthermore, pMEK was detected in the whole lens of mice with Le-Cre; FGFR1/2f/f; and LSL-KrasG12D, whereas pMEK was detected only in the lens epithelial cells at the equator. So, the spatial profile of pERK and pMEK expression was different from those of wild-type, although the authors observed that Prox1 and Crystallin expression are normally induced in the lens of mice with Le-Cre; FGFR1/2f/f; LSL-KrasG12D. I wonder whether the lens normally develops in mice with Le-Cre; LSL-KrasG12D? Is the lens growth enhanced in mice with Le-Cre; LSL-KrasG12D? Please add the panels of mice with Le-Cre; LSL-KrasG12D in Figure 2B and 2C. In addition, I wonder whether apoptosis is suppressed in the lens of mice with Le-Cre; FGFR1/2f/f; LSL-KrasG12D?

      As we previously reported (Developmental Biology 355, 2011, 12–20), Le-Cre; LSL-KrasG12D did not lead to enhanced lens growth. While we agree that including images of Le-Cre; LSL-KrasG12D as controls in Fig. 2B and C and evaluating apoptosis in Le-Cre; FGFR1/2f/f; LSL-KrasG12D mutants would be appropriate, we regretfully no longer have these animals available to conduct these experiments.

      (6) Results (page 11, line 15): the PCR genotyping image of Fig. 6C seems to be missing.

      The PCR genotyping image was correctly placed below Fig. 6B. 

      (7) Results (page 11, lines 15-20): there is no citation of Figure 6D in the results section.

      The citation for Fig. 6D is added in the results section.

      (8) Figures 5H, 6H, and 7A: Western blotting of some of the pERK, ERK lanes is missing.

      These western blots all have pERK/ERK overlay images.

      (9) Figure 7A, western blotting data on pShc levels are important to suggest the involvement of Shc1 in Frs2-independent Grb2 activation by FGF stimulation. Please provide the histogram for statistical analysis.

      pShc quantification is now provided in Fig. 7B.

      (10) There is no citation of Figure 7D, E, and F in the results section. Please add them.

      These citations have been added.

      (11) Figures 7E, and 7F: The authors showed that lens morphology and lens size evaluation in genetic combinations: control, Frs2/Shc1 KD, Frs2/Shp2 KD, and Frs2/Shp2/Shc1 KD. However, I would like to request the authors to show more detailed data in these genetic combinations, for example, pERK, foxe3, Maf, Prox1, Jag1, p57, cyclin D3, g-crystallin, and TUNEL.

      Unfortunately, we no longer have these mutant mice to perform these detailed staining.  

      Reviewer #3 (Recommendations for the authors):

      (1) The figure legend for Figure 2 lists (G) twice. The second (G) should be (H). Also, in Figures 2G and H there is no indication as to what stage lenses were used for the TUNEL and size analyses. I assume that it was E13.5, but it should be explicitly stated.

      The figure labeling has been corrected and the stage added to the figure legend.

      (2) In Figure 4 A the label should be gamma-crystallin rather than r-crystallin.

      The figure labeling has been corrected.

      (3) In Figure 6 D, I believe that the immunolabeling for Maf and Foxe3 are reversed. The Maf should be red as it is in the fibers and the Foxe3 should be green as it is epithelial.

      The figure labeling has been corrected.

      (4) In Figure 6C I believe that the labels for the WT and YF alleles on the western blot are reversed.

      The YF PCR band was designed to be larger than WT, so the labeling was correct as is.

      (5) In Figure 6F I believe that the labels for WT and CS on the western blot are reversed.

      The figure labeling has been corrected.

      (6) In Supplemental figure 2 there are no genotype labels for the TUNEL bar graph.

      The figure labeling has been added.

    1. eLife Assessment

      In this valuable report, the authors investigated the effect of mitochondrial transplantation on post-cardiac arrest myocardial dysfunction (PAMD), which is associated with mitochondrial dysfunction. They convincingly demonstrated that mitochondrial transplantation enhanced cardiac function and increased survival rates after the return of spontaneous circulation (ROSC). They have also shown that myocardial tissues with transplanted mitochondria exhibited increased mitochondrial complex activity, higher ATP levels, reduced cardiomyocyte apoptosis, and lower myocardial oxidative stress post-ROSC.

    2. Reviewer #1 (Public review):

      Summary:

      In this study, the authors investigate the effect of mitochondrial transplantation on post-cardiac arrest myocardial dysfunction (PAMD), which is associated with mitochondrial dysfunction. The authors demonstrate that mitochondrial transplantation enhances cardiac function and increases survival rates after the return of spontaneous circulation (ROSC). Mechanistically, they found that myocardial tissues with transplanted mitochondria exhibit increased mitochondrial complex activity, higher ATP levels, reduced cardiomyocyte apoptosis, and lower myocardial oxidative stress post-ROSC.

      Strengths:

      Previous studies have reported that mitochondrial transplantation can improve myocardial recovery after regional ischemia, but its potential for treating myocardial injury following cardiac arrest has not been tested yet. Therefore, the findings are somewhat novel. Remarkably, the increased survival in mitochondria treated group post ROSC is very promising and highlights its translational potential.

      Comments on revisions:

      My concerns are adequately addressed.

    3. Reviewer #3 (Public review):

      In this manuscript titled "Transplantation of exogenous mitochondria mitigates myocardial dysfunction after cardiac arrest", Zhen Wang et al. report that exogenous mitochondrial transplantation can enhance myocardial function and survival rates. It limits mitochondrial morphology impairment, boosts complexes II and IV activity, and increases ATP levels. Additionally, mitochondrial therapy reduces oxidative stress, lessens myocardial injury, and improves PAMD after cardiopulmonary resuscitation. The results of this manuscript clearly demonstrate that mitochondrial transplantation can effectively improve PAMD after cardiopulmonary resuscitation, highlighting its significant scientific and clinical value. The findings shown in this manuscript are interesting to the readers. However, further experiments are needed to confirm this conclusion. In addition, the results should be rewritten to describe and discuss the relevant data in detail.

      Major comments from the original round of review:

      (1) Can isolated mitochondria be transported to cultured cardiomyocytes, such as H9C2 cells, in vitro?

      (2) The description of results in the manuscript is too simple. It lacks detail on the rationale behind the experiments and the significance of the data.

      (3) The authors demonstrate that mitochondrial transplantation reduces cardiomyocyte apoptosis. Therefore, Western blot analysis of apoptosis-related caspases could be provided for further confirmation.

      (4) Do donor mitochondria fuse with recipient mitochondria? Relevant experiments and data should be provided to address this question.

      (5) In Figure 5A, the histograms are not labeled with the specific experimental groups.

      Comments on revisions:

      The revised manuscript quality has been improved, and most of my concerns were addressed and resolved.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer 3 (Public review):

      Major comments:

      (1) Can isolated mitochondria be transported to cultured cardiomyocytes, such as H9C2 cells, in vitro?

      Thank you for this insightful question. Mitochondria are highly dynamic organelles that play a crucial role in cellular energy metabolism. When cells encounter various stressors and increased energy demands, they can benefit from the incorporation of exogenous mitochondria. In 2013, Masuzawa et al. (Masuzawa, et al.,2013) were the first to demonstrate that transplanted mitochondria are internalized by cardiomyocytes 2 to 8 hours after transplantation, significantly contributing to the preservation of myocardial energetics. Ali et al. (Ali, et al.,2020) discovered that exogenous mitochondria could be internalized by H9C2 cardiomyocytes as quickly as 5 minutes after co-incubation, resulting in an acute enhancement of normal cellular bioenergetics following mitochondrial transplantation. Pacak et al. (Pacak, et al.,2015) established that the internalization of mitochondria into cardiomyocytes is time-dependent and occurs through actin-dependent endocytosis.

      Collectively, these evidences illustrate that exogenous mitochondria can be effectively internalized by H9C2 cells and other cardiomyocytes, our experiments further confirmed that mitochondrial transplantation can be incorporated by the myocardium in vivo.

      (2) The description of results in the manuscript is too simple. It lacks detail on the rationale behind the experiments and the significance of the data.

      Thank you for this suggestion. We have realized that the results in the submitted manuscript have not been adequately interpreted. We have added necessary details on the rationale behind the experiments and the significance of the data to the results section (Lines 57~59, 69~73, 81~88, 91~98, 100~102, 103~104,  10<sup>9</sup>~115, 124~129, 135~146, 149~157, 159~161, 168~169, 178~179). We would like to express our gratitude to the reviewers once again and hope that our modifications will meet their requirements.

      (3) The authors demonstrate that mitochondrial transplantation reduces cardiomyocyte apoptosis. Therefore, Western blot analysis of apoptosis-related caspases could be provided for further confirmation.

      Thank you for this constructive comment. We fully agree with the reviewer's perspective on the detection of apoptosis-related caspases and have conducted a Western blot assay to investigate the impact of mitochondria on myocardial tissue. Our new evidence indicates that rats receiving mitochondrial transplantation exhibited reduced expression of cleaved caspase-3 compared with those in the NS and Vehicle groups (Fig. 6G, 6H, Lines 168~169), suggesting that mitochondrial transplantation decreased the level of apoptosis in the myocardium.

      (4) Do donor mitochondria fuse with recipient mitochondria? Relevant experiments and data should be provided to address this question.

      This is a very helpful comment. Investigating the fate of transplanted mitochondria in myocardial cells after CA is of great significance. The internalization of exogenous mitochondria has been observed across various cell types (Liu, et al.,2021; Shanmughapriya, et al.,2020). Notably, a recent study indicated that after being incorporated into host cells, isolated mitochondria are transported to endosomes and lysosomes. Subsequently, most of these mitochondria escape from these compartments and fuse with the endogenous mitochondrial network (Cowan, et al.,2017). We have discussed this in the manuscript. (Lines 217~220)

      Oxidative stress, a pathophysiological phenomenon common to cells suffering from ischemia/reperfusion insults after CA/CPR, was implicated to promote internalization and survival of exogenous mitochondria (Aharoni-Simon, et al.,2022). In our study, we confirmed that mitochondrial transplantation can enhance the metabolism of cardiomyocytes, increase ATP level, and reduce reactive oxygen species (ROS). Our results indirectly confirm that isolated mitochondria can successfully fuse with myocardial mitochondria.

      (5) In Figure 5A, the histograms are not labeled with the specific experimental groups.

      We apologize for this oversight. We have labeled the specific experimental groups in the histograms presented in Figure 6B and 6C (originally Figure 5A).

      Reviewer #1 (Recommendations For The Authors):

      (1) The age, gender, and strain of the donor rats should be specified in the Methods section. Additionally, it is not obvious what doses of mitochondria were injected into the rats and how the dosage was initially determined.

      Thanks for your suggestion. We have included relevant information about the donor rats in the Methods section(Lines 361~362).

      In Mito group, each animal received 0.5 mL of 1× 10<sup>9</sup>/mL mitochondrial suspension. (Lines 342~345). Considerable amounts of data have demonstrated the efficacy of mitochondrial transplantation in cellular, animal, and human research (Alemany, et al.,2024; Kaza, et al.,2017; Liu, et al.,2023). However, there is currently no evidence to determine the optimal dosage for transplantation. In previous research, isolated mitochondria (1 ×  10<sup>9</sup>) were delivered to the left coronary ostium in pigs, and can be a viable treatment modality in cardiac ischemia-reperfusion injury (Blitzer, et al.,2020; Guariento, et al.,2020). Additionally, the dose of 1× 10<sup>9</sup> mitochondria achieve the maximal hyperemic effect when administered via intracoronary injection (Shin, et al.,2019). Considering that Sprague-Dawley (SD) rats are smaller than pigs and that there is a loss of mitochondria during pulmonary circulation, we adopted a mitochondrial transplantation dose of 5× 10<sup>8</sup>. We will explore the optimal dosage in our future research.

      (2) In Figure 4a, the number of transplanted mitochondria appears to be very low. Considering the high number of mitochondria present in cardiomyocytes, it is unclear whether this small amount of transplanted mitochondria can significantly impact complex II activity and ATP levels in myocardial tissues, as shown in Figures 4b-d, or improve survival post-ROSC, as shown in Figure 2d. Could the observed benefits of mitochondrial transplantation be due to the indirect effects of the injected mitochondria, such as the release of mitochondrial contents, rather than the mitochondria themselves, as discussed by Bertero et al. (2021, Circ. Research)? This issue should be addressed in the manuscript.

      Thanks for this wonderful comment. As presented in Fig. 4 (originally Figure 4A), our results indicated the internalization of mitochondria by myocardium, shown by colocalization of Mito-tracker and myocardium marker. We would like to make our points here regrading to Fig. 4:

      (1) Significant left ventricular systolic and diastolic dysfunction that occurs in the myocardium shortly after the return of ROSC is referred to post-cardiac arrest myocardial dysfunction (PAMD) (Laurent, et al.,2002). It has demonstrated the efficacy of mitochondrial transplantation for the heart following ischemia-reperfusion injury in cellular, animal, and human studies, despite inadequate mitochondrial internalization (Liu, et al.,2023). A low number of transplanted mitochondria may improve cardiac function.

      (2) Only biologically active mitochondria can be specifically labeled with Mito-tracker. Therefore, cardiomyocytes uptake mitochondria that possess complete functionality. Previous results have demonstrated that mitochondrial contents, such as nonviable mitochondria, mitochondrial fractions, mitochondrial deoxyribonucleic acid, ribonucleic acid, exogenous adenosine diphosphate and ATP, do not provide protection to the ischemic heart (McCully, et al.,2017; McCully, et al.,2009).

      (3) The specific mechanism for mitochondrial internalization has yet to be fully elucidated. We totally agree with reviewer’s opinion pertaining the presence of other mechanisms of mitochondria transplantation that play a role in cardiac protection. Multiple mechanism may involve in the cardiac protection effect of mitochondria transplantation, and we are actively seeking reasonable approach to verify these hypotheses in an underway study (Lines 236~246).

      (3) In Figure 4g, the claims regarding sarcomere length, mitochondrial structure, the number of cristae, accumulated calcium etc. seem to rely on the visual interpretation of representative images. To ensure a reliable interpretation of the data, a blinded quantification of each image in each group should be conducted. The same applies to the claims made in Figure 5E.

      Thanks for this suggestion. We have quantitatively evaluated the electron microscope images and HE images of the myocardium to ensure reliable interpretation. Corresponding supplements have been added to the methods (Lines 433~441, 494~496), results sections (Lines  10<sup>9</sup>~115, 178~179), and Figures 5C, 5D, 6K and 6H (originally Figures 4G and 5E).

      (4) In line 69, it is unclear why the authors claim that MAP and HR decrease at 1, 2, 3, and 4 hours after ROSC in all groups compared to the Sham group, despite stating in line 72 that "MAP and HR did not differ at any observational time points (P>0.05, Figure 2C)."

      We apologize for our inaccurate phrasing. In the presented study, there was no statistically significant difference between MAP and HR at any observational timepoints (P>0.05, Figure 2C). In the NS, Vehicle and Mito groups, the MAP and HR decreased at 1, 2, 3, and 4 hours after ROSC, reaching their nadir at 1 hour. Subsequently, MAP and HR increased gradually but did not show any statistically significant differences compared with the Sham group.  (Lines 69~73).

      (5) The absence of increased mitochondrial content in the mito-groups should be discussed further in the manuscript.

      Thank you for your suggestion. We discussed the reasons why the mass of isolated mitochondria did not increase in Lines 224~235.

      (6) The N in Figure 5d should be provided.

      Thanks for your suggestion. We have revised the figure legend to include N of Figure 6F (originally Figures 5D).

      (7) Figure 6 demonstrates content beyond the findings in this manuscript. This reviewer recommends limiting the graphical abstract to the findings specifically in this paper.

      Thanks for your great advice. We have revised Figure 7 (originally Figure 6) and restricted the graphical abstract to the findings presented in this paper.

      Minor issues:

      (8) The order of data in Figure 4 should be consistent with the text in the manuscript. Figures 4E-F-G are described before Figures 4B-C-D in the text. Similarly, Figure 5F was described before Figure 5E in the text.

      Thanks for your great advice. We have rearranged the order of the pictures to align with the text. Thank you for your proposal.

      (9) In Figure 4A, the locations of the epicardium, muscle, and endocardium should be indicated for clarity. Also, it is not obvious where the close-up box refers to in the actual image.

      Thank you for your suggestion. We primarily seek evidence of mitochondrial internalization within the endocardium, as injury occurs first during myocardial ischemia (Kuwada and Takenaka,2000). The close-up box in Fig. 4 refers to the endocardium.

      (10) In Figure 5A, the group annotations are missing from the MDA and SOD graphs. The standard deviation bars for the SOD vehicle and SOD mito groups (3rd and 4th columns) appear to overlap. Can the authors provide the actual p-values?

      We apologize for the mission of group annotations in the MDA and SOD graphs. The p-value between the Vehicle group and the Mito group was 0.004. The SOD activity level of myocardial samples in the groups are presented in Table 1.

      Author response table 1.

      The SOD activity levels of myocardial samples in groups (U/mgprot)

      (11) In line 58, NS abbreviation is used without defining what NS is.

      We apologize for not including the full name of NS. NS is the abbreviation of normal. It has now been marked in the manuscript. (Line 58)

      (12) In line 118, what MDA stands for is not described until line 348. MDA should be defined in the text for the general audience.

      We apologize for this. We have defined it in the manuscript. (Lines 156~157)

      (13) In line 192, the authors state that "mitochondrial transplantation... increased the expression of antioxidant enzymes after four hours of ROSC," while only SOD activity levels were assessed in the manuscript. Increased activity levels do not necessarily imply an increase in expression levels. This discrepancy should be addressed in the Discussion section.

      Sorry for confusing the ‘activity’ with ‘expression’. Although mitochondrial transplantation has been shown to be involved in the restoration of manganese superoxide dismutase levels after ischemic insults, the changes in antioxidant enzyme expression level were not evaluated at the protein level in this paper (Tashiro, et al.,2022). To avoid misunderstandings, we have replaced the term ‘expression’ with ‘activity’ as appropriate. (Lines 268~271)

      (14) Mitochondria from non-ischemic gastrocnemius muscle of health donor animals were isolated and a manner that maximized their healing potential. This sentence is not clear.

      We apologize for the confusing sentence in the original manuscript. To improve clarity, we have revised that sentence. We isolated mitochondria from allogeneic gastrocnemius muscle tissue of healthy rats and maintained optimal mitochondrial activity and therapeutic effects. (Lines 199~201)

      Minor grammar issues:

      In line 153, mitochondrial should be mitochondria.

      Figure 2D: Percent servival should be percent survival.

      There should be a blank in complex IIactivity Figure 4B, and complex IV activity in Figure 4C.

      In line 134, Four hours of ROSC, Tissue samples from. Tissue is capital.

      In line 190, Similaerly should be similarly.

      Thank you for your valuable comments. We apologize for the grammatical issues caused by our oversight. We have made the necessary corrections in the manuscript and figures. (Lines 198, 179, and 268), Figure 2D, Figure 5E (originally Figure 4B); Figure 5F (originally Figure 4C).

      Reviewer #2 (Recommendations For The Authors):

      Some details are lacking clarity, such as the rationale behind choosing certain doses or time points for interventions.

      Thank you for this valuable suggestion. We have explained the rationale behind the selection of the dosage and the timing of the intervention. (Lines 201~212)

      I would suggest verifying mitochondrial function using the seahorse experiment oxygen consumption, and to check mitochondrial oxidative stress. I would also suggest checking the mitochondrial permeability transition pore opening, using for example calcein cobalt quenching or simply a kit to examine this further.

      Thank you for your valuable advice. In our manuscript, we added results regarding mitochondrial reactive oxygen species (ROS) and the mitochondrial permeability transition pore (mPTP) opening. As anticipated, mitochondrial transplantation reduced the increase in mitochondrial ROS and the mPTP opening in ischemic myocardium. (Lines 135~146, 149~157, 442~455, 460~476, Figure 5H, 5I, 6A)

      We agree that seahorse experiment oxygen consumption would be beneficial for understanding the intricacies of their interactions and enhancements. Additionally, Ali et al. (Ali, et al.,2020) have demonstrated that introducing non-autologous mitochondria from healthy skeletal muscle cells into normal cardiomyocytes results in a short-term improvement in bioenergetics, as measured using a Seahorse Extracellular Flux Analyzer. In our results, we have not yet conducted cellular experiments, The process of isolating cells from the myocardial tissue of adult SD rats for Seahorse analysis can lead to secondary damage to the myocardial cells (Jacobson, et al.,1985). In this experiment, we measured ATP content and the activity of mitochondrial complexes to evaluate energy changes after mitochondrial transplantation. We will conduct cell experiments and utilize Seahorse measurements to further clarify the alterations in myocardial energy in future.

      For Figure 3B, it would be beneficial to include the relative quantification of the mitochondrial marker COX-IV. Additionally, if feasible, I suggest verifying the representation of the mitochondria outer membrane TOM20 or VDAC.

      Thank you for your great suggestion. As suggested, we added TOM20 to assess the purity of the isolated mitochondria and reached the same conclusion: the isolated mitochondria exhibited high purity (Figure 3B). TOM20 was expressed in both muscle lysates and isolated mitochondria, whereas GAPDH was exclusively found in the muscle lysate. (We re-validated the purity of the mitochondria by using relative quantification of TOM20 and COX VI.)

      In Figure 2C, the clarity of the graphs depicting both arterial pressure (MAP) and heart rate (HR) is lacking and could potentially confuse the reader. I recommend incorporating color coding instead of relying solely on symbols, or by presenting the data in a more comprehensible format and that aligns with graph B as well.

      Thank you for your constructive comments. We have color-coded the diagrams in Figure 2B and 2C.

      In Figure 4A, please include high-magnification of the mitochondria to provide a more detailed examination.

      Thank you for this insightful comment. We have provided a high-magnification image of the mitochondria in Figure 4.

      Regarding lines 81-82, I recommend specifying the sentence more precisely for better clarity and understanding.

      Thank you for your comments. We have revised the sentences in lines 83~86 to enhance their clarity for readers.

      In the Materials and Methods section, it is crucial to provide precise details. For instance, when staining the exogenous mitochondria with MitoTracker Red, it is important to specify the duration of staining, such as the standard 20 minutes for example. Additionally, it is advisable to mention the number of times these mitochondria were washed with the respiratory solution to ensure thorough removal of excess MitoTracker, thus preventing unintended staining of endogenous mitochondria with MitoTracker red upon injection of pre-labeled mitochondria.

      Thank you for your suggestion. We have added the necessary details regarding Mito-Tracker Red dyeing. (Lines 373~376) In addition, we also added other details in necessary (Lines 373~376, 379~382, 395~396, 397~400, 487~488). We appreciate your suggestion once again.

      The sensitivity of JC-1 dye to temperature and pH fluctuations underscores the necessity for meticulous experimental conditions. It is crucial for the authors to elucidate why they chose to maintain the samples at 4 {degree sign} C for 60 minutes, especially considering the dye's optimal operating temperature of 25 {degree sign} C. Providing a rationale behind this deviation from standard protocol would enhance the scientific rigor and reproducibility of the study. Please add more information on the objectives used in the fluorescence microscope (BX53, OLYMPUS, Tokyo, Japan) and the software used.

      We sincerely apologize for the mistake in this sentence. The purified mitochondria, which are stained with JC-1, should be stored at 4°C and examined using a fluorescence microscope within 60 minutes. Purified mitochondria were incubated with JC-1 staining solution at 37°C for 20 minutes. The fluorescence microscope used in our experiment is equipped with a WHN 10/22 eyepiece, and the software version is OLYMPUS cellSens Standard 3.2. (Lines 379~382)

      Moreover, in the context of immunoblotting, it is imperative for the authors to furnish detailed information regarding the preparation of muscle tissue homogenates. Specifically, clarification is needed regarding the solution utilized for tissue grinding. Did the authors employ ice-cold RIPA lysis buffer or an alternative lysis buffer, supplemented with a protease inhibitor cocktail? Such details are pivotal for methodological transparency.

      Thanks for this wonderful comment. In the methods section, we added detailed information about protein extraction. (Lines 383~385)

      Furthermore, it would be beneficial for the authors to specify the instrument employed for scanning the immunoblots, as well as the software utilized for subsequent analysis of the immunoblot images. Providing this information would not only enhance the reproducibility of the findings but also facilitate the evaluation of the experimental results.

      Thank you for your suggestion. We have included the instrument used for scanning the Western blot, as well as the software used for image analysis in the manuscript. (Lines 397~400)

      Authors must exercise caution against copy-pasting. In line 282, there's a query regarding how the mitochondria were isolated. It is recommended to cite a specific reference and offer more comprehensive details. Despite the authors referencing a number within the text, the absence of numbered references makes it challenging to cross-reference.

      Thank you for pointing this out; we have updated the citation accordingly (Line 361).

      Figure 5C please double check some misspelling label errors (e.g: Vehicle and not Vehucle).

      We apologize for the misspelling in Figure 6E (originally Figure 5C) and have corrected it. Additionally, we have thoroughly reviewed the text for spelling errors and sincerely apologize once again for the previous mistakes. (Lines 249~252, 322)

      References:

      Aharoni-Simon M, Ben-Yaakov K, Sharvit-Bader M, Raz D, Haim Y, Ghannam W, Porat N, Leiba H, Marcovich A, Eisenberg-Lerner A, Rotfogel Z. 2022. Oxidative stress facilitates exogenous mitochondria internalization and survival in retinal ganglion precursor-like cells. SCI REP-UK 12:5122. doi:10.1038/s41598-022-08747-3

      Alemany VS, Nomoto R, Saeed MY, Celik A, Regan WL, Matte GS, Recco DP, Emani SM, Del NP, McCully JD. 2024. Mitochondrial transplantation preserves myocardial function and viability in pediatric and neonatal pig hearts donated after circulatory death. J THORAC CARDIOV SUR 167: e6-e21. doi: 10.1016/j.jtcvs.2023.05.010

      Ali PP, Kenney MC, Kheradvar A. 2020. Bioenergetics Consequences of Mitochondrial Transplantation in Cardiomyocytes. J AM HEART ASSOC 9: e14501. doi:10.1161/JAHA.119.014501

      Blitzer D, Guariento A, Doulamis IP, Shin B, Moskowitzova K, Barbieri GR, Orfany A, Del NP, McCully JD. 2020. Delayed Transplantation of Autologous Mitochondria for Cardioprotection in a Porcine Model. ANN THORAC SURG  109:711-719. doi: 10.1016/j.athoracsur.2019.06.075

      Cowan DB, Yao R, Thedsanamoorthy JK, Zurakowski D, Del NP, McCully JD. 2017. Transit and integration of extracellular mitochondria in human heart cells. SCI REP-UK 7:17450. doi:10.1038/s41598-017-17813-0

      Guariento A, Blitzer D, Doulamis I, Shin B, Moskowitzova K, Orfany A, Ramirez-Barbieri G, Staffa SJ, Zurakowski D, Del NP, McCully JD. 2020. Preischemic autologous mitochondrial transplantation by intracoronary injection for myocardial protection. J THORAC CARDIOV SUR 160: e15-e29. doi: 10.1016/j.jtcvs.2019.06.111

      Jacobson SL, Banfalvi M, Schwarzfeld TA. 1985. Long-term primary cultures of adult human and rat cardiomyocytes. BASIC RES CARDIOL 80 Suppl 1:79-82. doi:10.1007/978-3-662-11041-6_15

      Kaza AK, Wamala I, Friehs I, Kuebler JD, Rathod RH, Berra I, Ericsson M, Yao R, Thedsanamoorthy JK, Zurakowski D, Levitsky S, Del NP, Cowan DB, McCully JD. 2017. Myocardial rescue with autologous mitochondrial transplantation in a porcine model of ischemia/reperfusion. J THORAC CARDIOV SUR 153:934-943. doi: 10.1016/j.jtcvs.2016.10.077

      Kuwada Y, Takenaka K. 2000. [Transmural heterogeneity of the left ventricular wall: subendocardial layer and subepicardial layer]. J CARDIOL 35:205-218.

      Laurent I, Monchi M, Chiche JD, Joly LM, Spaulding C, Bourgeois B, Cariou A, Rozenberg A, Carli P, Weber S, Dhainaut JF. 2002. Reversible myocardial dysfunction in survivors of out-of-hospital cardiac arrest. J AM COLL CARDIOL 40:2110-2116. doi:10.1016/s0735- 1097(02)02594-9

      Liu D, Gao Y, Liu J, Huang Y, Yin J, Feng Y, Shi L, Meloni BP, Zhang C, Zheng M, Gao J. 2021. Intercellular mitochondrial transfer as a means of tissue revitalization. SIGNAL TRANSDUCT TAR 6:65. doi:10.1038/s41392-020-00440-z

      Liu Q, Liu M, Yang T, Wang X, Cheng P, Zhou H. 2023. What can we do to optimize mitochondrial transplantation therapy for myocardial ischemia-reperfusion injury? MITOCHONDRION 72:72-83. doi: 10.1016/j.mito.2023.08.001

      Masuzawa A, Black KM, Pacak CA, Ericsson M, Barnett RJ, Drumm C, Seth P, Bloch DB, Levitsky S, Cowan DB, McCully JD. 2013. Transplantation of autologously derived mitochondria protects the heart from ischemia-reperfusion injury. AM J PHYSIOL-HEART C 304:H966-H982. doi:10.1152/ajpheart.00883.2012

      McCully JD, Cowan DB, Emani SM, Del NP. 2017. Mitochondrial transplantation: From animal models to clinical use in humans. MITOCHONDRION 34:127-134. doi: 10.1016/j.mito.2017.03.004

      McCully JD, Cowan DB, Pacak CA, Toumpoulis IK, Dayalan H, Levitsky S. 2009. Injection of isolated mitochondria during early reperfusion for cardioprotection. AM J PHYSIOL-HEART C 296:H94-H105. doi:10.1152/ajpheart.00567.2008

      Pacak CA, Preble JM, Kondo H, Seibel P, Levitsky S, Del NP, Cowan DB, McCully JD. 2015. Actin-dependent mitochondrial internalization in cardiomyocytes: evidence for rescue of mitochondrial function. BIOL OPEN 4:622-626. doi:10.1242/bio.201511478

      Shanmughapriya S, Langford D, Natarajaseenivasan K. 2020. Inter and Intracellular mitochondrial trafficking in health and disease. AGEING RES REV 62:101128. doi: 10.1016/j.arr.2020.101128

      Shin B, Saeed MY, Esch JJ, Guariento A, Blitzer D, Moskowitzova K, Ramirez-Barbieri G, Orfany A, Thedsanamoorthy JK, Cowan DB, Inkster JA, Snay ER, Staffa SJ, Packard AB, Zurakowski D, Del NP, McCully JD. 2019. A Novel Biological Strategy for Myocardial Protection by Intracoronary Delivery of Mitochondria: Safety and Efficacy. JACC-BASIC TRANSL SC 4:871-888. doi: 10.1016/j.jacbts.2019.08.007

      Tashiro R, Bautista-Garrido J, Ozaki D, Sun G, Obertas L, Mobley AS, Kim GS, Aronowski J, Jung JE. 2022. Transplantation of Astrocytic Mitochondria Modulates Neuronal Antioxidant Defense and Neuroplasticity and Promotes Functional Recovery after Intracerebral Hemorrhage. J NEUROSCI 42:7001-7014. doi:10.1523/JNEUROSCI.2222-21.2022

    1. eLife Assessment

      These useful findings assigned a novel functional implication of histone acylation, crotonylation. Although the mechanistic insights have been provided in great detail regarding the role of the YEATS2-GCDH axis in modulating EMT in HNC, the strength of evidence for the manuscript is incomplete. The patient cohort is very small, with just 10 patients; to establish a significant result the cohort size should be increased. Furthermore, the functional implication of p300 is also to be looked into.

    2. Reviewer #1 (Public review):

      Summary:

      This manuscript investigates a mechanism between the histone reader protein YEATS2 and the metabolic enzyme GCDH, particularly in regulating epithelial-to-mesenchymal transition (EMT) in head and neck cancer (HNC).

      Strengths:

      Great detailing of the mechanistic aspect of the above axis is the primary strength of the manuscript.

      Weaknesses:

      Several critical points require clarification, including the rationale behind EMT marker selection, the inclusion of metastasis data, the role of key metabolic enzymes like ECHS1, and the molecular mechanisms governing p300 and YEATS2 interactions.

      Major Comments:

      (1) The title, "Interplay of YEATS2 and GCDH mediates histone crotonylation and drives EMT in head and neck cancer," appears somewhat misleading, as it implies that YEATS2 directly drives histone crotonylation. However, YEATS2 functions as a reader of histone crotonylation rather than a writer or mediator of this modification. It cannot itself mediate the addition of crotonyl groups onto histones. Instead, the enzyme GCDH is the one responsible for generating crotonyl-CoA, which enables histone crotonylation. Therefore, while YEATS2 plays a role in recognizing crotonylation marks and may regulate gene expression through this mechanism, it does not directly catalyse or promote the crotonylation process.

      (2) The study suggests a link between YEATS2 and metastasis due to its role in EMT, but the lack of clinical or pre-clinical evidence of metastasis is concerning. Only primary tumor (PT) data is shown, but if the hypothesis is that YEATS2 promotes metastasis via EMT, then evidence from metastatic samples or in vivo models should be included to solidify this claim.

      (3) There seems to be some discrepancy in the invasion data with BICR10 control cells (Figure 2C). BICR10 control cells with mock plasmids, specifically shControl and pEGFP-C3 show an unclear distinction between invasion capacities. Normally, we would expect the control cells to invade somewhat similarly, in terms of area covered, within the same time interval (24 hours here). But we clearly see more control cells invading when the invasion is done with KD and fewer control cells invading when the invasion is done with OE. Are these just plasmid-specific significant effects on normal cell invasion? This needs to be addressed.

      (4) In Figure 3G, the Western blot shows an unclear band for YEATS2 in shSP1 cells with YEATS2 overexpression condition. The authors need to clearly identify which band corresponds to YEATS2 in this case.

      (5) In ChIP assays with SP1, YEATS2 and p300 which promoter regions were selected for the respective genes? Please provide data for all the different promoter regions that must have been analysed, highlighting the region where enrichment/depletion was observed. Including data from negative control regions would improve the validity of the results.

      (6) The authors establish a link between H3K27Cr marks and GCDH expression, and this is an already well-known pathway. A critical missing piece is the level of ECSH1 in patient samples. This will clearly delineate if the balance shifted towards crotonylation.

      (7) The p300 ChIP data on the SPARC promoter is confusing. The authors report reduced p300 occupancy in YEATS2-silenced cells, on SPARC promoter. However, this is paradoxical, as p300 is a writer, a histone acetyltransferase (HAT). The absence of a reader (YEATS2) shouldn't affect the writer (p300) unless a complex relationship between p300 and YEATS2 is present. The role of p300 should be further clarified in this case. Additionally, transcriptional regulation of SPARC expression in YEATS2 silenced cells could be analysed via downstream events, like Pol-II recruitment. Assays such as Pol-II ChIP-qPCR could help explain this.

      (8) The role of GCDH in producing crotonyl-CoA is already well-established in the literature. The authors' hypothesis that GCDH is essential for crotonyl-CoA production has been proven, and it's unclear why this is presented as a novel finding. It has been shown that YEATS2 KD leads to reduced H3K27cr, however, it remains unclear how the reader is affecting crotonylation levels. Are GCDH levels also reduced in the YEATS2 KD condition? Are YEATS2 levels regulating GCDH expression? One possible mechanism is YEATS2 occupancy on GCDH promoter and therefore reduced GCDH levels upon YEATS2 KD. This aspect is crucial to the study's proposed mechanism but is not addressed thoroughly.

      (9) The authors should provide IHC analysis of YEATS2, SPARC alongside H3K27cr and GCDH staining in normal vs. tumor tissues from HNC patients.

    3. Reviewer #2 (Public review):

      Summary:

      The manuscript emphasises the increased invasive potential of histone reader YEATS2 in an SP1-dependent manner. They report that YEATS2 maintains high H3K27cr levels at the promoter of EMT-promoting gene SPARC. These findings assigned a novel functional implication of histone acylation, crotonylation.

      Concerns:

      (1) The patient cohort is very small with just 10 patients. To establish a significant result the cohort size should be increased.

      (2) Figure 4D compares H3K27Cr levels in tumor and normal tissue samples. Figure 1G shows overexpression of YEATS2 in a tumor as compared to normal samples. The loading control is missing in both. Loading control is essential to eliminate any disparity in protein concentration that is loaded.

      (3) Figure 4D only mentions 5 patient samples checked for the increased levels of crotonylation and hence forms the basis of their hypothesis (increased crotonylation in a tumor as compared to normal). The sample size should be more and patient details should be mentioned.

      (4) YEATS2 maintains H3K27Cr levels at the SPARC promoter. The p300 is reported to be hyper-activated (hyperautoacetylated) in oral cancer. Probably, the activated p300 causes hyper-crotonylation, and other protein factors cause the functional translation of this modification. The authors need to clarify this with a suitable experiment.

      (5) I do not entirely agree with using GAPDH as a control in the western blot experiment since GAPDH has been reported to be overexpressed in oral cancer.

      (6) The expression of EMT markers has been checked in shControl and shYEATS2 transfected cell lines (Figure 2A). However, their expression should first be checked directly in the patients' normal vs. tumor samples.

      (7) In Figure 3G, knockdown of SP1 led to the reduced expression of YEATS2 controlled gene Twist1. Ectopic expression of YEATS2 was able to rescue Twist1 partially. In order to establish that SP1 directly regulates YEATS2, SP1 should also be re-introduced upon the knockdown background along with YEATS2 for complete rescue of Twist1 expression.

      (8) In Figure 7G, the expression of EMT genes should also be checked upon rescue of SPARC expression.

    4. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This manuscript investigates a mechanism between the histone reader protein YEATS2 and the metabolic enzyme GCDH, particularly in regulating epithelial-to-mesenchymal transition (EMT) in head and neck cancer (HNC).

      Strengths:

      Great detailing of the mechanistic aspect of the above axis is the primary strength of the manuscript.

      Weaknesses:

      Several critical points require clarification, including the rationale behind EMT marker selection, the inclusion of metastasis data, the role of key metabolic enzymes like ECHS1, and the molecular mechanisms governing p300 and YEATS2 interactions.

      We would like to sincerely thank the reviewer for the detailed, in-depth, and positive response. We are committed to implementing constructive revisions to the manuscript to address the reviewer’s concerns effectively.

      Major Comments:

      (1) The title, "Interplay of YEATS2 and GCDH mediates histone crotonylation and drives EMT in head and neck cancer," appears somewhat misleading, as it implies that YEATS2 directly drives histone crotonylation. However, YEATS2 functions as a reader of histone crotonylation rather than a writer or mediator of this modification. It cannot itself mediate the addition of crotonyl groups onto histones. Instead, the enzyme GCDH is the one responsible for generating crotonyl-CoA, which enables histone crotonylation. Therefore, while YEATS2 plays a role in recognizing crotonylation marks and may regulate gene expression through this mechanism, it does not directly catalyse or promote the crotonylation process.

      We thank the reviewer for raising this concern. As stated by the reviewer, YEATS2 functions as a reader protein, capable of recognizing histone crotonylation marks and assisting in the addition of this mark to nearby histone residues, possibly by assisting the recruitment of the writer protein for crotonylation. Our data indicates the involvement of YEATS2 in the recruitment of writer protein p300 on the promoter of the SPARC gene, making YEATS2 a regulatory factor responsible for the addition of crotonyl marks in an indirect manner. Thus, we have decided to make changes in the title by replacing the word “mediates” with “regulates”. Therefore, the updated title can be read as: “Interplay of YEATS2 and GCDH regulates histone crotonylation and drives EMT in head and neck cancer”.

      (2) The study suggests a link between YEATS2 and metastasis due to its role in EMT, but the lack of clinical or pre-clinical evidence of metastasis is concerning. Only primary tumor (PT) data is shown, but if the hypothesis is that YEATS2 promotes metastasis via EMT, then evidence from metastatic samples or in vivo models should be included to solidify this claim.

      We appreciate the reviewer’s suggestion. Here, we would like to state that the primary aim of this study was to delineate the molecular mechanisms behind the role of YEATS2 in maintaining histone crotonylation at the promoter of genes that favour EMT in head and neck cancer. We have dissected the importance of histone crotonylation in the regulation of gene expression in head and neck cancer in great detail, having investigated the upstream and downstream molecular players involved in this process that promote EMT. Moreover, with the help of multiple phenotypic assays, such as Matrigel invasion, wound healing, and 3D invasion assays, we have shown the functional importance of YEATS2 in promoting EMT in head and neck cancer cells. Since EMT is known to be a prerequisite process for cancer cells undergoing metastasis(1), the evidence of YEATS2 being associated with EMT demonstrates a potential correlation of YEATS2 with metastasis. However, as part of the revision, we will use publicly available patient data to investigate the direct association of YEATS2 with metastasis by checking the expression of YEATS2 between different grades of head and neck cancer, as an increase in tumor grade is often correlated with the incidence of metastasis(2).

      (3) There seems to be some discrepancy in the invasion data with BICR10 control cells (Figure 2C). BICR10 control cells with mock plasmids, specifically shControl and pEGFP-C3 show an unclear distinction between invasion capacities. Normally, we would expect the control cells to invade somewhat similarly, in terms of area covered, within the same time interval (24 hours here). But we clearly see more control cells invading when the invasion is done with KD and fewer control cells invading when the invasion is done with OE. Are these just plasmid-specific significant effects on normal cell invasion? This needs to be addressed.

      We appreciate the reviewer for the thorough evaluation of the manuscript. The figure panels in question, Figure 2B and 2C, represent two different experiments performed independently, the invasion assay performed after knockdown and overexpression of YEATS2, respectively. We would like to clarify that both panels represent results that are distinct and independent of each other and that the method used to knockdown or overexpress YEATS2 is also different. As stated in the Materials and Methods section, the knockdown is performed using lentivirus-mediated transfection (transduction) of cells, on the other hand, the overexpression is done using standard method of transfection by directly mixing transfection reagent and the respective plasmids, prior to the addition of this mix to the cells. The difference in the experimental conditions in these two experiments might have attributed to the differences seen in the controls as observed previously(3). Hence, we would like to state that the results of figure panels Figure 2B and Figure 2C should be evaluated independently of each other.

      (4) In Figure 3G, the Western blot shows an unclear band for YEATS2 in shSP1 cells with YEATS2 overexpression condition. The authors need to clearly identify which band corresponds to YEATS2 in this case.

      The two bands seen in the shSP1+pEGFP-C3-YEATS2 condition correspond to the endogenous YEATS2 band (lower band, indicated by * in the shControl lane) and YEATS2-GFP band (upper band, corresponding to overexpressed YEATS2-GFP fusion protein, which has a higher molecular weight). To avoid confusion, the endogenous band will be highlighted (marked by *) in the lane representing the shSP1+pEGFP-C3-YEATS2 condition in the revised version of the manuscript.

      (5) In ChIP assays with SP1, YEATS2 and p300 which promoter regions were selected for the respective genes? Please provide data for all the different promoter regions that must have been analysed, highlighting the region where enrichment/depletion was observed. Including data from negative control regions would improve the validity of the results.

      Throughout our study, we have performed ChIP-qPCR assays to check the binding of SP1 on YEATS2 and GCDH promoter, and to check YEATS2 and p300 binding on SPARC promoter. Using transcription factor binding prediction tools and luciferase assays, we selected multiple sites on the YEATS2 and GCDH promoter to check for SP1 binding. The results corresponding to the site that showed significant enrichment were provided in the manuscript. The region of SPARC promoter in YEATS2 and p300 ChIP assay was selected on the basis of YEATS2 enrichment found in the YEATS2 ChIP-seq data. We will provide data for all the promoter regions investigated (including negative controls) in the revised version of the manuscript.

      (6) The authors establish a link between H3K27Cr marks and GCDH expression, and this is an already well-known pathway. A critical missing piece is the level of ECSH1 in patient samples. This will clearly delineate if the balance shifted towards crotonylation.

      We thank the reviewer for their valuable suggestion. To support our claim, we had checked the expression of GCDH and ECHS1 in TCGA HNC RNA-seq data (provided in Figure 4—figure supplement 1A and B) and found that GCDH showed increase while ECHS1 showed decrease in tumor as compared to normal samples. We hypothesized that higher GCDH expression and decreased ECHS1 expression might lead to an increase in the levels of crotonylation in HNC. To further substantiate our claim, we will check the abundance of ECHS1 in HNC patient samples as part of the revision.

      (7) The p300 ChIP data on the SPARC promoter is confusing. The authors report reduced p300 occupancy in YEATS2-silenced cells, on SPARC promoter. However, this is paradoxical, as p300 is a writer, a histone acetyltransferase (HAT). The absence of a reader (YEATS2) shouldn't affect the writer (p300) unless a complex relationship between p300 and YEATS2 is present. The role of p300 should be further clarified in this case. Additionally, transcriptional regulation of SPARC expression in YEATS2 silenced cells could be analysed via downstream events, like Pol-II recruitment. Assays such as Pol-II ChIP-qPCR could help explain this.

      Using RNA-seq and ChIP-seq analyses, we have shown that YEATS2 affects the expression of several genes by regulating the level of histone crotonylation at gene promoters globally. The histone writer p300 is a promiscuous acyltransferase protein that has been shown to be involved in the addition of several non-acetyl marks on histone residues, including crotonylation(4). Our data provides evidence for the dependency of the writer p300 on YEATS2 in mediating histone crotonylation, as YEATS2 downregulation led to decreased occupancy of p300 on the SPARC promoter (Figure 5F). However, the exact mechanism of cooperativity between YEATS2 and p300 in maintaining histone crotonylation remains to be investigated. To address the reviewer’s concern, we will perform various experiments to delineate the molecular mechanism pertaining to the association of YEATS2 with p300 in regulating histone crotonylation. Following are the experiments that will be performed:

      (a) Co-immunoprecipitation experiments to check the physical interaction between YEATS2 and p300.

      (b) We will check H3K27cr levels on the SPARC promoter and SPARC expression in p300-depleted HNC cells.

      (c) Rescue experiments to check if the decrease in p300 occupancy on the SPARC promoter can be compensated by overexpressing YEATS2.

      (d) As suggested by the reviewer, Pol-II ChIP-qPCR at the promoter of SPARC will be performed in YEATS2-silenced cells to explain the mode of transcriptional regulation of SPARC expression by YEATS2.

      (8) The role of GCDH in producing crotonyl-CoA is already well-established in the literature. The authors' hypothesis that GCDH is essential for crotonyl-CoA production has been proven, and it's unclear why this is presented as a novel finding. It has been shown that YEATS2 KD leads to reduced H3K27cr, however, it remains unclear how the reader is affecting crotonylation levels. Are GCDH levels also reduced in the YEATS2 KD condition? Are YEATS2 levels regulating GCDH expression? One possible mechanism is YEATS2 occupancy on GCDH promoter and therefore reduced GCDH levels upon YEATS2 KD. This aspect is crucial to the study's proposed mechanism but is not addressed thoroughly.

      The source for histone crotonylation, crotonyl-CoA, can be produced by several enzymes in the cell, such as ACSS2, GCDH, ACOX3, etc(5). Since metabolic intermediates produced during several cellular pathways in the cell can act as substrates for epigenetic factors, we wanted to investigate if such an epigenetic-metabolism crosstalk existed in the context of YEATS2. As described in the manuscript, we performed GSEA using publicly available TCGA RNA-seq data and found that patients with higher YEATS2 expression also showed a high correlation with expression levels of genes involved in the lysine degradation pathway, including GCDH. Since the preferential binding of YEATS2 with H3K27cr and the role of GCDH in producing crotonyl-CoA was known(6,7), we hypothesized that higher H3K27cr in HNC could be a result of both YEATS2 and GCDH. We found that the presence of GCDH in the nucleus of HNC cells is correlated to higher H3K27cr abundance, which could be a result of excess levels of crotonyl-CoA produced via GCDH. We also found a correlation between H3K27cr levels and YEATS2 expression, which could arise due to YEATS2-mediated preferential maintenance of crotonylation. This states that although being a reader protein, YEATS2 is affecting the promoter H3K27cr levels, possibly by helping in the recruitment of p300 (as shown in Figure 5F). Thus, YEATS2 and GCDH are both responsible for the regulation of histone crotonylation-mediated gene expression in HNC.

      We did not find any evidence of YEATS2 regulating the expression of GCDH in HNC cells. However, we found that YEATS2 downregulation reduced the nuclear pool of GCDH in head and neck cancer cells (Figure 7F). This suggests that YEATS2 not only regulates histone crotonylation by affecting promoter H3K27cr levels (with p300), but also by affecting the nuclear localization of crotonyl-CoA producing GCDH. Also, we observed that the expression of YEATS2 and GCDH are regulated by the same transcription factor SP1 in HNC. We found that the transcription factor SP1 binds to the promoter of both genes, and its downregulation led to a decrease in their expression (Figure 3 and Figure 7).

      We would like to state that the relationship between YEATS2 and the nuclear localization of GCDH, as well as the underlying molecular mechanism, remains unexplored and presents an open question for future investigation.

      (9) The authors should provide IHC analysis of YEATS2, SPARC alongside H3K27cr and GCDH staining in normal vs. tumor tissues from HNC patients.

      We thank the reviewer for their suggestion. We are consulting our clinical collaborators to assess the feasibility of including this IHC analysis in our revision and will make every effort to incorporate it.

      Reviewer #2 (Public review):

      Summary:

      The manuscript emphasises the increased invasive potential of histone reader YEATS2 in an SP1-dependent manner. They report that YEATS2 maintains high H3K27cr levels at the promoter of EMT-promoting gene SPARC. These findings assigned a novel functional implication of histone acylation, crotonylation.

      We thank the reviewer for the constructive comments. We are committed to making beneficial changes to the manuscript in order to alleviate the reviewer’s concerns.

      Concerns:

      (1) The patient cohort is very small with just 10 patients. To establish a significant result the cohort size should be increased.

      We thank the reviewer for this suggestion. We will increase the number of patient samples to assess the levels of YEATS2 and H3K27cr in normal vs. tumor samples.

      (2) Figure 4D compares H3K27Cr levels in tumor and normal tissue samples. Figure 1G shows overexpression of YEATS2 in a tumor as compared to normal samples. The loading control is missing in both. Loading control is essential to eliminate any disparity in protein concentration that is loaded.

      In Figures 1G and 4D, we have used Ponceau S staining as a control for equal loading. Ponceau S staining is frequently used as an alternative for housekeeping genes like GAPDH as a control for protein loading(8). It avoids the potential for variability in housekeeping gene expression. However, it may be less quantitative than using housekeeping proteins. To address the reviewer’s concern, we will probe with an antibody against a house keeping gene as a loading control in the revised figures, provided its expression remains stable across the conditions tested.

      (3) Figure 4D only mentions 5 patient samples checked for the increased levels of crotonylation and hence forms the basis of their hypothesis (increased crotonylation in a tumor as compared to normal). The sample size should be more and patient details should be mentioned.

      A total of 9 samples were checked for H3K27cr levels (5 of them are included in Figure 4D and rest included in Figure 4—figure supplement 1D). However, as a part of the revision, we will check the H3K27cr levels in more patient samples.

      (4) YEATS2 maintains H3K27Cr levels at the SPARC promoter. The p300 is reported to be hyper-activated (hyperautoacetylated) in oral cancer. Probably, the activated p300 causes hyper-crotonylation, and other protein factors cause the functional translation of this modification. The authors need to clarify this with a suitable experiment.

      In our study, we have shown that p300 is dependent on YEATS2 for its recruitment on the SPARC promoter. As a part of the revision, we propose the following experiments to further substantiate the role of p300 in YEATS2-mediated gene regulation:

      (a) Co-immunoprecipitation experiments to check the physical interaction between YEATS2 and p300.

      (b) We will check H3K27cr levels on the SPARC promoter and SPARC expression in p300-depleted HNC cells.

      (c) Rescue experiments to check if the decrease in p300 occupancy on the SPARC promoter can be compensated by overexpressing YEATS2.

      (d) Pol-II ChIP-qPCR at the promoter of SPARC will be performed in YEATS2-silenced cells to explain the mode of transcriptional regulation of SPARC expression by YEATS2.

      (5) I do not entirely agree with using GAPDH as a control in the western blot experiment since GAPDH has been reported to be overexpressed in oral cancer.

      We would like to clarify that GAPDH was not used as a loading control for protein expression comparisons between normal and tumor samples. GAPDH was used as a loading control only in experiments using head and neck cancer cell lines where shRNA-mediated knockdown or overexpression was employed. These manipulations specifically target the genes of interest and are not expected to alter GAPDH expression, making it a suitable loading control in these instances.

      (6) The expression of EMT markers has been checked in shControl and shYEATS2 transfected cell lines (Figure 2A). However, their expression should first be checked directly in the patients' normal vs. tumor samples.

      We thank the reviewer for the suggestion. To address this, we will check the expression of EMT markers alongside YEATS2 expression in normal vs. tumor samples.

      (7) In Figure 3G, knockdown of SP1 led to the reduced expression of YEATS2 controlled gene Twist1. Ectopic expression of YEATS2 was able to rescue Twist1 partially. In order to establish that SP1 directly regulates YEATS2, SP1 should also be re-introduced upon the knockdown background along with YEATS2 for complete rescue of Twist1 expression.

      To address the reviewer’s concern regarding the partial rescue of Twist1 in SP1 depleted-YEATS2 overexpressed cells, we will perform the experiment as suggested by the reviewer. In brief, we will overexpress both SP1 and YEATS2 in SP1-depleted cells and then assess the expression of Twist1.

      (8) In Figure 7G, the expression of EMT genes should also be checked upon rescue of SPARC expression.

      We thank the reviewer for the suggestion. We will check the expression of EMT markers on YEATS2/ GCDH rescue and update Figure 7G in the revised version of the manuscript.

      References

      (1) T. Brabletz, R. Kalluri, M. A. Nieto and R. A. Weinberg, Nat Rev Cancer, 2018, 18, 128–134.

      (2) P. Pisani, M. Airoldi, A. Allais, P. Aluffi Valletti, M. Battista, M. Benazzo, R. Briatore, S. Cacciola, S. Cocuzza, A. Colombo, B. Conti, A. Costanzo, L. Della Vecchia, N. Denaro, C. Fantozzi, D. Galizia, M. Garzaro, I. Genta, G. A. Iasi, M. Krengli, V. Landolfo, G. V. Lanza, M. Magnano, M. Mancuso, R. Maroldi, L. Masini, M. C. Merlano, M. Piemonte, S. Pisani, A. Prina-Mello, L. Prioglio, M. G. Rugiu, F. Scasso, A. Serra, G. Valente, M. Zannetti and A. Zigliani, Acta Otorhinolaryngol Ital, 2020, 40, S1–S86.

      (3) J. Lin, P. Zhang, W. Liu, G. Liu, J. Zhang, M. Yan, Y. Duan and N. Yang, Elife, 2023, 12, RP87510.

      (4) X. Liu, W. Wei, Y. Liu, X. Yang, J. Wu, Y. Zhang, Q. Zhang, T. Shi, J. X. Du, Y. Zhao, M. Lei, J.-Q. Zhou, J. Li and J. Wong, Cell Discov, 2017, 3, 17016.

      (5) G. Jiang, C. Li, M. Lu, K. Lu and H. Li, Cell Death Dis, 2021, 12, 703.

      (6) D. Zhao, H. Guan, S. Zhao, W. Mi, H. Wen, Y. Li, Y. Zhao, C. D. Allis, X. Shi and H. Li, Cell Res, 2016, 26, 629–632.

      (7) H. Yuan, X. Wu, Q. Wu, A. Chatoff, E. Megill, J. Gao, T. Huang, T. Duan, K. Yang, C. Jin, F. Yuan, S. Wang, L. Zhao, P. O. Zinn, K. G. Abdullah, Y. Zhao, N. W. Snyder and J. N. Rich, Nature, 2023, 617, 818–826.

      (8) I. Romero-Calvo, B. Ocón, P. Martínez-Moya, M. D. Suárez, A. Zarzuelo, O. Martínez-Augustin and F. S. de Medina, Anal Biochem, 2010, 401, 318–320.

    1. eLife Assessment

      In this important study, the authors advance our understanding of copper uptake by chalkophores and their targeted metalloproteins in Mycobacterium tuberculosis. These convincing data demonstrate that chalkophore-acquired copper is solely incorporated into the Mtb bcc:aa3 copper-iron respiratory oxidase under low copper conditions, and that chalkophore-mediated protection of the respiratory chain is critical to Mtb virulence. These findings may be leveraged for drug discovery and will be of broad interest to those studying bacterial pathogenesis.

    2. Reviewer #1 (Public review):

      Summary:

      It is essential for Mycobacterium tuberculosis (Mtb) to scavenge trace metals from its host to survive. In this study, the authors explore the effects of copper limitation on Mtb. Mtb synthesizes small molecular diisonitrile lipopeptides termed chalkophores, that chelate host copper for import, whereby the copper is incorporated into Mtb metalloproteins. However, the role of chalkophores in Mtb biology and their targeted metalloproteins are unknown. This study investigates Mtb proteins that require chalkophores for copper incorporation and their effect on Mtb virulence. It is known that the nrp operon is induced by copper deprivation and encodes the synthesis of chalkophores. A genetic analysis revealed transcriptional differences for WT and Mtb∆nrp when exposed to the copper chelator tetrathiomolybdate (TTM). The authors found that copper chelation results in upregulation of genes in the chalkophore cluster as well as genes involved in the respiratory chain: specifically, components of the heme-dependent oxidase CytBD and subunits of the bcc:aa3 heme-copper oxidase. Interestingly, treatment of Mtb∆nrp with an inhibitor of the QcrB subunit of the bcc:aa3 oxidase (Q203) resulted in similar transcriptional changes. The bcc:aa3 oxidase and CytBD are functionally redundant, and while both utilize heme as a cofactor, only the first utilizes heme and copper. Utilizing Mtb∆nrp, Mtb∆cydAB and MtbΔnrpΔcydAB along with single gene complementation, the authors showed that copper starvation survival requires diisonitrile chalkophore synthesis and that copper starvation results in dysfunctional bcc:aa3 oxidase. Further genetic analysis combined with inhibitor studies indicate that bcc:aa3 oxidase is the only target impacted by copper starvation. By monitoring oxygen consumption for mutants in combination with inhibitors, the authors show that copper deprivation inhibits respiration through the bcc:aa3 oxidase. Similarly, they show that TTM or Q203 treatment inhibits ATP production in MtbΔnrpΔcydAB, but not in WT, showing that chalkophores maintain oxidative phosphorylation. Lastly, the authors compare the virulence of WT Mtb, Mtb∆nrp and MtbΔnrpΔcydAB strains in mice spleen and lung. The Mtb∆nrp strain showed mild attenuation, but virulence in MtbΔnrpΔcydAB was severely attenuated, and complementation with the chalkophore biosynthetic pathway restored Mtb virulence. These results suggest that chalkophore mediated protection of the respiratory chain is critical to Mtb virulence, and the that redundant respiratory oxidases within Mtb provides respiratory chain flexibility that may promote host adaptation.

      Strengths:

      Overall, the paper is very clear and well-written, with thorough and well-thought-out experimentation.

      The methods are all quite standard, so there are no weaknesses identified with regard to methodology.

    3. Reviewer #2 (Public review):

      Summary:

      This is a well-written manuscript that clearly demonstrates that the nrp encoded diisonitrile chalkophore is necessary for the function of the bcc-aa3 oxidase supercomplex under low copper conditions. In addition, the study demonstrates that the chlakophore is important early during infection when copper sequestration is employed by the host as a method of nutritional immunity.

      Strengths:

      The authors use genetic approaches including single and double mutants of chalkophore biosynthesis, and both the Mtb oxidases. They use copper chelators to restrict copper in vitro. A strength of the work was the use of a synthesized a Mtb chalkophore analogue to show chemical complementation of the mutant nrp locus. Oxphos metabolic activity was measuered by oxygen consumption and ATP levels. Importantly, the study demonstrated that chalkophore, especially in a strain lacking the secondary oxidase, was necessary for early infection and ruled out a role for adaptive immunity in the chalkophore lacking Mtb by use of SCID mice. It is interesting that after two weeks of infection and onset of adaptive immunity, the chalkophore is not required, which is consistent with the host environment switching from a copper-restricted to copper overload in phagosomes.

      Weaknesses:

      Most claims in the manuscript are soundly justified. The one exception is the claim that "maintenance of respiration is the only cellular target of chalkophore mediated copper acquisition." While under the in vitro conditions tested this does appear to be the case; however, it can't be ruled out that the chalkophore is important in other situations. In particular, for maintenance of the periplasmic superoxide dismustase, SodC, which is the other M. tuberculosis enzyme known to require copper.

    4. Reviewer #3 (Public review):

      Summary:

      In this manuscript, the group of Glickman expands on their previous studies on the function of chalkophores during the growth of and infection by Mycobacterium tuberculosis. Previously, the group had shown that chalkophores, which are metallophores specific for the scavenging of copper, are induced by M. tuberculosis under copper deprivation conditions. Here, they show that chalkophores, under copper limiting conditions, are essential for the uptake of copper and maturation of a terminal oxidase, the heme-copper oxidase, cytochrome bcc:aa3. As M. tuberculosis has two redundant terminal oxidases, growth of and infection by M. tuberculosis is only moderated if both the chalkophores and the second terminal oxidase, cytochrome bd, are inhibited.

      Strengths:

      A strength of this work is that the lab-culture experiments are expanded upon with mice infection models, providing strong indications that host-inflicted copper deprivation is a condition that M. tuberculosis has adapted to for virulence.

      Weaknesses:

      Because the phenotype of M. tuberculosis lacking chalkophores is similar, if not identical, to using Q203, an inhibitor of cytochrome bcc:aa3, the authors propose that the copper-containing cytochrome bcc:aa3 is the only recipient of copper-uptake by chalkophores. A minor weakness of the work is that this latter conclusion is not verified under infection conditions and other copper-enzymes might still be functionally required during one or more stages of infection.

    5. Author response:

      We thank the reviewers for their careful evaluation of our manuscript and appreciate the suggestions for improvement. We will outline our planned revisions in response to these reviews.

      Reviewer 2:

      “The one exception is the claim that "maintenance of respiration is the only cellular target of chalkophore mediated copper acquisition." While under the in vitro conditions tested this does appear to be the case; however, it can't be ruled out that the chalkophore is important in other situations. In particular, for maintenance of the periplasmic superoxide dismutase, SodC, which is the other M. tuberculosis enzyme known to require copper.”

      And

      Reviewer 3:

      “Because the phenotype of M. tuberculosis lacking chalkophores is similar, if not identical, to using Q203, an inhibitor of cytochrome bcc:aa3, the authors propose that the copper-containing cytochrome bcc:aa3 is the only recipient of copper-uptake by chalkophores. A minor weakness of the work is that this latter conclusion is not verified under infection conditions and other copper-enzymes might still be functionally required during one or more stages of infection.

      Both comments concern the question of whether the bcc:aa3 respiratory oxidase supercomplex is the only target of chalkophore delivered copper. In culture, our experiments suggest that bcc:aa3 is the only target. The evidence for this claim is in Figure 2E and F. In 2E, we show that M. tuberculosis DctaD (a subunit of bcc:aa3) is growth impaired, copper chelation with TTM does not exacerbate that growth defect, and that a DctaDDnrp double mutant is no more sensitive to TTM than DctaD. These data indicate that role of the chalkophore in protecting against copper deprivation is absent when the bcc:aa3 oxidase is missing. Similar results were obtained with Q203 (Figure 2F). Q203 or TTM arrest growth of M. tuberculosis Dnrp, but the combination has no additional effect, indicating that when Q203 is inhibiting the bcc:aa3 oxidase, the chalkophore has no additional role. However, we agree with the reviewers that we cannot exclude the possibility that during infection, there is an additional target of chalkophore mediated Cu acquisition. We will add this caveat to the revised version of this manuscript.

    1. eLife Assessment

      This manuscript reports fundamental discoveries on how necrotic cells contribute to organ regeneration through apoptotic signalling to produce cells with non-lethal apoptotic caspase activity that contribute to the regenerated tissue. These findings will be of broad interest to those who study wound repair and tissue regeneration. The strength of the evidence is solid and has been improved in the revised version.

    2. Reviewer #2 (Public review):

      In this revised manuscript, Klemm et al., build on top of past published findings (Klemm et al., 2021) to characterize caspase activation in distal cells following necrotic tissue damage within the Drosophila wing imaginal disc. Previously in Klemm et al., 2021, the authors describe necrosis-induced-apoptosis (NiA) following the development of a genetic system to study necrosis that is caused by the expression of a constitutive active GluR1 (Glutamate/Ca2+ channel), and they discovered that the appearance of NiA cells were important for promoting regeneration.

      In this manuscript, the authors investigate how tissues regenerate following necrotic cell death. They find that:

      (1) the cells of the wing pouch are more likely to have non-autonomous caspase activation than other regions within the wing imaginal disc (hinge and notum),

      (2) two signaling pathways that are known to be upregulated during regeneration, Wnt (wingless) and JAK/Stat signaling, act to prevent additional NiA in pouch cells, and may partially explain the region specificity,

      (3) the presence of NiA (and/or NiCP) cells promotes regenerative proliferation in the late stages of regeneration,

      (4) not all caspase-positive cells are cleared from the epithelium (these cells are then referred to as Necrosis-induced Caspase Positive (NiCP) cells), these NiCP cells continue to live and promote proliferation in adjacent cells,

      (5) the initiator caspase Dronc is important for creating NiA/NiCP cells and for these cells to promote proliferation. Animals heterozygous for a Dronc null allele show a decrease in regeneration following necrotic tissue damage. In the revised manuscript, the authors provide improvements through additional data quantifications and text changes to better explain NiA/NiCP lineage tracing methods.

      The study has the potential to be broadly interesting due to the insights into how tissues differentially respond to necrosis as compared to apoptosis to promote regeneration. The paper raises many interesting questions for future investigation, including what is the nature of the signaling between the damaged tissue and the NiA/NiCP responsive areas (such as the identity of the DAMPs)? What determines if these cells at a distance undergo apoptosis or remain viable in the tissue as caspase-positive cells? And since the authors have data that indicates that the phenomenon is distinct from 'undead cells', what are the mechanisms by which these cells promote local proliferation?

    3. Reviewer #3 (Public review):

      The manuscript "Regeneration following tissue necrosis is mediated by non-apoptotic caspase activity" by Klemm et al. is an exploration of what happens to a group of cells that experience caspase activation after necrosis occurs some distance away from the cells of interest. These experiments have been conducted in the Drosophila wing imaginal disc, which has been used extensively to study the response of a developing epithelium to damage and stress. The authors revise and refine their earlier discovery of apoptosis initiated by necrosis, here showing that many of those presumed apoptotic cells do not complete apoptosis. Thus, the most interesting aspect of the paper is the characterization of a group of cells that experience mild caspase activation in response to an unknown signal, followed by some effector caspase activation and DNA damage, but that then recover from the DNA damage, avoid apoptosis, and proliferate instead.

      The authors have addressed the concerns raised, including those about drawing conclusions from RNAi knockdown without evaluating the efficacy of the knockdown, and in doing so they revised their conclusions after ascertaining that the Zfh2 RNAi was not effective.

      The authors have added quantification of the imaging data throughout, which strengthens their conclusions.

      In addition, the authors have revised some of the text describing the changes in EdU signal and added explanations of reagents such as the caspase sensors to clarify the experimental approaches, results, and interpretation of those results.

      The authors have also addressed the minor concerns and questions about the figures and text.

      A few questions remain, which the authors may choose to address.

      (1) The hh>Stat92ERNAi was assessed by the 10xSTAT-GFP reporter, as shown in Fig 2 Supp1 F. The authors point out the marked reduction in GFP in the ventral part of the hinge but do not comment on the lack of change in GFP in the dorsal part of the hinge. However, the open arrowhead in Figure 2H indicating the lack of cDcp-1 signal in the hinge in the same experiment points to the dorsal hinge, where the reporter suggests no difference in JAK-STAT signaling.

      (2) The data used to conclude that DRONC-DN and UAS-DIAP1 do not affect regenerative proliferation were normalized EdU intensities. As discussed in the prior review round, normalized EdU may not be a good comparison across experimental conditions given that the remainder of the disc may also have altered EdU incorporation, so this measurement may not be enough by itself to draw conclusions about regenerative proliferation. To strengthen the conclusion that regenerative proliferation is unaffected under these conditions, the authors may want to consider using a second measure such as adult wing size, PCNA, or quantitate mitoses via anti-phospho histone H3 staining.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      In previous work, the authors described necrosis-induced apoptosis (NiA) as a consequence of induced necrosis. Specifically, experimentally induced necrosis in the distal pouch of larval wing imaginal discs triggers NiA in the lateral pouch. In this manuscript, the authors confirmed this observation and found that while necrosis can kill all areas of the disc, NiA is limited to the pouch and to some extent to the notum, but is excluded from the hinge region. Interestingly and unexpectedly, signaling by the Jak/Stat and Wg pathways inhibits NiA. Further characterization of NiA by the authors reveals that NiA also triggers regenerative proliferation which can last up to 64 hours following necrosis induction. This regenerative response to necrosis is significantly stronger compared to discs ablated by apoptosis. Furthermore, the regenerative proliferation induced by necrosis is dependent on the apoptotic pathway because RNAi targeting the RHG genes is sufficient to block proliferation. However, NiA does not promote proliferation through the previously described apoptosis-induced proliferation (AiP) pathway, although cells at the wound edge undergo AiP. Further examination of the caspase levels in NiA cells allowed the authors to group these cells into two clusters: some cells (NiA) undergo apoptosis and are removed, while others referred to as Necrosis-induced Caspase Positive (NiCP) cells survive despite caspase activity. It is the NiCP cells that repair cellular damage including DNA damage and that promote regenerative proliferation. Caspase sensors demonstrate that both groups of cells have initiator caspase activity, while only the NiA cells contain effector caspase activity. Under certain conditions, the authors were also able to visualize effector caspase activity in NiCP cells, but the level was low, likely below the threshold for apoptosis. Finally, the authors found that loss of the initiator caspase Dronc blocks regenerative proliferation, while inhibiting effector caspases by expression of p35 does not, suggesting that Dronc can induce regenerative proliferation following necrosis in a non- apoptotic manner. This last finding is very interesting as it implies that Dronc can induce proliferation in at least two ways in addition to its requirement in AiP.

      Strengths:

      This is a very interesting manuscript. The authors demonstrate that epithelial tissue that contains a significant number of necrotic cells is able to regenerate. This regenerative response is dependent on the apoptotic pathway which is induced at a distance from the necrotic cells. Although regenerative proliferation following necrosis requires the initiator caspase Dronc, Dronc does not induce a classical AiP response for this type of regenerative response. In future work, it will be very interesting to dissect this regenerative response pathway genetically.

      Weaknesses:

      No weaknesses were identified.

      We thank the reviewer for their positive evaluation and kind words.

      Reviewer #2 (Public Review):

      Summary / Strengths:

      In this manuscript, Klemm et al., build on past published findings (Klemm et al., 2021) to characterize caspase activation in distal cells following necrotic tissue damage within the Drosophila wing imaginal disc. Previously in Klemm et al., 2021, the authors describe necrosis-induced-apoptosis (NiA) following the development of a genetic system to study necrosis that is caused by the expression of a constitutive active GluR1 (Glutamate/Ca2+ channel), and they discovered that the appearance of NiA cells were important for promoting regeneration.

      In this manuscript, the authors aim to investigate how tissues regenerate following necrotic cell death. They find that the cells of the wing pouch are more likely to have non-autonomous caspase activation than other regions within the wing imaginal disc (hinge and notum),two signaling pathways that are known to be upregulated during regeneration, Wnt (wingless) and JAK/Stat signaling, act to prevent additional NiA in pouch cells, and may explain the region specificity, the presence of NiA cells promotes regenerative proliferation in late stages of regeneration, not all caspase-positive cells are cleared from the epithelium (these cells are then referred to as Necrosis-induced Caspase Positive (NiCP) cells), these NiCP cells continue to live and promote proliferation in adjacent cells, the caspase Dronc is important for creating NiA/NiCP cells and for these cells to promote proliferation. Animals heterozygous for a Dronc null allele show a decrease in regeneration following necrotic tissue damage.

      The study has the potential to be broadly interesting due to the insights into how tissues differentially respond to necrosis as compared to apoptosis to promote regeneration.

      Weaknesses:

      However, here are some of my current concerns for the manuscript in its current version:

      The presence of cells with activated caspase that don't die (NiCP cells) is an interesting biological phenomenon but is not described until Figure 5. How does the existence of NiCP cells impact the earlier findings presented? Is late proliferation due to NiA, NiCP, or both? Does Wg and JAK/STAT signaling act to prevent the formation of both NiA and NiCP cells or only NiA cells? Moreover, the authors are able to specifically manipulate the wound edge (WE) and lateral pouch cells (LP), but don't show how these manipulations within these distinct populations impact regeneration. The authors provide evidence that driving UAS-mir(RHG) throughout the pouch, in the LP or the WE all decrease the amount of NiA/NiCP in Figure 3G-O, but no data on final regenerative outcomes for these manipulations is presented (such as those presented for Dronc-/+ in Fig 7M). The manuscript would be greatly enhanced by quantification of more of the findings, especially in describing if the specific manipulations that impacted NiA /NiCP cells disrupt end-point regeneration phenotypes.

      We have added a line to the results to clarify that we believe the finding that some NiA likely persist as NiCP does not affect our conclusions up to this point.

      We have added a statement emphasizing the results from our first paper, which demonstrate that LP>miRHG expression reduces the overall capacity to regenerate.

      Quantification of the change in posterior NiA number have been added to Figure 2L to strengthen the evidence. Likewise, we have included quantification of the E2F time course presented in Figure 3A (Figure 3 – Figure supplement 1C), and quantification of the change in GC3Ai signal over time has been added to Figure 5 - Figure supplement 1D) to emphasize the perdurance of GC3Ai-positive NiA/NiCP.

      How fast does apoptosis take within the wing disc epithelium? How many of the caspase(+) cells are present for the whole 48 hours of regeneration? Are new cells also induced to activate caspase during this time window? The author presented a number of interesting experiments characterizing the NiCP cells. For the caspase sensor GC3Ai experiments in Figure 5, is there a way to differentiate between cells that have maintained fluorescent CG3Ai from cells that have newly activated caspase? What is the timeline for when NiA and NiCP are specified? In addition, what fraction of NiCP cells contribute to the regenerated epithelium? Additional information about the temporal dynamics of NiA and NiCP specification/commitment would be greatly appreciated.

      We have included more information concerning the kinetics of apoptotic cell removal, and how this compares to the observations we have made with NiA/NiCP in our GC3Ai experiments. Additionally, we have included a quantification of the percent of the whole wing pouch with GC3Ai signal over time (Figure 5F) as well as the distal wing pouch with GC3Ai signal over time (Figure 5 – Figure supplement 1D) to further support the idea that NiCP persist over time.

      We acknowledge that our GC3Ai time course unfortunately cannot confirm whether the increase in GC3Ai signal over time is due to cells with new caspase activity or proliferating NiCP and have included this point in the discussion.

      We attempted to track the lineage of NiA/NiCP into the pupal and adult wings with CasExpress and DBS, however the results of these experiments were inconsistent, and therefore we did not feel confident to include these data or draw conclusions in either direction. We are currently designing variations of these lineage trace tools in order to better track the lineage of these cells that we hope to include in a future paper.

      The notum also does not express developmental JAK/STAT, yet little NiA was observed within the notum. Do the authors have any additional insights into the differential response between the pouch and notum? What makes the pouch unique? Are NiA/NiCP cells created within other imaginal discs and other tissues? Are they similarly important for regenerative responses in other contexts?

      We have added a brief mention of these points to the appropriate results section to avoid further increasing the length of the discussion.

      Data on the necrosis of other imaginal discs through FLP/FRT clone formation in haltere and leg discs has been added to Figure 1 Figure supplement 1J, and described in the text.

      Reviewer #3 (Public Review):

      The manuscript "Regeneration following tissue necrosis is mediated by non- apoptotic caspase activity" by Klemm et al. is an exploration of what happens to a group of cells that experience caspase activation after necrosis occurs some distance away from the cells of interest. These experiments have been conducted in the Drosophila wing imaginal disc, which has been used extensively to study the response of a developing epithelium to damage and stress. The authors revise and refine their earlier discovery of apoptosis initiated by necrosis, here showing that many of those presumed apoptotic cells do not complete apoptosis. Thus, the most interesting aspect of the paper is the characterization of a group of cells that experience mild caspase activation in response to an unknown signal, followed by some effector caspase activation and DNA damage, but that then recover from the DNA damage, avoid apoptosis, and proliferate instead. Many questions remain unanswered, including the signal that stimulates the mild caspase activation, and the mechanism through which this activation stimulates enhanced proliferation.

      The authors should consider answering additional questions, clarifying some points, and making some minor corrections:

      Major concerns affecting the interpretation of experimental results:

      Expression of STAT92E RNAi had no apparent effect on the ability of hinge cells to undergo NiA, leading the authors to conclude that other protective signals must exist. However, the authors have not shown that this STAT92E RNAi is capable of eliminating JAK/STAT signaling in the hinge under these experimental conditions. Using a reporter for JAK/STAT signaling, such as the STAT-GFP, as a readout would confirm the reduction or elimination of signaling. This confirmation would be necessary to support the negative result as presented.

      We have included data demonstrating our ability to knock down JAK/STAT activity in the hinge with UAS-Stat92E<sup>RNAi</sup> (Figure 2 – Figure supplement 1E and F). Additionally, we have included a quantification of posterior NiA/NiCP with the Stat92E<sup>RNAi</sup> (as well as wg<sup>RNAi</sup> and Zfh-2<sup>RNAi</sup>, Figure 2L) to strengthen our conclusion that JAK/STAT and WNT signaling acts to regulate NiA formation within the pouch.

      Similarly, the authors should confirm that the Zfh2 RNAi is reducing or eliminating Zfh2 levels in the hinge under these experimental conditions, before concluding that Zfh2 does not play a role in stopping hinge cells from undergoing NiA.

      We have repeated this experiment with a longer knockdown using a GAL4 driver that expresses from early larval stages until our evaluation at L3, but were unable to demonstrate a loss of Zfh-2 with IF labeling. Additionally, we have quantified posterior NiA/NiCP with a Zfh-2RNAi (Figure 2L) and do find a slight increase in NiA/NiCP number, however this change is not significant. We have altered our conclusions to reflect these new data.

      EdU incorporation was quantified by measuring the fluorescence intensity of the pouch and normalizing it to the fluorescence intensity of the whole disc. However, the images show that EdU fluorescence intensity of other regions of the disc, especially the notum, varied substantially when comparing the different genetic backgrounds (for example, note the substantially reduced EdU in the notum of Figure 3 B' and B'). Indeed, it has been shown that tissue damage can lead to suppression of proliferation in the notum and elsewhere in the disc, unless the signaling that induces the suppression is altered. Therefore, the normalization may be skewing the results because the notum EdU is not consistent across samples, possibly because the damage-induced suppression of proliferation in the notum is different across the different genetic backgrounds.

      To more accurately reflect the observations that we have made with the EdU assay, we have changed our terminology to indicate that the EdU signal is more localized to the damaged tissue in ablated discs, thus taking into account the relative changes across the disc, rather than referring to it as an increase in the pouch. To further strengthen our observation that damage results in a localized proliferation, we have included a quantification of the E2F time course presented in Figure 3A (Figure 3 – Figure supplement 1C), which underscores the trend observed in our EdU experiments.

      The authors expressed p35 to attempt to generate "undead cells". They take an absence of mitogen secretion or increased proliferation as evidence that undead cells were not generated. However, there could be undead cells that do not stimulate proliferation non-autonomously, which could be detected by the persistence of caspase activity in cells that do not complete apoptosis. Indeed, expressing p35 and observing sustained effector caspase activation could help answer the later question of what percentage of this cell population would otherwise complete apoptosis (NiA, rescued by p35) vs reverse course and proliferate (NiCP, unaffected by p35).

      In our previous work, we showed that P35 expression impairs our ability to detect effector caspases with IF-based tools. This can also be seen in Figure 4 of this work (Figure 4C and F). Given that P35 expression precludes our ability to label and assay effector caspase activity visually, and thus address the concerns outlined above, we relied on other tools such as reporters of AiP mitogens (wg-lacZ & dpp-lacZ) to assay whether NiA participate in AiP. As a functional readout, we also paired P35 expression with the EdU assay to test whether proliferation was altered by the presence of undead cells. The results discussed in Figure 4 lead us to conclude that NiA likely do not participate in the canonical AiP feedforward loop, although it is possible that these experiments generate another type of undead cell – one that utilizes a different mechanism to promote proliferation.

      It is unclear if the authors' model is that the NiCP cells lead to autonomous or non-autonomous cell proliferation, or both. Could the lineage-tracing experiments and/or the experiments marking mitosis relative to caspase activity answer this question?

      We have added further details to the discussion on the potential for NiA/NiCP to induce cell autonomous/non-autonomous proliferation.

      Many of the conclusions rely on single images. Quantification of many samples should be included wherever possible.

      We have added quantification to strengthen the results of Figures 2, 3 and 5.

      Why does the reduction of Dronc appear to affect regenerative growth in females but not males?

      We have repeated this regeneration scoring experiments and have increased the N for control versus droncI29 mutant males, however the results of the analysis for male wing size remain not significant, although the general trend that droncI29 wings are slightly smaller. While there could be sex-specific differences in the capacity to regenerate that contribute to this observation, it is unclear what the underlying mechanism could be.

      Reviewer #1 (Recommendations for the authors):

      The work in this paper is already very complete and very well worked out. The conclusions are well supported by the data in this manuscript. I do not have any experimental requests, only a few minor and formal requests/questions.

      (1) Why does Diap1 overexpression not affect regenerative proliferation, whereas mir(RHG) and dronc[I29] do, given that Diap1 acts between RHG and Dronc?

      We speculate on this point in the discussion section but have adjusted some of the phrasing for clarity.

      (2) I assume that the authors used the cleaved Dcp-1 antibody from Cell Signaling Technologies. I recommend that the authors refer to this antibody as cDcp-1 in text and figures as this antibody specifically detects the cleaved, and thus activated form of Dcp-1, and not the uncleaved, inactive form of Dcp-1 which has a uniform expression in the discs.

      Changed to cDcp-1.

      (3) Line 299: Hay et al. 1994 did not show that p35 inhibits Drice and Dcp-1 (in fact, both genes were not even cloned yet). This was shown by Meier et al. 2000 and Hawkins et al. 2000. Please correct references.

      Corrected.

      (4) Line 574/575. Meier et al. 2000 did not show that Dronc is mono-ubiquitylated. This was shown by Kamber-Kaya et al., 2017. Please correct.

      Corrected.

      Reviewer #2 (Recommendations for the authors):

      (1) Does domeless knockdown cause apoptosis without tissue ablation (Figures 2C-E)? Currently, the non-ablation control is not shown.

      Domeless knockdown does not cause apoptosis in the absence of ablation (Added Figure 2 – Figure supplement 1A).

      (2) The supplemental experiment with zfh2-RNAi is hard to interpret because there is no evidence of RNAi knockdown based on the staining with the anti-Zfh2 antibody.

      As noted above, a longer zfh-2 knockdown does not appear to alter Zfh-2 protein levels. A quantification of posterior NiA/NiCP following knockdown shows a slight (non-significant) increase in posterior NiA/NiCP. Considering these new results, we have altered our interpretation within the appropriate results and discussion sections.

      (3) The authors should consider adding a diagram showing where mir(RHG) and DIAP1 are in the apoptotic/caspase activation pathway (Figure 7N).

      Completed, Figure 7N and 7O.

      Reviewer #3 (Recommendations for the authors):

      (1) Figure 2 I -The purported increase in NiA should be quantitated relative to the NiA in G across many discs.

      Completed (Figure 2L)

      (2) Figure 2 M - contrary to the conclusion drawn, the posterior Dcp1 does not appear different from that in the control (K). This conclusion that the NiA does not occur in the margin could be better supported with more images/quantification.

      We have exchanged the image for a representative one that more clearly shows the lack of margin NiA and highlighted with an arrowhead (Figure 2K)

      (3) Figure 2 supp 1 E - the "slight increase" in NiA in the pouch is relative to which control? Can this conclusion be supported by quantification?

      Figure 2L now quantifies this change.

      (4) Figure 2 Supp 1 D, E - these discs supposedly have Zfh2 RNAi expressed, but there appears to be no reduction in Zfh2.

      We were unable to demonstrate a reduction of Zfh2, even with a longer knockdown. Considering these new data, we have altered our conclusions from the Zfh2 experiments.

      (5) Figure 2 Supp 1 I - please quantitate the Dcp-1 across many discs to support the conclusion.

      This is the UAS-wg experiment, which we decided to remove from the quantification given the non-specific increase in cDcp-1 throughout the disc (likely as a result from ectopic Wg expression).

      (6) Figure 4 legend M - The authors conclude that the experiment indicates that "NiA promote proliferation independent of AiP". It would be more precise to say that NiA cells do not secrete AiP mitogens and do not increase the proliferation of surrounding cells when prevented from completing apoptosis. To say that the NiA-induced proliferation does not require AiP would require eliminating AiP, perhaps through reaper hid grim knockdown or mitogen knockdown.

      Corrected.

      Minor concerns and clarification needed:

      (7) Line 61 - consider the distinction between a feed-forward loop and a positive feedback loop.

      Corrected.

      (8) Line 338 - it would be helpful to have a brief explanation of what the GC3Ai consists of and how it reports caspase activity.

      Corrected.

      (9) Line 343 - the authors should clarify by what they mean when they state GC3Ai-positive cells are "associated with" mitotic cells. Are the GC3Ai cells undergoing mitosis? Or is the increase in mitosis non-autonomous?

      Adjusted. “associated with adjacent proliferative cells”.

      (10) Lines 392-394 - the authors should add brief descriptions of how the Drice-Based sensor and the CasExpress function, so the readers can better understand the distinctions between these sensors and the previously mentioned sensors (anti-Dcp1 and GC3Ai). In addition, please clarify how the Gal80ts modulates the sensitivity of the CasExpress.

      Descriptions of DBS and CasExpress and additional clarification provided.

      (11) Line 413: How does Gal80ts suppress the background developmental caspase signal, and how does this suppression lead to NiCP cells expressing GFP?

      This section has been reworded to clarify.

      (12) Line 417 - which GFP label is referred to here?

      This section has been reworded to clarify.

      (13) Line 445 is the first mention of the CARD domain - it could be introduced more fully and explained why the DroncDN's lack of effect on proliferation excludes the CARD domain as being important.

      Clarified. See also the discussion for the significance of the CARD domain as dispensable for regenerative proliferation following necrosis.

      (14) Line 452 - "As mentioned" - the manuscript has not previously mentioned DIAP1 modification of the CARD domain and what that modification does. Perhaps the previous explanatory text was inadvertently removed?

      Corrected.

      (15) The Discussion is a lengthy list of experiments that the authors did not do or observations they were unable to make. This section could benefit from a more in-depth discussion of necrosis and the possibility that NiCP cells contribute to repair after injury across contexts and species.

      We have made several changes to the discussion that elaborate on some of the points listed in the public reviews.

      (16) All figures: Consider making single-channel panels grayscale to aid visualization. Also consider using color combinations that can be distinguished by color-blind readers.

      We appreciate these suggestions and will consider them for future manuscripts.

      (17) All figure legends - are error bars SD or SEM?

      Standard deviation. Added to appropriate legends.

      (18) Figure 1A,C - it would be helpful in the diagrams to note when the necrosis occurs/completes.

      The endpoint of necrosis is not well defined, given the simultaneous changes that occur with regeneration. Thus, we opted to not include an indicator of when necrotic ablation ends.

      (19) Figure 1B - it would be helpful to name the GAL4 drivers whose expression domain is depicted to correlate with the terms used in the text.

      Completed.

      (20) Figure 1 legend- what do the different colors of the arrowheads denote? The dotted lines are in R' and S', not N' and O'.

      Completed.

      (21) Figure 2G - the yellow dashed line is not in the same place in the two images.

      Corrected.

      (22) Figure 2I - what is the open arrowhead?

      Completed (Figure 2I legend).

      (23) Figure 3 legend - please describe what the time course is observing (EdU).

      Completed.

      (24) Figure 4 - please include the yellow boxes in the Dcp-1 channels.

      Completed.

      (25) Figure 5 F' - add the arrowheads to all the panels. The yellow arrowhead appears to be pointing to nothing.

      Completed.

      (27) Figure 5 legend - what is a "cytoplasmic undisturbed cell"? What is the arrowhead in G? J and J' should show the same view at different time points or different views at the same time point.

      Figure legend has been corrected.

      (28) Figure 5 Supp 1 would be especially helped by having more single-channel panels in grayscale.

      For clarity and consistency, we chose to maintain the different color channels.

      (29) Figure 5 Supp 1 D and E - It would be helpful to have higher magnification and arrows pointing to the cells of interest. Why are there TUNEL+ cells that do not have caspase activation (green)?

      We have added arrowheads as suggested. We believe the disparity in TUNEL and GC3Ai signals are a result of the different sensitivities of the IF staining and the TUNEL assay.

      (30) Figure 5 Supp 1 F - perhaps the arrowheads should be in all panels - they point to empty spaces with no H2Av staining in the final panel. Perhaps a higher magnification image would make the "strong overlap" of the two signals more apparent?

      We have added arrowheads where appropriate.

      (31) Figure 6 D-E - does the widespread GFP lineage tracing signal suggest that most cells in the repaired tissue originated from cells that once had caspases activity?

      Possibly, however given that CasExpress leads to significant developmental labeling, we were unable to determine to what extent the signal in this experiment comes from NiA/NiCP activity versus developmental labeling. Note that tubGAL80ts is not present in this experiment.

      (32) Writing corrections:

      Line 343 "positive" is misspelled.

      Completed

      Line 429 - a word may be missing.

      Completed

      Line 639 - the word "day" may be missing.

      Completed

      Line 658 - what temperature was the recovery?

      Completed

      Lines 706-708 - were the discs incubated in 55 mL and 65 mL of liquid, or a smaller volume?

      Completed

    1. eLife Assessment

      This manuscript establishes a mathematical model to estimate the key parameters that control the repopulation of planarian stem cells after sublethal irradiation as they undergo fate-switching as part of their differentiation and self-renewal process. The findings are valuable for future investigation of stem cell division in planarians. The methods are solid, integrating modeling with perturbations of key transcription factors known to be critical for cell fate decisions, but the authors have only shown that this is the case for a small number of stem cell types.

    2. Reviewer #1 (Public review):

      Summary:

      This is a very creative study using modeling and measurement of neoblast dynamics to gain insight into the mechanism that allows these highly potent cells to undergo fate-switching as part of their differentiation and self-renewal process. The authors estimate growth equation parameters for expanding neoblast clones based on new and prior experimental observations. These results indicate neoblast likely undergo much more symmetric self-amplifying division than loss of the population through symmetric differentiation, in the case of clone expansion assays after sublethal irradiation. Neoblasts take on multiple distinct transcriptional fates related to their terminally differentiated cell types, and prior work indicated neoblasts have a high plasticity to switch fates in a way linked to cell cycle progression and possibly through a random process. Here, the authors explore the impact of inhibition of key transcription factors defining such states (ie "fate specifying transcription factors", FSTFs) plus measurement and modeling in the clone expansion assay, to find that inhibition of factors like zfp1 likely cause otherwise zfp1-fated neoblasts to fail to proliferate and differentiation without causing compensatory gains in other lineages. A mathematical model of this process assuming that neoblasts do not retain a memory of prior states while they proliferate, and transition across specified states can mimic the experimentally determined decreased sizes of clones following inhibition of zfp1. Complementary approaches to inhibit more than one lineage (muscle plus intestine) supports the idea that this is a more general process in planarian stem cells. These results provide an important advance for understanding the fate-switching process and its relationship to neoblast growth.

      Overall I find the evidence very well presented and the study compelling. It offers an important new perspective on the key properties of neoblasts. I do have some comments to clarify the presentation and significance of the work.

    3. Reviewer #2 (Public review):

      Summary:

      Cell cycle duration and cell fate choice are critical to understanding the cellular plasticity of neoblasts in planarians. In this study, Tamar et al. integrated experimental and computational approaches to simulate a model for neoblast behaviors during colony expansion.

      Strengths:

      The finding that "arresting differentiation into specific lineages disrupts neoblast proliferative capacities without inducing compensatory expression of other lineages" is particularly intriguing. This concept could inspire further studies on pluripotent stem cells and their application for regenerative biology.

      Weaknesses:

      However, the absence of a cell-cell feedback mechanism during colony growth and the likelihood of the difference needs to be clarified. Is there any difference in interpreting the results if this mechanism is considered? More explanation and discussion should be included to distinguish the stages controlled by the one-step model from those discussed in this study. Although hnf-4 and foxF have been silenced together to validate the model, a deeper understanding of the tgs-1+ cell type and the non-significant reduction of tgs-1+ neoblasts in zfp-1 RNAi colonies is necessary, considering a high neural lineage frequency.

    4. Author response:

      Reviewer #1:

      Overall I find the evidence very well presented and the study compelling. It offers an important new perspective on the key properties of neoblasts. I do have some comments to clarify the presentation and significance of the work.

      We thank the reviewer for the positive feedback and plan to improve the presentation of the work.

      Reviewer #2:

      However, the absence of a cell-cell feedback mechanism during colony growth and the likelihood of the difference needs to be clarified. Is there any difference in interpreting the results if this mechanism is considered?

      We will improve the description of the model assumptions and the interpretation of the data on the basis of these assumptions.

      Although hnf-4 and foxF have been silenced together to validate the model, a deeper understanding of the tgs-1+ cell type and the non-significant reduction of tgs-1+ neoblasts in zfp-1 RNAi colonies is necessary, considering a high neural lineage frequency.

      We will improve the analysis of this result in light of the experimentally determined frequency of the tgs-1+ neoblast population.

    1. eLife Assessment

      This study provides evidence that single-cell multi-omics profiling can reveal key regulators of HIV-1 persistence and early immune dysregulation, particularly implicating KLF2 and Th17 cells as major players in viral reservoir dynamics. The findings are solid, supported by rigorous integration of scRNA-seq and scATAC-seq data, but are limited by sample size and lack of validation with external datasets. Overall, this work makes a valuable contribution to understanding HIV-1 immune evasion and highlights potential therapeutic targets for reservoir eradication.

    2. Reviewer #1 (Public review):

      Summary:

      The authors aimed to elucidate the molecular mechanisms underlying HIV-1 persistence and host immune dysfunction in CD4+ T cells during early infection (<6 months). Using single-cell multi-omics technologies-including scRNA-seq, scATAC-seq, and single-cell multiome analyses-they characterized the transcriptional and epigenomic landscapes of HIV-1-infected CD4+ T cells. They identified key transcription factors (TFs), signaling pathways, and T cell subtypes involved in HIV-1 persistence, particularly highlighting KLF2 and Th17 cells as critical regulators of immune suppression. The study provides new insights into immune dysregulation during early HIV-1 infection and reveals potential epigenetic regulatory mechanisms in HIV-1-infected T cells.

      Strengths:

      The study excels through its innovative integration of single-cell multi-omics technologies, enabling detailed analysis of gene regulatory networks in HIV-1-infected cells. Focusing on early infection stages, it fills a crucial knowledge gap in understanding initial immune responses and viral reservoir establishment. The identification of KLF2 as a key transcription factor and Th17 cells as major viral reservoirs, supported by comprehensive bioinformatics analyses, provides robust evidence for the study's conclusions. These findings have immediate clinical relevance by identifying potential therapeutic targets for HIV-1 reservoir eradication.

      Weaknesses:

      Despite its strengths, the study has several limitations. By focusing exclusively on CD4+ T cells, the study overlooks other relevant immune cells such as CD14+ monocytes, NK cells, and B cells. Additionally, while the authors generated their own single-cell datasets, they need to validate their findings using other publicly available single-cell data from HIV-1-infected PBMCs.

    3. Reviewer #2 (Public review):

      Summary:

      The authors observed gene ontologies associated with upregulated KLF2 target genes in HIV-1 RNA+ CD4 T Cells using scRNA-seq and scATAC-seq datasets from the PBMCs of early HIV-1-infected patients, showing immune responses contributing to HIV pathogenesis and novel targets for viral elimination.

      Strengths:<br /> The authors carried out detailed transcriptomics profiling with scRNA-seq and scATAC-seq datasets to conclude upregulated KLF2 target genes in HIV-1 RNA+ CD4 T Cells.

      Weaknesses:

      This key observation of up-regulation KLF2 associated genes family might be important in the HIV field for early diagnosis and viral clearance. However, with the limited sample size and in-vivo study model, it will be hard to conclude. I highly recommend increasing the sample size of early HIV-1-infected patients.

    4. Reviewer #3 (Public review):

      Summary:

      This manuscript studies intracellular changes and immune processes during early HIV-1 infection with an additional focus on the small CD4+ T cell subsets. The authors used single-cell omics to achieve high resolution of transcriptomic and epigenomic data on the infected cells which were verified by viral RNA expression. The results add to understanding of transcriptional regulation which may allow progression or HIV latency later in infected cells. The biosamples were derived from early HIV infection cases, providing particularly valuable data for the HIV research field.

      Strengths:

      The authors examined the heterogeneity of infected cells within CD4 T cell populations, identified a significant and unexpected difference between naive and effector CD4 T cells, and highlighted the differences in Th2 and Th17 cells. Multiple methods were used to show the role of the increased KLF2 factor in infected cells. This is a valuable finding of a new role for the major transcription factor in further disease progression and/or persistence.

      The methods employed by the authors are robust. Single-cell RNA-Seq from PBMC samples was followed by a comprehensive annotation of immune cell subsets, 16 in total. This manuscript presents to the scientific community a valuable multi-omics dataset of good quality, which could be further analyzed in the context of larger studies.

      Weaknesses:

      Methods and Supplementary materials<br /> Some technical aspects could be described in more detail. For example, it is unclear how the authors filtered out cells that did not pass quality control, such as doublets and cells with low transcript/UMI content. Next, in cell annotation, what is the variability in cell types between donors? This information is important to include in the supplementary materials, especially with such a small sample size. Without this, it is difficult to determine, whether the differences between subsets on transcriptomic level, viral RNA expression level, and chromatin assessment are observed due to cell type variations or individual patient-specific variations. For the DEG analysis, did the authors exclude the most variable genes?

      The annotation of 16 cell types from PBMC samples is impressive and of good quality, however, not all cell types get attention for further analysis. It's natural to focus primarily on the CD4 T cells according to the research objectives. The authors also study potential interactions between CD4 and CD8 T cells by cell communication inference. It would be interesting to ask additional questions for other underexplored immune cell subsets, such as: 1) Could viral RNA be detected in monocytes or macrophages during early infection? 2) What are the inferred interactions between NK cells and infected CD4 T cells, are interactions similar to CD4-CD8 results? 3) What are the inferred interactions between monocytes or macrophages and infected CD4 T cells?

      Discussion<br /> It would be interesting to see more discussion of the observation of how naïve T cells produce more viral RNA compared to effector T cells. It seems counterintuitive according to general levels of transcriptional and translational activity in subsets.<br /> Another discussion block could be added regarding the results and conclusion comparison with Ashokkumar et al. paper published earlier in 2024 (10.1093/gpbjnl/qzae003). This earlier publication used both a cell line-based HIV infection model and primary infected CD4 T cells and identified certain transcription factors correlated with viral RNA expression.

    1. eLife Assessment

      This valuable study describes a software package in R for visualizing metabolite ratio pairs. The evidence supporting the claims of the authors is solid and broadly supports the authors' conclusions. This work would be of interest to the mass spectrometry community.

    2. Reviewer #2 (Public review):

      Summary:

      In the article, the authors describe their software package in R for visualizing metabolite ratio pairs. I think the work would be of interest to the mass spectrometry community.

      Strengths:

      The authors describe a software that would be of use to those performing MALDI MSI. This software would certainly add to the understanding metabolomics data and enhance the identification of critical metabolites.

      Weaknesses:

      The figures are difficult to interpret/ analyze in their current state but are significantly better in the revision.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Cheng et al explore the utility of analyte ratios instead of relative abundance alone for biological interpretation of tissue in a MALDI MSI workflow. Utilizing the ratio of metabolites and lipids that have complimentary value in metabolic pathways, they show the ratio as a heat map which enhances the understanding of how multiple analytes relate to each other spatially. Normally, this is done by projecting each analyte as a unique color but using a ratio can help clarify visualization and add to biological interpretability. However, existing tools to perform this task are available in open-source repositories, and fundamental limitations inherent to MALDI MSI need to be made clear to the reader. The study lacks rigor and controls, i.e. without quantitative data from a variety of standards (internal isotopic or tissue mimetic models for example), the potential delta in ionization efficiencies of different species subtracts from the utility of pathway analysis using metabolite ratios.

      We thank the reviewer for comments on the availability of four other commercial and open-source tools for performing ratio imaging: ENVI® Geospatial Analysis Software, MATLAB image processing toolbox, Spectral Python (SPy) and QGIS. We now highlight these in the introduction (page 3 line 80-86). However, in contrast to these target ratio imaging methods, our approach uniquely enables the untargeted discovery of correlated (or anti-correlated) ratios of molecular features, whether the species are structurally known or unknown.

      ENVI® Geospatial Analysis Software and MATLAB image processing toolbox for hyperspectral imaging are both paid programs, limiting free access and software evaluation for the potential application of untargeted ratio-metric imaging. We are able to evaluate the application of MATLAB RatioImage since Weill Cornell Medicine has an institutional subscription for Mathwork-MATLAB. Notably, MATLAB RatioImage computes and displays an individual intensity modulated ratiometric image by choosing a numerator and denominator image. This software tool only images the ratios of selected metabolites from an input list of multiple species and does not allow for the possibility of untargeted ratiometric images of all metabolite pairs.

      While Spectral Python (SPy) and QGIS are both freely-available software packages, and both can perform individual metabolite ratio images, neither allows for untargeted ratiometric imaging of all pairs from a multiple metabolite input list. Table S1 (below) provides a comparison of the ratio imaging tool that we offer in comparison with other previously available tools.

      We appreciate the reviewer’s insightful comments on differential ionization efficiency among metabolites and the importance of using stable isotope internal standard to gain absolute quantification.

      A fundamental advantage of our ratiometric imaging tool is to provide better image contrast for tissue regions with differential ionization efficiency, with the potential to discover new “metabolic” regions that can be revealed by metabolite ratio. Note that comparison for ratio image abundance is limited to tissue groups in the equivalent region which is expected to have similar ionization efficiency for given metabolites. Furthermore, the power of our strategy is to provide untargeted (and targeted) ratio imaging as a hypothesis generation tool and this use does not require absolute quantification. If cost was not an issue, an extensive group of stable isotope standards could theoretically be used for absolute metabolite quantification of target metabolites with known identity.

      Using the tissue mimetic model, we generate calibration curve for stable isotope standards spiked in carboxymethylcellulose (CMC)-embedded brain homogenate cryosections and quantify the concentration of brain glucose, lactate and ascorbate concentrations. Similar ratio images among these metabolites are obtained from abundance data compared to quantified concentration data (Fig S3). While stable isotope standards are often used to obtain quantitative concentration of metabolite/lipid of interest, it is not applicable for untargeted metabolite ratios that include an assessment of structurally undefined species. Nevertheless, our data indicates that absolute quantification is not necessary for the targeted and untargeted ratio imaging described here (Page 6, line 196-205).

      Reviewer #2 (Public Review):

      Summary:

      In the article, "Untargeted Pixel-by-Pixel Imaging of Metabolite Ratio Pairs as a Novel Tool for Biomedical Discovery in Mass Spectrometry Imaging" the authors describe their software package in R for visualizing metabolite ratio pairs. I think the novelty of this manuscript is overstated and there are several notable issues with the figures that prevent detailed assessment but the work would be of interest to the mass spectrometry community.

      Strengths:

      The authors describe a software that would be of use to those performing MALDI MSI. This software would certainly add to the understanding of metabolomics data and enhance the identification of critical metabolites.

      Weaknesses:

      The authors are missing several references and discussion points, particularly about SIMS MSI, where ratio imaging has been previously performed.

      There are several misleading sentences about the novelty of the approach and the limitations of metabolite imaging.

      Several sentences lack rigor and are not quantitative enough.

      The figures are difficult to interpret/ analyze in their current state and lack some critical components, including labels and scale bars.

      We thank reviewer for very helpful comments. The tone of the manuscript has been adjusted to highlight the real novelty of this method in the ease of computing and application to MS specific projects (abstract line 26-30 ). All figures have been updated to include labels and scale bars with improved resolution. References for ratio imaging use of SIMS MSI has been added in the introduction (Page 3, line 80-89).

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Major Comments:

      In the Abstract it is stated that: "the research community lacks a discovery tool that images all metabolite abundance ratio pairs." However, the following tools exist that perform this fundamental task.

      A "pixel by pixel" data frame in .csv form has a very similar data structure to many instruments like satellite imaging or other hyperspectral tools. It is true this does not exist in the MALDI-specific context, but it would not be difficult to perform this task on the following programs. Highlight the novelty here is not ratios but the ease of computing them and the application in the specific project. Also, describe the available tools and what shortcomings others lack that this package provides. A supplemental table of MSI data analysis tools and the function of each would be a good addition.

      List of tools to perform band ratio computation with minimal modification:

      (1) ENVI IDL: geospatial imaging tool that allows ratio computation between spectral bands.

      (2) MATLAB image processing toolbox for hyperspectral imaging.

      (3) Spectral Python package (SPy).

      (4) QGIS with plugins can be used for hyperspectral image analysis with a ratio between bands.

      We revised the abstract and introduction to include novelty and comparison to other existing methods listed in Table S1.

      "untargeted R package workflow" - If there are functions used outside the SCiLS Lab API client then write it up and include a GitHub link for open access to fit the mission of eLife.

      As shown in Scheme I. We develop two types of codes for untargeted ratio imaging. The first type uses Scils lab API client to extend the function of targeted and targeted ratio imaging and all related spatial image analysis. This is suitable for Scils lab users. The second type does not require Scils lab API, it allows extracting pixel data from imzml file then proceed targeted and untargeted imaging and analysis. Both codes are now deposit in Github via public access (https://github.com/qic2005/Untargeted-massspectrometry-ratio-imaging.git).

      "across cells and tissue subregions" The value in reporting cell type and tissue type-specific differences in any metric is powerful, but not done in this paper. Only whole samples are compared such as "KO vs WT" and the annotations in Figure 3 are not leveraged for increased biological relevance. This paper treats each image as a homogenization experiment in a practical sense beyond just visually inspecting each image. Remove this claim or do the calculations on region/tissue/cell-type specific differences with the appropriate tools to show the data beyond simple heat map images.

      We have deleted the sentence containing across cells and tissue subregions from the abstract.

      "enhances spatial image resolution" Clarify. The resolution in MALDI is set by the raster size of the pixels which is an instrument parameter and cannot be changed post-acquisition. Image-specific methods to increase resolution exist, but dividing the value in one peak column by another does not change functional resolution in the context of the instruments here.

      We thank reviewer for pointing out this typo. We have changed it to enhance spatial image contrast in the abstract (line 34).

      "pixel-by-pixel imaging of the ratio of an enzyme's substrate to its derived product offers an opportunity to view the distribution of functional activity for a given metabolic pathway across tissue" - Appropriately calibrate the impact of this work and correct this statement to better reflect the capabilities of this approach. Do not oversell the exploration of pathway activity since the raw quantity reported as relative abundance does not provide biologically interpretable pathway information. This is due to unaccounted differences in ionization efficiencies between analytes in a pathway and lack of determination of rate. Without a calibration curve and more techniques on the analytical chemistry side of the project, it is possible a relative abundance of one analyte (like the product of a pathway) could be higher than the relative abundance of another analyte (a precursor), but due to structural differences, the actual quantity of the higher relative abundance species could be significantly different or even lower than its counterpart. Secondly, "functional activity" cannot be assessed in this manner without isotopic labeling or additional techniques. This does not subtract from the overall validity and impact of the work, but highlighting these shortcomings and slight alterations to the claim are important for a multidisciplinary audience.

      Although we show that abundance ratio results in similar image to concentration ratio for brain metabolites such as lactate, glucose and ascorbate, we agree with the reviewer that abundance ratio is different from the absolute concentration ratio in numerical value due to difference in ionization efficiency. We delete the sentence “pixel-by-pixel imaging of the ratio of an enzyme's substrate to its derived product offers an opportunity to view the distribution of functional activity for a given metabolic pathway across tissue" from the abstract. We apologize for not clarifying this application more clearly. We meant to compare pathway activity among the equivalent and similar pixel/regions of tissues from different biological groups, given the assumption that ionization efficiency is identical for equivalent pixel from different tissue sections ( i.e. same cell type and microenvironment), especially for metabolites with similar functional structure in the same pathway. For example, fatty acids with different chain length and phospholipid with same head groups are expected to have similar ionization efficiency in the same tissue pixel/region. We have thereby rewritten this section (Page 7, line 239-247).

      "We further show that ratio imaging minimizes systematic variations in MSI data by sample handling and instrument drift, improves image resolution, enables anatomical mapping of metabotype heterogeneity, facilitates biomarker discovery, and reveals new spatially resolved tissue regions of interest (ROIs) that are metabolically distinct but otherwise unrecognized."

      Instrument drift is not accounted for by ratios as it impacts the process before ratio computation. "metabotype" - spelling?

      Instrument drift here refers to individual ion abundance changes during long data acquisition. Ratio may offer a better read-out than individual metabolite abundance alone. However, for acquired data after total ion normalization, ratio data would not have difference from non-ratio data. Therefore, we delete instrument drift from the sentence (Page 2, line 33, and Page 3, line 99)

      Metabotype is a term widely used for metabolomics field. It is categorized by similar metabolic profiles, which are based on combinations of specific metabolites. https://nutritionandmetabolism.biomedcentral.com/articles/10.1186/s12986-020-00499-z

      Results 3: Justify the claim that the ratio reduces artifacts. A ratio is the value from one m/z area over another and would seem that the quality of the ratio would be always lower than the individually higher quality pixel signal of the two analytes that compose a ratio.

      Ratio images are indeed the heatmaps of pixel-by-pixel ratio data, set by the scale of all ratio values. For very abundant ion pairs, their individual image may not be better than the ratio image, depending on the abundance changes among pixels within tissue sections. Similarly, the quality of ratio image may not be higher than the individual image if distribution of ratios does not change much among pixels in tissue sections. For example, metabolite or lipids in Figures 2 and 5 are abundant, but non-ratio images do not have better quality than ratio images. Furthermore, ratio image provides additional information on how the ratio of the two metabolite pair changes pixel-by pixel in all tissue sections, such additional information could be useful for data interpretation.

      Results 4: The metabolite pairs are biologically sensible but should be clearly stated that they do not account for differences in ionization efficiency between metabolites and cannot provide quantitative pathway analysis with a high degree of biological confidence.

      We apologize for not clarifying this application more clearly. We meant to compare pathway activity among the equivalent and similar pixel/regions of tissues from different biological groups, given the assumption that ionization efficiency is identical for equivalent pixel from different tissue sections ( i.e. same cell type and microenvironment), especially for metabolites with similar functional structure in the same pathway. For example, fatty acids with different chain length and phospholipid with same head groups are expected to have similar ionization efficiency in the same tissue pixel/region. We have thereby rewritten this section (Page 7, 239-247, 254-255).

      Results 4: "cell-type specific metabolic activity at cellular (10 µm) spatial resolution" Prove the cell type differences with IHC coregistration or MALDI IHC if you want to make claims about them. Just visually determining a tissue type of a scan of a slide is inadequate to support this claim.

      We agree with reviewer’s comments. We meant to provide additional information on cellular level metabolic activity such as adenosine nucleotide phosphorylation status (ATP/AMP) ratio at 10µm resolution. Hippocampus neurons provide a good example for depicting this utility. We have rewritten the claim to highlight the role of ratio imaging in providing additional metabolic information (Page 8, line 288-290).

      Minor Comments:

      Table 2 "Aspartiate" spelling

      We have corrected it.

      Describe the process and mathematical background for ratio computation in the Methods section. As this paper introduces a package, describing its underlying functions has value.

      We have added R-script comments to illustrate the untargeted ratio calculation using the R-mathematical function of combination and division between any two metabolite pairs in a data matrix (Page 4, line 139-141)

      "we annotate missing values with 1/5 the minimum value quantified in all pixels in which it was detected" This is explicit (ie only values with exactly 1/5 the value are annotated" - make it clear this is a threshold.

      We apologize for misunderstanding. Missing values are either have no value or have solid zero in their abundance. We first calculate the minimum abundance of a particular m/z among all pixels with detectable abundance ( i.e. excluding non-missing values), then use 1/5 this minimum value as a threshold to annotate missing value (Page 4, 133-139).

      Figure 1: legend scils is branded SCiLS and EXCEL does not need caps lock (Excel).

      Figure 1 legend has been corrected.

      Conflicts of interest "None" - there are Bruker employees on a paper about MALDI method development in a field they dominate.

      We added Joshua Fischer as a Bruker employee.

      Figure 3: The legend does not describe the purple arrow in J.

      Purple arrow description is added to figure legend.

      Figure 5: Fix orientation inconsistencies in G, H, I, and J. Especially in J - they are opposite directions. This is arbitrary and determined in SCiLS lab with simple rotation.

      Orientation has been made consistent in G,H, I and J.

      Figure S8: Provide exact number of biological and technical replicates used to generate this figure.

      Figure S8, now Figure S9, was generated from 4 biological replicates of KO and 4 biological replicates of WT brain section in the ROI7 region. This information has been added to the figure legend.

      Figure S9: Make consistent orientation of all brains

      We have made brain orientations consistent.

      In addition to ionization efficiencies impacting the value of the numeric relative abundance where ratio computation originates from, it should be mentioned how different classes of metabolites are differentially impacted by the euthanasia and collection methods used for various tissue types. For example, it is well established the ATP/AMP ratio can change drastically from tissue collection.

      We have added this to page 8, line 315-319.

      Perform standards to adjust for ionization efficiency between different m/z features.

      Untargeted ratio imaging serves as an add-on MSI data analysis tool with primary use in comparing ratio among equivalent regions/pixels with similar ionization efficiencies. It is a hypothesis generation tool. Standards adjust for ionization efficiency would be a great idea for a more accurate assessment of ratio values. Due to the cost and availability of stable isotope standards for different m/z, we chose glucose, lactate and ascorbate to showcase that abundance ratio and concentration ratio result in similar images among example brain metabolite lactate, glucose and ascorbate (page 6, 196-205).

      Add more controls to support the claims.

      We have 4 biological replicates for each genotype of brain. We have added the number of controls in all figure legends.

      Significantly tone down the claims, it is unclear how knowledgeable the authors are about the current literature of SW regarding MALDI.

      The tone has been significantly tuned down throughout the revised manuscript.

      Reviewer #2 (Recommendations For The Authors):

      Abstract:

      "relative abundance of structurally identified and yet-undefined metabolites across tissue cryosections" is misleading, since tandem MS can be performed in an imaging context and is often also compatible with the same instrument.

      We have deleted this sentence in the abstract.

      Intro:

      Paragraph 1: The authors mention MALDI and DESI, but I would argue that SIMS is more abundantly used than DESI within single-cell applications.

      We have added SIMS to the introduction Page 3, line 67.

      Paragraph 2: While it may not be all detected pairs, there are many examples of ratio imaging in the MALDI MSI and SIMS communities, particularly for bacterial signaling. These would be important examples to reference.

      We have added the application of SIMS ratio imaging to the introduction, page 3, line 74-75.

      Materials :

      Paragraph 1: More specificity on sample size is required. 3 or 4 per group is not specific. Which has four and which has three? Why are they different?

      We have corrected sample numbers for specific genotype in the text and figure legends. The number of sections per group is different due to the availability of fresh-frozen tissues (Page 4, line 115-117).

      Results:

      Paragraph 1: Am I correct in reading that an .imzml can't be used directly? Why not?

      Imaging Mass Spectrometry Markup Language (imzml) is a common data format for mass spectrometry imaging. It was developed to allow the flexible and efficient exchange of large MS imaging data between different instruments and data analysis software (Schramm et al, 2012). It contains two sets of data: the mass spectral data which is stored in a binary file (.ibd file) to ensure efficient storage and the XML metadata (.imzml file) which stores instrumental parameters, sample details. Therefore, it can’t be used directly. We have added this to result 1(Page 5, line 160-169).

      Paragraph 4: "Additionally, nonlipid small molecule metabolites suffer from smearing and/or diffusion during cryosection processing, including over the course of matrix deposition for MALDI-MSI." This is misleading. There are several examples of MALDI MSI of small metabolites that are nonlipids, where smearing or diffusion have not occurred. It would be beneficial to have a more accurate discussion of this instead. The authors should also provide some evidence of this, since they continue to focus on it for the full paragraph and don't provide references.

      We initially meant the poor image quality of small molecule metabolites is due to its interaction with aqueous phase of spraying solution, rapid degradation rate and matrix interference. We have deleted this sentence in the revised version.

      Section 5 Paragraph 2; "However, ratio imaging revealed a much greater aspartate to glutamate ratio in an unusual "moon arc" region across the amygdala and hypothalamus relative to the rest of the coronal brain." Much greater isn't scientifically accurate or descript. Use real numbers and be quantitative.

      We used pixel data from all 8 sections to obtain quantitative changes in the ratio-generated “moon arc” region compared to the rest of coronal brain (page 8, line 331-337). Ratio imaging revealed a average of 1.59-fold increase in aspartate to glutamate ratio in an unusual “moon arc” region across the amygdala and hypothalamus (mean abundance 0.563 in 6345 pixels) relative to the rest of the coronal brain (mean abundance 0.353 in 45742 pixels, Figure 5D). Similar but different arc-like structures are encompassed within the ventral thalamus and hypothalamus, wherein glutamate to glutamine ratio show a 1.63-fold increase in intensity compared to the rest of the brain (mean abundance of 0.695 in 7108 pixels vs 0.428 in 44979 pixels, Figure 5E).

      Section 8 Paragraph 2: "UMAPing" is not scientifically written.

      We have replaced UMAPing with UMAP.

      Figure 2 is difficult to interpret, given the small sizes of the images. Align the images, reduce the white space, clearly label the different tissues, add scale bars, increase size, etc. This applies to all figures, except for 3. This will make it possible to review.

      All figures have been resized by removing extra space between sections.

      Figure 3. There seems to be a change in tissue after section I, so a different diagram would be helpful. SCD has a high abundance in an area that seems to be off of the tissue. Can the authors explain this? Some of the images also appear to be low signal-to-noise. Example spectra in the SI would be helpful, so I can more accurately judge the quality of the data.

      We apologize for the discrepancy. All images are from the same sample. We initially cropped the individual image from multiple page PDF plot, then inserted it in Figure 3. Resizing and cropping inconsistency may lead to the small difference in image size. In the revised version, we plot all images in one page, which eliminates the inconsistency.

      Figure 3 example pixel data, ratio pixel data, mass spectra and ratio images can be downloaded below:

      https://wcm.box.com/s/2d5jch45ar8upjzytljnylt6doewcsqc

    1. eLife Assessment

      This study provides valuable information on the single nucleus RNA sequencing transcriptome, pathways, and cell types in pig skeletal muscle in response to conjugated linoleic acid (CLA) supplementation. Based on the comprehensive data analyses, the data are considered compelling and provide new insight into the mechanisms underlying intramuscular fat deposition and muscle fiber remodeling. The study contributes significantly to the understanding of nutritional strategies for fat infiltration in pig muscle.

    2. Joint Public Review:

      This study comprehensively presents data from single nuclei sequencing of Heigai pig skeletal muscle in response to conjugated linoleic acid supplementation. The authors identify changes in myofiber type and adipocyte subpopulations induced by linoleic acid at depth previously unobserved. The authors show that linoleic acid supplementation decreased the total myofiber count, specifically reducing type II muscle fiber types (IIB), myotendinous junctions, and neuromuscular junctions, whereas type I muscle fibers are increased. Moreover, the authors identify changes in adipocyte pools, specifically in a population marked by SCD1/DGAT2. To validate the skeletal muscle remodeling in response to linoleic acid supplementation, the authors compare transcriptomics data from Laiwu pigs, a model of high intramuscular fat, to Heigai pigs. The results verify changes in adipocyte subpopulations when pigs have higher intramuscular fat, either genetically or diet-induced. Targeted examination using cell-cell communication network analysis revealed associations with high intramuscular fat with fibro-adipogenic progenitors (FAPs). The authors then conclude that conjugated linoleic acid induces FAPs towards adipogenic commitment. Specifically, they show that linoleic acid stimulates FAPs to become SCD1/DGAT2+ adipocytes via JNK signaling. The authors conclude that their findings demonstrate the effects of conjugated linoleic acid on skeletal muscle fat formation in pigs, which could serve as a model for studying human skeletal muscle diseases.

      [Editors' note: the authors have responded to the previous rounds of review: https://doi.org/10.7554/eLife.99790.1.sa1 and https://doi.org/10.7554/eLife.99790.2.sa1]

    3. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #1 (Public review):

      In this revised manuscript, the authors aim to elucidate the cytological mechanisms by which conjugated linoleic acids (CLAs) influence intramuscular fat deposition and muscle fiber transformation in pig models. They have utilized single-nucleus RNA sequencing (snRNA-seq) to explore the effects of CLA supplementation on cell populations, muscle fiber types, and adipocyte differentiation pathways in pig skeletal muscles. Notably, the authors have made significant efforts in addressing the previous concerns raised by the reviewers, clarifying key aspects of their methodology and data analysis.

      Strengths:

      (1) Thorough validation of key findings: The authors have addressed the need for further validation by including qPCR, immunofluorescence staining, and western blotting to verify changes in muscle fiber types and adipocyte populations, which strengthens their conclusions.

      (2) Improved figure presentation: The authors have enhanced figure quality, particularly for the Oil Red O and Nile Red staining images, which now better depict the organization of lipid droplets (Figure 7A). Statistical significance markers have also been clarified (Figure 7I and 7K).

      Thanks!

      Weaknesses:

      (1) Cross-species analysis and generalizability of the results: Although the authors could not perform a comparative analysis across species due to data limitations, they acknowledged this gap and focused on analyzing regulatory mechanisms specific to pigs. Their explanation is reasonable given the current availability of snRNA-seq datasets on muscle fat deposition in other human and mouse.

      Thanks for your suggestion!

      (2) Mechanistic depth in JNK signaling pathway: While the inclusion of additional experiments is a positive step, the exploration of the JNK signaling pathway could still benefit from deeper analysis of downstream transcriptional regulators. The current discussion acknowledges this limitation, but future studies should aim to address this gap fully.

      Thanks! As we discussed in discussion part, further studies should focus on the downstream transcriptional regulators of JNK signaling pathway on IMF deposition.

      (3) Limited exploration of other muscle groups: The authors did not expand their analysis to additional muscle groups, leaving some uncertainty regarding whether other muscle groups might respond differently to CLA supplementation. Further studies in this direction could enhance the understanding of muscle fiber dynamics across the organism.

      Thanks for your suggestion! In this study, we mainly focused on the adipocytes, muscles and FAPs subpopulations, which play important roles in lipid deposition. As you suggested, our further study will focus on other subpopulations such as endothelial cells and immune cells.

      Reviewer #2 (Public review):

      Summary:

      This study comprehensively presents data from single nuclei sequencing of Heigai pig skeletal muscle in response to conjugated linoleic acid supplementation. The authors identify changes in myofiber type and adipocyte subpopulations induced by linoleic acid at depth previously unobserved. The authors show that linoleic acid supplementation decreased the total myofiber count, specifically reducing type II muscle fiber types (IIB), myotendinous junctions, and neuromuscular junctions, whereas type I muscle fibers are increased. Moreover, the authors identify changes in adipocyte pools, specifically in a population marked by SCD1/DGAT2. To validate the skeletal muscle remodeling in response to linoleic acid supplementation, the authors compare transcriptomics data from Laiwu pigs, a model of high intramuscular fat, to Heigai pigs. The results verify changes in adipocyte subpopulations when pigs have higher intramuscular fat, either genetically or diet-induced. Targeted examination using cell-cell communication network analysis revealed associations with high intramuscular fat with fibro-adipogenic progenitors (FAPs). The authors then conclude that conjugated linoleic acid induces FAPs towards adipogenic commitment. Specifically, they show that linoleic acid stimulates FAPs to become SCD1/DGAT2+ adipocytes via JNK signaling. The authors conclude that their findings demonstrate the effects of conjugated linoleic acid on skeletal muscle fat formation in pigs, which could serve as a model for studying human skeletal muscle diseases.

      Strengths:

      The comprehensive data analysis provides information on conjugated linoleic acid effects on pig skeletal muscle and organ function. The notion that linoleic acid induces skeletal muscle composition and fat accumulation is considered a strength and demonstrates the effect of dietary interactions on organ remodeling. This could have implications for the pig farming industry to promote muscle marbling. Additionally, these data may inform the remodeling of human skeletal muscle under dietary behaviors, such as elimination and supplementation diets and chronic overnutrition of nutrient-poor diets. However, the biggest strength resides in thorough data collection at the single nuclei level, which was extrapolated to other types of Chinese pigs.

      Weaknesses:

      Although the authors compiled a substantial and comprehensive dataset, the scope of cellular and molecular-level validation still needs to be expanded. For instance, the single nuclei data suggest changes in myofiber type after linoleic acid supplementation, but these findings need more thorough validation. Further histological and physiological assessments are necessary to address fiber types and oxidative potential. Similarly, the authors propose that linoleic acid alters adipocyte populations, FAPs, and preadipocytes; however, there are limited cellular and molecular analyses to confirm these findings. The identified JNK signaling pathways require additional follow-ups on the molecular mechanism or transcriptional regulation. However, these issues are discussed as potential areas for future exploration. While various individual studies have been conducted on mouse/human skeletal muscle and adipose tissues, these have only been briefly discussed, and further investigation is warranted. Additionally, the authors incorporate two pig models into their results, but they only examine one muscle group. Exploring whether other muscle groups respond similarly or differently to linoleic acid supplementation would be valuable. Furthermore, the authors should discuss how their results translate to human and pig nutrition, such as the desirability and cost-effectiveness for pig farmers and human diets high in linoleic acid. Notably, while the single nuclei data is comprehensive, there needs to be a statement on data deposition and code availability, allowing others access to these datasets.

      Thanks for your suggestion!

      Recommendations for the authors:

      Reviewer #2 (Recommendations for the authors):

      The authors have discussed and provided some experimental evidence to address the related issues to help justify their conclusions. The reviewer believes that authors should deposit their single-cell sequencing data and code for the broader research community.

      Thank you! We have uploaded our raw dataset in the Genome Sequence Archive (Genomics, Proteomics & Bioinformatics 2021) in National Genomics Data Center (Nucleic Acids Res 2022), China National Center for Bioinformation / Beijing Institute of Genomics, Chinese Academy of Sciences and data availability part has been updated (line 575-579).

    1. eLife Assessment

      This important study reveals that disrupting fatty acid metabolism in macrophages significantly restricts the growth of Mycobacterium tuberculosis, showing that impaired lipid processing triggers various antimicrobial responses. Overall, the approach is robust, utilizing CRISPR-Cas9 knockout of multiple genes involved in lipid metabolism which yielded convincing data. This work highlights how host lipid metabolism affects the ability of tubercle bacilli to thrive intracellularly, pointing to potential new therapeutic targets.

    2. Reviewer #1 (Public review):

      Summary:

      This study investigates the role of macrophage lipid metabolism in the intracellular growth of Mycobacterium tuberculosis. By using a CRISPR-Cas9 gene-editing approach, the authors knocked out key genes involved in fatty acid import, lipid droplet formation, and fatty acid oxidation in macrophages. Their results show that disrupting various stages of fatty acid metabolism significantly impairs the ability of Mtb to replicate inside macrophages. The mechanisms of growth restriction included increased glycolysis, oxidative stress, pro-inflammatory cytokine production, enhanced autophagy, and nutrient limitation. The study demonstrates that targeting fatty acid homeostasis at different stages of the lipid metabolic process could offer new strategies for host-directed therapies against tuberculosis.

      The work is convincing and methodologically strong, combining genetic, metabolic, and transcriptomic analyses to provide deep insights into how host lipid metabolism affects bacterial survival.

      Strengths:

      The study uses a multifaceted approach, including CRISPR-Cas9 gene knockouts, metabolic assays, and dual RNA sequencing, to assess how various stages of macrophage lipid metabolism affect Mtb growth. The use of CRISPR-Cas9 to selectively knock out key genes involved in fatty acid metabolism enables precise investigation of how each step-lipid import, lipid droplet formation, and fatty acid oxidation-affects Mtb survival. The study offers mechanistic insights into how different impairments in lipid metabolism lead to diverse antimicrobial responses, including glycolysis, oxidative stress, and autophagy. This deepens the understanding of macrophage function in immune defense.<br /> The use of functional assays to validate findings (e.g., metabolic flux analyses, lipid droplet formation assays, and rescue experiments with fatty acid supplementation) strengthens the reliability and applicability of the results.<br /> By highlighting potential targets for HDT that exploit macrophage lipid metabolism to restrict Mtb growth, the work has significant implications for developing new tuberculosis treatments.

      Weaknesses:

      The experiments were primarily conducted in vitro using CRISPR-modified macrophages. While these provide valuable insights, they may not fully replicate the complexity of the in vivo environment where multiple cell types and factors influence Mtb infection and immune responses. Yet, I agree that the Hoxb8 in vitro model provides a powerful genetic tool to interrogate host-Mtb interactions using primary macrophages that represent the bone marrow-derived macrophage lineage, instead of using cell lines.

      Comments on revisions: The authors have addressed my comment satisfactorily.

    3. Reviewer #2 (Public review):

      Summary:

      Host-derived lipids are an important factor during Mtb infection. In this study, using CRISPR knockouts of genes involved in fatty acid uptake and metabolism, the authors claim that a compromised uptake, storage or metabolism of fatty acid in the hosts restricts Mtb growth upon infection. The mechanism involves increased glycolysis, autophagy, oxidative stress, pro-inflammatory cytokines and nutrient limitation. The study may be useful for developing novel host-directed approaches against TB.

      Strengths:

      The study's strength is the use of clean HOXB8-derived primary mouse macrophage lines for generating CRISPR knockouts.

      Weaknesses:

      The strength of evidence on autophagy and redox stress remains incomplete.

      Comments on revisions:

      The authors have revised the manuscript and addressed some of the earlier concerns. However, some of the interpretations and responses are incorrect.

      Overall, the level of evidence to state the following in the abstract- "Our analyzes demonstrate that macrophages which cannot either import, store or catabolize fatty acids restrict Mtb growth by both common and divergent anti-microbial mechanisms, including increased glycolysis, increased oxidative stress, production of pro-inflammatory cytokines, enhanced autophagy and nutrient limitation" is incomplete.

      There is an increase in glycolysis and pro-inflammatory cytokines and, to some extent, oxidative stress. The same can not be said about autophagy. Unfortunately, the authors did not try to establish a direct role of any of these pathways in restricting bacterial growth in the absence of any of the three genes studied.

      Major concern:

      Autophagy: The LC3 WB does not, by any stretch of the imagination, convince that there is an increase in autophagy flux, as inferred by the authors. Authors correctly cite the "Guidelines to autophagy" paper. Unfortunately, they cite it only selectively to justify their assessment. The LC3II/LC3I ratio indicates the number of autophagosomes present. This ratio can also increase if there is an active block of autophagosome maturation. That's why having BafA1 or CQ controls is important to assess the active autophagosome maturation. However, the authors sidestep this serious consideration by claiming some "pleiotropic impact on Mtb". With BafA1 and CQ, the only assay one needs is to measure the impact on LC3II levels. In the absence of this assay, the evidence supporting the role of autophagy is incomplete.

      The main concern regarding autophagy results is that autophagy induction can typically bring down oxidative stress and classically has anti-inflammatory outlay. Thus, increased glycolysis, inflammatory cytokine production and redox stress indicate more towards a potential block in autophagy at the maturation step. This necessitates validation using autophagy flux assays.

      Oxidative stress: Showing a representative image for the corresponding representative groups would be more convincing. For example, there is no clarity on whether, in the infected group, there was any staining for Mtb to analyse only the infected cells.

    4. Reviewer #3 (Public review):

      Summary:

      This study provides significant insights into how host metabolism, specifically of lipids, influences the pathogenesis of Mycobacterium tuberculosis (Mtb). It builds on existing knowledge about Mtb's reliance on host lipids and emphasizes the potential of targeting fatty acid metabolism for therapeutic intervention.

      Strengths:

      To generate the data, the authors use CRISPR technology to precisely disrupt the genes involved in lipid import (CD36, FATP1), lipid droplet formation (PLIN2) and fatty acid oxidation (CPT1A, CPT2) in mouse primary macrophages. The Mtb Erdman strain is used to infect the macrophage mutants. The study, revealsspecific roles of different lipid-related genes. Importantly, results challenge previous assumptions about lipid droplet formation and show that macrophage responses to lipid metabolism impairments are complex and multifaceted. The experiments are well-controlled and the data is convincing.

      Overall, this well-written paper makes a meaningful contribution to the field of tuberculosis research, particularly in the context of host-directed therapies (HDTs). It suggests that manipulating macrophage metabolism could be an effective strategy to limit Mtb growth.

      Weaknesses:

      None noted. The manuscript provides important new knowledge that will lead mpvel to host-directed therapies to control Mtb infections.

      Comments on revisions: The authors have addressed the concerns of the reviewers.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This study investigates the role of macrophage lipid metabolism in the intracellular growth of Mycobacterium tuberculosis. By using a CRISPR-Cas9 gene-editing approach, the authors knocked out key genes involved in fatty acid import, lipid droplet formation, and fatty acid oxidation in macrophages. Their results show that disrupting various stages of fatty acid metabolism significantly impairs the ability of Mtb to replicate inside macrophages. The mechanisms of growth restriction included increased glycolysis, oxidative stress, pro-inflammatory cytokine production, enhanced autophagy, and nutrient limitation. The study demonstrates that targeting fatty acid homeostasis at different stages of the lipid metabolic process could offer new strategies for host-directed therapies against tuberculosis.

      The work is convincing and methodologically strong, combining genetic, metabolic, and transcriptomic analyses to provide deep insights into how host lipid metabolism affects bacterial survival.

      Strengths:

      The study uses a multifaceted approach, including CRISPR-Cas9 gene knockouts, metabolic assays, and dual RNA sequencing, to assess how various stages of macrophage lipid metabolism affect Mtb growth. The use of CRISPR-Cas9 to selectively knock out key genes involved in fatty acid metabolism enables precise investigation of how each step-lipid import, lipid droplet formation, and fatty acid oxidation affect Mtb survival. The study offers mechanistic insights into how different impairments in lipid metabolism lead to diverse antimicrobial responses, including glycolysis, oxidative stress, and autophagy. This deepens the understanding of macrophage function in immune defense.

      The use of functional assays to validate findings (e.g., metabolic flux analyses, lipid droplet formation assays, and rescue experiments with fatty acid supplementation) strengthens the reliability and applicability of the results.

      By highlighting potential targets for HDT that exploit macrophage lipid metabolism to restrict Mtb growth, the work has significant implications for developing new tuberculosis treatments.

      Weaknesses:

      The experiments were primarily conducted in vitro using CRISPR-modified macrophages. While these provide valuable insights, they may not fully replicate the complexity of the in vivo environment where multiple cell types and factors influence Mtb infection and immune responses.

      We thank the reviewer for pointing this out. We acknowledge that our in vitro system may indeed not fully replicate the complex in vivo environment given of what is becoming to light of macrophage heterogenous responses to Mtb infection in whole animal models. We do believe, however, that the Hoxb8 in vitro model provides a powerful genetic tool to interrogate host-Mtb interactions using primary macrophages that represent the bone marrow-derived macrophage lineage.

      Reviewer #2 (Public review):

      Summary:

      Host-derived lipids are an important factor during Mtb infection. In this study, using CRISPR knockouts of genes involved in fatty acid uptake and metabolism, the authors claim that a compromised uptake, storage, or metabolism of fatty acid restricts Mtb growth upon infection. Further, the authors claim that the mechanism involves increased glycolysis, autophagy, oxidative stress, pro-inflammatory cytokines, and nutrient limitation. The authors also claim that impaired lipid droplet formation restricts Mtb growth. However, promoting lipid droplet biogenesis does not reverse/promote Mtb growth.

      Strengths:

      The strength of the study is the use of clean HOXB8-derived primary mouse macrophage lines for generating CRISPR knockouts.

      Weaknesses:

      There are many weaknesses of this study, they are clubbed into four categories below

      (1) Evidence and interpretations: The results shown in this study at several places do not support the interpretations made or are internally contradictory or inconsistent. There are several important observations, but none were taken forward for in-depth analysis.

      a) The phenotypes of PLIN2<sup>-/-</sup>, FATP1<sup>-/-</sup>, and CPT-/- are comparable in terms of bacterial growth restriction; however, their phenotype in terms of lipid body formation, IL1B expression, etc., are not consistent. These are interesting observations and suggest additional mechanisms specific to specific target genes; however, clubbing them all as altered fatty acid uptake or catabolism-dependent phenotypes takes away this important point.

      We thank the reviewer for highlighting this. Our focus was on assessing the impact of manipulating lipid homeostasis in macrophages at several stages and the consequences this has on the intracellular growth of Mtb. Throughout the manuscript (abstract, results and discussion), we have continuously emphasized that interfering with lipid handling at several stages in macrophages results in both conserved and divergent antimicrobial responses against intracellular Mtb.

      b) Finding the FATP1 transcript in the HOXB8-derived FATP1<sup>-/-</sup> CRISPR KO line is a bit confusing. There is less than a two-fold decrease in relative transcript abundance in the KO line compared to the WT line, leaving concerns regarding the robustness of other experiments as well using FATP1<sup>-/-</sup> cells.

      CRISPR-Cas9 targeting of genes with single sgRNAs as is the case with our mutants generates insertions and deletions (INDELs) at the CRISPR cut site. These INDELs do not block mRNA transcription totally, and this is widely reported in the field.  Because of this, quantitative RT-PCR or RNA-seq methods are not routinely used to verify CRISPR knockouts as they are not sensitive enough to identify INDELs. We provide INDEL quantification and knockout efficiencies by ICE analysis in supplemental file 1 for all the mutants used in the study. We also demonstrate protein depletion by western blot and flow cytometry for all the mutants (Figure 1 - figure supplement 1). Only mutants with greater than >90% protein depletion were used for subsequent characterization.

      c) No gene showing differential regulation in FATP<sup>-/-</sup> macrophages, which is very surprising.

      We assume the reviewer is referring to the Mtb transcriptome response in FATP1<sup>-/-</sup> macrophages, which we agree was unexpected.  However, we saw a significant compensatory response in the host cell (at transcriptional level) in FATP1<sup>-/-</sup> macrophages as evidenced by an upregulation of other fatty acid transporters (Figure 5 - figure supplement 1, now Figure 6 - figure supplement 1). We believe that these compensatory responses could, in part, alleviate the stresses the bacteria experience within the cell. We discuss this point in the manuscript.

      d) ROS measurements should be done using flow cytometry and not by microscopy to nail the actual pattern.

      We thank the reviewer for the suggestion. However, confocal imaging is also widely used to measure ROS with similar quantitative power and individual cell resolution (PMID: 32636249, 35737799).

      (2) Experimental design: For a few assays, the experimental design is inappropriate

      a) For autophagy flux assay, immunoblot of LC3II alone is not sufficient to make any interpretation regarding the state of autophagy. This assay must be done with BafA1 or CQ controls to assess the true state of autophagy.

      We would like to point out that monitoring LC3I to LC3II conversion by western blot, confocal imaging of LC3 puncta and qPCR analysis of autophagy related genes are all validated assays for monitoring autophagic flux in a wide variety of cells. We refer the reviewer to the latest extensive guidelines on the subject (PMID: 33634751). Furthermore, Bafilomycin A and chloroquine are not specific inhibitors of autophagy and therefore are of limited value as controls. BafA is an inhibitor of the proton-ATPase apparatus and can indirectly impact autophagy through activity on the Ca-P60A/SERCA pathway. Chloroquine impacts vacuole acidification, autophagosome/lysosome fusion and slows phagosome maturation. So, while BafA and chloroquine will reduce autophagy; their effects are pleotropic and their impact on Mtb is unknown.

      b) Similarly, qPCR analyses of autophagy-related gene expression do not reflect anything on the state of autophagy flux.

      See our response above.

      (3) Using correlative observations as evidence:

      a) Observations based on RNAseq analyses are presented as functional readouts, which is incorrect.

      We are not entirely sure where we used our RNA-seq data sets as functional readouts. We used our transcriptome data to provide a preliminary identification of anti-microbial responses in the mutant macrophages infected with Mtb and we mention this at the beginning of the RNA-seq results sections. Where applicable, we followed up and confirmed the more compelling RNA-seq data either by metabolic flux analyzes, qPCR, ROS measurements, and quantitative imaging.

      b) Claiming that the inability to generate lipid droplets in PLIN2<sup>-/-</sup> cells led to the upregulation of several pathways in the cells is purely correlative, and the causal relationship does not exist in the data presented.

      It was not our intention to infer causality. We have re-written the beginning of the sentence, and it now starts with “Meanwhile, Mtb infection of PLIN2<sup>-/-</sup> macrophages led to upregulation” which hopefully eliminates any association to causality.

      (4) Novelty: A few main observations described in this study were previously reported. That includes Mtb growth restriction in PLIN2 and FATP1 deficient cells. Similarly, the impact of Metformin and TMZ on intracellular Mtb growth is well-reported. While that validates these observations in this study, it takes away any novelty from the study.

      To the best of our knowledge, Mtb growth restrictions in PLIN2 and FATP1 deficient macrophages have not been reported elsewhere. To the contrary, PLIN2 knockout macrophages obtained from PLIN2 deficient mice have been reported to robustly support Mtb replication (PMID: 29370315). We extensively discuss these discrepancies in the manuscript. We also discuss and cite appropriate references where Mtb growth restriction for similar macrophage mutants have been reported (CD36<sup>-/-</sup> and CPT2<sup>-/-</sup>). Our aim was to carry out a systematic myeloid specific genetic interference of fatty acid import, storage and catabolism to assess the effect on Mtb growth at all stages of lipid handling instead of focusing on one target. In the chemical approach, we used TMZ and Metformin deliberately because they had already been reported as being active against intracellular Mtb and we wished to place our data in the context of existing literature.  These studies have been referenced extensively in the text.

      (5) Manuscript organisation: It will be very helpful to rearrange figures and supplementary figures.

      New figures have been added, and existing ones have been re-arranged where necessary. See our responses to recommendations for authors.

      Reviewer #3 (Public review):

      Summary:

      This study provides significant insights into how host metabolism, specifically lipids, influences the pathogenesis of Mycobacterium tuberculosis (Mtb). It builds on existing knowledge about Mtb's reliance on host lipids and emphasizes the potential of targeting fatty acid metabolism for therapeutic intervention.

      Strengths:

      To generate the data, the authors use CRISPR technology to precisely disrupt the genes involved in lipid import (CD36, FATP1), lipid droplet formation (PLIN2), and fatty acid oxidation (CPT1A, CPT2) in mouse primary macrophages. The Mtb Erdman strain is used to infect the macrophage mutants. The study, reveals specific roles of different lipid-related genes. Importantly, results challenge previous assumptions about lipid droplet formation and show that macrophage responses to lipid metabolism impairments are complex and multifaceted. The experiments are well-controlled and the data is convincing.

      Overall, this well-written paper makes a meaningful contribution to the field of tuberculosis research, particularly in the context of host-directed therapies (HDTs). It suggests that manipulating macrophage metabolism could be an effective strategy to limit Mtb growth.

      Weaknesses:

      None noted. The manuscript provides important new knowledge that will lead mpvel to host-directed therapies to control Mtb infections.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      The study presents compelling and well-supported conclusions based on a solid body of evidence. However, the clarity of several figures could be improved for better understanding.

      (1) In Figure 1, panels B and C are referenced incorrectly in the text.

      We thank the reviewer for identifying the error. This has now been corrected

      (2) Figures 2 and S2 would benefit from being combined or reorganized to display the data related to infected and uninfected cells together, making it easier for the reader to interpret.

      We thank the reviewer for the suggestion. However, we believe that combining the two figures would further complicate the merged figure making it even more difficult to interpret. We decided to highlight the mutant macrophage’s responses upon Mtb infection in Figure 2 and put the uninfected data sets in supplementary information given that the OCR and ECAR trends were similar and as expected in both infected and uninfected states.

      (3) Figure 3 is mislabeled, with four panels shown in the figure, but only panels A and B are mentioned in both the text and the figure legend.

      We thank the reviewer for the observation. Figure 3 has been extensively revised. We have included new blots, statistical comparisons and a corresponding new supplementary figure (Figure 3 - figure supplement 1). We have verified that the figure panels are labelled correctly and appropriately referenced in the manuscript text.

      (4) Figure 5 is overly complex and difficult to interpret. Simplifying the figure, possibly by reducing the amount of data or breaking it into more digestible parts, would enhance its readability.

      We thank the reviewer for the suggestion. We have separated the figure into two parts which are now Figure 5 for the PCA and Venn diagrams and Figure 6 for the pathway enrichment figure panels. We have increased the resolution of both figures in the revised manuscript to improve readability.

      (5) Panel 6A is not particularly informative and could either be omitted with a more detailed explanation provided in the text, or replaced with a clearer visual representation, such as Venn diagrams, to improve data visualization.

      We thank the reviewer for the suggestion. We have removed Figure 6A given that detailed explanation of the panel is already available in the manuscript text.

      (6) Additionally, on line 309, the word "to" is missing before "generate".

      We thank the reviewer for identifying this. This sentence has now been re-written to address some unintended inferences of causation in line with recommendations from reviewer 2.

      Reviewer #2 (Recommendations for the authors):

      (1) Manuscript Organisations: The manuscript is very poorly organised. Supplemental figures are labelled very unconventionally, and that creates much confusion in following the manuscript. Some of the results in the supplementary figures could be easily kept in the main figures, as it is difficult to compare plots between the main figures and the supple figures. The results of RNAseq experiments are impossible to follow with very small fonts. Overall, the figures are very casually organised and can certainly be improved.

      We would like to clarify that supplemental figures are labelled and organized as is in line with the eLife formatting of supplemental figures. We deliberately put some redundant figures like Figure 2 - figure supplement 1 in supplementary information (see our response to reviewer 1 recommendations on the same). We have split the RNA-seq Figure 5 into two separate figures (now Figure 5 and 6) and increased their resolution to improve readability.

      (2) Figure 3: Among the KO lines, only PLIN2<sup>-/-</sup> had a higher HIF1a level before infection. Infection surely leads to higher levels across the three cases.

      We have generated replicate western blots and provide statistical quantitation for both HIF1a, AMPK and pAMPK. Figure 3 has now been revised extensively, replicate blots are in Figure 3 - figure supplement 1. We have updated the text to reflect the reviewer observation which was also consistent with our statistical quantification.

      (3) pAMPK blots are of very poor quality. Without quantification, the trend mentioned in the text is not clearly visible.

      We have provided two more replicate blots for AMPK/pAMPK and provide statistical quantification as described above.

      (4) Line 230: Regarding autophagy flux, neither the data suggest what is interpreted nor is this experiment correctly done. LC3 WB and autophagy gene qPCR: Unfortunately, LC3 WB, the way it was done, does not tell anything about the state of autophagy in these cells. A very mild LC3II increase is noted in CPT2<sup>-/-</sup> cells upon infection; the rest of the others do not show any change. This assay is not done correctly. To interpret LC3II WB, one needs to include the Bafilomycin A1 control, usually +Baf and -Baf run in the adjacent wells in the gel. Similarly, qPCR results are not indicative of any increase in autophagy. Regulation of ATG7, MAP1LC3B, and ULK1 is more at the post-translational level than the transcriptional level.

      We have provided an additional replicate blot together with statistical quantification of LC3II/LC3I ratios in the revised Figure 3 - figure supplement 2. Our quantifications remain consistent with our prior assertations in the manuscript text. See our response in the public review section concerning autophagy assays and the use of Baf or chloroquine as controls.

      (5) Exogenous oleate fails to rescue the Mtb icl1-deficient mutant in FATP1<sup>-/-</sup>, PLIN2<sup>-/-</sup> and CPT2<sup>-/-</sup> macrophages: this result is confusing. Lipid uptake and metabolism have been the central players so far; however, here, the phenotypes of FATP1 and CPT2 in terms of lipid body accumulation are very distinct. Therefore, the assessment that Mtb growth inhibition is due to factors other than limited access to fatty acid is not consistent with the theme of the study.

      Nutrient limitation is a distinct transcriptional signature of Mtb, at least in PLIN2<sup>-/-</sup> macrophages (Figure 7). We used the oleate supplementation assay with the Mtb Dicl1 mutant to assess whether nutrient restriction was the sole anti-microbial pathway against Mtb in the knockout macrophages. This would have been the case (to a certain extent) if the growth of the Mtb Dicl1 mutant was rescuable upon addition of exogenous oleate in the knockout macrophages. Our data clearly shows that this is not the case and that in addition to nutrient limitation, interference with lipid processing results in several other macrophage anti-microbial responses against the bacteria. We extensively discuss these points in the abstract, results and discussion sections of the manuscript.

      (6) Line 309: "Meanwhile, inability generate lipid droplets in Mtb infected PLIN2<sup>-/-</sup> macrophages led to upregulation in pathways involved in ribosomal biology, MHC class 1 antigen presentation, canonical glycolysis, ATP metabolic processes and type 1 interferon responses (Figure 5C, Supplementary file 3)." This is just a correlative observation. However, it is mentioned here as a causal mechanism.

      We have revised this sentence to remove any unintended inference of causation.

      (7) IL-1b is upregulated in FATP-/- macrophages, no effect in CPT2<sup>-/-</sup> macrophages, but downregulated in PLIN2<sup>-/-</sup> macrophages. Moreover, this effect is very transient, and by 24 hours, all these differences are lost. This suggests the mechanism of action, as their pro-bacterial function shown in Figure 1, is very distinct for different proteins, and FA metabolism is probably not the common denominator across these phenotypes.

      We agree with the reviewer, and we extensively discuss this in the manuscript text (results and discussion). Clearly, they are shared anti-microbial responses across the mutants, but they are also points of divergence. We would like to further clarify that pro-inflammatory responses (IL-1b or IFN-B) in Mtb infected macrophages show a biphasic early upregulation (up to 8 hours of infection) followed by a rapid resolution phase (24-48 hours post infection). This is well reported in the literature (PMID: 30914513). It is common for pro-inflammatory gene expression differences to be temporary lost during the resolution phase (PMID: 30914513, 39472457). IL-1b expression profiles return to the 4-hour equivalent profile in Mtb infected FATP1<sup>-/-</sup> and PLIN2<sup>-/-</sup> macrophages 4 days post infection (Figure 6A, Figure 6 - figure supplement 2B, Supplementary file 2)

      (8) It is very surprising that FATP-/- macrophages do not show any change in Mtb gene expression. The robustness of this experiment and analysis appears doubtful, given that the phenotype in terms of bacterial growth was clean.

      See our response to this comment in the public reviews section

      (9) Figure 5, Supplementary Figure 1: Among the FA transporters, authors also show data for FATP1. I am surprised to see FATP1 expression levels in the FATP1<sup>-/-</sup> cells. This puts into doubt every dataset using FATP-/- cells in this study.

      See our response to this comment in the public reviews section

      (10) Unfortunately, with the kind of evidence presented, it is far-fetched to claim that PLIN2<sup>-/-</sup> macrophages restrict Mtb growth by increasing ROS production. There is no evidence for this statement. The MFI units in Figure 6, Supplementary 1 are too small to extract meaningful interpretations. Moreover, the data appears to be arrived at by combining multiple technical replicates. Usually, flow cytometry data are more reliable for CellROX assays. Microscopy is not the technique of choice for this assay.

      We would like to point out that MFIs are arbitrary units set to predetermined reference points. In our case, the reference was background fluorescence in CellROX unstained cells and cells stained with CellROX equivalent fluorophore conjugated isotype antibodies. We are not entirely sure what the reviewer means by “small” in these contexts. And the data is not entirely from technical replicates. Reported MFIs are from three independent repeats with MFI reads of at least 30 cells per replicate. We have added this clarification in Figure 6 - figure supplement 1 legend, now Figure 7 - figure supplement 1. See our response in the public reviews section on the use of confocal microcopy to image and quantify ROS. Furthermore, the Mtb transcriptional response in PLIN2<sup>-/-</sup> and CPT2<sup>-/-</sup> macrophages is clearly indicative of increased oxidative stresses (Figure 7).

      (11) The CFU results with Metformin and TMZ are on the expected lines, as published earlier by others. FATP1 In data is good and aligned with the knockout phenotype.

      We thank the reviewer for the note.

      (12) Western blots, when interpreted for quantitative differences, must be quantified, and data should be represented as plots with statistical analysis.

      Replicate blots have been provided and statistical quantifications performed.

    1. eLife Assessment

      This manuscript establishes a mathematical model to estimate the key parameters that control the repopulation of planarian stem cells after sublethal irradiation as they undergo fate-switching as part of their differentiation and self-renewal process. The findings are important for future investigation of stem cell division in planarians and have implications for analyzing stem cell biology in other systems. The methods are convincing, integrating modeling with perturbations of key transcription factors known to be critical for cell fate decisions, but the authors have only shown that this is the case for a small number of stem cell types.

    2. Reviewer #1 (Public review):

      Summary:

      This is a very creative study using modeling and measurement of neoblast dynamics to gain insight into the mechanism that allows these highly potent cells to undergo fate-switching as part of their differentiation and self-renewal process. The authors estimate growth equation parameters for expanding neoblast clones based on new and prior experimental observations. These results indicate neoblast likely undergo much more symmetric self-amplifying division than loss of the population through symmetric differentiation, in the case of clone expansion assays after sublethal irradiation. Neoblasts take on multiple distinct transcriptional fates related to their terminally differentiated cell types, and prior work indicated neoblasts have a high plasticity to switch fates in way linked to cell cycle progression and possibly through a random process. Here, the authors explore the impact of inhibition of key transcription factors defining such states (ie "fate specifying transcription factors", FSTFs) plus measurement and modeling in the clone expansion assay, to find that inhibition of factors like zfp1 likely cause otherwise zfp1-fated neoblasts to fail to proliferate and differentiation, without causing compensatory gains in other lineages. A mathematical model of this process assuming that neoblasts do not retain a memory of prior states while they proliferate and transition across specified states can mimic the experimentally determined decreased sizes of clones following inhibition of zfp1. Complementary approaches to inhibit more than one lineage (muscle plus intestine) supports the idea that this is a more general process in planarian stem cells. These results provide an important advance for understanding the fate-switching process and its relationship to neoblast growth.

      Overall I find the evidence very well presented and the study compelling, and offers an important new perspective on the key properties of neoblasts. I have some comments to clarify the presentation and significance of the work.

      Comments on revisions:

      In this revised version, the authors nicely address all of my comments and I find the work makes a strong case for its main conclusions.

    3. Reviewer #2 (Public review):

      Summary:

      Cell cycle duration and cell fate choice are critical to understanding the cellular plasticity of neoblasts in planarians. In this study, Tamar et al. integrated experimental and computational approaches to simulate a model for neoblast behaviors during colony expansion.

      Strengths:

      The finding that "arresting differentiation into specific lineages disrupts neoblast proliferative capacities without inducing compensatory expression of other lineages" is particularly intriguing. This concept could inspire further studies on pluripotent stem cells and their application for regenerative biology.

      Comments on revisions:

      The authors have addressed all of my comments and concerns.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public reviews

      Reviewer #1 (Public review):

      Overall I find the evidence very well presented and the study compelling. It offers an important new perspective on the key properties of neoblasts. I do have some comments to clarify the presentation and significance of the work.

      We thank the reviewer for the positive feedback and plan to improve the presentation of the work.

      Reviewer #2 (Public review):

      However, the absence of a cell-cell feedback mechanism during colony growth and the likelihood of the difference needs to be clarified. Is there any difference in interpreting the results if this mechanism is considered?

      We will improve the description of the model assumptions and the interpretation of the data on the basis of these assumptions.

      Although hnf-4 and foxF have been silenced together to validate the model, a deeper understanding of the tgs-1+ cell type and the non-significant reduction of tgs-1+ neoblasts in zfp-1 RNAi colonies is necessary, considering a high neural lineage frequency.

      We will improve the analysis of this result in light of the experimentally determined frequency of the tgs-1+ neoblast population.

      Recommendations for the authors

      Reviewing Editor Comments:

      After consultation, we have compiled a list of the key changes to be made to the manuscript, along with reviewer-specific recommendations to follow.

      (1) Include a section that explicitly describes the assumptions and limitations of the study, particularly with respect to the following assumptions:

      We thank the reviewers for the comment. We added a description of the model assumptions in the methods section “Assumptions underlying neoblast colony growth model”.

      a) All known types of specialized neoblasts cycle at the same rate (see points from Reviewer 1).

      We thank the reviewers for the comment. The current data used to estimate τ (Lei et al., Dev Cell, 2016) does not allow the direct estimation of individual cycling behaviors. Consequently, we assume that all specialized neoblasts cycle at the same average rate, a simplification supported by the model's accurate prediction of colony growth.

      b) The assumption that any FSTF-like gene would behave like zfp1 or foxF and hnfA genes. The manuscript does not mention that there may be fundamental differences among these different FSTFs that could be uncovered by future work. A strong addition to the paper would be to test other epithelial genes (e.g. p53, chd4, egr5) to show reproducible behavior within a single lineage.

      We thank the reviewers for the comment. Colony size reduction following inhibition of Smed-p53 and failure to produce epidermal progenitors is strongly supported by previous analysis (Wagner et al., Cell Stem Cell, 2012). We refer to this observation in the paper in the section titled: “Inhibition of zfp-1 does not induce overexpression of other lineages in homeostasis”. We added the following sentence to the discussion (Line 460-462): Interestingly, suppression of Smed-p53, a TF expressed in neoblasts and required for epidermal cell production, has resulted in a similar reduction in colony size (Wagner et al., Cell Stem Cell, 2012).

      Of note, Chd4 expression is not limited to specialized neoblasts or to a specific lineage (Scinome et al., Development, 2010), and therefore its inhibition likely has a more complex outcome than an effect on a single lineage. Furthermore, egr-5 is not expressed in neoblasts (Tu et al, eLife, 2015), making this experimental condition more challenging to examine in the context of neoblast colonies at the time points assessed in this study.

      c) The fact that the data used to feed the model relies on radiated animals which are likely to have altered cell cycle rates compared to unirradiated animals (see comment by Reviewer 1). Of note, the model predicts a steady increase in colony size, but colony size does not change between 9dpi and 12dpi.

      We thank the reviewers for the comment. The colony size in control animals increased between 9 and 12 dpi (Fig 3B), as predicted by the model. In zfp-1 (RNAi) animals, the median colony size has also increased over this period, at a slower rate, which we attribute to the increase in q. We attribute the unchanged average colony size to an increase in the frequency of cells failing to proliferate, because of selection of a fate they cannot fully differentiate into.

      d) In light of both reviewers' comments about colony expansion vs. feedback, the authors should discuss how predicted changes to division frequencies might change as homeostasis is reached, or explain how their model accounts for the predicted rate differences under homeostatic conditions in which overall neoblast numbers do not change. Can the model estimate when this transition might occur?

      We thank the reviewers for the comment. Our colony assays are constrained by the animals survival following sub-total irradiation (16 to 20 days). In this timeframe, the neoblast population is overwhelmingly smaller in comparison to non-irradiated animals. Therefore, the animals do not reach homeostasis during the experiment, and the model does not allow to estimate the time the system would need to return to homeostasis.

      (2) In Figure 2D, the assumption is that these adjacent smedwi-1+ cells are sisters. Previous data analyzing this relied on EdU or H3P staining to show a shared division history. When these images were collected is therefore extremely critical to include (the methods suggest 7, 9, or 12 days). The authors should justify why they believe that these adjacent cells are derived from a single neoblast that has divided only once.

      We thank the reviewers for the comment. The images were collected at 7 dpi. We modified the figure legend and the associated methods to include this information. At this early time point, smedwi-1+ cell dyads are spatially separated from other neighboring cells, suggesting that they are the product of a single cell division. Importantly, our data is in complete agreement with previous estimates of symmetric renewal division rate (Raz et al., Cell Stem Cell, 2021; Lei et al, Developmental Cell, 2016).

      (3) Clarify the wording 'pre-selected' in the abstract as described by Reviewer 1.

      We thank the reviewers for the comment, and for clarity we replaced the wording “pre-select” with “select”. 

      (4) Experimental details that are important to the interpretation should be added. For example, how is belonging to a colony defined? This is important because some of the data (e.g. Figure S1A: similar numbers of smedwi-1+ cells are observed at 2dpi and 4dpi, but 4dpi is considered a colony whereas 2dpi is not). The timing of quantification should be included in each figure (it is missing in Figure S2, and Figure 3C and 3D). How the authors distinguish biological vs technical replicates is not mentioned.

      We thank the reviewers for the comment. Subtotal irradiation may result in formation of a spatially-isolated cluster of neoblasts that is not distributed throughout the animal (Wagner et al., Science, 2011). This localized cluster of neoblasts is defined as a neoblast colony (Wagner et al., Science, 2011; Wagner et al., Cell Stem Cell, 2012). The small number of high smedwi-1+ cells observed at 4 dpi in our experiments aligns with this definition (Fig S1A). By contrast, the low smedwi-1 expression detected across the animal 2 dpi does not fit this definition and likely reflects remnants of dying neoblasts resulting from irradiation. The following text was added to the figure legend: “isolated cells expressing low levels of smedwi-1+ were scattered in the planarian parenchyma, likely reflecting remnants of dying neoblasts”.

      (5) Figure 5F appears to use SMEDWI-1 antibody (based on capital letters and increased signal in the brain). Is this the case? The methods do not mention the use of a SMEDWI-1 antibody, and the text indicates that these are progenitors, but SMEDWI-1 protein is well known to not mark neoblasts. If the antibody was used, the authors should not claim that these are neoblasts.

      We thank the reviewers for the comment. The SMEDWI-1 antibody used in the experiments described in Figure 5F indeed labels neoblasts and their progeny (Guo et al., Developmental cell, 2006). The methods section “Immunofluorescence combined with FISH” details the labeling procedure, which combines FISH and IF using this antibody.

      All microscopy images are difficult to see. Perhaps this is because they are formatted as CMYK images. They should be converted to RGB format to make them appear less dull.

      We thank the reviewer for the comment. Improved version of the figures has now been uploaded.

      The terminology used in Figure 5 to describe upregulation should not be "overexpression".  We thank the reviewers for the comment.

      We changed the terminology to “upregulated”.

      Reviewer #1 (Recommendations for the authors):

      I think the authors should include a section that explicitly lays out the assumptions and limitations of the study. For example, I believe that determining tau requires assuming that all different types of specialized neoblasts cycle at the same rates. Also there is the assumption that any FSTF-like gene would behave like zfp1 or foxF and hnfA genes. It seems to remain possible that a future study could find that a subset of FSTFs might indeed exert "either/or" decisions in fating, just not the particular genes under investigation here.

      We thank the reviewer for the comment. We added a description of the model assumptions in the methods section.

      In the abstract, the wording "pre-selected" is somewhat puzzling to me. I would interpret a preselection as a process that defines the next specified state prior to its manifestation. Instead, and as I understand the authors argue this as well, the study provides good evidence that the determination mechanism is random in that subsequent neoblast choices do not likely depend on prior states. So I would suggest changing that wording.

      We thank the reviewer for the comment. We replaced “pre-select” with “select”

      Is it possible to determine the uncertainty in measuring tau the cell cycle time and would this have an impact on subsequent modeling?

      We thank the reviewers for the comment. The current data that was used to estimate tau (Lei et al., Dev Cell, 2016) does not allow us to directly estimate the uncertainty in measuring τ.

      For lines 154-164 I would suggest doing a little more to explicitly write out the logic of determining the growth constants within the main text and not just in methods, for ease of reading.

      We thank the reviewer for the comment, and added explanations for how we determined the growth constant in the text. The text now reads (lines 160-166): “Considering an average cell cycle length of 29.7 hours, we calculated the value of q using the following approach: the probabilities of all cell division outcomes must sum to 1. Our experimental data showed that symmetric renewal (p) and asymmetric division (a) occur at equal rates (i.e., p = a). By fitting these parameters to the experimental data, we determined that the difference between the probabilities of symmetric renewal and symmetric differentiation (i.e., p - q) was = 0.345 (Fig 2E, S1D-E). Therefore, with these criteria, we estimated the probabilities of cell division outcomes in the colony as p = 0.45, a = 0.45, and q = 0.1 (Fig 2G; Methods).”

      Line 192 why does post-mitotic progeny number linearly relate to neoblast number? In clones, a change in q has an exponential effect. I feel like I am missing something.

      We thank the reviewer for the comment. In colonies, 50% of cell divisions result in the production of post-mitotic progeny (asymmetric division). Therefore, the number of produced progenitors in a given cell cycle is linearly correlated with the number of neoblasts. This statement is in line with previous analysis of planarian colony size (Wagner et al., Cell Stem Cell, 2012).

      Line103 it also seems possible, although less likely, that the specified state is not fixed within a given cell cycle and could be that cells that try to switch into zeta-neoblasts mid-cell cycle arrest in proliferation etc just for that time.

      We thank the reviewer for the comment and agree that this is a possibility. However, our observations suggest that incorporating this factor into the model is unnecessary for accurately predicting colony size.

      In terms of the feedback mechanism proposed to operate in homeostasis, I think in the case of zfp-1 it is quite likely that loss of epidermal differentiation results in wound responses (this phenomenon has been documented in egr-5 RNAi in Tu et al 2015 I believe). This could play out differently in the clone assay because the effects of sublethal irradiation on this process would predominate in both control versus zfp1(RNAi) conditions.

      We thank the reviewer for the comment. Our RNA-seq analysis following zfp-1 inhibition did not show overexpression of injury-induced genes at an early time point (6 days; Fig. 5B-C). However, an increase in cycling cells was detected much earlier via EdU labeling (3 days; Fig. 5D). In the case of egr-5 suppression, Tu et al. analyzed injury-induced gene expression at a later stage (21 days of RNAi), where they found significant epidermal defects (see Fig. 5C in Tu et al.). We agree that sublethal irradiation effects likely predominate in colony analysis for both control and zfp-1 (RNAi) animals. In homeostasis, additional factors likely influence cell proliferation and differentiation.

      It seems likely that some of the differences noted between homeostasis versus clone growth could ultimately arise from the different growth parameters under each setting. Could the rate parameters be estimated from prior data in homeostasis as well? It seems to me that with the framework the authors use, homeostasis must involve a net zero change to neoblast abundance (also shown by Wagner 2011 by the sigmoidal curve of neoblast abundance at the endpoint of clone expansion). Therefore, in these conditions p=q by definition. Experimental evidence from Lei 2016 (Figure S7M) suggests asymmetric divisions and symmetric renewing divisions are about equally abundant (5/12 41% sym renewing vs 7/12 69% asymmetric renewing). Therefore, under homeostasis, there would be an estimated p=q=0.3 and a=0.4. Compared to clone growth conditions then, in homeostasis, it seems that roughly the rate of symmetric renewal decreases and the rate of symmetric differentiation also increases. I wonder, could this kind of difference potentially account for the differences between homeostasis versus clone expansion settings? It is also worth noting that the clone expansion context has been used as a sensitized genetic background for identifying effects of gene inhibition on neoblast self-renewal, so perhaps the reason this works is that the rates of selfrenewal are relatively less in homeostasis so that clone expansion represents a case where there is greater demand for self-renewal.

      We thank the reviewer for the comment. We agree that under homeostatic conditions, where the population size remains stable, the average probability of symmetric renewal matches the average probability of symmetric differentiation or elimination. By contrast, during colony expansion, the probability of symmetric renewal exceeds that of symmetric differentiation or elimination. The differences in response to a lineage block between homeostasis and colony expansion can have multiple interpretations. However, data from homeostatic animals does not permit the analysis of individual neoblasts or their specific responses to a lineage block. Consequently, we cannot determine whether the proliferative response following the lineage block during homeostasis is a direct response to the lineage block or an indirect effect resulting from changes in other neoblasts. We discuss these possibilities further in lines 472 - 484.

      In terms of the memory effect, I recall some arguments presented in the Raz 2021 study that were consistent with a slight memory for neoblast specification being retained. I believe this was a minor point from detecting a slightly higher likelihood of identifying 2-cell clones that both took on prog1+ identity compared to the population average. If this is the case, it may be worth the authors commenting on reconciling those observations with their model.

      We thank the reviewer for their comment. Raz et al. (Cell Stem Cell, 2021) reported that in the asymmetric division of a zeta-neoblast, which generates a prog-2+ cell and a neoblast, there was a slightly higher observed frequency of zfp-1 expression in the neoblast compared to the expected rate (Expected: 32%, Observed: 44%). This small increase may reflect a mild memory effect, experimental variability, or both. However, statistical analysis using Fisher's exact test yielded a non-significant p-value (p = 0.1), suggesting that this difference could be attributed to experimental variability. Other data from Raz et al., such as lineage representation in early colonies, also did not show significant memory effects, indicating that any such effects, if present, are minimal and difficult to detect. Therefore, while we do not, and cannot, rule out the presence of minor memory effects, we expect that effects of this magnitude will have minimal impact on our model.

      Reviewer #2 (Recommendations for the authors):

      Figure 2C and 2D:

      Please provide the specific time points for the data presented.

      We thank the reviewer for the comment. The information was added to the figure legend.

      Colony growth and homeostasis:

      It would be beneficial to estimate a time point at which colony growth transitions to a model with a cell-cell feedback mechanism, similar to that observed in homeostasis. This would help in understanding the dynamics and timing of these processes.

      We thank the reviewers for the comment. Our colony assays were constrained by the animals survival following sub-total irradiation (16 to 20 days). Neoblast numbers are substantially reduced compared to unirradiated animals, preventing us from determining the time point at which homeostasis is achieved.

      Methods:

      μl should be μL  

      The text was changed accordingly.

      Line 526: H2O should be H2O

      The text was changed accordingly.

    1. eLife Assessment

      This important and well-written study uses functional neuroimaging in human observers to provide compelling evidence that activity in the early visual cortex is suppressed at locations that are frequently occupied by a task-irrelevant but salient item. This suppression appears to be general to any kind of stimulus and also occurs in advance of any item actually appearing. The work will be of great interest to psychologists and neuroscientists examining attention, perception, learning and prediction.

    2. Reviewer #1 (Public review):

      Summary:

      The authors investigated if/how distractor suppression derived from statistical learning may be implemented in early visual cortex. While in a scanner, participants conducted a standard additional singleton task in which one location more frequently contained a salient distractor. The results showed that activity in EVC was suppressed for the location of the salient distractor as well as for neighbouring neutral locations. This suppression was not stimulus specific - meaning it occurred equally for distractors, targets and neutral items - and it was even present in trials in which the search display was omitted. Generally, the paper was clear, the experiment was well-designed, and the data are interesting.

      The authors addressed all of my concerns and the revised manuscript will make a beautiful addition to the literature.

    3. Reviewer #2 (Public review):

      The authors of this work set out to test ideas about how observers learn to ignore irrelevant visual information. Specifically, they used fMRI to scan participants who performed a visual search task. The task was designed in such a way that highly salient but irrelevant search items were more likely to appear at a given spatial location. With a region-of-interest approach, the authors found that activity in visual cortex that selectively responds to that location was generally suppressed, in response to all stimuli (search targets, salient distractors, or neutral items), as well as in the absence of an anticipated stimulus.

      Strengths of the study include: A well-written and well-argued manuscript; clever application of a region of interest approach to fMRI design, which allows articulating clear tests of different hypotheses; careful application of follow-up analyses to rule out alternative, strategy-based accounts of the findings; tests of the robustness of the findings to detailed analysis parameters such as ROI size; and exclusion of the role of regional baseline differences in BOLD responses. The main findings are enhanced by supplementary analyses that distinguish between the responses of early visual areas.

      The study provides an advance over previous studies, which identified enhancement or suppression in visual cortex as a function of search target/distractor predictability, but in less spatially-specific way. It also speaks to open questions about whether such suppression/enhancement is observed only in response to the arrival of visual information, or instead is preparatory, favouring the latter view. These questions have been at the heart of theoretical debates in this literature on how distractor suppression unfolds in the context of visual search.

    4. Author response:

      The following is the authors’ response to the original reviews.

      eLife Assessment

      This well-written report uses functional neuroimaging in human observers to provide convincing evidence that activity in the early visual cortex is suppressed at locations that are frequently occupied by a task-irrelevant but salient item. This suppression appears to be general to any kind of stimulus, and also occurs in advance of any item actually appearing. The work in its present form will be valuable to those examining attention, perception, learning and prediction, but with a few additional analyses could more informatively rule out potential alternative hypotheses. Further discussion of the mechanistic implications could clarify further the broad extent of its significance. 

      We thank the editor and the reviewers for the positive evaluation of our manuscript and the thoughtful comments. Below we provide a detailed point-by-point reply to the reviewers’ comments.

      In addition to addressing the reviewers' comments, we have improved the figure legends by explicitly describing the type of error bars depicted in the figures, information which was previously only listed in the Materials and Methods section. Specifically, the statement: “Error bars denote within-subject SEM” was added to several figures, as applicable. We believe that briefly reiterating this information in the figure legends enhances clarity and enables readers to interpret the results more accurately and efficiently. We also updated our code and data sharing statement, as well as opened the repository for the public: “Analysis and experiment code, as well as data required to replicate the results reported in this manuscript are available here: https://doi.org/10.17605/OSF.IO/G4RXV. Raw MRI data is available upon request.”

      Public Reviews

      Reviewer #1 (Public review): 

      Summary: 

      The authors investigated if/how distractor suppression derived from statistical learning may be implemented in early visual cortex. While in a scanner, participants conducted a standard additional singleton task in which one location more frequently contained a salient distractor. The results showed that activity in EVC was suppressed for the location of the salient distractor as well as for neighbouring neutral locations. This suppression was not stimulus specific - meaning it occurred equally for distractors, targets and neutral items - and it was even present in trials in which the search display was omitted. Generally, the paper was clear, the experiment was well-designed, and the data are interesting. Nevertheless, I do have several concerns mostly regarding the interpretation of the results. 

      (1) My biggest concern with the study is regarding the interpretation of some of the results. Specifically, regarding the dynamics of the suppression. I appreciate that there are some limitations with what you might be able to say here given the method but I do feel as if you have committed to a single interpretation where others might still be at play. Below I've listed a few alternatives to consider. 

      We agree with the reviewer that there are important alternatives to consider. Adequately addressing these alternatives will substantially increase the inferences we can draw from our data. Therefore, we address each alternative interpretation in detail below.

      (a) Sustained Suppression. I was wondering if there is anything in your results that would speak for or against the suppression being task specific. That is, is it possible that people are just suppressing the HPDL throughout the entire experiment (i.e., also through ITI, breaks, etc., rather than just before and during the search). Since the suppression does not seem volitional, I wonder if participants might apply a blanket suppression to HPDL un l they learn otherwise. Since your localiser comes a er the task you might be able to see hints of sustained suppression in the HPDL during these trials.  

      It is indeed possible that participants suppressed the HPDL throughout the entire experiment, instead of proactively instantiating suppression on each trial. While possible, we believe that this account is less likely to explain the present results, given the utilized analysis approach, a voxel-wise GLM fit to the BOLD data per run (see Materials and Methods for details). Specifically, we derived parameter estimates from this GLM per location to estimate the relative suppression. Sustained suppression would modulate BOLD responses throughout the run, i.e. presumably also during the implicit baseline period used to estimate the contrast parameter estimates per location. Hence, sustained suppression should not result in a differential modulation between locations, as the BOLD response at the HPDL during the baseline period would be equally suppressed as during the trial. Inspired by the reviewer’s comment, we now clarify this critical point in the manuscript’s Discussion section:

      “Third, participants might have suppressed the HPDL consistently throughout the experiment. This sustained suppression account differs from the proactive suppression proposed here. While this alternative is plausible, we believe that it is less likely to account for the present results, given the analysis conducted. Specifically, we computed voxel-wise parameter estimates and contrasted the obtained betas between locations. Under a sustained suppression account, the HPDL would show suppression even during the implicit baseline period, which would obscure the observed BOLD suppression at and near the HPDL.” 

      (b) Enhancement followed by suppression. Another alternative that wasn't discussed would be an initial transient enhancement of the HPDL which might be brought on by the placeholders followed by more sustained suppression through the search task. Of course, on the whole this would look like suppression, but this still seems like it would hold different implications compared to simply "proactive suppression". This would be something like search and destroy however could be on the location level before the actual onset of the search display.  

      R1 correctly points out that BOLD data, given the poor temporal resolution, do not allow for the detection of potential transient enhancements at the HPDL followed by a later and more pronounced suppression (akin to “search and destroy”). We fully agree with this assessment. However, we also argue that a transient enhancement followed by sustained suppression before search display onset constitutes proactive suppression in line with our interpretation, because suppression would still arise proactively (i.e., before search, and hence distractor, onset). Whether transient enhancement precedes suppression cannot be elucidated by our data, but we believe that it constitutes an interesting avenue for future studies using me-resolved and spatially specific recording methods. We now clarify this important implementational variation in the updated manuscript.

      “Finally, due to the limited temporal resolution of BOLD data, the present data do not elucidate whether the present suppression is preceded by a brief attentional enhancement of the HPDL, as implied by some prior work (Huang et al., 2024). On this account the HPDL would see transient enhancement, followed by sustained suppression, akin to a ‘search and destroy’ mechanism. Critically, we believe that this variation would nonetheless constitute proactive distractor suppression as the suppression would still arise before search onset. Using temporally and spatially resolved methods to explore potential transient enhancements preceding suppression is a promising avenue for future research charting the neural mechanisms underlying distractor suppression.”

      (2) I was also considering whether your effects might be at least partially attributable to priming type effects. This would be on the spatial (not feature) level as it is clear that the distractors are switching colours. Basically, is it possible that on trial n participants see the HPDL with the distractor in it and then on trial n+1 they suppress that location. This would be something distinct from the statistical learning framework and from the repetition suppression discussion you have already included. To test for this, you could look at the trials that follow omission or trials. If there is no suppression or less suppression on these trials it would seem fair to conclude that the suppression is at least in part due to the previous trial. 

      We agree with the reviewer that it is plausible that participants particularly suppress locations which on previous trials contained a distractor. To address this possibility, we conducted a new analysis and adjusted the manuscript accordingly:

      “Second, participants may have suppressed locations that contained the distractor on the previous trial, reflecting a spatial priming effect. This account constitutes a complementary but different perspective than statistical learning, which integrates implicit prior knowledge across many trials. We ruled out that spatial priming explains the present results by contrasting BOLD suppression magnitudes on trials with the distractor at the HPDL and trials where the distractor was not at the HPDL on the previous trial. Results, depicted in Supplementary Figure 4 showed that distractor suppression was statistically significant across both trial types, including trials without a distractor at the HPDL on the preceding trial. This indicates that the observed BOLD suppression is unlikely to be driven by priming and is instead more consistent with statistical learning. Moreover, results did not yield a statistically significant difference between trial types based on the distractor location in the preceding trial. However, these results should not be taken to suggest that spatial priming cannot contribute to distractor suppression; for details see: Supplementary Figure 4.” (p. 13).

      We note that this analysis approach slightly differs from the reviewer’s suggestion, which considered omission trials. However, we decided to exclude trials immediately following an omission to ensure that both conditions were matched as closely as possible. In particular, omission trials represent extended rest periods, which could alter participants’ state and especially modulate the visually evoked BOLD responses (e.g., potentially increasing the dynamic range) compared to trials that did not follow omissions. Our analysis approach avoids this difference while still addressing the hypothesis put forward by the reviewer. We now provide the full explanation and results figure of this priming analysis in the figure text of Supplementary Figure 4: 

      Reviewer #2 (Public review): 

      The authors of this work set out to test ideas about how observers learn to ignore irrelevant visual information. Specifically, they used fMRI to scan participants who performed a visual search task. The task was designed in such a way that highly salient but irrelevant search items were more likely to appear at a given spatial location. With a region-of-interest approach, the authors found that activity in visual cortex that selectively responds to that location was generally suppressed, in response to all stimuli (search targets, salient distractors, or neutral items), as well as in the absence of an anticipated stimulus. 

      Strengths of the study include: A well-written and well-argued manuscript; clever application of a region of interest approach to fMRI design, which allows articulating clear tests of different hypotheses; careful application of follow-up analyses to rule out alternative, strategy-based accounts of the findings; tests of the robustness of the findings to detailed analysis parameters such as ROI size; and exclusion of the role of regional baseline differences in BOLD responses. 

      We thank the reviewer for the positive evaluation of our manuscript.

      The report might be enhanced by analyses (perhaps in a surface space) that distinguish amongst the multiple "early" retinotopic visual areas that are analysed in the aggregate here. 

      We agree with the reviewer that an exploratory analysis separating early visual cortex (EVC) into its retinotopic areas could be an interesting addition. Our reasoning to combine early visual areas into one mask in the original analyses was two-fold: First, we did not have an a priori reason to expected distinct neural suppression between these early ROIs. Therefore, we did not acquire retinotopy data to reliably separate early visual areas (e.g. V1, V2 and V3), instead opting to increase the number of search task trials. The lack of retinotopy data inherently limits the reliability of the resulting cortical segmentation. However, we now performed an analysis separating early visual cortex into V1 and V2 and report the details as Supplementary Text 1:

      “In an exploratory analysis we investigated whether subdivisions of EVC exhibit different representations of priority signals. In brief, we used FreeSurfer to reconstruct brain surfaces (recon-all) from each subject’s anatomical scan. From these reconstructions we derived V1_exvivo and V2_exvivo labels, which were transformed into volume space using ‘mri_label2vol’ and merged into a bilateral mask for each ROI. We then selected the voxels within each ROI that were most responsive to the four stimulus locations, based on independent localizer data. This voxel selection followed the procedure outlined in the Materials and Methods: Region of Interest (ROI) Definition. To accommodate the subdivision into two ROIs (V1 and V2) compared to the single EVC ROI in the main analysis, we halved the number of voxels selected per location. Finally, we applied the same ROI analysis to investigate distractor suppression during search and omission trials, following the procedure described in Materials and Methods: Statistical Analysis. 

      Results of this more fine-grained ROI analyses are depicted in Supplementary Figure 1. First, the results from V2 qualitatively mirrored our primary ROI analysis. BOLD responses in V2 differed significantly between stimulus types (main effect of stimulus type: F<sub>(2,54)</sub> = 31.11, p < 0.001, 𝜂 = 0.54). Targets elicited larger BOLD responses compared to distractors (t<sub>(27)</sub> = 3.05, p<sub>holm</sub> = 0.004, d = 0.06) and neutral stimuli (t<sub>(27)</sub> = 7.82, p<sub>holm</sub> < 0.001, d = 0.14). Distractors also evoked larger responses than neutral stimuli (t<sub>(27)</sub> = 4.78, p<sub>holm</sub> < 0.001, d = 0.09). These results likely reflect top-down modulation due to target relevance and bo om-up effects of distractor salience. Consistent with the primary ROI analysis, the manipula on of distractor predictability showed a distinct pattern of location specific BOLD suppression in V2 (main effect of location: F<sub>(1.1,52.8)</sub> = 5.01, p = 0.030, 𝜂 = 0.16). Neural populations with receptive fields at the HPDL showed significantly reduced BOLD responses compared to the diagonally opposite neutral location (NL-far; post hoc test HPDL vs NL-far: t<sub>(27)</sub> = 2.69, p<sub>holm</sub> = 0.022, d = 0.62). Again, this suppression was not confined to the HPDL but also extended to close by neutral locations (NL-near vs NL-far: t<sub>(27)</sub> = 2.79, p<sub>holm</sub> = 0.022, d = 0.65). BOLD responses did not differ between HPDL and NL-near locations (HPDL vs NL-near: t<sub>(27)</sub> = 0.11, p<sub>holm</sub> = 0.915, d = 0.03; BF<sub>10</sub> = 0.13). As in the EVC ROI analysis, this suppression pattern was consistent across distractor, target, and neutral stimuli presented at the HPDL and NL-near locations compared to NL-far. In sum, neural responses in V2 were significantly modulated by the distractor contingencies, evident as reduced BOLD responses in neural populations with receptive fields at the HPDL and neutral locations near the location of the frequent distractor (NL-near), relative to the neutral location diagonally across the HPDL (NL-far). 

      In V1, BOLD responses also differed significantly between stimulus types (main effect of stimulus type: F<sub>(1.3,35.6)</sub> = 6.69, p = 0.009, 𝜂 = 0.20). Targets elicited larger BOLD responses compared neutral stimuli (t<sub>(27)</sub> = 3.52, p<sub>holm</sub> = 0.003, d = 0.12) and distractors evoked larger responses than neutral stimuli (t<sub>(27)</sub> = 2.62, p<sub>holm</sub> = 0.023, d = 0.09). However, no difference between targets and distractors was observed (t<sub>(27)</sub> = 0.90, p<sub>holm</sub> = 0.375, d = 0.03; BF<sub>10</sub> = 0.17), suggesting reduced sensitivity to task-related effects in V1. Indeed, analyzing the effect of distractor predictability for BOLD responses in V1 showed a different result than in V2 and the combined EVC ROI. There was no significant main effect of location (F<sub>(2,54)</sub> = 2.20, p = 0.120, 𝜂 = 0.08; BF<sub>10</sub> = 0.77). BOLD responses at NL-near and NL-far were similar (BF<sub>10</sub> = 0.171), with the only reliable difference found between target stimuli at the HPDL and NL-far locations (W = 94, p<sub>holm</sub> = 0.012, r = 0.54).”  

      We include the new result figure as Supplementary Figure 5

      We now include reference to these results in the manuscript’s Discussion section:

      “Are representations of priority signals uniform across EVC? A priori we did not have any hypotheses regarding distinct neural suppression profiles across different early visual areas, hence our primary analyses focused stimulus responses neural populations in EVC, irrespective of subdivision. However, an exploratory analysis suggests that distractor suppression may show different patterns in V1 compared to V2 (Supplementary Figure 5 and Supplementary Text 1). In brief, results in V2 mirrored those reported for the combined EVC ROI (Figure 4). In contrast, results in V1 appeared to be only partially modulated by distractor contingencies, and if so, the modulation was less robust and not as spatially broad as in V2. This suggests the possibility of different effects of distractor predictability across subdivisions of early visual areas. However, these results should be interpreted with caution. First, our design did not optimize the delineation of early visual areas (e.g., no functional retinotopy), limiting the accuracy of V1 and V2 segmentation. Additionally, analyses were conducted in volumetric space, which further reduces spatial precision. Future studies could improve this by including retinotopy runs to accurately delineate V1, V2, and V3, and by performing analyses in surface space. Higher-resolution functional and anatomical MRI sequences would also help elucidate how distractor suppression is implemented across EVC with greater precision.”

      Furthermore, the study could benefit from an analysis that tests the correlation over observers between the magnitude of their behavioural effects and their neural responses. 

      R2 highlights that behavioral facilitation and neural suppression could be correlated across participants. The rationale is that if neural suppression in EVC is related to the facilitation of behavioral responses, we should expect a positive relationship between neural suppression at the HPDL and RTs across participants. In this analysis we focused on the contrast between HPDL and NL-far, as this contrast was statistically significant in both the RT (Figure 2) and the neural suppression analysis (Figure 4). First, we computed for each participant the behavioural benefit of distractor suppression as: RT<sub>facilitation</sub> = RT<sub>NL-far</sub> – RT<sub>HPDL</sub>. Thereby RT facilitation reflects the response speeding due to a distractor appearing at the high probability distractor location compared to the far neutral location. Next, we computed neural suppression as: BOLD<sub>suppression</sub> = BOLD<sub>NL-far</sub> – BOLD<sub>HPDL</sub> Thus, positive values reflect the suppression of BOLD responses at the HPDL comparted to the NL-far location. The BOLD suppression index was computed for each stimulus type separately, as in the main ROI analysis (i.e. for Targets, Neutrals and Distractors). Finally, we correlated RT<sub>facilitation</sub> with BOLD<sub>suppression</sub> across participants using Pearson correlation. Results showed a small, but not statistically significant correlation between RT facilitation and BOLD suppression for distractor (r<sub>(26)</sub> = 0.22, p = 0.257), target (r<sub>(26)</sub> = 0.10, p = 0.598) and neutral (r<sub>(26)</sub> = 0.13, p = 0.519) stimuli. Thus, while the direc on of the correlation was in line with the specula on by the reviewer in the “ Recommendations for the authors”, results were not statistically reliable and therefore inconclusive. As also noted in our preliminary reply to the reviewer comments, it was a priori unlikely that this analysis would yield a statistically significant correlation. An a priori power analysis suggested that, to reach a power of 0.8 at a standard alpha of 0.05, given the present sample size of n=28, the effect size would need to exceed r > 0.75, which seemed unlikely for the correlation of behavioural and neural difference scores. Given the inconclusive nature of the results, we prefer to not include this additional analysis in the manuscript, as we believe that it does not add to the main message of the paper but have it accessible to the interested reader in the public “peer review process”.

      The study provides an advance over previous studies, which iden fied enhancement or suppression in visual cortex as a function of search target/distractor predictability, but in less spatially-specific way. It also speaks to open questions about whether such suppression/enhancement is observed only in response to the arrival of visual information, or instead is preparatory, favouring the la er view. The theoretical advance is moderate, in that it is largely congruent with previous frameworks, rather than strongly excluding an opposing view or providing a major step change in our understanding of how distractor suppression unfolds. 

      We agree with the reviewer that our results are an advancement of prior work, particularly with respect to narrowing down the role of sensory areas and the proactive nature of distractor suppression. However, we argue that this represents a significant step forward for several reasons. First, to our knowledge, the literature on distractor suppression, and visual search in general, is by no means unanimous with respect to the conclusion that distractor suppression is instantiated proactively (Huang et al., 2021, 2022). Indeed, there are several studies suggesting the opposite account; reactive suppression (Chang et al., 2023) or contributions by both proactive and reactive mechanisms (Sauter et al., 2021; Wang et al., 2019). Moreover, studies in support of proactive distractor suppression did not investigate the involvement of (early) sensory areas during suppression. Conversely, to our knowledge most studies investigating the involvement of sensory cortex during distractor suppression did not address the question whether suppression arises proactive or reactively.

      Recommendations for the authors: 

      Reviewer #1 ( Recommendations for the authors): 

      Minor Points: 

      (1) There are several disconnects between the behaviour and the MR results - i.e. not stimulus specific yet there are no deficits for targets appearing the HPDL, also no behavioural suppression for the NLNear but neural suppression found. Nevertheless, the behaviour is used as a way to rule out potential attentional strategies when considering whether there is enhancement in the NL-Far condition. I realise you have a few other points here, but I think it's worth addressing what could be seen as a double standard.

      The reviewer points out an important concern, which we feel could have better been addressed in the manuscript. From our point of view a partial dissociation between neural modulations in EVC and eventual behavioural facilitation is not surprising, given the extensive neural processing beyond EVC required for behaviour. However, this assessment may differ, if one stresses an explicit volitional attentional strategy over an implicit statistical learning account. That said, we clearly do not want to create the impression of using a double standard. The lack of behavioural facilitation for targets at NLfar is not a critical part of our argument against explicit attentional strategies. Therefore, we rephrased the relevant paragraph in the Discussion section to now emphasize the importance of the control analysis excluding participants who reported the correct HPDL in the questionnaire (Figure 5), but nonetheless yielded qualitatively identical results to the main ROI analysis (Figure 4). In our opinion, this control analysis provides more compelling evidence against a volitional attentional strategy account without the risk of crea ng the impression of applying a double standard in the interpretation of behavioural data. Additionally, we now acknowledge the limitation of relying on behavioral data in ruling out volitional attentional strategies in the updated manuscript:

      “It is well established that attention enhances BOLD responses in visual cortex (Maunsell, 2015; Reynolds & Chelazzi, 2004; Williford & Maunsell, 2006). If participants learned the underlying distractor contingencies, they could deploy an explicit strategy by directing their attention away from the HPDL, for example by focusing attention on the diagonally opposite neutral location. This account provides an alternative explanation for the observed EVC modulations. However, while credible, the current findings are not consistent with such an interpretation. First, there was no behavioral facilitation for target stimuli presented at the far neutral location, contrary to what one might expect if participants employed an explicit strategy. However, given the partial dissociation between neural suppression in EVC and behavioral facilitation, additional neural data analyses are required to rule out volitional attention strategies. Thus, we performed a control analysis that excluded all participants that indicated the correct HPDL location in the questionnaire, thereby possibly expressing explicit awareness of the contingencies. This control analysis yielded qualitatively identical results to the full sample, showing significant distractor suppression in EVC. Therefore, it is unlikely that explicit attentional strategies, and the enhancement of locations far from the HPDL, drive the results observed here. Instead the current finding are consistent with an account emphasizing the automa c deployment of spatial priors (He et al., 2022) based on implicitly learned statistical regularities.”

      (2) Does the level of suppression change in any way through the experiment? I.e., does it get stronger in the second vs. first half of the experiment? 

      The reviewer askes an interesting question, whether BOLD suppression may change across the experiment. To address this question, we performed an additional analysis testing BOLD suppression in EVC during the first compared to second half of the MRI experiment. Here we defined BOLD suppression as: BOLD<sub>suppression</sub> = ((BOLD<sub>NL-far</sub> – BOLD<sub>HPDL</sub>) + (BOLD<sub>NL-far</sub> – BOLD<sub>NL-near</sub>)) / 2. Thus, in this formula on of BOLD suppression we summarize the two primary BOLD suppression effects observed in our main results (Figure 4). Additionally, as we previously did not observe any significant differences in BOLD suppression magnitudes between different stimulus types (i.e. suppression was similar for target, distractor and neutral stimuli), we collapsed across stimulus types in this analysis.

      Results, depicted below, showed that during both the initial (Run 1+2) and later part (Run 4+5) of the MRI experiment BOLD suppression was statistically significant (BOLD suppression Run 1+2: W = 331, p = 0.003, r = 0.63; BOLD suppression Run 4+5: W = 320, p = 0.007, r= 0.58) , confirming our main results of reliable distractor suppression even in this subset of trials. However, we did not observe any statistically significant differences between early and late runs of the experiment (t<sub>(27)</sub> = -0.21, p = 0.835, d = -0.04). In fact, a Bayesian paired t-test provided evidence for the absence of a difference in BOLD suppression between early compared to later runs (BF<sub>10</sub> = 0.205), suggesting that distractor suppression in EVC was stable throughout the experiment. A qualitatively similar, pattern was evident during omission trials, with significant distractor suppression during early runs (t<sub>(27)</sub> = 2.70, p = 0.012, d = 0.51), but not quite a statistically significant modulation for later runs (t<sub>(27)</sub> = 1.97, p = 0.059, d = 0.37). Again, there was no evidence for a difference in suppression magnitudes across the experiment (W = 198, p = 0.920, d = -0.025) and support for the absence of a difference in BOLD suppression between early and late runs (BF<sub>10</sub> = 0.278).

      Author response image 1.

      Analysis of BOLD suppression magnitudes in EVC across the MRI experiment phases. BOLD suppression was comparable between early (Run 1+2) and late (Run 4+5) phases of the MRI experiment, suggesting consistent suppression in EVC following statistical learning. Error-bars denote within-subject SEM. * p < 0.05, ** p < 0.01, = BF<sub>10</sub> < 1/3.

      In sum, results suggest that distractor suppression in EVC was stable across runs and did not change significantly throughout the experiment. This result was a priori likely, given that participants already underwent behavioral training before entering the MRI. This enabled them to establish modified spatial priority maps, containing the high probability distractor location contingencies, already before the first MRI run. While specula ve, it is possible that participants may still have consolidated the spatial priority maps during the initial runs, but that this additional consolation is not evident in the data, as later runs may see less engagement by participants due to increasing fa gue towards the end of the MRI experiment. Indeed, rapid learning and stable suppression throughout the remainder of the experiment is also reported by prior work (Lin et al., 2021). We believe that it is highly interesting for future studies to investigate the development of distractor suppression across learning, with initial exposure to the contingencies inside the MRI. However, as the present results are inconclusive, we prefer to not include this analysis in the main manuscript, as it may not provide significant additional insight into the neural mechanisms underlying distractor suppression. 

      (3) In the methods vs. results you have reported the probabili es slightly differently. In the methods you say the HPDL was 6x more likely to contain a distractor whereas in the results you say 4x. Based on the reported trial numbers I think it should be 4, but probably you want to double check that this is consistent and correct throughout. 

      We thank the reviewer for bringing this inconsistency to our attention. We have corrected this oversight in the adjusted manuscript: 

      “One of the four locations of interest was designated the high probability distractor location (HPDL), which contained distractor stimuli (unique color) four mes more o en than any of the remaining three locations of interest. In other words, if a distractor was present on a given trial (42 trials per run), the distractor appeared 57% (24 trials per run) at the HPDL and at one of the other three locations with equal probability (i.e., 14% or 6 trials per run per location).” 

      Reviewer #2 ( Recommendations for the authors): 

      The authors have performed their analyses in the volume rather than the surface, and have grouped together V1, V2, and V3 as "early visual cortex". As the authors' claims lean heavily on the idea that they are measuring "early" visual responses, the study would be improved by delinea ng the ROIS within these different retinotopic regions. Such an approach might be facilitated by analysing data on the reconstructed surface. 

      Please refer to our reply to this analysis suggested in the Public review.

      The authors rightly tread carefully on the causal link between their neural findings and the behavioural outcomes. The picture might be clarified somewhat further by testing for a positive relationship between behavioural effect sizes and neural effect sizes across participants. e.g. to what extent is the search advantage when distractors are presented at the "HPDL" linked to greater suppression of BOLD at the HDPL region of early visual cortex? 

      Please refer to our reply to this analysis suggested in the Public review.

      Some of the claims based on null hypotheses would be better supported by Bayesian tests e.g. page 6 "This pattern of results was the same regardless whether the distractor, target, or a neutral stimulus presented at the HPDL and NL-near locations compared to NL-far ..." and "BOLD responses between HPDL and NL-near locations did not reliably differ ..." This is similar to the approach that the authors adopted later in the section "Ruling out attentional modulation".

      We agree with the reviewer that our ROI analyses would benefit from providing evidence for the absence of a modulation. Accordingly, we updated our results by adding equivalent Bayesian tests. Bayes Factors were computed using JASP 0.18.2 (JASP Team, 2024; RRID:SCR_015823) with default settings; i.e. for Bayesian paired t-tests with a Cauchy prior width of 0.707. Qualitative interpretations of BFs were based on Lee and Wagenmakers (2014). We now report the obtained BF in the Results section. 

      “BOLD responses between HPDL and NL-near locations did not reliably differ (HPDL vs NL-near: t<sub>(27)</sub> = 0.47, p<sub>holm</sub> = 0.643, d = 0.08; BF<sub>10</sub> = 0.19).”

      And:

      “Neural responses at HPDL and NL-near did not reliably differ (t<sub>(27)</sub> = 0.21, p<sub>holm</sub> = 0.835 d = 0.04; BF<sub>10</sub> = 0.21).”

      Moreover, we now denote any equivalent results (defined as BF<sub>10</sub><1/3) in Fig. 4 and Fig. 5, and included the descrip on of the associated symbol in the figure text (“ = BF<sub>10</sub> < 1/3”).

      Additionally, we now also report the BF for all paired t-tests reported in Supplementary Table 1.

      Finally, we addressed the statement: “This pattern of results was the same regardless whether the distractor, target, or a neutral stimulus presented at the HPDL and NL-near locations compared to NLfar”. Our inten on was to emphasize that the pattern of results reported in the sentence preceding it was evident for distractor, target, or neutral stimulus, and not to suggest that the magnitude of the effect is the same. Hence, to more accurate reflect the results, we changed this sentence to:  “This pattern of results was present regardless whether the distractor, target, or a neutral stimulus presented at the HPDL and NL-near locations compared to NL-far”

    1. eLife Assessment

      This valuable work presents how PRDM16 plays a critical role during colloid plexus development, through regulating BMP signaling. Solid evidence supports the context-dependent gene regulatory mechanisms both in vivo and in vitro. The work will be of broad interest to researchers working on growth factor signaling mechanisms and vertebrate development.

    2. Reviewer #1 (Public review):

      Summary:

      This manuscript describes the role of PRDM16 in modulating BMP response during choroid plexus (ChP) development. The authors combine PRDM16 knockout mice and cultured PRDM16 KO primary neural stem cells (NSCs) to determine the interactions between BMP signaling and PRDM16 in ChP differentiation.

      They show PRDM16 KO affects ChP development in vivo and BMP4 response in vitro. They determine genes regulated by BMP and PRDM16 by ChIP-seq or CUT&TAG for PRDM16, pSMAD1/5/8, and SMAD4. They then measure gene activity in primary NSCs through H3K4me3 and find more genes are co-repressed than co-activated by BMP signaling and PRDM16. They focus on the 31 genes found to be co-repressed by BMP and PRDM16. Wnt7b is in this set and the authors then provide evidence that PRDM16 and BMP signaling together repress Wnt activity in the developing choroid plexus.

      Strengths:

      Understanding context-dependent responses to cell signals during development is an important problem. The authors use a powerful combination of in vivo and in vitro systems to dissect how PRDM16 may modulate BMP response in early brain development.

      Main weaknesses of the experimental setup:

      (1) Because the authors state that primary NSCs cultured in vitro lose endogenous Prdm16 expression, they drive expression by a constitutive promoter. However, this means the expression levels are very different from endogenous levels (as explicitly shown in Supplementary Figure 2B) and the effect of many transcription factors is strongly dose-dependent, likely creating differences between the PRDM16-dependent transcriptional response in the in vitro system and in vivo.

      (2) It seems that the authors compare Prdm16_KO cells to Prdm16 WT cells overexpressing flag_Prdm16. Aside from the possible expression of endogenous Prdm16, other cell differences may have arisen between these cell lines. A properly controlled experiment would compare Prdm16_KO ctrl (possibly infected with a control vector without Prdm16) to Prdm16_KO_E (i.e. the Prdm16_KO cells with and without Prdm16 overexpression.)

      Other experimental weaknesses that make the evidence less convincing:

      (1) The authors show in Figure 2E that Ttr is not upregulated by BMP4 in PRDM16_KO NSCs. Does this appear inconsistent with the presence of Ttr expression in the PRDM16_KO brain in Figure1C?

      (2) Figure 3: The authors use H3K4me3 to measure gene activity. This is however, very indirect, with bulk RNA-seq providing the most direct readout and polymerase binding (ChIP-seq) another more direct readout. Transcription can be regulated without expected changes in histone methylation, see e.g. papers from Josh Brickman. They verify their H3K4me3 predictions with qPCR for a select number of genes, all related to the kinetochore, but it is not clear why these genes were picked, and one could worry whether these are representative.

      (3) Line 256: The overlap of 31 genes between 184 BMP-repressed genes and 240 PRDM16-repressed genes seems quite small.

      (4) The Wnt7b H3K4me3 track in Fig. 3G is not discussed in the text but it shows H3K4me3 high in _KO and low in _E regardless of BMP4. This seems to contradict the heatmap of H3K4me3 in Figure 3E which shows H3K4me3 high in _E no BMP4 and low in _E BMP4 while omitting _KO no BMP4. Meanwhile CDKN1A, the other gene shown in 3G, is missing from 3E.

      (5) The authors use PRDM16 CUT&TAG on dissected dorsal midline tissues to determine if their 31 identified PRDM16-BMP4 co-repressed genes are regulated directly by PRDM16 in vivo. By manual inspection, they find that "most" of these show a PRDM16 peak. How many is most? If using the same parameters for determining peaks, how many genes in an appropriately chosen negative control set of genes would show peaks? Can the authors rigorously establish the statistical significance of this observation? And why wasn't the same experiment performed on the NSCs in which the other experiments are done so one can directly compare the results? Instead, as far as I could tell, there is only ChIP-qPCR for two genes in NSCs in Supplementary Figure 4D.

      (6) In comparing RNA in situ between WT and PRDM16 KO in Figure 7, the authors state they use the Wnt2b signal to identify the border between CH and neocortex. However, the Wnt2b signal is shown in grey and it is impossible for this reviewer to see clear Wnt2b expression or where the boundaries are in Figure 7A. The authors also do not show where they placed the boundaries in their analysis. Furthermore, Figure 7B only shows insets for one of the regions being compared making it difficult to see differences from the other region. Finally, the authors do not show an example of their spot segmentation to judge whether their spot counting is reliable. Overall, this makes it difficult to judge whether the quantification in Figure 7C can be trusted.

      (7) The correlation between mKi67 and Axin2 in Figure 7 is interesting but does not convincingly show that Wnt downstream of PRDM16 and BMP is responsible for the increased proliferation in PRDM16 mutants.

      Weaknesses of the presentation:

      Overall, the manuscript is not easy to read. This can cause confusion.

    3. Reviewer #2 (Public review):

      Summary:

      This article investigates the role of PRDM16 in regulating cell proliferation and differentiation during choroid plexus (ChP) development in mice. The study finds that PRDM16 acts as a corepressor in the BMP signaling pathway, which is crucial for ChP formation.

      The key findings of the study are:<br /> (1) PRDM16 promotes cell cycle exit in neural epithelial cells at the ChP primordium.<br /> (2) PRDM16 and BMP signaling work together to induce neural stem cell (NSC) quiescence in vitro.<br /> (3) BMP signaling and PRDM16 cooperatively repress proliferation genes.<br /> (4) PRDM16 assists genomic binding of SMAD4 and pSMAD1/5/8.<br /> (5) Genes co-regulated by SMADs and PRDM16 in NSCs are repressed in the developing ChP.<br /> (6) PRDM16 represses Wnt7b and Wnt activity in the developing ChP.<br /> (7) Levels of Wnt activity correlate with cell proliferation in the developing ChP and CH.

      In summary, this study identifies PRDM16 as a key regulator of the balance between BMP and Wnt signaling during ChP development. PRDM16 facilitates the repressive function of BMP signaling on cell proliferation while simultaneously suppressing Wnt signaling. This interplay between signaling pathways and PRDM16 is essential for the proper specification and differentiation of ChP epithelial cells. This study provides new insights into the molecular mechanisms governing ChP development and may have implications for understanding the pathogenesis of ChP tumors and other related diseases.

      Strengths:

      (1) Combining in vitro and in vivo experiments to provide a comprehensive understanding of PRDM16 function in ChP development.

      (2) Uses of a variety of techniques, including immunostaining, RNA in situ hybridization, RT-qPCR, CUT&Tag, ChIP-seq, and SCRINSHOT.

      (3) Identifying a novel role for PRDM16 in regulating the balance between BMP and Wnt signaling.

      (4) Providing a mechanistic explanation for how PRDM16 enhances the repressive function of BMP signaling. The identification of SMAD palindromic motifs as preferred binding sites for the SMAD/PRDM16 complex suggests a specific mechanism for PRDM16-mediated gene repression.

      (5) Highlighting the potential clinical relevance of PRDM16 in the context of ChP tumors and other related diseases. By demonstrating the crucial role of PRDM16 in controlling ChP development, the study suggests that dysregulation of PRDM16 may contribute to the pathogenesis of these conditions.

      Weaknesses:

      (1) Limited investigation of the mechanism controlling PRDM16 protein stability and nuclear localization in vivo. The study observed that PRDM16 protein became nearly undetectable in NSCs cultured in vitro, despite high mRNA levels. While the authors speculate that post-translational modifications might regulate PRDM16 in NSCs similar to brown adipocytes, further investigation is needed to confirm this and understand the precise mechanism controlling PRDM16 protein levels in vivo.

      (2) Reliance on overexpression of PRDM16 in NSC cultures. To study PRDM16 function in vitro, the authors used a lentiviral construct to constitutively express PRDM16 in NSCs. While this approach allowed them to overcome the issue of low PRDM16 protein levels in vitro, it is important to consider that overexpressing PRDM16 may not fully recapitulate its physiological role in regulating gene expression and cell behavior.

      (3) Lack of direct evidence for AP1 as the co-factor responsible for SMAD relocation in the absence of PRDM16. While the study identified the AP1 motif as enriched in SMAD binding sites in Prdm16 knockout cells, they only provided ChIP-qPCR validation for c-FOS binding at two specific loci (Wnt7b and Id3). Further investigation is needed to confirm the direct interaction between AP1 and SMAD proteins in the absence of PRDM16 and to rule out other potential co-factors.

    4. Reviewer #3 (Public review):

      Summary:

      Bone morphogenetic protein (BMP) signaling instructs multiple processes during development including cell proliferation and differentiation. The authors set out to understand the role of PRDM16 in these various functions of BMP signaling. They find that PRDM16 and BMP co-operate to repress stem cell proliferation by regulating the genomic distribution of BMP pathway transcription factors. They additionally show that PRDM16 impacts choroid plexus epithelial cell specification. The authors provide evidence for a regulatory circuit (constituting of BMP, PRDM16, and Wnt) that influences stem cell proliferation/differentiation.

      Strengths:

      I find the topics studied by the authors in this study of general interest to the field, the experiments well-controlled and the analysis in the paper sound.

      Weaknesses:

      I have no major scientific concerns. I have some minor recommendations that will help improve the paper (regarding the discussion).

    5. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This manuscript describes the role of PRDM16 in modulating BMP response during choroid plexus (ChP) development. The authors combine PRDM16 knockout mice and cultured PRDM16 KO primary neural stem cells (NSCs) to determine the interactions between BMP signaling and PRDM16 in ChP differentiation.

      They show PRDM16 KO affects ChP development in vivo and BMP4 response in vitro. They determine genes regulated by BMP and PRDM16 by ChIP-seq or CUT&TAG for PRDM16, pSMAD1/5/8, and SMAD4. They then measure gene activity in primary NSCs through H3K4me3 and find more genes are co-repressed than co-activated by BMP signaling and PRDM16. They focus on the 31 genes found to be co-repressed by BMP and PRDM16. Wnt7b is in this set and the authors then provide evidence that PRDM16 and BMP signaling together repress Wnt activity in the developing choroid plexus.

      Strengths:

      Understanding context-dependent responses to cell signals during development is an important problem. The authors use a powerful combination of in vivo and in vitro systems to dissect how PRDM16 may modulate BMP response in early brain development.

      Main weaknesses of the experimental setup:

      (1) Because the authors state that primary NSCs cultured in vitro lose endogenous Prdm16 expression, they drive expression by a constitutive promoter. However, this means the expression levels are very different from endogenous levels (as explicitly shown in Supplementary Figure 2B) and the effect of many transcription factors is strongly dose-dependent, likely creating differences between the PRDM16-dependent transcriptional response in the in vitro system and in vivo.<br />

      We acknowledge that our in vitro experiments may not ideally replicate the in vivo situation, a common limitation of such experiments, our primary aim was to explore the molecular relationship between PRDM16 and BMP signaling in gene regulation. Such molecular investigations are challenging to conduct using in vivo tissues. In vitro NSCs treated with BMP4 has been used a model to investigate NSC proliferation and quiescence, drawing on previous studies (e.g., Helena Mira, 2010; Marlen Knobloch, 2017). Crucially, to ensure the relevance of our in vitro findings to the in vivo context, we confirmed that cultured cells could indeed be induced into quiescence by BMP4, and this induction necessitated the presence of PRDM16. Furthermore, upon identifying target genes co-regulated by PRDM16 and SMADs, we validated PRDM16's regulatory role on a subset of these genes in the developing Choroid Plexus (ChP) (Fig. 7 and Suppl.Fig7-8). Only by combining evidence from both in vitro and in vivo experiments could we confidently conclude that PRDM16 serves as an essential co-factor for BMP signaling in restricting NSC proliferation.

      (2) It seems that the authors compare Prdm16_KO cells to Prdm16 WT cells overexpressing flag_Prdm16. Aside from the possible expression of endogenous Prdm16, other cell differences may have arisen between these cell lines. A properly controlled experiment would compare Prdm16_KO ctrl (possibly infected with a control vector without Prdm16) to Prdm16_KO_E (i.e. the Prdm16_KO cells with and without Prdm16 overexpression.)

      We agree that Prdm16 KO cells carrying the Prdm16-expressing vector would be a good comparison with those with KO_vector. However, despite more than 10 attempts with various optimization conditions, we were unable to establish a viable cell line after infecting Prdm16 KO cells with the Prdm16-expressing vector. The overall survival rate for primary NSCs after viral infection is low, and we observed that KO cells were particularly sensitive to infection treatment when the viral vector was large (the Prdm16 ORF is more than 3kb).

      As an alternative oo assess vector effects, we instead included two other control cell lines, wt and KO cells infected with the 3xNLS_Flag-tag viral vector, and presented the results in supplementary Fig 2.  When we compared the responses of the four lines — wt, KO, wt infected with the Flag vector, KO infected with the Flag vector — to the addition and removal of BMP4, we confirmed that the viral infection itself has no significant impacts on the responses of these cells to these treatments regarding changes in cell proliferation and Ttr induction.

      Given that wt cells and the KO cells, with or without viral backbone infection behave quite similarly in terms of cell proliferation, we speculate that even if we were successful in obtaining a cell line with Prdm16-expressing vector in the KO cells, it may not exhibit substantial differences compared to wt cells infected with Prdm16-expressing vector.

      Other experimental weaknesses that make the evidence less convincing:

      (1) The authors show in Figure 2E that Ttr is not upregulated by BMP4 in PRDM16_KO NSCs. Does this appear inconsistent with the presence of Ttr expression in the PRDM16_KO brain in Figure1C?<br />

      The reviwer’s point is that there was no significant increase in Ttr expression in Prdm16_KO cells after BMP4 treatment (Fig. 2E), but there remained residule Ttr mRNA signals in the Prdm16 mutant ChP (Fig. 1C). We think the difference lies in the measuable level of Ttr expression between that induced by BMP4 in NSC culture and that in the ChP. This is based on our immunostaining expreriment in which we tried to detect Ttr using a Ttr antibody. This antibody could not detect the Ttr protein in BMP4-treated Prdm16_expressing NSCs but clearly showed Ttr signal in the wt ChP. This means that although Ttr expression can be significantly increased by BMP4 in vitro to a level measurable by RT-qPCR, its absolute quantity even in the Prdm16_expressing condition is much lower compared to that in vivo. Our results in Fig 1C and Fig 2E, as well as Fig 7B, all consistently showed that Prdm16 depletion significantly reduced Ttr expression in in vitro and in vivo.

      (2) Figure 3: The authors use H3K4me3 to measure gene activity. This is however, very indirect, with bulk RNA-seq providing the most direct readout and polymerase binding (ChIP-seq) another more direct readout. Transcription can be regulated without expected changes in histone methylation, see e.g. papers from Josh Brickman. They verify their H3K4me3 predictions with qPCR for a select number of genes, all related to the kinetochore, but it is not clear why these genes were picked, and one could worry whether these are representative.

      H3K4me3 has widely been used as an indicator of active transcription and is a mark for cell identity genes. And it has been demonstrated that H3K4me3 has a direct function in regulating transciption at the step of RNApolII pausing release. As stated in the text, there are advantages and disadvantages of using H3K4me3 compared to using RNA-seq. RNA-seq profiles all gene products, which are affected by transcription and RNA stability and turnover. In contrast, H3K4me3 levels at gene promoter reflects transcriptional activity. In our case, we aimed to identify differential gene expression between proliferation and quiescence states. The transition between these two states is fast and dynamic. RNA-seq may not be able to identify functionally relevant genes but more likely produces false positive and negative results. Therefore, we chose H3K4me3 profiling.

      We agree that transcription may change without histone methylation changes. This may cause an under-estimation of the number of changed genes between the conditions. 

      We validated 7 out of 31 genes (Wnt7b, Id3, Mybl2, Spc24, Spc25, Ndc80 and Nuf2). We chose these genes based on two critira: 1) their function is implicated in cell proliferation and cell-cycle regulation based on gene ontology analysis; 2) their gene products are detectable in the developing ChP based on the scRNA-seq data. Three of these genes (Wnt7b, Id3, Mybl2) are not related to the kinetochore. We now clarify this description in the revised text.

      (3) Line 256: The overlap of 31 genes between 184 BMP-repressed genes and 240 PRDM16-repressed genes seems quite small.

      This indicates that in addition to co-repressing cell-cycle genes, BMP and PRDM16 have independent fucntions. For example, it was reported that BMP regulates neuronal and astrocyte differentiation (Katada, S. 2021), while our previous work demonstrated that Prdm16 controls temporal identity of NSCs (He, L. 2021).

      (4) The Wnt7b H3K4me3 track in Fig. 3G is not discussed in the text but it shows H3K4me3 high in _KO and low in _E regardless of BMP4. This seems to contradict the heatmap of H3K4me3 in Figure 3E which shows H3K4me3 high in _E no BMP4 and low in _E BMP4 while omitting _KO no BMP4. Meanwhile CDKN1A, the other gene shown in 3G, is missing from 3E.

      The track in Fig 3G shows the absolute signal of H3K4me3 after mapping the sequencing reads to the genome and normaliz them to library size. Compare the signal in Prdm16_E with BMP4 and that in Prdm16_E without BMP4, the one with BMP4 has a lower peak. The same trend can be seen for the pair of Prdm16_KO cells with or without BMP4.  The heatmap in Fig. 3E shows the relative level of H3K4me3 in three conditions. The Prdm16_E cells with BMP4 has the lowest level, while the other two conditions (Prdm16_KO with BMP4 and Prdm16_E without BMP4) display a higher level. These two graphs show a consistent trend of H3K4me3 changes at the Wnt7b promoter across these conditions.

      (5) The authors use PRDM16 CUT&TAG on dissected dorsal midline tissues to determine if their 31 identified PRDM16-BMP4 co-repressed genes are regulated directly by PRDM16 in vivo. By manual inspection, they find that "most" of these show a PRDM16 peak. How many is most? If using the same parameters for determining peaks, how many genes in an appropriately chosen negative control set of genes would show peaks? Can the authors rigorously establish the statistical significance of this observation? And why wasn't the same experiment performed on the NSCs in which the other experiments are done so one can directly compare the results? Instead, as far as I could tell, there is only ChIP-qPCR for two genes in NSCs in Supplementary Figure 4D.

      In our text, we indicated the genes containing PRDM16 binding peaks in the figures and described them as “Text in black in Fig. 6A and Supplementary Fig. 5A”. We will add the precise number “25 of these genes” in the main text to clarify it. To define a negative control set of genes, we will use BMP-only repressed 184-31 =153 genes (excluding PRDM16-BMP4 co-repressed), and of these 153 genes, we will determine how many have PRDM16 peaks in the E12.5 ChP data, say X. Then we will use binomial test to calculate p-value binom_test(25, 31, X/153, alternative=“greater).

      We are confused with the second part of the comment “And why wasn't the same experiment performed on the NSCs in which the other experiments are done so one can directly compare the results? Instead, as far as I could tell, there is only ChIP-qPCR for two genes in NSCs in Supplementary Figure 4D.” If the reviewer meant why we didn’t sequence the material from sequential-ChIP or validate more taget genes, the reason is the limitation of the material. Sequential ChIP requires a large quantity of the antibodies, and yields little material barely sufficient for a few qPCR after the second round of IP. This yielded amount was far below the minimum required for library construction. The PRDM16 antibody was a gift, and the quantity we have was very limited. We made a lot of efforts to optimize all available commercial antibodies in ChIP and Cut&Tag, but none of them worked.

      (6) In comparing RNA in situ between WT and PRDM16 KO in Figure 7, the authors state they use the Wnt2b signal to identify the border between CH and neocortex. However, the Wnt2b signal is shown in grey and it is impossible for this reviewer to see clear Wnt2b expression or where the boundaries are in Figure 7A. The authors also do not show where they placed the boundaries in their analysis. Furthermore, Figure 7B only shows insets for one of the regions being compared making it difficult to see differences from the other region. Finally, the authors do not show an example of their spot segmentation to judge whether their spot counting is reliable. Overall, this makes it difficult to judge whether the quantification in Figure 7C can be trusted.

      To address these questions, in the revised manuscript we will include an individal channel of Wnt2b and mark the boundaries. We will also provide full-view images and examples of spot segmentation in supplementary figures as space limitation in the main figures.

      (7) The correlation between mKi67 and Axin2 in Figure 7 is interesting but does not convincingly show that Wnt downstream of PRDM16 and BMP is responsible for the increased proliferation in PRDM16 mutants.

      We agree that this result (the correlation between mKi67 and Axin2) alone only suggests that Wnt signaling is related to the proliferation defect in the Prdm16 mutant, and does not necessarily mean that Wnt is downstream of PRDM16 and BMP. Our concolusion is backed up by two additional lines of evidences:  the Cut&Tag data in which PRDM16 binds to regulatory regions of Wnt7b and Wnt3a; BMP and PRDM16 co-repress Wnt7b in vitro.

      An ideal result is that down-regulating Wnt signaling in Prdm16 mutant can rescue Prdm16 mutant phenotype. Such an experiment is technically challenging. Wnt plays diverse and essential roles in NSC regulation, and one would need to use a celltype-and stage-specific tool to down-regulate Wnt in the background of Prdm16 mutation. Moreover, Wnt genes are not the only targets regulated by PRDM16 in these cells, and downregulating Wnt may not be sufficient to rescue the phenotype. 

      Weaknesses of the presentation:

      Overall, the manuscript is not easy to read. This can cause confusion.

      We will revise the text to improve the clarity.

      Reviewer #2 (Public review):

      Summary:

      This article investigates the role of PRDM16 in regulating cell proliferation and differentiation during choroid plexus (ChP) development in mice. The study finds that PRDM16 acts as a corepressor in the BMP signaling pathway, which is crucial for ChP formation.

      The key findings of the study are:

      (1) PRDM16 promotes cell cycle exit in neural epithelial cells at the ChP primordium.

      (2) PRDM16 and BMP signaling work together to induce neural stem cell (NSC) quiescence in vitro.

      (3) BMP signaling and PRDM16 cooperatively repress proliferation genes.

      (4) PRDM16 assists genomic binding of SMAD4 and pSMAD1/5/8.

      (5) Genes co-regulated by SMADs and PRDM16 in NSCs are repressed in the developing ChP.

      (6) PRDM16 represses Wnt7b and Wnt activity in the developing ChP.

      (7) Levels of Wnt activity correlate with cell proliferation in the developing ChP and CH.

      In summary, this study identifies PRDM16 as a key regulator of the balance between BMP and Wnt signaling during ChP development. PRDM16 facilitates the repressive function of BMP signaling on cell proliferation while simultaneously suppressing Wnt signaling. This interplay between signaling pathways and PRDM16 is essential for the proper specification and differentiation of ChP epithelial cells. This study provides new insights into the molecular mechanisms governing ChP development and may have implications for understanding the pathogenesis of ChP tumors and other related diseases.

      Strengths:

      (1) Combining in vitro and in vivo experiments to provide a comprehensive understanding of PRDM16 function in ChP development.

      (2) Uses of a variety of techniques, including immunostaining, RNA in situ hybridization, RT-qPCR, CUT&Tag, ChIP-seq, and SCRINSHOT.

      (3) Identifying a novel role for PRDM16 in regulating the balance between BMP and Wnt signaling.

      (4) Providing a mechanistic explanation for how PRDM16 enhances the repressive function of BMP signaling. The identification of SMAD palindromic motifs as preferred binding sites for the SMAD/PRDM16 complex suggests a specific mechanism for PRDM16-mediated gene repression.

      (5) Highlighting the potential clinical relevance of PRDM16 in the context of ChP tumors and other related diseases. By demonstrating the crucial role of PRDM16 in controlling ChP development, the study suggests that dysregulation of PRDM16 may contribute to the pathogenesis of these conditions.

      Weaknesses:

      (1) Limited investigation of the mechanism controlling PRDM16 protein stability and nuclear localization in vivo. The study observed that PRDM16 protein became nearly undetectable in NSCs cultured in vitro, despite high mRNA levels. While the authors speculate that post-translational modifications might regulate PRDM16 in NSCs similar to brown adipocytes, further investigation is needed to confirm this and understand the precise mechanism controlling PRDM16 protein levels in vivo.

      While mechansims controlling PRDM16 protein stability and nuclear localization in the developing brain are interesting, the scope of this paper is revealing the function of PRDM16 in the choroid plexus and its interaction with BMP signaling. We will be happy to pursuit this direction in our next study.

      (2) Reliance on overexpression of PRDM16 in NSC cultures. To study PRDM16 function in vitro, the authors used a lentiviral construct to constitutively express PRDM16 in NSCs. While this approach allowed them to overcome the issue of low PRDM16 protein levels in vitro, it is important to consider that overexpressing PRDM16 may not fully recapitulate its physiological role in regulating gene expression and cell behavior.

      As stated above, we acknowledge that findings from cultured NSCs may not directly apply to ChP cells in vivo. We are cautious with our statements. The cell culture work was aimed to identify potential mechanisms by which PRDM16 and SMADs interact to regulate gene expression and target genes co-regulated by these factors. We expect that not all targets from cell culture are regulated by PRDM16 and SMADs in the ChP, so we validated expression changes of several target genes in the developing ChP and now included the new data in Fig. 7 and Supplementary Fig. 7. Out of the 31 genes identified from cultured cells, four cell cycle regulators including Wnt7b, Id3, Spc24/25/nuf2 and Mybl2, showed de-repression in Prdm16 mutant ChP. These genes can be relevant downstream genes in the ChP, and other target genes may be cortical NSC-specific or less dependent on Prdm16 in vivo.

      (3) Lack of direct evidence for AP1 as the co-factor responsible for SMAD relocation in the absence of PRDM16. While the study identified the AP1 motif as enriched in SMAD binding sites in Prdm16 knockout cells, they only provided ChIP-qPCR validation for c-FOS binding at two specific loci (Wnt7b and Id3). Further investigation is needed to confirm the direct interaction between AP1 and SMAD proteins in the absence of PRDM16 and to rule out other potential co-factors.

      We agree that the finding of the AP1 motif enriched at the PRDM16 and SMAD co-binding regions in Prdm16 KO cells can only indirectly suggest AP1 as a co-factor for SMAD relocation. That’s why we used ChIP-qPCR to examine the presence of C-fos at these sites. Although we only validated two targets, the result confirms that C-fos binds to the sites only in the Prdm16 KO cells but not Prdm16_expressing cells, suggesting AP1 is a co-factor.  We results cannot rule out the presence of other co-factors.

      Reviewer #3 (Public review):

      Summary:

      Bone morphogenetic protein (BMP) signaling instructs multiple processes during development including cell proliferation and differentiation. The authors set out to understand the role of PRDM16 in these various functions of BMP signaling. They find that PRDM16 and BMP co-operate to repress stem cell proliferation by regulating the genomic distribution of BMP pathway transcription factors. They additionally show that PRDM16 impacts choroid plexus epithelial cell specification. The authors provide evidence for a regulatory circuit (constituting of BMP, PRDM16, and Wnt) that influences stem cell proliferation/differentiation.

      Strengths:

      I find the topics studied by the authors in this study of general interest to the field, the experiments well-controlled and the analysis in the paper sound.

      Weaknesses:

      I have no major scientific concerns. I have some minor recommendations that will help improve the paper (regarding the discussion).

      We will revise the discussion according the suggestions.

    1. eLife Assessment

      The authors utilize a valuable computational approach to exploring the mechanisms of memory-dependent klinotaxis, with a hypothesis that is both plausible and testable. Although they provide a solid hypothesis of circuit function based on an established model, the model's lack of integration of newer experimental findings, its reliance on predefined synaptic states, and oversimplified sensory dynamics, make the investigation incomplete for both memory and internal-state modulation of taxis.

    2. Reviewer #1 (Public review):

      Summary:

      This research focuses on C. elegans klinotaxis, a chemotactic behavior characterized by gradual turning, aiming to uncover the neural circuit mechanism responsible for the context-dependent reversal of salt concentration preference. The phenomenon observed is that the preferred salt concentration depends on the difference between the pre-assay cultivation conditions and the current environmental salt levels.

      The authors propose that a synaptic-reversal plasticity mechanism at the primary sensory neuron, ASER, is critical for this memory- and context-dependent switching of preference. They build on prior findings regarding synaptic reversal between ASER and AIB, as well as the receptor composition of AIY neurons, to hypothesize that similar "plasticity" between ASER and AIY underpins salt preference behavior in klinotaxis. This plasticity differs conceptually from the classical one as it does not rely on any structural changes but rather synaptic transmission is modulated by the basal level of glutamate, and can switch from inhibitory to excitatory.

      To test this hypothesis, the study employs a previously established neuroanatomically grounded model [4] and demonstrates that reversing the ASER-AIY synapse sign in the model agent reproduces the observed reversal in salt preference. The model is parameterized using a computational search technique (evolutionary algorithm) to optimize unknown electrophysiological parameters for chemotaxis performance. Experimental validity is ensured by incorporating constraints derived from published findings, confirming the plausibility of the proposed mechanism.

      Finally. the circuit mechanism allowing C. elegans to switch behaviour to an exploration run when starved is also investigated. This extension highlights how internal states, such as hunger, can dynamically reshape sensory-motor programs to drive context-appropriate behaviors.

      Strengths and weaknesses:

      The authors' approach of integrating prior knowledge of receptor composition and synaptic reversal with the repurposing of a published neuroanatomical model [4] is a significant strength. This methodology not only ensures biological plausibility but also leverages a solid, reproducible modeling foundation to explore and test novel hypotheses effectively.

      The evidence produced that the original model has been successfully reproduced is convincing.

      The writing of the manuscript needs revision as it makes comprehension difficult.

      One major weakness is that the model does not incorporate key findings that have emerged since the original model's publication in 2013, limiting the support for the proposed mechanism. In particular, ablation studies indicate that AIY is not critical for chemotaxis, and other interneurons may play partially overlapping roles in positive versus negative chemotaxis. These findings challenge the centrality of AIY and suggest the model oversimplifies the circuit involved in klinotaxis.

      Reference [1] also shows that ASER neurons exhibit complex, memory- and context-dependent responses, which are not accounted for in the model and may have a significant impact on chemotactic model behaviour.

      The hypothesis of synaptic reversal between ASER and AIY is not explicitly modeled in terms of receptor-specific dynamics or glutamate basal levels. Instead, the ASER-to-AIY connection is predefined as inhibitory or excitatory in separate models. This approach limits the model's ability to test the full range of mechanisms hypothesized to drive behavioral switching.

      While the main results - such as response dependence on step inputs at different phases of the oscillator - are consistent with those observed in chemotaxis models with explicit neural dynamics (e.g., Reference [2]), the lack of richer neural dynamics could overlook critical effects. For example, the authors highlight the influence of gap junctions on turning sensitivity but do not sufficiently analyze the underlying mechanisms driving these effects. The role of gap junctions in the model may be oversimplified because, as in the original model [4], the oscillator dynamics are not intrinsically generated by an oscillator circuit but are instead externally imposed via $z_\text{osc}$. This simplification should be carefully considered when interpreting the contributions of specific connections to network dynamics. Lastly, the complex and context-dependent responses of ASER [1] might interact with circuit dynamics in ways that are not captured by the current simplified implementation. These simplifications could limit the model's ability to account for the interplay between sensory encoding and motor responses in C. elegans chemotaxis.

      Appraisal:

      The authors show that their model can reproduce memory-dependent reversal of preference in klinotaxis, demonstrating that the ASER-to-AIY synapse plays a key role in switching chemotactic preferences. By switching the ASER-AIY connection from excitatory to inhibitory they indeed show that salt preference reverses. They also show that the curving/turn rate underlying the preference change is gradual and depends on the weight between ASER-AIY. They further support their claim by showing that curving rates also depend on cultivated (set-point).

      Thus within the constraints of the hypothesis and the framework, the model operates as expected and aligns with some experimental findings. However, significant omissions of key experimental evidence raise questions on whether the proposed neural mechanisms are sufficient for reversal in salt-preference chemotaxis.

      Previous work [1] has shown that individually ablating the AIZ or AIY interneurons has essentially no effect on the Chemotactic Index (CI) toward the set point ([1] Figure 6). Furthermore, in [1] the authors report that different postsynaptic neurons are required for movement above or below the set point. The manuscript should address how this evidence fits with their model by attempting similar ablations. It is possible that the CI is rescued by klinokinesis but this needs to be tested on an extension of this model to provide a more compelling argument.

      The investigation of dispersal behaviour in starved individuals is rather limited to testing by imposing inhibition of the SMB neurons. Although a circuit is proposed for how hunger states modulate taxis in the absence of food, this circuit hypothesis is not explicitly modelled to test the theory or provide novel insights.

      Impact :

      This research underscores the value of an embodied approach to understanding chemotaxis, addressing an important memory mechanism that enables adaptive behavior in the sensorimotor circuits supporting C. elegans chemotaxis. The principle of operation - the dependence of motor responses to sensory inputs on the phase of oscillation - appears to be a convergent solution to taxis. Similar mechanisms have been proposed in Drosophila larvae chemotaxis [2], zebrafish phototaxis [3], and other systems. Consequently, the proposed mechanism has broader implications for understanding how adaptive behaviors are embedded within sensorimotor systems and how experience shapes these circuits across species.

      Although the reported reversal of synaptic connection from excitatory to inhibitory is an exciting phenomenon of broad interest, it is not entirely new, as the authors acknowledge similar reversals have been reported in ASER-to-AIB signaling for klinokinesis ( Hiroki et al., 2022). The proposed reversal of the ASER-to-AIY synaptic connection from inhibitory to excitatory is a novel contribution in the specific context of klinotaxis. While the ASER's role in gradient sensing and memory encoding has been previously identified, the current paper mechanistically models these processes, introducing a hypothesis for synaptic plasticity as the basis for bidirectional salt preference in klinotaxis.

      The research also highlights how internal states, such as hunger, can dynamically reshape sensory-motor programs to drive context-appropriate behaviors.

      The methodology of parameter search on a neural model of a connectome used here yielded the valuable insight that connectome information alone does not provide enough constraints to reproduce the neural circuits for behaviour. It demonstrates that additional neurophysiological constraints are required.

      Additional Context

      Oscillators with stimulus-driven perturbations appear to be a convergent solution for taxis and navigation across species. Similar mechanisms have been studied in zebrafish phototaxis [3], Drosophila larvae chemotaxis [2], and have even been proposed to underlie search runs in ants. The modulation of taxis by context and memory is a ubiquitous requirement, with parallels across species. For example, Drosophila larvae modulate taxis based on current food availability and predicted rewards associated with odors, though the underlying mechanism remains elusive. The synaptic reversal mechanism highlighted in this study offers a compelling framework for understanding how taxis circuits integrate context-related memory retrieval more broadly.

      As a side note, an interesting difference emerges when comparing C. elegans and Drosophila larvae chemotaxis. In Drosophila larvae, oscillatory mechanisms are hypothesized to underlie all chemotactic reorientations, ranging from large turns to smaller directional biases (weathervaning). By contrast, in C. elegans, weathervaning and pirouettes are treated as distinct strategies, often attributed to separate neural mechanisms. This raises the possibility that their motor execution could share a common oscillator-based framework. Re-examining their overlap might reveal deeper insights into the neural principles underlying these maneuvers.

      (1) Luo, L., Wen, Q., Ren, J., Hendricks, M., Gershow, M., Qin, Y., Greenwood, J., Soucy, E.R., Klein, M., Smith-Parker, H.K., & Calvo, A.C. (2014). Dynamic encoding of perception, memory, and movement in a C. elegans chemotaxis circuit. Neuron, 82(5), 1115-1128.

      (2) Antoine Wystrach, Konstantinos Lagogiannis, Barbara Webb (2016) Continuous lateral oscillations as a core mechanism for taxis in Drosophila larvae eLife 5:e15504.

      (3) Wolf, S., Dubreuil, A.M., Bertoni, T. et al. Sensorimotor computation underlying phototaxis in zebrafish. Nat Commun 8, 651 (2017).

      (4) Izquierdo, E.J. and Beer, R.D., 2013. Connecting a connectome to behavior: an ensemble of neuroanatomical models of C. elegans klinotaxis. PLoS computational biology, 9(2), p.e1002890.

    3. Reviewer #2 (Public review):

      Summary:

      This study explores how a simple sensorimotor circuit in the nematode C. elegans enables it to navigate salt gradients based on past experiences. Using computational simulations and previously described neural connections, the study demonstrates how a single neuron, ASER, can change its signaling behavior in response to different salt conditions, with which the worm is able to "remember" prior environments and adjust its navigation toward "preferred" salinity accordingly.

      Strengths:

      The key novelty and strength of this paper is the explicit demonstration of computational neurobehavioral modeling and evolutionary algorithms to elucidate the synaptic plasticity in a minimal neural circuit that is sufficient to replicate memory-based chemotaxis. In particular, with changes in ASER's glutamate release and sensitivity of downstream neurons, the ASER neuron adjusts its output to be either excitatory or inhibitory depending on ambient salt concentration, enabling the worm to navigate toward or away from salt gradients based on prior exposure to salt concentration.

      Weaknesses:

      While the model successfully replicates some behaviors observed in previous experiments, many key assumptions lack direct biological validation. As to the model output readouts, the model considers only endpoint behaviors (chemotaxis index) rather than the full dynamics of navigation, which limits its predictive power. Moreover, some results presented in the paper lack interpretation, and many descriptions in the main text are overly technical and require clearer definitions.

    4. Author response:

      eLife Assessment 

      The authors utilize a valuable computational approach to exploring the mechanisms of memorydependent klinotaxis, with a hypothesis that is both plausible and testable. Although they provide a solid hypothesis of circuit function based on an established model, the model's lack of integration of newer experimental findings, its reliance on predefined synaptic states, and oversimplified sensory dynamics, make the investigation incomplete for both memory and internal-state modulation of taxis.  

      We would like to express our gratitude to the editor for the assessment of our work. However, we respectfully disagree with the assessment that our investigation is incomplete, if the negative assessment is primarily due to the impact of AIY interneuron ablation on the chemotaxis index (CI) which was reported in Reference [1]. It is crucial to acknowledge that the CI determined through experimental means incorporates contributions from both klinokinesis and klinotaxis [1]. It is plausible that the impact of AIY ablation was not adequately reflected in the CI value. Consequently, the experimental observation does not necessarily diminish the role of AIY in klinotaxis. Anatomical evidence provided by the database (http://ims.dse.ibaraki.ac.jp/ccep-tool/) substantiates that ASE sensory neurons and AIZ interneurons, which have been demonstrated to play a crucial role in klinotaxis [Matsumoto et al., PNAS 121 (5) e2310735121], have the highest number of synaptic connections with AIY interneurons. These findings provide substantial evidence supporting the validity of the presented minimal neural network responsible for salt klinotaxis.

      Public Reviews: 

      Reviewer #1 (Public review): 

      Summary: 

      This research focuses on C. elegans klinotaxis, a chemotactic behavior characterized by gradual turning, aiming to uncover the neural circuit mechanism responsible for the context-dependent reversal of salt concentration preference. The phenomenon observed is that the preferred salt concentration depends on the difference between the pre-assay cultivation conditions and the current environmental salt levels. 

      We would like to express our gratitude for the time and consideration you have dedicated to reviewing our manuscript.

      The authors propose that a synaptic-reversal plasticity mechanism at the primary sensory neuron, ASER, is critical for this memory- and context-dependent switching of preference. They build on prior findings regarding synaptic reversal between ASER and AIB, as well as the receptor composition of AIY neurons, to hypothesize that similar "plasticity" between ASER and AIY underpins salt preference behavior in klinotaxis. This plasticity differs conceptually from the classical one as it does not rely on any structural changes but rather synaptic transmission is modulated by the basal level of glutamate, and can switch from inhibitory to excitatory. 

      To test this hypothesis, the study employs a previously established neuroanatomically grounded model [4] and demonstrates that reversing the ASER-AIY synapse sign in the model agent reproduces the observed reversal in salt preference. The model is parameterized using a computational search technique (evolutionary algorithm) to optimize unknown electrophysiological parameters for chemotaxis performance. Experimental validity is ensured by incorporating constraints derived from published findings, confirming the plausibility of the proposed mechanism. 

      Finally. the circuit mechanism allowing C. elegans to switch behaviour to an exploration run when starved is also investigated. This extension highlights how internal states, such as hunger, can dynamically reshape sensory-motor programs to drive context-appropriate behaviors.  

      We would like to thank the reviewer for the appropriate summary of our work. 

      Strengths and weaknesses: 

      The authors' approach of integrating prior knowledge of receptor composition and synaptic reversal with the repurposing of a published neuroanatomical model [4] is a significant strength.

      This methodology not only ensures biological plausibility but also leverages a solid, reproducible modeling foundation to explore and test novel hypotheses effectively.

      The evidence produced that the original model has been successfully reproduced is convincing.

      The writing of the manuscript needs revision as it makes comprehension difficult.  

      We would like to thank the reviewer for recognizing the usefulness of our approach. In the revised version, we will improve the explanation.  

      One major weakness is that the model does not incorporate key findings that have emerged since the original model's publication in 2013, limiting the support for the proposed mechanism. In particular, ablation studies indicate that AIY is not critical for chemotaxis, and other interneurons may play partially overlapping roles in positive versus negative chemotaxis. These findings challenge the centrality of AIY and suggest the model oversimplifies the circuit involved in klinotaxis.

      We would like to express our gratitude for the constructive feedback we have received. We concur with some of your assertions. In fact, our model is the minimal network for salt klinotaxis, which includes solely the interneurons that are connected to each other via the highest number of synaptic connections. It is important to note that our model does not consider redundant interneurons that exhibit overlapping roles. Consequently, the model is not applicable to the study of the impact of interneuron ablation. In the reference [1], the influence of interneuron ablations on the chemotaxis index (CI) has been investigated. The experimentally determined CI value incorporates the contributions from both klinokinesis and klinotaxis. Consequently, it is plausible that the impact of AIY ablation was not significantly reflected in the CI value. The experimental observation does not necessarily diminish the role of AIY in klinotaxis. 

      Reference [1] also shows that ASER neurons exhibit complex, memory- and context-dependent responses, which are not accounted for in the model and may have a significant impact on chemotactic model behaviour. 

      As pointed out by the reviewer, our model does not incorporate the context-dependent response of the ASER. Instead, the salt concentration-dependent glutamate release from the ASRE [S. Hiroki et al. Nat Commun 13, 2928 (2022)] as the result of the ASER responses is considered in the present study.

      The hypothesis of synaptic reversal between ASER and AIY is not explicitly modeled in terms of receptor-specific dynamics or glutamate basal levels. Instead, the ASER-to-AIY connection is predefined as inhibitory or excitatory in separate models. This approach limits the model's ability to test the full range of mechanisms hypothesized to drive behavioral switching.  

      We would like to thank the reviewer for the helpful comments. In the revised version, we will mention the limitation.

      While the main results - such as response dependence on step inputs at different phases of the oscillator - are consistent with those observed in chemotaxis models with explicit neural dynamics (e.g., Reference [2]), the lack of richer neural dynamics could overlook critical effects. For example, the authors highlight the influence of gap junctions on turning sensitivity but do not sufficiently analyze the underlying mechanisms driving these effects. The role of gap junctions in the model may be oversimplified because, as in the original model [4], the oscillator dynamics are not intrinsically generated by an oscillator circuit but are instead externally imposed via $z_¥text{osc}$. This simplification should be carefully considered when interpreting the contributions of specific connections to network dynamics. Lastly, the complex and contextdependent responses of ASER [1] might interact with circuit dynamics in ways that are not captured by the current simplified implementation. These simplifications could limit the model's ability to account for the interplay between sensory encoding and motor responses in C. elegans chemotaxis. 

      We might not understand the substance of your assertions. However, we understand that the oscillator dynamics were not generated by an oscillator neural circuit in our modeling. On the other hand, the present study focuses on how the sensory input and resulting interneuron dynamics regulate the oscillatory activity of SMB motor neurons to generate klinotaxis. 

      Appraisal: 

      The authors show that their model can reproduce memory-dependent reversal of preference in klinotaxis, demonstrating that the ASER-to-AIY synapse plays a key role in switching chemotactic preferences. By switching the ASER-AIY connection from excitatory to inhibitory they indeed show that salt preference reverses. They also show that the curving/turn rate underlying the preference change is gradual and depends on the weight between ASER-AIY. They further support their claim by showing that curving rates also depend on cultivated (set-point).  

      We would like to thank the reviewer for assessing our work.

      Thus within the constraints of the hypothesis and the framework, the model operates as expected and aligns with some experimental findings. However, significant omissions of key experimental evidence raise questions on whether the proposed neural mechanisms are sufficient for reversal in salt-preference chemotaxis.  

      We agree with your opinion. The present hypothesis should be verified by experiments.

      Previous work [1] has shown that individually ablating the AIZ or AIY interneurons has essentially no effect on the Chemotactic Index (CI) toward the set point ([1] Figure 6). Furthermore, in [1] the authors report that different postsynaptic neurons are required for movement above or below the set point. The manuscript should address how this evidence fits with their model by attempting similar ablations. It is possible that the CI is rescued by klinokinesis but this needs to be tested on an extension of this model to provide a more compelling argument.  

      We would like to express our gratitude for the constructive feedback we have received. In the reference [1], the influence of interneuron ablations on the chemotaxis index (CI) has been investigated. It is important to acknowledge that the experimentally determined CI value encompasses the contributions of both klinokinesis and klinotaxis. It is plausible that the impact of AIY ablation was not reflected in the CI value. Consequently, these experimental observations do not necessarily diminish the role of AIY in klinotaxis. The neural circuit model employed in the present study constitutes a minimal network for salt klinotaxis, encompassing solely interneurons that are connected to each other via the highest number of synaptic connections. Anatomical evidence provided by the database (http://ims.dse.ibaraki.ac.jp/cceptool/) substantiates that ASE sensory neurons and AIZ interneurons, which have been demonstrated to play a crucial role in klinotaxis [Matsumoto et al., PNAS 121 (5) e2310735121], have the highest number of synaptic connections with AIY interneurons. Our model does not take into account redundant interneurons with overlapping roles, thus rendering it not applicable to the study of the effects of interneuron ablation.

      The investigation of dispersal behaviour in starved individuals is rather limited to testing by imposing inhibition of the SMB neurons. Although a circuit is proposed for how hunger states modulate taxis in the absence of food, this circuit hypothesis is not explicitly modelled to test the theory or provide novel insights.  

      As pointed out by the reviewer, the neural circuit that inhibits the SMB motor neurons was not explicitly incorporated in our model. We then examined whether our minimal network model could reproduce dispersal behavior under starvation conditions solely due to the experimentally identified inhibitory effect of SMB motor neurons.

      Impact : 

      This research underscores the value of an embodied approach to understanding chemotaxis, addressing an important memory mechanism that enables adaptive behavior in the sensorimotor circuits supporting C. elegans chemotaxis. The principle of operation - the dependence of motor responses to sensory inputs on the phase of oscillation - appears to be a convergent solution to taxis. Similar mechanisms have been proposed in Drosophila larvae chemotaxis [2], zebrafish phototaxis [3], and other systems. Consequently, the proposed mechanism has broader implications for understanding how adaptive behaviors are embedded within sensorimotor systems and how experience shapes these circuits across species.

      We would like to express our gratitude for useful suggestion. We will add the argument that the reviewer mentioned in the revised version.  

      Although the reported reversal of synaptic connection from excitatory to inhibitory is an exciting phenomenon of broad interest, it is not entirely new, as the authors acknowledge similar reversals have been reported in ASER-to-AIB signaling for klinokinesis ( Hiroki et al., 2022). The proposed reversal of the ASER-to-AIY synaptic connection from inhibitory to excitatory is a novel contribution in the specific context of klinotaxis. While the ASER's role in gradient sensing and memory encoding has been previously identified, the current paper mechanistically models these processes, introducing a hypothesis for synaptic plasticity as the basis for bidirectional salt preference in klinotaxis.  

      The research also highlights how internal states, such as hunger, can dynamically reshape sensory-motor programs to drive context-appropriate behaviors.  

      The methodology of parameter search on a neural model of a connectome used here yielded the valuable insight that connectome information alone does not provide enough constraints to reproduce the neural circuits for behaviour. It demonstrates that additional neurophysiological constraints are required.  

      We would like to acknowledge the appropriate recognition of our work.

      Additional Context 

      Oscillators with stimulus-driven perturbations appear to be a convergent solution for taxis and navigation across species. Similar mechanisms have been studied in zebrafish phototaxis [3],

      Drosophila larvae chemotaxis [2], and have even been proposed to underlie search runs in ants.

      The modulation of taxis by context and memory is a ubiquitous requirement, with parallels across species. For example, Drosophila larvae modulate taxis based on current food availability and predicted rewards associated with odors, though the underlying mechanism remains elusive. The synaptic reversal mechanism highlighted in this study offers a compelling framework for understanding how taxis circuits integrate context-related memory retrieval more broadly.  

      We would like to express our gratitude for the insightful commentary. In the revised version, we will incorporate the discussion that the similar oscillator mechanism with stimulus-driven perturbations has been observed for zebrafish phototaxis [3] and Drosophila larvae chemotaxis [2].

      As a side note, an interesting difference emerges when comparing C. elegans and Drosophila larvae chemotaxis. In Drosophila larvae, oscillatory mechanisms are hypothesized to underlie all chemotactic reorientations, ranging from large turns to smaller directional biases (weathervaning). By contrast, in C. elegans, weathervaning and pirouettes are treated as distinct strategies, often attributed to separate neural mechanisms. This raises the possibility that their motor execution could share a common oscillator-based framework. Re-examining their overlap might reveal deeper insights into the neural principles underlying these maneuvers. 

      We would like to acknowledge your thoughtfully articulated comment. As pointed out by the reviewer, from the anatomical database (http://ims.dse.ibaraki.ac.jp/ccep-tool/), we found that the neural circuits underlying weathervaning and pirouettes in C. elegans are predominantly distinct but exhibit partial overlap. When we restrict our search to the neurons that are connected to each other with the highest number of synaptic connections, we identify the projections from the neural circuit of weathervaning to the circuit of pirouettes; however we observed no reversal projections. This finding suggests that the neural circuit of weathervaning, namely, our minimal neural network, is not likely to be affected by that of pirouettes, which consists of AIB interneurons and interneurons and motor neurons the downstream. 

      (1) Luo, L., Wen, Q., Ren, J., Hendricks, M., Gershow, M., Qin, Y., Greenwood, J., Soucy, E.R., Klein, M., Smith-Parker, H.K., & Calvo, A.C. (2014). Dynamic encoding of perception, memory, and movement in a C. elegans chemotaxis circuit. Neuron, 82(5), 1115-1128. 

      (2) Antoine Wystrach, Konstantinos Lagogiannis, Barbara Webb (2016) Continuous lateral oscillations as a core mechanism for taxis in Drosophila larvae eLife 5:e15504. 

      (3) Wolf, S., Dubreuil, A.M., Bertoni, T. et al. Sensorimotor computation underlying phototaxis in zebrafish. Nat Commun 8, 651 (2017). 

      (4) Izquierdo, E.J. and Beer, R.D., 2013. Connecting a connectome to behavior: an ensemble of neuroanatomical models of C. elegans klinotaxis. PLoS computational biology, 9(2), p.e1002890. 

      Reviewer #2 (Public review): 

      Summary: 

      This study explores how a simple sensorimotor circuit in the nematode C. elegans enables it to navigate salt gradients based on past experiences. Using computational simulations and previously described neural connections, the study demonstrates how a single neuron, ASER, can change its signaling behavior in response to different salt conditions, with which the worm is able to "remember" prior environments and adjust its navigation toward "preferred" salinity accordingly.  

      We would like to express our gratitude for the time and consideration the reviewer has dedicated to reviewing our manuscript.

      Strengths: 

      The key novelty and strength of this paper is the explicit demonstration of computational neurobehavioral modeling and evolutionary algorithms to elucidate the synaptic plasticity in a minimal neural circuit that is sufficient to replicate memory-based chemotaxis. In particular, with changes in ASER's glutamate release and sensitivity of downstream neurons, the ASER neuron adjusts its output to be either excitatory or inhibitory depending on ambient salt concentration, enabling the worm to navigate toward or away from salt gradients based on prior exposure to salt concentration.

      We would like to thank the reviewer for appreciating our research. 

      Weaknesses: 

      While the model successfully replicates some behaviors observed in previous experiments, many key assumptions lack direct biological validation. As to the model output readouts, the model considers only endpoint behaviors (chemotaxis index) rather than the full dynamics of navigation, which limits its predictive power. Moreover, some results presented in the paper lack interpretation, and many descriptions in the main text are overly technical and require clearer definitions.  

      We would like to thank the reviewer for the constructive feedback. As the reviewer noted, the fundamental assumptions posited in the study have yet to be substantiated by biological validation. Consequently, these assumptions must be directly assessed by biological experimentation. The model performance for salt klinotaxis is evaluated by multiple factors, including not only a chemotaxis index but also the curving rate vs. bearing (Fig. 4a, the bearing is defined in Fig. A3) and the curving rate vs. normal gradient (Fig. 4c). The subsequent two parameters work to characterize the trajectory during salt klinotaxis. In the revised version, we will meticulously revise the manuscript according to the suggestions by the reviewer. We would like to express our sincere gratitude for your insightful review of our work.

    1. eLife Assessment

      This important study examines the role of pericytes in patterning the zebrafish blood-brain barrier (BBB) and controlling its permeability. Using pdgfrb mutant zebrafish models lacking brain pericytes, the authors report that pericyte-deficient cerebrovasculatures are ill-patterned, yet display unaltered restrictive BBB permeability properties at larval and juvenile stages. More severe phenotypes are detected in adults, with focal leakage sites associated with hemorrhages and aneurysms. Using solid and beautifully documented imaging, the authors suggest that, contrary to the situation described in rodent models, pdgfrb-dependent pericytes are not required to maintain the BBB in the zebrafish brain; these unexpected and intriguing findings reshape our understanding of BBB permeability regulation in vertebrates.

    2. Reviewer #1 (Public review):

      Summary:

      The study investigates the role of vascular mural cells, specifically pericytes and vascular smooth muscle cells (vSMCs), in maintaining blood-brain barrier (BBB) integrity and regulating vascular patterning. Analyzing zebrafish pdgfrb mutants that lack brain pericytes and vSMCs, they show that mural cell deficiency does not impair BBB establishment or maintenance during larval and early juvenile stages. However, mural cells seem to be crucial for preventing vascular aneurysms and hemorrhage in adulthood as focal leakage, basement membrane disruption, and increased caveolae formation are observed in adult zebrafish at aneurysm hotspots. The authors challenge the paradigm that mural cells are essential for BBB regulation in early development while highlighting their importance for long-term vascular stability.

      Strengths:

      Previous studies have established that the zebrafish BBB shares molecular and morphological homology with e.g. the mammalian BBB and therefore represents a suitable model. By examining mural cell roles across different life stages - from larval to adult zebrafish - the study provides an unprecedented comprehensive developmental analysis of brain vascular development and of how mural cells influence BBB integrity and vascular stability over time. The use of live imaging, whole-brain clearing, and electron microscopy offers high-resolution insights into cerebrovascular patterning, aneurysm development, and structural changes in endothelial cells and basement membranes. By analyzing "leakage hotspots" and their association with structural endothelial defects in adults the presented findings add novel insights into how mural cell loss may lead to vascular instability.

      Weaknesses:

      The study uses quantitative tracer assays with multiple molecular weight dyes to evaluate blood-brain barrier (BBB) permeability. The study normalizes the intensity of tracer signals (e.g., 10 kDa, 70 kDa dextrans) in the brain parenchyma to the vascular signal of a 2000 kDa dextran tracer (assumed to remain within vessels). Intensity normalization is used to control for variations in tracer injection efficiency or vascular density. This method doesn't directly assess the absolute amount of tracer present in the parenchyma, potentially underestimating leakage severity. As the lack of BBB impairment is a "negative" finding, more rigorous controls or other methods might be needed to corroborate it.

    3. Reviewer #2 (Public review):

      Summary:

      The authors generated a zebrafish mutant of the pdgfrb gene. The presented analyses and data confirm previous studies demonstrating that Pdgfrb signaling is necessary for mural cell development in zebrafish. In addition, the data support previously published studies in zebrafish showing that mural cell deficiency leads to hemorrhages later in life. The authors presented quantified data on vessel density and branching, assessed tracer extravasation, and investigated the vasculature of adult mice using electron microscopy.

      Strengths:

      The strength of this article is that it provides independent confirmation of the important role of Pdgfrb signaling for the development of mural cells in the zebrafish brain. In addition, it confirms previous literature on zebrafish that provides evidence that, in the absence of pericytes/VSMC, hemorrhages appear (Wang et al, 2014, PMID: 24306108 and Ando et al 2021, PMID: 3431092). The study by Ando et al, 2021 did not report experiments assessing BBB leakage in pdgfrb mutants but in the review article by Ando et al (PMID: 34685412) it is stated that "indicating that endothelial cells can produce basic barrier integrity without pericytes in zebrafish".

      Weaknesses:

      (1) The authors should avoid using violin plots, which show distribution. Instead, they should replace all violin plots in the figures with graphs showing individual data points and standard deviation. For Figure 2f specifically, the standard deviation in the analyzed cohort should be shown.

      (2) The authors have not shown the reduced PDGFRB protein or the effect of mutation on mRNA level in their zebrafish mutant.

      (3) Statistical data analysis: Did the authors perform analyses to investigate whether the data has a normal distribution (e.g., Figures 1d, e)?

      (4) Analysis of tracer extravasation. The use of 2000 kDa dextran intensity as an internal reference is problematic because the authors have not provided data demonstrating that the 2000 kDa dextran signal remains consistent across the entire vasculature. The authors have not provided data demonstrating that the 2000 kDa dextran signal in vessels exhibits acceptable variance across the vasculature to serve as a reliable internal reference. The variability of this signal within a single animal remains unknown. The presented data do not address this aspect.

      Additionally, it's intriguing that the signal intensity in the parenchyma of the tested tracers presents a substantial range, varying by 20-30% in the analysed cohort (Figure 1g, Extended Figure 1e). Such large variability raises the question of its origin. Could it be a consequence of the normalization to 2000 kDa dextran intensity which differs between different fish? Or is it due to the differences in the parenchymal signal intensity while the baseline 2000 kDa intensity is stable? Or is the situation mixed?

      An alternative and potentially more effective approach would be to cross the pdgfrb mutant line with a line where endothelial cells are genetically labeled to define vessels (e.g. the line kdrl used in acquiring data presented in Figure 2a). Non-injected controls could then be used as a baseline to assess tracer extravasation into the parenchyma.

      How is the data presented in Figure 3e generated? How was the dextran intensity calculated? It looks like the authors have used the kdrl line to define vessels. Was the 2000 kDa still used as in previous figures? If not, please describe this in the Materials and Methods section.

      (5) The authors state that both controls and mutants show extravasation of 1 kDa NHS-ester into the parenchyma. However, the presented images do not illustrate this; it is not obvious from these images (Extended Data Figure 1c). Additionally, the presented quantification data (Extended Data Figure 1e) do not show that, at 7 dpf, the vasculature is permeable to this tracer. Note that the range of signal intensity of the 1 kDa NHS-ester is similar to the 70 kDa dextran (Figure 1g and Extended Figure 1e). Would one expect an increase in the ratio in case of extravasation, considering that the 2000 kDa dextran has the same intensity in all experiments? Please explain.

      (6) The study would be strengthened by a more detailed temporal analysis of the phenotype. When do the aneurysms appear? Is there an additional loss of VSMC?

      (7) The authors intended to analyze the BBB at later stages (line 128), but there is not a significant time difference between 2 months (Figure 2) and 3 months (Figure 3) considering that zebrafish live on average 3 years. Therefore, the selection of only two time-points, 2 and 3 months, to analyze BBB changes does not provide a comprehensive overview of temporal changes throughout the zebrafish's lifespan. How long do the pdgfb mutants live?

      (8) Why is there a difference in tracer permeability between 2 and 3 months (Figures 2 and 3)? Are hemorrhages not detected in 2-month-old zebrafish?

      (9) Figure 3: The capillary bed should be presented in magnified images as it is not clearly visible. Figure 3e shows that in the pdgfb mutant the dextran intensity is higher also in regions 6-10. How do the authors explain this?

      (10) In general, the manuscript would benefit from a more detailed description of the performed experiments. How long did the tracer circulate in the experiments presented in Figures 2, 3, and 4?

      (11) How do the authors explain the poor signal of the 70 kDa dextran from the vasculature of 5-month-old zebrafish presented in Extended Data Figure 3?

      (12) The study would benefit from a clear separation of the phenotypes caused by the loss of VSMC. The title eludes that also capillaries present hemorrhages which is not the case. How do vascular mural cells differ from mural cells? Are there any other mural cells?

      (13) I have a few comments about how the authors have interpreted the literature and why, in my opinion, they should revise their strong statements (e.g., the last sentence in the abstract).

      Scientists have their own insights and interpretations of data. However, when citing published data, it should be clearly indicated whether the statement is a direct quote from the original publication or an interpretation. In the current manuscript, the authors have not correctly cited the data presented in the two published papers (references 5 and 6). These papers do not propose a model where pericytes suppress "adsorptive transcytosis" (lines 73-76). While increased transcytosis is observed in pericyte-deficient mice, the specific type of vesicular transport that is increased or induced remains unknown.

      Similarly, lines 151-152 refer to references 5 and 6 and use the term "adsorptive transcytosis," but the authors of both papers did not use this term. Attributing this term to the original authors is inaccurate. Additionally, lines 152-153 do not accurately represent the findings of references 5 and 6. These papers do not state that there is an induction of "caveolae" in endothelial cells in pericyte-deficient mice. In the absence of pericytes, many vesicles can be observed in endothelial cells, but these vesicles are relatively large. It is more likely that there is some form of uncontrolled transcytosis, perhaps micropinocytosis. Please refer to the original papers accurately.

      Also, the authors have missed the fact that in mice, the extent of pericyte loss correlates with the extent of BBB leakage. To a certain extent, the remaining pericytes, can compensate for the loss by making longer processes and so ensure the full longitudinal coverage of the endothelium. This was shown in the initial work of Armulik et al (reference 5) and later in other studies.

      The bold assertion on lines 183 -187 that a lack of specific BBB phenotype in pdgfrb zebrafish mutant invalidates mouse model findings is unfounded. Despite the notion that zebrafish endothelium possesses a BBB, I present a few examples highlighting the differences in brain vascular development and why the authors' expectation of a straightforward extrapolation of mouse BBB phenotypes to zebrafish is untenable.

      In mice Pdgfrb knockout is lethal, but in zebrafish, this is not the case. In marked contrast to mice, however, zebrafish pdgfrb null mutants reach adulthood despite extensive cerebral vascular anomalies and hemorrhage. Following the authors' argumentation about the unlikely divergence of zebrafish and mice evolution, does it mean that the described mouse phenotype warrants a revisit and that the Pdgfrb knockout in mice perhaps is not lethal? Another example where the role of a gene product is not one-to-one, which relates to pericyte development, is Notch3. Notch3-null mice do not show significant changes in pericyte numbers or distribution, suggesting a less prominent role in pericyte development compared to zebrafish.

      Although many aspects of development are conserved between species, there are significant differences during brain vascular development between zebrafish and mice. These differences could reveal why the BBB is not impaired in zebrafish pdgfrb mutants. There is a difference in the temporal aspect when various cellular players emerge. The timing of microglia colonization in the brain differs. In mice, microglia colonization starts before the first vessel sprouts enter the brain, while in zebrafish, microglia enter after. Additionally, microglia in zebrafish and mice have a different ontogeny. In mice, astrocytes specialize postnatally and form astrocyte endfeet postnatally. In zebrafish, radial glia/astrocytes form at 48 hpf, and as early as 3 dpf, gfap+ cells have a close relationship with blood vessels. Thus, these radial glia/astrocyte-like cells could play an important role in BBB induction in zebrafish. It's worth noting that in Drosophila, the blood-brain barrier is located in glial cells. While speculative, these cells might still play a role in zebrafish, while the role of pericytes does not seem to be crucial. Pericytes enter the brain and contact with developing vasculature (endothelium) relatively late in zebrafish (60 hpf). In mice, the situation is different, as there is no such lag between endothelium and pericyte entry into the brain. I suggest that the authors approach the observed data with curiosity and ask: Why are these differences present? Are all aspects of the BBB induced by neural tissue in zebrafish? What is the contribution of microglia and astrocytes?"

      Another interesting aspect to consider is the endothelial-pericyte ratio and longitudinal coverage of pericytes in the zebrafish brain, and how this relates to what is observed in mice. How similar is the zebrafish vasculature to the mouse vasculature when it comes to the average length of pericytes in the zebrafish brain? Does the longitudinal coverage of pericytes in the zebrafish brain reach nearly 100%, as it does in mice?

      Based on the preceding arguments, it is recommended that the authors present a balanced discussion that provides insightful discussion and situates their work within a broader framework.

    4. Reviewer #3 (Public review):

      This manuscript examines the role of pdgfrb-positive pericytes in the establishment and maintenance of the blood-brain barrier (BBB) in the zebrafish. Previous studies in PDGFB- or PDGFRB-deficient mice have suggested that loss of pericytes results in disruption of the BBB. The authors show that zebrafish pdgfrb mutant larvae have an intact BBB and that pdgfrb mutant adult fish show large vessel defects and hemorrhage but do not exhibit substantial leakage from brain capillaries, suggesting loss of pericytes is not sufficient to "open" the BBB. The authors use beautiful and compelling images and rigorous quantification to back up most of their conclusions. The imaging of the adult brain is particularly nice. The authors rigorously document the lack of BBB leakage in pdgfrbuq30bh mutant larvae and large vessel phenotypes (eg, enlargement and rupture) in pdgfrbuq30bh mutant adults. A few points would help the authors to further strengthen their findings contradicting the current dogma from rodent models.

      Major point:

      The authors document pericyte loss using a single TgBAC(pdgfrb:egfp)ncv22 transgenic line driven by the promoter of the same gene mutated in their pdgfrbuq30bh mutants. Given their findings on the consequences of pericyte loss directly contradict current dogma from rodent studies, it would be useful to further validate the absence of brain pericytes in these mutants using one of several other transgenic lines marking pericytes currently available in the zebrafish. This could be done using pdgfrb crispants, which the authors show nicely phenocopy the germline mutants, at least in larvae. This would help nail down the absence of any currently identifiable pericyte population or sub-population in the loss of pdgfrb animals and substantially strengthen the authors' conclusions.

      Other issues:

      The authors should provide more information about the pdgfrbuq30bh mutant and how it was generated (including a diagram in a supplemental figure would be useful).

      It would be helpful to show some data on whether mutants show morphological phenotypes or developmental delay at 7 and 14 dpf, to provide some context to better assess the reduced branching and vessel length vascular phenotypes (see Figures 1c-e).

      If available, it would be helpful to have a positive control for the tracer leakage experiments - a genetic manipulation that does cause disruption of the BBB and leakage at 2 hours post-tracer injection (see Figures 1f and g).

      Quantification of the findings in Figure 4c,d would be useful, as would the use of germline fish for these experiments if these are now available. If this is not possible, it would be helpful to document that the crispants used in these experiments lack pdgfrb:egfp pericytes at adult stages (this is only shown for 5 dpf larvae, in Extended Data Figure 4b).

      Adult mutants clearly show less dye leakage in the more superficial capillary regions than WT siblings, but dextran intensity is a bit higher, although this could well be diffusion from more central brain regions where overt hemorrhage is occurring. Along similar lines though, the authors' TEM data in Extended Data Figure 4d hints that there may be more caveolae in mutant brain capillaries, although the N number was lower here than for the measurements from TEM of larger central vessels (Figure 4g). It would be useful to carry out additional measurements to increase the N number in Figure 4d to see whether the difference between wild-type sibling and mutant capillary caveolae numbers remains as not significant.

      It might be helpful to include some orienting labels and/or additional descriptions in the figure legends to help readers who are not used to looking at zebrafish brain vessels have an easier time figuring out what they are looking at and where it is in the brain.

    5. Author response:

      We thank all the reviewers for their detailed comments. In response, we will address the comments with further analysis, experiments and an expanded discussion.

      In terms of each specific reviewer's comments:

      Reviewer 1 was positive overall but had several suggestions and requested further rigorously controls. These are highly constructive technical concerns and will be addressed through additional experimentation and methods for quantification.

      Reviewer 2 summarised the strengths of the study as being largely confirmatory. They have perhaps not fully appreciated that this is the first published functional assessment of cerebral vascular permeability in a pericyte deficient zebrafish model.

      The reviewer has made a number of very helpful suggestions to improve technical aspects of the analysis. Many align with the suggestions of Reviewer 1. Additional experiments that include more rigorous controls and further methods to quantify vessel permeability will address these concerns in revision.

      We also note that the reviewer calls for a more nuanced and careful discussion section. We take the reviewers point and do appreciate their concerns. We were limited by wordcount in the initial submission in short report format, but in response will expand and provide a more thorough discussion.

      Reviewer 3 was positive overall but has suggested additional controls and experiments to further strengthen the findings and support our conclusions. Some align with the suggestions of Reviewers 1 and 2. We agree and aim to address them through additional work in revision.

    1. eLife Assessment

      This important manuscript proposes a new strategy for the identification of new mechanisms of drug resistance based on SAturated Transposon Analysis in Yeast (SATAY), a powerful transposon sequencing method in Saccharomyces cerevisiae. This method allows us to uncover loss- and gain-of-function mutations conferring resistance to 20 different antifungal compounds. The method is convincing, allowing the authors to identify a novel interaction of chitosan with the cell wall mannosylphosphate, and show that the transporter Hol1 concentrates the novel antifungal ATI-2307 within yeast.

    2. Reviewer #1 (Public review):

      Summary:

      In this study, the authors employed Saturated Transposon Analysis in Yeast (SATAY) in the model yeast Saccharomyces cerevisiae to uncover mutations conferring resistance to 20 different antifungal compounds. These screens revealed novel resistance mechanisms and the modes of action for the antifungal compounds Chitosan and HTI-2307. The authors discovered that Chitosan electrostatically interacts with cell wall mannosylphosphate and identified Hol1 as the transporter of HTI-2307.

      Strengths:

      The study highlights the power of SATAY in uncovering drug-resistance mechanisms, modes of action, and cellular processes influencing fungal responses to drugs. Identifying novel resistance mechanisms and modes of action for various compounds in this model yeast provides valuable insights for further investigating these compounds in fungal pathogens and developing antifungal strategies. This study thus represents a significant resource for exploring cellular responses to chemical stresses.

      The manuscript is well-written and highly clear.

      Weaknesses:

      As the study was conducted using highly modified non-pathogenic laboratory yeast strains, verification of the findings in fungal pathogens would greatly enhance its relevance and applicability.

    3. Reviewer #2 (Public review):

      The study begins by exposing wild-type yeast libraries to some well-understood antifungals (amphotericin B, caspofungin, myriocin) to illustrate the complexity and power of the analytical method. These toxins are positively selected for loss-of-function transposon (CDS) insertions in many of the genes identified previously in earlier studies. The outlier genes were visually evident in scatter plots (Figure 1A, 1B, 1C) but the magnitude and statistical significance of the effects were not presented in tables. There were some unexplained and unexpected findings as well. For example, caspofungin targets the product of the GSC2 gene, and yet transposon insertions in this gene were positively selected rather than negatively selected (seemingly discordant from other studies).

      Interestingly, transposon insertions immediately upstream of toxin targets (Figure 1D) and toxin efflux transporters or their regulators (Figure 1E) were visibly selected by exposure to the toxins, suggesting gain-of-expression. Most of these findings are convincing, even without statistical tests. However, some were not (for example, Soraphen A on YOR1). A relevant question emerges here: Do both ends of the transposon confer the same degree of cryptic enhancer/promoter activity? If one end contains strong activity on downstream gene expression while the other does not, the effects of one may be obscured by the other. The directionality of transposon insertions (not provided) would then be important to consider when interpreting the raw data.

      A masterful rationalization of transposon insertion selection in the YAP1 and FLR1 genes was presented wherein loss of C-terminal auto-inhibitory domain of the Yap1 transcription factor resulted in FLR1 overexpression and resistance to Cerulenin. Transposon insertions in the CDS of YAP1 and FLR1 were negatively selected in Chlorothalonil while the gain-of-function and -expression insertions (enriched in Cerulenin) were not. The rationalization of these findings - that Chlorothalonil activates Yap1 while Cerulenin does not - was much less convincing and should be tested directly with a simple experiment such as Q-PCR.

      Moving to specially engineered yeast strains (Figure 2) where multiple efflux transporters were eliminated (for Prochloraz testing) or new drug targets were inserted (for Fludioxonil and Iprodione), numerous interesting observations were obtained. For instance, transposon insertions in totally different sets of genes were enriched by prochloraz depending on the strain background. Conversely, almost the exact same genes were selected in Fludioxonil and Iprodione, including genes in the well-known HOG pathway. Because several candidate receptors of these compounds were not significant in the Tn-seq dataset, the authors add new evidence to the field suggesting that the introduced gene (BdDRK1) represents the direct, or near-direct, target of these compounds.

      Chitosan effectiveness was studied by Tn-seq in yet another specialized strain of yeast that is uniquely susceptible to the toxin. Once again, the authors masterfully rationalize the complex effects, leading to a simple model where chitosan interacts with mannosyl-phosphate in the cell wall and membrane, which is deposited by Mnn4 and Mnn6 and masked by Mnn1 enzymes in the Golgi complex (themselves regulated or dependent on a number of additional gene products such as YND1. This research compellingly adds to our understanding of an industrial antifungal.

      Finally, the effects of a preclinical antifungal ATI-2307 were studied for the first time. Remarkably, ATI-2307 efficacy greatly depended on HOL1 coding sequences and an upstream enhancer (Figure 4). After engineering hol1∆ strains, uptake of the compound and sensitivity to the compound were lost and then restored by heterologous expression of CaHOL1 from a pathogenic yeast. HOL1 also conferred susceptibility to polyamines with related structures (Pentamidine, Iminoctadine). Remarkably, separation-of-function mutations were obtained in HOL1 that abolished the uptake of the toxins while preserving the uptake of nutrient polyamines in low nitrogen conditions, which strongly suggests that HOL1 encodes a direct transporter of the toxins. The implications are important for ATI-2307 efficacy in patients, where resistance mutations could arise spontaneously and produce poor clinical outcomes.

      Additional comments:

      The experiments presented here are often convincing and serve to illustrate the power of Tn-seq approaches in elucidating drug resistance mechanisms in eukaryotic microbes. The gain-of-expression effects (upstream of CDS), gain-of-function effects (elimination of auto-inhibitory domains), and loss-of-function effects were all carefully exposed and discussed, leading to numerous new insights on the action of diverse toxins.

      On the other hand, several deficiencies and weaknesses (in addition to the minor ones described above) limit the utility of the data that has been generated.

      (1) There was no summary table of Tn-seq data for different genes in the different conditions, so readers could not easily access data for genes and pathways not mentioned in the text. This is especially important because transposon insertions that were negatively selected (of great interest to the community) were barely mentioned. Additionally, the statistical significance of outlier genes was not reported. The same is true for insertions within the DNA segments upstream of CDSs. Users of these data are therefore restricted to visually inspecting insertion sites on a genome browser.

      (2) Only one dose of each toxin was studied, which therefore produces a limited perspective on the genetic mechanisms of resistance in each case.

      (3) No Tn-seq experiments were performed in diploid yeast strains. The gain-of-expression and gain-of-function insertions under positive selection in haploid strains in the different conditions are expected to be dominant in diploid strains as well, while loss-of-function insertions in CDS are expected to be recessive. Do these expectations hold? Could such experiments potentially confirm the models for Cerulenin and Chlorothalonil effects on YAP1 and FLR1? Pathogenic Candida species are usually diploid where gain-of-function/expression mutants most frequently lead to poor clinical outcomes. Resistance to ATI-2307 through loss of HOL1 may not be as significant for diploid C. albicans with two functional copies of all genes. On a related note, is it possible that transposon insertions in the 3' untranslated region produce anti-sense transcripts that lowers the expression of the upstream gene from both alleles in diploids, thereby producing a strong selective advantage in ATI-2307? This study already touches on exciting new applications of the Tn-seq method but could easily go a bit further.

    4. Reviewer #3 (Public review):

      Summary:

      This manuscript describes an extensive application of the Yeast (SATAY) transposon mutagenesis and sequencing method to explore loss- and gain-of-function mutations conferring resistance to 20 different antifungal compounds. Impressively, the authors demonstrate that SATAY can be used to identify mutations that lead to antifungal resistance, including promoter mutations that include the direct targets of antifungal compounds and drug efflux pumps. Because SATAY is not tied to a specific genetic background, the sensitivity of an S. cerevisiae strain, AD1-8, that specifically displays Chitosan susceptibility was examined in detail, and the results suggest that Chitosan acts through interactions with the fungal cell wall. Through a series of experiments that expand upon SATAY analysis, the novel antifungal ATI-2307, the authors clearly show that the transporter Hol1 concentrates this compound within yeast.

      General Comments:

      This is a very impressive application of SATAY, highlighting many different strategies for exploring the mechanism of action of various antifungal compounds. It's clear from the findings presented that SATAY is a powerful and potentially highly productive approach for chemical-genetic analysis.

    1. eLife Assessment

      This important study seeks to examine the relationship between pupil size and information gain, showing opposite effects dependent upon whether the average uncertainty increases or decreases across trials. Given the broad implications for learning and perception, the findings will be of broad interest to researchers in cognitive neuroscience, decision-making, and computational modelling. Nevertheless, the evidence in support of the particular conclusion is at present incomplete - the conclusions would be strengthened if the authors could both clarify the differences between model-updating and prediction error in their account and clarify the patterns in the data.

    2. Reviewer #1 (Public review):

      Summary:

      This study investigates whether pupil dilation reflects prediction error signals during associative learning, defined formally by Kullback-Leibler (KL) divergence, an information-theoretic measure of information gain. Two independent tasks with different entropy dynamics (decreasing and increasing uncertainty) were analyzed: the cue-target 2AFC task and the letter-color 2AFC task. Results revealed that pupil responses scaled with KL divergence shortly after feedback onset, but the direction of this relationship depended on whether uncertainty (entropy) increased or decreased across trials. Furthermore, signed prediction errors (interaction between frequency and accuracy) emerged at different time windows across tasks, suggesting task-specific temporal components of model updating. Overall, the findings highlight that pupil dilation reflects information-theoretic processes in a complex, context-dependent manner.

      Strengths:

      This study provides a novel and convincing contribution by linking pupil dilation to information-theoretic measures, such as KL divergence, supporting Zénon's hypothesis that pupil responses reflect information gained during learning. The robust methodology, including two independent datasets with distinct entropy dynamics, enhances the reliability and generalisability of the findings. By carefully analysing early and late time windows, the authors capture the temporal dynamics of prediction error signals, offering new insights into the timing of model updates. The use of an ideal learner model to quantify prediction errors, surprise, and entropy provides a principled framework for understanding the computational processes underlying pupil responses. Furthermore, the study highlights the critical role of task context - specifically increasing versus decreasing entropy - in shaping the directionality and magnitude of these effects, revealing the adaptability of predictive processing mechanisms.

      Weaknesses:

      While this study offers important insights, several limitations remain. The two tasks differ significantly in design (e.g., sensory modality and learning type), complicating direct comparisons and limiting the interpretation of differences in pupil dynamics. Importantly, the apparent context-dependent reversal between pupil constriction and dilation in response to feedback raises concerns about how these opposing effects might confound the observed correlations with KL divergence. Finally, subjective factors such as participants' confidence and internal belief states were not measured, despite their potential influence on prediction errors and pupil responses.

    3. Reviewer #2 (Public review):

      Summary:

      The authors proposed that variability in post-feedback pupillary responses during the associative learning tasks can be explained by information gain, which is measured as KL divergence. They analysed pupil responses in a later time window (2.5s-3s after feedback onset) and correlated them with information-theory-based estimates from an ideal learner model (i.e., information gain-KL divergence, surprise-subjective probability, and entropy-average uncertainty) in two different associative decision-making tasks.

      Strength:

      The exploration of task-evoked pupil dynamics beyond the immediate response/feedback period and then associating them with model estimates was interesting and inspiring. This offered a new perspective on the relationship between pupil dilation and information processing.

      Weakness:

      However, disentangling these later effects from noise needs caution. Noise in pupillometry can arise from variations in stimuli and task engagement, as well as artefacts from earlier pupil dynamics. The increasing variance in the time series of pupillary responses (e.g., as shown in Figure 2D) highlights this concern.

      It's also unclear what this complicated association between information gain and pupil dynamics actually means. The complexity of the two different tasks reported made the interpretation more difficult in the present manuscript.

    4. Reviewer #3 (Public review):

      Summary:

      This study examines prediction errors, information gain (Kullback-Leibler [KL] divergence), and uncertainty (entropy) from an information-theory perspective using two experimental tasks and pupillometry. The authors aim to test a theoretical proposal by Zénon (2019) that the pupil response reflects information gain (KL divergence). In particular, the study defines the prediction error in terms of KL divergence and speculates that changes in pupil size associated with KL divergence depend on entropy. Moreover, the authors examine the temporal characteristics of pupil correlates of prediction errors, which differed considerably across previous studies that employed different experimental paradigms. In my opinion, the study does not achieve these aims due to several methodological and theoretical issues.

      Strengths:

      (1) Use of an established Bayesian model to compute KL divergence and entropy.

      (2) Pupillometry data preprocessing, including deconvolution.

      Weaknesses:

      (1) Definition of the prediction error in terms of KL divergence:

      I'm concerned about the authors' theoretical assumption that the prediction error is defined in terms of KL divergence. The authors primarily refer to a review article by Zénon (2019): "Eye pupil signals information gain". It is my understanding that Zénon argues that KL divergence quantifies the update of a belief, not the prediction error: "In short, updates of the brain's internal model, quantified formally as the Kullback-Leibler (KL) divergence between prior and posterior beliefs, would be the common denominator to all these instances of pupillary dilation to cognition." (Zénon, 2019).

      From my perspective, the update differs from the prediction error. Prediction error refers to the difference between outcome and expectation, while update refers to the difference between the prior and the posterior. The prediction error can drive the update, but the update is typically smaller, for example, because the prediction error is weighted by the learning rate to compute the update. My interpretation of Zénon (2019) is that they explicitly argue that KL divergence defines the update in terms of the described difference between prior and posterior, not the prediction error.

      The authors also cite a few other papers, including Friston (2010), where I also could not find a definition of the prediction error in terms of KL divergence. For example [KL divergence:] "A non-commutative measure of the non-negative difference between two probability distributions." Similarly, Friston (2010) states: Bayesian Surprise - "A measure of salience based on the Kullback-Leibler divergence between the recognition density (which encodes posterior beliefs) and the prior density. It measures the information that can be recognized in the data." Finally, also in O'Reilly (2013), KL divergence is used to define the update of the internal model, not the prediction error.

      The authors seem to mix up this common definition of the model update in terms of KL divergence and their definition of prediction error along the same lines. For example, on page 4: "KL divergence is a measure of the difference between two probability distributions. In the context of predictive processing, KL divergence can be used to quantify the mismatch between the probability distributions corresponding to the brain's expectations about incoming sensory input and the actual sensory input received, in other words, the prediction error (Friston, 2010; Spratling, 2017)."

      Similarly (page 23): "In the current study, we investigated whether the pupil's response to decision outcome (i.e., feedback) in the context of associative learning reflects a prediction error as defined by KL divergence."

      This is problematic because the results might actually have limited implications for the authors' main perspective (i.e., that the pupil encodes prediction errors) and could be better interpreted in terms of model updating. In my opinion, there are two potential ways to deal with this issue:

      a) Cite work that unambiguously supports the perspective that it is reasonable to define the prediction error in terms of KL divergence and that this has a link to pupillometry. In this case, it would be necessary to clearly explain the definition of the prediction error in terms of KL divergence and dissociate it from the definition in terms of model updating.

      b) If there is no prior work supporting the authors' current perspective on the prediction error, it might be necessary to revise the entire paper substantially and focus on the definition in terms of model updating.

      (2) Operationalization of prediction errors based on frequency, accuracy, and their interaction:

      The authors also rely on a more model-agnostic definition of the prediction error in terms of stimulus frequency ("unsigned prediction error"), accuracy, and their interaction ("signed prediction error"). While I see the point here, I would argue that this approach offers a simple approximation to the prediction error, but it is possible that factors like difficulty and effort can influence the pupil signal at the same time, which the current approach does not take into account. I recommend computing prediction errors (defined in terms of the difference between outcome and expectation) based on a simple reinforcement-learning model and analyzing the data using a pupillometry regression model in which nuisance regressors are controlled, and results are corrected for multiple comparisons.

      (3) The link between model-based (KL divergence) and model-agnostic (frequency- and accuracy-based) prediction errors:

      I was expecting a validation analysis showing that KL divergence and model-agnostic prediction errors are correlated (in the behavioral data). This would be useful to validate the theoretical assumptions empirically.

      (4) Model-based analyses of pupil data:

      I'm concerned about the authors' model-based analyses of the pupil data. The current approach is to simply compute a correlation for each model term separately (i.e., KL divergence, surprise, entropy). While the authors do show low correlations between these terms, single correlational analyses do not allow them to control for additional variables like outcome valence, prediction error (defined in terms of the difference between outcome and expectation), and additional nuisance variables like reaction time, as well as x and y coordinates of gaze.

      Moreover, including entropy and KL divergence in the same regression model could, at least within each task, provide some insights into whether the pupil response to KL divergence depends on entropy. This could be achieved by including an interaction term between KL divergence and entropy in the model.

      (5) Major differences between experimental tasks:

      More generally, I'm not convinced that the authors' conclusion that the pupil response to KL divergence depends on entropy is sufficiently supported by the current design. The two tasks differ on different levels (stimuli, contingencies, when learning takes place), not just in terms of entropy. In my opinion, it would be necessary to rely on a common task with two conditions that differ primarily in terms of entropy while controlling for other potentially confounding factors. I'm afraid that seemingly minor task details can dramatically change pupil responses. The positive/negative difference in the correlation with KL divergence that the authors interpret to be driven by entropy may depend on another potentially confounding factor currently not controlled.

      (6) Model validation:

      My impression is that the ideal learner model should work well in this case. However, the authors don't directly compare model behavior to participant behavior ("posterior predictive checks") to validate the model. Therefore, it is currently unclear if the model-derived terms like KL divergence and entropy provide reasonable estimates for the participant data.

      (7) Discussion:

      The authors interpret the directional effect of the pupil response w.r.t. KL divergence in terms of differences in entropy. However, I did not find a normative/computational explanation supporting this interpretation. Why should the pupil (or the central arousal system) respond differently to KL divergence depending on differences in entropy?

      The current suggestion (page 24) that might go in this direction is that pupil responses are driven by uncertainty (entropy) rather than learning (quoting O'Reilly et al. (2013)). However, this might be inconsistent with the authors' overarching perspective based on Zénon (2019) stating that pupil responses reflect updating, which seems to imply learning, in my opinion. To go beyond the suggestion that the relationship between KL divergence and pupil size "needs more context" than previously assumed, I would recommend a deeper discussion of the computational underpinnings of the result.

    1. eLife Assessment

      This important study analyzes a large dataset of Salmonella gallinarum whole-genome sequences and provides findings regarding the population structure of this avian-specific pathogen. The convincing results indicate regional adaptation of the mobilome-driven resistome and a role in the evolutionary trajectory of this pathogen that will interest microbiologists and researchers working on genomics, evolution, and antimicrobial resistance.

    2. Reviewer #1 (Public review):

      Summary:

      The investigators in this study analyzed the dataset assembly from 540 Salmonella isolates, and those from 45 recent isolates from Zhejiang University of China. The analysis and comparison of the resistome and mobilome of these isolates identified a significantly higher rate of cross-region dissemination compared to localized propagation. This study highlights the key role of the resistome in driving the transition and evolutionary history of S. Gallinarum.

      Strengths:

      The isolates included in this study were from 16 countries in the past century (1920 to 2023). While the study uses S. Gallinarun as the prototype, the conclusion from this work will likely apply to other Salmonella serotypes and other pathogens.

    3. Reviewer #2 (Public review):

      Summary:

      The authors sequence 45 new samples of S. Gallinarum, a commensal Salmonella found in chickens, which can sometimes cause disease. They combine these sequences with around 500 from public databases, determine the population structure of the pathogen, and coarse relationships of lineages with geography. The authors further investigate known anti-microbial genes found in these genomes, how they associate with each other, whether they have been horizontally transferred, and date the emergence of clades.

      Strengths:

      - It doesn't seem that much is known about this serovar, so publicly available new sequences from a high burden region are a valuable addition to the literature.<br /> - Combining these sequences with publicly available sequences is a good way to better contextualise any findings.<br /> - The genomic analyses have been greatly improved since the first version of the manuscript, and appropriately analyse the population and date emergence of clades.<br /> - The SNP thresholds are contextualised in terms of evolutionary time.<br /> - The importance and context of the findings are fairly well described.

    4. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews: 

      Reviewer #1 (Public review):

      Summary:

      The investigators in this study analyzed the dataset assembly from 540 Salmonella isolates, and those from 45 recent isolates from Zhejiang University of China. The analysis and comparison of the resistome and mobilome of these isolates identified a significantly higher rate of cross-region dissemination compared to localized propagation. This study highlights the key role of the resistome in driving the transition and evolutionary history of S. Gallinarum.

      Strengths:

      The isolates included in this study were from 16 countries in the past century (1920 to 2023). While the study uses S. Gallinarun as the prototype, the conclusion from this work will likely apply to other Salmonella serotypes and other pathogens.

      Thank you very much for your positive feedback. We recognize, as you noted, that emphasizing Salmonella enterica Serovar Gallinarum in the title may lead readers to perceive our methods and conclusions as overly restrictive. In light of your evaluation of our work, we have revised the title to: “Avian-specific Salmonella transition to endemicity is accompanied by localized resistome and mobilome interaction” We believe this final version not only reflects the applicability of our conclusions, as you appreciated, but also addresses your previous suggestion to highlight the resistome and mobilome.

      Revisions in the manuscript Lines: 1-3

      Weaknesses:

      While the isolates came from 16 countries, most strains in this study were originally from China.

      We believe that this issue was discussed in detail in our previous response. Although potential bias exists, we have minimized its impact by constructing the largest global S. Gallinarum genome dataset to date. In addition, we have further emphasized these limitations in the manuscript.

      Comments on revisions:

      This reviewer is happy with the detailed responses from the authors regarding revising this manuscript. I do not have further comments.

      We greatly appreciate your positive feedback and are pleased that our responses have addressed your concerns.

      Reviewer #2 (Public review):

      Summary:

      The authors sequence 45 new samples of S. Gallinarum, a commensal Salmonella found in chickens, which can sometimes cause disease. They combine these sequences with around 500 from public databases, determine the population structure of the pathogen, and coarse relationships of lineages with geography. The authors further investigate known anti-microbial genes found in these genomes, how they associate with each other, whether they have been horizontally transferred, and date the emergence of clades.

      Strengths:

      - It doesn't seem that much is known about this serovar, so publicly available new sequences from a high burden region are a valuable addition to the literature.

      - Combining these sequences with publicly available sequences is a good way to better contextualise any findings.

      - The genomic analyses have been greatly improved since the first version of the manuscript, and appropriately analyse the population and date emergence of clades.

      - The SNP thresholds are contextualised in terms of evolutionary time.

      - The importance and context of the findings are fairly well described.

      Thank you so much for your thorough review and constructive comments on the manuscript.

      Weaknesses:

      -  There are still a few issues with the genomic analyses, although they no longer undermine the main conclusions:

      We are grateful for the valuable time and effort you have dedicated to improving our manuscript. In this revision, we have provided a point-by-point response to each of your concerns. Moreover, with the addition of new supplementary materials and modifications to the figures, we have re-examined and adjusted the numbering of figures and supplementary materials in the text to ensure they appear correctly in the manuscript.

      (1) Although the SNP distance is now considered in terms of time, the 5 SNP distance presented still represents ~7yrs evolution, so it is unlikely to be a transmission event, as described. It would be better to use a much lower threshold or describe the interpretation of these clusters more clearly. Bringing in epidemiological evidence or external references on the likely time interval between transmissions would be helpful.

      We sincerely thank you for highlighting this issue. We appreciate your concern regarding the use of a 5-SNP threshold to define a transmission event, especially given the approximate 7-year evolutionary timeframe. Considering our updated estimate for the evolutionary rate of S. Gallinarum (approximately 0.74 SNPs per year, with a 95% HPD range of 0.42 to 1.06), we have revised the manuscript to use a 2-SNP threshold (approximately representing less than two years of evolution) to better control the temporal span of transmission events. In addition, we have updated the manuscript to reflect this new threshold and demonstrated that the use of a more stringent SNP threshold does not affect the overall conclusions of the study.

      Specifically, we adopted the newly established 2-SNP threshold to update Figure 3a and corresponding Supplementary Figure 8. The heatmap on the far right of New Figure 3a illustrates the SNP distances among 45 newly isolated S. Gallinarum strains from two locations in Zhejiang Province (Taishun and Yueqing). New Supplementary Figure 8 simulates potential transmission events between the bvSP strains isolated from Zhejiang Province (n=95) and those from other regions of China with available provincial information (n=435). These analyses collectively demonstrate the localized transmission patterns of bvSP within China.

      For New Figure 3a, we found that even with the 2-SNP threshold, the number of potential transmission events among the 45 newly isolated S. Gallinarum strains from the two Zhejiang locations (Taishun and Yueqing) remains unchanged. In fact, we observed that the results from SNP tracing using an SNP threshold of less than 5 are consistent (see Author response image 1). 

      Author response image 1.

      Clustering results of 45 newly isolated S. Gallinarum strains using different SNP thresholds of 1, 2, 3, 4, and 5 SNPs. The five subplots represent the clustering results under each threshold. Each point corresponds to an individual strain, and lines connect strains with potential transmission relationships.

      For New Supplementary Figure 8, we employed the 2-SNP threshold and found that the number of transmission events between the bvSP strains isolated from Zhejiang Province (n=95) and those from other Chinese provinces (n=435) decreased from 91 to 53. The names of the strains involved in these potential transmission events are listed in Supplementary Table 5.

      Revisions in the manuscript

      Lines: 352-357

      Figures: Figure 3; Supplementary Figure 8

      Table: Supplementary Table 5

      (2) The HGT definition has not fundamentally been changed and therefore still has some issues, mainly that vertical evolution is still not systematically controlled for. 

      We sincerely thank you for highlighting this issue. We hope the following explanation will help clarify and improve our manuscript, as well as address your concerns.

      In bacteria, mobile genetic elements (MGEs) such as plasmids, transposons, integrons, and prophages, as mentioned in our manuscript, are segments of DNA that encode enzymes and proteins responsible for mediating the movement of genetic material between bacterial genomes (commonly referred to as “jumping genes”). These MGEs contribute to the mechanisms of horizontal gene transfer (HGT) in Salmonella, including transduction (via prophages), conjugation (via plasmids), and transposition (via integrons and transposons) (Nat Rev Microbiol. 2005 Sep;3(9):722-32). These “jumping genes” can enable Salmonella to acquire additional antimicrobial resistance genes (ARGs), which may not only originate from other Salmonella strains but also from distantly related species.

      To further address your concern regarding the systematic control of vertical evolution, we employed the HGTphyloDetect pipeline developed by Le Yuan et al. (Brief Bioinform. 2023 Mar 19;24(2):bbad035) to control for vertical evolution in the ARG sequences mentioned in our manuscript. We chose HGTphyloDetect because, as noted, "jumping genes" often occur among evolutionarily distant species, rendering the use of Gubbins potentially unsuitable for these distant HGT events.

      Using the HGTphyloDetect pipeline, we extracted base sequences for the eight ARGs shown in Figure 6b with an HGT frequency greater than zero (bla<sup>TEM-1B</sup>, sul1, dfrA17, aadA5, sul2, aph(3’’)-Ib, tet(A), aph(6)-Id). For bla<sup>TEM-1B</sup>, sul1, dfrA17, aadA5, and sul2, the HGT frequency reached 100% across different isolates, indicating that these ARG sequences have a unique sequence type. In contrast, due to the ResFinder settings requiring both similarity and coverage to meet a minimum value of 90%, the base sequences for aph(3’’)-Ib, tet(A), and aph(6)-Id are not unique. Consequently, we applied the HGTphyloDetect pipeline individually to each sequence type of ARGs to verify their association with HGT events. Specifically, among 436 bvSP isolates collected in China, we identified two sequence types of aph(3’’)-Ib, four sequence types of tet(A), and three sequence types of aph(6)-Id.

      Subsequently, to identify potential ARGs horizontally acquired from evolutionarily distant organisms, we queried the translated amino acid sequences of each ARG against the National Center for Biotechnology Information (NCBI) non-redundant protein database. We then evaluated whether these sequences were products of HGT by calculating Alien Index (AI) scores and out_perc values.

      The calculation of AI score is as follows:

      In this study, bbhG and bbhO represent the E-values of the best blast hit in ingroup and outgroup lineages, respectively. The outgroup lineage is defined as all species outside of the kingdom, while the ingroup lineage encompasses species within the kingdom but outside of the subphylum. An AI score ≥ 45 is considered a strong indicator that the gene in question is likely derived from an HGT event.

      Regarding the calculation method for out_perc:

      Finally, according to the definition provided by the HGTphyloDetect pipeline, ARGs with AI score ≥ 45 and out_perc ≥ 90% are presumed to be potential candidates for HGT from evolutionarily distant species. We have compiled the calculation results for the aforementioned genes in New Supplementary Table 9. The results indicate that all ARGs presented in Figure 6b, which exhibited a HGT frequency greater than zero, were acquired horizontally by S. Gallinarum. Based on these findings, we have revised the manuscript accordingly.

      Revisions in the manuscript

      Lines: 302-307; 616-650; 955-957

      Table: Supplementary Table 9

      Using a 5kb window is not sufficient, as LD may extend across the entire genome.

      We agree with your point that linkage disequilibrium (LD) could influence the transmission of genes within chromosomal regions. LD can lead to the non-random cooccurrence of alleles at different loci within a population. Considering that horizontal gene transfer (HGT) events involving more distantly related ARGs may be accompanied by vertical propagation on chromosomes, and to simultaneously assess the impact of LD, we conducted two evaluations.

      It is important to note that the following assessments are based on the assumption that plasmid replicons detected by PlasmidsFinder are part of self-replicating, extrachromosomal DNA.

      (1) In the revised pipeline used to calculate ARG HGT frequencies, we categorized a total of 621 ARGs carried by 436 bvSP isolates collected in China and found that 415 of these ARGs were located on MGEs. We further investigated the distribution of these 415 ARGs across different MGEs, taking into account the complex nesting relationships among them. We observed that 90% of the ARGs (372/415) were located on plasmid contigs. It is important to clarify that this finding does not contradict our statement in the manuscript regarding plasmids and transposons as the primary reservoirs for resistome geo-temporal dissemination. This is because transposons, integrons, and prophages carrying ARGs can also be found on plasmids. Additionally, only 25 bvSG isolates from China contained ARGs, which were likely acquired via transposons or integrons located on the chromosome.  

      (2) In our manuscript, we searched for ARGs within a 5kb upstream and downstream region (a total of 10kb) of transposons and integrons (The BLASTn parameters used in the Bacant pipeline to identify transposons and integrons were set to a coverage threshold of 60%, rather than 100%). However, in light of the potential impact of LD on vertical transmission, we expanded our search to include a 10kb upstream and downstream range (a total of 20kb)  for these 25 isolates. The decision to expand the search range to 10kb upstream and downstream range is based on the following two considerations: 1) Based on literature, we determined the overall lengths of the integrons and transposons carried by the 25 isolates (Tn801, Tn6205, Tn1721, In498, In1440, In473, and In282), and found that the maximum length of these elements is ~13.5 kb. Using a 10kb upstream and downstream threshold effectively covers these integrons/transposons. 2) The limitation posed by genomic fragmentation due to next-generation sequencing, which restrict the search range. We present the results of this expanded search for colocalization of ARGs with transposons and integrons at: Figshare:  https://doi.org/10.6084/m9.figshare.28129130.v1

      We found that these results were consistent with those obtained using the previous search range.

      Taken together, these results suggest that although linkage disequilibrium may influence genetic processes within chromosomal regions—particularly for the few chromosomeassociated antibiotic resistance genes linked to integrons and transposons—the overall impact in our study is likely minimal. This conclusion is supported by the observation that 90% of the ARGs in our dataset are located on plasmids, and even an expanded search range does not alter this outcome. Additionally, by incorporating Alien Index scores and calculating out_perc, we can further confirm the occurrence of horizontal gene transfer events.

      However, it is undeniable that other studies using our current pipeline may be affected. As a temporary remedial measure, we have included a note in the "README" file  as below (https://github.com/tjiaa/Cal_HGT_Frequency):

      “Note: Considering that ARGs located on the chromosome and carried by mobile genetic elements—such as integrons and transposons—may introduce potential computational errors, we recommend evaluating the number of ARGs associated with these elements on the chromosome during your analysis. If a majority of ARGs in your dataset fall into this category, we suggest using additional methods to evaluate the potential impact of linkage disequilibrium. Additionally, by modifying the “MGE_start” and “MGE_end” parameters in the “eLife_MGE_ARG_Co_location.ipynb” script, you can assess the distance between different ARGs and integrons or transposons on the chromosome. This approach will further aid in evaluating the impact of linkage disequilibrium on the genetic process.”

      We believe this approach will assist researchers in further assessing the potential impact of vertical evolution and help other users determine whether additional methods are necessary to account for such effects.

      As the authors have now run gubbins correctly, they could use the results from this existing analysis to find recent HGT.

      We sincerely thank you for your valuable suggestion. Utilizing additional methods to predict potential horizontal gene transfer (HGT) events could indeed enhance the robustness of the results. However, "jumping genes" often occur among evolutionarily distant species, rendering the use of Gubbins potentially unsuitable for these distant HGT events.

      Furthermore, the primary focus of our study is to identify HGT of antimicrobial resistance genes (ARGs) in the Salmonella genome driven by mobile genetic elements. Therefore, we employed the HGTphyloDetect pipeline developed by Le Yuan et al. (Brief Bioinform. 2023 Mar 19;24(2):bbad035) to control for vertical evolution in the ARG sequences. The specific computational methods and conclusions have been detailed above.

      To definite mobilisation, perhaps a standard pipeline such (e.g. https://github.com/EBIMetagenomics/mobilome-annotation-pipeline) would be more convincing.

      Thank you for your valuable suggestion. We agree that defining mobilization using a standardized pipeline can add rigor and clarity to our analysis. The pipeline you referenced (https://github.com/EBI-Metagenomics/mobilome-annotation-pipeline) is an excellent resource and provides a robust approach to the identification and annotation of mobile genetic elements.

      We have examined and run this pipeline, which uses “IntegronFinder” and “ICEfinder” to detect integrons, “geNomad” to identify plasmids, and “geNomad” and “VIRify” to detect prophages. Our initial checks revealed that the numbers of integrons, plasmids, and prophages identified using this pipeline were consistent with those detected in our study. However, due to the significantly different output formats, the results from this pipeline could not be integrated with the pipeline we used for calculating HGT frequency.

      We will incorporate the standardized pipeline you suggested in future studies to further improve the reliability of our findings.

      (3) The invasiveness index is better described, but the authors still did not provide convincing evidence that the small difference is actually biologically meaningful (there was no statistical difference between the two strains provided in response Figure 6). What do other Salmonella papers using this approach find, and can their links be brought in? If there is still no good evidence, a better description of this difference would help make the conclusions better supported.

      We sincerely appreciate your thoughtful feedback. The initial introduction of the invasiveness index in our manuscript aimed to quantitatively assess the differences in invasiveness between two geographically distinct strains of S. Gallinarum (isolated from Taishun and Yueqing) by comparing the degradation of 196 top predicted genes associated with invasiveness in their genomes. We found a highly significant statistical difference (P < 0.0001) in the invasiveness index between them.

      Several studies have also employed the invasiveness index to predict biological relevance in Salmonella strains, and we believe these examples provide further context for our approach:

      (1) Caisey V. Pulford et al, Nat Microbiol, 2021, used the same method to calculate the invasiveness index for Salmonella Typhimurium and employed it to characterize the invasiveness of different lineage strains. They found that Salmonella in Lineage-3 exhibited the highest invasiveness index, suggesting an adaptation from an intestinal to a systemic lifestyle. The authors noted, "Although the invasiveness index cannot yet be experimentally validated, Salmonella isolates with different invasiveness indices produce distinct clinical symptoms in a human population (BMC Med. 2020 Jul 17; 18(1):212)". They emphasized the necessity of developing more robust methods to measure Salmonella invasiveness.

      (2) Sandra Van Puyvelde et al, Nat Commun, 2019, reported that Salmonella Typhimurium sequence type 313 (ST313) lineage II.1 exhibited a higher invasiveness index compared to lineage II, suggesting that the two lineages might have distinct adaptations to an invasive lifestyle. Further experiments demonstrated significant differences between these lineages in terms of biofilm formation (A red dry and rough (RDAR) assay) and metabolic capacity for carbon compounds.

      (3) Wim L. Cuypers et al, Nat Commun, 2023, calculated the invasiveness index for 284 global Salmonella Concord strains across different lineages and found that Lineage-4 potentially exhibited the highest invasiveness.

      Given these evidences, we acknowledge that no significant difference in mortality was observed between the L2b and L3b S. Gallinarum strains in 16-day-old SPF chicken embryos. Existing literature suggests that strains with higher invasiveness indices may still exhibit differences in biofilm formation and metabolic capacities, reflecting their adaptation to different host environments. As such, we maintain that the invasiveness index remains a valuable metric for evaluating the genomic differences between S. Gallinarum strains from Taishun and Yueqing. We plan to further investigate these differences through phenotypic experiments in our next research.

      In the revised manuscript, we have added the following discussion along with additional references:

      Lines 358-365: “Moreover, the invasiveness index of bvSP from Taishun and Yueqing suggests that different lineages of S. Gallinarum recovered from distinct regions may exhibit biological differences. Previous studies have shown that strains with higher invasiveness indexes tend to be more virulent in hosts (30, 31), potentially causing neurological or arthritic symptoms in S. Gallinarum infections. Furthermore, strains with varying invasiveness indexes have been confirmed to differ in their biofilm formation abilities and metabolic capacities for carbon compounds (32).”

      Revisions in the manuscript:

      Lines: 358-365, 806-827.

      In summary, the analysis is broadly well described and feels appropriate. Some of the conclusions are still not fully supported, although the main points and context of the paper now appear sound.

      Thank you so much for your positive evaluation of our work. We hope that the revised manuscript meets your expectations and offers a more accurate interpretation of our findings.

      Recommendations for the authors:

      Reviewer #2 (Recommendations for the authors):

      This is a great improvement over the first version and I thank the authors for a thorough response, as well as changing their conclusions in response to their improvements.

      Other small remaining issues:

      Figure 3: Heatmap of SNPs is hard to read in grayscale. It also just represents the between clade distances already shown by the tree. It would be more useful to present intraclade distances only to see the SNP resolution _within_ each lineage. Using a better colour scheme would also help.

      Thank you for your insightful comments and suggestions regarding Figure 3. We agree that the grayscale heatmap may present challenges in terms of visual clarity. To address this, we have updated the heatmap with a more distinct color gradient, ensuring better contrast and easier interpretation (New Figure 3). 

      Regarding your second suggestion: "It would be more useful to present intraclade distances only to see the SNP resolution within each lineage," we believe it is already addressed in the current version of New Figure 3. Specifically, the heatmap on the right side of New Figure 3 illustrates the SNP distances between S. Gallinarum isolates from Taishun and Yueqing, with the goal of demonstrating that genomic variation within isolates from a single region is generally smaller compared to those from different regions. In this figure, 45 newly isolated S. Gallinarum strains are categorized into two lineages: L2b and L3b. The heatmap on the right side of Figure 3 displays the SNP distances between all pairwise combinations of these 45 strains, where the intraclade distances are represented by the red regions (highlighting the pairwise distances within each lineage, specifically L3b and L2b, which are indicated by two triangles). The between-clade distances are shown by the blue regions.

      We also believe in further exploring the intraclade distances across the entire dataset of 580 S. Gallinarum strains, as it could provide additional insights. However, this analysis would extend beyond the scope of the current section.

      Revisions in the manuscript Line: 998

      Figure: Figure 3

      Please remove Figure 6c, it does not add anything to the paper and raises questions about performing this regression.

      Thank you for pointing out this issue. We have removed Figure 6c and the corresponding description in the "Results" section from the manuscript (New Figure 6).

      Revisions in the manuscript Lines: 316, 319, 1035-1041.

      Figure: Figure 6

      Again, thank you all for your time and efforts in reviewing our work. We believe the improved manuscript meets the high standards of the journal.

    1. eLife Assessment

      In this valuable study, Tutak and colleagues set out to identify factors that mediate Repeat Associated Non-AUG (RAN) translation of CGG repeats in the FMR1 mRNA which are implicated in toxic protein accumulation that underpins ensuing neurological pathologies. The authors provide solid evidence that RPS26 may be implicated in mediating the RAN translation of FMR1 mRNA. This article should be of broad interest to researchers in the variety of disciplines including post-transcriptional regulation of gene expression and neurobiology.

    2. Reviewer #2 (Public review):

      Summary:

      Translation of CGG repeats leads to accumulation of poly G, which is associated with neurological disorders. This is an important paper in which the authors sought out proteins that modulate RAN translation. They determined which proteins in Hela cells were enriched on CGG repeats and affected levels of polyG encoded in the 5'UTR of the FMR1 mRNA. They then showed that siRNA depletion of ribosomal protein RPS26 results in less production of FMR1polyG than in control. Experiments were performed in several cell lines and with several reporters with differences in repeats and transfection methods to increase confidence that changes were occurring. New data and details of the methods increase confidence that reporter translation but not global translation is diminished by RPS26 knockdown as concluded. The manuscript has been improved by data showing that new proteins are being synthesized in cells following RPS26 knockdown, and that near-cognate start codon usage is diminished in lines when RPS26 is knocked down, but the mechanism by which RPS26 depletion affects translation is still unclear.

      Strengths:

      - The authors have proteomics data that show enrichment of a set of proteins on FMR1-polyG RNA but not a related RNA.<br /> - Knockdown of RPS26, which was enriched on the FMR1 RNA, led to decreases in cell growth, but surprisingly did not strongly affect global translation, as assessed by puromycin incorporation<br /> - There is some new evidence that near-cognate start codon selection is affected by RPS26 knockdown

      Weaknesses:

      - The mechanism for RPS26 knockdown affecting translation of the polyG sequences is unclear, whether knockdown is affecting ribosome levels, extra ribosomal RPS26 or ribosome composition is not known.

    3. Reviewer #3 (Public review):

      Tutak et al provide intriguing findings demonstrating that insufficiency of RPS26 and related proteins, such as TSR2 and RPS25, downregulates RAN translation from CGG repeat RNA in fragile X-associated conditions. Using RNA-tagging system and mass spectrometry-based screening, the authors identified RPS26 as a potential regulator of RAN translation. They further confirmed its regulatory effects on RAN translation by siRNA-based knockdown experiments in multiple cellular disease models. Quantitative mass spectrometry analysis revealed that the expression of some ribosomal proteins is sensitive to RPS26 depletion, while approximately 80% of proteins, including FMRP, were not influenced. Given the limited understanding of the roles of ribosomal proteins in RAN translation regulation, this study provides novel insights into this research field. However, certain data do not fully support the authors' critical conclusions.

      (1) While the authors substituted the ACG near-cognate initiation codon with other near-cognate codons, such as GTG and CTG, in the luciferase assay (Figure 4F), substitution of the ACG codon with an ATG codon should also be performed. Although they evaluated RPS26 knockdown effect on AUG-dependent FMRP translation in Figure 3C, investigating its effect on AUG-dependent repeat-associated translation (e.g., AUG-CGG-repeat) is necessary to substantiate their claim that ACG codon selection is important for RAN translation downregulation by RPS26 knockdown.

      (2) The results of the ASO-based ACG codon-blocking experiment in Figure 4G are difficult to interpret. While RPS knockdown reduces FMRpolyG expression, the effect appears attenuated by the ASO-ACG treatment compared to the control. However, this does not conclusively demonstrate that the regulatory effect is directly due to ACG codon selection during translation initiation for some reasons. For example, ASO-ACG treatment possibly interferes with ribosomal scanning rather than ACG-codon selection, or alters the expression of template CGG repeat RNA. To validate the effect of RPS26 knockdown on ACG codon selection, experiments using the ACG-to-ATG substituted CGG repeat reporter are recommended, as suggested in comment 1.

      (3) The regulatory effects of RPS26 and other molecules on RAN translation have been investigated as effects on the expression levels of FMRpolyG proteins upon knockdown of these molecules in disease model cells expressing CGG repeat sequences (Figures 1C, 1D, 3B, 3C, 3E, 4F, 4G, 5A, 5C, 6A, 6D). However, FMRpolyG expression levels can be influenced by factors other than RAN translation in these cellular experiments, such as template RNA level, template RNA localization, and FMRpolyG protein degradation. Although the authors evaluated the effect on the expression levels of template CGG repeat RNA, it would be better to confirm the direct effect of these regulators on RAN translation by other experiments. In vitro translation assay that can directly evaluate RAN translation is preferable, but experiments using the ACG-to-ATG substituted CGG repeat reporter, as suggested in comment 1, would also provide valuable insights.

    4. Author response:

      The following is the authors’ response to the current reviews.

      We thank Reviewers for highlighting the strengths of our work along with suggestions for future directions.

      We agree with the Reviewers that RPS26 depletion may impact not only RAN translation initiation and codon selection (as showed in the experiments in Figure 4G), but also other mechanisms, such as speed of PIC scanning, as we stated in the discussion. Although, we did provide the data showing that mRNA of exogenous FMR1-GFP does not change upon RPS26 depletion (Figure 3B&C), hence observed effect most likely stems from translation regulation. In addition, an experiment with ASO-ACG treatment (Figure 4G) suggests that near cognate start codon selection or speed of PIC scanning may be a part of the regulation of RAN translation sensitive to RPS26 depletion. In addition, our latest unpublished results (Niewiadomska D. et al., in revision), indicate that FMRpolyG in fusion with GFP is fairly stable, in particular, while derived from long repeats (>90xCGG), suggesting that the protein stability is not at play in RPS26-dependent regulation.

      We would like to stress that in order to avoid bias in result interpretation and to mimic the natural situation, the majority of experiments concerning levels of FMRpolyG were performed in cell models with stable expression of ACG-initiated FMRpolyG. Currently, we do not possess a cell model with stable expression of AUG-initiated FMRpolyG, and the experiments based on transient transfection system would not necessarily be comparable to the results obtained in stable expression system. However, we believe that the experiment presented in Figure 2B serves as a good control for overall translation level upon RPS26 depletion indicating that RPS26 insufficiency does not affect global translation and the observed regulation is specific to some mRNAs including the one encoding FMRpolyG frame. We also show that the level of ca. 80% of identified canonical proteins, including FMRP, did not change upon RPS26 silencing (SILAC-MS, Figure 4A). Indeed, we did not explore the ribosome composition upon RPS26 and TSR2 depletion, although, most likely the pool of functional ribosomes in the cell is sufficient enough to support the basal translation level (SUnSET assays, Figure 2B & 5C). However, we cannot exclude possibility that for some mRNAs, including one encoding for FMRpolyG, the observed effect can be partially caused by lowering the number of fully active ribosomes, especially in experiments with transient transfection experiments where transgene expression is hundreds times higher than for average native mRNA.

      Finally, we agree with the Reviewer that in vitro translation assay would provide the evidence of direct effect of RPS26 on FMRpolyG level, however, we did not manage to overcome technical difficulties in obtaining cellular lysate devoid of RPS26 from vendor companies.


      The following is the authors’ response to the original reviews.

      General Comments

      We thank Reviewers for the critical comments and experimental suggestions. We considered most of the advices in the revised version of the manuscript, which allowed for a more balanced interpretation of the results presented, and further supported major statement of the manuscript that insufficiency of the RPS26 and RPS25 plays a role in modulating the efficiency of noncanonical RAN translation from FMR1 mRNA, which results in the production of toxic polyglycine protein (FMRpolyG). Firstly, performing new experiments, we showed that silencing of the RPS26 and its chaperone protein TSR2, which regulates loading/exchange of RPS26 in maturing small ribosome subunit, did not elicit global translation inhibition. Secondly, we demonstrated that in contrary to RPS26 and RPS25 depletion, silencing the RPS6 protein, a core component of 40S subunit, did not affect FMRpolyG production, further supporting the specific effect of RPS26 and RPS25 on RAN translation regulation of mutant FMR1 mRNA. We also observed that depletion of RPS26, RPS25 and RPS6 had significant negative effect on cells proliferation which is in line with previously published results indicating that insufficiencies of ribosomal proteins negatively affect cell growth. Moreover, we showed that FMRpolyG production is significantly affected by RPS26 depletion while initiated at ACG, but not other near cognate start codons. Importantly, translation of FMRP initiated at canonical AUG codon of the same mRNA upstream the CGGexp was not affected by RPS26 silencing, similarly to vast majority of the human proteome. This implies that RAN translation of FMR1 mRNA mediated by RPS26 insufficiency is likely to be dependent on start codon selection/fidelity. In essence, we provide a series of evidences indicating that cellular amount of 40S ribosomal proteins RPS26 and RPS25 is important factor of CGGrelated RAN translation regulation. Finally, we also decided to tone down our claims. Now, we state that the RPS26/25/TSR2 insufficiency or depletion, affects RAN translation, rather than composition of 40S ribosomal subunit per se influences RAN translation. We have addressed all specific concerns below and made changes to the new version of manuscript.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this manuscript, Tutak et al use a combination of pulldowns, analyzed by mass spectrometry, reporter assays, and fluorescence experiments to decipher the mechanism of protein translation in fragile X-related diseases. The topic is interesting and important.

      Although a role for Rps26-deficient ribosomes in toxic protein translation is plausible based on already available data, the authors' data are not carefully controlled and thus do not support the conclusions of the paper.

      We sincerely appreciate your rigorous, insightful, and constructive feedback throughout the revision process. We believe your guidance has been instrumental in significantly enhancing the quality of our research. Below, we have addressed your comments pointby-point.

      Strengths:

      The topic is interesting and important.

      Weaknesses:

      In particular, there is very little data to support the notion that Rps26-deficient ribosomes are even produced under the circumstances. And no data that indicate that they are involved in the RAN translation. Essential controls (for ribosome numbers) are lacking, no information is presented on the viability of the cells (Rps26 is an essential protein), and the differences in protein levels could well arise from block in protein synthesis, and cell division coupled to differential stability of the proteins.

      We agree that data presented in the first version of the manuscript did not directly address the following processes: ribosome content, global translation rate and cell viability upon RPS26 depletion. Therefore we addressed some of the issues in the revised version of the manuscript. In particular, we showed that RPS26 and TSR2 knock down did not inhibit global translation (new Figure 2B & 4C), hence we concluded that the changes of FMRpolyG level did not arise from general translational shut down. On the other hand, RPS26, RPS25 and RPS6 depletion negatively affected cells proliferation (new Figure 2A,5D,6C), which is in line with a number of previously published researches (e.g. Cheng et al, 2019; Havkin-Solomon et al, 2023). However, the rate of proliferation abnormalities is limited. We agree that observed effects on RAN translation from mutant FMR1 mRNA may stem from the combination of altered protein synthesis, conditions of the cells but also cis-acting factors of mRNA sequence/structure. In new experiments we showed that single nucleotide substitution of ACG by other near cognate start codons change sensitivity of RAN translation to insufficiency of RPS26 (new Figure 4F). Also the inhibitory effect of antisense oligonucleotide binding to the region of 5’UTR containing ACG initiation codon (ASO_ACG) is different in cells differing in amount of RPS26 (new Figure 4G).

      We also agree that our data only partially supports the role of RPS26-defficient ribosomes in RAN translation. Therefore, we have toned down our claims. Now, we state that the RPS26/25/TSR2 insufficiency or depletion affects RAN translation. We also changed the title of the manuscript to: “Insufficiency of 40S ribosomal proteins, RPS26 and RPS25, negatively affects biosynthesis of polyglycine-containing proteins in fragile-X associated conditions” (Previously it was: “Ribosomal composition affects the noncanonical translation and toxicity of polyglycine-containing proteins in fragile X-associated conditions”.

      Specific points:

      (1) Analysis of the mass spec data in Supplemental Table S3 indicates that for many of the proteins that are differentially enriched in one sample, a single peptide is identified. So the difference is between 1 peptide and 0. I don't understand how one can do a statistical analysis on that, or how it would give out anything of significance. I certainly do not think it is significant. This is exacerbated by the fact that the contaminants in the assay (keratins) are many, many-fold more abundant, and so are proteins that are known to be mitochondrial or nuclear, and therefore likely not actual targets (e.g. MCCC1, PC, NPM1; this includes many proteins "of significance" in Table S1, including Rrp1B, NAF1, Top1, TCEPB, DHX16, etc...).

      The data in Table S6/Figure 3A suffer from the same problem.

      I am not convinced that the mass spec data is reliable.

      We thank Reviewer for the comment concerning MS data; however, we believe that it may stem from misunderstanding of the data presented in Table S3 and S6. Both tables represent the output from MaxQuant analysis (so-called ProteinGroup) of MS .raw files, without any filtering. As stated in the Material&Methods, we applied default parameters suggested by MaxQuant developers to analyze MS data, these include identification of proteins based on at least 1 unique peptide, and thus some of the proteins with only 1 unique peptide are shown in Tables S1 and S3. Reviewer is also right that in this output table common contaminants, such as keratins are included. However, these identifications are denoted as “CON_”, and are further filtered out during statistical analysis in Perseus software. During the statistical analysis we first filtered out irrelevant protein groups identifications, such as contaminants, or only identified by site modifications.

      We have changed the names of Supplementary Table files, giving more detailed description. We hope this will help to avoid misunderstanding for broader public. Secondly, when comparing the data presented in Table S3 and volcano plot presented in Figure 1B, one can notice that indeed the majority of identified proteins are not statistically significant (grey points), thus not selected for further stratification. Lack of significance of these proteins may be partially due to poor MS identification, however, they are not included in the following parts of the manuscript. Further, we selected only eight proteins (out of over 150) for stratification by orthogonal techniques, thus we argue that this step validates the biological relevance of chosen candidate RAN-translation modifiers. One should also keep in mind that pull down samples analyzed by MS often yield lower intensity and identification rates, when comparing to whole cell analysis, as a result of lower protein input or stringent washes used during sample preparation.

      Regarding the data presented in Table S6 (SILAC data), we argue that these data are of very good quality. More than 2,000 proteins were identified in a 125min gradient, with over 80% of proteins that were identified with at least 2 unique peptides. Each of three biological replicates was analyzed three times (technical replicates), giving total of 9 high resolution MS runs. Together, we strongly believe that this data is of high confidence.

      (2) The mass-spec data however claims to identify Rps26 as a factor binding the toxic RNA specifically. The rest of the paper seeks to develop a story of how Rps26-deficient ribosomes play a role in the translation of this RNA. I do not consider that this makes sense.

      Indeed, we identified RPS26 as a protein that co-precipitated with FMR1 containing expanded CGG repeats (Supplementary Figure 1G) and found that depletion of RPS26 hindered RAN translation of FMRpolyG, suggesting that RPS26 positively affects RAN translation. However, we did not state that RPS26 directly interacts with toxic RNA. In order to confirm the specificity of RAN translation regulation by RPS26 insufficiency, we tested whether depletion of other 40S ribosomal protein, RPS6, affects FMRpolyG synthesis. Our experiments showed that there was no any significant effect on RAN translation efficiency post RPS6 silencing (new Figure 5C). Importantly, we showed that RPS26 depletion did not inhibit global translation (new Figure 2B). In addition, mutagenesis of near-cognate start codon (new Figure 4F) and ASO_ACG treatment (new Figure 4G) provided the evidences that modulation of FMRpolyG biosynthesis by RPS26 level may depend on start codon selection. In essence, our data suggest that RPS26 depletion specifically affects synthesis of FMRpolyG, but not FMRP derived from the same FMR1 mRNA with CGGexp. However, we do not claim that the observed effect is the consequence of a direct interaction between RPS26 and 5’UTR of FMR1 mRNA. Downregulation of FMRpolyG biosynthesis could be an outcome of the alteration of ribosomal assembly, decrease of efficiency and fidelity of PIC scanning/initiation or impeded elongation or a combination of all these processes. In the manuscript we presented the results of experiments which tested many of these possibilities.

      (3) Rps26 is an essential gene, I am sure the same is true for DHX15. What happens to cell viability? Protein synthesis? The yeast experiments were carefully carried out under experiments where Rps26 was reduced, not fully depleted to give small growth defects.

      We agree with the Reviewer that RPS26 and DHX15 are essential proteins, similarly to all RNA binding proteins, and caution should be taken during experimental design. To address this, we titrated different concentrations of siRPS26, and found that administration of 5 nM siRPS26, which just partially silenced RPS26, decreased FMRpolyG by around 50% (new Figure 1D). This impact was even greater with 15 nM siRPS26, as we observed around 80% decrease of FMRpolyG.

      Havkin-Solomon et al. (2023), showed that proliferation rate is decreased in cells with mutated C-terminus of RPS26, which is required for contacting mRNA. In accordance with this study, we showed that cells with knocked down RPS26 proliferate less efficiently (new Figure 2A), but depletion of RPS26 did not impact the global translation (new Figure 2B). In addition, our SILAC-MS data indicates that ~80% of proteins with determined expression level were not affected by RPS26 insufficiency, and ~20% of the proteins turned out to be sensitive to RPS26 decrease. Although, these data do not take into account the protein stability.

      (4) Knockdown efficiency for all tested genes must be shown to evaluate knockdown efficiency.

      The current version of the manuscript contains representative western blots with validation of knock-down efficiency (for example in Figure 3B, C, E, Figure 6A) and we included knock-down validations where applicable (Figures 1D, 2B, 4G and 5C).

      (5) The data in Figure 1E have just one mock control, but two cell types (control si and Rps26 depletion).

      Mock control corresponds to the cells treated with lipofectamine reagent and was included in the study to determine the “background” signal from cells treated with delivery agent and reagents used to measure the apoptosis process. These cells were neither expressing FMRpolyG, nor siRNAs. Luminescence signals were normalized to the values obtained from mock control. We added more details describing this assay in the Figure 1 legend.

      (6) The authors' data indicate that the effects are not specific to Rps26 but indeed also observed upon Rps25 knockdown. This suggests strongly that the effects are from reduced ribosome content or blocked protein synthesis. Additional controls should deplete a core RP to ascertain this conclusion.

      We agree that observed effects may stem from reduced ribosome content, however, we argue that this is the only possibility and explanation. Previously, it was shown that RPS25 regulates G4C2-related RAN translation, but knock out of RPS25 does not affect global translation (Yamada S, 2019, Nat. Neuroscience). Similarly, we showed that KD of RPS26 or TSR2 did not reduce significantly global translation rate (SUnSET assay; new Figure 2B and 5C, respectively).

      Moreover, in a new version of manuscript we included a control experiment, where we silenced core ribosomal protein (RPS6) and found that RPS6 depletion did not affect RAN translation from mutant FMR1 mRNA (new Figure 5C), thus strengthening our conclusion about specific RAN translation regulation by the level of RPS26 and RPS25.

      Finally, our observation aligns well with current knowledge about how deficiency of different ribosomal proteins alters translation of some classes of mRNAs (Luan Y, 2022, Nucleic Acids Res; Cheng Z, 2019, Mol Cell). It was shown that depletion of RPS26 affects translation rate of different mRNAs compared to depletion of other proteins of small ribosomal subunit.

      (7) Supplemental Figure S3 demonstrates that the depletion of S26 does not affect the selection of the start codon context. Any other claim must be deleted. All the 5'-UTR logos are essentially identical, indicating that "picking" happens by abundance (background).

      Supplementary Figure 3D represents results indicating that the mutation in -4 position (from G to A) did not affect the RAN translation regardless of RPS26 presence or depletion. However, this result does not imply that RPS26 does not affect the selection of start codon of sequence- or RNA structure-context. We verified this particular -4 position, as it was suggested previously as important RPS26-sensitive site in yeasts (Ferretti M, 2017, Nat Struct Mol Biol). We agree with Reviewer that all 5’UTR logos presented in our paper did not show statistical significance for neither tested position for human mRNAs. On the contrary, we observed that regulation sensitive to RPS26 level depends on the selection of start codon of RAN translation, in particular ACG initiation (new Figure 4F&G). RPS26 depletion affected ACG-initiated but not GTG- or CTG-initiated RAN translation.

      In the previous version of the manuscript, we wrote that we did not identify any specific motifs or enrichment within analyzed transcripts in comparison to the background. On the other hand, we found that the GC-content among analyzed transcripts is higher within 5’UTRs and in close proximity to ATG in coding sequences (Figure 4D), what suggests the importance of RNA stable structures in this region. In addition, we showed that mRNAs encoding proteins responding to RPS26 depletion have shorter than average 5’UTRs (new Figure 4E).

      (8) Mechanism is lacking entirely. There are many ways in which ribosomes could have mRNA-specific effects. The authors tried to find an effect from the Kozak sequence, unsuccessfully (however, they also did not do the experiment correctly, as they failed to recognize that the Kozak sequence differs between yeast, where it is A-rich, and mammalian cells, where it is GGCGCC). Collisions could be another mechanism.

      Indeed, collisions as well as other mechanisms such as skewed start codon fidelity may have an effect on efficiency of FMRpolyG biosynthesis. In the current version of the manuscript, we show that RPS26 amount-sensitive regulation seems to be start codonselection dependent (new Figure 4F&G).

      Reviewer #2 (Public Review):

      Summary:

      Translation of CGG repeats leads to the accumulation of poly G, which is associated with neurological disorders. This is a valuable paper in which the authors sought out proteins that modulate RAN translation. They determined which proteins in Hela cells bound to CGG repeats and affected levels of polyG encoded in the 5'UTR of the FMR1 mRNA. They then showed that siRNA depletion of ribosomal protein RPS26 results in less production of FMR1polyG than in control. There are data supporting the claim that RPS26 depletion modulates RAN translation in this RNA, although for some results, the Western results are not strong. The data to support increased aggregation by polyG expression upon S26 KD are incomplete.

      We thank the Reviewer for critical comments and suggestions. We sincerely appreciate your rigorous, insightful, and constructive feedback throughout the revision process.

      Below each specific point, we addressed the mentioned issues.

      Strengths:

      The authors have proteomics data that show the enrichment of a set of proteins on FMR1 RNA but not a related RNA.

      We thank Reviewer for appreciation of provided MS-screening results, which identified proteins enriched on FMR1 RNA with expanded CGG repeats.

      Weaknesses:

      - It is insinuated that RPS26 binds the RNA to enhance CGG-containing protein expression. However, RPS26 reduction was also shown previously to affect ribosome levels, and reduced ribosome levels can result in ribosomes translating very different RNA pools.

      In previous version of the manuscript we did not state that RPS26 binds directly to RNA with expanded CGG repeats and we did not show the experiment indicating direct interaction between studied RNA and RPS26. What we showed is that RPS26 was enriched on FMR1 RNA MS samples, however, we did not verify whether it is direct or indirect interaction. We also tried to test hypothesis that lack of RPS26 in PIC complex may affect efficiency of RAN translation initiation via specific, previously described in yeast Kozak context (Ferretti M, 2017, Nat Struct Mol Biol). As we described this hypothesis was negatively validated. However, we showed that other features of 5’UTR sequences (e.g. higher GC-content or shorter leader sequence) are potentially important for translation efficiency in cells with depleted RPS26.

      Indeed, RPS26 is involved in 40S maturation steps (Plassart L, 2021, eLife) and its insufficiency or mutations or blocking its inclusion to 40S ribosome may result in incomplete 40S maturation, which subsequently might negatively affect translation per se. However, we did not observe global translation inhibition after RPS26 depletion or depletion of TSR2, the chaperon involved in incorporation/exchange RPS26 to small ribosomal subunit (new Figure 2B and 5C). In addition, our SILAC-MS data indicates that majority of studied proteins (including FMRP, the main product of FMR1 gene) were not affected by RPS26 depletion which can be carefully extrapolated to global translation. In revised manuscript we also showed that relatively low silencing of RPS26 also decreased FMRpolyG production in model cells (new Figure 1D).

      We agree that reduced ribosome levels can result in different efficiency of translation of different RNA pools. We enhance this statement in revised manuscript. However, we also showed that the same mRNA containing different near cognate start codons (single/two nucleotide substitution) specific to RAN translation, or targeting this codon with antisense oligonucleotides resulted in altered sensitivity of FMR1 mRNA translation to RPS26 depletion (new Figure 4F).

      - A significant claim is that RPS26 KD alleviates the effects of FMRpolyG expression, but those data aren't presented well.

      We thank the Reviewer for this comment. In the new version of the manuscript, we have added new microscopic images and improved the explanation of Figure 1E. We have also completed the interpretation of Figure 1F in the main text, figure image as well as figure legend, and we hope that these changes will ameliorate understanding of our data.

      Recommendations For The Authors:

      - A significant claim is that RPS26 KD alleviates the effects of FMR polyG expression, but those data aren't presented well:

      Figure 1D (supporting data in S2) and 2D - the authors need to show representative images of a control that has aggregation and indicate aggregates being counted on an image. The legend states that there are no aggregates, but the quantification of aggregates/nucleus is ~1, suggesting there are at least 1 per cell. It is preferred to show at least a representative of what is quantified in the main figure instead of a bar graph.

      The representative images of control and siRPS26-treated cells are now shown in revised version of Figure 1E. Additionally, we completed the Figure legend concerning this part, as well as extended description of the experiment in Materials&Methods section.

      Figure 1E - it is unclear what luminescence signal is being measured. Is this a dye for an apoptotic marker? More information is needed in the legend.

      This information was added to the legend of modified Figure 1F (previously 1E) as suggested.

      - Some of the Western blots are not very convincing. Better evidence for the changes in bar graphs would improve how convincing the data are:

      Fig 2B. The western for FMR95G in the first model is not very convincing. The difference by eye for the second siRNA seems to give a larger effect than the first for 95G construct but they appear almost the same on the graph. More supporting information for the quantification is needed.

      We provided better explanation for WB quantification in M&M section in the manuscript. Alos, we provided additional blot demonstrating independent biological replicate of the mentioned experiment in supplementary materials (Supplementary Figure S2E).

      Figure 4A, the blots for RPS26 and FMR95G are not convincing. They are quite smeary compared to all of the others shown for these proteins in other figures. Could a different replicate be shown?

      We provided additional blot demonstrating the effect on transiently expressed FMRpolyG affected by depletion of TSR2 in COS7 cell line (Supplementary Figure S4A).

      Figure 5A and 5B blots are not ideal. Could a different replicate be shown? Or show multiple replicates in the supplemental figure?

      We provided additional blots from the same experiment, although data is not statistically significant, most likely due to low quality of normalization factor, which is Vinculin (Supplementary Figure S5A). Nevertheless, the level of FMRpolyG is decreased by ~70% after RPS25 silencing in SH-SY5Y cells.

      Figure 2C. Please use the same y axes for all four Westerns in B and C. One would like to compare 95 and 15 repeats, but it is difficult when the y axes are different.

      Thank you for this comment. The y axis was adjusted as suggested by the Reviewer.

      Figure 3D-The text suggests a significant difference between positive and negative responders that is not clear in the figure.

      In the main body of the manuscript we state that: “We did not observe any significant differences in the frequency of individual nucleotide positions in the 20-nucleotide vicinity of the start codon relative to the expected distribution in the BG”, which is in line with the graph showed in Figure 4D (previously 3D).

      Reviewer #3 (Public Review):

      Tutak et al provide interesting data showing that RPS26 and relevant proteins such as TSR2 and RPS25 affect RAN translation from CGG repeat RNA in fragile X-associated conditions. They identified RPS26 as a potential regulator of RAN translation by RNAtagging system and mass spectrometry-based screening for proteins binding to CGG repeat RNA and confirmed its regulatory effects on RAN translation by siRNA-based knockdown experiments in multiple cellular disease models and patient-derived fibroblasts. Quantitative mass spectrometry analysis found that the expressions of some ribosomal proteins are sensitive to RPS26 depletion while approximately 80% of proteins including FMRP were not influenced. Since the roles of ribosomal proteins in RAN translation regulation have not been fully examined, this study provides novel insights into this research field. However, some data presented in this manuscript are limited and preliminary, and their conclusions are not fully supported.

      (1) While the authors emphasized the importance of ribosomal composition for RAN translation regulation in the title and the article body, the association between RAN translation and ribosomal composition is apparently not evaluated in this work. They found that specific ribosomal proteins (RPS26 and RPS25) can have regulatory effects on RAN translation (Figures 1C, 2B, 2C, 2E, 4A, 5A, and 5B), and that the expression levels of some ribosomal proteins can be changed by RPS26 knockdown (Figure 3B, however, the change of the ribosome compositions involved in the actual translation has not been elucidated). Therefore, their conclusive statement, that is, "ribosome composition affects RAN translation" is not fully supported by the presented data and is misleading.

      We thank the Reviewer for critical comments and suggestions. We agree that the initial title and some statements in the text were misleading and the presented data did not fully support the aforementioned statement regarding ribosomal composition affecting FMRpolyG synthesis. Therefore, in the revised version of the manuscript we included a control experiment indicating that depletion of another core 40S ribosomal protein (RPS6) did not impact the FMRpolyG synthesis (new Figure 5C), which supports our hypothesis that RPS26 and RPS25 are specific CGG-related RAN translation modifiers. To precisely deliver a main message of our work, we changed the title that will indicate the specific effect of RPS26 and RPS25 insufficiency on RAN translation of FMRpolyG. Proposed title: “Insufficiency of 40S ribosomal proteins, RPS26 and RPS25 negatively affects biosynthesis of polyglycine-containing proteins in fragile-X associated conditions”. We also changed all statements regarding “ribosomal composition” in main text of the new version of manuscript.

      (2) The study provides insufficient data on the mechanisms of how RPS26 regulates RAN translation. Although authors speculate that RPS26 may affect initiation codon fidelity and regulate RAN translation in a CGG repeat sequence-independent manner (Page 9 and Page 11), what they really have shown is just identification of this protein by the screening for proteins binding to CGG repeat RNA (Figure 1A, 1B), and effects of this protein on CGG repeat-RAN translation. It is essential to clarify whether the regulatory effect of RPS26 on RAN translation is dependent on CGG repeat sequence or near-cognate initiation codons like ACG and GUG in the 5' upstream sequence of the repeat. It would be better to validate the effects of RPS26 on translation from control constructs, such as one composed of the 5' upstream sequence of FMR1 with no CGG repeat, and one with an ATG substitution in the 5' upstream sequence of FMR1 instead of near-cognate initiation codons.

      We agree that the data presented in the manuscript implies that insufficiency of RPS26 plays a pivotal role in the regulation of CGG-related RAN translation and in the revised version of the manuscript we included a series of experiments indicating that ACG codon selection seems to be an important part of RPS26 level-dependent regulation of polyglycine production (new Figure 4F&G; see point 3 below for more details). Importantly, in the luciferase assay showed on Figure 4F we used the AUG-initiated firefly luciferase reporter as normalization control.

      Moreover, to verify if FMRpolyG response to RPS26 deficiency depends on the type of reporter used, we repeated many experiments using FMRpolyG fused with different tags. The luciferase-based assays were in line with experiments conducted on constructs with GFP tag (new Figure 1D), thus strengthening our previous data. Moreover, in the series of experiments, we show that FMRP synthesis which is initiated from ATG codon located in FMR1 exon 1, was not affected by RPS26 depletion (Figure 3E & 4C), even though its translation occurs on the same mRNA as FMRpolyG. This indicates a specific RPS26 regulation of polyglycine frame initiated from ACG near cognate codon.

      (3) The regulatory effects of RPS26 and other molecules on RAN translation have all been investigated as effects on the expression levels of FMRpolyG-GFP proteins in cellular models expressing CGG repeat sequences Figures 1C, 2B, 2C, 2E, 4A, 5A, and 5B). In these cellular experiments, there are multiple confounding factors affecting the expression levels of FMRpolyG-GFP proteins other than RAN translation, including template RNA expression, template RNA distribution, and FMRpolyG-GFP protein degradation. Although authors evaluated the effect on the expression levels of template CGG repeat RNA, it would be better to confirm the effect of these regulators on RAN translation by other experiments such as in vitro translation assay that can directly evaluate RAN translation.

      We agree that there are multiple factors affecting final levels of FMRpolyG-GFP proteins including aforementioned processes. We evaluated the level of FMR1 mRNA, which turned out not to be decreased upon RPS26 depletion (Figure 3B&C), therefore, we assumed that what we observed, was the regulation on translation level, especially that RPS26 is a ribosomal protein contacting mRNA in E-site. We believe that direct assays such as in vitro translation may be beneficial, however, depletion of RPS26 from cellular lysate provided by the vendor seems technically challenging, if not completely impossible. Instead, we focused on sequence/structure specific regulation of RAN translation with the emphasis on start-codon initiation selection. It resulted in generating the valuable results pointing out the RPS26 role in start codon fidelity (Figure 4F&G). These new results showed that translation from mRNAs differing just in single or two nucleotide substitution in near cognate start codon (ACG to GUG or ACG to CUG), although results in exactly the same protein, is differently sensitive to RPS26 silencing (new Figure 4F). Similar differences were observed for translation efficiency from the same mRNA targeted or not with antisense oligonucleotide complementary to the region of RAN translation initiation codon (new Figure 4G). These results also suggest that stability of FMRpolyG is not affected in cells with decreased level of RPS26.

      (4) While the authors state that RPS26 modulated the FMRpolyG-mediated toxicity, they presented limited data on apoptotic markers, not cellular viability (Figure 1E), not fully supporting this conclusion. Since previous work showed that FMRpolyG protein reduces cellular viability (Hoem G, 2019,Front Genet), additional evaluations for cellular viability would strengthen this conclusion.

      We thank the Reviewer for this suggestion. We addressed the apoptotic process in order to determine the effect of RPS26 depletion on RAN translation related toxicity (Figure 1F). In revised version of the manuscript, we also added the evaluation on how cells proliferation was affected by RPS26, RPS25, RPS6 and TSR2 depletion. Our data indicate that TSR2 silencing slightly impacted the cellular fitness (new Figure 5D), whereas insufficiencies of RPS26, RPS25 and RPS6 had a much stronger negative effect on proliferation (new Figure 2A, 5D, 6C), which is in line with previous data (Cheng Z 2019, Mol Cell; Luan Y, 2022, Nucleic Acids Res). The difference in proliferation rate after treatment with siRPS26 makes proper interpretation of cellular viability assessment very difficult.

      Recommendations For The Authors:

      (1) It would be nice to validate the effects of overexpression of RPS26 and other regulators on RAN translation, not limited to knockdown experiments, to support the conclusion.

      We did not performed such experiments because we believed that RPS26 overexpression may have no or marginal effect on translation or RAN translation. It is likely impossible to efficiently incorporate overexpressed RPS26 into 40S subunits, because the concentration of all ribosomal proteins in the cells is very high.

      (2) It would be better to explain how authors selected 8 proteins for siRNA-based validation (Figure 1C, 1D, S1D) from 32 proteins enriched in CGG repeat RNA in the first screening.

      We selected those candidates based on their functions connected to translation, structured RNA unwinding or mRNA processing. For example, we tested few RNA helicases because of their known function in RAN translation regulation described by other researchers. This explanation was added to the revised version of the manuscript.

      (3) Original image data showing nuclear FMRpolyG-GFP aggregates should be presented in Figure 1D.

      The representative images of control and siRPS26-treated cells are now shown in modified version of Figure 1E and described with more details in the legend.

      (4) Image data in Figure 2A and 2D have poor signal/noise ratio and the resolution should be improved. In addition, aggregates should be clearly indicated in Figure 2D in an appropriate manner.

      The stable S-FMR95xG cellular model is characterized by very low expression of RANtranslated FMR95xG, therefore, it is challenging to obtain microscopic images of better quality with higher GFP signal. In the L-99xCGG model expression of transgene is higher. Therefore, we provided new image in the new version of Figure 3D (former 2D). Moreover, we showed aggregates on the image obtained using confocal microscopy (new Supplementary Figure 2D).

      (5) The detailed information on patient-derived fibroblast (age and sex of the patient, the number of CGG repeats, etc.) in Figure 2F needed to be presented.

      This information was added to the figure legend (Figure 3F; previously 2F) and in the Material and Methods section as suggested.

      (6) It would be better to normalize RNA expression levels of FMR1 and FMR1-GFP by the housekeeping gene in Figure S2C, like other RT-qPCR experimental data such as Figure 2B.

      Normalization of FMR1-GFP to GAPDH is now shown in modified version of Figure S2C (right graph) as requested by the Reviewer.

      (7) It would be better to add information on molecular weight on all Western blotting data.

      (8) Marks corresponding to molecular weight ladder were added to all images.

      Full blots, including protein ladders were deposited in Zenodo repository, under doi: 10.5281/zenodo.13860370

      References

      Cheng Z, Mugler CF, Keskin A, Hodapp S, Chan LYL, Weis K, Mertins P, Regev A, Jovanovic M & Brar GA (2019) Small and Large Ribosomal Subunit Deficiencies Lead to Distinct Gene Expression Signatures that Reflect Cellular Growth Rate. Mol Cell 73: 36-47.e10

      Havkin-Solomon T, Fraticelli D, Bahat A, Hayat D, Reuven N, Shaul Y & Dikstein R (2023) Translation regulation of specific mRNAs by RPS26 C-terminal RNA-binding tail integrates energy metabolism and AMPK-mTOR signaling. Nucleic Acids Res 51: 4415–4428

      Hoem,G., Larsen,K.B., Øvervatn,A., Brech,A., Lamark,T., Sjøttem,E. and Johansen,T. (2019) The FMRpolyGlycine protein mediates aggregate formation and toxicity independent of the CGG mRNA hairpin in a cellular model for FXTAS. Front. Genet., 10, 1–18.

      Luan Y, Tang N, Yang J, Liu S, Cheng C, Wang Y, Chen C, Guo YN, Wang H, Zhao W, et al (2022) Deficiency of ribosomal proteins reshapes the transcriptional and translational landscape in human cells. Nucleic Acids Res 50: 6601–6617

      Plassart L, Shayan R, Montellese C, Rinaldi D, Larburu N, Pichereaux C, Froment C, Lebaron S, O’donohue MF, Kutay U, et al (2021) The final step of 40s ribosomal subunit maturation is controlled by a dual key lock. Elife 10

    1. eLife Assessment

      This valuable study uses a massive and long-term experimental data set to provide solid evidence on how tree diversity affects host-parasitoid communities of insects in forests. The work will be of interest to ecologists working on biodiversity conservation, community ecology, and food webs.

    2. Reviewer #2 (Public review):

      Summary

      The authors use a tree biodiversity experiment to evaluate the effects of tree community and canopy cover on communities of cavity-nesting Hymenoptera and their parasitoids and the interactions between these two guilds. They find that multiple measures of tree diversity influence the hosts, parasitoids, and their interactions. In addition, host-parasitoid interactions show a phylogenetic signal.

      Strength

      The authors use a massive, long-term data set, meaningful community descriptors, and a solid set of analyses to explore the impacts of tree communities on host-parasitoid networks. It is rare to have such detailed data from multiple different trophic levels.

      Weakness

      Even though the data expands over several seasons, this is not considered in the analyses, but communities sampled at different years are pooled at the plot level. A more detailed analysis of the variations between years could reveal underlaying patterns as currently the differences in the communities and their structure between the years are ignored (e.g., when estimating the phylogenetic compositions not all the species pooled together actually coexist in time).<br /> Also, the precision of the writing should be improved as it was not always easy to follow the text and the thoughts.

    3. Author response:

      The following is the authors’ response to the original reviews.

      It would be great if the authors could add clarification about the NMDS analyses and the associated results (Fig. 1, Table 1 and Tables S2-4). The overall aim of these analyses was to see how plot characteristics (e.g. canopy cover) and composition of one taxonomic group were related to the composition of another taxonomic group. The authors quantified species composition by two axes from NMDS. (1) This analysis may yield an interpretation problem: if we only find one of the axes, but not the other, was significantly related to one variable, it would be difficult to determine whether that specific variable is important to the species composition because the composition is co-determined by two axes. (2) It is unclear how the authors did the correlation analyses for Tables S2-4. If correlation coefficients were presented in these tables, then these coefficients should be the same or very similar if we switch the positions of y vs. x. That is, the correlation between host vs. parasite phylogenetic composition would be very close to the correlation between parasite vs. phylogenetic composition, but not as the author found that these two relationships were quite different, leading to the interpretation of bottom-up or top-down processes. It is also unclear which correlation coefficient was significant or not because only one P value was provided per row in these tables. (3) In addition to the issues of multiple axes (point 1), NMDS axes simply define the relative positions of the objects in multi-dimensional space, but not the actual dissimilarities. Other methods, such as generalized dissimilarity modeling, redundancy analysis and MANOVA, can be better alternatives.

      Thank you for the thorough and constructive review. We have taken the concerns and questions raised by the editors and reviewers into account and provided clarification about the NMDS analyses as well as additional analyses to confirm our results. First, we have now added a brief explanation in the manuscript regarding the interpretation of the two NMDS axes and how they relate to species composition. Specifically, we clarified that while NMDS defines the relative positions of objects in multi-dimensional space, the two axes together provide a more comprehensive representation of the community composition, which is not solely determined by either axis independently. Second, we acknowledge that alternative approaches could help further strengthen our conclusions. To address this, we incorporated Mantel tests and PERMANOVA (with ‘adonis2’) as additional validation methods. These analyses allowed us to summarize compositional patterns while testing our hypotheses within the framework of the plot characteristics and taxonomic relationships. We have added these analyses and their results in the manuscript to reinforce our findings.

      In methods: L478-481 “To strengthen the robustness of our findings based on NMDS, we further validated the results using Mantel test and PERMANOVA (with ‘adonis2’) for correlation between communities and relationships between communities and environmental variables.”

      L469-475 “NMDS was used to summarize the variation in species composition across plots. The two axes extracted from the NMDS represent gradients in community composition, where each axis reflects a subset of the compositional variation. These axes should not be interpreted in isolation, as the overall species composition is co-determined by their combined variation. For clarity, results were interpreted based on the relationships of variables with the compositional gradients captured by both axes together."

      In results: L172-177 “The PERMANOVA analysis also highlighted the important role of canopy cover for host and parasitoid community (Table S6-9). The Mantel test revealed a consistent pattern with the NMDS analysis, highlighting a pronounced relationship between the species composition of hosts and parasitoids (Table S10). However, the correlation between the phylogenetic composition of hosts and parasitoids was not significant.”

      In discussion: L257-261 “However, this significant pattern was observed only in the NMDS analysis and not in the Mantel test, suggesting that the non-random interactions between hosts and parasitoids could not be simply predicted by their community similarity and associations between the phylogenetic composition of hosts and parasitoids are more complex and require further investigation in the future.”

      -- One additional minor point: "site" would be better set as a fixed rather than random term in the linear mixed-effects models, because the site number (2) is too small to make a proper estimate of random component.

      Now we treated “site” as a fixed factor in our models, interacting with tree species richness/tree MPD and tree functional diversity to reflect the variation of spatial and tree composition between the two sites. We found the main results did not change, as both sites showed consistent patterns for effects of tree richness/MPD on network metrics, which is more pronounced in one site.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The authors analyzed how biotic and abiotic factors impact antagonistic host-parasitoid interaction systems in a large BEF experiment. They found the linkage between the tree community and host-parasitoid community from the perspective of the multi-dimensionality of biodiversity. Their results revealed that the structure of the tree community (habitat) and canopy cover influence host-parasitoid compositions and their interaction pattern. This interaction pattern is also determined by phylogenetic associations among species. This paper provides a nice framework for detecting the determinants of network topological structures.

      Strengths:

      This study was conducted using a five-year sampling in a well-designed BEF experiment. The effects of the multi-dimensional diversity of tree communities have been well explained in a forest ecosystem with an antagonistic host-parasitoid interaction.

      The network analysis has been well conducted. The combination of phylogenetic analysis and network analysis is uncommon among similar studies, especially for studies of trophic cascades. Still, this study has discussed the effect of phylogenetic features on interacting networks in depth.

      Weaknesses:

      (1) The authors should examine species and interaction completeness in this study to confirm that their sampling efforts are sufficient.

      (2) The authors only used Rao's Q to assess the functional diversity of tree communities. However, multiple metrics of functional diversity exist (e.g., functional evenness, functional dispersion, and functional divergence). It is better to check the results from other metrics and confirm whether these results further support the authors' results.

      (3) The authors did not elaborate on which extinction sequence was used in robustness analysis. The authors should consider interaction abundance in calculating robustness. In this case, the author may use another null model for binary networks to get random distributions.

      (4) The causal relationship between host and parasitoid communities is unclear. Normally, it is easy to understand that host community composition (low trophic level) could influence parasitoid community composition (high trophic level). I suggest using the 'correlation' between host and parasitoid communities unless there is strong evidence of causation.

      Thank you very much for your thoughtful and constructive review of our manuscript. We have carefully addressed your comments and made several revisions to improve the clarity and robustness of our work.1) We appreciate your suggestion regarding species and interaction completeness. To confirm that our sampling efforts were sufficient, we have now included a figure (Fig. S1) showing the species accumulation curve and the coverage of interactions in our study. This ensures that the data collected provide a comprehensive representation of the system. 2) Regarding the use of only Rao’s Q to assess functional diversity, we acknowledge that multiple metrics of functional diversity exist. However, due to the large number of predictors in our analysis, we decided to streamline our approach and focus on Rao’s Q as it provides a robust measure for our research objectives. We have discussed this decision in the revised manuscript and clarified that, while additional metrics could be informative, we believe Rao’s Q sufficiently captures the key aspects of functional diversity in our study. 3) We have elaborated on the robustness analysis and the null model used in our study. Specifically, we now clarified which extinction sequence (random extinction) was used in our manuscript, and explained interaction abundance was incorporated into the robustness calculations (networklevel function, weighted=TURE; see L506). 4) We have revised the text to clarify the relationship between host and parasitoid communities. As you correctly pointed out, while it is intuitive that host community composition influences parasitoid community composition, we have reframed our analysis to emphasize the correlation between the two communities rather than implying causation without strong evidence. We have revised the manuscript to reflect this distinction.

      Reviewer #2 (Public Review):

      Summary:

      In their manuscript, Multi-dimensionality of tree communities structure host-parasitoid networks and their phylogenetic composition, Wang et al. examine the effects of tree diversity and environmental variables on communities of reed-nesting insects and their parasitoids. Additionally, they look for the correlations in community composition and network properties of the two interacting insect guilds. They use a data set collected in a subtropical tree biodiversity experiment over five years of sampling. The authors find that the tree species, functional, and phylogenetic diversity as well as some of the environmental factors have varying impacts on both host and parasitoid communities. Additionally, the communities of the host and parasitoid showed correlations in their structures. Also, the network metrices of the host-parasitoid network showed patterns against environmental variables.

      Strengths:

      The main strength of the manuscript lies in the massive long-term data set collected on host-parasitoid interactions. The data provides interesting opportunities to advance our knowledge on the effects of environmental diversity (tree diversity) on the network and community structure of insect hosts and their parasitoids in a relatively poorly known system.

      Weaknesses:

      To me, there are no major issues regarding the manuscript, though sometimes I disagree with the interpretation of the results and some of the conclusions might be too far-fetched given the analyses and the results (namely the top-down control in the system). Additionally, the methods section (especially statistics) was lacking some details, but I would not consider it too concerning. Sometimes, the logic of the text could be improved to better support the studied hypotheses throughout the text. Also, the results section cannot be understood as a stand-alone without reading the methods first. The study design and the rationale of the analyses should be described somewhere in the intro or presented with the results.

      Thank you very much for your valuable comments and suggestions on our manuscript! We appreciate your feedback and have made revisions accordingly. Specifically, we have rephrased the interpretation of the results and conclusions to better align with the analyses and avoid overstatements, particularly concerning the top-down control in the system. In addition, we have expanded the methods section by providing more details, especially regarding the statistical approaches, to address the points you raised. To enhance the clarity of the manuscript, we have also ensured that the logic of the text better supports the hypotheses throughout. Please see our point-by-point responses below for additional clarifications.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Line 120: "... and large ecosystems susceptible to global change (add citation here)": Citation(s)?

      Now we provided the missed citations.

      Line 141: Add sampling completeness information.

      Now we provide a new figure about sampling completeness (Fig. S1) in the supplementary materials, showing the adequate sampling effort for our study.

      Line 151: use more metrics in the evaluation of functional diversity

      We used tree functional diversity Rao’s Q, which is an integrated and wildly used metric to represent functional dissimilarity of trees. As our study focus on multiple diversity indices of trees, it would be better to do not pay more attention to one type of diversity. Thank you for your suggestion!

      Line 164: host vulnerability. Although generality and vulnerability are commonly used in network analysis, it is better to link these metrics with the trophic level, like the 'host vulnerability' you used. Thus, you can use 'parasitoid generality' instead of 'generality'.

      Thanks for your suggestion. Now the metrics were labeled with the trophic levels in the full text.

      Line 169: two'.'

      Corrected.

      Line 173: 'parasitoid robustness' Or 'robustness of parasitoids'?

      Now changed it to ‘robustness of parasitoid’.

      Lines 173, 468: For the robustness estimations, maybe use null model for binary networks to get random distributions?

      Thanks for the suggestion. Actually, we have used Patefield null models to compare the randomized robustness and observed, helping to assess whether the robustness of the observed network is significantly different compared to expected by chance. All robustness indices across plots were significantly different from a random distribution, See results section L197-201.

      Line 184: modulating interacting communities of hosts and parasitoids.

      Changed accordingly.

      Line 186: determined host-parasitoid interaction patterns

      Changed accordingly.

      Line 191: Biodiversity loss in this study refers to low trophic levels.

      Now we clarified this point.

      Line 190: understand

      Changed accordingly.

      Lines 215-216: Reorganize these sentences

      Line 227: indirectly influenced by...

      Changed accordingly.

      Line 238: Be more specific. Which type of further study?

      Rephased it more specific.

      Lines 297-299: rewrite this sentence to make it more transparent.

      Now we rewrote the sentence accordingly.

      Line 302: Certain

      Changed accordingly.

      Line 453: effective

      Changed accordingly.

      Finally, the authors should check the text carefully to avoid grammatical errors.

      Thanks, now we have checked the full text to avoid grammatical errors.

      Reviewer #2 (Recommendations For The Authors):

      I feel that the authors have very interesting data and have a solid set of analyses. I do not have major issues regarding the manuscript, though sometimes I disagree with the interpretation of the results and some of the conclusions might be too far-fetched given the analyses and the results. Additionally, the methods section (especially statistics) was lacking some details, but I would not consider it too concerning at this point.

      I feel that the largest caveat of the manuscript remains in the representation of the rationale of the study. I felt the introduction could be more concise and be better focused to back up the study questions and hypotheses. Many times, the sentences were too vague and unspecific, and thus, it was difficult to understand what was meant to be said. The authors could mention something more about how community composition of hosts and parasitoids are expected to change with the studied experimental design regarding the metrices you mention in the introduction (stronger hypotheses). The results section cannot be understood as a stand-alone without reading the methods first. The study design and the rationale of the analyses must be described somewhere in the intro or results, if the journal/authors want to keep the methods last structure. Also, the results and discussion could be more focused around the hypotheses. Naturally, these things can be easily fixed.

      I also disagree with the interpretation of results finding top-down control in the system (it might well be there, but I do not think that the current methods and tests are suitable in finding it). First, the used methodology cannot distinguish parasitoids if the hosts are not there and the probability to detect parasitoid likely depends on the abundance of the host. Thus, the top-down regulation is difficult to prove (is it the parasitoids that have driven the host population down). Secondly, I would be hesitant to say anything about the top-down and bottom-up control in the systems as the data in the manuscript is pooled across five years while the top-down/bottom-up regulation in insect systems usually spans only one season/generation in time (much shorter than five years). Consequently, the analyses are comparing the communities of species that some of most likely do not co-exist (they were found in the same space but not during the same time). Luckily, the top-down/bottom-up effects could potentially be explored by using separately the time steps of the now pooled community data: e.g., does the population of the host decrease in t if the parasitoids are abundant in t-1? There are also other statistical tests to explore these patterns.

      In the manuscript "Phylogenetic composition" refers to Mean Pairwise Distance. I would use "phylogenetic diversity" instead throughout the text. Also, to my understanding, in trees both "phylogenetic composition" and "phylogenetic diversity" are used even though based on their descriptions, they are the same.

      Detailed comments:

      Punctuation needs to be checked and edited at some point (I think copy-pasting had left things in the wrong places). Please check that "-" instead of "-" is used in host-parasitoid.

      1-2 The title is not very matching with the content. "Multi-dimensionality" is not mentioned in the text. "phylogenetic composition" -> "phylogenetic diversity"

      We didn’t find the role of functional diversity of trees in host-parasitoid interactions, but we still have tree richness and phylogenetic diversity. I also disagree with that using phylogenetic diversity to replace phylogenetic composition, because diversity highlights higher or lower phylogenetic distance among communities, while the later highlights the phylogenetic dissimilarity across communities.

      53-57 This sentence is quite vague and because of it, difficult to follow. Consider rephrasing and avoiding unspecified terms such as "tree identity", "genetic diversity", and "overall community composition of higher trophic levels" (at least, I was not sure what taxa/level you meant with them).

      Rephased.

      L58-61 “Especially, we lack a comprehensive understanding of the ways that biotic factors, including plant richness, overall community phylogenetic and functional composition of consumers, and abiotic factors such as microclimate, determining host–parasitoid network structure and host–parasitoid community dynamics.”

      56 I would remove "interact" as no interactions were tested.

      Removed accordingly.

      59-60 This needs rephrasing. I feel "taxonomic and phylogenetic composition should be just "species composition". To better match, what was done: "taxonomic, phylogenetic, and network composition of both host and parasitoid communities" -> "species and phylogenetic diversity of both host and parasitoid communities and the composition their interaction networks"

      Changed accordingly.

      62 Remove "tree composition".

      Done.

      62 Replace "taxonomic" with "species". Throughout the text.

      Done.

      63-64 "Generally, top-down control was stronger than bottom-up control via phylogenetic association between hosts and parasitoids" I disagree, see my comments elsewhere.

      Now we rephased the sentence.

      L68-70 “Generally, phylogenetic associations between hosts and parasitoids reflect non-randomly structured interactions between phylogenetic trees of hosts and parasitoids.”

      68 "habitat structure and heterogeneity" This is too strong and general of a statement based on the results. You did not really measure habitat structure or heterogeneity.

      Now we rephased the statement to avoid strong and general description.

      L71-73 “Our study indicates that the composition of higher trophic levels and corresponding interaction networks are determined by plant diversity and canopy cover especially via trophic phylogenetic links in species-rich ecosystems.”

      69 Specify "phylogenetic links". Trophic links?

      Specified to “trophic phylogenetic links”.

      75-77 The sentence is a bit difficult to follow. Consider rephrasing.

      Now we rephased it.

      L79-82 “Changes in network structure of higher trophic levels usually coincide with variations in their diversity and community, which could be in turn affected by the changes in producers via trophic cascades”

      76 Be more specific about what you mean by "community of trophic levels".

      Specified to “community composition”.

      79 Remove "basal changes of", it only makes the sentence heavier.

      Done.

      81 What is "species codependence"?

      We sim to describe the species co-occurrence depending on their closely relationships. For clarity, now we changed to “species coexistence”

      82 What do you mean by "complex dynamics"?

      Rephased to “mechanisms on dynamics of networks”.

      83 onward: I would not focus so much on top-down/bottom-up as I feel that your current analyses cannot really say anything too strong about these causalities but are rather correlative.

      Thanks, we now removed the relevant contents from the discussion. However, we kept one sentence in the Introduction, because it should be highlighted to make reviewers aware of this (the other text on about this were removed).

      89 Remove "environmental".

      Done.

      90 Specify what you mean by "these forces".

      Done.

      98-99 I have difficulties following the logic here "potential specialization of their hosts may cascade up to impact the parasitoids' presence or absence". Consider rephrasing.

      Now we rephased it.

      L101-102 “…and their host fluctuations may cascade up to impact the parasitoids’ presence or absence.”

      100 Be more specific with "habitat-level changes".

      Specified to “community-level changes”

      100 I do not see why host-parasitoid systems would be ideal to study "species interactions". There are much simpler and easier systems available.

      Changed to “… one of ideal…”

      101-103 "influence of" on what?

      Now we rephased the sentence.

      L104-105 “Previous studies mainly focused on the influence of abiotic factors on host-parasitoid interactions”

      104 Be more specific in "the role of multiple components of plant diversity".

      Now we specified "the role of multiple components of plant diversity".

      L107-108 “…the role of multiple components of plant diversity (i.e. taxonomic, functional and phylogenetic diversity)…”

      106 "diversity associations" of what?

      “diversity associations between host and parasitoids”.

      108 Specify the "direct and indirect effects".

      Now we specified it to “…direct and indirect effects (i.e. one pathway and more pathways via other variables)…”

      110-113 A bit heavy sentence to follow. Consider rephrasing.

      Now we rephased the sentence to make it more readable.

      114 Give an example of "phylogenetic dependences".

      Done. Phylogenetic dependences (e.g. phylogenetic diversity)

      117 Move the "e.g. taxonomic, phylogenetic, functional" within brackets in 117 after "dimensions of biodiversity".

      Done.

      120 "(add citation here)" Yes please!

      Done.

      120-121 Specify "such relationships".

      Done. Specified to “multiple dimensions of biodiversity”

      128-130 This is difficult to follow. Please rephrase.

      Now we rephased the sentence.

      L135-137 “We aimed to discern the primary components of the diversity and composition of tree communities that affect higher trophic level interactions via quantifying the strength and complexity of associations between hosts and parasitoid.”

      131-132 Remove "phylogenetic and". It is redundant to phylogenetic diversity.

      Done.

      128 Tested robustness does not really capture "stability of associations".

      Yes, we agree. Now we rephased the sentence and exclude the “stability” description.

      133 Specify "phylogenetic processes".

      Now we specified “phylogenetic processes”.

      L140-141 “…especially via phylogenetic processes (e.g. lineages of trophic levels diverge and evolve over time)…”

      141 I would like to have more details on the data set somewhere in the results. How many individuals and species were found in each plot (on average)? Was there a lot of temporal variation (e.g. between the seasons)? On how many sites were the insect species found?

      Thanks for your suggestion. Now we provide more details on the data set in the results (L153-156), including mean values of individuals and species in each plot. However, the temporal variation should be studied for another relative independent topic, as our study focus on the general patter of interactions between hosts and parasitoids. Therefore, we would not put more information on temporal changes to make readers get lost in the text.

      153-156 “Among them, we found 56 host species (12 bees and 44 wasps, mean abundance and richness are 400.05 and 45.14, respectively, for each plot) and 50 parasitoid species (38 Hymenoptera and 12 Diptera, mean abundance and richness are 14.07 and 9.05, respectively, for each plot).”

      149 tree -> trees

      Done.

      149 Should there read also some else than "NMDS scores"?

      Thanks! Now we provided more details about “NMDS scores”.

      L161-162 “(NMDS axis scores; i.e. preserving the rank order of pairwise dissimilarities between samples)”

      149 You could mention the amount of variation explained by the first two axes of the NMDSs. Now it is difficult to estimate how much the models actually explain.

      Thanks for your comments! However, we could not directly provide the explanatory power of the two axes, because NMDS is based on rank-order distances rather than linear relationships like in PCA. However, the goodness of fit for the NMDS solution is typically evaluated using the stress value. We provide the stress value in the figure caption.

      150 "tree MPD" is mentioned for the first time. Spell it out.

      Done.

      150 Explain "eastness".

      Done.

      L163-164 “…eastness (sine-transformed radian values of aspect) )”

      151 How was "tree functional diversity" quantified?

      Please see methods. L437-L438.

      160 Specify that you talk about phylogenetic compositions of the host and parasitoid communities here.

      We would keep it refined here, keeping consistent with species composition here. Phylogenetic composition just represents the dissimilarities of phylogenetic linages within a community.

      161 Describe "parafit" test here when first mentioned.

      Done, see methods L485-487.

      182 Keep on referring to tables and figures in the discussion! Also, more clearly discuss your hypotheses. There are lots of discussions on top-down/bottom-up control. It could be good to form a hypothesis on them and predict what kind of patterns would suggest either one and what would you expect to find regarding them.

      Now we referred figures and tables in the discussion. As the contents on top-down and bottom-up control were not fit very well with our study (as also suggested by reviewers), so we rephased the discussion and also clearly discuss our hypotheses in the discussion. See L218, L226, and L237 etc.

      186 "partly determined host-parasitoid networks" Be more specific.

      Done.

      L206-207 “…partly determined host-parasitoid network indices, including vulnerability, linkage density, and interaction evenness.”

      195 Tell what you mean by "other biotic factors".

      Specified it: “…other biotic factors such as elevation and slope…”

      197-198 "It seems likely that these results are based on bee linkages to pollen resources" I would be hesitant to conclude this as the bees most likely forage way beyond the borders of the 30m by 30m study plots.

      Thanks for your concern about this problem. While it is true that bees can forage beyond 30 x 30m, the study focuses on their nesting behavior and activity within this defined area, rather than their entire foraging range. Existing literature shows bees often forage locally when resources are available (e.g. Ebeling et al., 2012 Oecologia; Guo et al., year, Basic and Applied Ecology). Therefore, we are confident that this pattern could be associated with the resources around the trap nests.

      223 "This could be further tested by collecting the food directly used by the wasps (caterpillars)" A bit unnecessary addition.

      Thanks for your suggestion. Yes, this definitely is a good point, but currently we don’t have enough data of caterpillars, but we will follow this in the future.

      232-238 I disagree with the authors on the interpretation of the causality of the results here. I think that the community of parasitoids simply indicates which host species are available, while the host community does not have an as strong effect on parasitoid community as parasitoids do not utilise the whole species pool of the hosts. (Presence of parasitoid tells that the host is around while the presence of the host does not necessarily tell about the presence of the parasitoid.) To me, this would rather indicate a bottom-up than top-down regulation. Similar patterns are also visible in species communities of hosts and parasites.

      Thank you for your suggestion. We agree with you that parasitoids are more depended on hosts, as host could not be always attacked by parasitoids. Now we rephased our explanation to follow this argument.

      L254-256 “Such pattern could be further confirmed by the significant association between host phylogenetic composition and parasitoid phylogenetic composition (Fig. 1c), which suggested that their interactions are phylogenetically structured to some extent.”

      247-266 The logic in this section is difficult to follow. Try rephrasing.

      Now we rephased the section for a clearer logic.

      L270-287 “Tree community species richness did not significantly influence the diversity of hosts targeted by parasitoids (parasitoid generality), but caused a significant increase in the diversity of parasitoids per host species (host vulnerability) (Fig. 3a; Table 2). This is likely because niche differentiation often influences network specialization via potential higher resource diversity in plots with higher tree diversity (Lopez-Carretero et al. 2014). Such positive relationship between host vulnerability and tree species richness suggested that host-parasitoid interactions could be driven through bottom-up effects via benefit from tree diversity. For example, parasitoid species increases more than host diversity with increasing tree species richness (Guo et al. 2021), resulting increasing of host vulnerability at community level. According to the enemies hypothesis (Root 1973), which posits a positive effects of plant richness on natural enemies, the higher trophic levels in our study (e.g. predators and parasitoids) would benefit from tree diversity and regulate herbivores thereby (Staab and Schuldt 2020). Indeed, previous studies at the same site found that bee parasitoid richness and abundance were positively related to tree species richness, but not their bee hosts (Fornoff et al. 2021, Guo et al. 2021). Because our dataset considered all hosts and reflects an overall pattern of host-parasitoid interactions, the effects of tree species richness on parasitoid generality might be more complex and difficult to predict, as we found that neither tree species richness nor tree MPD were related to parasitoid generality.”

      249 "This is likely because niche differentiation often influences network specialization via potential higher resource diversity in plots with higher tree diversity" This is a bit contradicting your vulnerability results as niche differentiation should increase specialization and diversity and specialization should decrease vulnerability (less host per parasitoid).

      Thanks! We understand that the concepts of “generality” and “vulnerability” can be a bit confusing. To clarify, “fewer hosts per parasitoid” actually corresponds to lower generality at the community level.

      332-337 How did you select the species growing on your plots? Or was only species number considered? What was the pool of tree species growing on the selected plots? Was the selection similar at both sites?

      Now we provided more information on the experiment design.

      L354-356 “The species pools of the two plots are nonoverlapping (16 species for each site). The composition of tree species within the study plots is based on a “broken-stick” design (see Bruelheide et al. 2014).”

      342 Remove "centrally per plot"?

      Done.

      346-347 Was the selection of different reed diameters similar in all the plots?

      Diameters and the relative distribution of diameters was similar in all trap nests.

      399 & 432 Are "phylogenetic diversity of the tree communities" and "phylogenetic composition of trees" the same? They are both described as mean pairwise distance.

      These two are actually different, as we use this to distinguish the phylogenetic diversity with communities and rank order of dissimilarities between tree communities. Here, the phylogenetic diversity of the tree communities is mean pairwise phylogenetic distance of species for tree communities. Tree phylogenetic composition is the rank order of pairwise dissimilarities between tree communities based on NMDS.

      400 Do you think that MPD makes any sense with the monocultures (value is always 0)? Does this have a potential to bias your analyses and result?

      We agree your point. However, we do not think that this is a major problem in the analyses. We followed the experimental design and considered low phylogenetic relatedness of tree species in a plot (Likewise in monocultures, the tree species richness is always 1).

      402-405 MNTD is not mentioned before or after this. Consider removing this section.

      We tested the potential effects of MNTD in our models. Now we mentioned it in our results.

      L194-195 “Tree mean nearest taxon distance (MNTD) was unrelated to any network indices.”

      405 "Phylogenetic metrics of trees" Which ones?

      Both tree MPD and MNTD. Now we have noted it in the manuscript. (L432)

      410 Further details on "Rao's Q" and how the functional diversity of the communities was calculated are needed.

      Now more details were provided.

      L435-438 “Specifically, seven leaf traits were used for calculation of tree functional diversity, which was calculated as the mean pairwise distance in trait values among tree species, weighted by tree wood volume, and expressed as Rao's Q”

      413 Specify "higher trophic levels".

      Now we specified the trophic levels.

      L440-441 “…higher trophic levels in our study area, such as herbivores and predators”

      417-424 What about the position of the plots within study sites? Is there potential for edge effects (e.g. bees finding easier the trap nest close to the edge of the experimental forest)? Were there any differences between the two sites? What is the elevation range of the plots?

      Thanks for concerning the details of our study. First, all the plots were randomly distributed within the study sites (see Fig. S2). Admittedly, there are several plots are located in the edges of the site. However, we did not consider the potential edge effects in our analysis. Of course, this will be a good point in our future studies. Moreover, the biggest difference between the two is the non-overlapping tree species pool, and the two study sites are apart from 5 km in the same town. Finally, there is not too distinct elevation gradient across the plots (112 m - 260 m).

      432-434 "The species and phylogenetic composition of trees, hosts, and parasitoids were quantified at each plot with nonmetric multidimensional scaling (NMDS) analysis based on Morisita-Horn distances" This section needs to be more specific and detailed. Did you do the NMDS separately for each plot as suggested in the text?

      We provided more details of the section.

      L462-465 “The minimum number of required dimensions in the NMDS based on the reduction in stress value was determined in the analysis (k = 2 in our case). We centred the results to acquire maximum variance on the first dimension, and used the principal components rotation in the analysis.”

      435 Specify how picante was used (function and arguments)!

      Now we specified the function.

      L465-467 “The phylogenetic composition was calculated by mean pairwise distance among the host or parasitoid communities per plot with the R package “picante” with ‘mpd’ function.”

      436 "standardized values" Of what? How was the standardisation done?

      Now we citied a supplementary table (Table S2) to specify it (see L469). For the standardization, we used ‘scale’ function in R, which standardizes data by centering and scaling data. Specifically, it subtracts the mean and divides by the standard deviation for each variable.

      443 Provide more details on parafit.

      Actually, we have provided the reason why we use the parafit test and the usage.

      L483-486 “We used a parafit test (9,999 permutations) with the R package “ape” to test whether the associations were non-random between hosts and parasitoids. This is widely used to assess host-parasite co-phylogeny by analyzing the congruence between host and parasite phylogenies using a distance-based matrix approach.”

      449-451 Rephrase the sentence.

      Rephased.

      L490-491 “We constructed quantitative host-parasitoid networks at community level with the R package “bipartite” for each plot of the two sites.”

      451 "six" Should this be five?

      Yes, should be five, thanks.

      470-481 What package and function were used for the LMMs?

      As we now used linear models, we do no longer use a R package for LMMs.

      470 "mix" -> mixed

      Changed to linear models.

      472 "six" Should this be five?

      Again, we changed it to five.

      479-481 How did you treat the variables from the two different sites when testing for the correlations to avoid two geographic clusters of data points?

      Now we considered the two study sites as fixed factor in our linear models. Moreover, tree-based variables were additionally included as interaction terms with the study sites.

      501 "mix" -> mixed

      Changed to linear models.

      The panel selection for figures 3 and 4 seems random. Justify it!

      Thank you. To avoid including too many figures in the main text, which could potentially confuse readers, we have selected the key results that are of primary interest. The remaining figures are provided in the appendix for reference.

      533 "Note that axes are on a log scale for tree species richness." Why the log-scale if the analyses were performed with linear fit? Also, the drawn regression lines do not match the model description (non-linear, while a linear model is described in the text). The models should probably be described in more detail.

      We used log-transformed to promote the normality of the data. The drawn regression lines are linear lines, which fit our models.

      539 "Values were adjusted for covariates of the final regression model." How?

      We used residual plot to directly visualizes the relationship between the predictor and the response variable with the fitted regression line, making it easier to assess the model's fit.

      Fig. S4 text does not match the figure.

      Thanks! We now deleted the unmatched text in the figure.

    1. eLife Assessment

      This important study provides new insights into the mechanisms that underlie perceptual and attentional impairments of conscious access. The paper presents convincing evidence of a dissociation between the early stages of low-level perception, which are impermeable to perceptual or attentional impairments, and subsequent stages of visual integration which are susceptible to perceptual impairment but resilient to attentional manipulations. This study will be of interest to scientists working on visual perception and consciousness.

    2. Reviewer #1 (Public review):

      Summary:

      In this work, Noorman and colleagues test the predictions of the "four-stage model" of consciousness by combining psychophysics and scalp EEG in humans. The study relies on an elegant experimental design to investigate the respective impact of attentional and perceptual blindness on visual processing.

      The study is very well summarised, the text is clear and the methods seem sound. Overall, a very solid piece of work. I haven't identified any major weaknesses. Below I raise a few questions of interpretation that may possibly be the subject of a revision of the text.

      (1) The perceptual performance on Fig1D appears to show huge variation across participants, with some participants at chance levels and others with performance > 90% in the attentional blink and/or masked conditions. This seems to reveal that the procedure to match performance across participants was not very successful. Could this impact the results? The authors highlight the fact that they did not resort to post-selection or exclusion of participants, but at the same time do not discuss this equally important point.

      (2) In the analysis on collinearity and illusion-specific processing, the authors conclude that the absence of a significant effect of training set demonstrates collinearity-only processing. I don't think that this conclusion is warranted: as the illusory and non-illusory share the same shape, so more elaborate object processing could also be occuring. Please discuss.

      (3) Discussion, lines 426-429: It is stated that the results align with the notion that processes of perceptual segmentation and organization represent the mechanism of conscious experience. My interpretation of the results is that they show the contrary: for the same visibility level in the attentional blind or masking conditions, these processes can be implicated or not, which suggests a role during unconscious processing instead.

      (4) The two paradigms developed here could be used jointly to highlight non-idiosyncratic NCCs, i.e. EEG markers of visibility or confidence that generalise regardless of the method used. Have the authors attempted to train the classifier on one method and apply it to another (e.g. AB to masking and vice versa)? What perceptual level is assumed to transfer?

      (4) How can the results be integrated with the attentional literature showing that attentional filters can be applied early in the processing hierarchy?

      Comments on revisions:

      I'm very pleased with the responses to my previous comments, and congratulate the authors on this excellent piece of work.

    3. Reviewer #2 (Public review):

      Summary:

      This is a very elegant and important EEG study that unifies within a single set of behaviorally equated experimental conditions conscious access (and therefore also conscious access failures) during visual masking and attentional blink (AB) paradigms in humans. By a systematic and clever use of multivariate pattern classifiers across conditions, they could dissect, confirm, and extend a key distinction (initially framed within the GNWT framework) between 'subliminal' and 'pre-conscious' unconscious levels of processing. In particular, the authors could provide strong evidence to distinguish here within the same paradigm these two levels of unconscious processing that precede conscious access : (i) an early (< 80ms) bottom-up and local (in brain) stage of perceptual processing ('local contrast processing') that was preserved in both unconscious conditions, (ii) a later stage and more integrated processing (200-250ms) that was impaired by masking but preserved during AB. On the basis of preexisting studies and theoretical arguments, they suggest that this later stage could correspond to lateral and local recurrent feedback processes. Then, the late conscious access stage appeared as a P3b-like event.

      Strengths:

      The methodology and analyses are strong and valid. This work adds an important piece in the current scientific debate about levels of unconscious processing and specificities of conscious access in relation to feed-forward, lateral, and late brain-scale top-down recurrent processing.

      Comments on revisions:

      I congratulate the authors for the quality of their revised ms. They convincingly addressed each of the issues raised in my previous review.

    4. Reviewer #3 (Public review):

      Summary:

      This work aims to investigate how perceptual and attentional processes affect conscious access in humans. By using multivariate decoding analysis of electroencephalography (EEG) data, the authors explored the neural temporal dynamics of visual processing across different levels of complexity (local contrast, collinearity, and illusory perception). This is achieved by comparing the decidability of an illusory percept in matched conditions of perceptual (i.e., degrading the strength of sensory input using visual masking) and attentional impairment (i.e., impairing top-down attention using attentional blink, AB). The decoding results reveal three distinct temporal responses associated with the three levels of visual processing. Interestingly, the early stage of local contrast processing remains unaffected by both masking and AB. However, the later stage of collinearity and illusory percept processing are impaired by the perceptual manipulation but remained unaffected by the attentional manipulation. These findings contribute to the understanding of the unique neural dynamics of perceptual and attentional functions and how they interact with the different stages of conscious access.

      Strengths:

      The study investigates perceptual and attentional impairments across multiple levels of visual processing in a single experiment. Local contrast, collinearity, and illusory perception were manipulated using different configurations of the same visual stimuli. This clever design allows for the investigation of different levels of visual processing under similar low-level conditions.

      Moreover, behavioural performance was matched between perceptual and attentional manipulations. One of the main problems when comparing perceptual and attentional manipulations on conscious access is that they tend to impact performance at different levels, with perceptual manipulations like masking producing larger effects. The study utilizes a staircasing procedure to find the optimal contrast of the mask stimuli to produce a performance impairment to the illusory perception comparable to the attentional condition, both in terms of perceptual performance (i.e., indicating whether the target contained the Kanizsa illusion) and metacognition (i.e., confidence in the response).

      The results show a clear dissociation between the three levels of visual processing in terms of temporal dynamics. Local contrast was represented at an early stage (~80 ms), while collinearity and illusory perception were associated with later stages (~200-250 ms). Furthermore, the results provide clear evidence in support of a dissociation between the effects of perceptual and attentional processes on conscious access: while the former affected both neuronal correlates of collinearity and illusory perception, the latter did not have any effect on the processing of the more complex visual features involved in the illusion perception.

      Weaknesses:

      The design of the study and the results presented are very similar to those in Fahrenfort et al. (2017), reducing its novelty. Similar to the current study, Fahrenfort et al. (2017) tested the idea that if both masking and AB impact perceptual integration, they should affect the neural markers of perceptual integration in a similar way. They found that behavioural performance (hit/false alarm rate) was affected by both masking and AB, even though only the latter was significant in the unmasked condition. In contrast, an early classification peak was exclusively affected by masking. A later classification peak mirrored the behavioural findings, with classification performance impacted by both masking and AB.

      The interpretation of the results primarily relies on the recurrent processing theory of consciousness (Lamme, 2020), which lead to the assumption that local contrast and illusory perception reflect feedforward and (lateral and feedback) recurrent connections, respectively. It should be mentioned, however, that this theoretical prediction is not directly tested in the study. Moreover, the evidence for the dissociation between illusion and collinearity in terms of lateral and feedback connections seems at least limited. For instance, Kok et al. (2016) found that, whereas bottom-up stimulation activated all cortical layers, feedback activity induced by illusory figures led to a selective activation of the deep layers. Lee & Nguyen (2001), instead, found that V1 neurons respond to illusory contours of the Kanizsa figures, particularly in the superficial layers. Although both studies reference feedback connections, neither provides clear evidence for the involvement of lateral connections.

      The evidence in favour of primarily lateral connections driving collinearity seems mixed as well. On one hand, Liang et al. (2017) showed that feedback and lateral connections closely interact to mediate image grouping and segmentation. On the other hand, Stettler et al. (2002) showed that, whereas the intrinsic connections link similarly oriented domains in V1, V2 to V1 feedback displays no such specificity. Additionally, the other studies cited in the manuscript focused solely on lateral connections without examining feedback pathways, making it challenging to draw definitive conclusions.

      Comments on revisions:

      The authors have thoroughly addressed all my comments and provided comprehensive responses to each point raised.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review): 

      Summary: 

      In this work, Noorman and colleagues test the predictions of the "four-stage model" of consciousness by combining psychophysics and scalp EEG in humans. The study relies on an elegant experimental design to investigate the respective impact of attentional and perceptual blindness on visual processing. 

      The study is very well summarised, the text is clear and the methods seem sound. Overall, a very solid piece of work. I haven't identified any major weaknesses. Below I raise a few questions of interpretation that may possibly be the subject of a revision of the text. 

      We thank the reviewer for their positive assessment of our work and for their extremely helpful and constructive comments that helped to significantly improve the quality of our manuscript.

      (1) The perceptual performance on Fig1D appears to show huge variation across participants, with some participants at chance levels and others with performance > 90% in the attentional blink and/or masked conditions. This seems to reveal that the procedure to match performance across participants was not very successful. Could this impact the results? The authors highlight the fact that they did not resort to postselection or exclusion of participants, but at the same time do not discuss this equally important point. 

      Performance was indeed highly variable between observers, as is commonly found in attentional-blink (AB) and masking studies. For some observers, the AB pushes performance almost to chance level, whereas for others it has almost no effect. A similar effect can be seen in masking. We did our best to match accuracy over participants, while also matching accuracy within participants as well as possible, adjusting mask contrast manually during the experimental session. Naturally, those that are strongly affected by masking need not be the same participants as those that are strongly affected by the AB, given the fact that they rely on different mechanisms (which is also one of the main points of the manuscript). To answer the research question, what mattered most was that at the group-level, performance was well matched between the two key conditions. As all our statistical inferences, both for behavior and EEG decoding, rest on this group level. We do not think that variability at the individualsubject level detracts from this general approach.  

      In the Results, we added that our goal was to match performance across participants:

      “Importantly, mask contrast in the masked condition was adjusted using a staircasing procedure to match performance in the AB condition, ensuring comparable perceptual performance in the masked and the AB condition across participants (see Methods for more details).”

      In the Methods, we added:

      “Second, during the experimental session, after every 32 masked trials, mask contrast could be manually updated in accordance with our goal to match accuracy over participants, while also matching accuracy within participants as well as possible.”

      (2) In the analysis on collinearity and illusion-specific processing, the authors conclude that the absence of a significant effect of training set demonstrates collinearity-only processing. I don't think that this conclusion is warranted: as the illusory and nonillusory share the same shape, so more elaborate object processing could also be occurring. Please discuss. 

      We agree with this qualification of our interpretation, and included the reviewer’s account as an alternative explanation in the Discussion section:  

      “It should be noted that not all neurophysiological evidence unequivocally links processing of collinearity and of the Kanizsa illusion to lateral and feedback processing, respectively (Angelucci et al., 2002; Bair et al., 2003; Chen et al., 2014), so that overlap in decoding the illusory and non-illusory triangle may reflect other mechanisms, for example feedback processes representing the triangular shapes as well.”

      (3) Discussion, lines 426-429: It is stated that the results align with the notion that processes of perceptual segmentation and organization represent the mechanism of conscious experience. My interpretation of the results is that they show the contrary: for the same visibility level in the attentional blind or masking conditions, these processes can be implicated or not, which suggests a role during unconscious processing instead. 

      We agree with the reviewer that the interpretation of this result depends on the definition of consciousness that one adheres to. If one takes report as the leading metric for consciousness (=conscious access), one can indeed conclude that perceptual segmentation/organization can also occur unconsciously. However, if the processing that results in the qualitative nature of an image (rather than whether it is reported) is taken as leading – such as the processing that results in the formation of an illusory percept – (=phenomenal) the conclusion can be quite different. This speaks to the still ongoing debate regarding the existence of phenomenal vs access consciousness, and the literature on no-report paradigms amongst others (see last paragraph of the discussion). Because the current data do not speak directly to this debate, we decided to remove  the sentence about “conscious experience”, and edited this part of the manuscript (also addressing a comment about preserved unconscious processing during masking by Reviewer 2) by limiting the interpretation of unconscious processing to those aspects that are uncontroversial:

      “Such deep feedforward processing can be sufficient for unconscious high-level processing, as indicated by a rich literature demonstrating high-level (e.g., semantic) processing during masking (Kouider & Dehaene, 2007; Van den Bussche et al., 2009; van Gaal & Lamme, 2012). Thus, rather than enabling deep unconscious processing, preserved local recurrency during inattention may afford other processing advantages linked to its proposed role in perceptual integration (Lamme, 2020), such as integration of stimulus elements over space or time.”

      (4) The two paradigms developed here could be used jointly to highlight nonidiosyncratic NCCs, i.e. EEG markers of visibility or confidence that generalise regardless of the method used. Have the authors attempted to train the classifier on one method and apply it to another (e.g. AB to masking and vice versa)? What perceptual level is assumed to transfer? 

      To avoid issues with post-hoc selection of (visible vs. invisible) trials (discussed in the Introduction), we did not divide our trials into conscious and unconscious trials, and thus did not attempt to reveal NCCs, or NCCs generalizing across the two paradigms. Note also that this approach alone would not resolve the debate regarding the ‘true’ NCC as it hinges on the operational definition of consciousness one adheres to; also see our response to the previous point the reviewer raised. Our main analysis revealed that the illusory triangle could be decoded with above-chance accuracy during both masking and the AB over extended periods of time with similar topographies (Fig. 2B), so that significant cross-decoding would be expected over roughly the same extended period of time (except for the heightened 200-250 ms peak). However, as our focus was on differences between the two manipulations and because we did not use post-hoc sorting of trials, we did not add these analyses.

      (5) How can the results be integrated with the attentional literature showing that attentional filters can be applied early in the processing hierarchy? 

      Compared to certain manipulations of spatial attention, the AB phenomenon is generally considered to represent an instance of  “late” attentional filtering. In the Discussion section we included a paragraph on classic load theory, where early and late filtering depend on perceptual and attentional load. Just preceding this paragraph, we added this:  

      “Clearly, these findings do not imply that unconscious high-level (e.g., semantic) processing can only occur during inattention, nor do they necessarily generalize to other forms of inattention. Indeed, while the AB represents a prime example of late attentional filtering, other ways of inducing inattention or distraction (e.g., by manipulating spatial attention) may filter information earlier in the processing hierarchy (e.g., Luck & Hillyard, 1994 vs. Vogel et al., 1998).”

      Reviewer #2 (Public Review): 

      Summary: 

      This is a very elegant and important EEG study that unifies within a single set of behaviorally equated experimental conditions conscious access (and therefore also conscious access failures) during visual masking and attentional blink (AB) paradigms in humans. By a systematic and clever use of multivariate pattern classifiers across conditions, they could dissect, confirm, and extend a key distinction (initially framed within the GNWT framework) between 'subliminal' and 'pre-conscious' unconscious levels of processing. In particular, the authors could provide strong evidence to distinguish here within the same paradigm these two levels of unconscious processing that precede conscious access : (i) an early (< 80ms) bottom-up and local (in brain) stage of perceptual processing ('local contrast processing') that was preserved in both unconscious conditions, (ii) a later stage and more integrated processing (200-250ms) that was impaired by masking but preserved during AB. On the basis of preexisting studies and theoretical arguments, they suggest that this later stage could correspond to lateral and local recurrent feedback processes. Then, the late conscious access stage appeared as a P3b-like event. 

      Strengths: 

      The methodology and analyses are strong and valid. This work adds an important piece in the current scientific debate about levels of unconscious processing and specificities of conscious access in relation to feed-forward, lateral, and late brain-scale top-down recurrent processing. 

      Weaknesses: 

      - The authors could improve clarity of the rich set of decoding analyses across conditions. 

      - They could also enrich their Introduction and Discussion sections by taking into account the importance of conscious influences on some unconscious cognitive processes (revision of traditional concept of 'automaticity'), that may introduce some complexity in Results interpretation 

      - They should discuss the rich literature reporting high-level unconscious processing in masking paradigms (culminating in semantic processing of digits, words or even small group of words, and pictures) in the light of their proposal (deeper unconscious processing during AB than during masking). 

      We thank the reviewer for their positive assessment of our study and for their insightful comments and helpful suggestions that helped to significantly strengthen our paper. We provide a more detailed point-by-point response in the “recommendations for the authors” section below. In brief, we followed the reviewer’s suggestions and revised the Results/Discussion to include references to influences on unconscious processes and expanded our discussion of unconscious effects during masking vs. AB.  

      Reviewer #3 (Public Review): 

      Summary: 

      This work aims to investigate how perceptual and attentional processes affect conscious access in humans. By using multivariate decoding analysis of electroencephalography (EEG) data, the authors explored the neural temporal dynamics of visual processing across different levels of complexity (local contrast, collinearity, and illusory perception). This is achieved by comparing the decidability of an illusory percept in matched conditions of perceptual (i.e., degrading the strength of sensory input using visual masking) and attentional impairment (i.e., impairing topdown attention using attentional blink, AB). The decoding results reveal three distinct temporal responses associated with the three levels of visual processing. Interestingly, the early stage of local contrast processing remains unaffected by both masking and AB. However, the later stage of collinearity and illusory percept processing are impaired by the perceptual manipulation but remain unaffected by the attentional manipulation. These findings contribute to the understanding of the unique neural dynamics of perceptual and attentional functions and how they interact with the different stages of conscious access. 

      Strengths: 

      The study investigates perceptual and attentional impairments across multiple levels of visual processing in a single experiment. Local contrast, collinearity, and illusory perception were manipulated using different configurations of the same visual stimuli. This clever design allows for the investigation of different levels of visual processing under similar low-level conditions. 

      Moreover, behavioural performance was matched between perceptual and attentional manipulations. One of the main problems when comparing perceptual and attentional manipulations on conscious access is that they tend to impact performance at different levels, with perceptual manipulations like masking producing larger effects. The study utilizes a staircasing procedure to find the optimal contrast of the mask stimuli to produce a performance impairment to the illusory perception comparable to the attentional condition, both in terms of perceptual performance (i.e., indicating whether the target contained the Kanizsa illusion) and metacognition (i.e., confidence in the response). 

      The results show a clear dissociation between the three levels of visual processing in terms of temporal dynamics. Local contrast was represented at an early stage (~80 ms), while collinearity and illusory perception were associated with later stages (~200-250 ms). Furthermore, the results provide clear evidence in support of a dissociation between the effects of perceptual and attentional processes on conscious access: while the former affected both neuronal correlates of collinearity and illusory perception, the latter did not have any effect on the processing of the more complex visual features involved in the illusion perception. 

      Weaknesses: 

      The design of the study and the results presented are very similar to those in Fahrenfort et al. (2017), reducing its novelty. Similar to the current study, Fahrenfort et al. (2017) tested the idea that if both masking and AB impact perceptual integration, they should affect the neural markers of perceptual integration in a similar way. They found that behavioural performance (hit/false alarm rate) was affected by both masking and AB, even though only the latter was significant in the unmasked condition. An early classification peak was instead only affected by masking. However, a late classification peak showed a pattern similar to the behavioural results, with classification affected by both masking and AB. 

      The interpretation of the results mainly centres on the theoretical framework of the recurrent processing theory of consciousness (Lamme, 2020), which lead to the assumption that local contrast, collinearity, and the illusory perception reflect feedforward, local recurrent, and global recurrent connections, respectively. It should be mentioned, however, that this theoretical prediction is not directly tested in the study. Moreover, the evidence for the dissociation between illusion and collinearity in terms of lateral and feedback connections seems at least limited. For instance, Kok et al. (2016) found that, whereas bottom-up stimulation activated all cortical layers, feedback activity induced by illusory figures led to a selective activation of the deep layers. Lee & Nguyen (2001), instead, found that V1 neurons respond to illusory contours of the Kanizsa figures, particularly in the superficial layers. They all mention feedback connections, but none seem to point to lateral connections. 

      Moreover, the evidence in favour of primarily lateral connections driving collinearity seems mixed as well. On one hand, Liang et al. (2017) showed that feedback and lateral connections closely interact to mediate image grouping and segmentation. On the other hand, Stettler et al. (2002) showed that, whereas the intrinsic connections link similarly oriented domains in V1, V2 to V1 feedback displays no such specificity. Furthermore, the other studies mentioned in the manuscript did not investigate feedback connections but only lateral ones, making it difficult to draw any clear conclusions. 

      We thank the reviewer for their careful review and positive assessment of our study, as well as for their constructive criticism and helpful suggestions. We provide a more detailed point-by-point response in the “recommendations for the authors” section below. In brief, we addressed the reviewer’s comments and suggestions by better relating our study to Fahrenfort et al.’s (2017) paper and by highlighting the limitations inherent in linking our findings to distinct neural mechanisms (in particular, to lateral vs. feedback connections).

      Recommendations for the authors:  

      Reviewer #1 (Recommendations For The Authors): 

      -  Methods: it states that "The distance between the three Pac-Man stimuli as well as between the three aligned two-legged white circles was 2.8 degrees of visual angle". It is unclear what this distance refers to. Is it the shortest distance between the edges of the objects? 

      It is indeed the shortest distance between the edges of the objects. This is now included in the Methods.

      -  Methods: It's unclear to me if the mask updating procedure during the experimental session was based on detection rate or on the perceptual performance index reported on Fig1D. Please clarify. 

      It was based on accuracy calculated over 32 trials. We have included this information in the Methods.

      -  Methods and Results: I did not understand why the described procedure used to ensure that confidence ratings are not contaminated by differences in perceptual performance was necessary. To me, it just seems to make the "no manipulations" and "both manipulations" less comparable to the other 2 conditions. 

      To calculate accurate estimates of metacognitive sensitivity for the two matched conditions, we wanted participants to make use of the full confidence scale (asking them to distribute their responses evenly over all ratings within a block). By mixing all conditions in the same block, we would have run the risk of participants anchoring their confidence ratings to the unmatched very easy and very difficult conditions (no and both manipulations condition). We made this point explicit in the Results section and in the Methods section:

      “To ensure that the distribution of confidence ratings in the performancematched masked and AB condition was not influenced by participants anchoring their confidence ratings to the unmatched very easy and very difficult conditions (no and both manipulations condition, respectively), the masked and AB condition were presented in the same experimental block, while the other block type included the no and both manipulations condition.”

      “To ensure that confidence ratings for these matched conditions (masked, long lag and unmasked, short lag) were not influenced by participants anchoring their confidence ratings to the very easy and very difficult unmatched conditions (no and both manipulations, respectively), one type of block only contained the matched conditions, while the other block type contained the two remaining, unmatched conditions (masked, short lag and unmasked, long lag).”

      - Methods: what priors were used for Bayesian analyses? 

      Bayesian statistics were calculated in JASP (JASP Team, 2024) with default prior scales (Cauchy distribution, scale 0.707). This is now added to the Methods.

      - Results, line 162: It states that classifiers were applied on "raw EEG activity" but the Methods specify preprocessing steps. "Preprocessed EEG activity" seems more appropriate. 

      We changed the term to “preprocessed EEG activity” in the Methods and to “(minimally) preprocessed EEG activity (see Methods)” in the  Results, respectively.

      - Results, line 173: The effect of masking on local contrast decoding is reported as "marginal". If the alpha is set at 0.05, it seems that this effect is significant and should not be reported as marginal. 

      We changed the wording from “marginal” to “small but significant.”  

      - Fig1: The fixation cross is not displayed. 

      Because adding the fixation cross would have made the figure of the trial design look crowded and less clear, we decided to exclude it from this schematic trial representation. We are now stating this also in the legend of figure 1.  

      - Fig 3A: In the upper left panel, isn't there a missing significant effect of the "local contrast training and testing" condition in the first window? If not, this condition seems oddly underpowered compared to the other two conditions. 

      Thanks for the catch! The highlighting in bold and the significance bar were indeed lacking for this condition in the upper left panel (blue line). We corrected the figure in our revision.

      - Supplementary text and Fig S6: It is unclear to me why the two control analyses (the black lines vs. the green and purple lines) are pooled together in the same figure. They seem to test for different, non-comparable contrasts (they share neither training nor testing sets), and I find it confusing to find them on the same figure. 

      We agree that this may be confusing, and deleted the results from one control analysis from the figure (black line, i.e., training on contrast, testing on illusion), as the reviewer correctly pointed out that it displayed a non-comparable analysis. Given that this control analysis did not reveal any significant decoding, we now report its results only in the Supplementary text.  

      - Fig S6: I think the title of the legend should say testing on the non-illusory triangle instead of testing on the illusory triangle to match the supplementary text. 

      This was a typo – thank you! Corrected.  

      Reviewer #2 (Recommendations For The Authors): 

      Issue #1: One key asymmetry between the three levels of T2 attributes (i.e.: local contrast; non-illusory triangle; illusory Kanisza triangle) is related to the top-down conscious posture driven by the task that was exclusively focusing on the last attribute (illusory Kanisza triangle). Therefore, any difference in EEG decoding performance across these three levels could also depend to this asymmetry. For instance, if participants were engaged to report local contrast or non-illusory triangle, one could wonder if decoding performance could differ from the one used here. This potential confound was addressed by the authors by using decoders trained in different datasets in which the main task was to report one the two other attributes. They could then test how classifiers trained on the task-related attribute behave on the main dataset. However, this part of the study is crucial but not 100% clear, and the links with the results of these control experiments are not fully explicit. Could the author better clarity this important point (see also Issue #1 and #3). 

      The reviewer raises an important point, alluding to potential differences between decoded features regarding task relevance. There are two separate sets of analyses where task relevance may have been a factor, our main analyses comparing illusion to contrast decoding, and our comparison of collinearity vs. illusion-specific processing.  

      In our main analysis, we are indeed reporting decoding of a task-relevant feature (illusion) and of a task-irrelevant feature (local contrast, i.e., rotation of the Pac-Man inducers). Note, however, that the Pac-Man inducers were always task-relevant, as they needed to be processed to perceive illusory triangles, so that local contrast decoding was based on task-relevant stimulus elements, even though participants did not respond to local contrast differences in the main experiment. However, we also ran control analyses testing the effect of task-relevance on local contrast decoding in our independent training data set and in another (independent) study, where local contrast was, in separate experimental blocks, task-relevant or task-irrelevant. The results are reported in the Supplementary Text and in Figure S5. In brief, task-relevance did not improve early (70–95 ms) decoding of local contrast. We are thus confident that the comparison of local contrast to illusion decoding in our main analysis was not substantially affected by differences in task relevance. In our previous manuscript version, we referred to these control analyses only in the collinearity-vs-illusion section of the Results. In our revision, we added the following in the Results section comparing illusion to contrast decoding:

      “In the light of evidence showing that unconscious processing is susceptible to conscious top-down influences (Kentridge et al., 2004; Kiefer & Brendel, 2006; Naccache et al., 2002), we ran control analyses showing that early local contrast decoding was not improved by rendering contrast task-relevant (see Supplementary Information and Fig. S5), indicating that these differences between illusion and contrast decoding did not reflect differences in task-relevance.”

      In addition to our main analysis, there is the concern that our comparison of collinearity vs. illusion-specific processing may have been affected by differences in task-relevance between the stimuli inducing the non-illusory triangle (the “two-legged white circles”, collinearity-only) and the stimuli inducing the Kanizsa illusion (the PacMan inducers, collinearity-plus-illusion). We would like to emphasize that in our main analysis classifiers were always used to decode T2 illusion presence vs. absence (collinearity-plus-illusion), and never to decode T2 collinearity-only. To distinguish collinearity-only from collinearity-plus-illusion processing, we only varied the training data (training classifiers on collinearity-only or collinearity-plus-illusion), using the independent training data set, where collinearity-only and collinearity-plus-illusion (and rotation) were task-relevant (in separate blocks). As discussed in the Supplementary Information, for this analysis approach to be valid, collinearity-only processing should be similar for the illusory and the non-illusory triangle, and this is what control analyses demonstrated (Fig. S7). In any case, general task-relevance was equated for the collinearity-only and the collinearity-plus-illusion classifiers.  

      Finally, in supplementary Figure 6 we also show that our main results reported in Figure 2 (discussed at the top of this response) were very similar when the classifiers were trained on the independent localizer dataset in which each stimulus feature could be task-relevant.  

      Together, for the reasons described above, we believe that differences in EEG decoding performance across these three stimulus levels did  are unlikely to depend also depend on a “task-relevance” asymmetry.

      Issue #2: Following on my previous point the authors should better mention the concept of conscious influences on unconscious processing that led to a full revision of the notion of automaticity in cognitive science [1 , 2 , 3 , 4]. For instance, the discovery that conscious endogenous temporal and spatial attention modulate unconscious subliminal processing paved the way to this revision. This concept raises the importance of Issue#1: equating performance on the main task across AB and masking is not enough to guarantee that differences of neural processing of the unattended attributes of T2 (i.e.: task-unrelated attributes) are not, in part, due to this asymmetry rather than to a systematic difference of unconscious processing strengtsh [5 , 6-8]. Obviously, the reported differences for real-triangle decoding between AB and masking cannot be totally explained by such a factor (because this is a task-unrelated attribute for both AB and masking conditions), but still this issue should be better introduced, addressed, clarified (Issue #1 and #3) and discussed. 

      We would like to refer to our response to the previous point: Control analyses for local contrast decoding showed that task relevance had no influence on our marker for feedforward processing. Most importantly, as outlined above, we did not perform real-triangle decoding – all our decoding analyses focused on comparing collinearity-only vs. collinearity-plus-illusion were run on the task-relevant T2 illusion (decoding its presence vs. absence). The key difference was solely the training set, where the collinearity-only classifier was trained on the (task-relevant) real triangle and the collinearity-plus-illusion classifier was trained on the (task-relevant) Kanizsa triangle. Thus, overall task relevance was controlled in these analyses.  

      In our revision, we are now also citing the studies proposed by the reviewer, when discussing the control analyses testing for an effect of task-relevance on local contrast decoding:

      “In the light of evidence showing that unconscious processing is susceptible to conscious top-down influences (Kentridge et al., 2004; Kiefer & Brendel, 2006; Naccache et al., 2002), we ran control analyses showing that early local contrast decoding was not improved by rendering contrast task-relevant (see Supplementary Information and Fig. S5), indicating that these differences between illusion and contrast decoding did not reflect differences in task-relevance.”

      Issue #3: In terms of clarity, I would suggest the authors to add a synthetic figure providing an overall view of all pairs of intra and cross-conditions decoding analyses and mentioning main task for training and testing sets for each analysis (see my previous and related points). Indeed, at one point, the reader can get lost and this would not only strengthen accessibility to the detailed picture of results, but also pinpoint the limits of the work (see previous point). 

      We understand the point the reviewer is raising and acknowledge that some of our analyses, in particular those using different training and testing sets, may be difficult to grasp. But given the variety of different analyses using different training and testing sets, different temporal windows, as well as different stimulus features, it was not possible to design an intuitive synthetic figure summarizing the key results. We hope that the added text in the Results and Discussion section will be sufficient to guide the reader through our set of analyses.  

      In our revision, we are now more clearly highlighting that, in addition to presenting the key results in our main text that were based on training classifiers on the T1 data, “we replicated all key findings when training the classifiers on an independent training set where individual stimuli were presented in isolation (Fig. 3A, results in the Supplementary Information and Fig. S6).” For this, we added a schematic showing the procedure of the independent training set to Figure 3, more clearly pointing the reader to the use of a separate training data set.  

      Issue #4: In the light of these findings the authors should discuss more thoroughly the question of unconscious high-level representations in masking versus AB: in particular, a longstanding issue relates to unconscious semantic processing of words, numbers or pictures. According to their findings, they tend to suggest that semantic processing should be more enabled in AB than in masking. However, a rich literature provided a substantial number of results (including results from the last authors Simon Van Gaal) that tend to support the notion of unconscious semantic processing in subliminal processing (see in particular: [9 , 10 , 11 , 12 , 13]). So, and as mentioned by the authors, while there is evidence for semantic processing during AB they should better discuss how they would explain unconscious semantic subliminal processing. While a possibility could be to question the unconscious attribute of several subliminal results, the same argument also holds for AB studies. Another possible track of discussion would be to differentiate AB and subliminal perception in terms of strength and durability of the corresponding unconscious representations, but not necessarily in terms of cognitive richness. Indeed, one may discuss that semantic processing of stimuli that do not need complex spatial integration (e.g.: words or digits as compared to illusory Kanisza tested here) can still be observed under subliminal conditions. 

      We thank the reviewer for pointing us to this shortcoming of our previous Discussion. Note that our data does not directly speak to the question of high-level unconscious representations in masking vs AB, because such conclusions would hinge on the operational definition of consciousness one adheres to (also see response to Reviewer 1). Nevertheless, we do follow the reviewer’s suggestions and added the following in the Discussion (also addressing a point about other forms of attention raised by Reviewer 1):

      “Clearly, these findings do not imply that unconscious high-level (e.g., semantic) processing can only occur during inattention, nor do they necessarily generalize to other forms of inattention. Indeed, while the AB represents a prime example of late attentional filtering, other ways of inducing inattention or distraction (e.g., by manipulating spatial attention) may filter information earlier in the processing hierarchy (e.g., Luck & Hillyard, 1994 vs. Vogel et al., 1998).”

      And, in a following paragraph in the Discussion:

      “Such deep feedforward processing can be sufficient for unconscious high-level processing, as indicated by a rich literature demonstrating high-level (e.g., semantic) processing during masking (Kouider & Dehaene, 2007; Van den Bussche et al., 2009; van Gaal & Lamme, 2012). Thus, rather than enabling high-level unconscious processing, preserved local recurrency during inattention may afford other processing advantages linked to its proposed role in perceptual integration (Lamme, 2020), such as integration of stimulus elements over space or time.  

      Reviewer #3 (Recommendations For The Authors): 

      (1) The objective of Fahrenfort et al., 2017 seems very similar to that of the current study. What are the main differences between the two studies? Moreover, Fahrenfort et al., 2017 conducted similar decoding analyses to those performed in the current study.

      Which results were replicated in the current study, and which ones are novel? Highlighting these differences in the manuscript would be beneficial. 

      We now provide a more comprehensive coverage of the study by Fahrenfort et al., 2017. In the Introduction, we added a brief summary of the key findings, highlighting that this study’s findings could have reflected differences in task performance rather than differences between masking and AB:

      “For example, Fahrenfort and colleagues (2017) found that illusory surfaces could be decoded from electroencephalogram (EEG) data during the AB but not during masking. This was taken as evidence that local recurrent interactions, supporting perceptual integration, were preserved during inattention but fully abolished by masking. However, masking had a much stronger behavioral effect than the AB, effectively reducing task performance to chance level. Indeed, a control experiment using weaker masking, which resulted in behavioral performance well above chance similar to the main experiment’s AB condition, revealed some evidence for preserved local recurrent interactions also during masking. However, these conditions were tested in separate experiments with small samples, precluding a direct comparison of perceptual vs. attentional blindness at matched levels of behavioral performance. To test …”

      In the Results , we are now also highlighting this key advancement by directly referencing the previous study:

      “Thus, whereas in previous studies task performance was considerably higher during the AB than during masking (e.g., Fahrenfort et al., 2017), in the present study the masked and the AB condition were matched in both measures of conscious access.” When reporting the EEG decoding results in the Results section, we continuously cite the Fahrenfort et al. (2017) study to highlight similarities in the study’s findings. We also added a few sentences explicitly relating the key findings of the two studies:

      “This suggests that the AB allowed for greater local recurrent processing than masking, replicating the key finding by Fahrenfort and colleagues (2017). Importantly, the present result demonstrates that this effect reflects the difference between the perceptual vs. attentional manipulation rather than differences in behavior, as the masked and the AB condition were matched for perceptual performance and metacognition.”

      “This similarity between behavior and EEG decoding replicates the findings of Fahrenfort and colleagues  (2017) who also found a striking similarity between late Kanizsa decoding (at 406 ms) and behavioral Kanizsa detection. These results indicate that global recurrent processing at these later points in time reflected conscious access to the Kanizsa illusion.”  

      We also more clearly highlighted where our study goes beyond Fahrenfort et al.’s (2017), e.g., in the Results:

      “The addition of this element of collinearity to our stimuli was a key difference to the study by Fahrenfort and colleagues (2017), allowing us to compare non-illusory triangle decoding to illusory triangle decoding in order to distinguish between collinearity and illusion-specific processing.”

      And in the Discussion:

      “Furthermore, the addition of line segments forming a non-illusory triangle to the stimulus employed in the present study allowed us to distinguish between collinearity and illusion-specific processing.”

      Also, in the Discussion, we added a paragraph “summarizing which results were replicated in the current study, and which ones are novel”, as suggested by the reviewer:

      “This pattern of results is consistent with a previous study that used EEG to decode Kanizsa-like illusory surfaces during masking and the AB (Fahrenfort et al., 2017). However, the present study also revealed some effects where Fahrenfort and colleagues (2017) failed to obtain statistical significance, likely reflecting the present study’s considerably larger sample size and greater statistical power. For example, in the present study the marker for feedforward processing was weakly but significantly impaired by masking, and the marker for local recurrency was significantly impaired not only by masking but also by the AB, although to a lesser extent. Most importantly, however, we replicated the key findings that local recurrent processing was more strongly impaired by masking than by the AB, and that global recurrent processing was similarly impaired by masking and the AB and closely linked to task performance, reflecting conscious access. Crucially, having matched the key conditions behaviorally, the present finding of greater local recurrency during the AB can now unequivocally be attributed to the attentional vs. perceptual manipulation of consciousness.”

      Finally, we changed the title to “Distinct neural mechanisms underlying perceptual and attentional impairments of conscious access despite equal task performance” to highlight one of the crucial differences between the Fahrenfort et al., study and this study, namely the fact that we equalized task performance between the two critical conditions (AB and masking).

      (2) It is not clear from the text the link between the current study and the literature on the role of lateral and feedback connections in consciousness (Lamme, 2020). A better explanation is needed. 

      To our knowledge, consciousness theories such as recurrent processing theory by Lamme make currently no distinction between the role of lateral and feedback connections for consciousness. The principled distinction lies between unconscious feedforward processing and phenomenally conscious or “preconscious” local recurrent processing, where local recurrency refers to both lateral (or horizontal) and feedback connections. We added a sentence in the Discussion:

      “As current theories do not distinguish between the roles of lateral vs. feedback connections for consciousness, the present findings may enrich empirical and theoretical work on perceptual vs. attentional mechanisms of consciousness …”

      (3) When training on T1 and testing on T2, EEG data showed an early peak in local contrast classification at 75-95 ms over posterior electrodes. The authors stated that this modulation was only marginally affected by masking (and not at all by AB); however, the main effect of masking is significant. Why was this effect interpreted as nonrelevant? 

      Following this and Reviewer 1’s comment, we changed the wording from “marginal” to “weak but significant.” We considered this effect “weak” and of lesser relevance, because its Bayes factor indicated that the alternative hypothesis was only 1.31 times more likely than the null hypothesis of no effect, representing only “anecdotal” evidence, which is in sharp contrast to the robust effects of the consciousness manipulations on illusion decoding reported later. Furthermore, later ANOVAs comparing the effect of masking on contrast vs. illusion decoding revealed much stronger effects on illusion decoding than on contrast decoding (BFs>3.59×10<sup>4</sup>).

      (4) The decoding analysis on the illusory percept yielded two separate peaks of decoding, one from 200 to 250 ms and another from 275 to 475 ms. The early component was localized occipitally and interpreted as local sensory processing, while the late peak was described as a marker for global recurrent processing. This latter peak was localized in the parietal cortex and associated with the P300. Can the authors show the topography of the P300 evoked response obtained from the current study as a comparison? Moreover, source reconstruction analysis would probably provide a better understanding of the cortical localization of the two peaks. 

      Figure S4 now shows the P300 from electrode Pz, demonstrating a stronger positivity between 375 and 475 ms when the illusory triangle was present than when it was absent. We did not run a source reconstruction analysis.  

      (5) The authors mention that the behavioural results closely resembled the pattern of the second decoding peak results. However, they did not show any evidence for this relationship. For instance, is there a correlation between the two measures across or within participants? Does this relationship differ between the illusion report and the confidence rating? 

      This relationship became evident from simply eyeballing the results figures: Both in behavior and EEG decoding performance dropped from the both-manipulations condition to the AB and masked conditions, while these conditions did not differ significantly. Following a similar observation of a close similarity between behavior and the second/late illusion decoding peak in the study by Fahrenfort et al. (2017), we adopted their analysis approach and ran two additional ANOVAs, adding “measure” (behavior vs. EEG) as a factor. For this analysis, we dropped the both-manipulations condition due to scale restrictions (as noted in footnote 1: “We excluded the bothmanipulations condition from this analysis due to scale restrictions: in this condition, EEG decoding at the second peak was at chance, while behavioral performance was above chance, leaving more room for behavior to drop from the masked and AB condition.”). The analysis revealed that there were no interactions with condition:

      “The pattern of behavioral results, both for perceptual performance and metacognitive sensitivity, closely resembled the second decoding peak: sensitivity in all three metrics dropped from the no-manipulations condition to the masked and AB conditions, while sensitivity did not differ significantly between these performancematched conditions (Fig. 2C). Two additional rm ANOVAs with the factors measure (behavior, second EEG decoding peak) and condition (no-manipulations, masked, AB)<sup>1</sup> for perceptual performance and metacognitive sensitivity revealed no significant interaction (performance: F</iv><sub>2,58</sub>=0.27, P\=0.762, BF<sub>01</sub>=8.47; metacognition: F</iv><sub>2,58</sub=0.54, P\=0.586, BF<sub>2,58</sub>=6.04). This similarity between behavior and EEG decoding replicates the findings of Fahrenfort and colleagues  (2017) who also found a striking similarity between late Kanizsa decoding (at 406 ms) and behavioral Kanizsa detection. These results indicate that global recurrent processing at these later points in time reflected conscious access to the Kanizsa illusion.”

      (6) The marker for illusion-specific processing emerged later (200-250 ms), with the nomanipulation decoding performing better after training on the illusion than the nonillusory triangle. This difference emerged only in the AB condition, and it was fully abolished by masking. The authors confirmed that the illusion-specific processing was not affected by the AB manipulations by running a rm ANOVA which did not result in a significant interaction between condition and training set. However, unlike the other non-significant results, a Bayes Factor is missing here. 

      We added Bayes factors to all (significant and non-significant) rm ANOVAs.

      (7) The same analysis yielded a second illusion decoding peak at 375-475 ms. This effect was impaired by both masking and AB, with no significant differences between the two conditions. The authors stated that this result was directly linked to behavioural performance. However, it is not clear to me what they mean (see point 5). 

      We added analyses comparing behavior and EEG decoding directly (see our response to point 5).

      (8) The introduction starts by stating that perceptual and attentional processes differently affect consciousness access. This differentiation has been studied thoroughly in the consciousness literature, with a focus on how attention differs from consciousness (e.g., Koch & Tsuchiya, TiCS, 2007; Pitts, Lutsyshyna & Hillyard, Phil. Trans. Roy. Soc. B Biol. Sci., 2018). The authors stated that "these findings confirm and enrich empirical and theoretical work on perceptual vs. attentional mechanisms of consciousness clearly distinguishing and specifying the neural profiles of each processing stage of the influential four-stage model of conscious experience". I found it surprising that this aspect was not discussed further. What was the state of the art before this study was conducted? What are the mentioned neural profiles? How did the current results enrich the literature on this topic? 

      We would like to point out that our study is not primarily concerned with the conceptual distinction between consciousness and attention, which has been the central focus of e.g., Koch and Tsuchiuya (2007). While this literature was concerned with ways to dissociate consciousness and attention, we tacitly assumed that attention and consciousness are now generally considered as different constructs. Our study is thus not dealing with dissociations between attention and consciousness, nor with the distinction between phenomenal consciousness and conscious access, but is concerned with different ways of impairing conscious access (defined as the ability to report about a stimulus), either via perceptual or via attentional manipulations. For the state of the art before the study was conducted, we would like to refer to the motivation of our study in the Introduction, e.g., previous studies’ difficulties in unequivocally linking greater local recurrency during attentional than perceptual blindness to the consciousness manipulation, given performance confounds (we expanded this Introduction section). We also expanded a paragraph in the discussion to remind the reader of the neural profiles of the 4-stage model and to highlight the novelty of our findings related to the distinction between lateral and feedback processes:

      “As current theories do not distinguish between the roles of lateral vs. feedback connections for consciousness, the present findings may enrich empirical and theoretical work on perceptual vs. attentional mechanisms of consciousness (Block, 2005; Dehaene et al., 2006; Hatamimajoumerd et al., 2022; Lamme, 2010; Pitts et al., 2018; Sergent & Dehaene, 2004), clearly distinguishing the neural profiles of each processing stage of the influential four-stage model of conscious experience (Fig. 1A). Along with the distinct temporal and spatial EEG decoding patterns associated with lateral and feedback processing, our findings suggest a processing sequence from feedforward processing to local recurrent interactions encompassing lateral-tofeedback connections, ultimately leading to global recurrency and conscious report.”  

      (9) When stating that this is the first study in which behavioural measures of conscious perception were matched between the attentional blink and masking, it would be beneficial to highlight the main differences between the current study and the one from Fahrenfort et al., 2017, with which the current study shares many similarities in the experimental design (see point 1). 

      We would like to refer the reviewer to our response to point 1), where we detail how we expanded the discussion of similarities and differences between our present study and Fahrenfort et al. (2017).

      (10) The discussion emphasizes how the current study "suggests a processing sequence from feedforward processing to local recurrent interactions encompassing lateral-to-feedback connections, ultimately leading to global recurrency and conscious report". For transparency, it is though important to highlight that one limit of the current study is that it does not provide direct evidence for the specified types of connections (see point 6). 

      We added a qualification in the Discussion section:

      “Although the present EEG decoding measures cannot provide direct evidence for feedback vs. lateral processes, based on neurophysiological evidence, …”

      Furthermore, we added this qualification in the Discussion section:

      “It should be noted that the not all neurophysiological evidence unequivocally links processing of collinearity and of the Kanizsa illusion to lateral and feedback processing, respectively (Angelucci et al., 2002; Bair et al., 2003; Chen et al., 2014), so that overlap in decoding the illusory and non-illusory triangle may reflect other mechanisms, for example feedback processing as well.”

      References

      Angelucci, A., Levitt, J. B., Walton, E. J. S., Hupe, J.-M., Bullier, J., & Lund, J. S. (2002). Circuits for local and global signal integration in primary visual cortex. The Journal of Neuroscience: The Official Journal of the Society for Neuroscience, 22(19), 8633–8646.

      Bair, W., Cavanaugh, J. R., & Movshon, J. A. (2003). Time course and time-distance relationships for surround suppression in macaque V1 neurons. The Journal of Neuroscience: The Official Journal of the Society for Neuroscience, 23(20), 7690–7701.

      Block, N. (2005). Two neural correlates of consciousness. Trends in Cognitive Sciences, 9(2), 46–52.

      Chen, M., Yan, Y., Gong, X., Gilbert, C. D., Liang, H., & Li, W. (2014). Incremental integration of global contours through interplay between visual cortical areas. Neuron, 82(3), 682–694.

      Dehaene, S., Changeux, J.-P., Naccache, L., Sackur, J., & Sergent, C. (2006). Conscious, preconscious, and subliminal processing: a testable taxonomy. Trends in Cognitive Sciences, 10(5), 204–211.

      Hatamimajoumerd, E., Ratan Murty, N. A., Pitts, M., & Cohen, M. A. (2022). Decoding perceptual awareness across the brain with a no-report fMRI masking paradigm. Current Biology: CB. https://doi.org/10.1016/j.cub.2022.07.068

      JASP Team. (2024). JASP (Version 0.19.0)[Computer software]. https://jasp-stats.org/ Kentridge, R. W., Heywood, C. A., & Weiskrantz, L. (2004). Spatial attention speeds discrimination without awareness in blindsight. Neuropsychologia, 42(6), 831– 835.

      Kiefer, M., & Brendel, D. (2006). Attentional Modulation of Unconscious “Automatic” Processes: Evidence from Event-related Potentials in a Masked Priming Paradigm. Journal of Cognitive Neuroscience, 18(2), 184–198.

      Kouider, S., & Dehaene, S. (2007). Levels of processing during non-conscious perception: a critical review of visual masking. Philosophical Transactions of the Royal Society B: Biological Sciences, 362(1481), 857–875.

      Lamme, V. A. F. (2010). How neuroscience will change our view on consciousness. Cognitive Neuroscience, 1(3), 204–220.

      Luck, S. J., & Hillyard, S. A. (1994). Electrophysiological correlates of feature analysis during visual search. Psychophysiology, 31(3), 291–308.

      Naccache, L., Blandin, E., & Dehaene, S. (2002). Unconscious masked priming depends on temporal attention. Psychological Science, 13(5), 416–424.

      Pitts, M. A., Lutsyshyna, L. A., & Hillyard, S. A. (2018). The relationship between attention and consciousness: an expanded taxonomy and implications for ‘noreport’ paradigms. Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences, 373(1755), 20170348.

      Sergent, C., & Dehaene, S. (2004). Is consciousness a gradual phenomenon? Evidence for an all-or-none bifurcation during the attentional blink. Psychological Science, 15(11), 720–728.

      Van den Bussche, E., Van den Noortgate, W., & Reynvoet, B. (2009). Mechanisms of masked priming: a meta-analysis. Psychological Bulletin, 135(3), 452–477. van Gaal, S., & Lamme, V. A. F. (2012). Unconscious high-level information processing: implication for neurobiological theories of consciousness: Implication for neurobiological theories of consciousness. The Neuroscientist: A Review Journal Bringing Neurobiology, Neurology and Psychiatry, 18(3), 287–301.

      Vogel, E. K., Luck, S. J., & Shapiro, K. L. (1998). Electrophysiological evidence for a postperceptual locus of suppression during the attentional blink. Journal of Experimental Psychology. Human Perception and Performance, 24(6), 1656– 1674.

    1. eLife Assessment

      This important manuscript sets out to identify sleep/arousal phenotypes in larval zebrafish carrying mutations in Alzheimer's disease (AD)-associated genes. The authors provide detailed phenotypic data for F0 knockouts of each of 7 AD-associated genes and then compare the resulting behavioral fingerprints to those obtained from a large-scale chemical screen to generate new hypotheses about underlying molecular mechanisms. The data presented are solid, although extensive interpretation of pharmacological screen data does not necessarily reflect the limited mechanistic data. Nonetheless, the authors address most reviewer concerns in their revised version, providing invaluable new analyses. Phenotypic characterization presented is comprehensive, and the authors develop a well-designed behavioral analysis pipeline that will provide considerable value for zebrafish neuroscientists.

    2. Reviewer #1 (Public review):

      Summary:

      In this study, Kroll et al. conduct an in-depth behavioral analysis of F0 knockouts of 4 genes associated with late-onset Alzheimer's Disease (AD), together with 3 genes associated with early-onset AD. Kroll and colleagues developed a web application (ZOLTAR) to compare sleep-associated traits between genetic mutants with those obtained from a panel of small molecules to promote identification of affected pathways and potential therapeutic interventions. The authors make a set of potentially important findings vis-à-vis the relationship between AD-associated genes and sleep. First, they find that loss-of-function in late-onset AD genes universally result in nighttime sleep loss, consistent with the well-supported hypothesis that sleep disruption contributes to Alzheimer's-related pathologies. psen-1, an early-onset associated AD gene, which the authors find is principally responsible for the generation of AB40 and AB42 in zebrafish, also shows a slight increase in activity at night and slight decreases in nighttime sleep. Conversely, psen-2 mutations increase daytime sleep, while appa/appb mutations have no impact on sleep. Finally, using ZOLTAR, the authors identify serotonin receptor activity as potentially disrupted in sorl1 mutants, while betamethasone is identified as a potential therapeutic to promote reversal of psen2 knockout-associated phenotypes.

      This is a highly innovative and thorough study, yet a handful of key questions remain. First, are the nighttime sleep loss phenotypes observed in all knockouts for late-onset AD genes in the larval zebrafish a valid proxy for AD risk? Can 5-HT reuptake inhibitors reverse other AD-related pathologies in zebrafish? Can compounds be identified which have a common behavioral fingerprint across all or multiple AD risk genes? Do these modify sleep phenotypes? Finally, the authors propose but do not test the hypothesis that sorl1 might regulate localization/surface expression of 5-HT2 receptors. This could provide exciting / more convincing mechanistic support for the assertion that serotonin signaling is disrupted upon loss of AD-associated genes. Despite these important considerations, this study provides a valuable platform for high-throughput analysis of sleep phenotypes and correlation with small-molecule induced sleep phenotypes. The platform could also be expanded to facilitate comparison of other behavioral phenotypes, including stimulus-evoked behaviors. Moreover, the new analyses looking for pathways that might be co-regulated by AD risk genes and discussion of cholinergic signaling as a potentially meaningful target downstream of 5/7 knockouts are valuable.

      Strengths:<br /> - Provides a useful platform for comparison of sleep phenotypes across genotypes/drug manipulations.<br /> - Presents convincing evidence that nighttime sleep is disrupted in mutants for multiple late-onset AD-related genes.<br /> - Provides potential mechanistic insights for how AD-related genes might impact sleep and identifies a few drugs that modify their identified phenotypes.

      Weaknesses:<br /> - Exploration of potential mechanisms for serotonin disruption in sorl1 mutants is limited<br /> - The pipeline developed is only used to examine sleep-related / spontaneous movement phenotypes. Stimulus-evoked behaviors are not examined.

    3. Reviewer #2 (Public review):

      Summary:

      This work delineates the larval zebrafish behavioral phenotypes caused by F0 knockout of several important genes that increase risk for Alzheimer's disease. Using behavioral pharmacology, comparing the behavioral fingerprint of previously assayed molecules to the newly generated knockout data, compounds were discovered that impacted larval movement in ways that suggest interaction with or recovery of disrupted mechanisms.

      Strengths:

      This is a well-written manuscript that uses newly developed analysis methods to present the findings in a clear, high-quality way. The addition of an extensive behavioral analysis pipeline is of value to the field of zebrafish neuroscience and will be particularly helpful for researchers who prefer the R programming language. Even the behavioral profiling of these AD risk genes, regardless of the pharmacology aspect, is an important contribution. The recovery of most behavioral parameters in the psen2 knockout with betamethasone, predicted by comparing fingerprints, is an exciting demonstration of the approach. The hypotheses generated by this work are important stepping stones to future studies uncovering the molecular basis of the proposed gene-drug interactions and discovering novel therapeutics to treat AD or co-occurring conditions such as sleep disturbance. Most concerns are sufficiently addressed in the revised manuscript or response to reviewers.

      Weaknesses:

      - The overarching concept of the work is that comparing behavioral fingerprints can align genes and molecules with similarly disrupted molecular pathways. While the recovery of the psen2 phenotypes by one molecule with the opposite phenotype is interesting, as are previous studies that show similar behaviorally-based recoveries, the underlying assumption that normalizing the larval movement normalizes the mechanism still lacks substantial support. While I agree with the authors detailed response that rescuing most behavioral parameters is a good indication that the underlying mechanism is normalized, I disagree that high-throughput larval behavior kinematics is a sufficient enough representation of most behavioral parameters to be indicative of molecular mechanism normalization. There are many instances of mutants with completely normal kinetics at baseline, but a behavioral difference that emerges during stimulation or in a new paradigm such as hunting. Without testing far more behavioral paradigms than are possible in the multi-well plate format, as well as possibly multiple life stages, I remain unconvinced that this approach will yield valuable therapeutic insights. I do agree that it can yield insight for future investigation, such as in the case of cntnap2a/cntnap2b and GABA receptor agonists, but even in that instance is it not clear that such an agonist would rescue abnormalities in a meaningful way. In the case of a disorder such as autism, the early locomotor phenotypes may be disconnected from the molecular mechanisms underlying later social deficits, and it is far more challenging to screen on juvenile behaviors that would be a more appropriate target for a behavior-first approach. The added experiment of testing fluvoxamine, a second SSRI, yielded very different behavioral responses to the SSRI citalopram, supporting my assertion that this approach and the disrupted underlying mechanisms are more complicated than suggested by the authors. I disagree that the connection between sorl1 and serotonin is strengthened by this experiment. The authors suggest that since the knockout larvae react differently than control siblings to both SSRIs, it indicates that serotonin is disrupted. There is no negative control included, where a pathway that is clearly not indicated to be important is pharmacologically manipulated. It is possible that the mutants would also behave differently compared to siblings when other pathways are perturbed. The authors acknowledge in the reviewers that they may not have identified the underlying molecular disruption in this mutant, but they did not substantially alter the Discussion section on this point. I agree with the authors that using a different wild-type strain in a different lab could lead to discrepancies, but these issues could have been experimentally mitigated or more clearly highlighted in the manuscript itself.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this study, Kroll et al. conduct an in-depth behavioral analysis of F0 knockouts of 4 genes associated with late-onset Alzheimer's Disease (AD), together with 3 genes associated with early-onset AD. Kroll and colleagues developed a web application (ZOLTAR) to compare sleep-associated traits between genetic mutants with those obtained from a panel of small molecules to promote the identification of affected pathways and potential therapeutic interventions. The authors make a set of potentially important findings vis-à-vis the relationship between AD-associated genes and sleep. First, they find that loss-of-function in late-onset AD genes universally results in night-time sleep loss, consistent with the well supported hypothesis that sleep disruption contributes to Alzheimer's-related pathologies. psen-1, an early-onset associated AD gene, which the authors find is principally responsible for the generation of AB40 and AB42 in zebrafish, also shows a slight increase in activity at night and slight decreases in night-time sleep. Conversely, psen-2 mutations increase daytime sleep, while appa/appb mutations have no impact on sleep. Finally, using ZOLTAR, the authors identify serotonin receptor activity as potentially disrupted in sorl1 mutants, while betamethasone is identified as a potential therapeutic to promote reversal of psen2 knockout-associated phenotypes.

      This is a highly innovative and thorough study, yet a handful of key questions remain. First, are night-time sleep loss phenotypes observed in all knockouts for late-onset AD genes in the larval zebrafish a valid proxy for AD risk?

      We cannot say, but it is an interesting question. We selected the four late-onset Alzheimer’s risk genes (APOE, CD2AP, CLU, SORL1) based on human genetics data and brain expression in zebrafish larvae, not based on their likelihood to modify sleep behaviour, which we could have tried by searching for overlaps with GWAS of sleep phenotypes, for example. Consequently, we find it remarkable that all four of these genes caused a night-time sleep phenotype when mutated. We also find it reassuring that knockout of appa/appb and psen2 did not cause a night-time sleep phenotype, which largely excludes the possibility that the phenotype is a technical artefact (e.g. caused by the F0 knockout method) or a property of every gene expressed in the larval brain.

      Having said that, it could still be a coincidence, rather than a special property of genes associated with late-onset AD. In addition to testing additional late-onset Alzheimer’s risk genes, the ideal way to answer this question would be to test in parallel a random set of genes expressed in the brain at this stage of development. From this random set, one could estimate the proportion of genes that cause a night-time sleep phenotype when mutated. One could then use that information to test whether late-onset Alzheimer’s risk genes are indeed enriched for genes that cause a night-time sleep phenotype when mutated.

      For those mutants that cause night-time sleep disturbances, do these phenotypes share a common underlying pathway? e.g. Do 5-HT reuptake inhibitors promote sleep across all 4 late-onset genes in addition to psen1? Can 5-HT reuptake inhibitors reverse other AD-related pathologies in zebrafish? Can compounds be identified that have a common behavioral fingerprint across all or multiple AD risk genes? Do these modify sleep phenotypes?

      To attempt to answer these questions, we used ZOLTAR to generate predictions for all the knockout behavioural fingerprints presented in the study, in the same way as for sorl1 in Fig. 5 and Fig. 5–supplement 1. Here are the indications, targets, and KEGG pathways which are shared by the largest number of knockouts (Author response image 1):

      – One indication is shared by 4/7 knockouts: “opioid dependence” (significant for appa/appb, psen1, apoea/apoeb, cd2ap).

      – Four targets are shared by 4/7 knockouts: “strychnine-binding glycine receptor” (psen1, apoea/apoeb, clu, sorl1); “neuronal acetylcholine receptor beta-2” (psen1, apoea/apoeb, cd2ap, clu); thyroid peroxidase (psen1, apoea/apoeb, cd2ap, clu); carbonic anhydrase IV (appa/appb, psen1, psen2, cd2ap).

      – Three KEGG pathways are shared by 5/7 knockouts: “cholinergic synapse” (psen1, apoea/apoeb, cd2ap, clu, sorl1); tyrosine metabolism (psen2, apoea/apoeb, cd2ap, clu, sorl1); and “nitrogen metabolism” (appa/appb, psen1, psen2, apoea/apoeb, cd2ap).

      As reminder, we hypothesised that loss of Sorl1 affected serotonin signalling based on the following annotations being significant: indication “depression”, target “serotonin transporter”, and KEGG pathway “serotonergic synapse”. Indication “depression” is only significant for sorl1 knockouts; target “serotonin transporter” is also significant for appa/appb and psen2 knockouts; and KEGG pathway “serotonergic synapse” is also significant for psen2 knockouts. ZOLTAR therefore does not predict serotonin signalling to be a major theme common to all mutants with a night-time sleep loss phenotype.

      Particularly interesting is cholinergic signalling appearing in the most common targets and KEGG pathways. Acetylcholine signalling is a major theme in research on AD. For example, the first four drugs ever approved by the FDA to treat AD were acetylcholinesterase inhibitors, which increase acetylcholine signalling by preventing its breakdown by acetylcholinesterase. These drugs are generally considered only to treat symptoms and not modify disease course, but this view has been called into question (Munoz-Torrero, 2008; Relkin, 2007). If, as ZOLTAR suggests, mutations in several Alzheimer’s risk genes affect cholinergic signalling early in development, this would point to a potential causal role of cholinergic disruption in AD.

      Author response image 1.

      Common predictions from ZOLTAR for the seven Alzheimer’s risk genes tested. Predictions from ZOLTAR which are shared by multiple knockout behavioural fingerprints presented in the study. Only indications, targets, and KEGG pathways which are significant for at least three of the seven knockouts tested are shown, ranked from the annotations which are significant for the largest number of knockouts.

      Finally, the web- based platform presented could be expanded to facilitate comparison of other behavioral phenotypes, including stimulus-evoked behaviors.

      Yes, absolutely. The behavioural dataset we used (Rihel et al., 2010) did not measure other stimuli than day/night light transitions, but the “SauronX” platform and dataset (MyersTurnbull et al., 2022) seems particularly well suited for this. To provide some context, we and collaborators have occasionally used the dataset by Rihel et al. (2010) to generate hypotheses or find candidate drugs that reverse a behavioural phenotype measured in the sleep/wake assay (Ashlin et al., 2018; Hoffman et al., 2016). The present work was the occasion to enable a wider and more intuitive use of this dataset through the ZOLTAR app, which has already proven successful. Future versions of ZOLTAR may seek to incorporate larger drug datasets using more types of measurements.

      Finally, the authors propose but do not test the hypothesis that sorl1 might regulate localization/surface expression of 5-HT2 receptors. This could provide exciting / more convincing mechanistic support for the assertion that serotonin signaling is disrupted upon loss of AD-associated genes.

      While working on the Author Response, we made some changes to the analysis ran by ZOLTAR to calculate enrichments (see Methods and github.com/francoiskroll/ZOLTAR, notes on v2). With the new version, 5-HT receptor type 2 is not a significantly enriched target for the sorl1 knockout fingerprint but type 4 is. 5-HT receptor type 4 was also shown to interact with sorting nexin 27, a subunit of retromer, so is a promising candidate (Joubert et al., 2004). Antibodies against human 5-HT receptor type 2 and 4a exist; whether they would work in zebrafish remains to be tested. In our experience, the availability of antibodies suitable for immunohistochemistry in the zebrafish is a serious experimental roadblock.

      Note, all the results presented in the “Version of Records” are from ZOLTAR v2.

      Despite these important considerations, this study provides a valuable platform for highthroughput analysis of sleep phenotypes and correlation with small-molecule-induced sleep phenotypes.

      Strengths:

      - Provides a useful platform for comparison of sleep phenotypes across genotypes/drug manipulations.

      - Presents convincing evidence that night-time sleep is disrupted in mutants for multiple late onset AD-related genes.

      - Provides potential mechanistic insights for how AD-related genes might impact sleep and identifies a few drugs that modify their identified phenotypes

      Weaknesses:

      - Exploration of potential mechanisms for serotonin disruption in sorl1 mutants is limited.

      - The pipeline developed can only be used to examine sleep-related / spontaneous movement phenotypes and stimulus-evoked behaviors are not examined.

      - Comparisons between mutants/exploration of commonly affected pathways are limited.

      Thank you for these excellent suggestions, please see our answers above.

      Reviewer #2 (Public Review):

      Summary:

      This work delineates the larval zebrafish behavioral phenotypes caused by the F0 knockout of several important genes that increase the risk for Alzheimer's disease. Using behavioral pharmacology, comparing the behavioral fingerprint of previously assayed molecules to the newly generated knockout data, compounds were discovered that impacted larval movement in ways that suggest interaction with or recovery of disrupted mechanisms.

      Strengths:

      This is a well-written manuscript that uses newly developed analysis methods to present the findings in a clear, high-quality way. The addition of an extensive behavioral analysis pipeline is of value to the field of zebrafish neuroscience and will be particularly helpful for researchers who prefer the R programming language. Even the behavioral profiling of these AD risk genes, regardless of the pharmacology aspect, is an important contribution. The recovery of most behavioral parameters in the psen2 knockout with betamethasone, predicted by comparing fingerprints, is an exciting demonstration of the approach. The hypotheses generated by this work are important stepping stones to future studies uncovering the molecular basis of the proposed gene-drug interactions and discovering novel therapeutics to treat AD or co-occurring conditions such as sleep disturbance.

      Weaknesses:

      - The overarching concept of the work is that comparing behavioral fingerprints can align genes and molecules with similarly disrupted molecular pathways. While the recovery of the psen2 phenotypes by one molecule with the opposite phenotype is interesting, as are previous studies that show similar behaviorally-based recoveries, the underlying assumption that normalizing the larval movement normalizes the mechanism still lacks substantial support. There are many ways that a reduction in movement bouts could be returned to baseline that are unrelated to the root cause of the genetically driven phenotype. An ideal experiment would be to thoroughly characterize a mutant, such as by identifying a missing population of neurons, and use this approach to find a small molecule that rescues both behavior and the cellular phenotype. If the connection to serotonin in the sorl1 was more complete, for example, the overarching idea would be more compelling.

      Thank you for this cogent criticism.

      On the first point, we were careful not to claim that betamethasone normalises the molecular/cellular mechanism that causes the psen2 behavioural phenotype. Having said that, yes, to a certain extent that would be the hope of the approach. As you say, every compound which normalises the behavioural fingerprint will not normalise the underlying mechanism, but the opposite seems true: every compound that normalises the underlying mechanism should also normalise the behavioural fingerprint. We think this logic makes the “behaviour-first” approach innovative and interesting. The logic is to discover compounds that normalise the behavioural phenotype first, only subsequently test whether they also normalise the molecular mechanism, akin to testing first whether a drug resolves the symptoms before testing whether it actually modifies disease course. While in practice testing thousands of drugs in sufficient sample sizes and replicates on a mutant line is challenging, the dataset queried through ZOLTAR provides a potential shortcut by shortlisting in silico compounds that have the opposite effect on behaviour.

      You mention a “reduction in movement bouts” but note here that the number of behavioural parameters tested is key to our argument. To take the two extremes, say the only behavioural parameter we measured in psen2 knockout larvae was time active during the day, then, yes, any stimulant used at the right concentration could probably normalise the phenotype. In this situation, claiming that the stimulant is likely to also normalise the underlying mechanism, or even that it is a genuine “phenotypic rescue”, would not be convincing. Conversely, say we were measuring thousands of behavioural parameters under various stimuli, such as swimming speed, position in the well, bout usage, tail movements, and eye angles, it seems almost impossible for a compound to rescue most parameters without also normalising the underlying mechanism. The present approach is somewhere inbetween: ZOLTAR uses six behavioural parameters for prediction (e.g. Fig 6a), but all 17 parameters calculated by FramebyFrame can be used to assess rescue during a subsequent experiment (Fig. 6c). For both, splitting each parameter in day and night increases the resolution of the approach, which partly answers your criticism. For example, betamethasone rescued the day-time hypoactivity without causing night-time hyperactivity, so we are not making the “straw man argument” explained above of using any broad stimulant to rescue the hypoactivity phenotype.

      Furthermore, for diseases where the behavioural defect is the primary concern, such as autism or bipolar disorder, perhaps this behaviour-first approach is all that is needed, and whether or not the compound precisely rescues the underlying mechanism is somewhat secondary. The use of lithium to prevent manic episodes in bipolar disorder is a good example. It was initially tested because mania was thought to be caused by excess uric acid and lithium can dissolve uric acid (Mitchell and Hadzi-Pavlovic, 2000). The theory is now discredited, but lithium continues to be used without a precise understanding of its mode of action. In this example, behavioural rescue alone, assuming the secondary effects are tolerable, is sufficient to be beneficial to patients, and whether it modulates the correct causal pathway is secondary.

      On the second point, we agree that testing first ZOLTAR on a mutant for which we have a fairly good understanding of the mechanism causing the behavioural phenotype could have been a productive approach. Note, however, that examples already exist in the literature (Ashlin et al., 2018; Hoffman et al., 2016). The example from Hoffman et al. (2016) is especially convincing. Drugs generating behavioural fingerprints that positively correlate with the cntnap2a/cntnap2b double knockout fingerprint were enriched with NMDA and GABA receptor antagonists. In experiments analogous to our citalopram and fluvoxamine treatments (Fig. 5c,d and Fig. 5–supplement 1c,d), cntnap2a/cntnap2b knockout larvae were overly sensitive to the NMDA receptor antagonist MK-801 and the GABAA receptor antagonist pentylenetetrazol (PTZ). Among other drugs tested, zolpidem, a GABAA receptor agonist, caused opposite effects on wild-type and cntnap2a/cntnap2b knockout larvae. Knockout larvae were found to have fewer GABAergic neurons in the forebrain. While these studies did not use precisely the same analysis that ZOLTAR runs, they used the same rationale and behavioural dataset to make these predictions (Rihel et al., 2010), which shows that approaches like ZOLTAR can point to causal processes.

      On your last point, we hope our experiment testing fluvoxamine, another selective serotonin reuptake inhibitor (SSRI), makes the connection between Sorl1 and serotonin signalling more convincing.

      - The behavioral difference between the sorl1 KO and scrambled at the higher dose of the citalopram is based on a small number of animals. The KO Euclidean distance measure is also more spread out than for the other datasets, and it looks like only five or so fish are driving the group difference. It also appears as though the numbers were also from two injection series. While there is nothing obviously wrong with the data, I would feel more comfortable if such a strong statement of a result from a relatively subtle phenotype were backed up by a higher N or a stable line. It is not impossible that the observed difference is an experimental fluke. If something obvious had emerged through the HCR, that would have also supported the conclusions. As it stands, if no more experiments are done to bolster the claim, the confidence in the strength of the link to serotonin should be reduced (possibly putting the entire section in the supplement and modifying the discussion). The discussion section about serotonin and AD is interesting, but I think that it is excessive without additional evidence.

      We mostly agree with this criticism. One could interpret the larger spread of the data for sorl1 KO larvae treated with 10 µM citalopram as evidence that the knockout larvae do indeed react differently to the drug at this dose, regardless of being driven by a subset of the animals. The result indeed does not survive removing the top 5 (p = 0.87) or top 3 (p = 0.18) sorl1 KO + 10 µM larvae, but this amounts to excluding 20 (3/14) or 35 (5/14) % of the datapoints as potential outliers, which is unreasonable. In fact, excluding the top 5 sorl1 KO + 10 µM is equivalent to calling any datapoint with z-score > 0.2 an outlier (z-scores of the top 5 datapoints are 0.2–1.8). Applying consistently the same criterion to the scrambled + 10 µM group would remove the top 6 datapoints (z-scores = 0.5–3.9). Comparing the resulting two distributions again gives the sorl1 KO + 10 µM distribution as significantly higher (p = 0.0015). We would also mention that Euclidean distance, as a summary metric for distance between behavioural fingerprints, has limitations. For example, the measure will be more sensitive to changes in some parameters but not others, depending on how much room there is for a given parameter to change. We included this metric to lend support to the observation one can draw from the fingerprint plot (Fig. 5c) that sorl1 mutants respond in an exaggerated way to citalopram across many parameters, while being agnostic to which parameter might matter most.

      Given that the HCR did not reveal anything striking, we agree with you that too much of our argument relied on this result being robust. As you and Reviewer #3 suggested, we repeated this experiment with a different SSRI, fluvoxamine (Fig. 5–supplement 1). We cannot readily explain why the result was opposite to what we found with citalopram, but in both cases sorl1 knockout larvae reacted differently than their control siblings, which adds an argument to our claim that ZOLTAR correctly predicted serotonin signalling as a disrupted pathway from the behavioural fingerprint. Accordingly, we mostly kept the Discussion on Sorl1 the same, although we concede that we may not have identified the molecular mechanism.

      - The authors suggest two hypotheses for the behavioral difference between the sorl1 KO and scrambled at the higher dose of the citalopram. While the first is tested, and found to not be supported, the second is not tested at all ("Ruling out the first hypothesis, sorl1 knockouts may react excessively to a given spike in serotonin." and "Second, sorl1 knockouts may be overly sensitive to serotonin itself because post-synaptic neurons have higher levels of serotonin receptors."). Assuming that the finding is robust, there are probably other reasons why the mutants could have a different sensitivity to this molecule. However, if this particular one is going to be mentioned, it is surprising that it was not tested alongside the first hypothesis. This work could proceed without a complete explanation, but additional discussion of the possibilities would be helpful or why the second hypothesis was not tested.

      There are no strong scientific reasons why this hypothesis was not tested. The lead author (F Kroll) moved to a different lab and country so the project was finalised at that time. We do not plan on testing this hypothesis at this stage. However, we adapted the wording to make it clear this is one possible alternative hypothesis which could be tested in the future. The small differences found by HCR are actually more in line with the new results from the fluvoxamine experiment, so it may also be that both hypotheses (pre-synaptic neurons releasing less serotonin when reuptake is blocked; or post-synaptic neurons being less sensitive) contribute. The fluvoxamine experiment was performed in a different lab (ICM, Paris; all other experiments were done in UCL, London) in a different wild-type strain (TL in ICM, AB x Tup LF in UCL), which complicates how one interprets this discrepancy.

      - The authors claim that "all four genes produced a fairly consistent phenotype at night". While it is interesting that this result arose in the different lines, the second clutch for some genes did not replicate as well as others. I think the findings are compelling, regardless, but the sometimes missing replicability should be discussed. I wonder if the F0 strategy adds noise to the results and if clean null lines would yield stronger phenotypes. Please discuss this possibility, or others, in regard to the variability in some phenotypes.

      For the first part of this point, please see below our answer to Reviewer #3, point (2) c.

      Regarding the F0 strategy potentially adding variability, it is an interesting question which we tested in a larger dataset of behavioural recordings from F0 and stable knockouts for the same genes (unpublished). In summary, the F0 knockout method does not increase clutchto-clutch or larva-to-larva variability in the assay. F0 knockout experiments found many more significant parameters and larger effect sizes than stable knockout experiments, but this difference could largely be explained by the larger sample sizes of F0 knockout experiments. In fact, larger sample sizes within individual clutches appears to be a major advantage of the F0 knockout approach over in-cross of heterozygous knockout animals as it increases sensitivity of the assay without causing substantial variability. We plan to report in more detail on this analysis in a separate paper as we think it would dilute the focus of the present work.

      - In this work, the knockout of appa/appb is included. While APP is a well-known risk gene, there is no clear justification for making a knockout model. It is well known that the upregulation of app is the driver of Alzheimer's, not downregulation. The authors even indicate an expectation that it could be similar to the other knockouts ("Moreover, the behavioural phenotypes of appa/appb and psen1 knockout larvae had little overlap while they presumably both resulted in the loss of Aβ." and "Comparing with early-onset genes, psen1 knockouts had similar night-time phenotypes, but loss of psen2 or appa/appb had no effect on night-time sleep."). There is no reason to expect similarity between appa/appb and psen1/2. I understand that the app knockouts could unveil interesting early neurodevelopmental roles, but the manuscript needs to be clarified that any findings could be the opposite of expectation in AD.

      On “there is no reason to expect similarity […]”, we disagree. Knockout of appa/appb and knockout of psen1 will both result in loss of Aβ (appa/appb encode Aβ and psen1 cleaves Appa/Appb to release Aβ, cf. Fig. 3e). Consequently, a phenotype caused by the loss of Aβ, or possibly other Appa/Appb cleavage products, should logically be found in both appa/appb and psen1 knockouts.

      On “it is well known that the upregulation of APP is the driver of Alzheimer’s, not downregulation”; we of course agree. Among others, the examples of Down syndrome, APP duplication (Sleegers et al., 2006), or mouse models overexpressing human APP show definitely that overexpression of APP is sufficient to cause AD. Having said that, we would not be so quick in dismissing APP knockout as potentially relevant to understanding of AD.

      Loss of soluble Aβ due to aggregation could contribute to pathology (Espay et al., 2023). Without getting too much into this intricate debate, links between levels of Aβ and risk of disease are often counter-intuitive too. For example, out of 138 PSEN1 mutations screened in vitro, 104 reduced total Aβ production and 11 even seemingly abolished the production of both Aβ40 and Aβ42 (Sun et al., 2017). In short, loss of soluble Aβ occurs in both AD and in our appa/appb knockout larvae.

      We added a sentence in Results (section psen2 knockouts […]) to briefly justify our appa/appb knockout approach. To be clear, we do not want to imply, for example, that the absence of a night-time sleep phenotype for appa/appb is contradictory to the body of literature showing links between Aβ and sleep, including in zebrafish (Özcan et al., 2020). As you say, our experiment tested loss of App, including Aβ, while the literature typically reports on overexpression of APP, as in APP/PSEN1-overexpressing mice (Jagirdar et al., 2021).

      Reviewer #3 (Public Review):

      In this manuscript by Kroll and colleagues, the authors describe combining behavioral pharmacology with sleep profiling to predict disease and potential treatment pathways at play in AD. AD is used here as a case study, but the approaches detailed can be used for other genetic screens related to normal or pathological states for which sleep/arousal is relevant. The data are for the most part convincing, although generally the phenotypes are relatively small and there are no major new mechanistic insights. Nonetheless, the approaches are certainly of broad interest and the data are comprehensive and detailed. A notable weakness is the introduction, which overly generalizes numerous concepts and fails to provide the necessary background to set the stage for the data.

      Major points

      (1) The authors should spend more time explaining what they see as the meaning of the large number of behavioral parameters assayed and specifically what they tell readers about the biology of the animal. Many are hard to understand--e.g. a "slope" parameter.

      We agree that some parameters do not tell something intuitive about the biology of the animal. It would be easy to speculate. For example, the “activity slope” parameter may indicate how quickly the animal becomes tired over the course of the day. On the other hand, fractal dimension describes the “roughness/smoothness” of the larva’s activity trace (Fig. 2–supplement 1a); but it is not obvious how to translate this into information about the physiology of the animal. We do not see this as an issue though. While some parameters do provide intuitive information about the animal’s behaviour (e.g. sleep duration or sunset startle as a measure of startle response), the benefit of having a large number of behavioural parameters is to compare behavioural fingerprints and assess rescue of the behavioural phenotype by small molecules (Fig. 6c). For this purpose, the more parameters the better. The “MoSeq” approach from Wiltschko et al., 2020 is a good example from literature that inspired our own Fig. 6c. While some of the “behavioural syllables” may be intuitive (e.g. running or grooming), it is probably pointless to try to explain the ‘meaning’ of the “small left turn in place with head motion” syllable (Wiltschko et al., 2020). Nonetheless, this syllable was useful to assess whether a drug specifically treats the behavioural phenotype under study without causing too many side effects. Unfortunately, ZOLTAR has to reduce the FramebyFrame fingerprint (17 parameters) to just six parameters to compare it to the behavioural dataset from Rihel et al., 2010, but here, more parameters would almost certainly translate into better predictions too, regardless of their intuitiveness.

      It is true however that we did not give much information on how some of the less intuitive parameters, such as activity slope or fractal dimension, are calculated or what they describe about the dataset (e.g. roughness/smoothness for fractal dimension). We added a few sentences in the legend of Fig. 2–supplement 1.

      (2) Because in the end the authors did not screen that many lines, it would increase confidence in the phenotypes to provide more validation of KO specificity. Some suggestions include:

      a. The authors cite a psen1 and psen2 germline mutant lines. Can these be tested in the FramebyFrame R analysis? Do they phenocopy F0 KO larvae?

      We unfortunately do not have those lines. We investigated the availability of importing a psen2 knockout line from abroad, but the process of shipping live animals is becoming more and more cost and time prohibitive. However, we observed the same pigmentation phenotype for psen2 knockouts as reported by Jiang et al., 2018, which is at least a partial confirmation of phenocopying a loss of function stable mutant.  

      b. psen2_KO is one of the larger centerpieces of the paper. The authors should present more compelling evidence that animals are truly functionally null. Without this, how do we interpret their phenotypes?

      We disagree that there should be significant doubt about these mutants being truly functionally null, given the high mutation rate and presence of the expected pigmentation phenotype (Jiang et al., 2018, Fig. 3f and Fig. 3–supplement 3a). The psen2 F0 knockouts were virtually 100% mutated at three exons across the gene (mutation rates were locus 1: 100 ± 0%; locus 2: 99.99 ± 0.06%; locus 3: 99.85 ± 0.24%). Additionally, two of the three mutated exons had particularly high rates of frameshift mutations (locus 1: 97 ± 5%; locus 2: 88 ± 17% frameshift mutation rate). It is virtually impossible that a functional protein is translated given this burden of frameshift mutations. Phenotypically, in addition to the pigmentation defect, double psen1/psen2 F0 knockout larvae had curved tails, the same phenotype as caused by a high dose of the γ-secretase inhibitor DAPT (Yang et al., 2008). These double F0 knockouts were lethal, while knockout of psen1 or psen2 alone did not cause obvious morphological defects. Evidently, most larvae must have been psen2 null mutants in this experiment, otherwise functional Psen2 would have prevented early lethality.

      Translation of zebrafish psen2 can start at downstream start codons if the first exon has a frameshift mutation, generating a seemingly functional Psen2 missing the N-terminus (Jiang et al., 2020). Zebrafish homozygous for this early frameshift mutation had normal pigmentation, showing it is a reliable marker of Psen2 function even when it is mutated. This mechanism is not a concern here as the alternative start codons are still upstream of two of the three mutated exons (the alternative start codons discovered by Jiang et al., 2020 are in exon 2 and 3, but we targeted exon 3, exon 4, and exon 6).

      We understand that the zebrafish community may be cautious about F0 phenotyping compared to stably generated mutants. As mentioned to Reviewer #2, we are planning to assemble a paper that expressly compares behavioural phenotypes measured in F0 vs. stable mutants to allay some of these concerns. Our current manuscript, which combines CRISPR-Cas9 rapid F0 screening with in silico pharmacological predictions, inevitability represents a first step in characterizing the functions of these genes. 

      c. Related to the above, for cd2AP and sorl1 KO, some of the effect sizes seem to be driven by one clutch and not the other. In other words, great clutch-to-clutch variability. Should the authors increase the number of clutches assayed?

      Correct, there is substantial clutch-to-clutch variability in this behavioural assay. This is not specific to our experiments. Even within the same strain, wild-type larvae from different clutches (i.e. non-siblings) behave differently (Joo et al., 2021). This is why it is essential to compare behavioural phenotypes within individual clutches (i.e. from a single pair of parents, one male and one female), as we explain in Methods (section Behavioural video-tracking) and in the documentation of the FramebyFrame package. We often see two different experimental designs in literature: comparing non-sibling wild-type and mutant larvae, or pooling different clutches which include all genotypes (e.g. pooling multiple clutches from heterozygous in-crosses or pooling wild-type clutches before injecting them). The first experimental design causes false positive findings (Joo et al., 2021), as the clutchto-clutch variability we and others observe gets interpreted as a behavioural phenotype. The second experimental design should not cause false positives but likely decreases the sensitivity of the assay by increasing the spread within genotypes. In both cases, the clutch-to-clutch variability is hidden, either by interpreting it as a phenotype (first case) or by adding it to animal-to-animal variability (second case). Our experimental design is technically more challenging as it requires obtaining large clutches from unique pairs of parents. However, this approach is better as it clearly separates the different sources of variability (clutch-to-clutch or animal-to-animal). As for every experiment, yes, a larger number of replicates would be better, but we do not plan to assay additional clutches at this time. Our work heavily focuses on the sorl1 and psen2 knockout behavioural phenotypes. The key aspects of these phenotypes were effectively tested in four experiments (five to six clutches) as sorl1 knockout larvae were also tracked in the citalopram and fluvoxamine experiments (Fig. 5 and Fig. 5–supplement 1), and psen2 knockout larvae were also tracked in the small molecule rescue experiment (Fig. 6 and Fig. 6–supplement 1).

      The psen2 behavioural phenotype replicated well across the six clutches tested (pairwise cosine similarities: 0.62 ± 0.15; Author response image 2a). 5/6 clutches were less active and initiating more sleep bouts during the day, as we claimed in Fig. 3.

      In the citalopram experiment, the H<sub>2</sub>O-treated sorl1 knockout fingerprint replicated fairly well the baseline recordings in Fig. 4, despite the smaller sample size (cos = 0.30 and 0.78; Author response image 2b, see “KO Fig. 5”). 5/6 of the significant parameters presented in Fig. 4–supplement 4 moved in the same direction, and knockout larvae were also hypoactive during the day but hyperactive at night. Note that two clutches were tracked on the same 96-well plate in this experiment. We calculated each larva’s z-score using the average of its control siblings, then we averaged all the z-scores to generate the fingerprint. The H<sub>2</sub>O treated sorl1 knockout clutch from the fluvoxamine experiment did not replicate well the baseline recordings (cos = 0.08 and 0.11; Author response image 2b, see “KO Fig. 5–suppl. 1”). Knockout larvae were hypoactive during the day as expected, but behaviour at night was not as robustly affected. As mentioned above, knockouts were made in a different genetic background (TL, instead of AB x Tup LF used for all other experiments), which could explain the discrepancy.

      We also took the opportunity to check whether our SSRI treatments replicated well the data from Rihel et al., 2010. For both citalopram (n = 3 fingerprints in the database) and fluvoxamine (n = 4 fingerprints in the database), replication was excellent (cos ≥ 0.67 for all comparisons of a fingerprint from this study vs. a fingerprint from Rihel et al. 2010; Author response image 2c,d). Note that the scrambled + 10 µM citalopram and + 10 µM fluvoxamine fingerprints correlate extremely well (cos = 0.92; can be seen in Author response image 2c,d), which was predicted by the small molecule screen dataset.

      Author response image 2.

      Replication of psen2 and sorl1 F0 knockout fingerprints and SSRI treatments from Rihel et al., 2010. a, (left) Every psen2 F0 knockout behavioural fingerprint generated in this study. Each dot represents the mean deviation from the same-clutch scrambled-injected mean for that parameter (z-score, mean ± SEM). From the experiments in Fig. 6, presented is the psen2 F0 knockout + H<sub>2</sub>O fingerprints. The fingerprints in grey (“not shown”) are from a preliminary drug treatment experiment we did not include in the final study. These fingerprints are from psen2 F0 knockout larvae treated with 0.2% DMSO, normalised to scrambled-injected siblings also treated with 0.2% DMSO. (right) Pairwise cosine similarities (−1.0–1.0) for the fingerprints presented. b, Every sorl1 F0 knockout behavioural fingerprint, as in a). c, The scrambled-injected + citalopram (10 µM) fingerprints (grey) in comparison to the citalopram (10–15 µM) fingerprints from the Rihel et al., 2010 database (green). d, The scrambled-injected + fluvoxamine (10 µM) fingerprint (grey) in comparison to the fluvoxamine fingerprints from the Rihel et al., 2010 database (pink). In c) and d), the scrambled-injected fingerprints are from the experiments in Fig. 5 and Fig. 5–suppl. 1, but were converted here into the behavioural parameters used by Rihel et al., 2010 for comparison. Parameters: 1, average activity (sec active/min); 2, average waking activity (sec active/min, excluding inactive minutes); 3, total sleep (hr); 4, number of sleep bouts; 5, sleep bout length (min); 6, sleep latency (min until first sleep bout).

      (3) The authors make the point that most of the AD risk genes are expressed in fish during development. Is there public data to comment on whether the genes of interest are expressed in mature/old fish as well? Just because the genes are expressed early does not at all mean that early- life dysfunction is related to future AD (though this could be the case, of course). Genes with exclusive developmental expression would be strong candidates for such an early-life role, however. I presume the case is made because sleep studies are mainly done in juvenile fish, but I think it is really a prejy minor point and such a strong claim does not even need to be made.

      This is a fair criticism but we do not make this claim (“early-life dysfunction is related to future AD”) from expression alone. The reviewer is probably referring to the following quote:

      “[…] most of these were expressed in the brain of 5–6-dpf zebrafish larvae, suggesting they play a role in early brain development or function,” which does not mention future risk of AD. We do suggest that these genes have a function in development. After all, every gene that plays a role in brain development must be expressed during development, so this wording seemed reasonable. Nevertheless, we adapted the wording to address this point and Reviewer #2’s complaint below. As noted, the primary goal was to check that the genes we selected were indeed expressed in zebrafish larvae before performing knockout experiments. Our discussion does raise the hypothesis that mutations in Alzheimer’s risk genes impact brain development and sleep early in life, but this argument primarily relies on our observation that knockout of late-onset Alzheimer’s risk genes causes sleep phenotypes in 7-day old zebrafish larvae and from previous work showing brain structural differences in children at high genetic risk of AD (Dean et al., 2014; Quiroz et al., 2015), not solely on gene expression early in life.

      Please also see our answer to a similar point raised by Reviewer #2 below (cf. Author response image 7).

      (4) A common quandary with defining sleep behaviorally is how to rectify sleep and activity changes that influence one another. With psen2 KOs, the authors describe reduced activity and increased sleep during the day. But how do we know if the reduced activity drives increased behavioral quiescence that is incorrectly defined as sleep? In instances where sleep is increased but activity during periods during wake are normal or elevated, this is not an issue. But here, the animals might very well be unhealthy, and less active, so naturally they stop moving more for prolonged periods, but the main conclusion is not sleep per se. This is an area where more experiments should be added if the authors do not wish to change/temper the conclusions they draw. Are psen2 KOs responsive to startling stimuli like controls when awake? Do they respond normally when quiescent? Great care must be taken in all models using inactivity as a proxy for sleep, and it can harm the field when there is no acknowledgment that overall health/activity changes could be a confound. Particularly worrisome is the betamethasone data in Figure 6, where activity and sleep are once again coordinately modified by the drug.

      This is a fair criticism. We agree it is a concern, especially in the case of psen2 as we claim that day-time sleep is increased while zebrafish are diurnal. We do not rely heavily on the day-time inactivity being sleep (the ZOLTAR predictions or the small molecule rescue do not change whether the parameter is called sleep or inactivity), but our choice of labelling can fairly be challenged.

      To address “are psen2 KO responsive to startling stimuli like controls when awake/when quiescent”, we looked at the larvae’s behaviour immediately after lights abruptly switched on in the mornings. Almost every larva, regardless of genotype, responded strongly to every lights-off transition during the experiment. Instead, we chose the lights-on transition for this analysis because it is a weaker startling stimulus for the larvae than the lights-off transition (Fig. 3–supplement 3), potentially exposing differences between genotypes or behavioural states (quiescent or awake). We defined a larva as having reacted to the lights switching on if it made a swimming bout during the second (25 frames) a er the lights-on transition. Across two clutches and two lights-on transitions, an average of 65% (range 52–73%) of all larvae reacted to the stimulus. psen2 knockout larvae were similarly likely, if not more likely, to respond (in average 69% responded, range 60–76%) than controls (60% average, range 44– 75%). When the lights switched on, about half of the larvae (39–51%) would have been classified as asleep according to the one-minute inactivity definition (i.e. the larva did not move in the minute preceding the lights transition). This allowed us to also compare behavioural states, as suggested by the reviewer. For three of the four light transitions, larvae which were awake when lights switched on were more likely to react than asleep larvae, but this difference was not striking (overall, awake larvae were only 1.1× more likely to react; Author response image 3). Awake psen2 knockout larvae were 1.1× (range 1.04–1.11×) more likely to react than awake control larvae, so, yes, psen2 knockout larvae respond normally when awake. Asleep psen2 knockout larvae were 1.4× (range 0.63–2.19×) more likely to react than asleep control larvae, so psen2 knockouts are also more or equally likely to react than control larvae when asleep. In summary, the overall health of psen2 knockouts did not seem to be a significant confound in the experiment. As the reviewer suggested, if psen2 knockout larvae were seriously unhealthy, they would not be as responsive as control larvae to a startling stimulus.

      Author response image 3.

      psen2 F0 knockouts react normally to lights switching on, indicating they are largely healthy. At each lights-on transition (9 AM), each larva was categorised as awake if it had moved in the preceding one minute or asleep if it had been inactive for at least one minute. Darker tiles represent larvae which performed a swimming bout during the second following lights-on; lighter tiles represent larvae which did not move during that second. The total count of each waffle plot was normalised to 25 so plots can be compared to each other. The real count is indicated in the corner of each plot. Data is from the baseline psen2 knockout trackings presented in Fig. 3 and Fig. 3–suppl. 2.

      Next, we compared inactive period durations during the day between psen2 and control larvae. If psen2 knockout larvae indeed sleep more during the day compared to controls, we may predict inactive periods longer than one minute to increase disproportionately compared to the increase in shorter inactive periods. This broadly appeared to be the case, especially for one of the two clutches (Author response image 4). In clutch 1, inactive periods lasting 1–60 sec were equally frequent in both psen2 and control larvae (fold change 1.0× during both days), while inactive periods lasting 1–2 min were 1.5× (day 1) and 2.5× (day 2) more frequent in psen2 larvae compared to control larvae. In clutch 2, 1–60 sec inactive periods were also equally frequent in both psen2 and control larvae, while inactive periods lasting 1–2 min were 3.4× (day 1) and 1.5× (day 2) more frequent in psen2 larvae compared to control larvae. Therefore, psen2 knockouts disproportionately increased the frequency of inactive periods longer than one minute, suggesting they genuinely slept more during the day.

      Author response image 4.

      psen2 F0 knockouts increased preferentially the frequency of longer inactive bouts. For each day and clutch, we calculated the mean distribution of inactive bout lengths across larvae of same genotype (psen2 F0 knockout or scrambled-injected), then compared the frequency of inactive bouts of different lengths between the two genotypes. For example, in clutch 1 during day 2, 0.01% of the average scrambled-injected larva’s inactive bouts lasted 111–120 seconds (X axis 120 sec) while 0.05% of the average psen2 F0 knockout larva lasted this long, so the fold change was 5×. Inactive bouts lasting < 1 sec were excluded from the analysis. In clutch 2, day 1 plot, two datapoints fall outside the Y axis limit: 140 sec, Y = 32×; 170 sec, Y = 16×. Data is from the baseline psen2 knockout trackings presented in Fig. 3 and Fig. 3–suppl. 2.

      Ultimately, this criticism seems challenging to definitely address experimentally. A possible approach could be to use a closed-loop system which, after one minute of inactivity, triggers a stimulus that is sufficient to startle an awake larva but not an asleep larva. If psen2 knockout larvae indeed sleep more during the day, the stimulus should usually not be sufficient to startle them. Nevertheless, we believe the two analyses presented here are consistent with psen2 knockout larvae genuinely sleeping more during the day, so we decided to keep this label. We agree with the reviewer that the one-minute inactivity definition has limitations, especially for day-time inactivity.

      (5) The conclusions for the serotonin section are overstated. Behavioural pharmacology purports to predict a signaling pathway disrupted with sorl1 KO. But is it not just possible that the drug acts in parallel to the true disrupted pathway in these fish? There is no direct evidence for serotonin dysfunction - that conclusion is based on response to the drug. Moreover, it is just one drug - is the same phenotype present with another SSRI? Likewise, language should be toned down in the discussion, as this hypothesis is not "confirmed" by the results (consider "supported"). The lack of measured serotonin differences further raises concern that this is not the true pathway. This is another major point that deserves further experimental evidence, because without it, the entire approach (behavioral pharm screen) seems more shaky as a way to identify mechanisms. There are any number of testable hypotheses to pursue such as a) Using transient transgenesis to visualize 5HT neuron morphology (is development perturbed: cell number, neurite morphology, synapse formation); b) Using transgenic Ca reporters to assay 5HT neuron activity.

      Regarding the comment, “is it not just possible that the drug acts in parallel to the true disrupted pathway”, we think no, assuming we understand correctly the question. Key to our argument is the fact that sorl1 knockout larvae react differently to the drug(s) than control larvae. As an example, take night-time sleep bout length, which was not affected by knockout of sorl1 (Fig. 4–supplement 4). For the sake of the argument, say only dopamine signalling (the “true disrupted pathway”) was affected in sorl1 knockouts and that serotonin signalling was intact. Assuming that citalopram specifically alters serotonin signalling, then treatment should cause the same increase in sleep bout length in both knockouts and controls as serotonin signalling is intact in both. This is not what we see, however. Citalopram caused a greater increase in sleep bout length in sorl1 knockouts than in scrambled-injected larvae. In other words, the effect is non-additive, in the sense that citalopram did not add the same number of z-scores to sorl1 knockouts or controls. We think this shows that serotonin signalling is somehow different in sorl1 knockouts. Nonetheless, we concede that the experiment does not necessarily say much about the importance of the serotonin disruption caused by loss of Sorl1. It could be, for example, that the most salient consequence of loss of Sorl1 is cholinergic disruption (see reply to Reviewer #1 above) and that serotonin signalling is a minor theme.

      Furthermore, we agree with the reviewer and Reviewer #2 that the conclusions were overly confident. As suggested, we decided to repeat this experiment with another SSRI, fluvoxamine. Please find the results of this experiment in Fig. 5–supplement 1. The suggestions to further test the serotonin system in the sorl1 knockouts are excellent as well, however we do not plan to pursue them at this stage.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Major Comments:

      - Data are presented in a variety of different ways, occasionally making comparisons across figures difficult. Perhaps at a minimum, behavioral fingerprints as in Figure 3 - Supplementary Figure 1 should be presented for all mutants in the main figures.

      We like this suggestion! Thank you. We brought the behavioural fingerprints figure (previously Fig. 4–supplement 5) as main Fig. 4, and put the figure focused on the sorl1 knockout behavioural phenotype in supplementary, with the other gene-by-gene figures.

      - It is not clear why some data were selected for supplemental rather than main figures. In many cases, detailed phenotypic data is provided for one example mutant in the main figures, and then additional mutants are described in detail in the supplement. Again, to facilitate comparisons between mutants, fingerprints could be provided for all mutants in a main figure, with detailed analyses moved to the supplements.

      The logic was to dedicate one main figure to psen2 (Fig. 3) as an example of an early-onset Alzheimer’s risk gene, and one to sorl1 (previously Fig. 4) as an example of a late-onset Alzheimer’s risk gene. We focused on them in main figures as they are both tested again later (Fig. 5 and Fig. 6). Having said that, we agree that the fingerprints may be a better use of main figure space than the parameters plots. In addition to the above (fingerprints of lateonset Alzheimer’s risk genes in main figure), we rearranged the figures in the early-onset AD section to have the psen2 F0 knockout fingerprint in main.

      - The explication of the utility of behavioral fingerprinting on page 35 is somewhat confusing. The authors describe drugs used to treat depression as enriched among small molecules anti-correlating with the sorl1 fingerprint. However, in Figure 5 - Supplementary Figure 1, drugs used to treat depression are biased toward positive cosines, which are indicated as having a more similar fingerprint to sorl1. These drugs should be described as more present among compounds positively correlating with the sorl1 fingerprint.

      Sorry, the confusion is about “(anti-)correlating”. Precisely, we meant “correlating and/or anti-correlating”, not just anti-correlating. We changed to that wording. In short, the analysis is by design agnostic to whether compounds with a given annotation are found more on the positive cosines side (le side in Fig. 5–supplement 1a) or the negative cosines side (right side). This is because the dataset often includes both agonists and antagonists to a given pathway but these are difficult to annotate. For example, say 10 compounds in the dataset target the dopamine D4 receptor, but these are an unknown mix of agonists and antagonists. In this case, we want ZOLTAR to generate a low p-value when all 10 compounds are found at extreme ends of the list, regardless of which end(s) that is (e.g. top 8 and bottom 2 should give an extremely low p-value). Initially, we were splitting the list, for each annotation, into positive-cosine fingerprints and negative-cosine fingerprints and testing enrichment on both separately, but we think the current approach is better as it reflects better the cases we want to detect and considers all available examples for a given annotation in one test. In sum, yes, in this case drugs used to treat depression were mostly in the positive-cosine side, but the other drugs on the negative-cosine side also contributed to what the p-value is, so it reflects better the analysis to say “correlating and/or anticorrelating”. You can read more about our logic for the analysis in Methods (section Behavioural pharmacology from sorl1 F0 knockout’s fingerprint).

      - The authors conclude the above-described section by stating: "sorl1 knockout larvae behaved similarly to larvae treated with small molecules targeting serotonin signaling, suggesting that the loss of Sorl1 disrupted serotonin signaling." Directionality here may be important. Are all of the drugs targeting the serotonin transporter SSRIs or similar? If so, then a correct statement would be that loss of Sorl1 causes similar phenotypes to drugs enhancing serotonin signaling. Finally, based on the correlation between serotonin transporter inhibitor trazodone and the sorl1 crispant phenotype, it is potentially surprising that the SSRI citalopram caused the opposite phenotype from sorl1, that is, increased sleep during the day and night. It is potentially interesting that this result was enhanced in mutants, and suggests dysfunction of serotonin signaling, but the statement that "our behavioral pharmacology approach correctly predicted from behaviour alone that serotonin signaling was disrupted" is too strong a conclusion.

      We understand “disrupt” as potentially going either way, but this may not be the common usage. We changed to “altered”.

      The point regarding directionality is excellent, however. We tested the proportion of serotonin transporter agonists and antagonists (SSRIs) on each side of the ranked list of small molecule fingerprints. We used the STITCH database for this analysis as it has more drug–target interactions, but likely less curated, than the Therapeutic Target Database (Szklarczyk et al., 2016). As with the Therapeutic Target Database, most fingerprints of compounds interacting with the serotonin transporter SLC6A4 were found on the side of positive cosines (p ~ 0.005 using the custom permutation test), which replicates Fig. 5a with a different source for the drug–target annotations (Author response image 5). On the side of positive cosines (small molecules which generate behavioural fingerprints correlating with the sorl1 fingerprint), there were 2 agonists and 26 antagonists. On the side of negative cosines (small molecules which generate behavioural fingerprints anti-correlating with the sorl1 fingerprint), there were 3 agonists and 2 antagonists. Using a Chi-squared test, this suggests a significant (p = 0.002) over-representation of antagonists (SSRIs) on the positive side (expected count = 24, vs. 26 observed) and agonists on the negative side (expected count = 1, vs. 3 observed). If SLC6A4 antagonists, i.e. SSRIs, indeed tend to cause a similar behavioural phenotype than knockout of sorl1, this would point in the direction of our original interpretation of the citalopram experiment; which was that excessive serotonin signalling is what causes the sorl1 behavioural phenotype.

      Author response image 5.

      Using the STITCH database as source of annotations also predicts SLC6A4 as an enriched target for the sorl1 behavioural fingerprint. Same figures as Fig. 5a,b but using the STITCH database (Szklarczyk et al., 2016) as source for the drug targets. a, Compounds annotated by STITCH as interacting with the serotonin transporter SLC6A4 tend to generate behavioural phenotypes similar to the sorl1 F0 knockout fingerprint. 40,522 compound–target protein pairs (vertical bars; 1,592 unique compounds) are ranked from the fingerprint with the most positive cosine to the fingerprint with the most negative cosine in comparison with the mean sorl1 F0 knockout fingerprint. Fingerprints of drugs that interact with SLC6A4 are coloured in yellow. Simulated p-value = 0.005 for enrichment of drugs interacting with SLC6A4 at the top (positive cosine) and/or bottom (negative cosine) of the ranked list by a custom permutation test. b, Result of the permutation test for top and/or bottom enrichment of drugs interacting with SLC6A4 in the ranked list. The absolute cosines of the fingerprints of drugs interacting with SLC6A4 (n = 52, one fingerprint per compound) were summed, giving sum of cosines = 15.9. To simulate a null distribution, 52 fingerprints were randomly drawn 100,000 times, generating a distribution of 100,000 random sum of cosines. Here, only 499 random draws gave a larger sum of cosines, so the simulated p-value was p = 499/100,000 = 0.005 **.

      If this were true, we would expect, as the reviewer suggested, SSRI treatment (citalopram or fluvoxamine) on control larvae to give a similar behavioural phenotype as knockout of sorl1. However, this generally did not appear to be the case (sorl1 knockout fingerprint vs. SSRI-treated control fingerprint, cosine = 0.08 ± 0.35; Author response image 6).

      Author response image 6.

      sorl1 F0 knockouts in comparison to controls treated with SSRIs. a, sorl1 F0 knockout fingerprints (baseline recordings and sorl1 + H<sub>2</sub>O fingerprint from the citalopram experiment) in comparison with the scrambled-injected + citalopram (1 or 10 µM) fingerprints. Each dot represents the mean deviation from the same-clutch scrambled-injected H<sub>2</sub>O-treated mean for that parameter (z-score, mean ± SEM). b, As in a), sorl1 F0 knockout fingerprints (baseline recordings and sorl1 + H<sub>2</sub>O fingerprint from the fluvoxamine experiment) in comparison with the scrambled-injected + fluvoxamine (10 µM) fingerprint.

      The comparison with trazodone is an interesting observation, but it is only a weak serotonin reuptake inhibitor (Ki for SLC6A4 = 690 nM, vs. 8.9 nM for citalopram; Owens et al., 1997) and it has many other targets, both as agonist or antagonist, including serotonin, adrenergic, and histamine receptors (Mijur, 2011). In any case, the average trazodone fingerprint does not correlate particularly well to the sorl1 knockout fingerprint (cos = 0.3). Finally, the sorl1 knockout behavioural phenotype could be primarily caused by altered serotonin signalling in the hypothalamus, where we found both the biggest difference in tph1a/1b/2 HCR signal intensity (Fig. 5f) and the highest expression of sorl1 across scRNA-seq clusters (Fig. 1– supplement 2). In this case, it would be correct to expect sorl1 knockouts to react differently to SSRIs than controls, but it would be incorrect to expect SSRI treatment to cause the same behavioural phenotype, as it concurrently affects every other serotonergic neuron in the brain.

      Finally, we agree the quoted conclusion was too strong given the current evidence. We since tested another SSRI, fluvoxamine, on sorl1 knockouts.

      - Also in reference to Figure 5: in panel c, data are presented as deviation from vehicle treated. Because of this data presentation choice, it's no longer possible to determine whether, in this experiment, sorl1 crispants sleep less at night relative to their siblings. Does citalopram rescue / reverse sleep deficits in sorl1 mutants?

      On your first point, please see our response to Reviewer #3 (2)c and Author Response 2b above.

      On “does citalopram rescue/reverse sleep deficits in sorl1 mutants”: citalopram (and fluvoxamine) tends to reverse the key aspects of the sorl1 knockout behavioural phenotype by reducing night-time activity (% time active and total Δ pixels), increasing night-time sleep, and shortening sleep latency (Author response image 7). Extrapolating from the hypothesis presented in Discussion, this may be interpreted as a hint that sorl1 knockouts have reduced levels of 5-HT receptors, as increasing serotonin signalling using an SSRI tends to rescue the phenotype. However, we do not think that focusing on the significant behavioural parameters necessarily make sense here. Rather, one should take all parameters into account to conclude whether knockouts react differently to the drug than wild types (also see answer to Reviewer #3, (7) on this). For example, citalopram increased more the night-time sleep bout length of sorl1 knockouts than the one of controls (Fig. 5), but this parameter was not modified by knockout of sorl1 (Fig. 4). To explain the rationale more informally, citalopram is only used as a tool here to probe serotonin signalling in sorl1 knockouts, whether it worsens or rescues the behavioural phenotype is somewhat secondary, the key question is whether knockouts react differently than controls.

      Author response image 7.

      Comparing untreated sorl1 F0 knockouts vs. treated with SSRIs. a, sorl1 F0 knockout fingerprints (baseline recordings and sorl1 + H<sub>2</sub>O fingerprint from the citalopram experiment) in comparison with the sorl1 knockout + citalopram (1 or 10 µM) fingerprints. Each dot represents the mean deviation from the same-clutch scrambled-injected H<sub>2</sub>O-treated mean for that parameter (z-score, mean ± SEM). b, As in a), sorl1 F0 knockout fingerprints (baseline recordings and sorl1 + H<sub>2</sub>O fingerprint from the fluvoxamine experiment) in comparison with the sorl1 + fluvoxamine (10 µM) fingerprint.

      - Possible molecular pathways targeted by tinidazole, fenoprofen, and betamethasone are not described.

      Tinidazole is an antibiotic, fenoprofen is a non-steroidal anti-inflammatory drug (NSAIDs), betamethasone is a steroidal anti-inflammatory drug. Interestingly, long-term use of NSAIDs reduces the risk of AD (in ’t Veld Bas A. et al., 2001). Several mechanisms are possible (Weggen et al., 2007), including reduction of Aβ42 production by interacting with γ-secretase (Eriksen et al., 2003). However, we did not explore the mechanism of action of these drugs on psen2 knockouts so do not feel comfortable speculating. We do not know, for example, whether these findings apply to betamethasone.

      Minor Comments:

      - On page 25, panel "g" should be labeled as "f".

      Thank you!

      - On page 35, a reference should be provided for the statement "From genomic studies of AD, we know that mutations in genes such as SORL1 modify risk by disrupting some biological processes.".

      Thank you, this is now corrected. There were the same studies as mentioned in Introduction.

      - On page 43, the word "and" should be added - "in wild-type rats and mice, overexpressing mutated human APP and PSEN1, AND restricting sleep for 21 days...".

      Right, this sentence could be misread, we edited it. “overexpressing […]” only applied to the mice, not the rats (as they are wild-type); and both are sleep-deprived.

      - On page 45, a reference should be provided for the statement "SSRIs can generally be used continuously with no adverse effects" and this statement should potentially be softened.

      The reference is at the end of that sentence (Cirrito et al., 2011). You are correct though; we reformulated this statement to: “SSRIs can generally be used safely for many years”. SSRIs indeed have side effects.

      - On page 54, a 60-minute rolling average is described as 45k rows, but this seems to be a 30-minute rolling average.

      Thank you! We corrected. It should have been 90k rows, as in: 25 frames-per-second × 60 seconds × 60 minutes.

      Reviewer #2 (Recommendations For The Authors):

      "As we observed in the scRNA-seq data, most genes tested (appa, appb, psen1, psen2, apoea, cd2ap, sorl1) were broadly expressed throughout the 6-dpf brain (Fig. 1d and Fig. 1supplement 3 and 4)."

      - apoea and appb are actually not expressed highly in the scRNA-seq data, and the apoea in situ looks odd, as if it has no expression. The appb gene mysteriously does not look as though it has high expression in the Raj data, but it is clearly expressed based on the in situ. I had previously noticed the same discrepancy, and I attribute it to the transcriptome used to map the Raj data, as the new DanioCell data uses a new transcriptome and indicates high appb expression in the brain. Please point out the discrepancy and possible explanation, perhaps in the figure legend.

      All excellent points, thank you. We included them directly in Results text.

      "most of these were expressed in the brain of 5-6-dpf zebrafish larvae, suggesting they play a role in early brain development or function."

      - Evidence of expression does not suggest function, particularly not a function in brain development. As one example, almost half of the genome is expressed prior to the maternal-zygotic transition but does not have a function in those earliest stages of development. There are numerous other instances where expression does not equal function. Please change the sentence even as simply as "it is possible that they".

      We mostly agree and edited to “[…], so they could play a role […]”.

      Out of curiosity, we plotted, for each zebrafish developmental stage, the proportion of Alzheimer’s risk gene orthologues expressed in comparison to the proportion of all genes expressed (Author response image 8). We defined “all genes” as every gene that is expressed in at least one of the developmental stages (n = 24,856), not the complete transcriptome, to avoid including genes that are never expressed in the brain or whose expression is always below detection limit. We counted a gene as “expressed” if at least three cells had detectable transcripts. Using these definitions, 82 ± 7% of genes are expressed during development. For every developmental stage except 5 dpf (so 11/12), a larger proportion of Alzheimer’s risk genes than all genes are expressed (+5 ± 4%).

      Author response image 8.

      Proportion of Alzheimer’s risk genes orthologues expressed throughout zebrafish development. Proportion of Alzheimer’s risk genes orthologues (n = 42) and all genes (n = 24,856) expressed in the zebrafish brain at each developmental stage, from 12 hours post-fertilisation (hpf) to 15 days post-fertilisation (dpf). “All genes” corresponds to every gene expressed in the brain at any of the developmental stages, not the complete transcriptome. A gene is considered “expressed” (green) if at least three cells had detectable transcripts. Single-cell RNA-seq dataset from Raj et al., 2020.

      "This frame-by-frame analysis has several advantages over previous methods that analysed activity data at the one-minute resolution."

      - Which methods are these? There are no citations. There are certainly existing methods in the zebrafish field that can produce similar data to the method developed for this project. This new package is useful, as most existing software is not written in R, so it would help scientists who prefer this programming language. However, I would be careful not to oversell its novelty, since many methods do exist that produce similar results.

      We added the references. There were referenced above after “we combined previous sleep/wake analysis methods”, but should have been referenced again here.

      We are not convinced by this criticism. We would obviously not claim that the FramebyFrame package is as sophisticated and versatile as video-tracking tools like SLEAP or DeepLabCut, but we do think it answers a genuine need that was not addressed by other methods. Specifically, we know of many labs recording pixel count data across multiple days using the Zebrabox or DanioVision (we added support for DanioVision data after submission), but there were no packages to extract behavioural parameters from these data. Other methods involved standalone scripts with no documentation or version tracking. We would concede the FramebyFrame package is mostly targeted at these labs, but we already know of six labs routinely using it and were recently contacted by a researcher tracking Daphnia in the Zebrabox.

      "F0 knockouts of both cutches" - "clutches"

      Thank you!

      Reviewer #3 (Recommendations For The Authors):

      I would suggest totally revamping the Introduction section, and being sure to provide readers with the context and background they need for the data that comes thereafter. Key areas to touch on, in no particular order, include:

      • Far more detail on the behavioral pharm screen upon which this paper builds, as a brief overview of that approach and the data generated are needed.

      Thank you for the suggestion, we added a sentence hinting at this work in the last Introduction paragraph.

      • Limitations of current zebrafish sleep/arousal assays that motivated the authors to develop a new, temporally high-resolution system.

      We think this is better explained in Results, as is currently. For example, we need to point to Fig. 2–supplement 2a,b,c to explain that one-minute methods were missing sleep bouts and how FramebyFrame resolves this issue.

      • A paragraph about sleep and AD, that does a better job of citing work in humans, mammalian, and invertebrate models that motivate the interest in the connection pursued here.

      Sorry, we think this would place too much focus on sleep and AD. We want the main topic of the paper to be the behavioural pharmacology approach, not AD or sleep per se. As the Introduction states, we see Alzheimer’s risk genes as a case study for the behavioural pharmacology approach, rather than the reason why the approach was developed. Additionally, presenting sleep and AD in Introduction risks sounding like ZOLTAR is specifically designed for this context, while we conceived of it as much more generalisable and explicitly encourage its use to study genes associated to other diseases. Note that the paragraph you suggest is, we think, mostly present in Discussion (section Disrupted sleep and serotonin signalling […]).

      • I modestly suggest eliminating making such a strong case for a gene-first approach being the best way to understand disease. It is not a zero-sum game, and there is plenty to learn from proteomics, metabolomics, etc. I suspect nobody will argue with the authors saying they leveraged the strength of their system and focused on key AD genes of interest.

      From your point below, we understand the following quote is the source of the issue: “For finding causal processes, studying the genome, rather than the transcriptome or epigenome, is advantageous because the chronology from genomic variant to disease is unambiguous […]”. We did not want to suggest it is a zero-sum game, but we now understand how it can be read this way. We adapted slightly the wording. What we want to do is highlight the causality argument as the advantage of the genomics approach. We feel we do not read this argument often enough, while it remains a ‘magic power’ of genomics. One essentially does not have to worry about causality when studying a pathogenic germline variant, while it is a constant concern when studying the transcriptome or epigenome (i.e. did the change in this transcript’s level cause disease, or vice-versa?). To take an example in the context of AD, arguments based on genomics (e.g. Down syndrome or APP duplication) are often the definite arbiters when debating the amyloid hypothesis, exactly because their causality cannot be doubted.

      Minor comments

      (1) The opening of the introduction is perhaps overly broad, spending an entire paragraph on genome vs transcriptome, etc and making the claim that a gene-first approach is the best path. It isn't zero-sum, and the authors could just get right into AD and study genes of interest. Similar issues occur throughout the manuscript, with sentences/paragraphs that are not necessarily needed.

      Please see our answer to your previous point. On the introduction being overly broad, we perfectly agree it is broad, but related to your point about presenting sleep and AD in the Introduction, we wish to talk about finding causal processes from genomics findings using behavioural pharmacology. We purposefully present research on AD as one instance of this broader goal, not the primary topic of the paper.

      Another example are these sentences, which could be totally removed as the following paragraph starts off making the same point much more succinctly. "From genomic studies of AD, we know that mutations in genes such as SORL1 modify risk by disrupting some biological processes. Presumably, the same processes are disrupted in zebrafish sorl1 knockouts, and some caused the behavioural alterations we observed. Can we now follow the thread backwards and predict some of the biological processes in which Sorl1 is involved based on the behavioural profile of sorl1 knockouts?"

      Thanks for the suggestion, but we think these sentences are useful to place back this Results section in the context of the Introduction. Think of the paper as mainly about the behavioural pharmacology approach, not on Alzheimer’s risk genes. The function of the paragraph here is not simply to explain the method by which we decided to study sorl1; it is to reiterate the rationale behind the behavioural pharmacology approach so that the reader understands where this Results section fits in the overall structure.

      (2) Related to the above, the authors use lecanemab as an example to support their approach, but there has been a great deal of controversy regarding this drug. I don't think such extensive justification is needed. This study uses AD risk genes as a case study in a newly developed behavioral pharm pipeline. A great deal of the rest of the intro seems to just fill space and could be more focused on the study at hand. Interestingly, a er gene selection, the next step in their pipeline is sleep/wake analysis yet nothing is covered about AD and sleep in the intro. Some justification of that approach (why focus on sleep/wake as a starting point for behavioral pharm rather than learning and memory?) would be a better use of intro space.

      There has indeed been controversy about lecanemab, but even the harshest critiques of the amyloid hypothesis concede that it slows down cognitive decline (Espay et al., 2023). That is all that is needed to support our argument, which is that research on AD started primarily from genomics and thereby yielded a disease-modifying drug. The controversy seems mostly focused on whether this effect size is clinically significant, and we think we correctly represent this uncertainty (e.g. “antibodies against Aβ such as lecanemab show promise in slowing down disease progression” and “the beneficial effects from targeting Aβ aggregation currently remain modest”).

      Your next point is entirely fair. We mostly answered it above. To explain further, the primary reason why we measured sleep/wake behaviour is to match the behavioural dataset from Rihel et al., 2010 so we can use it to make predictions, not to study sleep in the context of AD per se. Sure, perhaps learning and memory would have been interesting, but we do not know of any study testing thousands of small molecules on zebrafish larvae during a memory task. We understand it can be slightly confusing though, as we then spend a paragraph of Discussion on sleep as a causal process in AD, but we obviously need to discuss this topic given the findings. However, to reiterate, we purposefully designed FramebyFrame and ZOLTAR to be useful beyond studying sleep/wake behaviour. For example, FramebyFrame would not calculate 17 behavioural parameters if the only goal was to measure sleep. We now mention the Rihel et al., 2010 study in the Introduction as you suggested above (“Far more detail on the behavioral pharm screen […]”), as that is the real reason why sleep/wake behaviour was measured in the first place.

      (3) Also related to the above, another more relevant point that could be talked about in the intro is the need for more refined approaches to analyze sleep in zebrafish, given the effort that went into the new analysis system described here. Again, I think the context for why the authors developed this system would be more meaningful than the current content.

      Thank you, we think we answered this point above (especially below Limitations of current zebrafish sleep/arousal assays […]).

      (4) GWAS can stand for Genome-wide associate studies (plural) so I do not think the extra "s" is needed (GWASs) .

      Indeed, that seems to be the common usage. Thank you.

      (5) AD candidate risk genes were determined from loci using "mainly statistic colocalization". Can the authors add a few more details about what was done and what the "mainly" caveat refers to?

      “Mainly” simply refers to the fact that other methods were used by Schwartzentruber et al. (2021) to annotate the GWAS loci with likely causal genes, but that most calls were ultimately made from statistic colocalisation. Readers can refer to this work to learn more about the methods used.

      (6) The authors write "The loss of psen1 only had mild effects on behaviour" but I think they mean "sleep behaviors" as there could be many other behaviors that are disrupted but were not assessed. The same issue a few sentences later with "Behaviour during the day was not affected" and at the end of the following paragraph.

      Yes, that would be more precise, thank you.

      (7) For the Sorl1 pharmacology data, it is very hard to understand what is being measured behaviorally. Are the authors measuring sleep +/- citalopram, or something else, and why the change to Euclidean distance rather than all the measures we were just introduced to earlier in the manuscript?

      We understand these plots (Fig. 5c,d) are less intuitive, but it is important that we show the difference in behaviour compared to H<sub>2</sub>O-treated larvae of same genotype. The claim is that citalopram has a larger effect on knockouts than on controls, so the reader needs to focus on the effect of the drug on each genotype, not on the effect of sorl1 knockout. We added the standard fingerprints (i.e. setting controls to z-score = 0) here in Author response figures.

      Euclidean distance takes as input all the measures we introduced. The point is precisely not to select a single measure. For example, say we were only plotting active bout number during the day, we would conclude that 10 µM citalopram has the same effect on knockouts and controls. Conversely, if we had taken sleep bout length at night, we would conclude 10 µM has a stronger effect on knockouts. What is the correct parameter to select? Using Euclidean distance resolves this by taking all parameters into account, rather than arbitrarily choosing one.

      And what exactly is a "given spike in serotonin"? and how is this hypothesis the conclusion based on the lack of evidence for the second hypothesis? As the authors say, there could be other ways sorl1 knockouts are more sensitive to citalopram, so the absence of evidence for one hypothesis certainly does not support the other hypothesis.

      We mean a given release of serotonin in the synaptic cleft. We have fixed this wording. 

      We tend to disagree on the second point. We can think of two ways that sorl1 knockouts are more sensitive to citalopram: 1) they produce more serotonin, so blocking reuptake causes a larger spike in knockouts; or 2) blocking reuptake causes the same increase in both knockouts and wild-types but knockouts react more strongly to serotonin. We cannot in fact think of another way to explain the citalopram results. Not finding overwhelming evidence for 1) surely supports 2) somewhat, even if we do not have direct evidence for it. As an analogy, if two diagnoses are possible for a patient, testing negative for the first one supports the other one, even before it is directly tested.

      (8) Again some language is used without enough care. Fish are referred to as "drowsier" under some drug conditions. How do the authors know the animal is drowsy? The phenotype is more specific - more sleep, less activity.

      Thank you, we switched to “Furthermore, fenoprofen worsened the day-time hypoactivity of psen2 knockout larvae […]”.

      (9) This sentence is misleading as it gives the impression that results in this manuscript suggest the conclusion: "Our observation that disruption of genes associated with AD diagnosis after 65 years reduces sleep in 7-day zebrafish larvae suggest that disrupted sleep may be a common mechanism through which these genes exert an effect on risk." That idea is widely held in the field, and numerous other previous manuscripts/reviews should be cited for clarity of where this hypothesis came from.

      This idea is not widely held in the field. You likely read this point as “disrupted sleep is a risk factor for AD”, which, yes, is widely discussed in the field, but is not precisely what we are saying. We hypothesise that mutations in some of the Alzheimer’s risk genes cause disrupted sleep, possibly from a very early age, which then causes AD decades later. Studies and reviews on sleep and AD rarely make this hypothesis, at least not explicitly. The closest we know of are a few recent human genetics studies, typically using Mendelian Randomisation, finding that higher genetic risk of AD correlates with some sleep phenotypes, such as sleep duration (Chen et al., 2022; Leng et al., 2021). The work of Muto et al. (2021) is particularly interesting as it found correlations between higher genetic risk of AD and some sleep phenotypes in men in their early twenties, which seems unlikely to be a consequence of early pathology (Muto et al., 2021). Note, however, that even these studies do not mention sleep possibly being disrupted early in development, which is what our findings in zebrafish larvae support. As we mention, we think a team should test whether sleep is different in infants at higher genetic risk of AD, essentially performing an analogous, but obviously much more difficult, experiment as we did in zebrafish larvae. We do not know of any study testing this or even raising this idea, so evidently it is not widely held. Having said that, the studies we mention here were not referenced in the Discussion paragraph. We have now corrected this.

      Ashlin TG, Blunsom NJ, Ghosh M, Cockcroft S, Rihel J. 2018. Pitpnc1a Regulates Zebrafish Sleep and Wake Behavior through Modulation of Insulin like Growth Factor Signaling. Cell Rep 24:1389–1396. doi:10.1016/j.celrep.2018.07.012

      Chen D, Wang X, Huang T, Jia J. 2022. Sleep and LateOnset Alzheimer’s Disease: Shared Genetic Risk Factors, Drug Targets, Molecular Mechanisms, and Causal Effects. Front Genet 13. doi:10.3389/fgene.2022.794202

      Cirrito JR, Disabato BM, Restivo JL, Verges DK, Goebel WD, Sathyan A, Hayreh D, D’Angelo G, Benzinger T, Yoon H, Kim J, Morris JC, Mintun MA, Sheline YI. 2011. Serotonin signaling is associated with lower amyloid-β levels and plaques in transgenic mice and humans. Proc Natl Acad Sci U S A 108:14968–14973. doi:10.1073/pnas.1107411108

      Dean DC, Jerskey BA, Chen K, Protas H, Thiyyagura P, RoonJva A, O’Muircheartaigh J, Dirks H, Waskiewicz N, Lehman K, Siniard AL, Turk MN, Hua X, Madsen SK, Thompson PM, Fleisher AS, Huentelman MJ, Deoni SCL, Reiman EM. 2014. Brain Differences in Infants at Differential Genetic Risk for Late-Onset Alzheimer Disease A Cross-sectional Imaging Study. JAMA Neurol 71:11–22. doi:10.1001/jamaneurol.2013.4544

      Eriksen JL, Sagi SA, Smith TE, Weggen S, Das P, McLendon DC, Ozols VV, Jessing KW, Zavitz KH, Koo EH, Golde TE. 2003. NSAIDs and enantiomers of flurbiprofen target γ-secretase and lower Aβ42 in vivo. J Clin Invest 112:440–449. doi:10.1172/JCI18162

      Espay AJ, Herrup K, Kepp KP, Daly T. 2023. The proteinopenia hypothesis: Loss of Aβ42 and the onset of Alzheimer’s Disease. Ageing Res Rev 92:102112. doi:10.1016/j.arr.2023.102112

      Hoffman EJ, Turner KJ, Fernandez JM, Cifuentes D, Ghosh M, Ijaz S, Jain RA, Kubo F, Bill BR, Baier H, Granato M, Barresi MJF, Wilson SW, Rihel J, State MW, Giraldez AJ. 2016. Estrogens Suppress a Behavioral Phenotype in Zebrafish Mutants of the AuJsm Risk Gene, CNTNAP2. Neuron 89:725–733. doi:10.1016/j.neuron.2015.12.039

      in ’t Veld Bas A, Ruitenberg A, Hofman A, Launer LJ, van Duijn CM, Stijnen T, Breteler MMB, Stricker BHC. 2001. Nonsteroidal Anti inflammatory Drugs and the Risk of Alzheimer’s Disease. N Engl J Med 345:1515–1521. doi:10.1056/NEJMoa010178

      Jagirdar R, Fu C-H, Park J, Corbek BF, Seibt FM, Beierlein M, Chin J. 2021. Restoring activity in the thalamic reticular nucleus improves sleep architecture and reduces Aβ accumulation in mice. Sci Transl Med 13:eabh4284. doi:10.1126/scitranslmed.abh4284

      Jiang H, Newman M, Lardelli M. 2018. The zebrafish orthologue of familial Alzheimer’s disease gene PRESENILIN 2 is required for normal adult melanotic skin pigmentation. PLOS ONE 13:e0206155. doi:10.1371/journal.pone.0206155

      Jiang H, Pederson SM, Newman M, Dong Y, Barthelson K, Lardelli M. 2020. Transcriptome analysis indicates dominant effects on ribosome and mitochondrial function of a premature termination codon mutation in the zebrafish gene psen2. PloS One 15:e0232559. doi:10.1371/journal.pone.0232559

      Joo W, Vivian MD, Graham BJ, Soucy ER, Thyme SB. 2021. A Customizable Low-Cost System for Massively Parallel Zebrafish Behavioral Phenotyping. Front Behav Neurosci 14.

      Joubert L, Hanson B, Barthet G, Sebben M, Claeysen S, Hong W, Marin P, Dumuis A, Bockaert J. 2004. New sorting nexin (SNX27) and NHERF specifically interact with the 5-HT4a receptor splice variant: roles in receptor targeting. J Cell Sci 117:5367–5379. doi:10.1242/jcs.01379

      Leng Y, Ackley SF, Glymour MM, Yaffe K, Brenowitz WD. 2021. Genetic Risk of Alzheimer’s Disease and Sleep Duration in Non-Demented Elders. Ann Neurol 89:177–181. doi:10.1002/ana.25910

      Mitchell PB, Hadzi-Pavlovic D. 2000. Lithium treatment for bipolar disorder. Bull World Health Organ 78:515–517.

      Mikur A. 2011. Trazodone: properties and utility in multiple disorders. Expert Rev Clin Pharmacol 4:181–196. doi:10.1586/ecp.10.138

      Munoz-Torrero D. 2008. Acetylcholinesterase Inhibitors as Disease-Modifying Therapies for Alzheimer’s Disease. Curr Med Chem 15:2433–2455. doi:10.2174/092986708785909067

      Muto V, Koshmanova E, Ghaemmaghami P, Jaspar M, Meyer C, Elansary M, Van Egroo M, Chylinski D, Berthomier C, Brandewinder M, Mouraux C, Schmidt C, Hammad G, Coppieters W, Ahariz N, Degueldre C, Luxen A, Salmon E, Phillips C, Archer SN, Yengo L, Byrne E, Collette F, Georges M, Dijk D-J, Maquet P, Visscher PM, Vandewalle G. 2021. Alzheimer’s disease genetic risk and sleep phenotypes in healthy young men: association with more slow waves and daytime sleepiness. Sleep 44. doi:10.1093/sleep/zsaa137

      Myers-Turnbull D, Taylor JC, Helsell C, McCarroll MN, Ki CS, Tummino TA, Ravikumar S, Kinser R, Gendelev L, Alexander R, Keiser MJ, Kokel D. 2022. Simultaneous analysis of neuroactive compounds in zebrafish. doi:10.1101/2020.01.01.891432

      Owens MJ, Morgan WN, Plok SJ, Nemeroff CB. 1997. Neurotransmiker receptor and transporter binding profile of antidepressants and their metabolites. J Pharmacol Exp Ther 283:1305– 1322.

      Özcan GG, Lim S, Leighton PL, Allison WT, Rihel J. 2020. Sleep is bi-directionally modified by amyloid beta oligomers. eLife 9:e53995. doi:10.7554/eLife.53995

      Quiroz YT, Schultz AP, Chen K, Protas HD, Brickhouse M, Fleisher AS, Langbaum JB, Thiyyagura P, Fagan AM, Shah AR, Muniz M, Arboleda-Velasquez JF, Munoz C, Garcia G, Acosta-Baena N, Giraldo M, Tirado V, Ramírez DL, Tariot PN, Dickerson BC, Sperling RA, Lopera F, Reiman EM. 2015. Brain Imaging and Blood Biomarker Abnormalities in Children With Autosomal Dominant Alzheimer Disease: A Cross-Sectional Study. JAMA Neurol 72:912–919. doi:10.1001/jamaneurol.2015.1099

      Relkin NR. 2007. Beyond symptomatic therapy: a reexamination of acetylcholinesterase inhibitors in Alzheimer’s disease. Expert Rev Neurother 7:735–748. doi:10.1586/14737175.7.6.735

      Rihel J, Prober DA, Arvanites A, Lam K, Zimmerman S, Jang S, Haggarty SJ, Kokel D, Rubin LL, Peterson RT, Schier AF. 2010. Zebrafish Behavioral Profiling Links Drugs to Biological Targets and Rest/Wake Regulation. Science 327:348–351. doi:10.1126/science.1183090

      Sleegers K, Brouwers N, Gijselinck I, Theuns J, Goossens D, Wauters J, Del-Favero J, Cruts M, van Duijn CM, Van Broeckhoven C. 2006. APP duplication is sufficient to cause early onset Alzheimer’s dementia with cerebral amyloid angiopathy. Brain J Neurol 129:2977–2983. doi:10.1093/brain/awl203

      Sun L, Zhou R, Yang G, Shi Y. 2017. Analysis of 138 pathogenic mutations in presenilin-1 on the in vitro production of Aβ42 and Aβ40 peptides by γ-secretase. Proc Natl Acad Sci 114:E476– E485. doi:10.1073/pnas.1618657114

      Szklarczyk D, Santos A, von Mering C, Jensen LJ, Bork P, Kuhn M. 2016. STITCH 5: augmenting protein–chemical interaction networks with tissue and affinity data. Nucleic Acids Res 44:D380–D384. doi:10.1093/nar/gkv1277

      Weggen S, Rogers M, Eriksen J. 2007. NSAIDs: small molecules for prevention of Alzheimer’s disease or precursors for future drug development? Trends Pharmacol Sci 28:536–543. doi:10.1016/j.Jps.2007.09.004

      Wiltschko AB, Tsukahara T, Zeine A, Anyoha R, Gillis WF, Markowitz JE, Peterson RE, Katon J, Johnson MJ, Daka SR. 2020. Revealing the structure of pharmacobehavioral space through motion sequencing. Nat Neurosci 23:1433–1443. doi:10.1038/s41593-020-00706-3

      Yang T, Arslanova D, Gu Y, Augelli-Szafran C, Xia W. 2008. Quantification of gamma-secretase modulation differentiates inhibitor compound selectivity between two substrates Notch and amyloid precursor protein. Mol Brain 1:15. doi:10.1186/1756-6606-1-15

    1. eLife Assessment

      This study examined how multidimensional social relationships influence social attention in rhesus macaques, linking individual and group-level behaviors to attentional processes. The findings that oxytocin altered social attention and its relationship to both social tendencies and dyadic relationships are important, as recent technological advances allow for the exploration of neuronal activities and mechanisms in free-moving macaques. This work is convincing and will be of interest to those studying the interplay between social dynamics and information processing in primates.

    2. Reviewer #1 (Public review):

      Summary:

      This study aims to investigate the links between social behaviors observed in free-moving situations and behavioral performances measured in well-controlled, laboratory settings. The authors assessed general social tendencies and dyadic relationships among four monkeys in a group by scoring agonistic (aggression) and affiliative (grooming and proximity) behaviors in each pair. By measuring the saccadic reaction time in a classic social interference task, the authors reported that the monkeys with higher SEIs (i.e., more social individuals) were less distracted by the faces of other monkeys. These effects were enhanced when the distractors were out-group monkey faces rather than in-group ones. Lastly, oxytocin administration increased the impact of the out-group monkey faces in the social interference task, while reducing the magnitude of general social tendencies measured with SEI.

      Strengths:

      (1) The combination of behavioral data obtained in a colony room and in a laboratory environment is rare and important.<br /> (2) The evaluation of social interactions were successfully performed based on an automated target detection algorithm. The resulting multi-dimensional, complicated social interactions were summarized into simple indices (SEI and IEI). These indices provide a good measure for the social tendencies of each monkey.<br /> (3) Well-designed and robust experiments in the laboratory environment that are linked nicely with the general social tendencies observed in spontaneous behaviors.

      Weaknesses:

      (1) While the overall results are interesting, I am somewhat left confused about how to interpret the difference in the scores derived from different conditions. For example, the authors stated "Comparing the weights for in-group and out-group distractors, the effect of proximity was larger than that of aggression and grooming" in p.8. Does this mean that the proximity is indeed the type of behavior most affected in the out-group condition compared to the in-group condition? The out-group effects are difficult to examine with actual behavioral data, but some in-group effects such as those involving OT can be tested, which possibly provides good insights into interpreting the differences of the weights observed across the experimental conditions.

      (2) I think it is important to provide how variable spontaneous social interactions were across sessions and how impactful the variability of the interactions is on the SEI and IEI, as it helps to understand how meaningful the differences of weights are across the conditions, but such data are missing. In line with this point, although the conclusions still hold as those data were obtained during the same experimental periods, shouldn't the weights in Fig. 3f and Figs. 4g and 4h (saline) be expected to be similar, if not the same?

      Comments on revisions: I do not have further comments.

    3. Reviewer #2 (Public review):

      Summary:

      The study presents significant findings that elucidate the relationship between multi-dimensional social relationships and social attention in rhesus macaques. By integrating advanced computational methods, behavioral analyses, and neuroendocrine manipulation, the authors provide strong evidence for how oxytocin modulates attention within social networks. The results are robust and address critical gaps in understanding the dynamics of social attention in primates.

      Strengths:

      (1) The use of YOLOv5 for automatic behavioral detection is an exceptional methodological advance. The combination of automated analyses with manual validation enhances confidence in the data.<br /> (2) The study's focus on three distinct dimensions of social interaction (aggression, grooming, and proximity) is comprehensive and provides nuanced insights into the complexity of primate social networks.<br /> (3) The investigation of oxytocin's role adds a compelling neuroendocrine dimension to the findings, providing a bridge between behavioral and neural mechanisms.

      Weaknesses:

      (1) The study's conclusions are based on observations of only four monkeys, which limits the generalizability of the findings. Larger sample sizes could strengthen the validity of the results.<br /> (2) The limited set of stimulus images (in-group and out-group faces) may introduce unintended biases. This could be addressed by increasing the diversity of stimuli or incorporating a broader range of out-group members.

      Comments on revisions: I have no further comments!

    4. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Weaknesses:<br /> (1) While the overall results are interesting, I am somewhat left confused about how to interpret the difference in the scores derived from different conditions. For example, the authors stated "Comparing the weights for in-group and out-group distractors, the effect of proximity was larger than that of aggression and grooming" in p.8. Does this mean that the proximity is indeed the type of behavior most affected in the out-group condition compared to the in-group condition? The out-group effects are difficult to examine with actual behavioral data, but some in-group effects such as those involving OT can be tested, which possibly provides good insights into interpreting the differences of the weights observed across the experimental conditions.

      Thank you for your thoughtful comments and for highlighting an important aspect of our findings. The statement in page 8 refers to the relative impact of different social behaviors—proximity, aggression, and grooming—on the derived weights for in-group and out-group distractors. Specifically, the data suggest that proximity exerts a stronger influence than aggression or grooming in differentiating the effects of out-group versus in-group distractors. Regarding the out-group condition, we acknowledge that it presents challenges for direct behavioral observation, as interactions involving out-group members are often more difficult to quantify in naturalistic settings. However, we agree with you about the suggestion to test certain in-group effects, particularly those influenced by oxytocin (OT), as they offer a more controlled framework to validate and interpret the observed differences in weights across experimental conditions. In line with this, we examined specific in-group behaviors under OT administration to disentangle their contributions to attentional dynamics (Fig. 4 and Fig. 5 e to h). By integrating controlled experimental manipulations, we think these results could provide deeper insights into how social relationships shape the observed patterns of attention.

      (2) I think it is important to provide how variable spontaneous social interactions were across sessions and how impactful the variability of the interactions is on the SEI and IEI, as it helps to understand how meaningful the differences of weights are across the conditions, but such data are missing. In line with this point, although the conclusions still hold as those data were obtained during the same experimental periods, shouldn't the weights in Fig. 3f and Figs. 4g and 4h (saline) be expected to be similar, if not the same?

      Thank you for your insightful comments. As highlighted, we utilized the entire experimental period as the dataset to evaluate the monkeys' social interactions. The experiments presented in Figures 3 and 4 were designed to examine how social relationships correlate with patterns of social attention under two distinct conditions: without manipulation (Fig. 3) and with nebulized exposure to oxytocin and saline (Fig. 4). Theoretically, the weights observed in the unmanipulated condition and the nebulized saline condition should be similar. However, our results indicate that distractor biases shifted significantly following nebulized saline exposure (Fig. 4) compared to the unmanipulated condition (Fig. 3) (MK: p = 9.3×10<sup>-3</sup>, ML: p = 9.77×10<sup>-4</sup>, MC: p = 9.77×10<sup>-4</sup>, MA: p = 0.09; n<sub>1</sub> = n<sub>2</sub> = 12 experimental days; Two-sided Wilcoxon signed-rank test). This suggests that the nebulization process itself, despite acclimating the monkeys to saline exposure for approximately two weeks prior to the experiments, still influenced their attentional behaviors.

      While the primary goal of nebulization was to assess the effects of oxytocin on social attention, our main conclusions remain robust, even considering the impact of nebulization on distractor biases. We acknowledge that variability in spontaneous social interactions across days or experimental sessions could be an important factor influencing the SEI and IEI. The dynamic nature of social interactions within the colony is likely affected by numerous variables. Future research will aim to integrate these factors into a more comprehensive and dynamic framework to better interpret their influence on social attention metrics.

      Reviewer #2 (Public review):

      Weaknesses:<br /> (1) The study's conclusions are based on observations of only four monkeys, which limits the generalizability of the findings. Larger sample sizes could strengthen the validity of the results.

      Thank you for your valuable comment. We acknowledge that the relatively small sample size could influence the generalizability of the findings.  However, despite this limitation, our work systematically examined multifaceted social relationships among monkeys and their attentional strategies within a well-controlled experimental setup. We reported results across sessions and conditions (e.g., in-group vs. out-group; saline vs. Oxytocin), which strengthens the reliability of the observed effects of social networks within this context. We agree that increasing the sample size would improve the generalizability of the results. Future studies with a larger cohort will be critical for confirming the robustness of our findings and expanding their broader applicability. We have acknowledged this limitation in the revised manuscript and highlighted the potential for further research with larger sample sizes to validate and extend our conclusions.

      (2) The limited set of stimulus images (in-group and out-group faces) may introduce unintended biases. This could be addressed by increasing the diversity of stimuli or incorporating a broader range of out-group members.

      Thank you for your thoughtful comment. We acknowledge that the use of a limited set of six monkey faces as stimuli for in-group and out-group conditions could potentially introduce biases. To address this concern, we conducted an additional analysis to minimize the potential impact of individual images on our findings using the current dataset. Specifically, we randomly excluded one in-group and one out-group image and reanalyzed distractor biases using the remaining two images (Supplementary Fig. 3a). For each subject, this approach generated three sets of two distractors per group, resulting in 81(3<sup>4</sup>) combinations across four monkey subjects, and a total of 81 × 81 subject-distractor pairings. We statistically compared distractor biases between in-group and out-group faces for each combination (Supplementary Fig. 3b). As shown in Supplementary Fig. 3c, 99.30% of the 6,561 combinations demonstrated significantly lower distractor biases towards in-group faces compared to out-group faces (two-sided Wilcoxon signed-rank test, p < 0.05). These results suggest that the observed differences in social attention between in-group and out-group monkeys are unlikely to be driven by specific images within the stimulus set. That said, we agree that increasing the diversity of stimulus images or incorporating a broader range of out-group members would improve the generalizability of the results. We have acknowledged this limitation in the revised manuscript and highlighted the potential for further research to incorporate a more diverse stimulus set to validate and extend our findings.

      “However, these conclusions may be constrained by the relatively small sample size and the homogeneity of stimulus set in the study. Future research focusing on larger, more diverse cohorts and incorporating a broader range of stimuli will enhance the generalizability and applicability of the findings.”

      Reviewer #1 (Recommendations for the authors):

      It is difficult to distinguish "Getting fighted" and "Fighting partner" in Fig. 1b (esp. when printed). I thought Actor showed "Fighting partner" several times in Session 2, but it seems to be "Getting fighted" judging from Figs. 1c and 1d. Is this correct? If so, I would suggest to change the color to improve visibility.

      Thank you for your valuable comment. We apologize for the confusion in the previous version. To improve clarity, we have both terms to “begin fighting” and “being fought”. As shown in Figure 1b, we now explicitly define the identities of the two monkeys as the actor (K) and the partner (L), with all behaviors described from the perspective of the actor. For example, when the actor (K) initiates the fight, it is marked as “begin fighting”, whereas when the partner (L) initiates the fight, the actor (K) is the recipient and labeled as “being fought”. Additionally, we have implemented your suggestion by changing the colors to enhance visibility, especially for the terms “begin fighting” and “being fought”.

      Reviewer #2 (Recommendations for the authors): 

      I have some minor concerns:

      (1) Figure1B, caption for x axis is missing, 4 means 4 days?

      Thank you so much for the comment. We have clarified the x-axis in Figure 1B, where the label "4" corresponds to 4 hours of video typing on each experimental day. The revised figure now includes the appropriate label for better clarity. We appreciate your careful attention to this detail.

      (2) I am slightly concerned about animal safety. How do the experimenters ensure the animals' safety and well-being in cases of aggressive interactions or attacks?

      Thank you for your comment. We share your concern regarding animal safety and take re the well-being of the monkeys in the study. All experimental procedures were reviewed and approved by the Institutional Animal Care and Use Committee at the Institute of Biophysics, Chinese Academy of Sciences (IBP-NHP-002(22)). The monkeys were housed together in the same colony room for over four years, in interconnected cages that allowed for direct physical interaction. Animal behaviors in cages were closely monitored via a live video system to ensure their safety. To prevent potential injuries, a sliding partition system was in place, enabling the isolation of individual animals when necessary, minimizing risks to their well-being.

    1. eLife Assessment

      This study reveals a novel mechanism of glutamine synthetase (GS) regulation in Methanosarcina mazei, demonstrating that 2-oxoglutarate (2-OG) directly promotes GS activity by stabilizing its dodecameric assembly. Using mass photometry, activity assays, and cryo-electron microscopy, the authors show that GS transitions from a dimeric, inactive form at low 2-OG concentrations to a fully active dodecameric complex at saturating 2-OG levels, highlighting 2-OG as a key effector in C/N sensing. The findings are valuable, supported by solid data, and provide new insights into archaeal GS regulation, though further clarification of interactions with known partners like Glnk1 and sp26 is needed.

    2. Reviewer #1 (Public review):

      Summary:

      Shows a new mechanism of GS regulation in the archaean Methanosarcina maze and clarifies the direct activation of GS activity by 2-oxoglutarate, thus featuring an other way, how 2-oxoglutarate acts as a central status reporter of C/N sensing.

      Strengths:

      mass photometry reveals a a dynamic mode the effect of 2-OG on the oligomerization state of GS. Single particle Cryo-EM reveals the mechanism of 2-OG mediated dodecamer formation.

      Weaknesses:

      Not entirely clear, how very high 2-OG concentrations activate GS beyond dodecamer formation.

      In the revised version, most of my concerns were adequately addressed. In the summary it is stated that glutamine acts as allosteric inhibitor of dodecameric GS. This is not correct: glutamine binds to the active site and is therefore not allosteric. This way of feedback inhibition is a type of product inhibition

    3. Reviewer #2 (Public review):

      Summary:

      Herdering et al. introduced research on an archaeal glutamine synthetase (GS) from Methanosarcina mazei, which exhibits sensitivity to the environmental presence of 2-oxoglutarate (2-OG). While previous studies have indicated 2-OG's ability to enhance GS activity, the precise underlying mechanism remains unclear. Initially, the authors utilized biophysical characterization, primarily employing a nanomolar-scale detection method called mass photometry, to explore the molecular assembly of Methanosarcina mazei GS (M. mazei GS) in the absence or presence of 2-OG. Similar to other GS enzymes, the target M. mazei GS forms a stable dodecamer, with two hexameric rings stacked in tail-to-tail interactions. Despite approximately 40% of M. mazei GS existing as monomeric or dimeric entities in the detectable solution, the majority spontaneously assemble into a dodecameric state. Upon mixing 2-OG with M. mazei GS, the population of the dodecameric form increases proportionally with the concentration of 2-OG, indicating that 2-OG either promotes or stabilizes the assembly process. The cryo-electron microscopy (cryo-EM) structure reveals that 2-OG is positioned near the interface of two hexameric rings. At a resolution of 2.39 Å, the cryo-EM map vividly illustrates 2-OG forming hydrogen bonds with two individual GS subunits as well as with solvent water molecules. Moreover, local sidechain reorientation and conformational changes of loops in response to 2-OG further delineate the 2-OG-stabilized assembly of M. mazei GS.

      Strengths & Weaknesses:

      The investigation studies into the impact of 2-oxoglutarate (2-OG) on the assembly of Methanosarcina mazei glutamine synthetase (M mazei GS). Utilizing cutting-edge mass photometry, the authors scrutinized the population dynamics of GS assembly in response to varying concentrations of 2-OG. Notably, the findings demonstrate a promising and straightforward correlation, revealing that dodecamer formation can be stimulated by 2-OG concentrations of up to 10 mM, although GS assembly never reaches 100% dodecamerization in this study. Furthermore, catalytic activities showed a remarkable enhancement, escalating from 0.0 U/mg to 7.8 U/mg with increasing concentrations of 2-OG, peaking at 12.5 mM. However, an intriguing gap arises between the incomplete dodecameric formation observed at 10 mM 2-OG, as revealed by mass photometry, and the continued increase in activity from 5 mM to 10 mM 2-OG for M mazei GS. This prompts questions regarding the inability of M mazei GS to achieve complete dodecamer formation and the underlying factors that further enhance GS activity within this concentration range of 2-OG.

      Moreover, the cryo-electron microscopy (cryo-EM) analysis provides additional support for the biophysical and biochemical characterization, elucidating the precise localization of 2-OG at the interface of two GS subunits within two hexameric rings. The observed correlation between GS assembly facilitated by 2-OG and its catalytic activity is substantiated by structural reorientations at the GS-GS interface, confirming the previously reported phenomenon of "funnel activation" in GS. However, the authors did not present the cryo-EM structure of M. mazei GS in complex with ATP and glutamate in the presence of 2-OG, which could have shed light on the differences in glutamine biosynthesis between previously reported GS enzymes and the 2-OG-bound M. mazei GS.

      Furthermore, besides revealing the cryo-EM structure of 2-OG-bound GS, the study also observed the filamentous form of GS, suggesting that filament formation may be a universal stacking mechanism across archaeal and bacterial species. However, efforts to enhance resolution to investigate whether the stacked polymer is induced by 2-OG or other factors such as ions or metabolites were not undertaken by the authors, leaving room for further exploration into the mechanisms underlying filament formation in GS.

      Comments on revisions:

      My comments have been addressed adequately.

      I recognize that determining the structure of the GS complex bound to ATP and/or other ligands would enhance this study by offering a more comprehensive understanding of 2-oxoglutarate-mediated dodecameric assembly and activation. However, I accept the authors' explanation for not including this aspect in the current work.

    4. Reviewer #3 (Public review):

      The current manuscript investigates the effect of 2-oxoglutarate (2OG) as modulator of glutamine synthetase (GS). To do this, the authors rely of mass photometry, specific activity measurements and single particle cryo-EM data.<br /> From the results, the authors conclude that the GS from Methanosarcina mazei shifts from a dimeric, non-active state under low concentrations of 2OG, to a dodecameric and fully active complex at saturating concentrations of 2OG.

      GS is a crucial enzyme in all domains of life. The dodecameric fold of GS is recurrent amongst prokaryotic and archaea organisms but the enzyme activity can be regulated in distinct ways. This is a very interesting work combining protein biochemistry with structural biology.

      A novel role for 2OG is presented for this mesophilic methanoarchaeon, as a crucial effector for the enzyme oligomerization and full reactivity.

      The conclusions of this paper are mostly well supported by data, but some aspects of this GS regulation and interaction with known partners like Glnk1 and sp26 need to be clarified and extended.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      his study shows a new mechanism of GS regulation in the archaean Methanosarcina mazei and clarifies the direct activation of GS activity by 2-oxoglutarate, thus featuring another way in which 2-oxoglutarate acts as a central status reporter of C/N sensing.

      Mass photometry and single particle cryoEM structure analysis convincingly show the direct regulation of GS activity by 2-OG promoted formation of the dodecameric structure of GS. The previously recognized small proteins GlnK1 and Sp26 seem to play a subordinate role in GS regulation, which is in good agreement with previous data. Although these data are quite clear now, there remains one major open question: how does 2-OG further increase GS activity once the full dodecameric state is achieved (at 5 mM)? This point needs to be reconsidered.

      Weaknesses:

      It is not entirely clear, how very high 2-OG concentrations activate GS beyond dodecamer formation.

      The data presented in this work are in stark contrast to the previously reported structure of M. mazei GS by the Schumacher lab. This is very confusing for the scientific community and requires clarification. The discussion should consider possible reasons for the contradictory results.

      Importantly, it is puzzling how Schumacher could achieve an apo-structire of dodecameric GS? If 2-OG is necessary for dodecameric formation, this should be discussed. If GlnK1 doesn't form a complex with the dodecameric GS, how could such a complex be resolved there?

      In addition, the text is in principle clear but could be improved by professional editing. Most obviously there is insufficient comma placement.

      We thank Reviewer #1 for the professional evaluation and raising important points. We will address those comments in the updated manuscript and especially improve the discussion in respect to the two points of concern.

      (1) How can GlnA1 activity further be stimulated with further increasing 2-OG after the dodecamer is already fully assembled at 5 mM 2-OG.

      We assume a two-step requirement for 2-OG, the dodecameric assembly and the priming of the active sites. The assembly step is based on cooperative effects of 2-OG and does not require the presence of 2-OG in all 2-OG-binding pockets: 2-OG-binding to one binding pocket also causes a domino effect of conformational changes in the adjacent 2-OG-unbound subunit, as also described for Methanothermococcus thermolithotrophicus GS in Müller et al. 2023. Due to the introduction of these conformational changes, the dodecameric form becomes more favourable even without all 2-OG binding sites being occupied. With higher 2-OG concentrations present (> 5mM), the activity increased further until finally all 2-OG-binding pockets were occupied, resulting in the priming of all active sites (all subunits) and thereby reaching the maximal activity.

      (2) The contradictory results with previously published data on the structure of M. mazei by Schumacher et al. 2023.

      We certainly agree that it is confusing that Schumacher et al. 2023 obtained a dodecameric structure without the addition of 2-OG, which we claim to be essential for the dodecameric form. 2-OG is a cellular metabolite that is naturally present in E. coli, the heterologous expression host both groups used. Since our main question focused on analysing the 2-OG effect on GS, we have performed thorough dialysis of the purified protein to remove all 2-OG before performing MP experiments. In the absence of 2-OG we never observed significant enzyme activity and always detected a fast disassembly after incubation on ice. We thus assume that a dodecamer without 2-OG in Schumacher et al. 2023 is an inactive oligomer of a once 2-OG-bound form, stabilized e.g. by the presence of 5 mM MgCl2.

      The GlnA1-GlnK1-structure (crystallography) by Schumacher et al. 2023 is in stark contrast to our findings that GlnK1 and GlnA1 do not interact as shown by mass photometry with purified proteins. A possible reason for this discrepancy might be that at the high protein concentrations used in the crystallization assay, complexes are formed based on hydrophobic or ionic protein interactions, which would not form under physiological concentrations.

      Reviewer #2 (Public Review):

      Summary:

      Herdering et al. introduced research on an archaeal glutamine synthetase (GS) from Methanosarcina mazei, which exhibits sensitivity to the environmental presence of 2-oxoglutarate (2-OG). While previous studies have indicated 2-OG's ability to enhance GS activity, the precise underlying mechanism remains unclear. Initially, the authors utilized biophysical characterization, primarily employing a nanomolar-scale detection method called mass photometry, to explore the molecular assembly of Methanosarcina mazei GS (M. mazei GS) in the absence or presence of 2-OG. Similar to other GS enzymes, the target M. mazei GS forms a stable dodecamer, with two hexameric rings stacked in tail-to-tail interactions. Despite approximately 40% of M. mazei GS existing as monomeric or dimeric entities in the detectable solution, the majority spontaneously assemble into a dodecameric state. Upon mixing 2-OG with M. mazei GS, the population of the dodecameric form increases proportionally with the concentration of 2-OG, indicating that 2-OG either promotes or stabilizes the assembly process. The cryo-electron microscopy (cryo-EM) structure reveals that 2-OG is positioned near the interface of two hexameric rings. At a resolution of 2.39 Å, the cryo-EM map vividly illustrates 2-OG forming hydrogen bonds with two individual GS subunits as well as with solvent water molecules. Moreover, local side-chain reorientation and conformational changes of loops in response to 2-OG further delineate the 2-OG-stabilized assembly of M. mazei GS.

      Strengths & Weaknesses:

      The investigation studies the impact of 2-oxoglutarate (2-OG) on the assembly of Methanosarcina mazei glutamine synthetase (M mazei GS). Utilizing cutting-edge mass photometry, the authors scrutinized the population dynamics of GS assembly in response to varying concentrations of 2-OG. Notably, the findings demonstrate a promising and straightforward correlation, revealing that dodecamer formation can be stimulated by 2-OG concentrations of up to 10 mM, although GS assembly never reaches 100% dodecamerization in this study. Furthermore, catalytic activities showed a remarkable enhancement, escalating from 0.0 U/mg to 7.8 U/mg with increasing concentrations of 2-OG, peaking at 12.5 mM. However, an intriguing gap arises between the incomplete dodecameric formation observed at 10 mM 2-OG, as revealed by mass photometry, and the continued increase in activity from 5 mM to 10 mM 2-OG for M mazei GS. This prompts questions regarding the inability of M mazei GS to achieve complete dodecamer formation and the underlying factors that further enhance GS activity within this concentration range of 2-OG.

      Moreover, the cryo-electron microscopy (cryo-EM) analysis provides additional support for the biophysical and biochemical characterization, elucidating the precise localization of 2-OG at the interface of two GS subunits within two hexameric rings. The observed correlation between GS assembly facilitated by 2-OG and its catalytic activity is substantiated by structural reorientations at the GS-GS interface, confirming the previously reported phenomenon of "funnel activation" in GS. However, the authors did not present the cryo-EM structure of M. mazei GS in complex with ATP and glutamate in the presence of 2-OG, which could have shed light on the differences in glutamine biosynthesis between previously reported GS enzymes and the 2-OG-bound M. mazei GS.

      Furthermore, besides revealing the cryo-EM structure of 2-OG-bound GS, the study also observed the filamentous form of GS, suggesting that filament formation may be a universal stacking mechanism across archaeal and bacterial species. However, efforts to enhance resolution to investigate whether the stacked polymer is induced by 2-OG or other factors such as ions or metabolites were not undertaken by the authors, leaving room for further exploration into the mechanisms underlying filament formation in GS.

      We thank Reviewer #2 for the detailed assessment and valuable input. We will address those comments in the updated manuscript and clarify the message.

      (1) The discrepancy of the dodecamer formation (max. at 5 mM 2-OG) and the enzyme activity (max. at 12.5 mM 2-OG). We assume that there are two effects caused by 2-OG: 1. cooperativity of binding (less 2-OG needed to facilitate dodecamer formation) and 2. priming of each active site. See also Reviewer #1 R.1). We assume this is the reason why the activity of dodecameric GlnA1 can be further enhanced by increased 2-OG concentration until all catalytic sites are primed.

      (2) The lack of the structure of a 2-OG and ATP-bound GlnA1. Although we strongly agree that this would be a highly interesting structure, it seems out of the scope of a typical revision to request new cryo-EM structures. We evaluate the findings of our present study concerning the 2-OG effects as important insights into the strongly discussed field of glutamine synthetase regulation, even without the requested additional structures.

      (3) The observed GlnA1-filaments are an interesting finding. We certainly agree with the referee on that point, that the stacked polymers are potentially induced by 2-OG or ions. However, it is out of the main focus of this manuscript to further explore those filaments. Nevertheless, this observation could serve as an interesting starting point for future experiments.

      Reviewer #3 (Public Review):

      Summary:

      The current manuscript investigates the effect of 2-oxoglutarate and the Glk1 protein as modulators of the enzymatic reactivity of glutamine synthetase. To do this, the authors rely on mass photometry, specific activity measurements, and single-particle cryo-EM data.

      From the results obtained, the authors convey that glutamine synthetase from Methanosarcina mazei exists in a non-active monomeric/dimeric form under low concentrations of 2-oxoglutarate, and its oligomerization into a dodecameric complex is triggered by higher concentration of 2-oxoglutarate, also resulting in the enhancement of the enzyme activity.

      Strengths:

      Glutamine synthetase is a crucial enzyme in all domains of life. The dodecameric fold of GS is recurrent amongst prokaryotic and archaea organisms, while the enzyme activity can be regulated in distinct ways. This is a very interesting work combining protein biochemistry with structural biology.

      The role of 2-OG is here highlighted as a crucial effector for enzyme oligomerization and full reactivity.

      Weaknesses:

      Various opportunities to enhance the current state-of-the-art were missed. In particular, omissions of the ligand-bound state of GnK1 leave unexplained the lack of its interaction with GS (in contradiction with previous results from the authors). A finer dissection of the effect and role of 2-oxoglurate are missing and important questions remain unanswered (e.g. are dimers relevant during early stages of the interaction or why previous GS dodecameric structures do not show 2-oxoglutarate).

      We thank Reviewer #3 for the expert evaluation and inspiring criticism.

      (1) Encouragement to examine ligand-bound states of GlnK1. We agree and plan to perform the suggested experiments exploring the conditions under which GlnA1 and GlnK1 might interact. We will perform the MP experiments in the presence of ATP. In GlnA1 activity test assays when evaluating the presence/effects of GlnK1 on GlnA1 activity, however, ATP was always present in high concentrations and still we did not observe a significant effect of GlnK1 on the GlnA1 activity.

      (2) The exact role of 2-OG could have been dissected much better. We agree on that point and will improve the clarity of the manuscript. See also Reviewer #1 R.1.

      (3) The lack of studies on dimers. This is actually an interesting point, which we did not consider during writing the manuscript. Now, re-analysing all our MP data in this respect, GlnA1 is likely a dimer as smallest species. Consequently, we will add more supplementary data which supports this observation and change the text accordingly.

      (4) Previous studies and structures did not show the 2-OG. We assume that for other structures, no additional 2-OG was added, and the groups did not specifically analyse for this metabolite either. All methanoarchaea perform methanogenesis and contain the oxidative part of the TCA cycle exclusively for the generation of glutamate (anabolism) but not a closed TCA cycle enabling them to use internal 2-OG concentration as internal signal for nitrogen availability. In the case of bacterial GS from organisms with a closed TCA cycle used for energy metabolism (oxidation of acetyl CoA) like e.g. E. coli, the formation of an active dodecameric GS form underlies another mechanism independent of 2-OG. In case of the recent M. mazei GS structures published by Schumacher et al. 2023, the dodecameric structure is probably a result from the heterologous expression and purification from E. coli. (See also Reviewer #1 R.2). One example of methanoarchaeal glutamine synthetases that do in fact contain the 2-OG in the structure, is Müller et al. 2023.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Specific issues:

      L 141: 2-OG levels increase due to slowing GOGAT reaction (due to Gln limitation as a consequence of N-starvation).... (2-OG also increases in bacteria that lack GDH...)

      As the GS-GOGAT cycle is the major route of ammonium assimilation, consumption of 2-OG by GDH is probably only relevant under high ammonium concentrations.

      In Methanoarchaea, GS is strictly regulated and expression strongly repressed under nitrogen sufficiency - thus glutamate for anabolism is mainly generated by GDH under N sufficiency consuming 2-OG delivered by the oxidative part of the TCA cycle (Methanogenesis is the energy metabolism in methanoarchaea, a closed TCA cycle is not present) thus 2-OG is increasing under nitrogen limitation, when no NH3 is available for GDH.

      L148: it is not clear what is meant by: "and due to the indirect GS activity assay"

      We apologize for not being clear here. The GS activity assay used is the classical assay by Sahpiro & Stadtman 1970 and is a coupled optical test assay (coupling the ATP consumption of the GS activity to the oxidation of NADH by lactate dehydrogenase). Based on the coupled test assay the measurements of low activities show a high deviation. We now added this information in the revised MS respectively.

      L: 177: arguing about 2-OG affinities: more precisely, the 0.75 mM 2-OG is the EC50 concentration of 2-OG for triggering dodecameric formation; it might not directly reflect the total 2-OG affinity, since the affinity may be modulated by (anti)cooperative effects, or by additional sites... as there may be different 2-OG binding sites involved... (same in line 201)

      Thank you for the valuable input. We changed KD to EC50 within the entire manuscript. Concerning possible additional 2-OG binding sites: we did not see any other 2-OG in the cryo-EM structure aside from the described one and we therefore assume that the one described in the manuscript is the main and only one. Considering the high amounts of 2-OG (12.5 mM) used in the structure, it is quite unlikely that additional 2-OG sites exist since they would have unphysiologically low affinities.

      In this respect, instead of the rather poor assay shown in Figure 1D, a more detailed determination of catalytic activation by different 2-OG concentrations should be done (similar to 1A)... This would allow a direct comparison between dodecamerization and enzymatic activation.

      We agree and performed the respective experiments, which are now presented in revised Fig. 1D

      Discussion: the role of 2-OG as a direct activator, comparison with other prokaryotic GS: in other cases, 2-OG affects GS indirectly by being sensed by PII proteins or other 2-OG sensing mechanisms (like 2OG-NtcA-mediated repression of IF factors in cyanobacteria)

      We agree and have added that information in the discussion as suggested.

      290. Unclear: As a second step of activation, the allosteric binding of 2-OG causes a series of conformational.... where is this site located? According to the catalytic effects (compare 1A and 1D) this site should have a lower affinity …

      Thank you very much for pointing this out. Binding of 2-OG only occurs in one specific allosteric binding-site. Binding however, has two effects on the GlnA1: dodecamer assembly and priming of the active site (with two specific EC50, which are now shown in Fig. 1A and D).

      See also public comment #1 (1).

      Reviewer #2 (Recommendations For The Authors):

      The primary concern for me is that mass photometry might lead to incorrect conclusions. The differences in the forms of GS seen in SEC and MP suggest that GS can indeed form a stable dodecamer when the concentration of GS is high enough, as shown in Figure S1B. I strongly suggest using an additional biophysical method to explore the connection between GS and 2-OG in terms of both assembly and activity, to truly understand 2-OG's role in the process of assembly and catalysis.

      We apologize if we did not present this clear enough, however the MP analysis of GlnA1 in the absence of 2-OG showed always (monomers/) dimers, dodecamers were only present in the presence of 2-OG. The SEC analysis in Fig. S1B has been performed in the presence of 12.5 mM 2-OG, we realized this information is missing in the figure legend - we now added this in the revised version. The 2-OG is in addition visible in the Cryo EM structure. Thus, we do not agree to perform additional biophysical methods.

      As for the other experimental findings, they appear satisfactory to me, and I have no reservations regarding the cryoEM data.

      (1) Mass photometry is a fancy technique that uses only a tiny amount of protein to study how they come together. However, the concentration of the protein used in the experiment might be lower than what's needed for them to stick together properly. So, the authors saw a lot of single proteins or pairs instead of bigger groups. They showed in Figure S1B that the M. mazei GS came out earlier than a 440-kDa reference protein, indicating it's actually a dodecamer. But when they looked at the dodecamer fraction using mass photometry, they found smaller bits, suggesting the GS was breaking apart because the concentration used was too low. To fix this, they could try using a technique called analytic ultracentrifuge (AUC) with different amounts of 2-OG to see if they can spot single proteins or pairs when they use a bit more GS. They could also try another technique called SEC-MALS to do similar tests. If they do this, they could replace Figure 1A with new data showing fully formed GS dodecamers when they use the right amount of 2-OG.

      Thank you for this input. In MP we looked at dodecamer formation after removing the 2-OG entirely and re-adding it in the respective concentration. We think that GlnA1 is much more unstable in its monomeric/dimeric fraction and that the complete and harsh removal of 2-OG results in some dysfunctional protein which does not recover the dodecameric conformation after dialysis and re-addition of 2-OG. Looking at the dodecamer-peak right after SEC however, we exclusively see dodecamers, which is now included as an additional supplementary figure (suppl. Fig. 1C). Consequently, we did not perform additional experiments.

      (2) Building on the last point, the estimated binding strength (Kd) between 2-OG and GS might be lower than it really is, because the GS often breaks apart from its dodecameric form in this experiment, even though 2-OG helps keep the pairs together, as seen with cryoEM. What if they used 5-10 times more GS in the mass photometry experiment? Would the estimated bond strength stay the same? Could they use AUC or other techniques like ITC to find out the real, not just estimated, strength of the bond?

      We agree that the term KD is not suitable. We have changed the term KD to EC50 as suggested by reviewer #1, which describes the effective concentration required for 50 % dodecamer assembly. Furthermore, we disagree that the dodecamer breaks apart when the concentrations are as low as in MP experiments. The actual reason for the breaking is rather the harsh dialysis to remove all 2-OG before MP experiments. Right after SEC, the we exclusively see dodecamer in MP (suppl. Fig. S1C). See also #2 (1).

      (3) The fact that the GS hardly works without 2-OG is interesting. I tried to understand the experiment setup, but it wasn't clear as the protocol mentioned in the author's 2021 FEBS paper referred to an old paper from 1970. The "coupled optical test assay" they talked about wasn't explained well. I found other papers that used phosphometry assays to see how much ATP was used up. I suggest the authors give a better, more detailed explanation of their experiments in the methods section. Also, it's unclear why the GS activity keeps going up from 5 to 12.5 mM 2-OG, even though they said it's saturated. They suggested there might be another change happening from 5 to 12.5 mM 2-OG. If that's the case, they should try to get a cryo-EM picture of the GS with lots of 2-OG, both with and without ATP/glutamate (or the Met-Sox-P-ADP inhibitor), to see what's happening at a structural level during this change caused by 2-OG.

      We agree with the reviewer that the GS assay was not explained in detail (since published and known for several years). However, we now added the more detailed description of the assay in the revised MS, which also measures the ATP used up by GS, but couples the generation of ADP to an optical test assay producing pyruvate from PEP with the generated ADP catalysed by pyruvate kinase present in the assay. This generated pyruvate is finally reduced to lactate by the present lactate dehydrogenase consuming NADH, the reduction of which is monitored at 340 nm.

      The still increasing activity of GS after dodecamer formation (max. at 5 mM 2-OG) and the continuously increasing enzyme activity (max. at 12.5 mM 2-OG): See also public reviews, we assume that there are two effects caused by 2-OG: 1. cooperativity of binding (less 2-OG needed to facilitate dodecamer formation) and 2. priming of each active site.

      The suggested additional experiments with and without ATP/Glutamate: Although we strongly agree that this would be a highly interesting structure, it seems out of the scope of a typical revision to request new cryo-EM structures. We evaluate the findings of our present study concerning the 2-OG effects as important insights into the strongly discussed field of glutamine synthetase regulation, even without the requested additional structures.

      (4) Please remake Figure S2, the panels are too small to read the words. At least I have difficulty doing so.

      We assume the reviewer is pointing to Suppl. Fig S3, we now changed this figure accordingly.

      Line 153, the reference Schumacher et al. 23, should be 2023?

      Yes, thank you. We corrected that.

      Line 497. I believe it's UCSF ChimeraX, not Chimera.

      We apologize and corrected accordingly.

      Reviewer #3 (Recommendations For The Authors):

      Recent studies on the Methanothermococcus thermolithotrophicus glutamine synthetase, published by Müller et al., 2024, have identified the binding site for 2-oxoglutarate as well as the conformational changes that were induced in the protein by its presence. In the present study, the authors confirm these observations and additionally establish a link between the presence of 2-oxoglutarate and the dodecameric fold and full activation of GS.

      Curiously, here, the authors could not confirm their own findings that the dodecameric GS can directly interact with the PII-like GlnK1 protein and the small peptide sP26. However, the lack of mention of the GlnK-bound state in these studies is very alarming since it certainly is highly relevant here.

      We agree with the reviewer that we have not observed the interaction with GlnK1 and sP26 in the recent study. Consequently, we speculate that yet unknown cellular factor(s) might be required for an interaction of GlnA1 with GlnK1 and sP26, which were not present in the in vitro experiments using purified proteins, however they were present in the previous pull-down approaches (Ehlers et al. 2005, Gutt et al. 2021). Another reason might be that post-translational modifications occur in M. mazei, which might be important for the interaction, which are also not present in purified proteins expressed in E. coli.

      The manuscript interest could have been substantially increased if the authors had done finer biochemical and enzymatic analyses on the oligomerization process of GS, used GlnK1 bound to known effectors in their assays and would have done some more efforts to extrapolate their findings (even if a small niche) of related glutamine synthetases.

      We thank the reviewer for their valuable encouragement to explore ligand-bound-states of GlnK1. However, in this manuscript we mainly focused on 2-OG as activator of GlnA1 and decided to dedicate future experiments to the exploration of conditions that possibly favor GlnK1-binding.

      In principle, we have explored the ATP bound GlnK1 effects on GlnA1 activity in the activity assays (Fig. 2E) since ATP (3.6 mM) is present. GlnK1 however showed no effects on GlnA1 activity.

      In general, the manuscript is poorly written, with grammatically incorrect sentences that at times, which stands in the way of passing on the message of the manuscript.

      Particular points:

      (1) It is mentioned that 2-OG induces the active oligomeric (dodecamer, 12-mer) state of GlnA1 without detectable intermediates. However, only 62 % of the starting inactive enzyme yields active 12-mers. Note that this is contradicted in line 212.

      Thanks for pointing out this discrepancy. After removing all 2-OG as we did before MP-experiments, GlnA1 doesn’t reach full dodecamers anymore when 2-OG is re-added. This is not because the 2-OG amount is not enough to trigger full assembly, but because the protein is much more unstable in the absence of 2-OG, so we predict that some GlnA1 breaks during dialysis. See also answer reviewer #2 (1) and supplementary figure S1C.

      Is there any protein precipitation upon the addition of 2-OG? Is all protein being detected in the assay, meaning, is monomer/dimer + dodecamer yields close to 100% of the total enzyme in the assay?

      There is no protein precipitation upon the addition of 2-OG, indeed, GlnA1 is much more stable in the presence of 2-OG. In the mass photometry experiments, all particles are measured, precipitated protein would be visible as big entities in the MP.

      Please add to Figure 1 the amount of monomer/dimer during titration. Some debate why there is no full conversion should be tentatively provided.

      We agree with the reviewer and included the amount of monomer/dimer in the figure, as well as some discussion on why it is not fully converted again. GlnA1 is unstable without 2-OG and it was dialysed against buffer without 2-OG before MP measurements. This sample mistreatment resulted in no full re-assembly after re-adding 2-OG (although full dodecamers before dialysis (suppl. Fig. S1C).

      (2) Figure 1B reflects an exemplary result. Here, the addition of 0.1 mM 2-OG seems to promote monomer to dimer transition. Why was this not studied in further detail? It seems highly relevant to know from which species the dodecamer is assembled.

      We thank the reviewer for their comment. However, we would like to point out that, although not shown in the figure, GlnA1 is always mainly present as dimers as the smallest entity. As suggested earlier, we have added the amount of monomers/dimers to Figure 1A, which shows low monomer-counts at all 2-OG concentrations (Fig.1A). Although not depicted in the graph starting at 0.01 mM OG, we also see mainly dimers at 0 mM 2-OG.

      How does the y-axis compare to the number and percentage of counts assigned to the peaks? In line 713, it is written that the percentage of dodecamer considers the total number of counts, and this was plotted against the 2-OG concentration.

      We thank the reviewer for addressing this unclarity. Line 713 corresponds to Figure 1A, where we indeed plotted the percentage of dodecamer against the 2-OG-concentration. Thereby, the percentage of dodecamer corresponds to the percentage calculated from the Gaussian Fit of the MP-dodecamer-peak. In Figure 1 B, however, the y-axis displays the relative amount of counts per mass, multiple similar masses then add up to the percentage of the respective peak (Gaussian Fit above similar masses).

      (3) Lines 714 and 721 (and elsewhere): Why only partial data is used for statistical purposes?

      We in general only show one exemplary biological replicate, since the quality of the respective GlnA1 purification sometimes varied (maximum activity ranging from 5 - 10 U/mg). Therefore, we only compared activities within the same protein purification. For the EC50 calculations of all measurements, we refer to the supplement.

      (4) Lines 192-193: It is claimed that GlnK1 was previously shown to both regulate the activity of GlnA1 and form a complex with GlnA1. Please mention the ratio between GlnK1 and GlnA1 in this complex.

      We now included the requested information (GlnA1:GlnK1 1:1, (Ehlers et al. 2005); His6-GlnA1 (0.95 μM), His6-GlnK1 (0.65 μM); 2:1,4, Gutt et al. 2021).

      It is also known that PII proteins such as GlnK1 can bind ADP, ATP, and 2-OG. Interestingly, however, for various described PII proteins, 2-OG can only bind after the binding of ATP.

      So, the crucial question here is what is the binding state of GlnK1? 

      Were these assays performed in the absence of ATP? This is key to fully understand and connect the results to the previous observations. For example, if the GlnK1 used was bound to ADP but not to ATP, then the added 2-OG might indeed only be able to affect GlnA1 (leading to its activation/oligomerization). If this were true and according to the data reported, ADP would prevent GlnK1 from interacting with any oligomeric form of GlnA1. However, if GlnK1 bound to ATP is the form that interacts with GlnA1 (potentially validating previous results?) then, 2-OG would first bind to GlnK1 (assuming a higher affinity of 2-OG to GlnK1), eventually causing its release from GlnA1 followed by binding and activation of GlnA1.

      These experiments need to be done as they are essential to further understand the process. Given the ability of the authors to produce the protein and run such assays, it is unclear why they were not done here. As written in line 203, in this case, "under the conditions tested" is not a good enough statement, considering what is known in the field and how many more conclusions could easily be taken from such a setup.

      Thanks for the encouragement to investigate the ligand-bound states of GlnK1. We agree and plan to perform the suggested mass photometry experiments exploring the conditions under which GlnA1 and GlnK1 might interact in future work. In GlnA1 activity test assays, when evaluating the presence/effects of GlnK1 on GlnA1 activity, however, ATP was always present in high concentrations and still we did not observe a significant effect of GlnK1 on the GlnA1 activity.

      (5) Figure 2D legend claims that the graphic shows the percentage of dodecameric GlnA1 as a function of the concentration of 2-OG. This is not what the figure shows; Figure 2D shows the dodecamer/dimer (although legend claims monomer was used, in line 732) ratio as a function of 2-OG (stated in line 736!). If this is true, a ratio of 1 means 50 % of dodecamers and dimers co-exist. This appears to be the case when GlnK1 was added, while in the absence of GlnK1 higher ratios are shown for higher 2-OG concentration implying that about 3 times more dodecamers were formed than dimers. However, wouldn´t a 50 % ratio be physiologically significant?

      We apologize for the partially incorrect and also misleading figure legend and corrected it. Indeed, the ratio of dodecamers and dimers is shown. Furthermore, we did not use monomeric GlnA1 (the smallest entity is mainly a dimer, see Fig 1A), however, the molarity was calculated based on the monomer-mass. Concerning the significance of the difference between the maximum ratio of GlnA1 and GlnK1: The ratio does appear higher, but this is mostly because adding large quantities of GlnK1 broadens all peaks at low molecular weight. This happens because the GlnK1 signal starts overlapping with the signal from GlnA1, leading to inflated GlnA1 dimer counts. We therefore do not think that this is biologically significant, especially as the activities do not differ under these conditions.

      (6) Is it possible that the uncleaved GlnA1 tag is preventing interaction with GlnK1? This should be discussed.

      This is of course a very important point. We however realized that Schumacher et al. also used an N-terminal His-tag, so we assume that the N-terminal tag is not hampering the interaction.

      (7) Line 228: Please detail the reported discrepancies in rmsd between the current protein and the gram-negative enzymes.

      The differences in rmsd between our M.mazei GlnA1 structure and the structure of gram-negative enzymes is caused by a) sequence similarity: E.g. M.mazei GlnA1 compared to B.subtilis GlnA have a sequence percent identity of 58.47; b) ligands in the structure: The B.Subtilis structure contains L-Methionine-S-sulfoximine phosphate, a transition state inhibitor, while the M. mazei  structure contains 2OG; c) Methodology: The structural determination methods also contribute to these differences. B. subtilis GlnA was determined using X-ray crystallography, while the M. mazei GlnA1 structure was resolved using Cryo-EM, where the protein behaves differently in ice compared to a crystal.

      (8) Line 747: The figure title claims "dimeric interface" although the manuscript body only refers to "hexameric interface" or "inter-hexamer interface" (line 224). Moreover, the figure 4 legend uses terms such as vertical and horizontal dimers and this too should be uniformized within the manuscript.

      Thank you for your valuable feedback. We have updated both the figure title and the figure legend as well in the main text to ensure consistency in the description.

      (9) Line 752: The description of the color scheme used here is somehow unclear.

      Thanks for pointing this out. We changed the description to make it more comprehensive.

      (10) Please label H14/15 and H14´/H15´in Fig 4C zoom.

      We agree that this has not been very clear. We added helix labels.

      (11) In Figure 4D legend, make sure to note that the binding sites for the substrate are based on homologies with another enzyme poised with these molecules.

      The same should be clear in the text: sites are not known, they are assumed to be, based on homologies (paragraph starting at line 239).

      Concerning this comment we want to point out that we studied the exact same enzyme as the Schumacher group, except that we used 2-OG in our experiments, which they did not.

      (12) Figure 3 appears redundant in light of Figure 4. 

      (13) Line 235: When mentioning F24, please refer to Figure 5.

      Thank you, we changed that accordingly.

      (14) Please provide the distances for the bonds depicted in Figure 4B.

      Thanks for pointing this out, we added distance labels to Figure 4B. For reasons of clarity only to three H-bonds.

      (15) Line 241: D57 is likely serving to abstract a proton from ammonium, what is residue Glu307 potentially doing? The information seems missing in light of how the sentence is built.

      Thanks for pointing this out. According to previous studies both residues are likely involved in proton abstraction - first from ammonium, and then from the formed gamma-ammonium group. Additionally, they contribute in shielding the active site from bulk solvent to prevent hydrolysis of the formed phospho-glutamate.

      (16) Why do the authors assume that increased concentrations of 2-OG are a signal for N starvation only in M. mazei and not in all prokaryotic equivalent systems (line 288)?

      In line 288, we did not claim that this is a unique signal for M. mazei. It is also the central N-starvation signal in Cyanobacteria but not directly perceived by the cyanobacterial GS through binding directly to GS.

      The authors should look into the residues that bind 2-OG and check if they are conserved in other GS. The results of this sequence analysis should be discussed in line with the variable prokaryotic glutamine synthetase types of activity modulation that were exposed in the introduction and Figure 7.

      Please refer to supplementary figure S5, where we already aligned the mentioned glutamine synthetase sequences. Since this was also already discussed in Müller et al. 2024, we did not want to repeat their observations and refer to our supplementary figure in too much detail.

      (17) Figure 5 title: Replace TS by transition state structures of homology enzymes, or alike.

      Thank you for this suggestion. We did not change the title however, since it is not a homologue but the exact same glutamine synthetase from Methanosarcina mazei.

      (18) Line 249: D170 is not shown in Figure 5A or elsewhere in Figure 5.

      Thank you for pointing this out. We added D170 to figure 5A.

      (19) Representative density for the residues binding 2-OG should be provided, maybe in a supplemental figure.

      Thank you for the suggestion. We added the densities of 2-OG-binding residues to figure 4B

      (20) Line 260: Please add a reference when describing the phosphoryl transfer.

      We thank the reviewer for this important point and added that accordingly.

      (21) Line 296: The binding of 2-OG indeed appears to be cooperative, such that at concentrations above its binding affinity to the protein, only dodecamers are seen (under experimental conditions). However, claiming that the oligomerization is fast is not correct when the experimental setup includes 10 minutes of incubation before measurements are done. Please correct this within the entire manuscript.

      A (fast) continuous kinetic assay could have confirmed this point and revealed the oligomerization steps and the intermediaries in the process (maybe monomer/dimers, then dimers/hexamers, and then hexamers/dodecamers). Such assays would have been highly valuable to this study.

      We thank the reviewer for this suggestion, but disagree. It is indeed a rather fast regulation (as activity assays without pre-incubation only takes 1 min longer to reach full activity, see the newly included suppl. Fig S6). Considering other regulation mechanisms like e.g. transcription or translation regulation, an activation that takes only 60 s is actually quite quick.

      (22) Line 305 (and elsewhere in the manuscript): the authors state that 2-OG primes the active site for a transition state. This appears incorrect. The transition state is the highest energy state in an enzymatic reaction progressing from substrate to product. Meaning, the transition state is a state that has a more or less modified form of the original substrate bound to the active site. This is not the case.

      In line 366 an "active open state" appears much more adequate to use. 

      We agree and changed accordingly throughout the manuscript.

      (23) Line 330: Please delete "found". Eventually replace it with "confirmed": As the authors write, others have described this residue as a ligand to glutamine.

      Thanks, we changed that accordingly, although previous descriptions were just based on homologies without the experimental validation.

      (24) The discussion in at various points summarizing again the results. It should be trimmed and improved.

      (25) Line 381: replace "two fast" with "fast"?

      We thank the reviewer for this suggestion, but disagree on this point. We especially wanted to highlight that there are two central nitrogen-metabolites involved in the direct regulation of GlnA1, that means TWO fast direct processes mediated by 2-OG and glutamine.

    1. eLife Assessment

      This important paper reports functional interactions between L1TD1, an RNA binding protein (RBP), and its ancestral LINE-1 retrotransposon which is not modulated at the translational level. The evidence for the association between L1TD1 and LINE-1 ORF1p is solid. The work implies that the transposon-derived RNA binding protein in the human genome can interact with the ancestral transposable element from which this protein was initially derived. This work spurs interesting questions for cancer types, where LINE1 and L1TD1 are aberrantly expressed.

    2. Reviewer #1 (Public review):

      Summary:

      In their manuscript entitled 'The domesticated transposon protein L1TD1 associates with its ancestor L1 ORF1p to promote LINE-1 retrotransposition', Kavaklıoğlu and colleagues delve into the role of L1TD1, an RNA binding protein (RBP) derived from a LINE1 transposon. L1TD1 proves crucial for maintaining pluripotency in embryonic stem cells and is linked to cancer progression in germ cell tumors, yet its precise molecular function remains elusive. Here, the authors uncover an intriguing interaction between L1TD1 and its ancestral LINE-1 retrotransposon.

      The authors delete the DNA methyltransferase DNMT1 in a haploid human cell line (HAP1), inducing widespread DNA hypo-methylation. This hypomethylation prompts abnormal expression of L1TD1. To scrutinize L1TD1's function in a DNMT1 knock-out setting, the authors create DNMT1/L1TD1 double knock-out cell lines (DKO). Curiously, while the loss of global DNA methylation doesn't impede proliferation, additional depletion of L1TD1 leads to DNA damage and apoptosis.

      To unravel the molecular mechanism underpinning L1TD1's protective role in the absence of DNA methylation, the authors dissect L1TD1 complexes in terms of protein and RNA composition. They unveil an association with the LINE-1 transposon protein L1-ORF1 and LINE-1 transcripts, among others.

      Surprisingly, the authors note fewer LINE-1 retro-transposition events in DKO cells compared to DNMT1 KO alone.

      Strengths:

      The authors present compelling data suggesting the interplay of a transposon-derived human RNA binding protein with its ancestral transposable element. Their findings spur interesting questions for cancer types, where LINE1 and L1TD1 are aberrantly expressed.

      Weaknesses:

      The finding that L1TD1/DNMT1 DKO cells exhibit increased apoptosis and DNA damage but decreased L1 retro-transposition is unexpected. Considering the DNA damage associated with retro-transposition and the DNA damage and apoptosis observed in L1TD1/DNMT1 DKO cells, one would anticipate the opposite outcome. Could it be that the observation of fewer transposition-positive colonies stems from the demise of the most transposition-positive colonies? Future studies are bound to further explore this intriguing phenomenon.

    3. Reviewer #2 (Public review):

      In this study, Kavaklıoğlu et al. investigated and presented evidence for a role for domesticated transposon protein L1TD1 in enabling its ancestral relative, L1 ORF1p, to retrotranspose in HAP1 human tumor cells. The authors provided insight into the molecular function of L1TD1 and shed some clarifying light on previous studies that showed somewhat contradictory outcomes surrounding L1TD1 expression. Here, L1TD1 expression was correlated with L1 activation in a hypomethylation dependent manner, due to DNMT1 deletion in HAP1 cell line. The authors then identified L1TD1 associated RNAs using RIP-Seq, which display a disconnect between transcript and protein abundance (via Tandem Mass Tag multiplex mass spectrometry analysis). The one exception was for L1TD1 itself, is consistent with a model in which the RNA transcripts associated with L1TD1 are not directly regulated at the translation level. Instead, the authors found L1TD1 protein associated with L1-RNPs and this interaction is associated with increased L1 retrotransposition, at least in the contexts of HAP1 cells. Overall, these results support a model in which L1TD1 is restrained by DNA methylation, but in the absence of this repressive mark, L1TD1 is expression, and collaborates with L1 ORF1p (either directly or through interaction with L1 RNA, which remains unclear based on current results), leads to enhances L1 retrotransposition. These results establish feasibility of this relationship existing in vivo in either development or disease, or both.

      Comments on revised version:

      Thank you for this revised manuscript and for addressing our concerns and suggestions. These improvements have significantly enhanced the quality and reliability of the results presented and have addressed all our questions.

    4. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      In their manuscript entitled 'The domesticated transposon protein L1TD1 associates with its ancestor L1 ORF1p to promote LINE-1 retrotransposition', Kavaklıoğlu and colleagues delve into the role of L1TD1, an RNA binding protein (RBP) derived from a LINE1 transposon. L1TD1 proves crucial for maintaining pluripotency in embryonic stem cells and is linked to cancer progression in germ cell tumors, yet its precise molecular function remains elusive. Here, the authors uncover an intriguing interaction between L1TD1 and its ancestral LINE-1 retrotransposon.

      The authors delete the DNA methyltransferase DNMT1 in a haploid human cell line (HAP1), inducing widespread DNA hypo-methylation. This hypomethylation prompts abnormal expression of L1TD1. To scrutinize L1TD1's function in a DNMT1 knock-out setting, the authors create DNMT1/L1TD1 double knock-out cell lines (DKO). Curiously, while the loss of global DNA methylation doesn't impede proliferation, additional depletion of L1TD1 leads to DNA damage and apoptosis.

      To unravel the molecular mechanism underpinning L1TD1's protective role in the absence of DNA methylation, the authors dissect L1TD1 complexes in terms of protein and RNA composition. They unveil an association with the LINE-1 transposon protein L1-ORF1 and LINE-1 transcripts, among others.

      Surprisingly, the authors note fewer LINE-1 retro-transposition events in DKO cells compared to DNMT1 KO alone.

      Strengths:

      The authors present compelling data suggesting the interplay of a transposon-derived human RNA binding protein with its ancestral transposable element. Their findings spur interesting questions for cancer types, where LINE1 and L1TD1 are aberrantly expressed.

      Weaknesses:

      Suggestions for refinement:

      The initial experiment, inducing global hypo-methylation by eliminating DNMT1 in HAP1 cells, is intriguing and warrants more detailed description. How many genes experience misregulation or aberrant expression? What phenotypic changes occur in these cells? Why did the authors focus on L1TD1? Providing some of this data would be helpful to understand the rationale behind the thorough analysis of L1TD1.

      The finding that L1TD1/DNMT1 DKO cells exhibit increased apoptosis and DNA damage but decreased L1 retro-transposition is unexpected. Considering the DNA damage associated with retro-transposition and the DNA damage and apoptosis observed in L1TD1/DNMT1 DKO cells, one would anticipate the opposite outcome. Could it be that the observation of fewer transposition-positive colonies stems from the demise of the most transpositionpositive colonies? Further exploration of this phenomenon would be intriguing.

      Reviewer #2 (Public review):

      In this study, Kavaklıoğlu et al. investigated and presented evidence for a role for domesticated transposon protein L1TD1 in enabling its ancestral relative, L1 ORF1p, to retrotranspose in HAP1 human tumor cells. The authors provided insight into the molecular function of L1TD1 and shed some clarifying light on previous studies that showed somewhat contradictory outcomes surrounding L1TD1 expression. Here, L1TD1 expression was correlated with L1 activation in a hypomethylation dependent manner, due to DNMT1 deletion in HAP1 cell line. The authors then identified L1TD1 associated RNAs using RIPSeq, which display a disconnect between transcript and protein abundance (via Tandem Mass Tag multiplex mass spectrometry analysis). The one exception was for L1TD1 itself, is consistent with a model in which the RNA transcripts associated with L1TD1 are not directly regulated at the translation level. Instead, the authors found L1TD1 protein associated with L1-RNPs and this interaction is associated with increased L1 retrotransposition, at least in the contexts of HAP1 cells. Overall, these results support a model in which L1TD1 is restrained by DNA methylation, but in the absence of this repressive mark, L1TD1 is expression, and collaborates with L1 ORF1p (either directly or through interaction with L1 RNA, which remains unclear based on current results), leads to enhances L1 retrotransposition. These results establish feasibility of this relationship existing in vivo in either development or disease, or both.

      Comments on revised version:

      In general, the authors did an acceptable job addressing the major concerns throughout the manuscript. This revision is much clearer and has improved in terms of logical progression.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      The authors have addressed all my questions in the revised version of the manuscript.

      Reviewer #2 (Recommendations for the authors):

      Revised comments:

      A few points we'd like to see addressed are our comments about the model (Figure S7C), as this is important for the readership to understand this complex finding. Please try to apply some quantification, if possible (question 8). Please do your best to tone down the direct relationship of these findings to embryology (question 11). Based on both reviewer comments, we believe addressing reviewer #1s "Suggestions for refinement" (2 points), would help us change our view of solid to convincing.

      Responses to changes:

      Major

      (1) The study only used one knockout (KO) cell line generated by CRISPR/Cas9.

      Considering the possibility of an off-target effect, I suggest the authors attempt one or both of these suggestions.

      A)  Generate or acquire a similar DMNT1 deletion that uses distinct sgRNAs, so that the likelihood of off-targets is negligible. A few simple experiments such as qRT-PCR would be sufficient to suggest the same phenotype.

      B)  Confirm the DNMT1 depletion also by siRNA/ASO KD to phenocopy the KO effect.

      (2) In addition to the strategies to demonstrate reproducibility, a rescue experiment restoring DNMT1 to the KO or KD cells would be more convincing. (Partial rescue would suffice in this case, as exact endogenous expression levels may be hard to replicate).

      We have undertook several approaches to study the effect of DNMT1 loss or inactivation: As described above, we have generated a conditional KO mouse with ablation of DNMT1 in the epidermis. DNMT1-deficient keratinocytes isolated from these mice show a significant increase in L1TD1 expression. In addition, treatment of primary human keratinocytes and two squamous cell carcinoma cell lines with the DNMT inhibitor aza-deoxycytidine led to upregulation of L1TD1 expression. Thus, the derepression of L1TD1 upon loss of DNMT1 expression or activity is not a clonal effect.

      Also, the spectrum of RNAs identified in RIP experiments as L1TD1-associated transcripts in HAP1 DNMT1 KO cells showed a strong overlap with the RNAs isolated by a related yet different method in human embryonic stem cells. When it comes to the effect of L1TD1 on L1-1 retrotranspostion, a recent study has reported a similar effect of L1TD1 upon overexpression in HeLa cells [4].

      All of these points together help to convince us that our findings with HAP1 DNMT KO are in agreement with results obtained in various other cell systems and are therefore not due to off-target effects. With that in mind, we would pursue the suggestion of Reviewer 1 to analyze the effects of DNA hypomethylation upon DNMT1 ablation.

      Thank you for addressing this concern. The reference to Beck 2021 and the additional cells lines (R2: keratinocytes and R3: squamous cell carcinoma) provides sufficient evidence that this result is unlikely to be a result of clonal expansion or off targets.

      Question: Was the human ES Cell RIP Experiment shown here? What is the overlap?

      We refer to the recently published study by Jin et al. (PMID: 38165001). As stated in the Discussion, the majority of L1TD1-associated transcripts in HAP1 cells (69%) identified in our study were also reported as L1TD1 targets in hESCs suggesting a conserved binding affinity of this domesticated transposon protein across different cell types.  

      (3) As stated in the introduction, L1TD1 and ORF1p share "sequence resemblance" (Martin 2006). Is the L1TD1 antibody specific or do we see L1 ORF1p if Fig 1C were uncropped?

      (6) Is it possible the L1TD1 antibody binds L1 ORF1p? This could make Figure 2D somewhat difficult to interpret. Some validation of the specificity of the L1TD1 antibody would remove this concern (see minor concern below).

      This is a relevant question. We are convinced that the L1TD1 antibody does not crossreact with L1 ORF1p for the following reasons: Firstly, the antibody does not recognize L1 ORF1p (40 kDa) in the uncropped Western blot for Figure 1C (Figure R4A). Secondly, the L1TD1 antibody gives only background signals in DKO cells in the indirect immunofluorescence experiment shown in Figure 1E of the manuscript.

      Thirdly, the immunogene sequence of L1TD1 that determines the specificity of the antibody was checked in the antibody data sheet from Sigma Aldrich. The corresponding epitope is not present in the L1 ORF1p sequence.

      Finally, we have shown that the ORF1p antibody does not cross-react with L1TD1 (Figure R4B).

      Response: Thank you for sharing these images. These full images relieve concerns about specificity. The increase of ORF1P in R4B and Main figure 3C is interesting and pointed out in the manuscript. Not for the purposes of this review, but the observation of reduced transposition despite increased ORF1P could be an interesting follow up to this study (combined with the similar UPF1 result could indicate a complex of some kind).

      (4) In abstract (P2), the authors mentioned that L1TD1 works as an RNA chaperone, but in the result section (P13), they showed that L1TD1 associates with L1 ORF1p in an RNA independent manner. Those conclusions appear contradictory. Clarification or revision is required.

      Our findings that both proteins bind L1 RNA, and that L1TD1 interacts with ORF1p are compatible with a scenario where L1TD1/ORF1p heteromultimers bind to L1 RNA. The additional presence of L1TD1 might thereby enhance the RNA chaperone function of ORF1p. This model is visualized now in Suppl. Figure S7C.

      Response: Thank you for the model. To further clarify, do you mean that L1TD1 can bind L1 RNA, but this is not needed for the effect, however this "bonus" binding (that is enabled by heteromultimerization) appears to enhance the retrotransposition frequency? Do you think L1TD1 is binding L1 RNA in this context or simply "stabilizing" ORF1P (Trimer) RNP?

      Based on our data, L1TD1 associates with L1 RNA and interacts with L1 ORF1p. Both features might contribute to the enhanced retrotransposition frequency. Interestingly, the L1TD1 protein shares with its ancestor L1 ORF1p the non-canonical RNA recognition motif and the coiled-coil motif required for the trimerization but has two copies instead of one of the C-terminal domain (CTD), a structure with RNA binding and chaperone function. We speculate that the presence of an additional CTD within the L1TD1 protein might thereby enhance the RNA binding and chaperone function of L1TD1/ORF1p heteromultimers.

      (5) Figure 2C fold enrichment for L1TD1 and ARMC1 is a bit difficult to fully appreciate. A 100 to 200-fold enrichment does not seem physiological. This appears to be a "divide by zero" type of result, as the CT for these genes was likely near 40 or undetectable. Another qRT-PCR based approach (absolute quantification) would be a more revealing experiment. This is the validation of the RIP experiments and the presentation mode is specifically developed for quantification of RIP assays (Sigma Aldrich RIP-qRT-PCR: Data Analysis Calculation Shell). The unspecific binding of the transcript in the absence of L1TD1 in DNMT1/L1TD1 DKO cells is set to 1 and the value in KO cells represents the specific binding relative the unspecific binding. The calculation also corrects for potential differences in the abundance of the respective transcript in the two cell lines. This is not a physiological value but the quantification of specific binding of transcripts to L1TD1. GAPDH as negative control shows no enrichment, whereas specifically associated transcripts show strong enrichement. We have explained the details of RIPqRT-PCR evaluation in Materials and Methods (page 14) and the legend of Figure 2C in the revised manuscript.

      Response: Thank you for the clarification and additional information in the manuscript.

      (6) Is it possible the L1TD1 antibody binds L1 ORF1p? This could make Figure 2D somewhat difficult to interpret. Some validation of the specificity of the L1TD1 antibody would remove this concern (see minor concern below).

      See response to (3).

      Response: Thanks.

      (7) Figure S4A and S4B: There appear to be a few unusual aspects of these figures that should be pointed out and addressed. First, there doesn't seem to be any ORF1p in the Input (if there is, the exposure is too low). Second, there might be some L1TD1 in the DKO (lane 2) and lane 3. This could be non-specific, but the size is concerning. Overexposure would help see this.

      The ORF1p IP gives rise to strong ORF1p signals in the immunoprecipitated complexes even after short exposure. Under these conditions ORF1p is hardly detectable in the input. Regarding the faint band in DKO HAP1 cells, this might be due to a technical problem during Western blot loading. Therefore, the input samples were loaded again on a Western blot and analyzed for the presence of ORF1p, L1TD1 and beta-actin (as loading control) and shown as separate panel in Suppl. Figure S4A.

      The enhanced image is clearer. Thanks.

      S4A and S4B now appear to the S6A and S6B, is that correct? (This is due to the addition of new S1 and S2, but please verify image orders were not disturbed).

      Yes, the input is shown now as a separate panel in Suppl. Figure S6A.

      (8) Figure S4C: This is related to our previous concerns involving antibody cross-reactivity. Figure 3E partially addresses this, where it looks like the L1TD1 "speckles" outnumber the ORF1p puncta, but overlap with all of them. This might be consistent with the antibody crossreacting. The western blot (Figure 3C) suggests an upregulation of ORF1p by at least 23x in the DKO, but the IF image in 3E is hard to tell if this is the case (slightly more signal, but fewer foci). Can you return to the images and confirm the contrast are comparable? Can you massively overexpose the red channel in 3E to see if there is residual overlap? In Figure 3E the L1TD1 antibody gives no signal in DNMT1/L1TD1 DKO cells confirming that it does not recognize ORF1p. In agreement with the Western blot in Figure 3C the L1 ORF1p signal in Figure 3E is stronger in DKO cells. In DNMT1 KO cells the L1 ORF1p antibody does not recognize all L1TD1 speckles. This result is in agreement with the Western blot shown above in Figure R4B and indicates that the L1 ORF1p antibody does not recognize the L1TD1 protein. The contrast is comparable and after overexposure there are still L1TD1 specific speckles. This might be due to differences in abundance of the two proteins.

      Response: Suggestion: Would it be possible to use a program like ImageJ to supplement the western blot observation? Qualitatively, In figure 3E, it appears that there is more signal in the DKO, but this could also be due to there being multiple cells clustered together or a particularly nicely stained region. Could you randomly sample 20-30 cells across a few experiments to see if this holds up. I am interested in whether the puncta in the KO image(s) is a very highly concentrated region and in the DKO this is more disperse. Also, the representative DKO seems to be cropped slightly wrong. (Please use puncta as a guide to make the cropping more precise)

      As suggested by the reviewer we have quantified the signals of 60 KO cells and 56 DKO cells in three different IF experiments by ImageJ. We measured a 1.4-fold higher expression level of L1 ORF1p in DKO cells. However, the difference is not statistically significant. This is most probably due to the change in cell size and protein content during the cell cycle with increasing protein contents from G1 to G2. Western blot analysis provides signals of comparable protein amounts representing an average expression levels over ten thousands of cells. Nevertheless, the quantification results reflect in principle the IF pictures shown in Figure 3E but IF is probably not the best method to quantify protein amounts. We have also corrected Figure 3E.

      Author response image 1.

      (9) The choice of ARMC1 and YY2 is unclear. What are the criteria for the selection?

      ARMC1 was one of the top hits in a pilot RIP-seq experiment (IP versus input and IP versus IgG IP). In the actual RIP-seq experiment with DKO HAP1 cells instead of IgG IP as a negative control, we found ARMC1 as an enriched hit, although it was not among the top 5 hits. The results from the 2nd RIP-seq further confirmed the validity of ARMC1 as an L1TD1interacting transcript. YY2 was of potential biological relevance as an L1TD1 target due to the fact that it is a processed pseudogene originating from YY1 mRNA as a result of retrotransposition. This is mentioned on page 6 of the revised manuscript.

      Response: Appreciated!

      (10) (P16) L1 is the only protein-coding transposon that is active in humans. This is perhaps too generalized of a statement as written. Other examples are readily found in the literature.

      Please clarify.

      We will tone down this statement in the revised manuscript.

      Response: Appreciated! To further clarify, the term "active" when it comes to transposable elements, has not been solidified. It can span "retrotransposition competent" to "transcripts can be recovered". There are quite a few reports of GAG transcripts and protein from various ERV/LTR subfamilies in various cells and tissues (in mouse and human at least), however whether they contribute to new insertions is actively researched.

      (11) In both the abstract and last sentence in the discussion section (P17), embryogenesis is mentioned, but this is not addressed at all in the manuscript. Please refrain from implying normal biological functions based on the results of this study unless appropriate samples are used to support them.

      Much of the published data on L1TD1 function are related to embryonic stem cells [3- 7].

      Therefore, it is important to discuss our findings in the context of previous reports.

      Response: It is well established that embryonic stem cells are not a perfect or direct proxies for the inner cell mass of embryos, as multiple reports have demonstrated transcriptomic, epigenetic, chromatin accessibility differences. The exact origin of ES cells is also considered controversial. We maintain that the distinction between embryos/embryogenesis and the results presented in the manuscript are not yet interchangeable. An important exception would be complex models of embryogenesis such as embryoids, (or synthetic/artificial embryo models that have been carefully been termed as such so as to not suggest direct implications to embryos). https://www.nature.com/articles/ncb2965  

      https://link.springer.com/article/10.1007/s00018-018-2965-y  

      https://www.cell.com/developmental-cell/abstract/S1534-5807(24)00363-0?_returnURL=https%3A%2F%2Flinkinghub.elsevier.com%2Fretrieve%2Fpii%2FS1534580724003630%3Fshowall%3Dtrue

      We have deleted the corresponding paragraph in the Discussion.

      (12) Figure 3E: The format of Figures 1A and 3E are internally inconsistent. Please present similar data/images in a cohesive way throughout the manuscript. We show now consistent IF Figures in the revised manuscript.

      Response: Thanks

      Minor:

      In general:

      Still need checking for typos, mostly in Materials and Methods section; Please keep a consistent writing style throughout the whole manuscript. If you use L1 ORF1p, then please use L1 instead of LINE-1, or if you keep LINE-1 in your manuscript, then you should use LINE-1 ORF1p.

      A lab member from the US checked again the Materials and Methods section for typos. We keep the short version L1 ORF1p.

      (1) Intro:

      - Is L1Td1 in mice and Humans? How "conserved" is it and does this suggest function? Murine and human L1TD1 proteins share 44% identity on the amino acid level and it was suggested that the corresponding genes were under positive selection during evolution with functions in transposon control and maintenance of pluripotency [8].

      - Why HAP1? (Haploid?) The importance of this cell line is not clear.

      HAP1 is a nearly haploid human cancer cell line derived from the KBM-7 chronic myelogenous leukemia (CML) cell line [9, 10]. Due to its haploidy is perfectly suited and widely used for loss-of-function screens and gene editing. After gene editing cells can be used in the nearly haploid or in the diploid state. We usually perform all experiments with diploid HAP1 cell lines. Importantly, in contrast to other human tumor cell lines, this cell line tolerates ablation of DNMT1. We have included a corresponding explanation in the revised manuscript on page 5, first paragraph.

      - Global methylation status in DNMT1 KO? (Methylations near L1 insertions, for example?)

      The HAP1 DNMT1 KO cell line with a 20 bp deletion in exon 4 used in our study was validated in the study by Smits et al. [11]. The authors report a significant reduction in overall DNA methylation. However, we are not aware of a DNA methylome study on this cell line. We show now data on the methylation of L1 elements in HAP1 cells and upon DNMT1 deletion in the revised manuscript in Suppl. Figure S1B.

      Response: Looks great!

      (2) Figure 1:

      - Figure 1C. Why is LMNB used instead of Actin (Fig1D)?

      We show now beta-actin as loading control in the revised manuscript.

      - Figure 1G shows increased Caspase 3 in KO, while the matching sentence in the result section skips over this. It might be more accurate to mention this and suggest that the single KO has perhaps an intermediate phenotype (Figure 1F shows a slight but not significant trend).

      We fully agree with the reviewer and have changed the sentence on page 6, 2nd paragraph accordingly.

      - Would 96 hrs trend closer to significance? An interpretation is that L1TD1 loss could speed up this negative consequence.

      We thank the reviewer for the suggestion. We have performed a time course experiment with 6 biological replicas for each time point up to 96 hours and found significant changes in the viability upon loss of DNMT1 and again significant reduction in viability upon additional loss of L1TD1 (shown in Figure 1F). These data suggest that as expected loss of DNMT1 leads to significant reduction viability and that additional ablation of L1TD1 further enhances this effect.

      Response: Looks good!

      - What are the "stringent conditions" used to remove non-specific binders and artifacts (negative control subtraction?)

      Yes, we considered only hits from both analyses, L1TD1 IP in KO versus input and L1TD1 IP in KO versus L1TD1 IP in DKO. This is now explained in more detail in the revised manuscript on page 6, 3rd paragraph.

      (3) Figure 2:

      - Figure 2A is a bit too small to read when printed.

      We have changed this in the revised manuscript.

      - Since WT and DKO lack detectable L1TD1, would you expect any difference in RIP-Seq results between these two?

      Due to the lack of DNMT1 and the resulting DNA hypomethylation, DKO cells are more similar to KO cells than WT cells with respect to the expressed transcripts.

      - Legend says selected dots are in green (it appears blue to me). We have changed this in the revised manuscript.

      - Would you recover L1 ORF1p and its binding partners in the KO? (Is the antibody specific in the absence of L1TD1 or can it recognize L1?) I noticed an increase in ORF1p in the KO in Figure 3C.

      Thank you for the suggestion. Yes, L1 ORF1p shows slightly increased expression in the proteome analysis and we have marked the corresponding dot in the Volcano plot (Figure 3A).

      - Should the figure panel reference near the (Rosspopoff & Trono) reference instead be Sup S1C as well? Otherwise, I don't think S1C is mentioned at all.

      - What are the red vs. green dots in 2D? Can you highlight ERV and ALU with different colors?

      We added the reference to Suppl. Figure S1C (now S3C) in the revised manuscript. In Figure 2D L1 elements are highlighted in green, ERV elements in yellow, and other associated transposon transcripts in red.

      Response: Much better, thanks!

      - Which L1 subfamily from Figure 2D is represented in the qRT-PCR in 2E "LINE-1"? Do the primers match a specific L1 subfamily? If so, which? We used primers specific for the human L1.2 subfamily.

      - Pulling down SINE element transcripts makes some sense, as many insertions "borrow" L1 sequences for non-autonomous retro transposition, but can you speculate as to why ERVs are recovered? There should be essentially no overlap in sequence.

      In the L1TD1 evolution paper [8], a potential link between L1TD1 and ERV elements was discussed:

      "Alternatively, L1TD1 in sigmodonts could play a role in genome defense against another element active in these genomes. Indeed, the sigmodontine rodents have a highly active family of ERVs, the mysTR elements [46]. Expansion of this family preceded the death of L1s, but these elements are very active, with 3500 to 7000 speciesspecific insertions in the L1-extinct species examined [47]. This recent ERV amplification in Sigmodontinae contrasts with the megabats (where L1TD1 has been lost in many species); there are apparently no highly active DNA or RNA elements in megabats [48]. If L1TD1 can suppress retroelements other than L1s, this could explain why the gene is retained in sigmodontine rodents but not in megabats."

      Furthermore, Jin et al. report the binding of L1TD1 to repetitive sequences in transcripts [12]. It is possible that some of these sequences are also present in ERV RNAs.

      Response: Interesting, thanks for sharing

      - Is S2B a screenshot? (the red underline).

      No, it is a Powerpoint figure, and we have removed the red underline.

      (4) Figure 3:

      - Text refers to Figure 3B as a western blot. Figure 3B shows a volcano plot. This is likely 3C but would still be out of order (3A>3C>3B referencing). I think this error is repeated in the last result section.

      - Figure and legends fail to mention what gene was used for ddCT method (actin, gapdh, etc.).

      - In general, the supplemental legends feel underwritten and could benefit from additional explanations. (Main figures are appropriate but please double-check that all statistical tests have been mentioned correctly).

      Thank you for pointing this out. We have corrected these errors in the revised manuscript.

      (5) Discussion:

      - Aluy connection is interesting. Is there an "Alu retrotransposition reporter assay" to test whether L1TD1 enhances this as well?

      Thank you for the suggestion. There is indeed an Alu retrotransposition reporter assay reported be Dewannieux et al. [13]. The assay is based on a Neo selection marker. We have previously tested a Neo selection-based L1 retrotransposition reporter assay, but this system failed to properly work in HAP1 cells, therefore we switched to a blasticidin based L1 retrotransposition reporter assay. A corresponding blasticidin-based Alu retrotransposition reporter assay might be interesting for future studies (mentioned in the Discussion, page 11 paragraph 4 of the revised manuscript.

      (6) Material and Methods :

      - The number of typos in the materials and methods is too numerous to list. Instead, please refer to the next section that broadly describes the issues seen throughout the manuscript.

      Writing style

      (1) Keep a consistent style throughout the manuscript: for example, L1 or LINE-1 (also L1 ORF1p or LINE-1 ORF1p); per or "/"; knockout or knock-out; min or minute; 3 times or three times; media or medium. Additionally, as TE naming conventions are not uniform, it is important to maintain internal consistency so as to not accidentally establish an imprecise version.

      (2) There's a period between "et al" and the comma, and "et al." should be italic.

      (3) The authors should explain what the key jargon is when it is first used in the manuscript, such as "retrotransposon" and "retrotransposition".

      (4) The authors should show the full spelling of some acronyms when they use it for the first time, such as RNA Immunoprecipitation (RIP).

      (5) Use a space between numbers and alphabets, such as 5 μg. (6) 2.0 × 105 cells, that's not an "x".

      (7) Numbers in the reference section are lacking (hard to parse).

      (8) In general, there are a significant number of typos in this draft which at times becomes distracting. For example, (P3) Introduction: Yet, co-option of TEs thorough (not thorough, it should be through) evolution has created so-called domesticated genes beneficial to the gene network in a wide range of organisms. Please carefully revise the entire manuscript for these minor issues that collectively erode the quality of this submission. Thank you for pointing out these mistakes. We have corrected them in the revised manuscript. A native speaker from our research group has carefully checked the paper. In summary, we have added Supplementary Figure S7C and have changed Figures 1C, 1E, 1F, 2A, 2D, 3A, 4B, S3A-D, S4B and S6A based on these comments.

      Response: Thank you for taking these comments on board!

    1. eLife Assessment

      The study reports valuable findings from a very rich EEG-fMRI dataset including 107 participants, which was collected during nocturnal naps. The authors link activity in memory-related brain regions (e.g., hippocampus, thalamus, and medial prefrontal cortex), and their functional connectivity, to the occurrence of canonical sleep rhythms, namely spindles and slow oscillations in non-rapid eye movement sleep. This work could contribute to further understanding of sleep neural dynamics, although the evidence for some of the main claims is incomplete at present.

    2. Reviewer #1 (Public review):

      Wang et al., recorded concurrent EEG-fMRI in 107 participants during nocturnal NREM sleep to investigate brain activity and connectivity related to slow oscillations (SO), sleep spindles, and in particular their co-occurrence. The authors found SO-spindle coupling to be correlated with increased thalamic and hippocampal activity, and with increased functional connectivity from the hippocampus to the thalamus and from the thalamus to the neocortex, especially the medial prefrontal cortex (mPFC). They concluded the brain-wide activation pattern to resemble episodic memory processing, but to be dissociated from task-related processing and suggest that the thalamus plays a crucial role in coordinating the hippocampal-cortical dialogue during sleep.

      The paper offers an impressively large and highly valuable dataset that provides the opportunity for gaining important new insights into the network substrate involved in SOs, spindles, and their coupling. However, the paper does unfortunately not exploit the full potential of this dataset with the analyses currently provided, and the interpretation of the results is often not backed up by the results presented.

      I have the following specific comments.

      (1) The introduction is lacking sufficient review of the already existing literature on EEG-fMRI during sleep and the BOLD-correlates of slow oscillations and spindles in particular (Laufs et al., 2007; Schabus et al., 2007; Horovitz et al., 2008; Laufs, 2008; Czisch et al., 2009; Picchioni et al., 2010; Spoormaker et al., 2010; Caporro et al., 2011; Bergmann et al., 2012; Hale et al., 2016; Fogel et al., 2017; Moehlman et al., 2018; Ilhan-Bayrakci et al., 2022). The few studies mentioned are not discussed in terms of the methods used or insights gained.

      (2) The paper falls short in discussing the specific insights gained into the neurobiological substrate of the investigated slow oscillations, spindles, and their interactions. The validity of the inverse inference approach ("Open ended cognitive state decoding"), assuming certain cognitive functions to be related to these oscillations because of the brain regions/networks activated in temporal association with these events, is debatable at best. It is also unclear why eventually only episodic memory processing-like brain-wide activation is discussed further, despite the activity of 16 of 50 feature terms from the NeuroSynth v3 dataset were significant (episodic memory, declarative memory, working memory, task representation, language, learning, faces, visuospatial processing, category recognition, cognitive control, reading, cued attention, inhibition, and action).

      (3) Hippocampal activation during SO-spindles is stated as a main hypothesis of the paper - for good reasons - however, other regions (e.g., several cortical as well as thalamic) would be equally expected given the known origin of both oscillations and the existing sleep-EEG-fMRI literature. However, this focus on the hippocampus contrasts with the focus on investigating the key role of the thalamus instead in the Results section.

      (4) The study included an impressive number of 107 subjects. It is surprising though that only 31 subjects had to be excluded under these difficult recording conditions, especially since no adaptation night was performed. Since only subjects were excluded who slept less than 10 min (or had excessive head movements) there are likely several datasets included with comparably short durations and only a small number of SOs and spindles and even less combined SO-spindle events. A comprehensive table should be provided (supplement) including for each subject (included and excluded) the duration of included NREM sleep, number of SOs, spindles, and SO+spindle events. Also, some descriptive statistics (mean/SD/range) would be helpful.

      (5) Was the 20-channel head coil dedicated for EEG-fMRI measurements? How were the electrode cables guided through/out of the head coil? Usually, the 64-channel head coil is used for EEG-fMRI measurements in a Siemens PRISMA 3T scanner, which has a cable duct at the back that allows to guide the cables straight out of the head coil (to minimize MR-related artifacts). The choice for the 20-channel head coil should be motivated. Photos of the recording setup would also be helpful.

      (6) Was the EEG sampling synchronized to the MR scanner (gradient system) clock (the 10 MHz signal; not referring to the volume TTL triggers here)? This is a requirement for stable gradient artifact shape over time and thus accurate gradient noise removal.

      (7) The TR is quite long and the voxel size is quite large in comparison to state-of-the-art EPI sequences. What was the rationale behind choosing a sequence with relatively low temporal and spatial resolution?

      (8) The anatomically defined ROIs are quite large. It should be elaborated on how this might reduce sensitivity to sleep rhythm-specific activity within sub-regions, especially for the thalamus, which has distinct nuclei involved in sleep functions.

      (9) The study reports SO & spindle amplitudes & densities, as well as SO+spindle coupling, to be larger during N2/3 sleep compared to N1 and REM sleep, which is trivial but can be seen as a sanity check of the data. However, the amount of SOs and spindles reported for N1 and REM sleep is concerning, as per definition there should be hardly any (if SOs or spindles occur in N1 it becomes by definition N2, and the interval between spindles has to be considerably large in REM to still be scored as such). Thus, on the one hand, the report of these comparisons takes too much space in the main manuscript as it is trivial, but on the other hand, it raises concerns about the validity of the scoring.

      (10) Why was electrode F3 used to quantify the occurrence of SOs and spindles? Why not a midline frontal electrode like Fz (or a number of frontal electrodes for SOs) and Cz (or a number of centroparietal electrodes) for spindles to be closer to their maximum topography?

      (11) Functional connectivity (hippocampus -> thalamus -> cortex (mPFC)) is reported to be increased during SO-spindle coupling and interpreted as evidence for coordination of hippocampo-neocortical communication likely by thalamic spindles. However, functional connectivity was only analysed during coupled SO+spindle events, not during isolated SOs or isolated spindles. Without the direct comparison of the connectivity patterns between these three events, it remains unclear whether this is specific for coupled SO+spindle events or rather associated with one or both of the other isolated events. The PPIs need to be conducted for those isolated events as well and compared statistically to the coupled events.

      (12) The limited temporal resolution of fMRI does indeed not allow for easily distinguishing between fMRI activation patterns related to SO-up- vs. SO-down-states. For this, one could try to extract the amplitudes of SO-up- and SO-down-states separately for each SO event and model them as two separate parametric modulators (with the risk of collinearity as they are likely correlated).

      (13) L327: "It is likely that our findings of diminished DMN activity reflect brain activity during the SO DOWN-state, as this state consistently shows higher amplitude compared to the UP-state within subjects, which is why we modelled the SO trough as its onset in the fMRI analysis." This conclusion is not justified as the fact that SO down-states are larger in amplitude does not mean their impact on the BOLD response is larger.

      (14) Line 77: "In the current study, while directly capturing hippocampal ripples with scalp EEG or fMRI is difficult, we expect to observe hippocampal activation in fMRI whenever SOs-spindles coupling is detected by EEG, if SOs- spindles-ripples triple coupling occurs during human NREM sleep". Not all SO-spindle events are associated with ripples (Staresina et al., 2015), but hippocampal activation may also be expected based on the occurrence of spindles alone (Bergmann et al., 2012).

      References:

      Bergmann TO, Molle M, Diedrichs J, Born J, Siebner HR (2012) Sleep spindle-related reactivation of category-specific cortical regions after learning face-scene associations. Neuroimage 59:2733-2742.<br /> Caporro M, Haneef Z, Yeh HJ, Lenartowicz A, Buttinelli C, Parvizi J, Stern JM (2011) Functional MRI of sleep spindles and K-complexes. Clin Neurophysiol.<br /> Czisch M, Wehrle R, Stiegler A, Peters H, Andrade K, Holsboer F, Samann PG (2009) Acoustic oddball during NREM sleep: a combined EEG/fMRI study. PLoS One 4:e6749.<br /> Fogel S, Albouy G, King BR, Lungu O, Vien C, Bore A, Pinsard B, Benali H, Carrier J, Doyon J (2017) Reactivation or transformation? Motor memory consolidation associated with cerebral activation time-locked to sleep spindles. PLoS One 12:e0174755.<br /> Hale JR, White TP, Mayhew SD, Wilson RS, Rollings DT, Khalsa S, Arvanitis TN, Bagshaw AP (2016) Altered thalamocortical and intra-thalamic functional connectivity during light sleep compared with wake. Neuroimage 125:657-667.<br /> Horovitz SG, Fukunaga M, de Zwart JA, van Gelderen P, Fulton SC, Balkin TJ, Duyn JH (2008) Low frequency BOLD fluctuations during resting wakefulness and light sleep: a simultaneous EEG-fMRI study. Hum Brain Mapp 29:671-682.<br /> Ilhan-Bayrakci M, Cabral-Calderin Y, Bergmann TO, Tuscher O, Stroh A (2022) Individual slow wave events give rise to macroscopic fMRI signatures and drive the strength of the BOLD signal in human resting-state EEG-fMRI recordings. Cereb Cortex 32:4782-4796.<br /> Laufs H (2008) Endogenous brain oscillations and related networks detected by surface EEG-combined fMRI. Hum Brain Mapp 29:762-769.<br /> Laufs H, Walker MC, Lund TE (2007) 'Brain activation and hypothalamic functional connectivity during human non-rapid eye movement sleep: an EEG/fMRI study'--its limitations and an alternative approach. Brain 130:e75; author reply e76.<br /> Moehlman TM, de Zwart JA, Chappel-Farley MG, Liu X, McClain IB, Chang C, Mandelkow H, Ozbay PS, Johnson NL, Bieber RE, Fernandez KA, King KA, Zalewski CK, Brewer CC, van Gelderen P, Duyn JH, Picchioni D (2018) All-Night Functional Magnetic Resonance Imaging Sleep Studies. J Neurosci Methods.<br /> Picchioni D, Horovitz SG, Fukunaga M, Carr WS, Meltzer JA, Balkin TJ, Duyn JH, Braun AR (2010) Infraslow EEG oscillations organize large-scale cortical-subcortical interactions during sleep: A combined EEG/fMRI study. Brain Res.<br /> Schabus M, Dang-Vu TT, Albouy G, Balteau E, Boly M, Carrier J, Darsaud A, Degueldre C, Desseilles M, Gais S, Phillips C, Rauchs G, Schnakers C, Sterpenich V, Vandewalle G, Luxen A, Maquet P (2007) Hemodynamic cerebral correlates of sleep spindles during human non-rapid eye movement sleep. Proc Natl Acad Sci U S A 104:13164-13169.<br /> Spoormaker VI, Schroter MS, Gleiser PM, Andrade KC, Dresler M, Wehrle R, Samann PG, Czisch M (2010) Development of a large-scale functional brain network during human non-rapid eye movement sleep. J Neurosci 30:11379-11387.<br /> Staresina BP, Bergmann TO, Bonnefond M, van der Meij R, Jensen O, Deuker L, Elger CE, Axmacher N, Fell J (2015) Hierarchical nesting of slow oscillations, spindles and ripples in the human hippocampus during sleep. Nat Neurosci 18:1679-1686.

    3. Reviewer #2 (Public review):

      In this study, Wang and colleagues aimed to explore brain-wide activation patterns associated with NREM sleep oscillations, including slow oscillations (SOs), spindles, and SO-spindle coupling events. Their findings reveal that SO-spindle events corresponded with increased activation in both the thalamus and hippocampus. Additionally, they observed that SO-spindle coupling was linked to heightened functional connectivity from the hippocampus to the thalamus, and from the thalamus to the medial prefrontal cortex-three key regions involved in memory consolidation and episodic memory processes.

      This study's findings are timely and highly relevant to the field. The authors' extensive data collection, involving 107 participants sleeping in an fMRI while undergoing simultaneous EEG recording, deserves special recognition. If shared, this unique dataset could lead to further valuable insights. While the conclusions of the data seem overall well supported by the data, some aspects with regard to the detection of sleep oscillations need clarification.

      The authors report that coupled SO-spindle events were most frequent during NREM sleep (2.46 {plus minus} 0.06 events/min), but they also observed a surprisingly high occurrence of these events during N1 and REM sleep (2.23 {plus minus} 0.09 and 2.32 {plus minus} 0.09 events/min, respectively), where SO-spindle coupling would not typically be expected. Combined with the relatively modest SO amplitudes reported (~25 µV, whereas >75 µV would be expected when using mastoids as reference electrodes), this raises the possibility that the parameters used for event detection may not have been conservative enough - or that sleep staging was inaccurately performed. This issue could present a significant challenge, as the fMRI findings are largely dependent on the reliability of these detected events.

    4. Reviewer #3 (Public review):

      Summary:

      Wang et al., examined the brain activity patterns during sleep, especially when locked to those canonical sleep rhythms such as SO, spindle, and their coupling. Analyzing data from a large sample, the authors found significant coupling between spindles and SOs, particularly during the upstate of the SO. Moreover, the authors examined the patterns of whole-brain activity locked to these sleep rhythms. To understand the functional significance of these brain activities, the authors further conducted open-ended cognitive state decoding and found a variety of cognitive processing may be involved during SO-spindle coupling and during other sleep events. The authors next investigated the functional connectivity analyses and found enhanced connectivity between the hippocampus, the thalamus, and the medial PFC. These results reinforced the theoretical model of sleep-dependent memory consolidation, such that SO-spindle coupling is conducive to systems-level memory reactivation and consolidation.

      Strengths:

      There are obvious strengths in this work, including the large sample size, state-of-the-art neuroimaging and neural oscillation analyses, and the richness of results.

      Weaknesses:

      Despite these strengths and the insights gained, there are weaknesses in the design, the analyses, and inferences.

      A repeating statement in the manuscript is that brain activity could indicate memory reactivation and thus consolidation. This is indeed a highly relevant question that could be informed by the current data/results. However, an inherent weakness of the design is that there is no memory task before and after sleep. Thus, it is difficult (if not impossible) to make a strong argument linking SO/spindle/coupling-locked brain activity with memory reactivation or consolidation.

      Relatedly, to understand the functional implications of the sleep rhythm-locked brain activity, the authors employed the "open-ended cognitive state decoding" method. While this method is interesting, it is rather indirect given that there were no behavioral indices in the manuscript. Thus, discussions based on these analyses are speculative at best. Please either tone down the language or find additional evidence to support these claims.

      Moreover, the results from this method are difficult to understand. Figure 3e showed that for all three types of sleep events (SO, spindle, SO-spindle), the same mental states (e.g., working memory, episodic memory, declarative memory) showed opposite directions of activation (left and right panels showed negative and positive activation, respectively). How to interpret these conflicting results? This ambiguity is also reflected by the term used: declarative memory and episodic memories are both indexed in the results. Yet these two processes can be largely overlapped. So which specific memory processes do these brain activity patterns reflect? The Discussion shall discuss these results and the limitations of this method.

      The coupling strength is somehow inconsistent with prior results (Hahn et al., 2020, eLife, Helfrich et al., 2018, Neuron). Specifically, Helfrich et al. showed that among young adults, the spindle is coupled to the peak of the SO. Here, the authors reported that the spindles were coupled to down-to-up transitions of SO and before the SO peak. It is possible that participants' age may influence the coupling (see Helfrich et al., 2018). Please discuss the findings in the context of previous research on SO-spindle coupling.

      The discussion is rather superficial with only two pages, without delving into many important arguments regarding the possible functional significance of these results. For example, the author wrote, "This internal processing contrasts with the brain patterns associated with external tasks, such as working memory." Without any references to working memory, and without delineating why WM is considered as an external task even working memory operations can be internal. Similarly, for the interesting results on SO and reduced DMN activity, the authors wrote "The DMN is typically active during wakeful rest and is associated with self-referential processes like mind-wandering, daydreaming, and task representation (Yeshurun, Nguyen, & Hasson, 2021). Its reduced activity during SOs may signal a shift towards endogenous processes such as memory consolidation." This argument is flawed. DMN is active during self-referential processing and mind-wandering, i.e., when the brain shifts from external stimuli processing to internal mental processing. During sleep, endogenous memory reactivation and consolidation are also part of the internal mental processing given the lack of external environmental stimulation. So why during SO or during memory consolidation, the DMN activity would be reduced? Were there differences in DMN activity between SO and SO-spindle coupling events?

    5. Author response:

      Reviewer #1 (Public review):

      Wang et al., recorded concurrent EEG-fMRI in 107 participants during nocturnal NREM sleep to investigate brain activity and connectivity related to slow oscillations (SO), sleep spindles, and in particular their co-occurrence. The authors found SO-spindle coupling to be correlated with increased thalamic and hippocampal activity, and with increased functional connectivity from the hippocampus to the thalamus and from the thalamus to the neocortex, especially the medial prefrontal cortex (mPFC). They concluded the brain-wide activation pattern to resemble episodic memory processing, but to be dissociated from task-related processing and suggest that the thalamus plays a crucial role in coordinating the hippocampal-cortical dialogue during sleep.

      The paper offers an impressively large and highly valuable dataset that provides the opportunity for gaining important new insights into the network substrate involved in SOs, spindles, and their coupling. However, the paper does unfortunately not exploit the full potential of this dataset with the analyses currently provided, and the interpretation of the results is often not backed up by the results presented. I have the following specific comments.

      Thank you for your thoughtful and constructive feedback. We greatly appreciate your recognition of the strengths of our dataset and findings Below, we address your specific comments and provide responses to each point you raised to ensure our methods and results are as transparent and comprehensible as possible. We hope these revisions address your comments and further strengthen our manuscript. Thank you again for the constructive feedback.

      (1) The introduction is lacking sufficient review of the already existing literature on EEG-fMRI during sleep and the BOLD-correlates of slow oscillations and spindles in particular (Laufs et al., 2007; Schabus et al., 2007; Horovitz et al., 2008; Laufs, 2008; Czisch et al., 2009; Picchioni et al., 2010; Spoormaker et al., 2010; Caporro et al., 2011; Bergmann et al., 2012; Hale et al., 2016; Fogel et al., 2017; Moehlman et al., 2018; Ilhan-Bayrakci et al., 2022). The few studies mentioned are not discussed in terms of the methods used or insights gained.

      We acknowledge the need for a more comprehensive review of prior EEG-fMRI studies investigating BOLD correlates of slow oscillations and spindles. However, these articles are not all related to sleep SO or spindle. Articles (Hale et al., 2016; Horovitz et al., 2008; Laufs, 2008; Laufs, Walker, & Lund, 2007; Spoormaker et al., 2010) mainly focus on methodology for EEG-fMRI, sleep stages, or brain networks, which are not the focus of our study. Thank you again for your attention to the comprehensiveness of our literature review, and we will expand the introduction to include a more detailed discussion of the existing literature, ensuring that the contributions of previous EEG-fMRI sleep studies are adequately acknowledged.

      Introduction, Page 4 Lines 62-76

      “Investigating these sleep-related neural processes in humans is challenging because it requires tracking transient sleep rhythms while simultaneously assessing their widespread brain activation. Recent advances in simultaneous EEG-fMRI techniques provide a unique opportunity to explore these processes. EEG allows for precise event-based detection of neural signal, while fMRI provides insight into the broader spatial patterns of brain activation and functional connectivity (Horovitz et al., 2008; Huang et al., 2024; Laufs, 2008; Laufs, Walker, & Lund, 2007; Schabus et al., 2007; Spoormaker et al., 2010). Previous EEG-fMRI studies on sleep have focused on classifying sleep stages or examining the neural correlates of specific waves (Bergmann et al., 2012; Caporro et al., 2012; Czisch et al., 2009; Fogel et al., 2017; Hale et al., 2016; Ilhan-Bayrakcı et al., 2022; Moehlman et al., 2019; Picchioni et al., 2011). These studies have generally reported that slow oscillations are associated with widespread cortical and subcortical BOLD changes, whereas spindles elicit activation in the thalamus, as well as in several cortical and paralimbic regions. Although these findings provide valuable insights into the BOLD correlates of sleep rhythms, they often do not employ sophisticated temporal modeling (Huang et al., 2024), to capture the dynamic interactions between different oscillatory events, e.g., the coupling between SOs and spindles.”

      (2) The paper falls short in discussing the specific insights gained into the neurobiological substrate of the investigated slow oscillations, spindles, and their interactions. The validity of the inverse inference approach ("Open ended cognitive state decoding"), assuming certain cognitive functions to be related to these oscillations because of the brain regions/networks activated in temporal association with these events, is debatable at best. It is also unclear why eventually only episodic memory processing-like brain-wide activation is discussed further, despite the activity of 16 of 50 feature terms from the NeuroSynth v3 dataset were significant (episodic memory, declarative memory, working memory, task representation, language, learning, faces, visuospatial processing, category recognition, cognitive control, reading, cued attention, inhibition, and action).

      Thank you for pointing this out, particularly regarding the use of inverse inference approaches such as “open-ended cognitive state decoding.” Given the concerns about the indirectness of this approach, we decided to remove its related content and results from Figure 3 in the main text and include it in Supplementary Figure 7. We will refocus the main text on direct neurobiological insights gained from our EEG-fMRI analyses, particularly emphasizing the hippocampal-thalamocortical network dynamics underlying SO-spindle coupling, and we will acknowledge the exploratory nature of these findings and highlight their limitations.

      Discussion, Page 17-18 Lines 323-332

      “To explore functional relevance, we employed an open-ended cognitive state decoding approach using meta-analytic data (NeuroSynth: Yarkoni et al. (2011)). Although this method usefully generates hypotheses about potential cognitive processes, particularly in the absence of a pre- and post-sleep memory task, it is inherently indirect. Many cognitive terms showed significant associations (16 of 50), such as “episodic memory,” “declarative memory,” and “working memory.” We focused on episodic/declarative memory given the known link with hippocampal reactivation (Diekelmann & Born, 2010; Staresina et al., 2015; Staresina et al., 2023). Nonetheless, these inferences regarding memory reactivation should be interpreted cautiously without direct behavioral measures. Future research incorporating explicit tasks before and after sleep would more rigorously validate these potential functional claims.”

      (3) Hippocampal activation during SO-spindles is stated as a main hypothesis of the paper - for good reasons - however, other regions (e.g., several cortical as well as thalamic) would be equally expected given the known origin of both oscillations and the existing sleep-EEG-fMRI literature. However, this focus on the hippocampus contrasts with the focus on investigating the key role of the thalamus instead in the Results section.

      We appreciate your insight regarding the relative emphasis on hippocampal and thalamic activation in our study. We recognize that the manuscript may currently present an inconsistency between our initial hypothesis and the main focus of the results. To address this concern, we will ensure that our Introduction and Discussion section explicitly discusses both regions, highlighting the complementary roles of the hippocampus (memory processing and reactivation) and the thalamus (spindle generation and cortico-hippocampal coordination) in SO-spindle dynamics.

      Introduction, Page 5 Lines 87-103

      “To address this gap, our study investigates brain-wide activation and functional connectivity patterns associated with SO-spindle coupling, and employs a cognitive state decoding approach (Margulies et al., 2016; Yarkoni et al., 2011)—albeit indirectly—to infer potential cognitive functions. In the current study, we used simultaneous EEG-fMRI recordings during nocturnal naps (detailed sleep staging results are provided in the Methods and Table S1) in 107 participants. Although directly detecting hippocampal ripples using scalp EEG or fMRI is challenging, we expected that hippocampal activation in fMRI would coincide with SO-spindle coupling detected by EEG, given that SOs, spindles, and ripples frequently co-occur during NREM sleep. We also anticipated a critical role of the thalamus, particularly thalamic spindles, in coordinating hippocampal-cortical communication.

      We found significant coupling between SOs and spindles during NREM sleep (N2/3), with spindle peaks occurring slightly before the SO peak. This coupling was associated with increased activation in both the thalamus and hippocampus, with functional connectivity patterns suggesting thalamic coordination of hippocampal-cortical communication. These findings highlight the key role of the thalamus in coordinating hippocampal-cortical interactions during human sleep and provide new insights into the neural mechanisms underlying sleep-dependent brain communication. A deeper understanding of these mechanisms may contribute to future neuromodulation approaches aimed at enhancing sleep-dependent cognitive function and treating sleep-related disorders.”

      Discussion, Page 16-17 Lines 292-307

      “When modeling the timing of these sleep rhythms in the fMRI, we observed hippocampal activation selectively during SO-spindle events. This suggests the possibility of triple coupling (SOs–spindles–ripples), even though our scalp EEG was not sufficiently sensitive to detect hippocampal ripples—key markers of memory replay (Buzsáki, 2015). Recent iEEG evidence indicates that ripples often co-occur with both spindles (Ngo, Fell, & Staresina, 2020) and SOs (Staresina et al., 2015; Staresina et al., 2023). Therefore, the hippocampal involvement during SO-spindle events in our study may reflect memory replay from the hippocampus, propagated via thalamic spindles to distributed cortical regions.

      The thalamus, known to generate spindles (Halassa et al., 2011), plays a key role in producing and coordinating sleep rhythms (Coulon, Budde, & Pape, 2012; Crunelli et al., 2018), while the hippocampus is found essential for memory consolidation (Buzsáki, 2015; Diba & Buzsá ki, 2007; Singh, Norman, & Schapiro, 2022). The increased hippocampal and thalamic activity, along with strengthened connectivity between these regions and the mPFC during SO-spindle events, underscores a hippocampal-thalamic-neocortical information flow. This aligns with recent findings suggesting the thalamus orchestrates neocortical oscillations during sleep (Schreiner et al., 2022). The thalamus and hippocampus thus appear central to memory consolidation during sleep, guiding information transfer to the neocortex, e.g., mPFC.”

      (4) The study included an impressive number of 107 subjects. It is surprising though that only 31 subjects had to be excluded under these difficult recording conditions, especially since no adaptation night was performed. Since only subjects were excluded who slept less than 10 min (or had excessive head movements) there are likely several datasets included with comparably short durations and only a small number of SOs and spindles and even less combined SO-spindle events. A comprehensive table should be provided (supplement) including for each subject (included and excluded) the duration of included NREM sleep, number of SOs, spindles, and SO+spindle events. Also, some descriptive statistics (mean/SD/range) would be helpful.

      We appreciate your recognition of our sample size and the challenges associated with simultaneous EEG-fMRI sleep recordings. We acknowledge the importance of transparently reporting individual subject data, particularly regarding sleep duration and the number of detected SOs, spindles, and SO-spindle events. To address this, we will provide comprehensive tables in the supplementary materials, contains descriptive information about sleep-related characteristics (Table S1), as well as detailed information about sleep waves at each sleep stage for all 107 subjects(Table S2-S4), listing for each subject:(1)Different sleep stage duration; (2)Number of detected SOs; (3)Number of detected spindles; (4)Number of detected SO-spindle coupling events; (5)Density of detected SOs; (6)Density of detected spindles; (7)Density of detected SO-spindle coupling events.

      However, most of the excluded participants were unable to fall asleep or had too short a sleep duration, so they basically had no NREM sleep period, so it was impossible to count the NREM sleep duration, SO, spindle, and coupling numbers.

      Supplementary Materials, Page 42-54, Table S1-S4

      (Consider of the length, we do not list all the tables here. Please refer to the revised manuscript.)

      (5) Was the 20-channel head coil dedicated for EEG-fMRI measurements? How were the electrode cables guided through/out of the head coil? Usually, the 64-channel head coil is used for EEG-fMRI measurements in a Siemens PRISMA 3T scanner, which has a cable duct at the back that allows to guide the cables straight out of the head coil (to minimize MR-related artifacts). The choice for the 20-channel head coil should be motivated. Photos of the recording setup would also be helpful.

      Thank you for your comment regarding our choice of the 20-channel head coil for EEG-fMRI measurements. We acknowledge that the 64-channel head coil is commonly used in Siemens PRISMA 3T scanners; however, the 20-channel coil was selected due to specific practical and technical considerations in our study. In particular, the 20-channel head coil was compatible with our EEG system and ensured sufficient signal-to-noise ratio (SNR) for both EEG and fMRI acquisition. The EEG electrode cables were guided through the lateral and posterior openings of the head coil, secured with foam padding to reduce motion and minimize MR-related artifacts. Moreover, given the extended nature of nocturnal sleep recordings, the 20-channel coil allowed us to maintain participant comfort while still achieving high-quality simultaneous EEG-fMRI data.

      We have made this clearer in the revised manuscript.

      Methods, Page 20 Lines 385-392

      “All MRI data were acquired using a 20-channel head coil on a research-dedicated 3-Tesla Siemens Magnetom Prisma MRI scanner. Earplugs and cushions were provided for noise protection and head motion restriction. We chose the 20-channel head coil because it was compatible with our EEG system and ensured sufficient signal-to-noise ratio (SNR) for both EEG and fMRI acquisition. The EEG electrode cables were guided through the lateral and posterior openings of the head coil, secured with foam padding to reduce motion and minimize MR-related artifacts. Moreover, given the extended nature of nocturnal sleep recordings, the 20-channel coil helped maintain participant comfort while still achieving high-quality simultaneous EEG-fMRI data.”

      (6) Was the EEG sampling synchronized to the MR scanner (gradient system) clock (the 10 MHz signal; not referring to the volume TTL triggers here)? This is a requirement for stable gradient artifact shape over time and thus accurate gradient noise removal.

      Thank you for raising this important point. We confirm that the EEG sampling was synchronized to the MR scanner’s 10 MHz gradient system clock, ensuring a stable gradient artifact shape over time and enabling accurate artifact removal. This synchronization was achieved using the standard clock synchronization interface of the EEG amplifier, minimizing timing jitter and drift. As a result, the gradient artifact waveform remained stable across volumes, allowing for more effective artifact correction during preprocessing. We appreciate your attention to this critical aspect of EEG-fMRI data acquisition.

      We have made this clearer in the revised manuscript.

      Methods, Page 19-20 Lines 371-383

      “EEG was recorded simultaneously with fMRI data using an MR-compatible EEG amplifier system (BrainAmps MR-Plus, Brain Products, Germany), along with a specialized electrode cap. The recording was done using 64 channels in the international 10/20 system, with the reference channel positioned at FCz. In order to adhere to polysomnography (PSG) recording standards, six electrodes were removed from the EEG cap: one for electrocardiogram (ECG) recording, two for electrooculogram (EOG) recording, and three for electromyogram (EMG) recording. EEG data was recorded at a sample rate of 5000 Hz, the resistance of the reference and ground channels was kept below 10 kΩ, and the resistance of the other channels was kept below 20 kΩ. To synchronize the EEG and fMRI recordings, the BrainVision recording software (BrainProducts, Germany) was utilized to capture triggers from the MRI scanner. The EEG sampling was synchronized to the MR scanner’s 10 MHz gradient system clock, ensuring a stable gradient artifact shape over time and enabling accurate artifact removal. This was achieved via the standard clock synchronization interface of the EEG amplifier, minimizing timing jitter and drift.”

      (7) The TR is quite long and the voxel size is quite large in comparison to state-of-the-art EPI sequences. What was the rationale behind choosing a sequence with relatively low temporal and spatial resolution?

      We acknowledge that our chosen TR and voxel size are relatively long and large compared to state-of-the-art EPI sequences. This decision was made to optimize the signal-to-noise ratio (SNR) and reduce susceptibility-related distortions, which are particularly critical in EEG-fMRI sleep studies where head motion and physiological noise can be substantial. A longer TR allowed us to sample whole-brain activity with sufficient coverage, while a larger voxel size helped enhance BOLD sensitivity and minimize partial volume effects in deep brain structures such as the thalamus and hippocampus, which are key regions of interest in our study. We appreciate your concern and hope this clarification provides sufficient rationale for our sequence parameters.

      We have made this clearer in the revised manuscript.

      Methods, Page 20-21 Lines 398-408

      “Then, the “sleep” session began after the participants were instructed to try and fall asleep. For the functional scans, whole-brain images were acquired using k-space and steady-state T2*-weighted gradient echo-planar imaging (EPI) sequence that is sensitive to the BOLD contrast. This measures local magnetic changes caused by changes in blood oxygenation that accompany neural activity (sequence specification: 33 slices in interleaved ascending order, TR = 2000 ms, TE = 30 ms, voxel size = 3.5 × 3.5 × 4.2 mm<sup>3</sup>, FA = 90°, matrix = 64 × 64, gap = 0.7 mm). A relatively long TR and larger voxel size were chosen to optimize SNR and reduce susceptibility-related distortions, which are critical in EEG-fMRI sleep studies where head motion and physiological noise can be substantial. The longer TR allowed whole-brain coverage with sufficient temporal resolution, while the larger voxel size helped enhance BOLD sensitivity and minimize partial volume effects in deep brain structures (e.g., the thalamus and hippocampus), which are key regions of interest in this study.”

      (8) The anatomically defined ROIs are quite large. It should be elaborated on how this might reduce sensitivity to sleep rhythm-specific activity within sub-regions, especially for the thalamus, which has distinct nuclei involved in sleep functions.

      We appreciate your insight regarding the use of anatomically defined ROIs and their potential limitations in detecting sleep rhythm-specific activity within sub-regions, particularly in the thalamus. Given the distinct functional roles of thalamic nuclei in sleep processes, we acknowledge that using a single, large thalamic ROI may reduce sensitivity to localized activity patterns. To address this, we will discuss this limitation in the revised manuscript, acknowledging that our approach prioritizes whole-structure effects but may not fully capture nucleus-specific contributions.

      Discussion, Page 18 Lines 333-341

      “Despite providing new insights, our study has several limitations. First, our scalp EEG did not directly capture hippocampal ripples, preventing us from conclusively demonstrating triple coupling. Second, the combination of EEG-fMRI and the lack of a memory task limit our ability to parse fine-grained BOLD responses at the DOWN- vs. UP-states of SOs and link observed activations to behavioral outcomes. Third, the use of large anatomical ROIs may mask subregional contributions of specific thalamic nuclei or hippocampal subfields. Finally, without a memory task, we cannot establish a direct behavioral link between sleep-rhythm-locked activation and memory consolidation. Future studies combining techniques such as ultra-high-field fMRI or iEEG with cognitive tasks may refine our understanding of subregional network dynamics and functional significance during sleep.”

      (9) The study reports SO & spindle amplitudes & densities, as well as SO+spindle coupling, to be larger during N2/3 sleep compared to N1 and REM sleep, which is trivial but can be seen as a sanity check of the data. However, the amount of SOs and spindles reported for N1 and REM sleep is concerning, as per definition there should be hardly any (if SOs or spindles occur in N1 it becomes by definition N2, and the interval between spindles has to be considerably large in REM to still be scored as such). Thus, on the one hand, the report of these comparisons takes too much space in the main manuscript as it is trivial, but on the other hand, it raises concerns about the validity of the scoring.

      We appreciate your concern regarding the reported presence of SOs and spindles in N1 and REM sleep and the potential implications. Our detection method for detecting SO, spindle, and coupling were originally designed only for N2&N3 sleep data based on the characteristics of the data itself, and this method is widely recognized and used in the sleep research (Hahn et al., 2020; Helfrich et al., 2019; Helfrich et al., 2018; Ngo, Fell, & Staresina, 2020; Schreiner et al., 2022; Schreiner et al., 2021; Staresina et al., 2015; Staresina et al., 2023). While, because the detection methods for SO and spindle are based on percentiles, this method will always detect a certain number of events when used for other stages (N1 and REM) sleep data, but the differences between these events and those detected in stage N23 remain unclear. We will acknowledge the reasons for these results in the Methods section and emphasize that they are used only for sanity checks.

      Methods, Page 25 Lines 515-524

      “We note that the above methods for detecting SOs, spindles, and their couplings were originally developed for N2 and N3 sleep data, based on the specific characteristics of these stages. These methods are widely recognized in sleep research (Hahn et al., 2020; Helfrich et al., 2019; Helfrich et al., 2018; Ngo, Fell, & Staresina, 2020; Schreiner et al., 2022; Schreiner et al., 2021; Staresina et al., 2015; Staresina et al., 2023). However, because this percentile-based detection approach will inherently identify a certain number of events if applied to other stages (e.g., N1 and REM), the nature of these events in those stages remains unclear compared to N2/N3. We nevertheless identified and reported the detailed descriptive statistics of these sleep rhythms in all sleep stages, under the same operational definitions, both for completeness and as a sanity check. Within the same subject, there should be more SOs, spindles, and their couplings in N2/N3 than in N1 or REM (see also Figure S2-S4, Table S1-S4).”

      (10) Why was electrode F3 used to quantify the occurrence of SOs and spindles? Why not a midline frontal electrode like Fz (or a number of frontal electrodes for SOs) and Cz (or a number of centroparietal electrodes) for spindles to be closer to their maximum topography?

      We appreciate your suggestion regarding electrode selection for SO and spindle quantification. Our choice of F3 was primarily based on previous studies (Massimini et al., 2004; Molle et al., 2011), where bilateral frontal electrodes are commonly used for detecting SOs and spindles. Additionally, we considered the impact of MRI-related noise and, after a comprehensive evaluation, determined that F3 provided an optimal balance between signal quality and artifact minimization. We also acknowledge that alternative electrode choices, such as Fz for SOs and Cz for spindles, could provide additional insights into their topographical distributions.

      (11) Functional connectivity (hippocampus -> thalamus -> cortex (mPFC)) is reported to be increased during SO-spindle coupling and interpreted as evidence for coordination of hippocampo-neocortical communication likely by thalamic spindles. However, functional connectivity was only analysed during coupled SO+spindle events, not during isolated SOs or isolated spindles. Without the direct comparison of the connectivity patterns between these three events, it remains unclear whether this is specific for coupled SO+spindle events or rather associated with one or both of the other isolated events. The PPIs need to be conducted for those isolated events as well and compared statistically to the coupled events.

      We appreciate your critical perspective on our functional connectivity analysis and the interpretation of hippocampus-thalamus-cortex (mPFC) interactions during SO-spindle coupling. We acknowledge that, in the current analysis, functional connectivity was only examined during coupled SO-spindle events, without direct comparison to isolated SOs or isolated spindles. To address this concern, we have conducted PPI analyses for all three ROIs(Hippocampus, Thalamus, mPFC) and all three event types (SO-spindle couplings, isolated SOs, and isolated spindles). Our results indicate that neither isolated SOs nor isolated Spindles yielded significant connectivity changes in all three ROIs, as all failed to survive multiple comparison corrections. This suggests that the observed connectivity increase is specific to SO-spindle coupling, rather than being independently driven by either SOs or spindles alone.

      Results, Page 14 Lines 248-255

      “Crucially, the interaction between FC and SO-spindle coupling revealed that only the functional connectivity of hippocampus -> thalamus (ROI analysis, t<sub>(106)</sub> = 1.86, p = 0.0328) and thalamus -> mPFC (ROI analysis, t<sub>(106)</sub> = 1.98, p = 0.0251) significantly increased during SO-spindle coupling, with no significant changes in all other pathways (Fig. 4e). We also conducted PPI analyses for the other two events (SOs and spindles), and neither yielded significant connectivity changes in the three ROIs, as all failed to survive whole-brain FWE correction at the cluster level (p < 0.05). Together, these findings suggest that the thalamus, likely via spindles, coordinates hippocampal-cortical communication selectively during SO-spindle coupling, but not isolated SOs or spindle events alone.”

      (12) The limited temporal resolution of fMRI does indeed not allow for easily distinguishing between fMRI activation patterns related to SO-up- vs. SO-down-states. For this, one could try to extract the amplitudes of SO-up- and SO-down-states separately for each SO event and model them as two separate parametric modulators (with the risk of collinearity as they are likely correlated).

      We appreciate your insightful comment regarding the challenge of distinguishing fMRI activation patterns related to SO-up vs. SO-down states due to the limited temporal resolution of fMRI. While our current analysis does not differentiate between these two phases, we acknowledge that separately modeling SO-up and SO-down states using parametric modulators could provide a more refined understanding of their distinct neural correlates. However, as you notes, this approach carries the risk of collinearity, and there is indeed a high correlation between the two amplitudes across all subjects in our results (r=0.98). Future studies could explore more on leveraging high-temporal-resolution techniques. While implementing this in the current study is beyond our scope, we will acknowledge this limitation in the Discussion section.

      Discussion, Page 17 Lines 308-322

      “An intriguing aspect of our findings is the reduced DMN activity during SOs when modeled at the SO trough (DOWN-state). This reduced DMN activity may reflect large-scale neural inhibition characteristic of the SO trough. The DMN is typically active during internally oriented cognition (e.g., self-referential processing or mind-wandering) and is suppressed during external stimuli processing (Yeshurun, Nguyen, & Hasson, 2021). It is unlikely, however, that this suppression of DMN during SO events is related to a shift from internal cognition to external responses given it is during deep sleep time. Instead, it could be driven by the inherent rhythmic pattern of SOs, which makes it difficult to separate UP- from DOWN-states (the two temporal regressors were highly correlated, and similar brain activation during SOs events was obtained if modelled at the SO peak instead, Fig. S5). Since the amplitude at the SO trough is consistently larger than that at the SO peak, the neural activation we detected may primarily capture the large-scale inhibition from DOWN-state. Interestingly, no such DMN reduction was found during SO-spindle coupling, implying that coupling may involve distinct neural dynamics that partially re-engage DMN-related processes, possibly reflecting memory-related reactivation. Future research using high-temporal-resolution techniques like iEEG could clarify these possibilities.

      Discussion, Page 18 Lines 333-341

      “Despite providing new insights, our study has several limitations. First, our scalp EEG did not directly capture hippocampal ripples, preventing us from conclusively demonstrating triple coupling. Second, the combination of EEG-fMRI and the lack of a memory task limit our ability to parse fine-grained BOLD responses at the DOWN- vs. UP-states of SOs and link observed activations to behavioral outcomes. Third, the use of large anatomical ROIs may mask subregional contributions of specific thalamic nuclei or hippocampal subfields. Finally, without a memory task, we cannot establish a direct behavioral link between sleep-rhythm-locked activation and memory consolidation. Future studies combining techniques such as ultra-high-field fMRI or iEEG with cognitive tasks may refine our understanding of subregional network dynamics and functional significance during sleep.

      (13) L327: "It is likely that our findings of diminished DMN activity reflect brain activity during the SO DOWN-state, as this state consistently shows higher amplitude compared to the UP-state within subjects, which is why we modelled the SO trough as its onset in the fMRI analysis." This conclusion is not justified as the fact that SO down-states are larger in amplitude does not mean their impact on the BOLD response is larger.

      We appreciate your concern regarding our interpretation of diminished DMN activity reflecting the SO down-state. We acknowledge that the current expression is somewhat misleading, and our interpretation of it is: it could be driven by the inherent rhythmic pattern of SOs, which makes it difficult to separate UP- from DOWN-states (the two temporal regressors were highly correlated, and similar brain activation during SOs events was obtained if modelled at the SO peak instead). Since the amplitude at the SO trough is consistently larger than that at the SO peak, the neural activation we detected may primarily capture the large-scale inhibition from DOWN-state. And we will make this clear in the Discussion section.

      Discussion, Page 17 Lines 308-322

      “An intriguing aspect of our findings is the reduced DMN activity during SOs when modeled at the SO trough (DOWN-state). This reduced DMN activity may reflect large-scale neural inhibition characteristic of the SO trough. The DMN is typically active during internally oriented cognition (e.g., self-referential processing or mind-wandering) and is suppressed during external stimuli processing (Yeshurun, Nguyen, & Hasson, 2021). It is unlikely, however, that this suppression of DMN during SO events is related to a shift from internal cognition to external responses given it is during deep sleep time. Instead, it could be driven by the inherent rhythmic pattern of SOs, which makes it difficult to separate UP- from DOWN-states (the two temporal regressors were highly correlated, and similar brain activation during SOs events was obtained if modelled at the SO peak instead, Fig. S5). Since the amplitude at the SO trough is consistently larger than that at the SO peak, the neural activation we detected may primarily capture the large-scale inhibition from DOWN-state. Interestingly, no such DMN reduction was found during SO-spindle coupling, implying that coupling may involve distinct neural dynamics that partially re-engage DMN-related processes, possibly reflecting memory-related reactivation. Future research using high-temporal-resolution techniques like iEEG could clarify these possibilities.

      (14) Line 77: "In the current study, while directly capturing hippocampal ripples with scalp EEG or fMRI is difficult, we expect to observe hippocampal activation in fMRI whenever SOs-spindles coupling is detected by EEG, if SOs- spindles-ripples triple coupling occurs during human NREM sleep". Not all SO-spindle events are associated with ripples (Staresina et al., 2015), but hippocampal activation may also be expected based on the occurrence of spindles alone (Bergmann et al., 2012).

      We appreciate your clarification regarding the relationship between SO-spindle coupling and hippocampal ripples. We acknowledge that not all SO-spindle events are necessarily accompanied by ripples (Staresina et al., 2015). However, based on previous research, we found that hippocampal ripples are significantly more likely to occur during SO-spindle coupling events. This suggests that while ripple occurrence is not guaranteed, SO-spindle coupling creates a favorable network state for ripple generation and potential hippocampal activation. To ensure accuracy, we will revise the manuscript to delete this misleading sentence in the Introduction section and acknowledge in the Discussion that our results cannot conclusively directly observe the triple coupling of SO, spindle, and hippocampal ripples.

      Discussion, Page 18 Lines 333-341

      “Despite providing new insights, our study has several limitations. First, our scalp EEG did not directly capture hippocampal ripples, preventing us from conclusively demonstrating triple coupling. Second, the combination of EEG-fMRI and the lack of a memory task limit our ability to parse fine-grained BOLD responses at the DOWN- vs. UP-states of SOs and link observed activations to behavioral outcomes. Third, the use of large anatomical ROIs may mask subregional contributions of specific thalamic nuclei or hippocampal subfields. Finally, without a memory task, we cannot establish a direct behavioral link between sleep-rhythm-locked activation and memory consolidation. Future studies combining techniques such as ultra-high-field fMRI or iEEG with cognitive tasks may refine our understanding of subregional network dynamics and functional significance during sleep.”

      Reviewer #2 (Public review):

      In this study, Wang and colleagues aimed to explore brain-wide activation patterns associated with NREM sleep oscillations, including slow oscillations (SOs), spindles, and SO-spindle coupling events. Their findings reveal that SO-spindle events corresponded with increased activation in both the thalamus and hippocampus. Additionally, they observed that SO-spindle coupling was linked to heightened functional connectivity from the hippocampus to the thalamus, and from the thalamus to the medial prefrontal cortex-three key regions involved in memory consolidation and episodic memory processes.

      This study's findings are timely and highly relevant to the field. The authors' extensive data collection, involving 107 participants sleeping in an fMRI while undergoing simultaneous EEG recording, deserves special recognition. If shared, this unique dataset could lead to further valuable insights. While the conclusions of the data seem overall well supported by the data, some aspects with regard to the detection of sleep oscillations need clarification.

      The authors report that coupled SO-spindle events were most frequent during NREM sleep (2.46 [plus minus] 0.06 events/min), but they also observed a surprisingly high occurrence of these events during N1 and REM sleep (2.23 [plus minus] 0.09 and 2.32 [plus minus] 0.09 events/min, respectively), where SO-spindle coupling would not typically be expected. Combined with the relatively modest SO amplitudes reported (~25 µV, whereas >75 µV would be expected when using mastoids as reference electrodes), this raises the possibility that the parameters used for event detection may not have been conservative enough - or that sleep staging was inaccurately performed. This issue could present a significant challenge, as the fMRI findings are largely dependent on the reliability of these detected events.

      Thank you very much for your thorough and encouraging review. We appreciate your recognition of the significance and relevance of our study and dataset, particularly in highlighting how simultaneous EEG-fMRI recordings can provide complementary insights into the temporal dynamics of neural oscillations and their associated spatial activation patterns during sleep. In the sections that follow, we address each of your comments in detail. We have revised the text and conducted additional analyses wherever possible to strengthen our argument, clarify our methodological choices. We believe these revisions improve the clarity and rigor of our work, and we thank you for helping us refine it.

      We appreciate your insightful comments regarding the detection of sleep oscillations. Our methods for detecting SOs, spindles, and their couplings were originally developed for N2 and N3 sleep data, based on the specific characteristics of these stages. These methods are widely recognized in sleep research (Hahn et al., 2020; Helfrich et al., 2019; Helfrich et al., 2018; Ngo, Fell, & Staresina, 2020; Schreiner et al., 2022; Schreiner et al., 2021; Staresina et al., 2015; Staresina et al., 2023). However, because this percentile-based detection approach will inherently identify a certain number of events if applied to other stages (e.g., N1 and REM), the nature of these events in those stages remains unclear compared to N2/N3. We nevertheless identified and reported the detailed descriptive statistics of these sleep rhythms in all sleep stages, under the same operational definitions, both for completeness and as a sanity check. Within the same subject, there should be more SOs, spindles, and their couplings in N2/N3 than in N1 or REM. We will acknowledge the reasons for these results in the Methods section and emphasize that they are used only for sanity checks.

      Regarding the reported SO amplitudes (~25 µV), during preprocessing, we applied the Signal Space Projection (SSP) method to more effectively remove MRI gradient artifacts and cardiac pulse noise. While this approach enhances data quality, it also reduces overall signal power, leading to systematically lower reported amplitudes. Despite this, our SO detection in NREM sleep (especially N2/N3) remain physiologically meaningful and are consistent with previous fMRI studies using similar artifact removal techniques. We appreciate your careful evaluation and valuable suggestions.

      In addition, we will provide comprehensive tables in the supplementary materials, contains descriptive information about sleep-related characteristics (Table S1), as well as detailed information about sleep waves at each sleep stage for all 107 subjects(Table S2-S4), listing for each subject:(1)Different sleep stage duration; (2)Number of detected SOs; (3)Number of detected spindles; (4)Number of detected SO-spindle coupling events; (2)Density of detected SOs; (3)Density of detected spindles; (4)Density of detected SO-spindle coupling events.

      Methods, Page 25 Lines 515-524

      “We note that the above methods for detecting SOs, spindles, and their couplings were originally developed for N2 and N3 sleep data, based on the specific characteristics of these stages. These methods are widely recognized in sleep research (Hahn et al., 2020; Helfrich et al., 2019; Helfrich et al., 2018; Ngo, Fell, & Staresina, 2020; Schreiner et al., 2022; Schreiner et al., 2021; Staresina et al., 2015; Staresina et al., 2023). However, because this percentile-based detection approach will inherently identify a certain number of events if applied to other stages (e.g., N1 and REM), the nature of these events in those stages remains unclear compared to N2/N3. We nevertheless identified and reported the detailed descriptive statistics of these sleep rhythms in all sleep stages, under the same operational definitions, both for completeness and as a sanity check. Within the same subject, there should be more SOs, spindles, and their couplings in N2/N3 than in N1 or REM (see also Figure S2-S4, Table S1-S4).”

      Supplementary Materials, Page 42-54, Table S1-S4

      (Consider of the length, we do not list all the tables here. Please refer to the revised manuscript.)

      Reviewer #3 (Public review):

      Summary:

      Wang et al., examined the brain activity patterns during sleep, especially when locked to those canonical sleep rhythms such as SO, spindle, and their coupling. Analyzing data from a large sample, the authors found significant coupling between spindles and SOs, particularly during the upstate of the SO. Moreover, the authors examined the patterns of whole-brain activity locked to these sleep rhythms. To understand the functional significance of these brain activities, the authors further conducted open-ended cognitive state decoding and found a variety of cognitive processing may be involved during SO-spindle coupling and during other sleep events. The authors next investigated the functional connectivity analyses and found enhanced connectivity between the hippocampus, the thalamus, and the medial PFC. These results reinforced the theoretical model of sleep-dependent memory consolidation, such that SO-spindle coupling is conducive to systems-level memory reactivation and consolidation.

      Strengths:

      There are obvious strengths in this work, including the large sample size, state-of-the-art neuroimaging and neural oscillation analyses, and the richness of results.

      Weaknesses:

      Despite these strengths and the insights gained, there are weaknesses in the design, the analyses, and inferences.

      Thank you for your detailed and thoughtful review of our manuscript. We are delighted that you recognize our advanced analysis methods and rich results of neuroimaging and neural oscillations as well as the large sample size data. In the following sections, we provide detailed responses to each of your comments. And we have revised the text and conducted additional analyses to strengthen our arguments and clarify our methodological choices. We believe these revisions enhance the clarity and rigor of our work, and we sincerely appreciate your thoughtful feedback in helping us refine the manuscript.

      (1) A repeating statement in the manuscript is that brain activity could indicate memory reactivation and thus consolidation. This is indeed a highly relevant question that could be informed by the current data/results. However, an inherent weakness of the design is that there is no memory task before and after sleep. Thus, it is difficult (if not impossible) to make a strong argument linking SO/spindle/coupling-locked brain activity with memory reactivation or consolidation.

      We appreciate your suggestion regarding the lack of a pre- and post-sleep memory task in our study design. We acknowledge that, in the absence of behavioral measures, it is hard to directly link SO-spindle coupling to memory consolidation in an outcome-driven manner. Our interpretation is instead based on the well-established role of these oscillations in memory processes, as demonstrated in previous studies. We sincerely appreciate this feedback and will adjust our Discussion accordingly to reflect a more precise interpretation of our findings.

      Discussion, Page 18 Lines 333-341

      “Despite providing new insights, our study has several limitations. First, our scalp EEG did not directly capture hippocampal ripples, preventing us from conclusively demonstrating triple coupling. Second, the combination of EEG-fMRI and the lack of a memory task limit our ability to parse fine-grained BOLD responses at the DOWN- vs. UP-states of SOs and link observed activations to behavioral outcomes. Third, the use of large anatomical ROIs may mask subregional contributions of specific thalamic nuclei or hippocampal subfields. Finally, without a memory task, we cannot establish a direct behavioral link between sleep-rhythm-locked activation and memory consolidation. Future studies combining techniques such as ultra-high-field fMRI or iEEG with cognitive tasks may refine our understanding of subregional network dynamics and functional significance during sleep.”

      (2) Relatedly, to understand the functional implications of the sleep rhythm-locked brain activity, the authors employed the "open-ended cognitive state decoding" method. While this method is interesting, it is rather indirect given that there were no behavioral indices in the manuscript. Thus, discussions based on these analyses are speculative at best. Please either tone down the language or find additional evidence to support these claims.

      Moreover, the results from this method are difficult to understand. Figure 3e showed that for all three types of sleep events (SO, spindle, SO-spindle), the same mental states (e.g., working memory, episodic memory, declarative memory) showed opposite directions of activation (left and right panels showed negative and positive activation, respectively). How to interpret these conflicting results? This ambiguity is also reflected by the term used: declarative memory and episodic memories are both indexed in the results. Yet these two processes can be largely overlapped. So which specific memory processes do these brain activity patterns reflect? The Discussion shall discuss these results and the limitations of this method.

      We appreciate your critical assessment of the open-ended cognitive state decoding method and its interpretational challenges. Given the concerns about the indirectness of this approach, we decided to remove its related content and results from Figure 3 in the main text and include it in Supplementary Figure 7.

      Due to the complexity of memory-related processes, we acknowledge that distinguishing between episodic and declarative memory based solely on this approach is not straightforward. We will revise the Supplementary Materials to explicitly discuss these limitations and clarify that our findings do not isolate specific cognitive processes but rather suggest general associations with memory-related networks.

      Discussion, Page 17-18 Lines 323-332

      “To explore functional relevance, we employed an open-ended cognitive state decoding approach using meta-analytic data (NeuroSynth: Yarkoni et al. (2011)). Although this method usefully generates hypotheses about potential cognitive processes, particularly in the absence of a pre- and post-sleep memory task, it is inherently indirect. Many cognitive terms showed significant associations (16 of 50), such as “episodic memory,” “declarative memory,” and “working memory.” We focused on episodic/declarative memory given the known link with hippocampal reactivation (Diekelmann & Born, 2010; Staresina et al., 2015; Staresina et al., 2023). Nonetheless, these inferences regarding memory reactivation should be interpreted cautiously without direct behavioral measures. Future research incorporating explicit tasks before and after sleep would more rigorously validate these potenial functional claims.”

      (3) The coupling strength is somehow inconsistent with prior results (Hahn et al., 2020, eLife, Helfrich et al., 2018, Neuron). Specifically, Helfrich et al. showed that among young adults, the spindle is coupled to the peak of the SO. Here, the authors reported that the spindles were coupled to down-to-up transitions of SO and before the SO peak. It is possible that participants' age may influence the coupling (see Helfrich et al., 2018). Please discuss the findings in the context of previous research on SO-spindle coupling.

      We appreciate your concern regarding the temporal characteristics of SO-spindle coupling. We acknowledge that the SO-spindle coupling phase results in our study are not identical to those reported by Hahn et al. (2020); Helfrich et al. (2018). However, these differences may arise due to slight variations in event detection parameters, which can influence the precise phase estimation of coupling. Notably, Hahn et al. (2020) also reported slight discrepancies in their group-level coupling phase results, highlighting that methodological differences can contribute to variability across studies. Furthermore, our findings are consistent with those of Schreiner et al. (2021), further supporting the robustness of our observations.

      That said, we acknowledge that our original description of SO-spindle coupling as occurring at the "transition from the lower state to the upper state" was not entirely precise. The -π/2 phase represents the true transition point, while our observed coupling phase is actually closer to the SO peak rather than strictly at the transition. We will revise this statement in the manuscript to ensure clarity and accuracy in describing the coupling phase.

      Discussion, Page 16 Lines 283-291

      “Our data provide insights into the neurobiological underpinnings of these sleep rhythms. SOs, originating mainly in neocortical areas such as the mPFC, alternate between DOWN- and UP-states. The thalamus generates sleep spindles, which in turn couple with SOs. Our finding that spindle peaks consistently occurred slightly before the UP-state peak of SOs (in 83 out of 107 participants), concurs with prior studies, including Schreiner et al. (2021). Yet it differs from some results suggesting spindles might peak right at the SO UP-state (Hahn et al., 2020; Helfrich et al., 2018). Such discrepancies could arise from differences in detection algorithms, participant age (Helfrich et al., 2018), or subtle variations in cortical-thalamic timing. Nonetheless, these results underscore the importance of coordinated SO-spindle interplay in supporting sleep-dependent processes.”

      (4) The discussion is rather superficial with only two pages, without delving into many important arguments regarding the possible functional significance of these results. For example, the author wrote, "This internal processing contrasts with the brain patterns associated with external tasks, such as working memory." Without any references to working memory, and without delineating why WM is considered as an external task even working memory operations can be internal. Similarly, for the interesting results on SO and reduced DMN activity, the authors wrote "The DMN is typically active during wakeful rest and is associated with self-referential processes like mind-wandering, daydreaming, and task representation (Yeshurun, Nguyen, & Hasson, 2021). Its reduced activity during SOs may signal a shift towards endogenous processes such as memory consolidation." This argument is flawed. DMN is active during self-referential processing and mind-wandering, i.e., when the brain shifts from external stimuli processing to internal mental processing. During sleep, endogenous memory reactivation and consolidation are also part of the internal mental processing given the lack of external environmental stimulation. So why during SO or during memory consolidation, the DMN activity would be reduced? Were there differences in DMN activity between SO and SO-spindle coupling events?

      We appreciate your concerns regarding the brevity of the discussion and the need for clearer theoretical arguments. We will expand this section to provide more in-depth interpretations of our findings in the context of prior literature. Regarding working memory (WM), we acknowledge that our phrasing was ambiguous. We will modify this statement in the Discussion section.

      For the SO-related reduction in DMN activity, we recognize the need for a more precise explanation. This reduced DMN activity may reflect large-scale neural inhibition characteristic of the SO trough. The DMN is typically active during internally oriented cognition (e.g., self-referential processing or mind-wandering) and is suppressed during external stimuli processing (Yeshurun, Nguyen, & Hasson, 2021). It is unlikely, however, that this suppression of DMN during SO events is related to a shift from internal cognition to external responses given it is during deep sleep time. Instead, it could be driven by the inherent rhythmic pattern of SOs, which makes it difficult to separate UP- from DOWN-states (the two temporal regressors were highly correlated, and similar brain activation during SOs events was obtained if modelled at the SO peak instead). Since the amplitude at the SO trough is consistently larger than that at the SO peak, the neural activation we detected may primarily capture the large-scale inhibition from DOWN-state.

      To address your final question, we have conducted the additional post hoc comparison of DMN activity between isolated SOs and SO-spindle coupling events. Our results indicate that

      DMN activation during SOs was significantly lower than during SO-spindle coupling (t<sub>(106)</sub> = -4.17, p < 1e-4). This suggests that SO-spindle coupling may involve distinct neural dynamics that partially re-engage DMN-related processes, possibly reflecting memory-related reactivation. We appreciate your constructive feedback and will integrate these expanded analyses and discussions into our revised manuscript.

      Results, Page 11 Lines 199-208

      “Spindles were correlated with positive activation in the thalamus (ROI analysis, t<sub>(106)</sub> = 15.39, p < 1e-4), the anterior cingulate cortex (ACC), and the putamen, alongside deactivation in the DMN (Fig. 3c). Notably, SO-spindle coupling was linked to significant activation in both the thalamus (ROI analysis, t<sub>(106)</sub> \= 3.38, p = 0.0005) and the hippocampus (ROI analysis, t<sub>(106)</sub> \= 2.50, p = 0.0070, Fig. 3d). However, no decrease in DMN activity was found during SO-spindle coupling, and DMN activity during SO was significantly lower than during coupling (ROI analysis, t<sub>(106)</sub> \= -4.17, p < 1e-4). For more detailed activation patterns, see Table S5-S7. We also varied the threshold used to detect SO events to assess its effect on hippocampal activation during SO-spindle coupling and observed that hippocampal activation remained significant when the percentile thresholds for SO detection ranged between 71% and 80% (see Fig. S6).”

      Discussion, Page 17-18 Lines 308-332

      “An intriguing aspect of our findings is the reduced DMN activity during SOs when modeled at the SO trough (DOWN-state). This reduced DMN activity may reflect large-scale neural inhibition characteristic of the SO trough. The DMN is typically active during internally oriented cognition (e.g., self-referential processing or mind-wandering) and is suppressed during external stimuli processing (Yeshurun, Nguyen, & Hasson, 2021). It is unlikely, however, that this suppression of DMN during SO events is related to a shift from internal cognition to external responses given it is during deep sleep time. Instead, it could be driven by the inherent rhythmic pattern of SOs, which makes it difficult to separate UP- from DOWN-states (the two temporal regressors were highly correlated, and similar brain activation during SOs events was obtained if modelled at the SO peak instead, Fig. S5). Since the amplitude at the SO trough is consistently larger than that at the SO peak, the neural activation we detected may primarily capture the large-scale inhibition from DOWN-state. Interestingly, no such DMN reduction was found during SO-spindle coupling, implying that coupling may involve distinct neural dynamics that partially re-engage DMN-related processes, possibly reflecting memory-related reactivation. Future research using high-temporal-resolution techniques like iEEG could clarify these possibilities.

      To explore functional relevance, we employed an open-ended cognitive state decoding approach using meta-analytic data (NeuroSynth: Yarkoni et al. (2011)). Although this method usefully generates hypotheses about potential cognitive processes, particularly in the absence of a pre- and post-sleep memory task, it is inherently indirect. Many cognitive terms showed significant associations (16 of 50), such as “episodic memory,” “declarative memory,” and “working memory.” We focused on episodic/declarative memory given the known link with hippocampal reactivation (Diekelmann & Born, 2010; Staresina et al., 2015; Staresina et al., 2023). Nonetheless, these inferences regarding memory reactivation should be interpreted cautiously without direct behavioral measures. Future research incorporating explicit tasks before and after sleep would more rigorously validate these potential functional claims.”

      Reviewing Editor Comment:

      The reviewers think that you are working on a relevant and important topic. They are praising the large sample size used in the study. The reviewers are not all in line regarding the overall significance of the findings, but they all agree the paper would strongly benefit from some extra work, as all reviewers raise various critical points that need serious consideration.

      We appreciate your recognition of the relevance and importance of our study, as well as your acknowledgment of the large sample size as a strength of our work. We understand that there are differing perspectives regarding the overall significance of our findings, and we value the constructive critiques provided. We are committed to addressing the key concerns raised by all reviewers, including refining our analyses, clarifying our interpretations, and incorporating additional discussions to strengthen the manuscript. Below, we address your specific recommendations and provide responses to each point you raised to ensure our methods and results are as transparent and comprehensible as possible. We believe that these revisions will significantly enhance the rigor and impact of our study, and we sincerely appreciate your thoughtful feedback in helping us improve our work.

      Reviewer #1 (Recommendations for the authors):

      (1) The phrase "overnight sleep" suggests an entire night, while these were rather "nocturnal naps". Please rephrase.

      Thank you for pointing this out. We have revised the phrasing in our manuscript to "nocturnal naps" instead of "overnight sleep" to more accurately reflect the duration of the sleep recordings.

      (2) Sleep staging results (macroscopic sleep architecture) should be provided in more detail (at least min and % of the different sleep stages, sleep onset latency, total sleep duration, total recording duration), at least mean/SD/range.

      Thank you for this suggestion. We will provide comprehensive tables in the supplementary materials, contains descriptive information about sleep-related characteristics. This information will help provide a clearer overview of the macroscopic sleep architecture in our dataset.

      Supplementary Materials, Page 42, Table S1

      Author response table 1.

      Descriptive results of demographic information and sleep characteristics. Note: The total recorded time is equal to the awake time plus the total sleep time. The sleep onset latency is the time taken to reach the first sleep epoch. The Sleep Efficiency is the ratio of actual sleep time to total recording time.

      Reviewer #2 (Recommendations for the authors):

      In order to allow for a better estimation of the reliability of the detected sleep events, please:

      (1) Provide densities and absolute numbers of all detected SOs and spindles (N1, NREM, and REM sleep).

      Thank you for pointing this out. We will provide comprehensive tables in the supplementary materials, contains detailed information about sleep waves at each sleep stage for all 107 subjects (Table S2-S4), listing for each subject:1) Different sleep stage duration; 2) Number of detected SOs; 3) Number of detected spindles; 4) Number of detected SO-spindle coupling events; 5) Density of detected SOs; 6) Density of detected spindles; 7) Density of detected SO-spindle coupling events.

      Supplementary Materials, Page 43-54, Table S2-S4

      (Consider of the length, we do not list all the tables here. Please refer to the revised manuscript.)

      (2) Show ERPs for all detected SOs and spindles (per sleep stage).

      Thank you for the suggestion. We will provide ERPs for all detected SOs and spindles, separated by sleep stage (N1, N2&N3, and REM) in supplementary Fig. S2-S4. These ERP waveforms will help illustrate the characteristic temporal profiles of SOs and spindles across different sleep stages.

      Methods, Page 25, Line 525-532

      “Event-related potentials (ERP) analysis. After completing the detection of each sleep rhythm event, we performed ERP analyses for SOs, spindles, and coupling events in different sleep stages. Specifically, for SO events, we took the trough of the DOWN-state of each SO as the zero-time point, then extracted data in a [-2 s to 2 s] window from the broadband (0.1–30 Hz) EEG and used [-2 s to -0.5 s] for baseline correction; the results were then averaged across 107 subjects (see Fig. S2a). For spindle events, we used the peak of each spindle as the zero-time point and applied the same data extraction window and baseline correction before averaging across 107 subjects (see Fig. S2b). Finally, for SO-spindle coupling events, we followed the same procedure used for SO events (see Fig. 2a, Figs. S3–S4).”

      Supplementary Materials, Page 36-38, Fig. S2-S4

      Author response image 1.

      ERPs of SOs and spindles coupling during different sleep stages across all 107 subjects. a. ERP of SOs in different sleep stages using the broadband (0.1–30 Hz) EEG data. We align the trough of the DOWN-state of each SO at time zero (see Methods for details). The orange line represents the SO ERP in the N1 stage, the black line represents the SO ERP in the N2&N3 stage, and the green line represents the SO ERP in the REM stage. b. ERP of spindles in different sleep stages using the broadband (0.1–30 Hz) EEG data. We align the peak of each spindle at time zero (see Methods for details). The color scheme is the same as in panel a.

      Author response image 2.

      ERP and time-frequency patterns of SO-spindle coupling in the N1 stage. The averaged temporal frequency pattern and ERP across all instances of SO-spindle coupling, computed over all subjects, following the same procedure as in Fig. 2a, but for N1 stage.

      Author response image 3.

      ERP and time-frequency patterns of SO-spindle coupling in the REM stage. The averaged temporal frequency pattern and ERP across all instances of SO-spindle coupling, computed over all subjects, again following the same procedure as in Fig. 2a, but for REM stage.

      (3) Provide detailed info concerning sleep characteristics (time spent in each sleep stage etc.).

      Thank you for this suggestion. Same as the response above, we will provide comprehensive tables in the supplementary materials, contains descriptive information about sleep-related characteristics.

      Supplementary Materials, Page 42, Table S1 (same as above)

      (4) What would happen if more stringent parameters were used for event detection? Would the authors still observe a significant number of SO spindles during N1 and REM? Would this affect the fMRI-related results?

      Thank you for this suggestion. Our methods for detecting SOs, spindles, and their couplings were originally developed for N2 and N3 sleep data, based on the specific characteristics of these stages. These methods are widely recognized in sleep research (Hahn et al., 2020; Helfrich et al., 2019; Helfrich et al., 2018; Ngo, Fell, & Staresina, 2020; Schreiner et al., 2022; Schreiner et al., 2021; Staresina et al., 2015; Staresina et al., 2023). However, because this percentile-based detection approach will inherently identify a certain number of events if applied to other stages (e.g., N1 and REM), the nature of these events in those stages remains unclear compared to N2/N3. We nevertheless identified and reported the detailed descriptive statistics of these sleep rhythms in all sleep stages, under the same operational definitions, both for completeness and as a sanity check. Within the same subject, there should be more SOs, spindles, and their couplings in N2/N3 than in N1 or REM (see also Figure S2-S4, Table S1-S4).

      Furthermore, in order to explore the impact of this on our fMRI results, we conducted an additional sensitivity analysis by applying different detection parameters for SOs. Specifically, we adjusted amplitude percentile thresholds for SO detection (the parameter that has the greatest impact on the results). We used the hippocampal activation value during N2&N3 stage SO-spindle coupling as an anchor value and found that when the parameters gradually became stricter, the results were similar to or even better than the current results. However, when we continued to increase the threshold, the results began to gradually decrease until the threshold was increased to 80%, and the results were no longer significant. This indicates that our results are robust within a specific range of parameters, but as the threshold increases, the number of trials decreases, ultimately weakening the statistical power of the fMRI analysis.

      Thank you again for your suggestions on sleep rhythm event detection. We will add the results in Supplementary and revise our manuscript accordingly.

      Results, Page 11, Line 199-208

      “Spindles were correlated with positive activation in the thalamus (ROI analysis, t<sub>(106)</sub> = 15.39, p < 1e-4), the anterior cingulate cortex (ACC), and the putamen, alongside deactivation in the DMN (Fig. 3c). Notably, SO-spindle coupling was linked to significant activation in both the thalamus (ROI analysis, t<sub>(106)</sub> \= 3.38, p = 0.0005) and the hippocampus (ROI analysis, t<sub>(106)</sub> \= 2.50, p = 0.0070, Fig. 3d). However, no decrease in DMN activity was found during SO-spindle coupling, and DMN activity during SO was significantly lower than during coupling (ROI analysis, t<sub>(106)</sub> \= -4.17, p < 1e-4). For more detailed activation patterns, see Table S5-S7. We also varied the threshold used to detect SO events to assess its effect on hippocampal activation during SO-spindle coupling and observed that hippocampal activation remained significant when the percentile thresholds for SO detection ranged between 71% and 80% (see Fig. S6).”

      Supplementary Materials, Page 40, Fig. S6

      Author response image 4.

      Influence of the percentile threshold for SO detection on hippocampal activation (ROI) during SO-spindle coupling. We changed the percentile threshold for SO event detection in the EEG data analysis and then reconstructed the GLM design matrix based on the SO events detected at each threshold. The brain-wide activation pattern of SO-spindle couplings in the N2/3 stage was extracted using the same method as shown in Fig. 3. The gray horizontal line represents the significant range (71%–80%). * p < 0.05.

      Finally, we sincerely thank all again for your thoughtful and constructive feedback. Your insights have been invaluable in refining our analyses, strengthening our interpretations, and improving the clarity and rigor of our manuscript. We appreciate the time and effort you have dedicated to reviewing our work, and we are grateful for the opportunity to enhance our study based on your recommendations.

      References:

      Bergmann, T. O., Mölle, M., Diedrichs, J., Born, J., & Siebner, H. R. (2012). Sleep spindle-related reactivation of category-specific cortical regions after learning face-scene associations. NeuroImage, 59(3), 2733-2742.

      Buzsáki, G. (2015). Hippocampal sharp wave‐ripple: A cognitive biomarker for episodic memory and planning. Hippocampus, 25(10), 1073-1188.

      Caporro, M., Haneef, Z., Yeh, H. J., Lenartowicz, A., Buttinelli, C., Parvizi, J., & Stern, J. M. (2012). Functional MRI of sleep spindles and K-complexes. Clinical neurophysiology, 123(2), 303-309.

      Coulon, P., Budde, T., & Pape, H.-C. (2012). The sleep relay—the role of the thalamus in central and decentral sleep regulation. Pflügers Archiv-European Journal of Physiology, 463, 53-71.

      Crunelli, V., Lőrincz, M. L., Connelly, W. M., David, F., Hughes, S. W., Lambert, R. C., Leresche, N., & Errington, A. C. (2018). Dual function of thalamic low-vigilance state oscillations: rhythm-regulation and plasticity. Nature Reviews Neuroscience, 19(2), 107-118.

      Czisch, M., Wehrle, R., Stiegler, A., Peters, H., Andrade, K., Holsboer, F., & Sämann, P. G. (2009). Acoustic oddball during NREM sleep: a combined EEG/fMRI study. PloS one, 4(8), e6749.

      Diba, K., & Buzsáki, G. (2007). Forward and reverse hippocampal place-cell sequences during ripples. Nature Neuroscience, 10(10), 1241.

      Diekelmann, S., & Born, J. (2010). The memory function of sleep. Nature Reviews Neuroscience, 11(2), 114-126.

      Fogel, S., Albouy, G., King, B. R., Lungu, O., Vien, C., Bore, A., Pinsard, B., Benali, H., Carrier, J., & Doyon, J. (2017). Reactivation or transformation? Motor memory consolidation associated with cerebral activation time-locked to sleep spindles. PloS one, 12(4), e0174755.

      Hahn, M. A., Heib, D., Schabus, M., Hoedlmoser, K., & Helfrich, R. F. (2020). Slow oscillation-spindle coupling predicts enhanced memory formation from childhood to adolescence. Elife, 9, e53730.

      Halassa, M. M., Siegle, J. H., Ritt, J. T., Ting, J. T., Feng, G., & Moore, C. I. (2011). Selective optical drive of thalamic reticular nucleus generates thalamic bursts and cortical spindles. Nature Neuroscience, 14(9), 1118-1120.

      Hale, J. R., White, T. P., Mayhew, S. D., Wilson, R. S., Rollings, D. T., Khalsa, S., Arvanitis, T. N., & Bagshaw, A. P. (2016). Altered thalamocortical and intra-thalamic functional connectivity during light sleep compared with wake. NeuroImage, 125, 657-667.

      Helfrich, R. F., Lendner, J. D., Mander, B. A., Guillen, H., Paff, M., Mnatsakanyan, L., Vadera, S., Walker, M. P., Lin, J. J., & Knight, R. T. (2019). Bidirectional prefrontal-hippocampal dynamics organize information transfer during sleep in humans. Nature Communications, 10(1), 3572.

      Helfrich, R. F., Mander, B. A., Jagust, W. J., Knight, R. T., & Walker, M. P. (2018). Old brains come uncoupled in sleep: slow wave-spindle synchrony, brain atrophy, and forgetting. Neuron, 97(1), 221-230. e224.

      Horovitz, S. G., Fukunaga, M., de Zwart, J. A., van Gelderen, P., Fulton, S. C., Balkin, T. J., & Duyn, J. H. (2008). Low frequency BOLD fluctuations during resting wakefulness and light sleep: A simultaneous EEG‐fMRI study. Human brain mapping, 29(6), 671-682.

      Huang, Q., Xiao, Z., Yu, Q., Luo, Y., Xu, J., Qu, Y., Dolan, R., Behrens, T., & Liu, Y. (2024). Replay-triggered brain-wide activation in humans. Nature Communications, 15(1), 7185.

      Ilhan-Bayrakcı, M., Cabral-Calderin, Y., Bergmann, T. O., Tüscher, O., & Stroh, A. (2022). Individual slow wave events give rise to macroscopic fMRI signatures and drive the strength of the BOLD signal in human resting-state EEG-fMRI recordings. Cerebral Cortex, 32(21), 4782-4796.

      Laufs, H. (2008). Endogenous brain oscillations and related networks detected by surface EEG‐combined fMRI. Human brain mapping, 29(7), 762-769.

      Laufs, H., Walker, M. C., & Lund, T. E. (2007). ‘Brain activation and hypothalamic functional connectivity during human non-rapid eye movement sleep: an EEG/fMRI study’—its limitations and an alternative approach. Brain, 130(7), e75.

      Margulies, D. S., Ghosh, S. S., Goulas, A., Falkiewicz, M., Huntenburg, J. M., Langs, G., Bezgin, G., Eickhoff, S. B., Castellanos, F. X., & Petrides, M. (2016). Situating the default-mode network along a principal gradient of macroscale cortical organization. Proceedings of the National Academy of Sciences, 113(44), 12574-12579.

      Massimini, M., Huber, R., Ferrarelli, F., Hill, S., & Tononi, G. (2004). The sleep slow oscillation as a traveling wave. Journal of Neuroscience, 24(31), 6862-6870.

      Moehlman, T. M., de Zwart, J. A., Chappel-Farley, M. G., Liu, X., McClain, I. B., Chang, C., Mandelkow, H., Özbay, P. S., Johnson, N. L., & Bieber, R. E. (2019). All-night functional magnetic resonance imaging sleep studies. Journal of neuroscience methods, 316, 83-98.

      Molle, M., Bergmann, T. O., Marshall, L., & Born, J. (2011). Fast and slow spindles during the sleep slow oscillation: disparate coalescence and engagement in memory processing. Sleep, 34(10), 1411-1421.

      Ngo, H.-V., Fell, J., & Staresina, B. (2020). Sleep spindles mediate hippocampal-neocortical coupling during long-duration ripples. Elife, 9, e57011.

      Picchioni, D., Horovitz, S. G., Fukunaga, M., Carr, W. S., Meltzer, J. A., Balkin, T. J., Duyn, J. H., & Braun, A. R. (2011). Infraslow EEG oscillations organize large-scale cortical– subcortical interactions during sleep: a combined EEG/fMRI study. Brain research, 1374, 63-72.

      Schabus, M., Dang-Vu, T. T., Albouy, G., Balteau, E., Boly, M., Carrier, J., Darsaud, A., Degueldre, C., Desseilles, M., & Gais, S. (2007). Hemodynamic cerebral correlates of sleep spindles during human non-rapid eye movement sleep. Proceedings of the National Academy of Sciences, 104(32), 13164-13169.

      Schreiner, T., Kaufmann, E., Noachtar, S., Mehrkens, J.-H., & Staudigl, T. (2022). The human thalamus orchestrates neocortical oscillations during NREM sleep. Nature communications, 13(1), 5231.

      Schreiner, T., Petzka, M., Staudigl, T., & Staresina, B. P. (2021). Endogenous memory reactivation during sleep in humans is clocked by slow oscillation-spindle complexes. Nature Communications, 12(1), 3112.

      Singh, D., Norman, K. A., & Schapiro, A. C. (2022). A model of autonomous interactions between hippocampus and neocortex driving sleep-dependent memory consolidation. Proceedings of the National Academy of Sciences, 119(44), e2123432119.

      Spoormaker, V. I., Schröter, M. S., Gleiser, P. M., Andrade, K. C., Dresler, M., Wehrle, R., Sämann, P. G., & Czisch, M. (2010). Development of a large-scale functional brain network during human non-rapid eye movement sleep. Journal of Neuroscience, 30(34), 11379-11387.

      Staresina, B. P., Bergmann, T. O., Bonnefond, M., van der Meij, R., Jensen, O., Deuker, L., Elger, C. E., Axmacher, N., & Fell, J. (2015). Hierarchical nesting of slow oscillations, spindles and ripples in the human hippocampus during sleep. Nature Neuroscience, 18(11), 1679-1686.

      Staresina, B. P., Niediek, J., Borger, V., Surges, R., & Mormann, F. (2023). How coupled slow oscillations, spindles and ripples coordinate neuronal processing and communication during human sleep. Nature Neuroscience, 1-9.

      Yarkoni, T., Poldrack, R. A., Nichols, T. E., Van Essen, D. C., & Wager, T. D. (2011). Large-scale automated synthesis of human functional neuroimaging data. Nature methods, 8(8), 665-670.

      Yeshurun, Y., Nguyen, M., & Hasson, U. (2021). The default mode network: where the idiosyncratic self meets the shared social world. Nature Reviews Neuroscience, 1-12.

    1. eLife Assessment

      The authors use deep mutational scanning to assess the effect of ~6,600 protein-coding variants in MC4R, a G protein coupled receptor associated with obesity. They develop new, more precise approaches to deep mutational scanning, enabling them to probe molecular phenotypes directly relevant to the development of drugs that target this receptor. In this important work, the authors provide compelling evidence that variants impact signaling through MC4R in different ways, that some defective variants are amenable to a corrector drug and that deep mutational scanning data could guide compound optimization.

    2. Reviewer #1 (Public review):

      Summary:

      Howard et al. performed deep mutational scanning on the MC4R gene, using a reporter assay to investigate two distinct downstream pathways across multiple experimental conditions. They validated their findings with ClinVar data and previous studies. Additionally, they provided insights into the application of DMS results for personalized drug therapy and differential ligand responses across variant types.

      Strengths:

      They captured over 99% of variants with robust signals and investigated subtle functionalities, such as pathway-specific activities and interactions with different ligands, by refining both the experimental design and analytical methods.

      They provided additional details regarding the quality of the library, including the even composition of variants, sufficient readout from tested cells, and adequate sequencing depth. Additionally, they clarified the underlying assay mechanisms, effectively demonstrating the robustness of their results.

    3. Reviewer #2 (Public review):

      Overview

      In this manuscript the authors use deep mutational scanning to assess the effect of ~6,600 protein-coding variants in MC4R, a G protein-coupled receptor associated with obesity. Reasoning that current deep mutational scanning approaches are insufficiently precise for some drug development applications, they focus on articulating new, more precise approaches. These approaches, which include a new statistical model and innovative reporter assay, enable them to probe molecular phenotypes directly relevant to the development of drugs that target this receptor with high precision and statistical rigor.

      They use the resulting data for a variety of purposes, including probing the relationship between MC4R's sequence and structure, analyzing the effect of clinically important variants, identifying variants that disrupt downstream MC4R signaling via one but not both pathways, identifying loss of function variants are amenable to a corrector drug and exploring how deep mutational scanning data could guide small molecule drug optimization.

      Strengths

      The analysis and statistical framework developed by the authors represent a significant advance. In particular, it makes use of barcode-level internally replicated measurements to more accurately estimate measurement noise.<br /> The framework allows variant effects to be compared across experimental conditions, a task which is currently hard to do with rigor. Thus, this framework will be applicable to a large number of existing and future deep mutational scanning experiments.

      The authors refine their existing barcode transcription-based assay for GPCR signaling, and develop a clever "relay" new reporter system to boost signaling in a particular pathway. They show that these reporters can be used to measure both gain of function and loss of function effects, which many deep mutational scanning approaches cannot do.

      The use of systematic approaches to integrate and then interrogate high-dimensional deep mutational scanning data is a big strength. For example, the authors applied PCA to the variant effect results from reporters for two different MC4R signaling pathways and were able to discover variants that biased signaling through one or the other pathway. This approach paves the way for analyses of higher dimensional deep mutational scans.

      The authors use the deep mutational scanning data they collect to map how different variants impact small molecule agonists activate MC4R signaling. This is an exciting idea because developing small-molecule protein-targeting therapeutics is difficult, and this manuscript suggests a new way to map small molecule-protein interactions.

      Weaknesses

      The authors derive insights into the relationship between MC4R signaling through different pathways and its structure. While these make sense based on what is already known, the manuscript would be stronger if some of these insights were validated using methods other than deep mutational scanning.

      Likewise, the authors use their data to identify positions where variants disrupt MC4R activation by one small molecule agonist but not another. They hypothesize these effects point to positions that are more or less important for the binding of different small molecule agonists. The manuscript would be stronger if some of these insights were explored further.

      Impact

      In this manuscript the authors present new methods, including a statistical framework for analyzing deep mutational scanning data that will have a broad impact. They also generate MC4R variant effect data that is of interest to the GPCR community.

      Comments on revisions:

      I do not have additional comments, and feel that the authors addressed most of my concerns!

    4. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer 1 (Public reviews):

      Summary

      Howard et al. performed deep mutational scanning on the MC4R gene, using a reporter assay to investigate two distinct downstream pathways across multiple experimental conditions. They validated their findings with ClinVar data and previous studies. Additionally, they provided insights into the application of DMS results for personalized drug therapy and differential ligand responses across variant types.

      Strengths

      They captured over 99% of variants with robust signals and investigated subtle functionalities, such as pathway-specific activities and interactions with different ligands, by refining both the experimental design and analytical methods.

      Weaknesses

      While the study generated informative results, it lacks a detailed explanation regarding the input library, replicate correlation, and sequencing depth for a given number of cells. Additionally, there are several questions that it would be helpful for authors to clarify.

      (1) It would be helpful to clarify the information regarding the quality of the input library and experimental replicates. Are variants evenly represented in the library? Additionally, have the authors considered using long-read sequencing to confirm the presence of a single intended variant per construct? Finally, could the authors provide details on the correlation between experimental replicates under each condition?

      Are variants evenly represented in the library?

      We strive to achieve as evenly balanced library as possible at every stage of the DMS process (e.g., initial cloning in E. coli through integration into human cells). Below is a representative plot showing the number of barcodes per amino acid variant at each position in a given ~60 amino acid subregion of MC4R, which highlights how evenly variants are represented at the E. coli cloning stage.

      Author response image 1.

      We also make similar measurements after the library is integrated into HEK293T cell lines, and see similarly even coverage across all variants, as shown in the plot below:

      Author response image 2.

      Additionally, have the authors considered using long-read sequencing to confirm the presence of a single intended variant per construct?

      We agree long-read sequencing would be an excellent way to confirm that our constructs contain a single intended variant. However, we elected for an alternate method (outlined in more detail in Jones et al. 2020) that leverages multiple layers of validation. First, the oligo chip-synthesized portions of the protein containing the variants are cloned into a sequence-verified plasmid backbone, which greatly decreases the chances of spuriously generating a mutation in a different portion of the protein. We then sequence both the oligo portion and random barcode using overlapping paired end reads during barcode mapping to avoid sequencing errors and to help detect DNA synthesis errors. At this stage, we computationally reject any constructs that have more than one variant. Given this, the vast majority of remaining unintended variants would come from somatic mutations introduced by the E. coli cloning or replication process, which should be low frequency. We have used our in-house full plasmid sequencing method, OCTOPUS, to sample and spot check this for several other DMS libraries we have generated using the same cloning methods. We have found variants in the plasmid backbone in only ~1% of plasmids in these libraries. Our statistical model also helps correct for this by accounting for barcode-specific variation. Finally we believe this provides further motivation for having multiple barcodes per variant, which dilutes the effect of any unintended additional variants.

      Finally, could the authors provide details on the correlation between experimental replicates under each condition?

      Certainly! In general, the Gs reporter had higher correlation between replicates than the Gq system (r ~ 0.5 vs r ~ 0.4). The plots below, which have been added as a panel to Supplementary Figure 1, show two representative correlations at the RNA-seq stage of read counts for barcodes between the low a-MSH conditions.

      We added the following text to reference this panel:

      (see Methods > Sequence processing for barcode expression): “The correlation (r) of barcode readcounts between replicates was ~0.5 and ~0.4 for the Gs and Gq assays, respectively (Supplementary Fig. 1E).”

      One important advantage of our statistical model is that it’s able to leverage information from barcodes regardless of the number of replicates they appear in.

      (2) Since the functional readout of variants is conducted through RNA sequencing, it seems crucial to sequence a sufficient number of cells with adequate sequencing saturation. Could the authors clarify the coverage depth used for each RNA-seq experiment and how this depth was determined? Additionally, how many cells were sequenced in each experiment?

      The text has been added in the manuscript as follows:

      (in Methods > Running DMS Assays): “Given the seeding density (~17x10<sup>6</sup> cells per 150 mm replicate dish), time from seeding to collection, and doubling time of HEK293T cells, approximately 25.5x10<sup>6</sup> cells were collected per replicate. This translates to approximately 30-60x cellular coverage per amino acid variant in each replicate.”

      (in Methods > Sequence processing for barcode expression): “Total mapped reads per replicate at the RNA-seq stage were as follows:

      - Gs/CRE: 9.1-18.2 million mapped reads, median=12.3

      - Gq/UAS: 8.6-24.1 million mapped reads, median=14.5

      - Gs/CRE+Chaperone: 6.4-9.5 million mapped reads, median=7.5”

      The median read counts per sample per barcode were 8, 10, and 6 reads for Gs/CRE, Gq/UAS, and Gs/CRE+Chaperone assays, respectively. The median number of barcodes per variant across all samples (the “median of medians”) were 56 for Gs/CRE, 28 for Gq/UAS, and 44 for Gs/CRE+Chaperone.”

      (3) It appears that the frequencies of individual RNA-seq barcode variants were used as a proxy for MC4R activity. Would it be important to also normalize for heterogeneity in RNA-seq coverage across different cells in the experiment? Variability in cell representation (i.e., the distribution of variants across cells) could lead to misinterpretation of variant effects. For example, suppose barcode_a1 represents variant A and barcode_b1 represents variant B. If the RNA-seq results show 6 reads for barcode_a1 and 7 reads for barcode_b1, it might initially appear that both variants have similar effect sizes. However, if these reads correspond to 6 separate cells each containing 1 copy of barcode_a1, and only 1 cell containing 7 copies of barcode_b1, the interpretation changes significantly. Additionally, if certain variants occupy a larger proportion of the cell population, they are more likely to be overrepresented in RNA sequencing.

      We account for this heterogeneity in several ways. First, as shown above (see Response to Reviewer 1, Question 1), we aim to have even representation of variants within our libraries. Second, we utilize compositional control conditions like forskolin or unstimulated conditions to obtain treatment-independent measurements of barcode abundance and, consequently, of mutant-vs-WT effects that are due to compositional rather than biological variability. We expect that variability observed under these controls is due to subtle effects of molecular cloning, gene expression, and stochasticity. Using these controls, we observe that mutant-vs-WT effects are generally close to zero in these normalization conditions (e.g., in untreated Gq, see Supplementary Figure 3) as compared to treated conditions. For example, pre-mature stops behave similar to WT in normalization conditions. This indicates that mutant abundance is relatively homogenous. Where there are barcode-dependent effects on abundance, we can use information from these conditions to normalize that effect. Finally, our mixed-effect model accounts for barcode-specific deviations from the expected mutant effect (e.g., a “high count” barcode consistently being high relative to the mean).

      (4) Although the assay system appears to effectively represent MC4R functionality at the molecular level, we are curious about the potential disparity between the DMS score system and physiological relevance. How do variants reported in gnomAD distribute within the DMS scoring system?

      Figure 2D shows DMS scores (variant effect on Gs signaling) relative to human population frequency for all MC4R variants reported in gnomAD as of January 8, 2024.

      (5) To measure Gq signaling, the authors used the GAL4-VPR relay system. Is there additional experimental data to support that this relay system accurately represents Gq signaling?

      The full Gq reporter uses an NFAT response element from the IL-2 promoter to regulate the expression of the GAL4-VPR relay. In this system, the activation of Gq signaling results in the activation of the NFAT response element, and this signal is then amplified by the GAL4-VPR relay. The NFAT response element has been previously well-validated to respond to the activation of Gq signaling (e.g., Boss, Talpade, and Murphy 1996). We will have added this reference to the text (see Results> Assays for disease-relevant mechanisms) to further support the use of the Gq assay.

      (6) Identifying the variants responsive to the corrector was impressive. However, we are curious about how the authors confirmed that the restoration of MC4R activity was due to the correction of the MC4R protein itself. Is there a possibility that the observed effect could be influenced by other factors affected by the corrector? When the corrector was applied to the cells, were any expected or unexpected differential gene expression changes observed?

      While we do not directly measure whether Ipsen-17 has effects on other signaling processes, previous work has shown that Ipsen-17 treatment does not indirectly alter signaling kinetics such as receptor internalization (Wang et al., 2014). Furthermore, our analysis methods inherently account for this by normalizing variant effects to WT signaling levels. Any observed rescue of a given variant inherently means that the variant is specifically more responsive to Ipsen-17 than WT, and the fact that different variants exhibit different levels of rescue is reassuring that the mechanism is on target to MC4R. Lastly, Ipsen-17 is known to be an antagonist of alpha-MSH activity and is thought to bind directly to the same site on MC4R (Wang et al., 2014).

      We have revised text in the Methods section as follows (see Running DMS Assays) to better articulate this : “For chaperone experiments, cells were washed 3x with 10 mL DMEM to remove Ipsen 17 prior to agonist stimulation as it has been shown to be an antagonist of α-MSH activity and is thought to bind directly to the same site on MC4R (Wang et al. 2014).”

      (7) As mentioned in the introduction, gain-of-function (GoF) variants are known to be protective against obesity. It would be interesting to see further studies on the observed GoF variants. Do the authors have any plans for additional research on these variants?

      We agree this would be an excellent line of inquiry, but due to changes in company priorities we unfortunately do not have any plans for additional research on these variants.

      Reviewer 2 (Public reviews):

      Overview

      In this manuscript, the authors use deep mutational scanning to assess the effect of ~6,600 protein-coding variants in MC4R, a G protein-coupled receptor associated with obesity. Reasoning that current deep mutational scanning approaches are insufficiently precise for some drug development applications, they focus on articulating new, more precise approaches. These approaches, which include a new statistical model and innovative reporter assay, enable them to probe molecular phenotypes directly relevant to the development of drugs that target this receptor with high precision and statistical rigor.

      They use the resulting data for a variety of purposes, including probing the relationship between MC4R's sequence and structure, analyzing the effect of clinically important variants, identifying variants that disrupt downstream MC4R signaling via one but not both pathways, identifying loss of function variants are amenable to a corrector drug and exploring how deep mutational scanning data could guide small molecule drug optimization.

      Strengths

      The analysis and statistical framework developed by the authors represent a significant advance. In particular, the study makes use of barcode-level internally replicated measurements to more accurately estimate measurement noise.

      The framework allows variant effects to be compared across experimental conditions, a task that is currently hard to do with rigor. Thus, this framework will be applicable to a large number of existing and future deep mutational scanning experiments.

      The authors refine their existing barcode transcription-based assay for GPCR signaling, and develop a clever "relay" new reporter system to boost signaling in a particular pathway. They show that these reporters can be used to measure both gain of function and loss of function effects, which many deep mutational scanning approaches cannot do.

      The use of systematic approaches to integrate and then interrogate high-dimensional deep mutational scanning data is a big strength. For example, the authors applied PCA to the variant effect results from reporters for two different MC4R signaling pathways and were able to discover variants that biased signaling through one or the other pathway. This approach paves the way for analyses of higher dimensional deep mutational scans.

      The authors use the deep mutational scanning data they collect to map how different variants impact small molecule agonists activate MC4R signaling. This is an exciting idea, because developing small-molecule protein-targeting therapeutics is difficult, and this manuscript suggests a new way to map small-molecule-protein interactions.

      Weaknesses

      The authors derive insights into the relationship between MC4R signaling through different pathways and its structure. While these make sense based on what is already known, the manuscript would be stronger if some of these insights were validated using methods other than deep mutational scanning.

      Likewise, the authors use their data to identify positions where variants disrupt MC4R activation by one small molecule agonist but not another. They hypothesize these effects point to positions that are more or less important for the binding of different small molecule agonists. The manuscript would be stronger if some of these insights were explored further.

      Impact

      In this manuscript, the authors present new methods, including a statistical framework for analyzing deep mutational scanning data that will have a broad impact. They also generate MC4R variant effect data that is of interest to the GPCR community.

      Recommendations for the authors:

      (1) Page 7 - the Gq reporter relay system is clever. Could the authors include the original data showing that the simpler design didn't work at all, or at least revise the text to say more precisely what "not suitable due to weak SNR" means?

      We added a panel (D) to Supplementary Figure 2 showing that the native NFAT reporter was ~10x weaker than the CRE reporter, and the relay system amplified the NFAT signal to be comparable to the CRE reporter:

      (2) Page 7 - Even though the relay system gives some signal, it's clearly less sensitive/higher background than Gs. How does that play out in the quantitative analysis?

      —AND—

      (4) Page 10 - The Gq library had fewer barcodes per variant, and, as noted above, the Gq reporter doesn't work quite as well as the Gs one. It would be nice if the authors could comment on how these aspects of the Gq experiments affected data quality/power to detect effects.

      Due to the reviewer's excellent suggestion, we updated Supplementary Figure 2B to better contextualize the quantitative effects of the difference in signal to noise ratio of the Gq versus the Gs reporter system (see changes below). These distributions show the Z-statistic for testing either each stop mutation (red) or all possible coding variants against WT. Thus, a |Z| > 1.96 corresponds to a p = 0.05 in a two-sided Wald Test. We can see that in the Gs reporter, 95% of the stops are nominally significantly different from WT (visualized above with the majority of the red distribution being < -1.96). Alternatively, only 64% of stops are nominally significantly different from WT in Gq. This implies that it will be more difficult to detect effects in the Gq system, especially those less severe than stops.

      In addition to the overall signal to noise ratio being less in the Gq system, there were also less barcodes per variant (28 vs 56 barcodes per variant on average for Gq vs Gs). As demonstrated in Supplementary Figure 2C, the error bars on our estimates are related to the number of barcodes per variant (Standard Error ~ 1 / sqrt(Number of Barcodes), as shown in the plot below). This suggests that our estimates of mutant effects will be less certain in the Gq library than the Gs library. For example, the average standard error in the Gq library was 0.260 which was ~1.58 times larger than the Gs library's 0.165. Finally, we believe this further reiterates the power of our statistical framework, as it naturally enables formalized hypothesis testing that takes these errors into account when making comparisons both within reporters and across reporters.

      (3) Page 9 - it would be nice to see the analysis framework applied to a few existing datasets from other types of assays, to really judge its performance. That's not the main point of this paper, and it's fine, but it would be lovely!

      We agree with the reviewer and hope others apply our framework to their problems to further refine its utility and applicability! To that end, we’ve open-sourced it under a permissive license to help encourage the community to use it. Part of the challenge in applying it to other existing datasets is that few DMS experiments leverage variant-level replication through barcodes. While we re-analyzed an older DMS data from Jones et al. 2020 to produce the distributions in Supplementary Figure 2b, a more thorough comparison is outside the scope of this paper. That said, we have two additional manuscripts in preparation that leverage this framework to analyze DMS data in different proteins and assay types.

      (5) Page 10 - In discussing the relationship of the data to ClinVar and AM, the authors use qualitative comparisons like "majority" and "typically." Just giving numbers would better help the reader appreciate how the data compare.

      We added specific proportions for these statements to the text for the ClinVar and AlphaMissense comparisons as follows:

      (See Results > Comprehensive Deep Mutational Scanning of MC4R): “For example, the majority (63.3%, 31/49) of human MC4R variants classified as pathogenic or likely pathogenic in ClinVar (Landrum et al., 2014) lead to a significant reduction of Gs signaling under low α-MSH stimulation conditions (significance threshold: false discovery rate (FDR) < 1%; Fig. 2C). Variants that are significantly loss-of-function in this condition are rarer in the human population, and more common human variants have no significant effect on MC4R function (significance threshold: FDR < 1%; Fig. 2D). Loss-of-function variants by our DMS assay are also typically (e.g., AlphaMissense: 93.4%, 1894/2028) predicted to be deleterious by commonly used variant effect predictors like AlphaMissense (Cheng et al., 2023) and popEVE (Orenbuch et al., 2023) (Supplementary Fig. 5).”

      (6) Pages 10-12, Figures 2C, E. The data look really nice, but the correlation with clinvar and the Huang data is not perfect (e.g. many pathogenic variants are classified as WT and partial LoF variants too). Can the authors comment on this discrepancy? For ClinVar, they should say when ClinVar was accessed and also how they filtered variants. I would recommend using variants with at least 1 star. Provided they did use high-quality clinical classifications, do they think the classifications are wrong, or their data? The same goes for Huang.

      —AND—

      (7) Page 13 - similar to previous comments, I'm curious about the 5 path/likely path ClinVar variants that are not LoF in the assay. Are they high noise/fewer barcodes? Or does the assay just miss some aspect of human biology?

      ClinVar data was accessed on January 5, 2024 (see Methods: Comparison to human genetics data and variant effect predictors). No annotation quality filtering was performed, and we have revised the text as follows to clarify this:

      (see Methods > Comparison to Human Genetics Data and Variant Effect Predictors): “Pathogenicity classifications of MC4R missense and nonsense variants were obtained from ClinVar (Landrum et al., 2014) on January 5, 2024, and all available annotations were included in the analysis regardless of ClinVar review status metric.”

      A substantial proportion of the discrepancy between our data and ClinVar is, as the reviewer suggests, likely due to low quality ClinVar annotations. Of the five variants that the reviewer notes were reported as pathogenic/likely pathogenic but did not result in loss of protein function in any of our DMS assays, two (V50M and V166I) have been reclassified in ClinVar to uncertain or conflicting interpretation since we accessed annotations in early 2024. An additional two of the five discrepant variants (Q43K and S58C) currently have 0 star ratings to support their pathogenic/likely pathogenic annotation. The remaining discrepant variant (S94N) has a 1 star rating supporting an annotation of “likely pathogenic.

      The Huang et al. paper did an admirably thorough job of aggregating variant annotations from more than a dozen primary literature sources that each reported functional validation data for small panels of variants. However, one inherent limitation of this approach is that the resulting annotation classes are based on experiments that were carried out using inconsistent methods and/or scoring criteria. For example, classifications in the Huang et al. paper are based on an inconsistent mix of functional assay types (e.g., Gs signaling, Gq signaling, protein cell surface expression, etc.), and different variants were tested in different cell types (e.g., HEK293T, CHO, Cos-7, etc.). In principle, DMS assays should provide a more accurate assessment of the relative quantitative differences between alleles since each variant was tested using identical experimental conditions and analysis parameters.

      That being said, while very good, our assays are likely missing or only indirectly reporting on at least some aspects of MC4R biology. For example, in addition to Gs and Gq signaling, MC4R interfaces with β-arrestin. Variants that are protective against obesity-related phenotypes have been shown to increase recruitment of β-arrestin to MC4R, and we did not directly assess this function.

      (8) Page 15, Fig 3C - The three variants they highlight all have paradoxical changes in bias as a-MSH dose is increased (e.g. the bias inverts). I'm not a GPCR expert, but this seems interesting and a little weird. Perhaps the authors could comment on it?

      We agree this is an interesting observation that deserves further study, but unfortunately is outside the scope of our priorities at the moment. As noted, all three highlighted variants in this region have a biased basal activity, and this bias inverts upon stimulation. While we don’t have a good explanation for why this would be the case, this phenomenon has been previously observed for 158R (Paisdzior et al., 2020). Our DMS data emphasizes how diverse biased effects can be and further highlights the importance of characterizing these effects. It would be interesting if further studies could elucidate the mechanistic basis for this behavior and how it may be related to G protein coupling in this region.

      (9) Page 16 - I'm not familiar with the A21x1 formalism. For the general reader, maybe the authors could introduce this formalism.

      Given the shared structural topology of GPCRs, others have developed a variety of numbering schemes to refer to where various variants are to allow more direct comparisons between different GPCRs. We use the GPCRDB.org numbering scheme (e.g., F202<sup>5x4</sup>) as it takes experimentally determined structures into account. Roughly speaking, the number preceding the “x” corresponds to which transmembrane domain (one through seven) or region the residue is located in. The numbers following the “x” correspond to where that residue is located in that region relative to a structurally conserved residue that is always assigned 50. For example F202<sup>5x48</sup> means that F202 is located in the 5th transmembrane helix and is 2 residues before the most conserved M204<sup>5x50</sup>. We updated the text to clarify this accordingly:

      (see Results > Structural Insights into Biased Signaling): “Upon ligand binding, W258 (W258<sup>6x48</sup> in https://gpcrdb.org/ nomenclature, where 6 corresponds to the 6th transmembrane helix and 48 denotes 258 is 2 residues before the most conserved residue in that helix (Isberg et al., 2015)) of the conserved CWxP motif undergoes a conformational rearrangement that is translated to L133<sup>3x36</sup> and I137<sup>3x40</sup>, of the conserved PIF motif (MIF in melanocortin receptors).”

      (10) Page 17, Figure 3A - Since 137, 254, and 140 are not picked out on the structure, I have no idea where they are. If the authors want to show readers these residues, perhaps they could be annotated or a panel added. Since ~1 entire page of the manuscript is dedicated to this cascade, it might make sense to add a panel. Just amplifying the comment above as regards position 79, others were discussed in that paragraph but not highlighted.

      We updated Supplementary Fig. 6C,D to label all of the listed residues on the protein structure for easy reference.

    1. eLife Assessment

      This manuscript describes an important study of the giant virus Jyvaskylavirus. The characterisation presented is compelling. The work will be of interest to virologists working on giant viruses as well as those working with other members of the PRD1/Adenoviridae lineage.

    2. Reviewer #1 (Public review):

      This study presents Jyvaskylavirus, a new member of the Marseilleviridae family, infecting Acanthamoeba castellanii. The study provides a detailed and comprehensive genomic and structural analysis of Jyvaskylavirus. The authors identified ORF142 as the capsid penton protein and additional structural proteins that comprise the virion. Using a combination of imaging techniques the authors provide new insights into the giant virus architecture and lifecycle. The study could be improved by providing atomic coordinates and refinement statistics, comparisons with available giant virus structures could be expanded, and the novelty in terms of the first isolated example of a giant virus from Finland could be expounded upon.

      The study contributes new structural and genomic diversity to the Marseilleviridae family, hinting at a broader distribution and ecological significance of giant viruses than previously thought.

      Comments on revisions: I'm satisfied with the authors' responses to the review, and request no further changes.

    3. Reviewer #2 (Public review):

      This paper describes the molecular characterisation of a new isolate of the giant virus Jyvaskylavirus, a member of the Marseilleviridae family infecting Acanthamoeba castellanii. The isolate comes from a boreal environment in Finland, showcasing that giant viruses can thrive in this ecological niche. The authors came up with a non-trivial isolation procedure that can be applied to characterise other members of the family and will be beneficial for the virology field. The genome shows typical Marseilleviridae features and phylogenetically belongs to their clade B. The structural characterisation was performed on the level of isolated virion morphology by negative stain EM, virions associated with cells either during the attachment or release by helium microscopy, the visualisation of the virus assembly inside cells using stained thin sections, and lastly on the protein secondary structure level by reconstructing ~6 A icosahedral map of the massive virion using cryoEM. The cryoEM density combined with gene product structure prediction enabled the identification and functional assessment of various virion proteins. The visualisation of ongoing virus assembly inside virus factories brings interesting hypotheses about the process that; however, needs to be verified in the next studies.

      Strengths:

      The detailed description of the virus isolation protocol is the largest strength of the paper and I believe it can be modified for isolating various viruses infecting small eukaryotes. The cryoEM map allows us to understand how exceptionally large virions of these viruses are stabilised by minor capsid proteins and nicely demonstrates the integration of medium-resolution cryoEM with protein structure prediction in deciphering virion protein function.

      Weaknesses:

      No mass spectrometry data are presented to supplement and confirm the identity of virion proteins which predicted models were fitted into the cryoEM density.

    4. Author response:

      The following is the authors’ response to the original reviews.

      eLife Assessment

      This manuscript describes an important study of the giant virus Jyvaskylavirus. The characterisation presented is solid, although, in the current form, it is not clear to what extent these findings change our perception of how giant viruses, especially those isolated from a cold environment, function. The work will be of interest to virologists working on giant viruses as well as those working with other members of the PRD1/Adenoviridae lineage.

      Thank you for the revision and positive comments. We decided to submit our revised version of the manuscript with changes made in light of the comments made by the editorial team and the reviewers. We hope that now the manuscript is in a better shape and satisfies all comments received. Major changes made were:

      - We changed the author order considering reviewer 2 comments (point 11). Note that no author was added or removed, we just rearranged the order of authorship.

      - We included a new supplementary table with the Jyvaskylavirus genome annotation. This is now supplementary table 2.

      - We included a supplementary figure 9 to support our changes based on reviewer 2 comments (point 6).

      - Figures 2,5,6,7 and the supplementary figure 2 were updated to accommodate our answers to different reviewer comments.

      - Three new references were added to support some of our changes.

      Below you will find our responses to each specific point raised by the reviewers.

      Public Reviews:

      Reviewer #1 (Public review):

      This study presents Jyvaskylavirus, a new member of the Marseilleviridae family, infecting Acanthamoeba castellanii. The study provides a detailed and comprehensive genomic and structural analysis of Jyvaskylavirus. The authors identified ORF142 as the capsid penton protein and additional structural proteins that comprise the virion. Using a combination of imaging techniques the authors provide new insights into the giant virus architecture and lifecycle. The study could be improved by providing atomic coordinates and refinement statistics, comparisons with available giant virus structures could be expanded, and the novelty in terms of the first isolated example of a giant virus from Finland could be expounded upon.

      The study contributes new structural and genomic diversity to the Marseilleviridae family, hinting at a broader distribution and ecological significance of giant viruses than previously thought.

      Thank you for your constructive comments. We have addressed each point raised in our rebuttal letter and revised the manuscript accordingly. By following your specific comments, we improved the manuscript regarding atomic coordinates, refinement statistics and novelty of finding a Finnish marseillevirus. Details are provided in the specific answers to your points.

      Reviewer #2 (Public review):

      Summary:

      This paper describes the molecular characterisation of a new isolate of the giant virus Jyvaskylavirus, a member of the Marseilleviridae family infecting Acanthamoeba castellanii. The isolate comes from a boreal environment in Finland, showcasing that giant viruses can thrive in this ecological niche. The authors came up with a non-trivial isolation procedure that can be applied to characterise other members of the family and will be beneficial for the virology field. The genome shows typical Marseilleviridae features and phylogenetically belongs to their clade B. The structural characterisation was performed on the level of isolated virion morphology by negative stain EM, virions associated with cells either during the attachment or release by helium microscopy, the visualisation of the virus assembly inside cells using stained thin sections, and lastly on the protein secondary structure level by reconstructing ~6 A icosahedral map of the massive virion using cryoEM. The cryoEM density combined with gene product structure prediction enabled the identification and functional assessment of various virion proteins.

      Strengths:

      The detailed description of the virus isolation protocol is the largest strength of the paper and this reviewer believes it can be modified for isolating various viruses infecting small eukaryotes. The cryoEM map allows us to understand how exceptionally large virions of these viruses are stabilised by minor capsid proteins and nicely demonstrates the integration of medium-resolution cryoEM with protein structure prediction in deciphering virion protein function. The visualisation of ongoing virus assembly inside virus factories brings interesting hypotheses about the process that; however, needs to be verified in the next studies.

      Weaknesses:

      The conclusions from helium microscopy images are overinterpreted, as the native membrane structure cannot be preserved in a fixed and dehydrated sample. In the image, there are many other parts of the curved membrane and a lot of virions, to me it seems the specific position of the highlighted virion could arise by a random chance. The claim that the cells were imaged in the near-original state by this method should be therefore omitted. Also, no mass spectrometry data are presented that would supplement and confirm the identity of virion proteins which predicted models were fitted into the cryoEM density. For a general virology reader outside of the giant virus field, the results presented in the current state might not have enough influence and the section should be rewritten to better showcase the novelty of findings.

      Thank you for your constructive comments. We thank reviewer #2 for highlighting these weaknesses, giving us the opportunity to improve our study. We have removed the claim that the cells were imaged in a near-original state. Additionally, we agree that the positions of the virions on the cell surface could result from a random distribution. However, the specific virion in panel 3C is situated halfway into a crevice, and it cannot be ruled out that this particular one could be in the process of being endocytotically uptaken. This is why we used the term "probably" while referring to this finding. Regarding the mass spectrometry data, while we understand that MS data would provide an additional layer of evidence to validate the specific proteins present in the virion, they would not confirm the precise location or role of these proteins within the virion.

      We have addressed each point raised in our rebuttal letter and revised the manuscript accordingly.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      I have only minor comments which should be relatively simple to address:

      (1) Atomic coordinates should be deposited in the PDB, and refinement statistics for the models provided, for example by expanding Table S2.

      We thank reviewer #1 for the suggestion. In the original submission in the ‘Data availability’ statement we stated that ‘Predicted Jyvaskylavirus PDB models using ModelAngelo and Alphafold have been deposited at BioStudies under the accession number S-BSST1654’. So, atomic coordinates of all predicted models are publicly available at the https://www.ebi.ac.uk/biostudies/ ; for additional clarity we also added the link in the ‘Data availability’ statement in the revised version.

      Our reasoning of not depositing them in the Protein Data Bank associated to our EMD-51613 entry is because they remain predicted models rigid-body fitted into the Jyvaskylavirus density map of 6.3 Å resolution. However, we have added into our BioStudies deposition (BSST1654) the whole Jyvaskylavirus pentameric assembly model (including all identified and predicted major and minor capsid proteins) rigid-body fitted into the Jyvaskylavirus map, and it can be easily downloaded.

      We did not to perform the real-space ‘minimization_global’ refinement of the predicted models corresponding to the ORFs of Melbournevirus (or Jyvaskylavirus) into the corresponding Melbournevirus available densities with entries EMD-37188, 37189, 37190 at ~ 3.5 Å resolution (by block-based reconstruction methods) as these maps were generated and deposited by other authors. Instead, we performed the rigid-body fit-into-map procedure of the individual predicted Jyvaskylavirus models into the previously deposited Melbournevirus maps using ChimeraX, demonstrating a fold-map alignment and assignment (see for example the individual stereo views in Supplementary Figure 6).

      In the revised version, we now provide the refinement statistics for the complete Jyvaskylavirus pentameric assembly (inclusive of peripentonal major capsid and minor capsid proteins) rigid-body fitted as a whole into the Melbournevirus 5-block reconstruction map using PHENIX, resulting into a CC<sub>mask</sub> of 57.3% (this is also stated in Supplementary Figure 7). The same pentameric assembly model was then placed into our lower-resolution 6.3 Å Jyvaskylavirus 3D density map in ChimeraX and rigid-body refined as a whole in PHENIX, yielding a predictably lower CC<sub>mask</sub> of 33%. This pentameric assembly model has now also been included into BioStudies entry.

      The procedure for this rigid body fitting and refinement has been clarified and added to the 'Materials and Methods' section as follows:

      “Then, the corresponding full 3D models were predicted using AlphaFold3 and fitted into the Melbournevirus and Jyvaskylavirus cryoEM density using the fit-into-map routine in ChimeraX together with the peripentonal capsomers (Meng et al 2023). To assess the metric of this fitting (Supplementary Figure 7), the 3.5 Å five-fold Melbournevirus block 3D density (EMDB-37190) was boxed around the pentameric assembly model and refined as a whole using rigid-body refinement in PHENIX, yielding a CC<sub>mask</sub> of 57.3%. The same pentameric model was subsequently fitted into the 6.3 Å Jyvaskylavirus 3D cryo-EM density (previously boxed around the model), resulting in a lower CC<sub>mask</sub> of 33%, consistent with the limited resolution of the capsid map and below regions.”

      (2) The results section 'Jyvaskylavirus three-dimensional architecture' could be expanded to compare and contrast with other giant virus structures, in terms of T-number, diameter, and features on and inside the capsid. This is not essential but would help focus claims of novelty with regard to structure.

      We have added a few lines as indicated by reviewer#1 to contextualize in morphological terms Jyvaskylavirus with other NCLDV viruses as follows:

      “Both the capsid organization and virion size are similar to those of other Marseilleviruses, such as Melbournevirus and Tokyovirus. Pacmanvirus, considered to be at the crossroads between Asfarviridae and Faustoviruses, also possesses the same T number (309) and a comparable diameter to Jyvaskylavirus. In contrast, other giant viruses, such as African swine fever virus (ASFV), representative of the Asfarviridae family, have a T number of 277 and a diameter of approximately 2,100 Å, while PBCV-1, a member of the Phycodnaviridae family, has a T number of 169 and an average diameter of 1,900 Å. All of the above-mentioned viruses have been shown to possess a major capsid protein with a vertical double jelly-roll fold that composes the capsid shell, along with an internal membrane bilayer. Minor capsid proteins have been identified and structurally modelled for the smaller virions ASFV and PBCV-1 (Wang et al. 2019; Shao et al. 2022).”

      (3) The authors highlight one of the main novelties of the virus as being the first to be isolated from Finland. The first isolation of a giant virus from the region is indeed a success but reported isolation experiments for giant viruses are still relatively few. To help shed light on the likely distribution of Jyvaskylavirus-like viruses in the region, and further afield, the genome of Jyvaskylavirus could be searched against relevant available metagenomes.

      In the last decade the interest on finding giant viruses by metagenomics has increased. However, the focus has been on marine environments, where these viruses are shown to be prevalent. Besides the few isolates from the Northern hemisphere mentioned in the manuscript, northern giant viruses were detected in metagenome datasets from glacier samples, epishelf lakes, the permafrost, the Nordic seas and in a deep-sea hydrothermal vent. Most of the genomic hits are for mimivirus-like or phycodnavirus-like sequences. A few marseilleviruses were found in the Loki’s castle deep sea vent, and we have already included these sequences in the analysis shown by the supplementary figure 3. In this case the deep-sea vent viruses clusters outside the conventional clades of the marseilleviridae family, evidencing their uniqueness.

      In response to the suggestion of exploring the distribution of Jyvaskylavirus, we utilized the MGnify-database to search for DNA polymerase (DNApol) and major capsid protein (MCP) sequences. Our findings revealed multiple hits with significantly low E-values (< 1e-80), where both DNApol and MCP were detected from the same studies, indicating the presence of similar virus-like particles (VLPs) globally. Of particular interest was the detection of similar sequences in metagenomes and transcriptomes obtained from drinking water distribution systems of ground and surface waterworks in central and eastern Finland (https://www.ebi.ac.uk/metagenomics/studies/MGYS00005650#overview). We have acknowledged this in the manuscript and cited the appropriated references, as follows:

      Results: “Searching the Jyvaskylavirus major capsid protein and DNA polymerase sequences in the MGnify-database (Richardson et al 2023) yields multiple hits with significantly low E-values (< 1e-80), as expected from the apparent ubiquity of marseilleviruses. Of note was the detection of similar sequences in metagenomes and transcriptomes obtained from drinking water distribution systems of ground and surface waterworks in central and eastern Finland, evidencing that marseilleviruses are prevalent but still unexplored in this region (Tiwari et al 2022)”.

      Discussion: “Marseillevirus DNA polymerase sequences are present in metagenomes from Finnish drinking water distribution systems (Tiwari et al 2022), hinting to a wide distribution of these viruses and still unknown ecological role in Central and Eastern Finland.”

      Reviewer #2 (Recommendations for the authors):

      Apart from the major comments in the weaknesses section, I have these additional minor comments to the authors:

      (1) I do not understand why the authors emphasized the uniqueness of isolating a giant virus from Finland. I think the manuscript would benefit if they rather emphasize that the virus comes from a boreal environment.

      The first giant virus, APMV, was described in 2003. In the following years the apparent ubiquity of these viruses was evidenced by two fronts. Metagenomics made clear that giant viruses are found almost everywhere, biased towards the oceans. Isolation efforts brought new virus groups in evidence but has been so far biased towards central Europe and South America samples. The closest isolated giant viruses to Jyvaskylavirus would be either an uncharacterized Swedish cedratvirus or a few microalgae-infecting mimivirus-like and phycodnaviruses-like isolates from Norway. Among marseilleviruses, Jyvaskylavirus is the northernmost isolate so far. Other marseilleviruses from the northern hemisphere were found in France, India, Japan and Algeria only.

      We still believe that finding a giant virus in Finland is relevant, considering that no other is known to date, be as an isolate or detected by genomics. We have made these observations clearer in the manuscript, giving emphasis to the boreal environment as well.

      (2) All discussed AlphaFold models should be added as Supplementary PDB data.

      We thank reviewer #2 for the suggestion. In the original submission in the ‘Data availability’ statement we stated that ‘Predicted Jyvaskylavirus PDB models using ModelAngelo and Alphafold have been deposited at BioStudies under the accession number S-BSST1654’. So, atomic coordinates of all predicted models are publicly available at the https://www.ebi.ac.uk/biostudies/ ; for additional clarity we also added the link in the ‘Data availability’ statement in the revised version.

      Our reasoning of not depositing them in the Protein Data Bank associated to our EMD-51613 entry is because they remain predicted models rigid-body fitted into the Jyvaskylavirus density map of 6.3 Å resolution. However, we have added into our BioStudies deposition (BSST1654) the whole Jyvaskylavirus pentameric assembly model (including all identified and predicted major and minor capsid proteins) rigid-body fitted into the Jyvaskylavirus map, and it can be easily downloaded.

      We did not to perform the real-space ‘minimization_global’ refinement of the predicted models corresponding to the ORFs of Melbournevirus (or Jyvaskylavirus) into the corresponding Melbournevirus available densities with entries EMD-37188, 37189, 37190 at ~ 3.5 Å resolution (by block-based reconstruction methods) as these maps were generated and deposited by other authors. Instead, we performed the rigid-body fit-into-map procedure of the individual predicted Jyvaskylavirus models into the previously deposited Melbournevirus maps using ChimeraX, demonstrating a fold-map alignment and assignment (see for example the individual stereo views in Supplementary Figure 6).

      In the revised version, we now provide the refinement statistics for the complete Jyvaskylavirus pentameric assembly (inclusive of peripentonal major capsid and minor capsid proteins) rigid-body fitted as a whole into the Melbournevirus 5-block reconstruction map using PHENIX, resulting into a CC<sub>mask</sub> of 57.3% (this is also stated in Supplementary Figure 7).

      The same pentameric assembly model was then placed into our lower-resolution 6.3 Å Jyvaskylavirus 3D density map in ChimeraX and rigid-body refined as a whole in PHENIX, yielding a predictably lower CC<sub>mask</sub> of 33%. This pentameric assembly model has now also been included into BioStudies entry.

      The procedure for this rigid body fitting and refinement has been clarified and added to the 'Materials and Methods' section as follows:

      “Then, the corresponding full 3D models were predicted using AlphaFold3 and fitted into the Melbournevirus and Jyvaskylavirus cryoEM density using the fit-into-map routine in ChimeraX together with the peripentonal capsomers (Meng et al 2023). To assess the metric of this fitting (Supplementary Figure 7), the 3.5 Å five-fold Melbournevirus block 3D density (EMDB-37190) was boxed around the pentameric assembly model and refined as a whole using rigid-body refinement in PHENIX, yielding a CC<sub>mask</sub> of 57.3%. The same pentameric model was subsequently fitted into the 6.3 Å Jyvaskylavirus 3D cryo-EM density (previously boxed around the model), resulting in a lower CC<sub>mask</sub> of 33%, consistent with the limited resolution of the capsid map and below regions.”

      (3) Figure 2A: Could ORFs that encode structural proteins discussed in the paper, be somehow highlighted?

      We have updated Figure2A to include this information.

      (4) Figure 2C: Could be somehow highlighted from these members on which there was conducted structural characterisation (e.g. by some symbol next to the name)?

      We have updated Figure2C to include this information.

      (5) Figure 5A: Could the central bid be shown in a lower threshold (you can retain the threshold for the protein shell)? It would be interesting to see some details of the interior, rather than a massive blob.

      We have decreased the threshold level of the map as suggested.

      (6) Figure 6: the density corresponding to MCPs, minor capsid, and penton proteins respectively could be colour-zoned in Chimera(X). This would better visualise where each entity lies.

      About ORF142 - what other virus protein possesses this fold? Is it similar to the penton protein in other PRD1/Adenoviridae viruses? Maybe some comparison could be presented?

      We have incorporated the feedback from reviewer_#_2 by modifying the corresponding panel A in Figure 6. We have colour-zoned the penton (ORF142), some of the density region corresponding to the MCPs (ORF184) and to the minor cap proteins (ORF121). We have kept in grey the density corresponding to other minor proteins, and those we were able to identify are logically introduced later and shown as individual coloured cartoon tube models fitted into the density in panel A of Figure 7.

      Regarding ORF142, we have included a reference in the Discussion section to a new Supplementary Figure 9, where we provide a side-by-side comparison of the predicted Jyvaskylavirus penton protein model with experimentally derived penton protein models of PRD1 and HCIV-1. In light of this comparison, we have also added a brief clarification in the Discussion as follows:

      “However, in ORF142, the CHEF strands are predicted to be tilted relative to the BIDG strands, with an estimated angle of approximately 60° based on visual inspection (Supplementary Figure 9).”

      (7) Figure 7B: Could the density around the protein be zoned (rather than side view clipped), as this would better showcase how it fits the density?

      Initially, we presented a side view of the clipped surface to highlight the correspondence between the wall-shaped density, characteristic of a low-resolution beta-barrel, and the beta-barrel of the predicted model. Following the Reviewer’s suggestion, we have now surface-zoned the density and provided a stereo view of the density with the model fitted into the map using ChimeraX. While we recognize that stereo views are no longer commonly used in main text figures, we believe they remain valuable for visually assessing the overall match in low-resolution 3D density maps.

      (8) The authors did not try to reconstruct the asymmetric feature of the virion by classifying pentons, which may have identified a special vertex, one they claim might be required for genome packaging in "open particles". I understand the number of particles is low, but even low-resolution classification in C5 might be of interest in the field.

      We thank reviewer #2 for this valuable comment. The potential existence of a unique vertex in Marseilleviruses remains an open and intriguing question. Further investigations, including a significant increase in the number of particles, may help clarify this issue, and we plan to explore this topic in future structural studies.

      (9) Supplementary Figure 2: It would be interesting how the titre changes after the 12 hours, will it plateau? Could you add a bar showing the original titre to the chart showing stability after 109 days? I like the data in this figure and think it should be transferred to the main text.

      The titre at the 12h time point is very close to the titre we often get in our stocks, indicating that indeed it is close to peaking. For comparison: the titre of the 12-hour time point was 10<sup>11.55</sup> TCID50/ml, whereas our stock has a titre of 10<sup>11.66</sup> TCID50/ml. Our growth curve had more time points up to 48h and we lost the later time points due to a higher viral load than predicted, which led to us not being able to count these time points with the dilutions used. Showing the first 12 hours was enough for our initial purpose, which was to show a quick replication cycle for Jyvaskylavirus, in accordance with the other marseilleviruses in which the timing of the replication cycle was observed (see the answer for point 10 below).

      We have added a bar representing the original titre of the stock used for the stability experiment as suggested.

      While preparing the draft we were divided into having the growth and stability figure in the main text or in the supplementary material. Our decision was to move this data to the supplementary material and keep the focus of the main text on the discovery, genome analysis and structural data, as these are the main findings of our work. The specifics regarding stability, growth and other uncharacterized VLPs went to the supplementary material for those in the field who are interested in looking deeper. That being said, we will decide to keep this data as supplementary material if you and the editor agrees.

      (10) In the Discussion, the authors should focus on how our perception of giant viruses changes by this study - compare with other growth curves, stability assays, and structures of giant viruses, showcasing how prevalent those stabilising minor capsid proteins are, etc. My impression is that in the current form, it is just not clear if/how substantial these findings are and such a comparison and putting the results in a bigger picture would considerably increase the impact of the paper.

      Our comparisons with other marseilleviruses were based on genomic and structural characteristics, the two fronts we had data from the literature and databases to compare to. Sadly there is not too much information regarding stability and growth of other isolates that could be used for an in-depth comparison. For example: although marseilleviruses are known to have a fast replication cycle, this has been measured by DAPI staining of DNA inside infected cells to evaluate viral factory formation (Boyer et al 2009), or by time-series observations of viral cycle stages by electron microscopy (Fabre et al 2017), and not by viral titration as done here. We included a mention to these references in the results:

      “A fast replication cycle is a feature also shown for other marseilleviruses (Boyer et al 2009 ; Fabre et al 2017).”

      The literature also does not show virion stability of other isolates, making it impossible to have a comparison with jyvaskylavirus. A comparative study testing different isolates side by side is definitely of relevance and interest, but this would be difficult to be done in a short time due to obtaining other isolates. We believe the results in this manuscript might set some parameters to be used for comparing with other marseilleviruses, by our groups and others, in the future.

      Regarding the prevalence of the minor capsid proteins, we have expanded and clarified the identification of ORFs in Melbournevirus in the ‘Results’ and ‘Discussion’ sections. The revised Supplementary Table 4 has been updated accordingly and referenced in the results to clarify that the identification of Melbourne ORFs was carried out in BLASTp by querying the Jyvaskylavirus minor protein sequences exclusively against the Melbournevirus isolate 1 (NCBI Reference Sequence: NC_025412.1). BLASTp was then performed against the full sequence database, and homologous sequences were primarily retrieved from other marseillaviruses. These results have been compiled in a new Supplementary Table 5.

      However, Supplementary Table 5 also shows that the hits for Melbournevirus are not ranked at the top, and in some cases, they do not appear among the top hits.

      The ‘Results’ section now contains the following text:

      “To this end, we identified the corresponding Jyvaskylavirus ORFs in Melbournevirus through sequence comparison with Melbournevirus isolate 1 (NCBI Reference Sequence: NC_025412.1) (Supplementary Table 34). However, when the identified Jyvaskylavirus ORF sequences were analyzed using BLASTp without restricting the search to the Melbournevirus reference, many hits were observed in other giant viruses, primarily marseillevirus. Remarkably, some of these hits scored higher than those for Melbournevirus, supporting the presence of homologous proteins in these viruses (Supplementary Table 5).”

      The ‘Discussion’ section now contains the following text:

      “Additionally, the observation that the identified Jyvaskylavirus minor capsid protein sequences are shared across other marseillaviruses supports their essential structural and stabilizing roles in these viruses.”

      At the same time, we have modified the ‘Materials and Methods’ section to include a reference to Supplementary Figure 5, where the use of ModelAngelo is mentioned. Additionally, a new Supplementary Figure 10 has been included to clarify how the residues built into the Melbournevirus density using ModelAngelo (without prior knowledge of any sequence) are subsequently matched with the Jyvaskylavirus sequences.

      (11) Based on the author's statement, Iker Arriaga did all the cryoEM experiments. It is strange to me they are not placed higher on the author's list.

      We thank you for this observation and agree with your comment. This manuscript has been in preparation for a few years, and the first draft had the author order defined before the structural data collection and analyses were completed. Iker participation was indeed important and substantial from the first draft to the submitted version and he definitely deserves a better author placement. We have modified the author order to accommodate this. Note that only the author order changed and that no author has been included or removed.

    1. eLife Assessment

      This manuscript presents solid evidence suggesting that the loss of ZNRF3 and RNF43, two E3 ubiquitin ligases, leads to dysregulation of EGFR signaling in cancer. The authors propose that EGFR is a direct substrate of ZNRF3/RNF43. While the authors provide immunoprecipitation data showing increased detection of ubiquitinated species, this evidence does not definitively establish that EGFR itself is ubiquitinated by RNF43/ZNRF3. The absence of direct evidence for EGFR ubiquitination is a major limitation, although the findings are useful as they may provide novel insights into the mechanisms underlying EGFR-driven cancers and open new therapeutic avenues.

    2. Reviewer #1 (Public review):

      Summary:

      In this manuscript, the authors provide strong evidence that the cell surface E3 ubiquitin ligases RNF43 and ZNRF3, which are well known for their role in regulating cell surface levels of WNT receptors encoded by FZD genes, also target EGFR for degradation. This is newly identified function for these ubiquitin ligases beyond their role in regulating WNT signaling. Loss of RNF43/ZNRF3 expression leads to elevated EGFR levels and signaling, suggesting a potential new axis to drive tumorigenesis, whereas overexpression of RNF43 or ZNRF3 decreases EGFR levels and signaling. Furthermore, RNF43 and ZNRF3 directly interact with EGFR through their extracellular domains.

      Strengths:

      The data showing that RNF43 and ZNRF3 interact with EGFR and regulate its levels and activity are thorough and convincing, and the conclusions are largely supported.

      Weaknesses:

      Prior work established a clear role for RNF43 and ZNRF3 in regulating cell surface levels of FZD, a class of WNT receptors. These new findings that these E3 ubiquitin ligases also target EGFR add a new layer of complexity, and it remains unclear to what extent WNT signaling versus EGFR signaling are impacted in cancer settings. The authors acknowledge this gap in our understanding, which will likely be the topic of follow-up studies.

      Comments on revisions:

      The authors addressed my main concerns in this revised version and in their rebuttal comments. I have no further critiques to add.

    3. Reviewer #2 (Public review):

      1st Public review:<br /> Using proteogenomic analysis of human cancer datasets, Yu et al, found that EGFR protein levels negatively correlate with ZNFR3/RNF43 expression across multiple cancers. Interestingly, they found that CRC harbouring the frequent RNF43 G659Vfs*41 mutation exhibit higher levels of EGFR when compared to RNF43 wild-type tumors. This is highly interesting since this mutation is generally not thought to influence Frizzled levels and Wnt-bcatenin pathway activity. Using CRISPR knockouts and overexpression experiments, the authors show that EGFR levels are modulated by ZNRF3/RNF43. Supporting these findings modulation of ZNRF3/RNF43 activity using Rspondin also leads to increased EGFR levels. Mechanistically, the authors, show that ZNRF3/RNF43 ubiquitinate EGFR and lead to degradation. Finally, the authors present functional evidence that loss of ZNRF3/RNF43 unleashes EGFR-mediated cell growth in 2D culture and organoids and promote tumor growth in vivo.

      Overall, the conclusions of the manuscript are well supported by the data presented, but some aspects of the mechanism presented need to be re-enforced to fully support the claims made by the authors. Additionally, the title of the paper suggests that ZNRF3 and RNF43 loss leads to hyperactivity of EGFR and that its signalling activity contribute to cancer initiation/progression. I don't think the authors convincingly showed this in their study.

      Major points:

      (1) EGFR ubiquitination. All of the experiments supporting that ZNFR3/RNF43 mediate EGFR ubiquitination are performed under overexpression conditions. A major caveat is also that none of the ubiquitination experiments are performed under denaturing conditions. Therefore, it is impossible to claim that the ubiquitin immunoreactivity observed on the western blots presented in Fig.4 corresponds to ubiquitinated-EGFR species.

      Another issue is that in Figure 4A, the experiments suggest that the RNF43-dependent ubiquitination of EGFR is promoted by EGF. However, there is no control showing the ubiquitination of EGFR in the absence of EGF but under RNF43 overexpression. According to the other experiments presented in Figures 4B, 4C and 4F, there seems to be a constitutive ubiquitination of EGFR upon overexpression. How do the authors reconcile the role of ZNRF3/RNF43 vs c-cbl?

      (2) EGFR degradation vs internalization. In Figure 3C, the authors show experiments that demonstrate that RNF43 KO increases steady state levels of EGFR and prevents its EGF-dependent proteolysis. Using flow cytometry they then present evidence that the reduction in cell surface levels of EGFR mediated by EGF is inhibited in the absence of RNF43. The authors conclude that this is due to inhibition of EGF-induced internalization of surface EGF. However, the experiments are not designed to study internalization and rather merely examine steady state levels of surface EGFR pre and post treatment. These changes are an integration of many things (retrograde and anterograde transport mechanisms presumable modulated by EGF). What process(es) is/are specifically affected by ZNFR3/RNF43? Are these processes differently regulated by c-cbl? If the authors are specifically interested in internalization/recycling, the use of cell surface biotinylation experiments and time courses are needed to examine the effect of EGF in the presence or absence of the E3 ligases.

      (3) RNF43 G659fs*41. The authors make a point in Figure 1D that this mutant leads to elevated EGFR in cancers but do not present evidence that this mutant is ineffective in mediated ubiquitination and degradation of EGFR. As this mutant maintains its ability to promote Frizzled ubiquitination and degradation, it would be important to show side by side that it does not affect EGFR. This would perhaps imply differential mechanisms for these two substrates.

      (4) "Unleashing EGFR activity". The title of the paper implies that ZNRF3/RNF43 loss leads to increased EGFR expression and hence increased activity that underlies cancer. However, I could find only one direct evidence showing that increased proliferation of the HT29 cell line mutant for RNF43 could be inhibited by the EGFR inhibitor Erlotinib. All the other evidence presented that I could find is correlative or indirect (e.g. RPPA showing increased phosphorylation of pathway members upon RNF43 KO, increased proliferation of a cell line upon ZNRF3/ RNF43 KO, decreased proliferation of a cell line upon ZNRF3/RNF43 OE in vitro or in xeno...). Importantly, the authors claim that cancer initiation/ progression in ZNRF3/RNF43 mutant may in some contexts be independent of their regulation of Wnt-bcatenin signaling and relying on EGFR activity upregulation. However, this has not been tested directly. Could the authors leverage their znrf3/RNF43 prostate cancer model to test whether EGFR inhibition could lead to reduced cancer burden whereas a Frizzled or Wnt inhibitor does not?

      More broadly, if EGFR signaling were to be unleashed in cancer, then one prediction would be that these cells would be more sensitive to EGFR pathway inhibition. Could the authors provide evidence that this is the case? Perhaps using isogenic cell lines or a panel of patient derived organoids (with known genotypes).

      Comments on revisions:

      The most important criticism of this manuscript that I raised in my original review has not been addressed. Indeed, the authors claim that EGFR is a direct substrate of the RNF43/ZNFR3 E3 ligase. This has not been directly demonstrated. Indeed, showing increased detection of ubiquitinated species in an immunoprecipitate could mean that a protein is directly modified. However, an alternative explanation is that a protein that is co-immunoprecipitated with the target protein is ubiquitinated (such as several EGFR adapters and interacting partners). Performing these experiments under denaturing conditions is one way to determine that EGFR is the substrate. Alternatively, a quantitative MS approach to quantify an increase in ubiquitinated peptides would also enable the authors to conclude that EGFR is indeed a substrate.

      In addition, one of the main conclusions of the authors is that EGFR activity is unleashed in cancer following ZNRF3 and/or RNF43 loss (as the title suggests). There is still no direct evidence in the manuscript that this is the case. I appreciate the new data showing that MEF with knockout of RNF43/ZNRF3 are sensitive to EGFR inhibitor (and not porcupine inhibitor) but what is the data supporting that EGFR activity is "unleashed" in cancer? The authors still claim that ZNRF3 and RNF43 loss could impact cancer initiation/development in a Wnt-independent fashion (see lines 341-343). I believe this conclusion is based on correlative staining of nuclear bcatenin (which is in itself not a reliable readout of active sginaling) and not on functional data.... I suggested in my original review that the authors should test the efficacy of EGFR inhibitor and Wnt inhibitor in the prostate cancer model that they present in Figure 7 that would have enabled them to firmly conclude about their relative contribution. This was largely handwaved in their rebuttal letter... Doing experiment in WT cells is not the same as addressing this question in the context of cancer.

      Finally, the authors use CRISPR KO experiments, without assessing editing or KO efficiencies throughout the manuscript and simply assume that the gRNA work. In my opinion this is an unacceptable practice.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this manuscript, the authors provide strong evidence that the cell surface E3 ubiquitin ligases RNF43 and ZNRF3, which are well known for their role in regulating cell surface levels of WNT receptors encoded by FZD genes, also target EGFR for degradation. This is a newly identified function for these ubiquitin ligases beyond their role in regulating WNT signaling. Loss of RNF43/ZNRF3 expression leads to elevated EGFR levels and signaling, suggesting a potential new axis to drive tumorigenesis, whereas overexpression of RNF43 or ZNRF3 decreases EGFR levels and signaling. Furthermore, RNF43 and ZNRF3 directly interact with EGFR through their extracellular domains.

      Strengths:

      The data showing that RNF43 and ZNRF3 interact with EGFR and regulate its levels and activity are thorough and convincing, and the conclusions are largely supported.

      Weaknesses:

      While the data support that EGFR is a target for RNF43/ZNRF3, some of the authors' interpretations of the data on EGFR's role relative to WNT's roles downstream of RNF43/ZNRF3 are overstated. The authors, perhaps not intentionally, promote the effect of RNF43/ZNRF3 on EGFR while minimizing their role in WNT signaling. This is the case in most of the biological assays (cell and organoid growth and mouse tumor models). For example, the conclusion of "no substantial activation of Wnt signaling" (page 14) in the prostate cancer model is currently not supported by the data and requires further examination. In fact, examination of the data presented here indicates effects on WNT/b-catenin signaling, consistent with previous studies.

      Cancers in which RNF43 or ZNRF3 are deleted are often considered to be "WNT addicted", and inhibition of WNT signaling generally potently inhibits tumor growth. In particular, treatment of WNT-addicted tumors with Porcupine inhibitors leads to tumor regression. The authors should test to what extent PORCN inhibition affects tumor (and APC-min intestinal organoid) growth. If the biological effects of RNF43/ZNRF3 loss are mediated primarily or predominantly through EGFR, then PORCN inhibition should not affect tumor or organoid growth.

      We thank the reviewer’s appreciation of the key strength of our study. We fully agree with the reviewer that RNF43/ZNRF3 play key roles in restraining WNT signaling and their deletions activate WNT signaling that leads  to cancer promotion, as discussed and cited in our manuscript (Hao et al, 2012; Koo et al, 2012). We have revised the language in this manuscript to avoid any confusion or appearance of downplaying this known signaling pathway in cancer progression.

      What we would like to highlight in this work is that our study uncovered an effect of RNF43/ZNRF3 on EGFR, leading to biological impact in multiple model systems. In particular, we included the APC-mutated human cancer cell line HT29 and Apc min mouse intestinal tumor organoids. In the context of APC mutations, β-catenin stabilization and the activation of WNT target genes are essentially decoupled from upstream WNT ligand binding to WNT receptors, thus we could primarily focus on the effect of RNF43/ZNRF3 on EGFR. Our statement of “no substantial activation of WNT signaling” as cited by the reviewer was made in describing the data in Fig. 7E where we did not observe β-catenin accumulation in the nucleus and reasoned no substantial activation of canonical WNT signaling. We agree that further examination would help strengthen the conclusion and appreciate the reviewer’s suggestion of PORCN inhibition experiments. While PORCN inhibition is a valuable experiment in models with abundance of WNT ligands/receptors and non-mutationally activated regulators of WNT signaling (Yu et al, 2020), in biological scenarios with existing APC mutations, another group has previously demonstrated that PORCN inhibition had no observable effect on WNT signaling in APC-deficient cells (PMID: 29533772). In our initial submission, we confirmed this predicted low response to manipulation of WNT signaling components upstream of a mutated APC. We showed that addition of RSPO1 in Apc min mouse intestinal tumor organoids failed to further activate WNT target expression (Fig. 6G). Furthermore, in this revised manuscript, we added new data on EGFR inhibition and PORCN inhibition in WT and Znrf3 KO MEFs (Fig. 6L). PORCN inhibition had no impact on cell growth in neither WT nor Znrf3 KO MEFs, suggesting that Znrf3 KO promoting MEF growth is WNT independent. In contrast, inhibition of EGFR downstream signaling components (Fig. 6L) significantly blocked MEF growth and abolished the impact of Znrf3 KO in MEF growth. This new evidence further supports our main conclusion that RNF43/ZNRF3 controls EGFR signaling to regulate cell growth.

      Reviewer #2 (Public Review):

      Using proteogenomic analysis of human cancer datasets, Yu et al, found that EGFR protein levels negatively correlate with ZNFR3/RNF43 expression across multiple cancers. Interestingly, they found that CRC harbouring the frequent RNF43 G659Vfs*41 mutation exhibits higher levels of EGFR when compared to RNF43 wild-type tumors. This is highly interesting since this mutation is generally not thought to influence Frizzled levels and Wnt-bcatenin pathway activity. Using CRISPR knockouts and overexpression experiments, the authors show that EGFR levels are modulated by ZNRF3/RNF43. Supporting these findings, modulation of ZNRF3/RNF43 activity using Rspondin also leads to increased EGFR levels. Mechanistically, the authors, show that ZNRF3/RNF43 ubiquitinate EGFR and leads to degradation. Finally, the authors present functional evidence that loss of ZNRF3/RNF43 unleashes EGFR-mediated cell growth in 2D culture and organoids and promotes tumor growth in vivo.

      Overall, the conclusions of the manuscript are well supported by the data presented, but some aspects of the mechanism presented need to be reinforced to fully support the claims made by the authors. Additionally, the title of the paper suggests that ZNRF3 and RNF43 loss leads to the hyperactivity of EGFR and that its signalling activity contributes to cancer initiation/progression. I don't think the authors convincingly showed this in their study.

      We thank the reviewer commenting that our “conclusions of the manuscript are well supported by the data presented.”  We address the concerns raised by this reviewer in an itemized way as detailed below:

      Major points:

      (1) EGFR ubiquitination. All of the experiments supporting that ZNFR3/RNF43 mediates EGFR ubiquitination are performed under overexpression conditions. A major caveat is also that none of the ubiquitination experiments are performed under denaturing conditions. Therefore, it is impossible to claim that the ubiquitin immunoreactivity observed on the western blots presented in Figure 4 corresponds to ubiquitinated-EGFR species. Another issue is that in Figure 4A, the experiments suggest that the RNF43-dependent ubiquitination of EGFR is promoted by EGF. However, there is no control showing the ubiquitination of EGFR in the absence of EGF but under RNF43 overexpression. According to the other experiments presented in Figures 4B, 4C, and 4F, there seems to be a constitutive ubiquitination of EGFR upon overexpression. How do the authors reconcile the role of ZNRF3/RNF43 vs c-cbl?

      We agree with this reviewer of the limitation of overexpression experiments. In this manuscript, we actually leveraged both overexpression and knockout systems to demonstrate that ZNRF3/RNF43 regulates EGFR ubiquitination: in Fig 4A, we showed that overexpression of RNF43 increased EGFR ubiquitination; in Fig 4B&C and Fig S3A, we showed that RNF43 knockout decreased EGFR ubiquitination; in Fig 4F, we showed that overexpression of ZNRF3 WT increased EGFR ubiquitination but overexpression of ZNRF3 RING domain deletion mutant failed to increase EGFR ubiquitination.

      We also appreciate the rigor with which the reviewer has approached our methodology. We acknowledge that denaturing conditions can provide additional validation, but the technical challenges associated with denaturing conditions include the potential disruption of epitope structures recognized by these antibodies. Our methodology was chosen to balance the need for accurate detection with the preservation of protein structure and function, which are crucial for understanding the biological implications of EGFR ubiquitination. Moreover, our immunoprecipitation and subsequent Western blotting were stringent with high SDS and 2-ME, optimized to minimize non-specific binding and enhance the specificity of detection. We believe that the data presented are robust and contribute significantly to the existing body of knowledge on EGFR ubiquitination.

      CBL is a well-known E3 ligase of EGFR, and it induces EGFR ubiquitination upon EGF ligand stimulation. Therefore, in order to have a fair comparison of RNF43 and CBL on EGFR ubiquitination, we designed Fig 4A and related experiments in the setting of EGF stimulation. We observed that RNF43 overexpression increased EGFR ubiquitination as potently as CBL did. Following this result, we further demonstrated that knockout of RNF43 decreased endogenous ubiquitinated EGFR level in the unstimulated/basal condition (Fig 4B) as well as in the EGF-stimulated condition (Fig 4C). We acknowledge the importance and interest in fully understanding how ZNRF3/RNF43 interplays with the functions of CBL in regulating EGFR ubiquitination. This line of investigation indeed holds the potential to uncover novel regulatory mechanisms in detail. However, the primary focus of the current study was to establish a foundational understanding of ZNRF3/RNF43 role in regulating EGFR ubiquitination. We look forward to exploring further in future work.

      (2) EGFR degradation vs internalization. In Figure 3C, the authors show experiments that demonstrate that RNF43 KO increases steady-state levels of EGFR and prevents its EGF-dependent proteolysis. Using flow cytometry they then present evidence that the reduction in cell surface levels of EGFR mediated by EGF is inhibited in the absence of RNF43. The authors conclude that this is due to inhibition of EGF-induced internalization of surface EGF. However, the experiments are not designed to study internalization and rather merely examine steady-state levels of surface EGFR pre and post-treatment. These changes are an integration of many things (retrograde and anterograde transport mechanisms presumable modulated by EGF). What process(es) is/are specifically affected by ZNFR3/RNF43? Are these processes differently regulated by c-cbl? If the authors are specifically interested in internalization/recycling, the use of cell surface biotinylation experiments and time courses are needed to examine the effect of EGF in the presence or absence of the E3 ligases.

      We agree that our study design primarily assesses EGFR levels on the cell surface before and after EGF treatment and does not comprehensively measure the whole internalization process. In response to the reviewer’s comments, we have revised the relevant sections of manuscript to clarify that our current findings are focused on changes in cell surface EGFR and do not extend to the detailed mechanisms of EGF-induced internalization or recycling.

      (3) RNF43 G659fs*41. The authors make a point in Figure 1D that this mutant leads to elevated EGFR in cancers but do not present evidence that this mutant is ineffective in mediated ubiquitination and degradation of EGFR. As this mutant maintains its ability to promote Frizzled ubiquitination and degradation, it would be important to show side by side that it does not affect EGFR. This would perhaps imply differential mechanisms for these two substrates.

      Fig 1D is based on bioinformatic analysis of colon cancer patient samples, showing that RNF43 G659Vfs*41 mutant tumors exhibited significantly higher levels of EGFR protein compared to RNF43 WT tumors. Following this lead, we investigated whether this RNF43 G659fs*41 hotspot mutation lost its role in downregulating EGFR. To this end, we transfected the same amount of control vector, RNF43 WT, RING deletion mutant, G659fs*41 mutant DNA into 293T cells and measured the level of EGFR (co-transfected). As shown in Author response image 1, overexpression of RNF43 WT decreased EGFR level while overexpression of RING deletion mutant had no impact on EGFR level as compared with the Vector group, which is consistent with our findings in the manuscript. Cells transfected with the RNF43 G659Vfs*41 mutant exhibited nearly normal levels of EGFR; however, we also observed that RNF43 G659Vfs*41 was less expressed than WT, even though the same amounts of DNA were transfected. Therefore, the insubstantial impact on EGFR levels could be attributed to both functional loss or compromised stability of RNF43 G659Vfs*41 mRNA or protein. Further investigation on RNF43 G659Vfs*41 mRNA and protein stability vs. RNF43 G659Vfs*41 protein function is needed to draw a solid conclusion.

      Author response image 1.

      (4) "Unleashing EGFR activity". The title of the paper implies that ZNRF3/RNF43 loss leads to increased EGFR expression and hence increased activity that underlies cancer. However, I could find only one direct evidence showing that increased proliferation of the HT29 cell line mutant for RNF43 could be inhibited by the EGFR inhibitor Erlotinib. All the other evidence presented that I could find is correlative or indirect (e.g. RPPA showing increased phosphorylation of pathway members upon RNF43 KO, increased proliferation of a cell line upon ZNRF3/ RNF43 KO, decreased proliferation of a cell line upon ZNRF3/RNF43 OE in vitro or in xeno...). Importantly, the authors claim that cancer initiation/ progression in ZNRF3/RNF43 mutants may in some contexts be independent of their regulation of Wnt-bcatenin signaling and relying on EGFR activity upregulation. However, this has not been tested directly. Could the authors leverage their znrf3/RNF43 prostate cancer model to test whether EGFR inhibition could lead to reduced cancer burden whereas a Frizzled or Wnt inhibitor does not?

      More broadly, if EGFR signaling were to be unleashed in cancer, then one prediction would be that these cells would be more sensitive to EGFR pathway inhibition. Could the authors provide evidence that this is the case? Perhaps using isogenic cell lines or a panel of patient-derived organoids (with known genotypes).

      We appreciate the reviewer’s suggestion to provide more direct evidence demonstrating the importance of the ZNRF3/RNF43-EGFR axis in cancer cell proliferation.   In this revised manuscript, we further studied this issue in the WT vs. Znrf3 KO MEF cells. We observed that treatment with the EGFR inhibitor erlotinib did not affect WT MEF but stunted the growth advantage of Znrf3 KO MEF cells (Fig. 6L). On the other hand, treatment with the porcupine inhibitor C59 did not impact either WT or Znrf3 KO MEF cells (Fig. 6L), suggesting a more important role of the ZNRF3/RNF43-EGFR axis in mediating the enhanced cell growth of MEF caused by Znrf3 knockout. Furthermore, considering EGFR is often mutated in human cancer, to increase the clinical relance of our study, we also tested the effect of RNF43 knockout on EGFR L858R (Fig. 2D), a common oncogenic EGFR mutant, and found that RNF43 knockout in HT29 boosted levels of this EGFR mutant detected by its FLAG tag, suggesting that RNF43 degrades both WT and mutated EGFR and its loss can enhance signaling of both WT EGFR and its oncogenic mutant .  However, we emphasize again that this manuscript is in no way written to diminish the proven importance of ZNRF3/RNF43-WNT-β-catenin axis in cancer and development.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      The main conclusion that EGFR is targeted for degradation by RNF43 and ZNRF3 is well supported and documented. Figures 1-5 and associated supplemental figures contain largely convincing data. Figures 6 and 7, however, require some modifications, as follows in order of appearance:

      Figure 6C: Growth of intestinal tumor organoids from Apcmin mice does not require Rspo, however, the authors show that these organoids grow larger in the presence of Rspo, an effect they attribute to increased EGFR activity, rather than increased WNT activity. While this conclusion may be correct, the authors should address this possibility by treating the organoids with PORCN inhibitor. The prediction would be that Rspo treatment still increases organoid size in the presence of PORCN inhibition. A further prediction would be that blocking EGFR (e.g. with Cetuximab) will abrogate the RSPO1 effect.

      Yes, we attributed the impact of Rspo on Apc min organoid growth to enhanced EGFR activity because we observed increased EGFR levels (Fig 6F) but no detectable increase in eight WNT target genes assayed. We agree that further pharmacologic experiments would further boost our conclusion, but our few attempts at treating organoids encountered technical difficulties. Hence, we switched to testing PORCN inhibition vs EGFR inhibition in WT and Znfr33 KO MEFs. As shown in the revised Fig. 6L, EGFR inhibition significantly reversed the growth advantage caused by Znrf3 KO but C59 did not.

      Figure 6G: It is unclear why the authors provide "8-day RSPO1 treatment" data. Here, EGFR mRNA appears to be elevated 2-fold (perhaps not statistically significant), and the Wnt targets Lef1 and Axin2 are decreased, as indicated by the statistical significance. What point is being made here?

      Our observation of increased size of APC min mouse intestinal tumor organoids and increased the EGFR protein levels were at 8 days of RSPO1 treatment. Therefore, we measured mRNA levels at the same time point with the 2-day time point also included for comparison. The goal of this qPCR experiment was to detect the contribution of WNT signaling, and we did not detect an increased transcriptional readout. We included EGFR mRNA levels for comparison, and we did not detect a statistically significant increase, consistent with our experiments concluding that ZNRF3/RNF43 regulate EGFR at the protein level. As stated in the preceding response, these data led us to attribute the impact of Rspo on Apc min organoid growth to enhanced EGFR activity.

      Figure 7A: This requires quantitation. How many mice were used per cell line? The data shown is not particularly convincing, with ZNRF3 overexpressing HT29 cells growing detectably. Showing representative mice is fine, but this should be supplemented with quantitation of all mice.

      We had provided this data. The BLI signal quantification was shown below the representative BLI images. Seven mice were used per cell line, as annotated at the top of the graph.

      Figure 7B: The authors assert that "canonical WNT signaling, based on levels of active-β-Catenin (non-phosphorylated at Ser33/37/Thr41; Figure 7B), remained unaffected". As shown, 2 of the 3 Myc-Znrf3 tumors have increased active-b-catenin signal over the GFP tumors. This indicates to me that canonical Wnt signaling was affected. The authors either need to present quantitative data that supports this claim or modify their conclusions. As presented, I don't think it is appropriate to decouple the effect of Znrf3 overexpression on EGFR from its effect on WNT.

      As requested, we have quantified the level of non-phospho β-Catenin at Ser33/37/Thr41 and found no significant differences (p > 0.05) between the control group vs. ZNRF3 overexpression group. We once again note that our manuscript was not meant to dispute the proven signaling and biological significance of WNT signaling regulation by ZNRF3/RNF43, and we have proof-read the manuscript multiple times to ensure that we did not make any generalized or misleading statements in this aspect.

      Author response image 2.

      Figure 7E: Here the authors assert that "no substantial activation of canonical Wnt signaling" in the Z&R KO tumors, however, the figure shows a substantial increase in active b-catenin staining. The current resolution is insufficient to claim that there is no increase in nuclear b-catenin. The authors' claim that WNT signaling is not involved here is not supported by the data presented here. One way to demonstrate that this effect is through EGFR activation and not through WNT activation is to treat mice with PORCN inhibitor. WNT-addicted tumors, such as by Rnf43 or Znrf3 deletion, regress upon PORCN inhibition. In this case, if the effect of Z&R KO is mediated through EGFR rather than WNT, then there should be no effect on tumor growth upon PORCN inhibition. This is a critical experiment in order to make this point.

      We appreciate the reviewer’s comments and suggestion of experiments. We based our initial statement on insubstantial nuclear β-catenin staining, but we agree that immunohistochemical staining lacks the resolution suitable for quantification. We could not generate the adequate number of KO animals for these in vivo experiments in the window of time planned for this revision. Rather, as shown in the newly added Fig. 6L, we tested EGFR inhibition and PORCN inhibition in Znrf3 KO MEFs and obtained strong data further supporting EGFR in mediating Znrf3 KO promotion of MEF growth. Notwithstanding, we have carefully revised our description of the in vivo data in Fig 7E to avoid any confusion or over-interpretation.

      Minor points:

      Figure 2A: provide quantitation of this immunoblot.

      We have revised manuscript with quantification result shown next to the immunoblot.

      Figure 2B: provide more detail in the figure legend and in the Materials and Methods section on how the KO MEFs were generated. Confirmation that Znrf3 (or in cases of Rnf43 KO) expression is lost in KO would be advisable.

      We have confirmed Znrf3 KO by genotyping and RNF43 KO by immunofluorescent staining. We have also tested multiple commercial anti-ZNRF3 antibodies and anti-RNF43 antibodies for Western blotting, but they all failed.

      Figure 4C is a little misleading. The schematic indicates that ECD-TM and TM-ICD truncations were analyzed for both ZNRF3 and RNF43. However, Figure 4 only shows data for ZNRF3, and the corresponding Figure S4 lacks data for the TM-ICD of Rnf43. A recommendation is to show only those schematics for which data is presented in that figure. On a related topic, the results using the deltaRING constructs (Figure S5) are not mentioned/described in the text.

      We think that the reviewer meant Fig 5C. We have revised the Fig 5C by removing the RNF43 label, and we confirm that  Results section does include the data in Fig S5.

      Figure S4A: Only ZNRF3 is indicated in this figure. Please explain why RNF43 is not represented here. Also, indicate what is plotted along the x-axis.

      We only detected the endogenous ZNRF3-EGFR interaction, possibly because the RNF43 protein level is relatively low in the cell line we used for the mass spec experiment. X-axis is the proteins ordered based on Y-axis values as detailed in the figure legend  -- each data point was arranged along the x axis based on the fold change of iBAQ of EGFR-associated proteins identified in EGF-stimulated vs. control in the log2 scale, from low to high (from left to right on x axis). We have added the phrase “Proteins detected by Mass-Spec” for X-axis.

      Reviewer #2 (Recommendations For The Authors):

      Minor Points.

      (1) In Figure 2B, the authors claim that Znrf3 KO enhanced both EGFR and p-EGFR levels both in the absence and presence of EGF. Although it is clear in the presence of EGF, the increased in p-EGFR in the absence of EGF is less than clear.

      We have revised the manuscript to more clearly state the result in Fig 2B.

      (2) Importantly the authors validated their findings using three independent RNF43 gRNA (fig S2D) but they do not show the editing efficiency obtained with the gRNA.

      We did not include RNF43 IB in this Figure due to lack of specific antibodies for detecting RNR43 in IB. We have no reasons to doubt adequate efficiency of knockout since EGFR was increased compared to the control group. As a result, we did not perform deep sequencing to validate knockout efficacy.

      (3) In S2E, the authors show that KO of either ZNRF3 or RNF43 enhance HER2 levels. This suggests that there is no redundancy between these E3 ligases, at least in this context. How do the authors reconcile that?

      The reviewer raised an interesting issue. Due to the lack of WB antibodies for these two proteins, we would not easily assess the feedback impact of knockout of either gene on the protein levels of the other gene. We speculate that there may be a threshold level of the sum of the two proteins that is needed for adequate degradation of HER2, leading to HER2 increase when either gene is knocked out. Detailed studies of this issue is beyond the scope of this current work.

      (4) Experiments performed in Fig 3C are performed in only one clone. The authors need to repeat in an additional clone or rescue this phenotype using a RNF43 cDNA.

      Our RNF43 KO HT29 line is a pool of KO cells, not a single clone.

      (5) In Figure 7E, the authors suggest that the absence of nuclear bcatenin means that canonical Wnt signaling is unaffected. It is widely known that nuclear bcatenin is often not correlating with pathway activity.

      As stated above, we have revised the manuscript to avoid confusion and misinterpretation.

      (6) What is the nature of the error bars in Fig 3c? Are the differences statistically significant?

      As mentioned in the figure legend, the error bars are SEM. The result is statistically significant, and p-value is noted in the graph.

      (7) In the Figure legends, it should be stated clearly how many biological replicates were performed for each experiment and single data points should be plotted where applicable (e.g. qPCR data). It would be helpful if the uncropped and unprocessed Western blot membranes and replicates that are not shown would be accessible to allow the reader a more comprehensive view of the acquired data, especially for blots that were quantified (e.g. Figure 2F, Figure 3C, there is clearly some defect on the blot).

      For WB representation, it would be helpful to include more size markers on the Western blots (especially on the Ips that show ubiquitin smear) and in general to use a reference protein (GAPDH, Actin, Vinculin) that is closer to the protein being accessed.

      More details should be added in the Methods section to explain how protocols were performed in detail. For example, it should be explained how the viruses used for infecting cells were produced (which plasmids were transfected using which transfection reagent, how long was the virus collected for, etc). Then, it should be stated how long the cells were undergoing selection before being harvested. Because the expression of the viral constructs potentially has an effect on cell proliferation through EGFR, this information is quite relevant. This is just an example, there are details missing in nearly every section (Flow: washing protocols, gating protocols (Live/dead stain?), WB: RIPA lysis buffer composition? How much protein was loaded on blots? How was protein quantification done? IP: how were washes performed and how often repeated?)

      Missing: antibody dilutions for IF, IHC, and WB, plasmid backbones, sequences and availability, qPCR primer sequences from Origene.

      Incucyte experiments are not described.

      We have revised the relevant sections to include more details.

      (8) Line 141: revise text: 2x mRNA abundance in the same sentence.

      Line 162: define intermediate expression better.

      Line 197/198: revise text ('the predominant one'?).

      Line 218/219: revise text (Internalisation of surface EGFR?).

      Line 245: clarify in text that it is endogenous EGFR that is being pulled down.

      Line 264: typo: conserved instead of conservative.

      Line 324: revise text (What does 'unknown significance' mean).

      Line 396/397: revise text: 2x Co-IP in the same sentence.

      Figure 3 D/E: more details on the Method in the figure legend.

      We have revised them accordingly.

    1. eLife Assessment

      This manuscript presents a clever and powerful approach to examining differential roles of Nav1.2 and Nav1.6 channels in excitability of neocortical pyramidal neurons, by engineering mice in which a sulfonamide inhibitor of both channels has reduced affinity for one or the other channels. Overall, the results in the manuscript are compelling and give important information about differential roles of Nav1.6 and Nav1.2 channels. Activity-dependent inactivation of NaV1.6 was also found to attenuate seizure-like activity in cells, demonstrating the promise of activity-dependent NaV1.6-specific pharmacotherapy for epilepsy.

    2. Reviewer #1 (Public review):

      Summary:

      Prior research indicates that NaV1.2 and NaV1.6 have different compartmental distributions, expression timelines in development, and roles in neuron function. The lack of subtype-specific tools to control Nav1.2 and Nav1.6 activity however has hampered efforts to define the role of each channel in neuronal behavior. The authors attempt to address the problem of subtype specificity here by using aryl sulfonamides (ASCs) to stabilize channels in the inactivated state in combination with mice carrying a mutation that renders NaV1.2 and/or NaV1.6 genetically resistant to the drug. Using this innovative approach, the authors find that action potential initiation is controlled by NaV1.6 while both NaV1.2 and NaV1.6 are involved in backpropagation of the action potential to the soma, corroborating previous findings. Additionally, NaV1.2 inhibition paradoxically increases the firing rate, as has also been observed in genetic knockout models. Finally, the potential anticonvulsant properties of ASCs were tested. NaV1.6 inhibition but not NaV1.2 inhibition was found to decrease action potential firing in prefrontal cortex layer 5b pyramidal neurons in response to current injections designed to mimic inputs during seizure. This result is consistent with studies of loss-of-function Nav1.6 models and knockdown studies showing that these animals are resistant to certain seizure types. These results lend further support for the therapeutic promise of activity-dependent, NaV1.6-selective, inhibitors for epilepsy.

      Strengths:

      (1) The chemogenetic approaches used to achieve selective inhibition of NaV1.2 and NaV1.6 are innovative and help resolve long-standing questions regarding the role of Nav1.2 and Nav1.6 in neuronal electrogenesis.

      (2) The experimental design is overall rigorous, with appropriate controls included.

      (3) The assays to elucidate the effects of channel inactivation on typical and seizure-like activity were well selected.

      Weaknesses:

      (1) The potential impact of the YW->SR mutation in the voltage sensor does not appear to have been sufficiently assessed. The activation/inactivation curves in Figure 1E show differences in both activation and inactivation at physiologically relevant membrane voltages, which may be significant even though the V1/2 and slope factors are roughly similar.

      (2) Additional discussion of the fact that channels are only partially blocked by the ASC and that ASCs act in a use-dependent manner would improve the manuscript and help readers interpret these results.

      (3) NaV1.6 was described as being exclusively responsible for the change in action potential threshold, but when NaV1.6 alone was inactivated, the effect was significantly reduced from the condition in which both channels were inactivated (Figure 4E). Similarly, Figure 6C shows that blockade of both channels causes threshold depolarization prior to the seizure-like event, but selective inactivation of NaV1.6 does not. As NaV1.2 does not appear to be involved in action potential initiation and threshold change, what is the mechanism of this dissimilarity between the NaV1.6 inactivation and combined NaV1.6/ NaV1.2 inactivation?

      (4) The idea that use-dependent VGSC-acting drugs may be effective antiseizure medications is well established. Additional discussion or at least acknowledgement of the existing, widely used, use-dependent VGSC drugs should be included (e.g. Carbamazepine, Lamotrigine, Phenytoin). Also, the idea that targeting NaV1.6 may be effective for seizures is established by studies using genetic models, knockdown, and partially selective pharmacology (e.g. NBI-921352). Additional discussion of how the results reported here are consistent with or differ from studies using these alternative approaches would improve the discussion

    3. Reviewer #2 (Public review):

      The authors used a clever and powerful approach to explore how Nav1.2 and Nav1.6 channels, which are both present in neocortical pyramidal neurons, differentially control firing properties of the neurons. Overall, the approach worked very well, and the results show very interesting differences when one or the other channel is partially inhibited. The experimental data is solid and the experimental data is very nicely complemented by a computational model incorporating the different localization of the two types of sodium channels.

      In my opinion the presentation and interpretation of the results could be improved by a more thorough discussion of the fact that only incomplete inhibition of the channels can be achieved by the inhibitor under physiological recording conditions and I thought the paper could be easier to digest if the figures were re-organized. However, the key results are well-documented.

    4. Reviewer #3 (Public review):

      Summary:

      The authors used powerful and novel reagents to carefully assess the roles of the voltage gated sodium channel (NaV) isoforms in regulating the neural excitability of principal neurons of the cerebral cortex. Using this approach, they were able to confirm that two different isoforms, NaV1.2 and NaV1.6 have distinct roles in electrogenesis of neocortical pyramidal neurons.

      Strengths:

      Development of very powerful transgenic mice in which NaV1.2 and/or NaV1.6 were modified to be insensitive to ASCs, a particular class of NaV blocker. This allowed them to test for roles of the two isoforms in an acute setting, without concerns of genetic or functional compensation that might result from a NaV channel knockout.

      Careful biophysical analysis of ASC effects on different NaV isoforms.

      Extensive and rigorous analysis of electrogenesis - action potential production - under conditions of blockade of either NaV1.2 or NaV1 or both.

      Weaknesses:

      Some results are overstated in that the representative example records provided do not directly support the conclusions.

      Results from a computational model are provided to make predictions of outcomes, but the computational approach is highly underdeveloped.

    5. Author response:

      We thank the reviewers and editors for these careful and constructive comments. Based on these comments, we plan to perform new experiments and revised analysis, summarized as follows:

      (1) A more thorough analysis and experimental test of the effects of YW->SR variants on baseline AP excitability in neurons in the absence of any pharmacology.

      (2) More details on modeling of selective block of Na<sub>V</sub>1.2 and Na<sub>V</sub>1.6.

      (3) Revisions to text, figure contents, and figure order to better convey key points and better frame these findings in the context of current clinically available anti-seizure medications that interact with sodium channels.

    1. eLife Assessment

      This valuable study provides a novel framework for leveraging longitudinal field observations to examine the effects of aging on stone tool use behaviour in wild chimpanzees. However, the analysis and interpretation are currently incomplete and would benefit from a more robust consideration of additional sources of variance for the data (e.g., foraging ecology, nut and tool properties, etc.). Despite the low sample size of five individuals, this study is of broad interest to ethologists, primatologists, archaeologists, and psychologists.

    2. Reviewer #1 (Public review):

      Summary:

      Howard-Spink et al. investigated how older chimpanzees changed their behavior regarding stone tool use for nutcracking over a period of 17 years, from late adulthood to old age. This behavior is cognitively demanding, and it is a good target for understanding aging in wild primates. They used several factors to follow the aging process of five individuals, from attendance at the nut-cracking outdoor laboratory site to time to select tools and efficiency in nut-cracking to check if older chimpanzee changed their behavior.

      Indeed, older chimpanzees reduced their visits to the outdoor lab, which was not observed in the younger adults. The authors discuss several reasons for that; the main ones being physiological changes, cognitive and physical constraints, and changes in social associations. Much of the discussion is hypothetical, but a good starting point, as there is not much information about senescence in wild chimpanzees.

      The efficiency for nut-cracking was variable, with some individuals taking a long time to crack nuts while others showed little variance. As this is not compared with the younger individuals and the sample is small (only five individuals), it is difficult to be sure if this is also partly a normal variance caused by other factors (ecology) or is only related to senescence.

      Strengths:

      (1) 17 years of longitudinal data in the same setting, following the same individuals.

      (2) Using stone tool use, a cognitively demanding behavior, to understand the aging process.

      Weaknesses:

      A lack of comparison of the stone tool use behavior with younger individuals in the same period, to check if the changes observed are only related to age or if it is an overall variance. The comparison with younger chimpanzees was only done for one of the variables (attendance).

    3. Reviewer #2 (Public review):

      Summary:

      Primates are a particularly important and oft-applied model for understanding the evolution of, e.g., life history and senescence in humans. Although there is a growing body of work on aging in primates, there are three components of primate senescence research that have been underutilized or understudied: (1) longitudinal datasets, (2) wild populations, and (3) (stone) tool-use behaviors. Therefore, the goal of this study was to (1) use a 17-year longitudinal dataset (2) of wild chimpanzees in the Bossou forest, (3) visiting a site for field experiments on nut-cracking. They sampled and analyzed data from five field seasons for five chimpanzees of old age. From this sample, Howard-Spink and colleagues noted a decline in tool-use and tool-use efficiency in some individuals, but not in others. The authors then conclude that there is a measurable effect of senescence on chimpanzee behavior, but that it varies individually. The study has major intellectual value as a building block for future research, but there are several major caveats.

      Strengths:

      With this study, Howard-Spink and colleagues make a foray into a neglected topic of research: the impact of the physiological and cognitive changes due to senescence on stone tool use in chimpanzees. Based on novelty alone, this is a valuable study. The authors cleverly make use of a longitudinal record covering 17 years of field data, which provides a window into long-term changes in the behavior of wild chimpanzees, which I agree cannot be understood through cross-sectional comparisons.

      The metrics of 'efficiency' (see caveats below) are suitable for measuring changes in technological behavior over time, as specifically tailored to the nut-cracking (e.g., time, number of actions, number of strikes, tool changes). The ethogram and the coding protocol are also suitable for studying the target questions and objectives. I would recommend, however, the inclusion of further variables that will assist in improving the amount of valid data that can be extrapolated (see also below).

      With this pilot, Howard-Spink and colleagues have established a foundation upon which future research can be designed, including further investigation with the Bossou dataset and other existing video archives, but especially future targeted data collection, which can be designed to overcome some of the limits and confounds that can be identified in the current study.

      Weaknesses:

      Although I agree with the reasoning behind conducting this research and understand that, as the authors state, there are logistical considerations that have to be made when planning and executing such a study, there are a number of methodological and theoretical shortcomings that either need to be more explicitly stated by the authors or would require additional data collection and analysis.

      One of the main limitations of this study is the small sample size. There are only 5 of the old-aged individuals, which is not enough to draw any inferences about aging for chimpanzees more generally. Howard-Spink and colleagues also study data from only five of the 17 years of recorded data at Bossou. The selection of this subset of data requires clarification: why were these intervals chosen, why this number of data points, and how do we know that it provides a representative picture of the age-related changes of the full 17 years?

      With measuring and interpreting the 'efficiency' of behaviors, there are in-built assumptions about the goals of the agents and how we can define efficiency. First, it may be that efficiency is not an intentional goal for nut-cracking at all, but rather, e.g., productivity as far as the number of uncrushed kernels (cf. Putt 2015). Second, what is 'efficient' for the human observer might not be efficient for the chimpanzee who is performing the behavior. More instances of tool-switching may be considered inefficient, but it might also be a valid strategy for extracting more from the nuts, etc. Understanding the goals of chimpanzees may be a difficult proposition, but these are uncertainties that must be kept in mind when interpreting and discussing 'decline' or any change in technological behaviors over time.

      For the study of the physiological impact of senescence of tool use (i.e., on strength and coordination), the study would benefit from the inclusion of variables like grip type and (approximate) stone size (Neufuss et al., 2016). The size and shape of stones for nut-cracking have been shown to influence the efficacy and 'efficiency' of tool use (i.e., the same metrics of 'efficiency' implemented by Howard-Spink et al. in the current study), meaning raw material properties are a potential confound that the authors have not evaluated.

      Similarly, inter- and intraspecific variation in the properties of nuts being processed is another confound (Falótico et al., 2022; Proffitt et al., 2022). If oil palm nuts were varying year-to-year, for example, this would theoretically have an effect on the behavioral forms and strategies employed by the chimpanzees, and thus, any metric of efficiency being collected and analyzed. Further, it is perplexing that the authors analyze only one year where the coula nuts were provided at the test site, but these were provided during multiple field seasons. It would be more useful to compare data from a similar number of field seasons with both species if we are to study age-related changes in nut processing over time (one season of coula nut-cracking certainly does not achieve this).

      Both individual personality (especially neophilia versus neophobia; e.g., Forss & Willems, 2022) and motivation factors (Tennie & Call, 2023) are further confounds that can contribute to a more valid interpretation of the patterns found. To draw any conclusions about age-related changes in diet and food preferences, we would need to have data on the overall food intake/preferences of the individuals and the food availability in the home range. The authors refer briefly to this limitation, but the implications for the interpretation of the data are not sufficiently underlined (e.g., for the relevance of age-related decline in stone tool-use ability for individual survival).

      Generally speaking, there is a lack of consideration for temporal variation in ecological factors. As a control for these, Howard-Spink and colleagues have examined behavioral data for younger individuals from Bossou in the same years, to ostensibly show that patterns in older adults are different from patterns in younger adults, which is fair given the available data. Nonetheless, they seem to focus mostly on the start and end points and not patterns that occur in between. For example, there is a curious drop in attendance rate for all individuals in the 2008 season, the implications of which are not discussed by the authors.

      As far as attendance, Howard-Spink and colleagues also discuss how this might be explained by changes in social standing in later life (i.e., chimpanzees move to the fringes of the social network and become less likely to visit gathering sites). This is not senescence in the sense of physiological and cognitive decline with older age. Instead, the reduced attendance due to changes in social standing seems rather to exacerbate signs of aging rather than be an indicator of it itself. The authors also mention a flu-like epidemic that caused the death of 5 individuals; the subsequent population decline and related changes in demography also warrant more discussion and characterization in the manuscript.

      Understandably, some of these issues cannot be evaluated or corrected with the presented dataset. Nonetheless, these undermine how certain and/or deterministic their conclusions can really be considered. Howard-Spink et al. have not strongly 'demonstrated' the validity of relationships between the variables of the study. If anything, their cursory observations provide us with methods to apply and hypotheses to test in future studies. It is likely that with higher-resolution datasets, the individual variability in age-related decline in tool-use abilities will be replicated. For now, this can be considered a starting point, which will hopefully inspire future attempts to research these questions.

      Falótico, T., Valença, T., Verderane, M. & Fogaça, M. D. Stone tools differences across three capuchin monkey populations: food's physical properties, ecology, and culture. Sci. Rep. 12, 14365 (2022).<br /> Forss, S. & Willems, E. The curious case of great ape curiosity and how it is shaped by sociality. Ethology 128, 552-563 (2022).<br /> Neufuss, J., Humle, T., Cremaschi, A. & Kivell, T. L. Nut-cracking behaviour in wild-born, rehabilitated bonobos (Pan paniscus): a comprehensive study of hand-preference, hand grips and efficiency. Am. J. Primatol. 79, e22589 (2016).<br /> Proffitt, T., Reeves, J. S., Pacome, S. S. & Luncz, L. V. Identifying functional and regional differences in chimpanzee stone tool technology. R. Soc. Open Sci. 9, 220826 (2022).<br /> Putt, S. S. The origins of stone tool reduction and the transition to knapping: An experimental approach. J. Archaeol. Sci.: Rep. 2, 51-60 (2015).<br /> Tennie, C. & Call, J. Unmotivated subjects cannot provide interpretable data and tasks with sensitive learning periods require appropriately aged subjects: A Commentary on Koops et al. (2022) "Field experiments find no evidence that chimpanzee nut cracking can be independently innovated". ABC 10, 89-94 (2023).

    4. Author response:

      We thank both reviewers for their comments on our manuscript. We are pleased that the value of this research has been communicated effectively, and that the reviewers agree that whilst our sample size of individuals is relatively small, it offers a unique perspective for understanding the effects of aging for wild chimpanzees’ technological behaviors. Whilst only yielding data on a few individuals, the Bossou archive is the only available data source with which we can currently address these questions over extended timescales, and is key for understanding longitudinal effects of aging for specific individuals. This is particularly true if we are to understand the life-long dynamics of chimpanzees’ technical skills during tasks which require the organization of multiple movable elements. Bossou is the only community where chimpanzees both perform nut cracking with moveable hammer and anvil stones, and have been systematically studied over a period of decades. Moreover, given the dwindling population at Bossou (N = 3 as of 2025), we must make every effort to understand these effects with existing data. We agree that this work will likely form a valuable foundation for future studies, which may aim to either replicate our results, or use our findings to design more specific research questions and approaches.

      In the next iteration of the manuscript, we will elaborate on our choice of field seasons more clearly. However, this was a logistical tradeoff between needing to sample across a long lifespan using fine-granularity behavior coding, versus the time constraints for our project and the likely yield of data collection. We sampled from the middle of individuals’ prime age, up until the oldest recorded ages of individuals lifespans (17 years). Where possible we aimed to use consistent time intervals (approximately 4 years); however, this was not always possible, as in some years data was not collected by researchers at Bossou (for example, during years where there were Ebola outbreaks affecting the region). In such instances, we sampled the closest available year that offered sufficient data to meet our sampling requirements).

      Reviewer 2 raises that there may be a disconnect between how human observers and chimpanzees conceive of efficiency when nut cracking, and support this idea with a citation to previous work on efficiency of Oldowan stone knapping. We agree that knowing precisely how chimpanzees perceive their own efficiency during tool use is not available through observation alone, nor can we assess the true extent to which chimpanzees are concerned about the efficiency of their nut-cracking. However, following previous studies, it is reasonable to assume that adult chimpanzees embody some level of efficiency, given that adults often select tools which aid efficient nut cracking (Braun et al. 2025, J. Hum. Evol.; Carvalho et al. 2008, J. Hum. Evol.; Sirianni et al. 2015, Animal Behav.); perform nut cracking using more streamlined combinations of actions than less experienced individuals (Howard-Spink et al. 2024, Peer J; Inoue-Nakamura & Matsuzawa 1997, J. Comp. Psychol.), and consequently end up cracking nuts using fewer hammer strikes, indicating a higher level of skill (Biro et al. 2003, Animal Cogn.; Boesch et al. 2019, Sci. Rep.). Ultimately, these factors suggest that across adulthood, experienced chimpanzees perform nut cracking with a level of efficiency which exceeds novice individuals, including across the chaine operatoire.

      To account for the multiple ways in which reduced efficiency may manifest later in life, we provide one of the most flexible measures of efficiency in wild chimpanzee tool use to date, which incorporates more classical measures of time and hammer strikes (see previous examples of Biro et al. 2003, Animal Cogn.; Boesch et al. 2019 Sci. Rep.) as well as additional variables which aim to characterize how streamlined behavioral sequences are (tool rotations, tool swaps, nut replacements, etc. see Berdugo et al. 2024 Nat. Hum. Behav for other analyses using similar metrics). In the case of swapping out tools, Reviewer 2 suggests that some of these tool swaps may in fact be to aid nut cracking, by maintaining kernel integrity (a key result relating to Yo’s coula nut cracking efficiency). This however seems unlikely, given that these behaviors were performed extremely rarely by chimpanzees in early field seasons, and were not performed more frequently by other individuals with aging. We will provide additional information behind our metrics for measuring efficiency, with reference to earlier work, and also will incorporate the points raised by Reviewer 2 concerning the limitations with which we can infer chimpanzees’ goals, and how efficiently they meet them.

      Reviewer 1 questioned why we did not sample efficiency data for younger individuals, and compare this data with older individuals to detect the effects of aging. Throughout our manuscript, we compared aging individuals’ nut-cracking efficiency with their efficiency in previous years (thus, at younger ages). This offered each individual personalized benchmark of efficiency in early life, and allowed us to identify aging effects whilst controlling for long-term interindividual variation in skill levels. Indeed, previous analyses at Bossou find that across the majority of adulthood, efficiency varies between individuals, but is relatively stable within individuals (see Berdugo et al. 2024, Nat. Hum. Behav.). As focal aging chimpanzees cracked multiple nuts each field season (and each encounter), we had ample data to fit models that examine individuals’ efficiency over field seasons, using random slopes to model correlations for each individual. By taking this approach, our paper offers a novel perspective by being able to report the longitudinal effects of aging on tool-using efficiency, rather than averaged cross-sectional effects between young and old cohorts. As random slope models (and not just random intercept models) offered the best explanation for variation in aging individuals’ efficiency over our sample period, this implies that focal chimpanzees were experiencing individual-level changes in efficiency over time, thus giving us key evidence that interindividual variation in tool-using efficiency can be compounded by aging.

      We argue that the reductions in efficiency observed for some individuals (e.g. Yo & Velu) are unlikely to be due to environmental changes (e.g. nuts becoming harder in later field seasons), as if this was the case, these effects would be detected across the behaviors of all individuals (which was not observed). Additionally, in the specific case of the hardness of nuts, nuts used in our experiment were sourced from local communities, and were moderately aged. This avoided the use of young nuts which are harder to crack, or older nuts which are often worm-eaten or can be empty (Sakura & Matsuzawa, 1991; Ethology). We will update our manuscript with this information.

      Whilst other factors may introduce general variation into our efficiency data (such as different stones used on different encounters, or more general variation in nut hardness across encounters), very few of these factors predict directional long-term changes in efficiency. Rather, if these factors were driving the majority of variation in our data, we would expect them to lead to variation across visits during earlier field seasons (such as 1999-2008) and later field seasons (2011 onwards) equally, and in a way which does not necessarily correlate with age. This does not match the pattern we observed in our data, where for some individuals (e.g. Yo & Velu), efficiency in nut cracking reduced in later field seasons only, and was relatively consistent across field seasons prior to 2011. Moreover, for Yo – the individual who exhibited the greatest reductions in tool-using efficiency - efficiency continued to decrease across the three of the latest sampled field seasons. Thus, it is more likely Yo was experiencing deleterious effects of aging. We do however agree that additional data on these variables would help us to remove the possibility of compounding factors more rigorously – we will include recommendations for this data to be collected in future studies.

      When modelling the effect of aging on attendance at the outdoor laboratory, we could not use the same approach we used when modelling tool-using efficiency, as we could only acquire one datapoint (attendance rate) per individual for each field season. We therefore had to adapt our analysis, and introduce attendance rates for younger individuals as a baseline to compare against the attendance rates of older individuals across years. We observed a significant interaction effect, where across field seasons, attendance dropped significantly more rapidly for older individuals than younger ones. Reviewer 2 has asked why we do not consider inter-annual variability across this time period, and suggested that we ignored intervening years. This is not the case. When fitting models that examined the effects of aging on attendance, we used all data across all field seasons. We reported an approximate effect size for this significant correlation using a digestible comparison of the attendance rates in the initial and final field seasons sampled. We will ensure that this is clear in the next iteration of our manuscript.

      Reviewer 2 noted that many factors may have influenced the decision for chimpanzees to attend the outdoor laboratory in older field seasons, and the current data may not be used to make strong arguments for changes in attendance rates being due to dietary preferences. We agree that many factors may have influenced these attendance rates, and that is what we have aimed to transparently report within our discussion where we raise an extensive, non-exhaustive list of hypotheses for why we have observed this age-related change in our data. We will aim to ensure that this is exceptionally clear prior to resubmission, and where relevant, will further emphasize points raised by Reviewer 2. We consider some points raised by Reviewer 2 to be unlikely to apply for our study; for example, it is unlikely neophobia has influenced the behaviors of chimpanzees, as these chimpanzees habitually attended the outdoor laboratory at their own accord for over a decade prior to the earliest year we sampled in this study (reflecting extremely high levels of habituation to the experimental set up). Previous studies at Bossou have surveyed the ecology of stone tool use across the home range, and confirm that the outdoor laboratory is visited by chimpanzees during ranging as a food patch (Almeida-Warren et al. 2022 Int. J. Primatol.).

      Reviewer 2 suggested that it would be helpful to have additional data on variables such as hand grip, as this may reveal further information about how cognitive and physiological senescence influences reductions in tool-using efficiency. We agree that whilst further data on hand grips are not required to detect reductions in efficiency per say per se, it would be profitable for future analyses to collect similar data – we will add this as a recommendation to our discussion.

      Finally, Reviewer 2 commented that they found our discussion of coula-nut cracking disruptive to the flow of the manuscript, given that we could not compare with coula-nut cracking in earlier years. We reported the coula nut cracking of Yo in 2011 as it was part of our sampled data, and we felt that the comparison with other individuals in the same year was an interesting discussion point, however we acknowledge this limitation. We will move all data and discussion of coula-nut cracking to the Supplementary Materials, which we will present as an interesting additional observation which may warrant further investigation using additional data from the Bossou archive. Data collection for this future project could include collecting data on the additional variables raised by both reviewers (e.g. hand grips).

      We thank both reviewers for their comments. We believe that their feedback will improve the quality of our reporting, and the validity of our interpretations.

    1. eLife Assessment

      The conclusions of this work are based on valuable simulations of a detailed model of striatal dopamine dynamics. Establishing that a lower dopamine uptake rate can lead to a 'tonic' level of dopamine in the ventral but not dorsal striatum, and that dopamine concentration changes at short delays can be tracked by D1 but not D2 receptor activation, is of value and will be of interest to dopamine aficionados. However, the simulations are incomplete, providing only partial support for the key claims. Several things can be done to strengthen the conclusions, including, for example, but not exclusively, a demonstration of how the results would change as a function of changes in D2 affinity.

    2. Reviewer #1 (Public review):

      Ejdrup, Gether, and colleagues present a sophisticated simulation of dopamine (DA) dynamics based on a substantial volume of striatum with many DA release sites. The key observation is that a reduced DA uptake rate in the ventral striatum (VS) compared to the dorsal striatum (DS) can produce an appreciable "tonic" level of DA in VS and not DS. In both areas they find that a large proportion of D2 receptors are occupied at "baseline"; this proportion increases with simulated DA cell phasic bursts but has little sensitivity to simulated DA cell pauses. They also examine, in a separate model, the effects of clustering dopamine transporters (DAT) into nanoclusters and say this may be a way of regulating tonic DA levels in VS. I found this work of interest and I think it will be useful to the community. At the same time, there are a number of weaknesses that should be addressed, and the authors need to more carefully explain how their conclusions are distinct from those based on prior models.

      (1) The conclusion that even an unrealistically long (1s) and complete pause in DA firing has little effect on DA receptor occupancy is potentially important. The ability to respond to DA pauses has been thought to be a key reason why D2 receptors (may) have high affinity. This simulation instead finds evidence that DA pauses may be useless. This result should be highlighted in the abstract and discussed more.

      (2) The claim of "DAT nanoclustering as a way to shape tonic levels of DA" is not very well supported at present. None of the panels in Figure 4 simply show mean steady-state extracellular DA as a function of clustering. Perhaps mean DA is not the relevant measure, but then the authors need to better define what is and why. This issue may be linked to the fact that DAT clustering is modeled separately (Figure 4) to the main model of DA dynamics (Figures 1-3) which per the Methods assumes even distribution of uptake. Presumably, this is because the spatial resolution of the main model is too coarse to incorporate DAT nanoclusters, but it is still a limitation. As it stands it is convincing (but too obvious) that DAT clustering will increase DA away from clusters, while decreasing it near clusters. I.e. clustering increases heterogeneity, but how this could be relevant to striatal function is not made clear, especially given the different spatial scales of the models.

      (3) I question how reasonable the "12/40" simulated burst firing condition is, since to my knowledge this is well outside the range of firing patterns actually observed for dopamine cells. It would be better to base key results on more realistic values (in particular, fewer action potentials than 12).

      (4) There is a need to better explain why "focality" is important, and justify the measure used.

      (5) Line 191: " D1 receptors (-Rs) were assumed to have a half maximal effective concentration (EC50) of 1000 nM"<br /> The assumptions about receptor EC50s are critical to this work and need to be better justified. It would also be good to show what happens if these EC50 numbers are changed by an order of magnitude up or down.

      (6) Line 459: "we based our receptor kinetics on newer pharmacological experiments in live cells (Agren et al., 2021) and properties of the recently developed DA receptor-based biosensors (Labouesse & Patriarchi, 2021). Indeed, these sensors are mutated receptors but only on the intracellular domains with no changes of the binding site (Labouesse & Patriarchi, 2021)"<br /> This argument is diminished by the observation that different sensors based on the same binding site have different affinities (e.g. in Patriarchi et al. 2018, dLight1.1 has Kd of 330nM while dlight1.3b has Kd of 1600nM).

      (7) Estimates of Vmax for DA uptake are entirely based on prior fast-scan voltammetry studies (Table S2). But FSCV likely produces distorted measures of uptake rate due to the kinetics of DA adsorption and release on the carbon fiber surface.

      (8) It is assumed that tortuosity is the same in DS and VS - is this a safe assumption?

      (9) More discussion is needed about how the conclusions derived from this more elaborate model of DA dynamics are the same, and different, to conclusions drawn from prior relevant models (including those cited, e.g. from Hunger et al. 2020, etc).

    3. Reviewer #2 (Public review):

      The work presents a model of dopamine release, diffusion, and reuptake in a small (100 micrometer^2 maximum) volume of striatum. This extends previous work by this group and others by comparing dopamine dynamics in the dorsal and ventral striatum and by using a model of immediate dopamine-receptor activation inferred from recent dopamine sensor data. From their simulations, the authors report two main conclusions. The first is that the dorsal striatum does not appear to have a sustained, relatively uniform concentration of dopamine driven by the constant 4Hz firing of dopamine neurons; rather that constant firing appears to create hotspots of dopamine. By contrast, the lower density of release sites and lower rate of reuptake in the ventral striatum creates a sustained concentration of dopamine. The second main conclusion is that D1 receptor (D1R) activation is able to track dopamine concentration changes at short delays but D2 receptor activation cannot.

      The simulations of the dorsal striatum will be of interest to dopamine aficionados as they throw some doubt on the classic model of "tonic" and "phasic" dopamine actions, further show the disconnect between dopamine neuron firing and consequent release, and thus raise issues for the reward-prediction error theory of dopamine.

      There is some careful work here checking the dependence of results on the spatial volume and its discretisation. The simulations of dopamine concentration are checked over a range of values for key parameters. The model is good, the simulations are well done, and the evidence for robust differences between dorsal and ventral striatum dopamine concentration is good.

      However, the main weakness here is that neither of the main conclusions is strongly evidenced as yet. The claim that the dorsal striatum has no "tonic" dopamine concentration is based on the single example simulation of Figure 1 not the extensive simulations over a range of parameters. Some of those later simulations seem to show that the dorsal striatum can have a "tonic" dopamine concentration, though the measurement of this is indirect. It is not clear why the reader should believe the example simulation over those in the robustness checks, for example by identifying which range of parameter values is more realistic.

      The claim that D1Rs can track rapid changes in dopamine is not well supported. It is based on a single simulation in Figure 1 (DS) and 2 (VS) by visual inspection of simulated dopamine concentration traces - and even then it is unclear that D1Rs actually track dynamics because they clearly do not track rapid changes in dopamine that are almost as large as those driven by bursts (cf Figure 1i). The claim also depends on two things that are poorly explained. First, the model of binding here is missing from the text. It seems to be a simple bound-fraction model, simulating a single D1 or D2 receptor. It is unclear whether more complex models would show the same thing. Second, crucial to the receptor model here is the inference that D1 receptor unbinding is rapid; but this inference is made based on the kinetics of dopamine sensors and is superficially explained - it is unclear why sensor kinetics should let us extrapolate to receptor kinetics, and unclear how safe is the extrapolation of the linear regression by an order of magnitude to get the D1 unbinding rate.

    4. Author response:

      eLife Assessment

      The conclusions of this work are based on valuable simulations of a detailed model of striatal dopamine dynamics. Establishing that a lower dopamine uptake rate can lead to a 'tonic' level of dopamine in the ventral but not dorsal striatum, and that dopamine concentration changes at short delays can be tracked by D1 but not D2 receptor activation, is of value and will be of interest to dopamine aficionados. However, the simulations are incomplete, providing only partial support for the key claims. Several things can be done to strengthen the conclusions, including, for example, but not exclusively, a demonstration of how the results would change as a function of changes in D2 affinity.

      We sincerely thank the Editors and Reviewers for their insightful comments on our manuscript. We are pleased that our simulations are recognized as interesting, sophisticated and valuable. Moreover, we fully agree that many of the findings will be of particular interest to dopamine aficionados. While we maintain that our simulations provide a solid basis for the key claims, we acknowledge that the conclusions can be further strengthened by the revisions suggested below.

      Reviewer #1 (Public review):

      Ejdrup, Gether, and colleagues present a sophisticated simulation of dopamine (DA) dynamics based on a substantial volume of striatum with many DA release sites. The key observation is that a reduced DA uptake rate in the ventral striatum (VS) compared to the dorsal striatum (DS) can produce an appreciable "tonic" level of DA in VS and not DS. In both areas they find that a large proportion of D2 receptors are occupied at "baseline"; this proportion increases with simulated DA cell phasic bursts but has little sensitivity to simulated DA cell pauses. They also examine, in a separate model, the effects of clustering dopamine transporters (DAT) into nanoclusters and say this may be a way of regulating tonic DA levels in VS. I found this work of interest and I think it will be useful to the community. At the same time, there are a number of weaknesses that should be addressed, and the authors need to more carefully explain how their conclusions are distinct from those based on prior models.

      (1) The conclusion that even an unrealistically long (1s) and complete pause in DA firing has little effect on DA receptor occupancy is potentially important. The ability to respond to DA pauses has been thought to be a key reason why D2 receptors (may) have high affinity. This simulation instead finds evidence that DA pauses may be useless. This result should be highlighted in the abstract and discussed more.

      We appreciate that the reviewer finds our work interesting and useful to the community. However, we acknowledge that in the revised version we to need to better describe how our conclusions are different from those reached based on previous models.

      We will also carry out new simulations across a range of D2R affinities to assess how this will affect the finding that even a long pause in DA firing has little effect on DR2 receptor occupancy. As also suggested, the results will be highlighted and further discussed.

      (2) The claim of "DAT nanoclustering as a way to shape tonic levels of DA" is not very well supported at present. None of the panels in Figure 4 simply show mean steady-state extracellular DA as a function of clustering. Perhaps mean DA is not the relevant measure, but then the authors need to better define what is and why. This issue may be linked to the fact that DAT clustering is modeled separately (Figure 4) to the main model of DA dynamics (Figures 1-3) which per the Methods assumes even distribution of uptake. Presumably, this is because the spatial resolution of the main model is too coarse to incorporate DAT nanoclusters, but it is still a limitation.

      We will improve our definitions and descriptions relating to nanoclustering of DAT in the revised version of the manuscript. We fully agree that the spatial resolution of the main model is a limitation and, ideally, that the nanoclustering should be combined with the large-scale release simulations. Unfortunately, this would require many orders of magnitude more computational power than currently available.

      As it stands it is convincing (but too obvious) that DAT clustering will increase DA away from clusters, while decreasing it near clusters. I.e. clustering increases heterogeneity, but how this could be relevant to striatal function is not made clear, especially given the different spatial scales of the models.

      Thank you for raising this important point. While it is true that DAT clustering increases heterogeneity in DA distribution at the microscopic level, the diffusion rate is, in most circumstances, too fast to permit concentration differences on a spatial scale relevant for nearby receptors. Accordingly, we propose that the primary effect of DAT nanoclustering is to decrease the overall uptake capacity, which in turn increases overall extracellular DA concentrations. Thus, homogeneous changes in extracellular DA concentrations can arise from regulating heterogenous DAT distribution. An exception to this would be the circumstance where the receptor is located directly next to a dense cluster – i.e. within nanometers. In such cases, local DA availability may be more directly influenced by clustering effects. This will be further discussed in the revised manuscript.

      (3) I question how reasonable the "12/40" simulated burst firing condition is, since to my knowledge this is well outside the range of firing patterns actually observed for dopamine cells. It would be better to base key results on more realistic values (in particular, fewer action potentials than 12).

      We fully agree that this typically is outside the physiological range. The values are included to showcase what extreme situations would look like.

      (4) There is a need to better explain why "focality" is important, and justify the measure used.

      We will expand on the intention of this measure in the revised manuscript. Thank you for pointing out this lack of clarification.

      (5) Line 191: " D1 receptors (-Rs) were assumed to have a half maximal effective concentration (EC50) of 1000 nM" The assumptions about receptor EC50s are critical to this work and need to be better justified. It would also be good to show what happens if these EC50 numbers are changed by an order of magnitude up or down.

      We agree that these assumptions are critical. Simulations on effective off-rates across a range of EC50 values will be included in the revised version.

      (6) Line 459: "we based our receptor kinetics on newer pharmacological experiments in live cells (Agren et al., 2021) and properties of the recently developed DA receptor-based biosensors (Labouesse & Patriarchi, 2021). Indeed, these sensors are mutated receptors but only on the intracellular domains with no changes of the binding site (Labouesse & Patriarchi, 2021)”

      This argument is diminished by the observation that different sensors based on the same binding site have different affinities (e.g. in Patriarchi et al. 2018, dLight1.1 has Kd of 330nM while dlight1.3b has Kd of 1600nM).

      We sincerely thank the reviewer for highlighting this important point. We fully recognize the fundamental importance of absolute and relative DA receptor kinetics for modeling DA actions and acknowledge that differences in affinity estimates from sensor-based measurements highlight the inherent uncertainty in selecting receptor kinetics parameters. While we have based our modeling decisions on what we believe to be the most relevant available data, we acknowledge that the choice of receptor kinetics is a topic of ongoing debate. Importantly, we are making our model available to the research community, allowing others to test their own estimates of receptor kinetics and assess their impact on the model’s behavior. In our revised manuscript, we will further discuss the rationale behind our parameter choices, including: Our selection of a Kd value of 1000 nM for D1R (based on the observed affinities for D1R sensors) and an extrapolated Koff of 19.5 s<sup>-1</sup> (Labouesse & Patriarchi, 2021). Our use of a Kd value of 7 nM and an extrapolated Koff of 0.2 s<sup>-1</sup> for D2R, consistent with recent binding studies (Ågren et al., 2021).

      (7) Estimates of Vmax for DA uptake are entirely based on prior fast-scan voltammetry studies (Table S2). But FSCV likely produces distorted measures of uptake rate due to the kinetics of DA adsorption and release on the carbon fiber surface.

      We fully agree that this is a limitation of FSCV. However, most of the cited papers attempt to correct for this by way of fitting the output to a multi-parameter model for DA kinetics. If newer literature brings the Vmax values estimated into question, we have made the model publicly available to rerun the simulations with new parameters.

      (8) It is assumed that tortuosity is the same in DS and VS - is this a safe assumption?

      The original paper cited does not specify which region the values are measured in. However, a separate paper estimates the rat cerebellum has a comparable tortuosity index (Nicholson and Phillips, J Physiol. (1981)), suggesting it may be a rather uniform value across brain regions.

      (9) More discussion is needed about how the conclusions derived from this more elaborate model of DA dynamics are the same, and different, to conclusions drawn from prior relevant models (including those cited, e.g. from Hunger et al. 2020, etc).

      As part of our revision, we will expand the current discussion of our finding in the context of previous models in the manuscript

      Reviewer #2 (Public review):

      The work presents a model of dopamine release, diffusion, and reuptake in a small (100 micrometer^2 maximum) volume of striatum. This extends previous work by this group and others by comparing dopamine dynamics in the dorsal and ventral striatum and by using a model of immediate dopamine-receptor activation inferred from recent dopamine sensor data. From their simulations, the authors report two main conclusions. The first is that the dorsal striatum does not appear to have a sustained, relatively uniform concentration of dopamine driven by the constant 4Hz firing of dopamine neurons; rather that constant firing appears to create hotspots of dopamine. By contrast, the lower density of release sites and lower rate of reuptake in the ventral striatum creates a sustained concentration of dopamine. The second main conclusion is that D1 receptor (D1R) activation is able to track dopamine concentration changes at short delays but D2 receptor activation cannot.

      The simulations of the dorsal striatum will be of interest to dopamine aficionados as they throw some doubt on the classic model of "tonic" and "phasic" dopamine actions, further show the disconnect between dopamine neuron firing and consequent release, and thus raise issues for the reward-prediction error theory of dopamine.

      There is some careful work here checking the dependence of results on the spatial volume and its discretisation. The simulations of dopamine concentration are checked over a range of values for key parameters. The model is good, the simulations are well done, and the evidence for robust differences between dorsal and ventral striatum dopamine concentration is good.

      However, the main weakness here is that neither of the main conclusions is strongly evidenced as yet. The claim that the dorsal striatum has no "tonic" dopamine concentration is based on the single example simulation of Figure 1 not the extensive simulations over a range of parameters. Some of those later simulations seem to show that the dorsal striatum can have a "tonic" dopamine concentration, though the measurement of this is indirect. It is not clear why the reader should believe the example simulation over those in the robustness checks, for example by identifying which range of parameter values is more realistic.

      We appreciate that the reviewer finds our work interesting and carefully performed. The reviewer is correct that DA dynamics, including the presence and level of tonic DA, are parameter-dependent in both the dorsal striatum (DS) and ventral striatum (VS). Indeed, our simulations across a broad range of biological parameters were intended to help readers understand how such variation would impact the model’s outcomes, particularly since many of the parameters remain contested. Naturally, altering these parameters results in changes to the observed dynamics. However, to derive possible conclusions, we selected a subset of parameters that we believe best reflect the physiological conditions, as elaborated in the manuscript. This is eventually required in computational modelling of biological systems. In response to the reviewer’s comment, we will place greater emphasis on clarifying which parameter regimes produce a "tonic" versus "non-tonic" DA state in the DS. Additionally, we will underscore that the distinction between tonic and non-tonic states is not a binary outcome but a parameter-dependent continuum—one that our model now allows researchers to explore systematically. Finally, we will highlight how our simulations across parameter space not only capture this continuum but also identify the regimes that produce the most heterogeneous DA signaling, both within and across striatal regions.

      The claim that D1Rs can track rapid changes in dopamine is not well supported. It is based on a single simulation in Figure 1 (DS) and 2 (VS) by visual inspection of simulated dopamine concentration traces - and even then it is unclear that D1Rs actually track dynamics because they clearly do not track rapid changes in dopamine that are almost as large as those driven by bursts (cf Figure 1i).

      We would like to draw the attention also to Fig. S1, where the claim that D1R track rapid changes is supported in more depth. According to this figure, upon coordinated burst firing, the D1R occupancy rapidly increased as diffusion no longer equilibrated the extracellular concentrations on a timescale faster than the receptors – and D1R receptor occupancy closely tracked extracellular DA with a delay on the order of tens of milliseconds. Note that the brief increases in [DA] from uncoordinated stochastic release events from tonic firing in Fig. 1i are too brief to drive D1 signaling, as the DA concentration diffuses into the remaining extracellular space on a timescale of 1-5 ms. This is faster than the receptors response rate, and does not lead to any downstream signaling according to our simulations. This means D1 kinetics are rapid enough to track coordinated signaling on a ~50 ms timescale and slower, but not fast enough to respond to individual release events from tonic activity. In our revised manuscript we will expand the discussion of this topic to provide greater clarity.

      The claim also depends on two things that are poorly explained. First, the model of binding here is missing from the text. It seems to be a simple bound-fraction model, simulating a single D1 or D2 receptor. It is unclear whether more complex models would show the same thing.

      We realize that this is not made clear in the methods and, accordingly, we will update the method section to elaborate on how we model receptor binding. The model simulates occupied fraction of D1R and D2R in every single voxel of the simulation space.

      Second, crucial to the receptor model here is the inference that D1 receptor unbinding is rapid; but this inference is made based on the kinetics of dopamine sensors and is superficially explained - it is unclear why sensor kinetics should let us extrapolate to receptor kinetics, and unclear how safe is the extrapolation of the linear regression by an order of magnitude to get the D1 unbinding rate.

      We chose to use the sensors because it was possible to estimate precise affinities/off-rates from the fluorescent measurements. Although there might some variation in affinities that could be attributable to the mutations introduced in the sensors, the data clearly separated D1R and D2R with a D1R affinity of ~1000 nM and a D1R affinity of ~7 nM (Labouesse & Patriarchi, 2021) consistent with earlier predictions of receptor affinities. From our assessment of the literature we found that this was the most reasonable way to estimate affinities and thereby off-rates. Importantly, the model has been made publicly available, so should new measurements arise, the simulations can be rerun with tweaks to the input parameters.

    1. eLife Assessment

      This valuable study provides convincing evidence that specific proteins on the surface of cancer cells undergo a particular form of recycling and are redirected toward the cell-cell contact with T cells, a type of immune cell. However, the characterization of the consequences of T cell activation resulting from perturbing the recycling pathway is incomplete. Furthermore, relevant literature has not been sufficiently cited.

    2. Reviewer #1 (Public review):

      Summary:

      This study by Xu et al. focuses on the impact of clathrin-independent endocytosis in cancer cells on T cell activation. In particular, by using a combination of biochemical approaches and imaging, the authors identify ICAM1, the ligand for T cell-expressed integrin LFA-1, as a novel cargo for EndoA3-mediated endocytosis. Subsequently, the authors aim to identify functional implications for T cell activation, using a combination of cytokine assays and imaging experiments.

      They find that the absence of EndoA3 leads to a reduction in T cell-produced cytokine levels. Additionally, they observe slightly reduced levels of ICAM1 at the immunological synapse and an enlarged contact area between T cells and cancer cells. Taken together, the authors propose a mechanism where EndoA3-mediated endocytosis of ICAM1, followed by retrograde transport, supplies the immunological synapse with ICAM1. In the absence of EndoA3, T cells attempt to compensate for suboptimal ICAM1 levels at the synapse by enlarging their contact area, which proves insufficient and leads to lower levels of T cell activation.

      Strengths:

      The authors utilize a rigorous and innovative experimental approach that convincingly identifies ICAM1 as a novel cargo for Endo3A-mediated endocytosis.

      Weaknesses:

      The characterization of the effects of Endo3A absence on T cell activation appears incomplete. Key aspects, such as surface marker upregulation, T cell proliferation, integrin signalling and most importantly, the killing of cancer cells, are not comprehensively investigated.

      As Endo- and exocytosis are intricately linked with the biophysical properties of the cellular membrane (e.g. membrane tension), which can significantly impact T-cell activation and cytotoxicity, the authors should address this possibility and ideally address it experimentally to some degree.

      Crucially, key literature relevant to this research, addressing the role of ICAM1 endocytosis in antigen-presenting cells, has not been taken into consideration.

    3. Reviewer #2 (Public review):

      Summary:

      The manuscript by Xu et al. studies the relevance of endophilin A3-dependent endocytosis and retrograde transport of immune synapse components and in the activation of cytotoxic CD8 T cells. First, the authors show that ICAM1 and ALCAM, known components of immune synapses, are endocytosed via endoA3-dependent endocytosis and retrogradely transported to the Golgi. The authors then show that blocking internalization or retrograde trafficking reduces the activation of CD8 T cells. Moreover, this diminished CD8 T cell activation resulted in the formation of an enlarged immune synapse with reduced ICAM1 recruitment.

      Strengths:

      The authors show a novel EndoA3-dependent endocytic cargo and provide strong evidence linking EndoA3 endocytosis to the retrograde transport of ALCAM and ICAM1.

      Weaknesses:

      The role of EndoA3 in the process of T cell activation is shown in a cell that requires exogenous expression of this gene. Moreover, the authors claim that their findings are important for polarized redistribution of cargoes, but failed to show convincingly that the cargoes they are studying are polarized in their experimental system. The statistics of the manuscript also require some refinement.

    4. Reviewer #3 (Public review):

      Summary:

      Shiqiang Xu and colleagues have examined the importance of ICAM-1 and ALCAM internalization and retrograde transport in cancer cells on the formation of a polarized immunological synapse with cytotoxic CD8+ T cells. They find that internalization is mediated by Endophilin A3 (EndoA3) while retrograde transport to the Golgi apparatus is mediated by the retromer complex. The paper is building on previous findings from corresponding author Henri-François Renard showing that ALCAM is an EndoA3-dependent cargo in clathrin-independent endocytosis.

      Strengths:

      The work is interesting as it describes a novel mechanism by which cancer cells might influence CD8+ T cell activation and immunological synapse formation, and the authors have used a variety of cell biology and immunology methods to study this. However, there are some aspects of the paper that should be addressed more thoroughly to substantiate the conclusions made by the authors.

      Weaknesses:

      In Figure 2A-B, the authors show micrographs from live TIRF movies of HeLa and LB33-MEL cells stably expressing EndoA3-GFP and transiently expressing ICAM-1-mScarlet. The ICAM-1 signal appears diffuse across the plasma membrane while the EndoA3 signal is partially punctate and partially lining the edge of membrane patches. Previous studies of EndoA3-mediated endocytosis have indicated that this can be observed as transient cargo-enriched puncta on the cell surface. In the present study, there is only one example of such an ICAM-1 and EndoA3 positive punctate event. Other examples of overlapping signals between ICAM-1 and EndoA3 are shown, but these either show retracting ICAM-1 positive membrane protrusions or large membrane patches encircled by EndoA3. While these might represent different modes of EndoA3-mediated ICAM-1 internalization, any conclusion on this would require further investigation.

      Moreover, in Figure 2C-E, uptake of the previously established EndoA3 endocytic cargo ALCAM is analyzed by quantifying total internal fluorescence in LB33-MEL cells of antibody labelled ALCAM following both overexpression and siRNA-mediated knockdown of EndoA3, showing increased and decreased uptake respectively. Why has not the same quantification been done for the proposed novel EndoA3 endocytic cargo ICAM-1? Furthermore, if endocytosis of ICAM-1 and ALCAM is diminished following EndoA3 knockdown, the expression level on the cell surface would presumably increase accordingly. This has been shown for ALCAM previously and should also be quantified for ICAM-1.

      In Figure 4A the authors show micrographs from a live-cell Airyscan movie (Movie S6) of a CD8+ T cell incubated with HeLa cells stably expressing HLA-A*68012 and transiently expressing ICAM1-EGFP. From the movie, it seems that some ICAM-1 positive vesicles in one of the HeLa cells are moving towards the T cell. However, it does not appear like the T cell has formed a stable immunological synapse but rather perhaps a motile kinapse. Furthermore, to conclude that the ICAM-1 positive vesicles are transported toward the T cell in a polarized manner, vesicles from multiple cells should be tracked and their overall directionality should be analyzed. It would also strengthen the paper if the authors could show additional evidence for polarization of the cancer cells in response to T-cell interaction.

      Finally, in Figures 4D-G, the authors show that the contact area between CD8+ T cells and LB33-MEL cells is increased in response to siRNA-mediated knockdown of EndoA3 and VPS26A. While this could be caused by reduced polarized delivery of ICAM-1 and ALCAM to the interface between the cells, it could also be caused by other factors such as increased cell surface expression of these proteins due to diminished endocytosis, and/or morphological changes in the cancer cells resulting from disrupted membrane traffic. More experimental evidence is needed to support the working model in Figure 4H.

    1. eLife Assessment

      This study presents valuable insights into the role of two proteins, Rab27A and SYTL5, which control vesicle transport and delivery. While the data is clear, the overall evidence is somewhat incomplete. Strengthening the mechanistic aspect would enhance the study, making it of greater interest to cell biologists studying membrane trafficking and mitochondria.

    2. Reviewer #1 (Public review):

      Summary:

      In this study, Ana Lapao et al. investigated the roles of Rab27 effector SYTL5 in cellular membrane trafficking pathways. The authors found that SYTL5 localizes to mitochondria in a Rab27A-dependent manner. They demonstrated that SYTL5-Rab27A positive vesicles containing mitochondrial material are formed under hypoxic conditions, thus they speculate that SYTL5 and Rab27A play roles in mitophagy. They also found that both SYTL5 and Rab27A are important for normal mitochondrial respiration. Cells lacking SYTL5 undergo a shift from mitochondrial oxygen consumption to glycolysis which is a common process known as the Warburg effect in cancer cells. Based on the cancer patient database, the author noticed that low SYTL5 expression is related to reduced survival for adrenocortical carcinoma patients, indicating SYTL5 could be a negative regulator of the Warburg effect and potentially tumorigenesis.

      Strengths:

      The authors take advantage of multiple techniques and novel methods to perform the experiments.

      (1) Live-cell imaging revealed that stably inducible expression of SYTL5 co-localized with filamentous structures positive for mitochondria. This result was further confirmed by using correlative light and EM (CLEM) analysis and western blotting from purified mitochondrial fraction.

      (2) In order to investigate whether SYTL5 and RAB27A are required for mitophagy in hypoxic conditions, two established mitophagy reporter U2OS cell lines were used to analyze the autophagic flux.

      Weaknesses:

      This study revealed a potential function of SYTL5 in mitophagy and mitochondrial metabolism. However, the mechanistic evidence that establishes the relationship between SYTL5/Rab27A and mitophagy is insufficient. The involvement of SYTL5 in ACC needs more investigation. Furthermore, images and results supporting the major conclusions need to be improved.

    3. Reviewer #2 (Public review):

      Summary:

      The authors provide convincing evidence that Rab27 and STYL5 work together to regulate mitochondrial activity and homeostasis.

      Strengths:

      The development of models that allow the function to be dissected, and the rigorous approach and testing of mitochondrial activity

      Weaknesses:

      There may be unknown redundancies in both pathways in which Rab27 and SYTL5 are working which could confound the interpretation of the results.

      Suggestions for revision:

      Given that Rab27A and SYTL5 are members of protein families it would be important to exclude any possible functional redundancies coming from Rab27B expression or one of the other SYTL family members. For Rab27 this would be straightforward to test in the assays shown in Figure 4 and Supplementary Figure 5. For SYTL5 it might be sufficient to include some discussion about this possibility.

      Suggestions for Discussion:

      Both Rab27A and STYL5 localize to other membranes, including the endolysosomal compartments. How do the authors envisage the mechanism or cellular modifications that allow these proteins, either individually or in complex to function also to regulate mitochondrial function? It would be interesting to have some views.

    4. Reviewer #3 (Public review):

      Summary:

      In the manuscript by Lapao et al., the authors uncover a role for the RAB27A effector protein SYTL5 in regulating mitochondrial function and turnover. The authors find that SYTL5 localizes to mitochondria in a RAB27A-dependent way and that loss of SYTL5 (or RAB27A) impairs lysosomal turnover of an inner mitochondrial membrane mitophagy reporter but not a matrix-based one. As the authors see no co-localization of GFP/mScarlet tagged versions of SYTL5 or RAB27A with LC3 or p62, they propose that lysosomal turnover is independent of the conventional autophagy machinery. Finally, the authors go on to show that loss of SYTL5 impacts mitochondrial respiration and ECAR and as such may influence the Warburg effect and tumorigenesis. Of relevance here, the authors go on to show that SYTL5 expression is reduced in adrenocortical carcinomas and this correlates with reduced survival rates.

      Strengths:

      There are clearly interesting and new findings here that will be relevant to those following mitochondrial function, the endocytic pathway, and cancer metabolism.

      Weaknesses:

      The data feel somewhat preliminary in that the conclusions rely on exogenously expressed proteins and reporters, which do not always align.

      As the authors note there are no commercially available antibodies that recognize endogenous SYTL5, hence they have had to stably express GFP-tagged versions. However, it appears that the level of expression dictates co-localization from the examples the authors give (though it is hard to tell as there is a lack of any kind of quantitation for all the fluorescent figures). Therefore, the authors may wish to generate an antibody themselves or tag the endogenous protein using CRISPR.

      In relation to quantitation, the authors found that SYTL5 localizes to multiple compartments or potentially a few compartments that are positive for multiple markers. Some quantitation here would be very useful as it might inform on function.

      The authors find that upon hypoxia/hypoxia-like conditions that punctate structures of SYTL5 and RAB27A form that are positive for Mitotracker, and that a very specific mitophagy assay based on pSu9-Halo system is impaired by siRNA of SYTL5/RAB27A, but another, distinct mitophagy assay (Matrix EGFP-mCherry) shows no change. I think this work would strongly benefit from some measurements with endogenous mitochondrial proteins, both via immunofluorescence and western blot-based flux assays.

      A really interesting aspect is the apparent independence of this mitophagy pathway on the conventional autophagy machinery. However, this is only based on a lack of co-localization between p62 or LC3 with LAMP1 and GFP/mScarlet tagged SYTL5/RAB27A. However, I would not expect them to greatly colocalize in lysosomes as both the p62 and LC3 will become rapidly degraded, while the eGFP and mScarlet tags are relatively resistant to lysosomal hydrolysis. -/+ a lysosome inhibitor might help here and ideally, the functional mitophagy assays should be repeated in autophagy KOs.

      The link to tumorigenesis and cancer survival is very interesting but it is not clear if this is due to the mitochondrially-related aspects of SYTL5 and RAB27A. For example, increased ECAR is seen in the SYTL5 KO cells but not in the RAB27A KO cells (Fig.5D), implying that mitochondrial localization of SYTL5 is not required for the ECAR effect. More work to strengthen the link between the two sections in the paper would help with future directions and impact with respect to future cancer treatment avenues to explore.

    1. eLife Assessment

      This important study characterizes the mechanics and stability of bolalipids from archaeal membranes using molecular dynamics simulations. A mesoscale model of bolalipids is presented and evaluated across a series of membrane configurations. The evidence supporting the conclusions is convincing, demonstrating that mixtures of bolalipids and regular bilayer lipids in archaeal membranes enhance fluidity and stability.

    2. Reviewer #1 (Public review):

      Summary:

      Amaral et al. presents a study investigating the mesoscale modelling and dynamics of bolalipids.

      Strengths:

      The figures in this paper are exceptional. Both those to outline and introduce the lipid types, but also the quality and resolution of the plots. The data held within also appears to be outstanding and of significant (hopefully) general interest.

      Weaknesses:

      In the introduction, I would like to have read more specifics on the biological role of bolalipids. Archaea are mentioned, but this kingdom is huge - there must be specific species that can be discussed where bolalipids are integral to archaeal life. The authors should go beyond 'extremophiles'. In short, they should unpack why the general audience should be interested in these lipids, within a subset of organisms that are often forgotten about.

    3. Reviewer #2 (Public review):

      Summary:

      The authors aimed to understand the biophysical properties of archeal membranes made of bolalipids. Bacterial and eukaryotic membranes are made of lipids that self-assemble into bilayers. Archea, instead, use bolalipids, lipids that have two headgroups and can span the entire bilayer. The authors wanted to determine if the unique characteristics of archaea, which are often extremophiles, are in part due to the fact that their membranes contain bolalipids.

      The authors develop a minimal computational model to compare the biophysics of bilayers made of lipids, bolalipids, and mixtures of the two. Their model enables them to determine essential parameters such as bilayer phase diagrams, mechanical moduli, and the bilayer behavior upon cargo inclusion and remodeling.

      The author demonstrates that bolalipid bilayers behave as binary mixtures, containing bolalipids organized either in a straight conformation, spanning the entire bilayer, or in a u-shaped one, confined to a single leaflet. This dynamic mixture allows bolalipid bilayers to be very sturdy but also provides remodeling. However, remodeling is energetically more expensive than with standard lipids. The authors speculate that this might be why lipids were more abundant in the evolutionary process.

      Strengths:

      This is a wonderful paper, a very fine piece of scholarship. It is interesting from the point of view of biology, biophysics, and material science. The authors mastered the modeling and analysis of these complex systems. The evidence for their findings is really strong and complete. The paper is written superbly, the language is precise and the reading experience is very pleasant. The plots are very well-thought-out.

      Weaknesses:

      I would not talk about weaknesses, because this is really a nice paper. If I really had to find one, I would have liked to see some clear predictions of the model expressed in such a way that experimentalists could design validation experiments.

    4. Reviewer #3 (Public review):

      Summary:

      The authors have studied the mechanics of bolalipid and archaeal mixed-lipid membranes via comprehensive molecular dynamics simulations. The Cooke-Deserno 3-bead-per-lipid model is extended to bolalipids with 6 beads. Phase diagrams, bending rigidity, mechanical stability of curved membranes, and cargo uptake are studied. Effects such as the formation of U-shaped bolalipids, pore formation in highly curved regions, and changes in membrane rigidity are studied and discussed. The main aim has been to show how the mixture of bolalipids and regular bilayer lipids in archaeal membrane models enhances the fluidity and stability of these membranes.

      Strengths:

      The authors have presented a wide range of simulation results for different membrane conditions and conformations. For the most part, the analyses and their results are presented clearly and concisely. Figures, supplementary information, and movies very well present what has been studied. The manuscript is well-written and is easy to follow.

      Major issues:

      The Cooke-Deserno model, while very powerful for biophysical analysis of membranes at the mesoscale, is very much void of chemical information. It is parameterized such that it is good in producing fluid membranes and predicting values for bending rigidity, compressibility, and even thermal expansion coefficient falling in the accepted range of values for bilayer membranes. But it still represents a generic membrane. Now, the authors have suggested a similar model for the archaeal bolalipids, which have chemically different lipids (the presence of cyclopentane rings for one), and there is no good justification for using the same pairwise interactions between their representative beads in the coarse-grained model. This does not necessarily diminish the worth of all the authors' analyses. What is at risk here is the confusion between "what we observe this model of bolalipid- or mixed-membranes do" and "how real bolalipid-containing archaeal membranes behave at these mechanical and thermal conditions.".

      Another more specific, major issue has to do with using the Hamm-Kozlov model for fitting the power spectrum of thermal undulations. The 1/q^2 term can very well be attributed to membrane tension. While a barostat is indeed used, have the authors made absolutely sure that the deviation from 1/q^4 behavior does not correspond to lateral tension? I got more worried when I noticed in the SI that the simulations had been done with combined "fix langevin" and "fix nph" LAMMPS commands. This combination does not result in a proper isothermal-isobaric ensemble. The importance of tilt terms for bolalipids is indeed very interesting, but I believe more care is needed to establish that.

      This issue is reinforced when considering Figure 3B. These results suggest that increasing the fraction of regular lipids increases the tilt modulus, with the maximum value achieved for a normal Cooke-Deserno bilayer void of bolalipids. But this is contradictory. For these bilayers, we don't need the tilt modulus in the first place.

      Also, from the SI, I gathered that the authors have neglected the longest wavelength mode because it is not equilibrated. If this is indeed the case, it is a dangerous thing to do, because with a small membrane patch, this mode can very well change the general trend of the power spectrum. As a lot of other analyses in the manuscript rely on these measurements, I believe more elaboration is in order.

      The authors have found that "there is a strong dependency of the bending rigidity on the membrane mean curvature of stiffer bolalipids." The effect is negative, with the membrane becoming less stiff at higher mean curvatures. Why is that? I would assume that with more flexible bolalipids, the possibility of reorganization into U-shaped chains should affect the bending rigidity more (as Figure 2E suggests). While for a stiff bolalipid, not much would change if you increase the mean curvature. This should be either a tilt effect, or have to do with asymmetry between the leaflets. But on the other hand, the tilt modulus is shown to decrease with increasing bolalipid rigidity. The authors get back to this issue only on page 10, when they consider U-shaped lipids in the inner and outer leaflets and write, "this suggested that an additional membrane-curving mechanism must be involved." But then again, in the Discussion, the authors write, "It is striking that membranes made from stiffer bolalipids showed a curvature-dependent bending modulus, which is a clear signature that bolalipid membranes exhibit plastic behavior during membrane reshaping," adding to the confusion.

      This issue is repeated when the authors study nanoparticle uptake. They write: "to reconcile these seemingly conflicting observations we reason that the bending rigidity, similar to Figure 2F, is not constant but softens upon increasing membrane curvature, due to dynamic change in the ratio between bolalipids in straight and U-shaped conformation. Hence, bolalipid membranes show stroking plastic behavior as they soften during reshaping." But the softening effect that they refer to, as shown in Figure 4B, occurs for very stiff bolalipids, for which not much switching to U-shaped conformation should occur.

      Another major issue is with what the authors refer to as the "effective temperature". While plotting phase diagrams for kT/eps value is absolutely valid, I'm not a fan of calling this effective temperature. It is a dimensionless quantity that scales linearly with temperature, but is not a temperature. It is usually called a "reduced temperature". Then the authors refer to their findings as studying the stability of archaeal membranes at high temperatures. I have to disagree because eps is not the only potential parameter in the simulations (there are at least space exclusion and angle-bending stiffnesses) so one cannot identify changing eps with changing the global simulation temperature. This only works when you have one potential parameter, like an LJ gas.

      Minor issues:

      As the authors have noted, the fact that the membrane curvature can change the ratio of U-shaped to straight bolalipids would render the curvature elasticity non-linear (though the term "plastic" should not be used, as this is still structurally reversible when the stress is removed. Technically, it is hypoelastic behavior, possibly with hysteresis.) With this in mind, when the authors use essentially linear elastic models for fluctuation analysis, they should make a comparison of maximum curvatures occurring in simulations with a range that causes significant changes in bolalipid conformational ratios.

      The Introduction section of the manuscript is written with a biochemical approach, with very minor attention to the simulation works on this system. Some molecular dynamics works are only cited as existing previous work, without mentioning what has already been studied in archaeal membranes. While some information, like the binding of ESCRT proteins to archaeal membranes, though interesting, helps little to place the study within the discipline. The Introduction should be revised to show what has already been studied with simulations (as the authors mention in the Discussion) and how the presented research complements it.

      The authors have been a bit loose with using the term "stability". I'd like to see the distinction in each case, as in "chemical/thermal/mechanical/conformational stability".

      In the original Cooke-Deserno model, a so-called "poorman's angle-bending term" is used, which is essentially a bond-stretching term between the first and third particle. However, I notice the authors using the full harmonic angle-bending potential. This should be mentioned.

      The analysis of energy of U-shaped lipids with the linear model E=c_0 + c_1 * k_bola is indeed very interesting. I am curious, can this also be corroborated with mean energy measurements? The minor issue is calling the source of the favorability of U-shaped lipids "entropic", while clearly an energetic contribution is found. The two conformations, for example, might differ in the interactions with the neighboring lipids.

      The authors write in the Discussion, "In any case, our results indicate that membrane remodelling, such as membrane fission during membrane traffic, is much more difficult in bolalipid membranes [34]." Firstly, I'm not sure if studying the dependence of budding behavior on adhesion energy with nanoparticles is enough to make claims about membrane fission. Secondly, why is the 2015 paper by Markus Deserno cited here?

      In the SI, where the measurement of the diffusion coefficient is discussed, the expression for D is missing the power 2 of displacement.

      Where cargo uptake is discussed, the term "adsorption energy" is used. I think the more appropriate term would be "adhesion energy".

      Typos:<br /> Page 1, paragraph 2: Adaption → Adaptation.<br /> Page 10, paragraph 1: Stroking → Striking.

    1. eLife Assessment

      This important study explores the commander-independent function of COMMD3-Arf1 in endosomal recycling. The evidence supporting the authors' claims is solid; however, the inclusion of additional validation experiments and control conditions would have further strengthened the study. The findings will be of significant interest to cell biologists working on membrane trafficking.

    2. Reviewer #1 (Public review):

      G. Squiers et al. analyzed a previously reported CRISPR genetic screening dataset of engineered GLUT4 cell-surface presentation and identified the Commander complex subunit COMMD3 as being required for endosomal recycling of specific cargo proteins, such as transferrin receptor (TfR), to the cell surface. Through comparison of COMMD3-KO and other Commander subunit-KO cells, they demonstrated that the role of COMMD3 in mediating TfR recycling is independent of the Commander complex. Structural analysis and co-immunoprecipitation followed by mass spectrometry revealed that TfR recycling by COMMD3 relies on ARF1. COMMD3 interacts with ARF1 through its N-terminal domain (NTD) to stabilize ARF1. A mutation in the NTD of COMMD3, which disrupts the NTD-ARF1 interaction, failed to rescue cell surface TfR in COMMD3-KO cells. In conclusion, the authors assert that COMMD3 stabilizes ARF1 in a Commander complex-independent manner, which is essential for recycling specific cargo proteins from endosomes to the plasma membrane.

      The conclusions of this paper are generally supported by data, but some validation experiments and control conditions should be included to strengthen the study.

      (1) Commander-Independent Role of COMMD3:<br /> While the authors provided evidence to support the Commander-independent role of COMMD3-such as the absence of other Commander subunits in the CRISPR screen and not decreased COMMD3 levels in other subunit-KO cells-direct evidence is lacking. The mutation that specifically disrupts the COMMD3-ARF1 interaction could serve as a valuable tool to directly address this question.

      (2) Role of ARF1 in Cargo Selection:<br /> The Commander-independent function of COMMD3 appears cargo-dependent and relies on ARF1's role in cargo selection. The authors should investigate whether KO/KD of ARF1 reduces cell surface levels of ITGA6 and TfR.

      (3) Impact on TfR Stability:<br /> Figure 7D suggests that TfR protein levels are reduced in COMMD3-KO cells, potentially due to degradation caused by disrupted recycling. This raises the question of whether the observed reduction in cell surface TfR is due to impaired endosomal recycling or decreased total protein levels. The authors should quantify the ratio of cell surface protein to total protein for TfR, GLUT-SPR, and ITGA6 in COMMD3-KO cells.

    3. Reviewer #2 (Public review):

      Summary:

      The Commander complex is a key player in endosomal recycling which recruits cargo proteins and facilitates the formation of tubulo-vesicular carriers. Squiers et al found COMMD3, a subunit of the Commander complex, could interact directly with ARF1 and regulate endosomal recycling.

      Strengths:

      Overall, this is a nice study that provides some interesting knowledge on the function of the Commander complex.

      Weaknesses:

      Several issues should be addressed.

      (1) All existing data suggest that COMMD3 is a subunit of the Commander complex. Is there any evidence that COMMD3 can exist as a monomer?

      (2) In Figure 9, the author emphasizes COMMD3-dependent cargo and Commander-dependent cargo. Can the authors speculate what distinguishes these two types of cargo? Do they contain sequence-specific motifs?

      (3) What could be the possible mechanism underlying the observation that the knockout of COMMD3 results in larger early endosomes? How is the disruption of cargo retrieval related to the increase in endosome size?

    4. Reviewer #3 (Public review):

      Summary:

      The manuscript by Squiers and colleagues uncovers a Commander-independent function for COMMD3 in endosomal recycling. The authors identified COMMD3 as a regulator of endosomal recycling for GLUT4-SPR through unbiased genetic screens. Subsequently, the authors performed COMMD3 knockout experiments to assess endosomal morphology and trafficking, demonstrating that COMMD3 regulates endosomal trafficking in a Commander-independent manner. Furthermore, the authors identified and confirmed that the N-terminal domain (NTD) of COMMD3 interacts with the GTPase Arf1. Using structure-guided mutations, they demonstrated that the COMMD3-Arf1 interaction is critical for the Commander-independent function of COMMD3.

      Overall, the manuscript presents compelling evidence for a Commander-independent role of COMMD3, and I agree with the author's interpretations. The manuscript uses a combination of genetic screening, microscopy, and structural and biochemical approaches to examine and support the conclusions. This is an excellent and intriguing study and I have only a few comments and suggestions to improve the manuscript further.

    1. eLife Assessment

      This manuscript makes a valuable contribution to the field by uncovering a molecular mechanism for miRNA intracellular retention, mediated by the interaction of PCBP2, SYNCRIP, and specific miRNA motifs. These findings advance our understanding of RNA-binding protein-mediated miRNA sorting and provide deeper insights into miRNA dynamics. While the conclusions are supported by solid experimental evidence, additional controls and clarification of the precise intracellular interactions would further strengthen the study.

    2. Reviewer #1 (Public review):

      In this study, Marocco and colleagues perform a deep characterization of the complex molecular mechanism guiding the recognition of a particular CELLmotif previously identified in hepatocytes in another publication. Having miR-155-3p with or without this CELLmotif as the initial focus, the authors identify 21 proteins differentially binding to these two miRNA versions. From there, they decided to focus on PCBP2. They elegantly demonstrate PCBP2 binding to the miR-155-3p WT version but not to the CELLmotif-mutated version. miR-155-3p contains a hEXOmotif identified in a different report, whose recognition is largely mediated by another RNA-binding protein called SYNCRIP. Interestingly, mutation of the hEXOmotif contained in miR-155-3p did not only blunt SYNCRIP binding but also PCBP2 binding despite the maintenance of the CELLmotif. This indicates that somehow SYNCRIP binding is a pre-requisite for PCBP2 binding. EMSA assay confirms that SYNCRIP is necessary for PCBP2 binding to miR-155-3p, while PCBP2 is not needed for SYNCRIP binding. The authors aim to extend these findings to other miRNAs containing both motifs. For that, they perform a small-RNA-Seq of EVs released from cells knockdown for PCBP2 versus control cells, identifying a subset of miRNAs whose expression either increases or decreases. The assumption is that those miRNAs containing PCBP2-binding CELLmotif should now be less retained in the cell and go more to extracellular vesicles, thus reflecting a higher EV expression. The specific subset of miRNAs having both the CELLmotif and hEXOmotif (9 miRNAs) whose expressions increase in EVs due to PCBP2 reduction is also affected by knocking-down SYNCRIP in the sense that reduction of SYNCRIP leads to lower EV sorting. Further experiments confirm that PCBP2 and SYNCRIP bind to these 9 miRNAs and that knocking down SYNCRIP impairs their EV sorting.

      While the process studied in this work is novel and interesting, there are several aspects of this manuscript that should be improved:

      (1) First of all, the nature of the CELLmotif and the hEXOmotif they are studying is extremely confusing. For the CELLmotif, the authors seem to focus on the Core CELLmotif AUU A/G in some experiments and the extended 7-nucleotide version in others. The fact that these CELLmotif and hEXOmotif are not shown anywhere in the figures (I mean with the full nucleotide variability described in the original publications) but only referred to in the text further complicates the identification of the motifs and the understanding of the experiments. Moreover, I am not convinced that the sequences they highlight in grey correspond to the original CELLmotif in all cases. For instance, in the miR-155-3p sequence, GCAUU is highlighted in grey. However, the original CELLmotif is basically 7-nucleotide long: C, A/U, G/A/C, U, U/A, C/G/A, A/U/C or CAGUUCA in its more abundant version. I can only see clearly the presence of the Core CELLmotif AUUA in miR-155-3p; however, the last A is not highlighted in grey. It is true that there is some nucleotide variability in each position in the originally reported CELLmotif by the authors in ref. 5 and the hEXOmotifs in ref. 7; however, not all nucleotides are equally likely to be found in each position. This fact seems to be not to be taken into account by the authors as they took basically any sequence with any length and almost sequence combination as valid CELLmotif. This means that I cannot identify the CELLmotif in many cases among the ones they highlight in grey. Instead, they should really focus on the most predominant CELLmotif sequence or, instead, take a reduced subset of "more abundant" CELLmotif versions from the ones that could fit in the originally described CELLmotif. Altogether, the authors need to explain much better what they have considered as the CELLmotif, what is the Core CELLmotif and what is hEXOmotif in each case and restrict to the most likely versions of the CELLmotif and hEXOmotif.

      (2) Validation of EV isolation method: first, a large part of Supplementary Figure 2 is not readable. EV markers seem to be enriched in EV isolates; however, more EV and cell markers should be assayed to fulfill MISEV guidelines.

      (3) A key variable is missing in Supplementary Figure 2, which is whether PCBP2 or SYNCRIP knockdowns impair EV secretion rates. A quantification of the nr vesicles released per cell upon knocking down each of these factors would be essential to rule out that any of the effects seen throughout the paper are not due to reduced or enhanced EV production rather than miRNA sorting/retention.

      (4) The EMSA experiment is important to support their claims. Given the weak bands that are shown, the authors need to show all their replicates to convince the readers that it is reproducible.

      (5) Although the bindings of SYNCRIP and PCBP2 to miR-155-3p and other miRNAs having both hEXOmotif and CELLmotif seem clear, the need for SYNCRIP binding to allow for PCBP2-mediated cellular retention is counterintuitive. What happens to those miRNAs that only contain a CELLmotif in terms of cellular retention and SYNCRIP dependence for cellular retention? In this regard, a representative miRNA (miR-31-3p) is analyzed in several experiments, showing that PCBP2 does not bind to it unless a hEXOmotif is introduced (Figure 3). However, this type of experiment should definitely be extended to other miRNAs containing only CELLmotif without hEXOmotif.

      (6) Along the same line, I am missing another important experiment: the artificial incorporation of CELLmotif. For example, miR-365-2-5p lacks a CELLmotif but has a hEXOmotif. Does PCBP2 bind to this miRNA upon incorporation of CELLmotif? Does this lead now to enhanced cellular retention of this miRNA?

      (7) What would be the net effect of knocking down both SYNCRIP and PCBP2 at the same time? Would this neutralize each other's effect or would the lack of one impose on the other? That could help in understanding the complex interplay between these two factors for mediating cellular retention and EV sorting.

      (8) The authors have here a great opportunity to shed some light on an unclear aspect of miRNA EV sorting and cellular retention: whether the RBPs go together with the miRNA to the EVs or not. While the original paper describing hEXOmotif found SYNCRIP in EVs, another publication (Jeppesen et al, Cell 2019; PMID: 30951670) later found this RBP being very scarce in small EVs compared to cellular bodies or large EVs (Supplementary Tables 3 and 4 in that publication). Can the authors find SYNCRIP and PCBP2 in the EVs? Another important question would be the colocalization of these RBPs in the place where the miRNA selection is supposed to take place: in multivesicular bodies (MVB). Is there a colocalization of these RBPs with MVBs in the cell?

      (9) In Figure 4C, the authors state in the text that CELLmotif and hEXOmotif are present in extra-seed region; however, for miR-181d-5p and miR-122-3p this is not true as their CELLmotifs fall within the seed sequence.

      (10) The authors need to describe how they calculate the EV/cell ratio in gene expression in some experiments (for instance, Figures 1H, 4D, etc). Did they use any housekeeping gene for EV RNA content, the same RNA load, or some other alternative method to normalize EV vs cell RNA content?

      (11) I would suggest that the authors speculate a bit in the discussion section on how the interaction between PCBP2 and SYNCRIP takes place. Do they contain any potential interacting domain? The binding of one to the miRNA would impose a topological interference on the binding of the other?

    3. Reviewer #2 (Public review):

      Summary:

      The author of this manuscript aimed to uncover the mechanisms behind miRNA retention within cells. They identified PCBP2 as a crucial factor in this process, revealing a novel role for RNA-binding proteins. Additionally, the study discovered that SYNCRIP is essential for PCBP2's function, demonstrating the cooperative interaction between these two proteins. This research not only sheds light on the intricate dynamics of miRNA retention but also emphasizes the importance of protein interactions in regulating miRNA behavior within cells.

      Strengths:

      This paper makes important progress in understanding how miRNAs are kept inside cells. It identifies PCBP2 as a key player in this process, showing a new role for proteins that bind RNA. The study also finds that SYNCRIP is needed for PCBP2 to work, highlighting how these proteins work together. These discoveries not only improve our knowledge of miRNA behavior but also suggest new ways to develop treatments by controlling miRNA locations to influence cell communication in diseases. The use of liver cell models and thorough experiments ensures the results are reliable and show their potential for RNA-based therapies

      Weaknesses:

      Despite its strengths, the manuscript has several notable limitations. The study's exclusive focus on hepatocytes limits the applicability of the findings to other cell types and physiological contexts. While the interaction between PCBP2 and SYNCRIP is well-characterized, the manuscript lacks detailed insights into the structural basis of this interaction and the dynamic regulation of their binding. The generalization of the findings to a broader spectrum of miRNAs and RNA-binding proteins (RBPs) remains underexplored, leaving gaps in understanding the full scope of miRNA compartmentalization.

      Furthermore, the therapeutic implications of these findings, though promising, are not directly connected to specific disease models or clinical scenarios, reducing their immediate translational impact. The manuscript would also benefit from a deeper discussion of potential upstream regulators of PCBP2 and SYNCRIP and the influence of cellular or environmental factors on their activity. Additionally, it is important to note that SYNCRIP has already been recognized as a major regulator of miRNA loading in extracellular vesicles (EVs). However, the purity of EVs is a concern, as the author only performed crude extraction methods without further purification using an iodixanol density gradient. The study also lacks in vivo evidence of PCBP2's role in exosomal miRNA export.

    1. eLife Assessment

      This important study explores the conserved role of IgM in both systemic and mucosal antiviral immunity in teleosts, challenging established views on the differential roles of IgT and IgM. The findings have theoretical and practical implications for immunology and aquaculture. However, the strength of the evidence is incomplete due to insufficient validation of the monoclonal antibodies used to deplete IgM, which limits confidence in the main claims. Addressing these methodological weaknesses would significantly enhance the study's impact.

    2. Joint Public Review:

      In this manuscript, Weiguang Kong et al. investigate the role of immunoglobulin M (IgM) in antiviral defense in the teleost largemouth bass (Micropterus salmoides). The study employs an IgM depletion model, viral infection experiments, and complementary in vitro assays to explore the role of IgM in systemic and mucosal immunity. The authors conclude that IgM is crucial for both systemic and mucosal antiviral defense, highlighting its role in viral neutralization through direct interactions with viral particles. The study's findings have theoretical implications for understanding immunoglobulin function across vertebrates and practical relevance for aquaculture immunology.

      Strengths:

      The manuscript applies multiple complementary approaches, including IgM depletion, viral infection models, and histological and gene expression analyses, to address an important immunological question. The study challenges established views that IgT is primarily responsible for mucosal immunity, presenting evidence for a dual role of IgM at both systemic and mucosal levels. If validated, the findings have evolutionary significance, suggesting the conserved role of IgM as an antiviral effector across jawed vertebrates for over 500 million years. The practical implications for vaccine strategies targeting mucosal immunity in fish are noteworthy, addressing a key challenge in aquaculture.

      Weaknesses:

      Several conceptual and technical issues undermine the strength of the evidence:

      Monoclonal Antibody (MoAb) Validation: The study relies heavily on a monoclonal antibody to deplete IgM, but its specificity and functionality are not adequately validated. The epitope recognized by the antibody is not identified, and there is no evidence excluding cross-reactivity with other isotypes. Mass spectrometry, immunoprecipitation, or Western blot analysis using tissue lysates with varying immunoglobulin expression levels would strengthen the claim of IgM-specific depletion.

      IgM Depletion Kinetics: The rapid depletion of IgM from serum and mucus (within one day) is unexpected and inconsistent with prior literature. Additional evidence, such as Western blot analyses comparing treated and control fish, is necessary to confirm this finding.

      Novelty of Claims: The manuscript claims a novel role for IgM in viral neutralization, despite extensive prior literature demonstrating this role in fish. This overstatement detracts from the contribution of the study and requires a more accurate contextualization of the findings.

      Support for IgM's Crucial Role: The mortality data following IgM depletion do not fully support the claim that IgM is indispensable for antiviral defense. The survival of IgM-depleted fish remains high (75%) compared to non-primed controls (~50%), suggesting that other immune components may compensate for IgM loss.

      Presentation of IgM Depletion Model: The study describes the IgM depletion model as novel, although similar models have been previously published (e.g., Ding et al., 2023). This should be clarified to avoid overstating its novelty.

      While the manuscript attempts to address an important question in teleost immunology, the current evidence is insufficient to fully support the authors' conclusions. Addressing the validation of the monoclonal antibody, re-evaluating depletion kinetics, and tempering claims of novelty would strengthen the study's impact. The findings, if rigorously validated, have important implications for understanding the evolution of vertebrate immunity and practical applications in fish health management.

      This work is of interest to immunologists, evolutionary biologists, and aquaculture researchers. The methodological framework, once validated, could be valuable for studying immunoglobulin function in other non-model organisms and for developing targeted vaccine strategies. However, the current weaknesses limit its broader applicability and impact.