10,000 Matching Annotations
  1. Feb 2025
    1. Reviewer #2 (Public review):

      Summary:

      NRDE-3 is a nuclear WAGO-clade Argonaute that, in somatic cells, binds small RNAs amplified in response to the ERGO-class 26G RNAs that target repetitive sequences. This manuscript reports that, in the germline and early embryos, NRDE-3 interacts with a different set of small RNAs that target mRNAs. This class of small RNAs were previously shown to bind to a different WAGO-clade Argonaute called CSR-1, which is cytoplasmic unlike nuclear NRDE-3. The switch in NRDE-3 specificity parallels recent findings in Ascaris where the Ascaris NRDE homolog was shown to switch from sRNAs that target repetitive sequences to CSR-class sRNAs that target mRNAs.

      The manuscript also correlates the change in NRDE-3 specificity with the appearance in embryos of cytoplasmic condensates that accumulate SIMR-1, a scaffolding protein that the authors previously implicated in sRNA loading for a different nuclear Argonaute HRDE-1. By analogy, and through a set of corelative evidence, the authors argue that SIMR foci arise in embryogenesis to facilitate the change in NRDE-3 small RNA repertoire. The paper presents lots of data that beautifully documents the appearance and composition of the embryonic SIMR-1 foci, including evidence that a mutated NRDE-3 that cannot bind sRNAs accumulate in SIMR-1 foci in SIMR-1-dependent fashion.

    2. Reviewer #3 (Public review):

      Summary:

      Chen and Phillips present intriguing work that extends our view on the C. elegans small RNA network significantly. While the precise findings are rather C. elegans specific there are also messages for the broader field, most notably the switching of small RNA populations bound to an argonaute, and RNA granules behavior depending on developmental stage. The work also starts to shed more light on the still poorly understood role of the CSR-1 argonaute protein and supports its role in the decay of maternal transcripts. Overall, the work is of excellent quality, and the messages have a significant impact.

      Strengths:

      Compelling evidence for major shift in activities of an argonaute protein during development, and implications for how small RNAs affect early development. Very balanced and thoughtful discussion.

      Weaknesses:

      The switch between maternal and zygotic NRDE-3 remains unaddressed

    3. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Summary:

      Chen and Phillips describe the dynamic appearance of cytoplasmic granules during embryogenesis analogous to SIMR germ granules, and distinct from CSR-1-containing granules, in the C. elegans germline. They show that the nuclear Argonaute NRDE-3, when mutated to abrogate small RNA binding, or in specific genetic mutants, partially colocalizes to these granules along with other RNAi factors, such as SIMR-1, ENRI-2, RDE-3, and RRF-1. Furthermore, NRDE-3 RIP-seq analysis in early vs. late embryos is used to conclude that NRDE-3 binds CSR-1-dependent 22G RNAs in early embryos and ERGO-1dependent 22G RNAs in late embryos. These data lead to their model that NRDE-3 undergoes small RNA substrate "switching" that occurs in these embryonic SIMR granules and functions to silence two distinct sets of target transcripts - maternal, CSR-1 targeted mRNAs in early embryos and duplicated genes and repeat elements in late embryos.

      Strengths:

      The identification and function of small RNA-related granules during embryogenesis is a poorly understood area and this study will provide the impetus for future studies on the identification and potential functional compartmentalization of small RNA pathways and machinery during embryogenesis.

      Weaknesses:

      (1) While the authors acknowledge the following issue, their finding that loss of SIMR granules has no apparent impact on NRDE-3 small RNA loading puts the functional relevance of these structures into question. As they note in their Discussion, it is entirely possible that these embryonic granules may be "incidental condensates." It would be very welcomed if the authors could include some evidence that these SIMR granules have some function; for example, does the loss of these SIMR granules have an effect on CSR-1 targets in early embryos and ERGO-1-dependent targets in late embryos?

      We appreciate reviewer 1’s concern that we do not provide enough evidence for the function of the SIMR granules. As suggested, we examined the NRDE-3 bound small RNAs more deeply, and we do observe a slight but significant increased CSR-class 22G-RNAs binding to NRDE-3 in late embryos of simr-1 and enri-2 mutants (see below, right). We hypothesize that this result could be due to a slower switch from CSR to ERGO 22G-RNAs in the absence of SIMR granules. We added these data to Figure 6G.

      (2) The analysis of small RNA class "switching" requires some clarification. The authors re-define ERGO1-dependent targets in this study to arrive at a very limited set of genes and their justification for doing this is not convincing. What happens if the published set of ERGO-1 targets is used? 

      As we mentioned in the manuscript, we initially attempted to use the previously defined ERGO targets. However, the major concern is fewer than half the genes classified as ERGO targets by Manage et al. and Fischer et al. overlap with one another (Figure 6—figure supplement 1D and below). We reason this might because the gene sets were defined as genes that lose small RNAs in various ERGO pathway mutants and because different criteria were used to define the lists as discussed in the manuscript (lines 471-476). As a result, some of the previously defined ERGO target genes may actually be indirect targets of the pathway. Here we focus on genes targeted by small RNAs enriched in an ERGO pathway Argonaute IP, which should be more specific.

      In this manuscript, we are interested specifically in the ERGO targets bound by NRDE-3, thus we utilized the IP-small RNA sequencing data from young adult animals (Seroussi et al, 2023), to define a new ERGO list. We are confident about this list because 1) Most of our new ERGO genes overlap with the overlap between ERGO-Manage and ERGO-Fischer list (see Figure 6—figure supplement 1D in our manuscript and below). 2) We observed the most significant decrease of small RNA levels and increase of mRNA levels in the nrde-3 mutants using our newly defined list (see Figure 6—figure supplement 1E-F in our manuscript).

      To further address reviewer 1’s concern about whether the data would look significantly different when using the ERGO-Manage and ERGO-Fischer lists, we made new scatter plots shown in Author response image 1 panels A-C below (ERGO-Manage – purple, ERGO-Fischer- yellow, and the overlap - yellow with purple ring). We found that the small switching pattern of NRDE-3 is consistent with our newly defined list, particularly if we look at the overlap of ERGO-Manage and ERGO-Fischer list (Author response image 1 panels D-F below, red).

      Author response image 1.

      Further, the NRDE-3 RIP-seq data is used to conclude that NRDE-3 predominantly binds CSR-1 class 22G RNAs in early embryos, while ERGO-1-dependent 22G RNAs are enriched in late embryos. a) The relative ratios of each class of small RNAs are given in terms of unique targets. What is the total abundance of sequenced reads of each class in the NRDE-3 IPs? 

      To address the reviewer’s question about the total abundance of sequenced reads of each class in the NRDE-3 IPs: Author response image 2 panel A-B below show the total RPM of CSR and ERGO class sRNAs in inputs and IPs at different stages. Focusing on late embryos, the total abundance of ERGO-dependent sRNAs is similar to CSR-class sRNAs in input, while much higher in IP, indicating an enrichment of ERGO-dependent 22G-RNAs in NRDE-3 consistent with our log2FC (IP vs input) in Figure 6B. This data supports our conclusion that NRDE-3 preferentially binds to ERGO targets in late embryos.

      Author response image 2.

      b) The "switching" model is problematic given that even in late embryos, the majority of 22G RNAs bound by NRDE-3 is the CSR-1 class (Figure 5D). 

      It is important to keep in mind the difference in the total number of CSR target genes (3834) and ERGO target genes (119).  The pie charts shown in Figure 6D are looking at the total proportion of the genes enriched in the NRDE-3 IP that are CSR or ERGO targets. For the NRDE-3 IP in late embryos, that would be 70/119 (58.8%) of ERGO targets are enriched, while 172/3834 (4.5%) of CSR targets are enriched. These data are also supported by the RPM graphs shown in Author response image 2 panels A-B above, which show that the majority of the small RNA bound by NRDE-3 in late embryos are ERGO targets. Nonetheless, NRDE-3 still binds to some CSR targets shown as Figure 6D and panel B, which may be because the amount of CSR-class 22G-RNAs is reduced gradually across embryonic development as the maternally-deposited NRDE-3 loaded with CSR-class 22G-RNAs is diluted by newly transcribed NRDE-3 loaded with ERGOdependent 22G-RNAs (lines 857-862). 

      c) A major difference between NRDE-3 small RNA binding in eri-1 and simr-1 mutants appears to be that NRDE-3 robustly binds CSR-1 22G RNAs in eri-1 but not in simr-1 in late embryos. This result should be better discussed.

      In the eri-1 mutant, we hypothesize that NRDE-3 robustly binds CSR-class 22G-RNAs because ERGOclass 22G-RNAs are not synthesized during mid-embryogenesis, so either NRDE-3 is unloaded (in granule at 100-cell stage in Figure 2A) or mis-loaded with CSR-class 22G-RNAs (in the nucleus at 100cell stage in Figure 2A). We don’t have a robust method to address the proportion of loaded vs. unloaded NRDE-3 so it is difficult to address the degree to which NRDE-3 is misloaded in the eri-1 mutant. In the simr-1 mutant, both classes of small RNAs are present and NRDE-3 is still preferentially loaded with ERGO-dependent 22G-RNAs, though we do see a subtle increase in association with CSR-class 22GRNAs. These data could suggest a less efficient loading of NRDE-3 with ERGO-dependent 22G-RNAs, but we would need more precise methods to address the loading dynamics in the simr-1 mutant.

      (3) Ultimately, if the switching is functionally important, then its impact should be observed in the expression of their targets. RNA-seq or RT-qPCR of select CSR-1 and ERGO-1 targets should be assessed in nrde-3 mutants during early vs late embryogenesis.

      The function of NRDE-3 at ERGO targets has been well studied (Guang et al, 2008) and is also assessed in our H3K9me3 ChIP-seq analysis in Figure 7E where, in mixed staged embryos, H3K9me3 level on ERGO targets (labeled as ‘NRDE-3 targets in young adults’) is reduced significantly in the nrde-3 mutant.

      To understand the function of NRDE-3 binding on CSR targets in early embryos, we attempted to do RTqPCR, smFISH, and anti-H3K9me3 CUT&Tag-seq on early embryos, and we either failed to obtain enough signal or failed to detect any significant difference (data not shown). We additionally tested the possibility that NRDE-3 functions with CSR-class 22G-RNAs in oocytes. We present new data showing that NRDE-3 represses RNA Pol II in oocytes to promote global transcriptional repression at the oocyteto-embryo transition, we now included these data in Figure 8. 

      Reviewer #2 (Public review):

      Summary:

      NRDE-3 is a nuclear WAGO-clade Argonaute that, in somatic cells, binds small RNAs amplified in response to the ERGO-class 26G RNAs that target repetitive sequences. This manuscript reports that, in the germline and early embryos, NRDE-3 interacts with a different set of small RNAs that target mRNAs. This class of small RNAs was previously shown to bind to a different WAGO-clade Argonaute called CSR1, which is cytoplasmic, unlike nuclear NRDE-3. The switch in NRDE-3 specificity parallels recent findings in Ascaris where the Ascaris NRDE homolog was shown to switch from sRNAs that target repetitive sequences to CSR-class sRNAs that target mRNAs.

      The manuscript also correlates the change in NRDE-3 specificity with the appearance in embryos of cytoplasmic condensates that accumulate SIMR-1, a scaffolding protein that the authors previously implicated in sRNA loading for a different nuclear Argonaute HRDE-1. By analogy, and through a set of corelative evidence, the authors argue that SIMR foci arise in embryogenesis to facilitate the change in NRDE-3 small RNA repertoire. The paper presents lots of data that beautifully documents the appearance and composition of the embryonic SIMR-1 foci, including evidence that a mutated NRDE-3 that cannot bind sRNAs accumulates in SIMR-1 foci in a SIMR-1-dependent fashion.

      Weaknesses:

      The genetic evidence, however, does not support a requirement for SIMR-1 foci: the authors detected no defect in NRDE-3 sRNA loading in simr-1 mutants. Although the authors acknowledge this negative result in the discussion, they still argue for a model (Figure 7) that is not supported by genetic data. My main suggestion is that the authors give equal consideration to other models - see below for specifics.

      We appreciate reviewer 2’s comments on the genetic evidence for the function of SIMR foci.  A similar concern was also brought up by reviewer 1. By re-examining our sequencing data, we found that there is a modest but significant increase in NRDE-3 association with CSR-class sRNAs in simr-1 and enri-2 mutants in late embryos. We believe that this data supports our model that SIMR-1 and ENRI-2 are required for an efficient switch of NRDE-3 bound small RNAs. Please refer our response to the reviewer 1 - point (1), and Figure 6G in the updated manuscript. 

      Reviewer #3 (Public review):

      Summary:

      Chen and Phillips present intriguing work that extends our view on the C. elegans small RNA network significantly. While the precise findings are rather C. elegans specific there are also messages for the broader field, most notably the switching of small RNA populations bound to an argonaute, and RNA granules behavior depending on developmental stage. The work also starts to shed more light on the still poorly understood role of the CSR-1 argonaute protein and supports its role in the decay of maternal transcripts. Overall, the work is of excellent quality, and the messages have a significant impact.

      Strengths:

      Compelling evidence for major shift in activities of an argonaute protein during development, and implications for how small RNAs affect early development. Very balanced and thoughtful discussion.

      Weaknesses:

      Claims on col-localization of specific 'granules' are not well supported by quantitative data

      We have now included zoomed images of individual granules to better show the colocalization in Figure 4 and Figure 4—figure supplement 1, and performed Pearson’s colocalization analysis between different sets of proteins in Figure 4B. 

      Reviewer #2 (Recommendations for the authors):

      - The manuscript is very dense and the gene names are not helpful. For example, the authors mention ERGO-1 without clarifying the type of protein, etc. I suggest the authors include a figure to go with the introduction that describes the different classes of primary and secondary sRNAs, associated Argonautes, and other accessory proteins. Also include a table listing relevant gene names, protein classes, main localizations, and proposed functions for easy reference by the readers.

      We agree that the genes names in different small RNA pathways are easily confused. We added a diagram and table in Figure 1—figure supplement 1 depicting the ERGO/NRDE and CSR pathways and added clarification about the ERGO/NRDE-3 pathway in the text from line 126-128.  

      - Line 424 - the wording here and elsewhere seems to imply that SIMR-1 and ENRI-2, although not essential, contribute to NRDE-3 sRNA loading. The sequencing data, however, do not support this - the authors should be clearer on this. If the authors believe there are subtle but significant differences, they should show them perhaps by adding a panel in Figure 5 that directly compares the NRDE-3 IPs in wildtype versus simr-1 mutants. Figure 5H however does not support such a requirement.

      As brought up by reviewer 1, we do not see difference in binding of ERGO-dependent sRNA in simr-1 mutant in late embryos. We do, however, see a modest, but significant, increase of CSR-sRNAs bound by NRDE-3 in simr-1 and enri-2 mutants, which we hypothesize could be due to a less efficient loading of ERGO-dependent 22G-RNAs by NRDE-3. The updated data are now in Figure 6G. We have also edited the text and model figure to soften these conclusions.

      - Condensates of PGL proteins appear at a similar time and place (somatic cells of early embryos) as the embryonic SIMR-1 foci. The PGL foci correspond to autophagy bodies that degrade PGL proteins. Is it possible that SIMR-1 foci also correspond to degradative structures? The possibility that SIMR-1 foci are targeted for autophagy and not functional would fit with the finding that simr-1 mutants do not affect NRDE-3 loading in embryos.

      We appreciate reviewer 2’s comments on possibility of SIMR granules acting as sites for degradation of SIMR-1 and NRDE-3. We think this is not the case for the following reasons: 1) if SIMR granules are sites of autophagic degradation, then we would expect that embryonic SIMR granules in somatic cells, like PGL granules, should only be observed in autophagy mutants; however we see them in wild-type embryos 2) we would not expect a functional Tudor domain to be required for granule localization; however in Figure 1—figure supplement 2B, we show that a point mutation in the Tudor domain of SIMR-1 abrogates SIMR granule formation, and 3) if NRDE-3(HK-AA) is recruited to SIMR granules for degradation while wild-type NRDE-3 is cytoplasmic, then NRDE-3(HK-AA) should shows a significantly reduced protein level comparing to wild-type NRDE-3. In the western blot in Figure 2—figure supplement 1B, NRDE-3 and NRDE-3(HK-AA) protein levels are similar, indicating that NRDE-3(HK-AA) is not degraded despite being unloaded. This is in contrast to what we have observed previously for HRDE-1, which is degraded in its unloaded state. If SIMR-1 played a role directly in promoting degradation of NRDE-3(HK-AA), we would similarly expect to see a change in NRDE-3 or NRDE-3(HK-AA) expression in a simr-1 mutant. We performed western blot and did not observe a significant change in protein expression for NRDE-3 (Figure 3—figure supplement 1A). 

      Although under wild-type conditions, SIMR granules do not appear to be sites of autophagic degradation, upon treatment with lgg-1 (an autophagy protein) RNAi, we found that SIMR-1, as well as many other germ granule and embryonic granule-localized proteins, increase in abundance in late embryos.  This data demonstrates that ZNFX-1, CSR-1, SIMR-1, MUT-2/RDE-3, RRF-1, and unloaded NRDE-3 are removed by autophagic degradation similar to what have been shown previously for PGL-1 proteins (Zhang et al, 2009, Cell). We added these data to Figure 5. It is important to emphasize, however, that the timing of degradation differs for each granule assayed (Lines 447-450), indicating that there must be multiple waves of autophagy to selectively degrade subsets of proteins when they are no longer needed by the embryo.

      - The observation that an NRDE-3 mutant that cannot load sRNAs localizes to SIMR-1 foci does not necessarily imply that wild-type unloaded NRDE-3 would also localize there. Unless the authors have additional data to support this idea, the authors should acknowledge that this hypothesis is speculative. In fact, why does cytoplasmic NRDE-3 not localize to granules in the rde-3;ego-1degron strain shown in Figure 6B?? Is it possible that the NRDE-3 mutant accumulates in SIMR-1 foci because it is unfolded and needs to be degraded?

      We believe that wild-type NRDE-3 also localize to SIMR foci when unloaded. This is supported by the localization of wild-type NRDE-3 in eri-1 and rde-3 mutants, where a subset of small RNAs are depleted. Wild-type NRDE-3 localizes to both somatic SIMR-1 granules and the nucleus, depending on embryo stage (Figure 2A, Figure 2—figure supplement 1C). The granule numbers in eri-1 and rde-3 mutants are less than the nrde-3(HK-AA) mutant, consistent with the imaging data that NRDE-3 only partially localize to somatic granule (Figure 2A – 100-cell stage).

      In the rde-3; ego-1 double mutant, the embryos have severe developmental defect: they cannot divide properly after 4-8 cell stage and exhibit morphology defects after that stage. In wild-type, SIMR foci does not appear until around 8-28-cell stage (shown in Figure 1C), so we believe that cytoplasmic NRDE-3 does not localize to foci in the double mutant is because of the timing.

      - The authors propose that NRDE-3 functions in nuclei to target mRNAs also targeted in the cytoplasm by CSR-1. If so, how do they propose that NRDE-3 might do this since little transcription occurs in oocytes/early embryos?? Are the authors suggesting that NRDE-3 targets germline genes for silencing specifically at the times that zygotic transcription comes back on, or already in maturing oocytes? Is the transcription of most CSR-1 targets silenced in early embryos??

      We appreciate the suggestions to check the function of NRDE-3 in oocytes. We tested this possibility and found it to be correct. NRDE-3 functions in oocytes for transcriptional repression by inhibiting RNA Pol II elongation. We added these data to Figure 8. We also attempted to do RT-qPCR, smFISH, and antiH3K9me3 Cut&Tag-seq on early embryos to further test the hypothesis that NRDE-3 acts with CSR-class 22G-RNAs in early embryos, but we either failed to obtain enough signal or failed to detect any significant difference (data not shown). Therefore, we think that the primary role for NRDE-3 bound to CSR-class 22G-RNAs may be for global transcriptional repression of oocytes prior to fertilization.

      - Line 684-686: "In summary, this work investigating the role of SIMR granules in embryos, together with our previous study of SIMR foci in the germline (Chen and Phillips 2024), has identified a new mechanism for small RNA loading of nuclear Argonaute proteins in C. elegans". This statement appears overstated/incorrect since there is no evidence that SIMR-1 foci are required for sRNA loading of NRDE3. The authors should emphasize other models, as suggested above.

      We have revised the text on line 869-871 to emphasize that SIMR granule regulate the localization of nuclear Argonaute proteins, rather than suggesting a direct role on controlling small RNA loading. We also edit the title, text, and legend for our model in Figure 9. 

      Reviewer #3 (Recommendations for the authors):

      Issues to be addressed:

      - The authors show a switch in 22G RNA binding by NRDE-3 during embryogenesis. While the data is convincing, it would be great if it could be tested if the preferred NRDE-3 replacement model is indeed correct. This could be done relatively easily by giving NRDE-3 a Dendra tag, allowing one to colour-switch the maternal WAGO-3 pool before the zygotic pool comes up. Such data would significantly enhance the manuscript, as this would allow the authors to follow the fate of maternal NRDE-3 more precisely, perhaps identifying a period of sharp decline of maternal NRDE-3.

      We think the NRDE-3 Dendra tag experiment suggested by the reviewer is a clever approach and we will consider generating this strain in the future. However, we feel that optimization of the color-switching tag between the maternal germline and the developing embryos is beyond the scope of this manuscript. To partially address the question about NRDE-3 fate during embryogenesis, we examined the single-cell sequencing data of C. elegans embryos from 1-cell to 16-cell stage (Tintori et al, 2016, Dev Cell; Visualization tool from John I Murray lab), as shown in Author response image 3 Panel A below, NRDE-3 transcript level increases as embryo develops, indicating that zygotic NRDE-3 is being actively expressed starting very early in development. We hypothesize that maternal NRDE-3 will either be diluted as the embryo develops or actively degraded during early embryogenesis. 

      Author response image 3.

      - Figure 3A: * should mark PGCs, but this seems incorrect. At the 8-cell stage there still is only one PGC (P4), not two, and at 100 cells there are only two, not three germ cells. Also, the identification of PGCs with a maker (PGL for instance) would be much more convincing.

      We apologize for the confusion in Figure 3A. We changed the figure legend to clarify that the * indicate nuclear NRDE-3 localization in somatic cells for 8- and 100-cell stage embryos rather than the germ cells.  

      - Overall, the authors should address colocalization more robustly. In the current manuscript, just one image is provided, and often rather zoomed-out. How robust are the claims on colocalization, or lack thereof? With the current data, this cannot be assessed. Pearson correlation, combined with line-scans through a multitude of granules in different embryos will be required to make strong claims on colocalization. This applies to all figures (main and supplement) where claims on different granules are derived from.

      We thank reviewer 3 for this important suggestion. To better address the colocalization, we included insets of individual granules in Figure 2D and Figure 4. We also performed colocalization analysis by calculating the Pearson’s R value between different groups of proteins in Figure 4B, to highlight that SIMR-1 colocalizes with ENRI-2, NRDE-3(HK-AA), RDE-3, and RRF-1, while CSR-1 colocalizes with EGO-1.

      For the proteins that lack colocalization in Figure 4—figure supplement 1, we also added insets of individual granules. Additionally, we included a new set of panels showing SIMR-1 localization compared to tubulin::GFP (Figure 4—figure supplement 1I) in response to a recent preprint (Jin et al, 2024, BioRxiv), which finds NRDE-3 (expressed under a mex-5 promoter) associating with pericentrosomal foci and the spindle in early embryos. We do not see SIMR-1 (or NRDE-3, data not shown) at centrosomes or spindles in wild-type conditions but made a similar observation for SIMR-1 in a mut-16 mutant (Figure 4E). All of the localization patterns were examined on at least 5 individual 100-cell staged embryos with same localization pattern.

      - Figure 7: Its title is: Function of cytoplasmic granules. This is a much stronger statement than provided in the nicely balanced discussion. The role of the granules remains unclear, and they may well be just a reflection of activity, not a driver. While this is nicely discussed in the text, figure 7 misses this nuance. For instance, the title suggests function, and also the legend uses phrases like 'recruited to granule X'. If granules are the results of activity, 'recruitment' is really not the right way to express the findings. The nuance that is so nicely worded in the discussion should come out fully in this figure and its legend as well.

      We have changed the title of Figure 7 (now Figure 9) to “Model for temporally- and developmentallyregulated NRDE-3 function” to deemphasize the role of the granules and to highlight the different functions of NRDE-3. Similarly, we have rephrased the text in the figure and legend and add a some details about our new results.

      Minor:

      Typo: line 663 Acaris

      We corrected the typo.

    1. eLife Assessment

      This useful study presents findings on the efficacy and mechanisms of linalool protection against Saprolegnia parasitica oomycetes in the grass carp model. The evidence presented is solid since the methods, data and analyses broadly support the claims with only minor weaknesses. This work will be of great interest to scientists within the fields of aquaculture, ichthyology, microbiology, and drug discovery.

    2. Reviewer #1 (Public review):

      Summary:

      The works seeks to investigate the efficacy of linalool as a natural alternative for combating Saprolegnia parasitica infections, which would provide great benefit to aquaculture. This paper shows the effect of linalool in vitro using a variety of techniques including changes in S. parasitica membrane integrity following linalool exposure and alterations in cell metabolism and ribosome function. Additionally, this work goes on to show that prophylactic and concurrent treatment of linalool at the time of S. parasitica infection can improve survival and tissue damage in vivo in their grass carp infection model. The conclusions of the paper are partially supported by the data with the corrections done by the authors improving clarity such that I believe there is merit in the work.

    3. Reviewer #2 (Public review):

      Summary:

      In this study, the authors aimed to delineate the antimicrobial activity of linalool and tried to investigate the mode of action on linalool against S. parasitica infection. One of the main focus of this work was to identify the in vitro and in vivo mechanisms associated with the protective role of linalool against S. parasitica infection.

      Strengths:

      (1) Authors have used a variety of techniques to prove their hypothesis.<br /> (2) Adequate number of replicates were used in their studies.<br /> (3) Their findings showed a protective role of linalool against oomycetes and makes it an attractive future antibiotic in the aquaculture industry.

      Weaknesses: The revised version of the manuscript is more thoroughly written with clearer explanations, however there are a few weaknesses in this manuscript.

      (1) Although the introduction section was rewritten with rationale, it's still lengthy and not very much to the point.<br /> (2) The claim of linalool regulating the gut microbiota is based on the correlation analysis only. It's not super convincing and requires experimental validation to strengthen the claim.

      Overall, the conclusions drawn by the authors are justified by the data. Importantly, this paper has discovered the novelty of the compound linalool as a potent antimicrobial agent and might open up future possibilities to use this compound in the aquaculture industry.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      (1) Adding microscopy of the untreated group to compare Figure 2A with would further strengthen the findings here.

      First of all, we would like to thank Reviewer #1 for their comments and efforts on our manuscript. We have carefully revised it. We used a time-lapse method to capture images at 0 minutes, before any drugs were added. We will change '0 min' to 'untreated,' which will further strengthen the findings.

      (2) Quantification of immune infiltration and histological scoring of kidney, liver, and spleen in the various treatment groups would increase the impact of Figure 4.

      Thank you very much to Reviewer #1 for their comments and efforts on our manuscript. We have revised it carefully. We conducted quantitative analysis of immune infiltration in the kidney, liver, and spleen across different treatment groups. However, due to the extremely low number of abnormal cells in the negative control, treatment, and prophylactic groups, neither the instrument nor manual methods could reliably gate the cells. Consequently, quantification of immune infiltration and histological scoring were not performed.

      (3) The data in Figure 6 I is not sufficiently convincing as being significant.

      Thanks so much for Reviewer #1 comments and efforts for our manuscript. We have revised it carefully. Previous researches have shown that antibiotics and other drugs can cause alterations in gut microbiota. Therefore, we plan to study the effects of antibiotics on gut microbiota. To conduct this research, we need to isolate these microbes from the gut. Although this process is challenging, we still aim to explore the gut microbiota. If possible, we will continue to delve into interesting aspects of how antibiotics affect gut microbiota in future studies.

      (4) Comparisons of the global transcriptomic analysis of the untreated group to the PC, LP, and LT groups would strengthen the author's claims about the immunological and transcriptomic changes caused by linalool and provide a true baseline.

      Thanks so much for Reviewer #1 comments and efforts for our manuscript. We have revised it carefully. Due to the initial research design and data analysis strategy, we have focused on comparisons among the PC, LP, and LT groups to more directly explore the differences under various treatment conditions. Specifically, while the transcriptomic data from the untreated group could provide a basic reference, it has shown limited relevance to the core hypotheses of our study. Our research has aimed to investigate the immunological and transcriptomic changes among the treatment groups rather than comparing treated and untreated states. We believe that the current experimental design and data analysis have effectively revealed the mechanisms of linalool and that the additional comparisons among the treatment groups have further supported our conclusions. We hope the reviewer understands the rationale behind our experimental design. If there are additional suggestions, we are more than willing to further optimize the content of our manuscript.

      Reviewer #2 (Public review):

      (1) The authors have taken for granted that the readers already know the experiments/assays used in the manuscript. There was not enough explanation for the figures as well as figure legends.

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. We will provide more detailed explanations of the experiments and assays used in the manuscript, as well as enhance the descriptions in the figure legends, to ensure that readers have a clear understanding of the figures and their context.

      (2) The authors missed adding the serial numbers to the references.

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. We will add serial numbers to the references to ensure proper citation and improve the clarity of our manuscript.

      (3) The introduction section does not provide adequate rationale for their work, rather it is focused more on the assays done.

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. We will add a section to the introduction that provides a rationale for our work, specifically focusing on the impact of plant extract on immunoregulation.

      (4) Full forms are missing in many places (both in the text and figure legends), also the resolution of the figures is not good. In some figures, the font size is too small.

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We will ensure that all abbreviations are expanded where necessary, both in the text and figure legends. Additionally, we will improve the resolution of the figures and increase the font size where needed to enhance clarity.

      (5) There is much mislabeling of the figure panels in the main text. A detailed explanation of why and how they did the experiments and how the results were interpreted is missing.

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. We will improve the labeling of the figure panels, provide detailed explanations of the experimental methods, including their rationale and interpretation, and clarify the connections between the methods.

      (6) There is not enough experimental data to support their hypothesis on the mechanism of action of linalool. Most of the data comes from pathway analysis, and experimental validation is missing.

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Actually, in our manuscript the transcriptomic data are not alone, and we carried out many experiments to substantiate the changes inferred from the transcriptomic data as SEM, TEM, CLSM, molecular docking, RT-qPCR, histopathological examinations. The detailed information is listed as below.

      As shown in Figure 2, we combined the transcriptomic data related to membrane and organelle with SEM, TEM, and CLSM images. After deep analysis of these data and observation together, we illustrated that cell membrane may be a potential target for linalool.

      As shown in Figure 3, we carried out molecular docking to explore the specific binding protein of linalool with ribosome which were screen out as potential target of linalool by transcriptomic data.

      As shown in Figure 5, transcriptomic data illustrated that linalool enhanced the host complement and coagulation system. To substantiate these changes, we carried out RT-qPCR to detect those important immune-related gene expressions, and found that RT-qPCR analysis results were consistent with the expression trend of transcriptome analysis genes.

      As shown in Figure 4 and 5, transcriptomics data revealed that linalool promoted wound healing tissue repair, and phagocytosis (Figure. 5E). To ensure these, we carried out histopathological examinations, and found that linalool alleviated tissue damage caused by S. parasitica infection on the dorsal surface of grass carp and enhancing the healing capacity (Figure. 4G).

      Overall, we will conduct additional experiments to verify the mechanism of action of linalool in the future.

      Reviewer #1 (Recommendations for the authors):

      (1) Figure 1 Panel G is not referenced in the legend, this should be fixed

      Thanks so much for Reviewer #1 comments and efforts for our manuscript. We have revised it carefully. Please check the Figure 1. The order of Panel F and G in Figure 1 is wrong. We have modified the order of Figure 1.

      (2) Statistical comparisons between groups in Figure 4 Panels C-F is lacking and should be added.

      Thanks so much for Reviewer #1 comments and efforts for our manuscript. We have revised it carefully. Please check the Figure 4 C-F. We have added statistical comparisons between groups in Figure 4 Panels C-F.

      (3) Capitalize Kidney label in Figure 4G.

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check the Figure 4G. We have capitalized the K of kidney.

      Reviewer #2 (Recommendations for the authors):

      (1) The authors missed adding the serial numbers to the references. I could not go through the references to cross-check if they cited the right ones because it's extremely difficult to figure out which one corresponds to which reference number.

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check the references. We have added the serial numbers to the references.

      (2) In the last paragraph of the introduction section, most of the techniques in the paper were summarized which does not go with the flow of the paper. The introduction should not be focused on the different techniques used the focus should be more on the rationale of the work. It would be nice if the last paragraph could be rewritten.

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check it in Line 85-94. We have added a section to the introduction that provides a rationale for our work, specifically focusing on the impact of plant extract on immunoregulation.

      (3) The resolution of the figures is not good.

      Thank you for your suggestion. We have revised it carefully. Please check all the figures. We have increased the resolution and size of all the figures.

      (4) Mostly, the figure legends sound like results, with not enough explanation. Full forms are missing in many places which would make the readers go back to the text/other figures each time.

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check it throughout the manuscript and all the figure legends. We have added full names and abbreviations to both the manuscript and all the figure legends so that we don't make the readers go back to the text/other figures each time.

      (5) Figure 1:

      Figure 1A: there is not enough explanation for this panel. It's not clear from the text which other EOs than Linalool are referred to here. Which EOs were extracted from daidai flowers?

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check it in the Figure 1A. Figure 1A is divided into “Essential oils (EOs)” and “The main compounds of EOs” to make it easier to distinguish.

      Figure 1B: do the three different wells of each set represent three replicates? If so, are they biological/technical replicates? Also, I'm not sure how the MFC was determined from this figure (line 116) because clearly this panel only corresponds to the determination of MICs, not MFCs.

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check it in Line 126-130. The three different wells of each set represent three biological replicates. After adding 5 μL of resazurin dye, when the color of the wells turned to pink, the linalool concentration in the first non-pink well corresponded to the MIC. The culture liquid in the well where no mycelium growth was seen was marked onto the plate and incubated at 25°C for 7 days. The well with the lowest linalool concentration and no mycelium growth was identified as MFC.

      Figure 1C: the figure legend says that the effect of linalool on mycelium growth inhibition was done over a 6hr timepoint but according to the figure the timepoint was 60hr. I am also confused about the concentrations of linalool used. Although a range of concentration from 0 to 0.4% is mentioned, I only see the time vs diameter curves for 7 concentrations.

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check it in Line 983 and Figure 1C. We have changed 6 h to 60 h in the figure legends. The reason why only the time vs diameter curves for 7 concentrations in Figure1C is that the growth inhibition of 0.4%, 0.2% and 0.1% linalool on mycelial growth is the same. As a result, the time vs diameter curves coincide. We have shown the time and diameter curves of 0.4%, 0.2% and 0.1% concentration with three dotted lines of different colors and sizes in Figure 1C.

      Figure 1D: mislabeled as 1G in the figure panel.

      Figures 1E and 1G: Figure 1E is missing and I do not see any figure legend for Figure 1G.

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check the Figure 1. The order of Panel F and G in Figure 1 is wrong. We changed the order of Figure 1 ABCDEF, no Figure G.

      Overall, Figure 1 is very confusing and needs rewriting. Also, there is a need to add more explanation of the figure panels in the results section.

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check the Figure 1. We have corrected all the problems in Figure1. And we have added more explanation of the figure panels in the results section, and increased the correlation between methods, in order to show how to carry out the experiment logically and interpret the results, please check them in Line 126-130, 144-147, 174-179, 213-217, 343-345, 677-682.

      (6) Figure 2:

      The authors could justify the reason for doing the experiments before moving into the results they got.

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check the methods and results in the manuscript, please check them in Line 126-130, 144-147, 174-179, 213-217, 343-345, 677-682. We have added more explanation of the figure panels in the results section, and increased the correlation between methods, in order to show how to carry out the experiment logically and interpret the results.

      What concentration of linalool was used?

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check it in Line 992-996. The mycelium treated with 6×MIC (0.3%) linalool was observed by Confocal laser scanning microscopy (CLSM), and the mycelium treated with 1×MIC 0.05% linalool was observed by Scanning Electron Microscope (SEM) and transmission electron microscopy (TEM).

      The full form of DEGs has been mentioned later, but it should be mentioned in the figure legend of Figure 2 as this is the first time the term was used. Also, what is the full form of DEPs?

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check it in Line 168, 175, 182, 631, 998, 1001. The word DEPs in Figure 2I was incorrect, and we have changed DEPs to DEGs.

      Is there a particular reason for looking into the cellular component rather than molecular function and biological processes in the GO analysis? (what I see is that Figure 2H indicates the prevalence of catalytic activity, binding, cellular, and metabolic processes as well). Also, there is not enough explanation of the observation from Figure 2I (both in the results section and figure legend).

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check it in Line 174-179, 998-1002 (Figure 2I). The reason we looked at cellular components rather than molecular functions and biological processes in GO analysis is because we focused more on the effects of cell membranes and cell walls. These results are closely related to and echo the results of our scanning electron microscopy (SEM) and transmission electron microscopy (TEM), and also support the results of electron microscopy. Enough explanations have added to the results and figure legend section to explain the observations from Figure 2I.

      (7) Figure 3:

      Figures 3A and 3B: The adjusted p value is already indicated in the figures, so there is no need to add statistical significance (Asterix) to each bar. The resolution for these panels is not good and the font is too small.

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check the Figure 3A and 3B. We have removed statistical significance (Asterix) from Figure3A and 3B. If we are lucky, we will upload the clearest figures when the manuscript is published.

      Figure 3C: the figure legend is missing (wrongly added as KEGG analysis, which should be network analysis). The numbering for the figure legends is wrong. What are the node sizes (5, 22, 40, 58) mentioned in the figure represent? Also, I wonder why ribosome biogenesis in eukaryotes has been indicated as the most enriched pathway despite its less connection to the other nodes.

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check the Figure 3C. Figure 3C is KEGG analysis generated by software, not network analysis. For the convenience of readers, we have made a new Figure of KEGG analysis.

      Figure 3D: KEGG enrichment and GO analysis: global/local search? Which database was used as a reference?

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check the 633-635. Functional enrichment analysis was performed using the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. KEGG pathway analysis was conducted using Goatools.

      Figure 3E: why were the RNA pol structures compared? The authors did not mention anything about this panel in their results.

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check the line 207. We found that many DEGs related to ribosome biogenesis (Figure 3D) and RNA polymerase (Figure 3E) are down expressed. Because RNA polymerase is closely related to ribosome biogenesis, the downregulation of RNA polymerase directly affects the synthesis of ribosome-related RNAs, including rRNA, mRNA, and tRNA, thereby inhibiting ribosome production. This relationship is particularly significant in cell growth, division, and the response to external environmental changes.

      Figures 3F and 3G: please mention which model is illustrated (ribbon/sphere model).

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check the line 1010-1015. The tertiary structure of NOP1 was displayed using a cartoon representation. Molecular docking of linalool with NOP1 was performed by enlarging the regions binding to the NOP1 activation pocket to showcase the detailed amino acid structures, which were presented using a surface model, while the small molecule was displayed with a ball-and-stick representation.

      Figure 3H: this panel needs more explanation. Why were some of the ABC transporters upregulated while some were downregulated?

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. It is a common phenomenon that microorganisms adjust the expression of genes related to substance transport in response to different environmental stimuli to optimize their survival strategies. The expression of ATP-binding cassette (ABC) transporters can be upregulated or downregulated due to various factors, such as environmental stimuli, metabolic demands, energy consumption, species specificity, and signaling molecules. This explains why some ABC transporters are upregulated while others are downregulated.

      (8) Figure 4:

      There was no statistical significance shown in the figures (D-F) which makes me wonder how they worked out that there was any significant increase/decrease, as mentioned in the text. What are the p values? What is the number of replicates? What concentration of linalool was used?

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully.  Please check the Figure 4D-F. In this study, 4 groups were established: (1) Positive control (PC) group (10 fish infected with S. parasitica). (2) Linalool therapeutic (LT) group (10 fish infected with S. parasitica, soaked in 0.00039% linalool in a 20L tank for 7 days). (3) Linalool prophylactic (LP) group (10 uninfected fish soaked in 0.00039% linalool in a 20L tank for 2 days, followed by the addition of 1×10<sup>6</sup> spores/mL secondary zoospores). (4) Negative control (NC) group (10 uninfected fish without linalool treatment). Each group had 3 replicate tanks. In each group, 8 fish were utilized for immunological assays, and on day 7, blood samples were collected from the tail veins using heparinized syringes and left to coagulate overnight at 4°C. Kits from Nanjing Jiancheng Institute (Nanjing, China) were used to measure lysozyme (LZY) activity, superoxide dismutase (SOD) activity, and alkaline phosphatase (AKP) activity.

      (9) Figure 5:

      Again, the resolution and font size are off. Please mention the full forms of the terms used in the figure legend. The interpretation of the in vivo protective mechanism of linalool is completely based on GO enrichment and KEGG pathway analysis (also some transcriptional analysis). The only wet lab validation done was by checking the mRNA level of some cytokines but that does not necessarily validate what the authors claim.

      Thank you for your suggestion. We have revised it carefully. Please check all the figures and figure legend. We have increased the resolution and size of all the figures and used the full forms of the terms in figure legend. If we are lucky, we will upload the clearest figures when the manuscript is published. Currently, in the field of aquaculture research, mRNA quantification at the genetic level faces numerous challenges compared to model organisms like mice and zebra fish, primarily due to the lack of available antibodies. For instance, antibodies related to grass carp have not yet been commercialized, making protein-level studies and validations significantly more difficult. This lack of antibodies limits the progress of protein verification. However, we hope to design more experiments and validation tests in the future to gradually overcome these technical bottlenecks and provide stronger support for research in the future.

      (10) Figure 6:

      There is not enough explanation on why and how the experiments were done. It seems like the authors already presumed that the readers know the experiments. The interpretation of the PCA plot is not clear. Why are the quadrant sizes different? How was the heat map plotted? Also, the claim of linalool regulating the gut microbiota is only dependent on the correlation analysis and there is no wet lab validation for this. The data represented in this figure is not enough to prove their hypothesis and needs further investigation.

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check the Figure 6. We will improve the labeling of the figure panels, provide detailed explanations of the experimental methods, including their rationale and interpretation, and clarify the connections between the methods.

      The goal of PCoA is to preserve the distance relationships between samples as much as possible through the principal coordinates, thereby revealing the differences or patterns in microbial composition among different groups. For example, in our study, PCoA analysis demonstrated that the microbial compositions of the positive control (PC), linalool prophylactic (LP), and linalool therapeutic (LT) groups showed significant differences in the reduced dimensional space, possibly indicating that these treatments had a notable impact on the microbial community.

      In our study, the heatmap was generated using the Majorbio Cloud Platform. This platform visualized the preprocessed microbial community data, providing an intuitive representation of the differences in microbial composition and relative abundance among samples. The platform automatically performed steps such as data normalization, color mapping, and clustering analysis, offering convenience for data analysis and interpretation.

      Previous researches have shown that antibiotics and other drugs can cause alterations in gut microbiota. Therefore, we plan to study the effects of antibiotics on gut microbiota. To conduct this research, we need to isolate these microbes from the gut. Although this process is challenging, we still aim to explore the gut microbiota. If possible, we will continue to delve into interesting aspects of how antibiotics affect gut microbiota in future studies.

      (11) Figure 7:

      This figure does not clarify how they did the interpretation. The in vivo study does not phenocopy their in vivo studies.

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. we have carefully reviewed and confirmed the current experimental design and data analysis. Although we have not made any changes to Figure 7, we have further clarified the interpretation of the results in the revised manuscript, especially concerning the discrepancies between the in vivo and in vitro studies. We have added more experimental background information to help better understand the possible reasons for these differences. We hope the reviewer will understand our explanation and we look forward to your further feedback.

      (12) Minor comments:

      Line 61: what's meant by "et al"?

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check it in Line 61. We have removed "et al".

      Line 87-88: please add a citation referring to the earlier studies.

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check it in Line 109.

      Line 151-152: the term "related to" has been used a couple of times. Mentioning it once in the beginning and avoiding repeating the same word might be better.

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check it in Line 168-171.We have rewritten this paragraph to avoid repeating the word “related to”.

      How did they reconstitute the EO compounds?

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. The EO compounds we used in our experiments were partially extracted from essential oils in the laboratory and partially purchased from ThermoFisher (USA).

      Line 544: needs explanation of how there was a 2-fold dilution in the concentrations shown in the figure compared to the concentrations mentioned here.

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. We set the concentration of MIC assay for mycelium to be 0.8%, 0.4%, 0.2%, 0.1%, 0.05%, 0.025%, 0.0125%, and 0.00625%, and the concentration of MIC assay for spores to be 0.4%, 0.2%, 0.1%. 0.05%, 0.025%, 0.0125%, 0.00625%. Figure 1B shows the MIC determination of linalool on spores, while the MIC determination of mycelium is not shown.

      Line 546: remove "were".

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check it in Line 573. We have removed "were".

      Line 555: what concentration of malachite green and tween 20 was used?

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check it in Line 579-580. 2.5mg /mL malachite green and 1% Tween 20 were used.

    1. eLife Assessment

      This useful study uses a model of Streptococcus suis (a pig pathogen) infection in mice using an intranasal route, the natural route of infection ignored in most of the literature. The study aims to understand how capsular polysaccharides (CPS) contribute to neuropathology and virulence. The findings suggest that the olfactory route may lead to meningitis before bacteremia occurs and that CPS down-regulation may play a role in this process. However, the study remains incomplete as presented.

    2. Reviewer #1 (Public review):

      Summary:

      The manuscript by Wang et al. investigates the relationship between Streptococcus Suis (S. Suis) growth phases and levels of virulence factor, capsular polysaccharide (CPS), in the bacterial cell wall. They use an understudied mouse intranasal infection model to connect growth phase related CPS abundance to the pathogenicity of the bacteria in the nose, blood, and other organs. Adoptive transfer of serum against either CPS or V5 (five other virulence factors) reinforces their discovery of CPS levels on S. Suis in different organs and stages of infection. Vaccination against bacterial infections can be difficult, and understanding how the serotype of a bacterial pathogen changes between infection sights and systemic disease is critical. Further, understanding host-pathogen interactions at early time points in the upper respiratory tract may have broad implications for vaccine development. While some of the results are interesting and compelling, others are not supported by the data and require further experimental work.

      Strengths:

      The model of intranasal infection is compelling to expand upon work previously done in vitro and with systemic routes of infection. The histology and fluorescent imaging of the olfactory epithelium and olfactory bulb complement work in Figure 2 about the attachment of S. suis to epithelial cells and the bacterial burden over time in different organs of Figure 3. Histology was performed at 1 hour and 9 days after intranasal infection with stationary phase S. suis and drives home that this pathogen can invade the olfactory nerve and may potentially cause bacterial meningitis seen in some infected swine.

      The adoptive transfer of either anti-CPS or anti-V5 to mice before infection at both longer (12 hr), and shorter (0.5 hr) time points is useful to demonstrate that the changes in cell wall composition between the NALT/CSF and blood compartments result in different efficacy in clearing bacteria from those locations. This is fundamental for the development of vaccines for the swine industry and begs those developing other bacterial vaccines to consider what virulence factors are the most useful as neutralizing antibody targets at the sight of bacterial invasion.

      Demonstrating that the amount of CPS within the cell wall of S. Suis is related to the growth phase of the bacteria is an important consideration for vaccine development. While others had previously shown that CPS levels were higher in the blood than in the CNS, and that CPS decreases the invasion of epithelial cells, the close look at the olfactory epithelium at an early time point ties together in vitro findings. The control of a CPS-negative strain was critical to understanding their findings. The location and the microbial community that bacterial pathogens live within may change the growth phase and therefore also the cell wall components.

      Weaknesses:

      The authors present compelling data that is relevant to the development of anti-bacterial vaccinations and show a relationship between CPS levels and pathogenicity. However, the use of a laboratory murine model requiring acetic acid pre-treatment and a high i.n. dose. Therefore, the findings presented may not represent what occurs in swine. Furthermore, several conclusions are not supported by the data and require substantial new experimental support. Thus, major concerns remain that impact the validity of the findings.

      Major concerns for the manuscript:

      The intranasal infections were done with S. Suis in the stationary phase which has been shown to have less CPS on the cell wall. While this mimics the literature that shows S. Suis to have less CPS in the CNS, the difference in the pathogenesis of a log phase vs. stationary phage intranasal infection would be interesting. Especially because the bacteria is a part of the natural microbial community of swine tonsils, it is curious if the change in growth phase and therefore CPS levels may be a causative reason for pathogenic invasion in some pigs. To take this line of thought a step further, the authors should consider taking the bacteria from NALT/CSF and blood and compare the lag times bacteria from different organs take to enter a log growth phase to show whether the difference in CPS is because S. Suis in each location is in a different growth phase. If log phase bacteria were intranasally delivered, would it adapt a stationary phase life strategy? How long would that take? Lastly, the authors should be cautious about claims about S. suis downregulating CPS in the NALT for increased invasion and upregulating CPS to survive phagocytosis in blood. While it is true that the data shows that there are different levels of CPS in these locations, the regulation and mechanism of the recorded and observed cell wall difference is not investigated past the correlation to the growth phase. While mechanistic work is outside the scope of the current work, readers should keep in mind that these results may be explained multiple ways. In addition, the mouse model is used rather than the usual host of a pig. The NALTs of conventional pigs and SPF mice certainly have unique microbial communities and this may affect the pathogenesis of S. suis in the mouse, therefore influencing the results. Because the authors show a higher infection rate in the mouse with acetic acid, they may want to consider investigating what the mouse NALT microenvironment is naturally doing to exclude more bacterial invasion in future studies. Is it simply a host mismatch or is there something about the microbiome or steady-state immune system in the nose of mice that is different from pigs?

    3. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer 1:

      (1) Some conclusions are not completely supported by the present data, and at times the manuscript is disjoint and hard to follow. While the work has some interesting observations, additional experiments and controls are warranted to support the claims of the manuscript.

      Thank you for the comments. We revised some of the claims and conclusions to be more objective and result-supportive.

      (2) While the authors present compelling data that is relevant to the development of anti-bacterial vaccinations, the data does not completely match their assertions and there are places where some further investigation would further the impact of their interesting study.

      We do not fully agree with the reviewer's comments. We have demonstrated that changes in CPS levels during infection are associated with pathogenesis, which will guide future studies on the underlying mechanisms. A significant amount of effort is required for studying mechanisms, which is beyond the scope of this research. We concur with the reviewer that assertions should be made cautiously until further studies are conducted. We have revised these assertions to align with the data and to avoid extrapolating the results (pages 7, lines 126, 133-136; page 11, lines 216-218; page 13, line 264; and page 18, lines 378-383).

      (3) The difference in the pathogenesis of a log phase vs. stationary phage intranasal infection would be interesting. Especially because the bacteria is a part of the natural microbial community of swine tonsils, it is curious if the change in growth phase and therefore CPS levels may be a causative reason for pathogenic invasion in some pigs.

      S. suis is a part of the natural microbial community of swine tonsils but not mouse NALT. It is interesting to know if CPS levels are low in pig tonsils since CPS is hydrophilic and not conducive to bacterial adhesion. In the study, mice were i.n. infected with a high dose of the bacteria, which could increase opportunities for dissemination (acidic acid may not be a contributor since with or without it is similar). S. suis getting into other body compartments from pig tonsils might be triggered by other conditions, such as viral coinfection, nasal cavity inflammation, cold weather, and decreased immunity.

      Experiments with pig blood and phagocytes have shown that genes involved in the synthesis of CPS are upregulated in pig blood. In contrast, these genes are downregulated [1]. In addition, the absence of CPS correlated with increased hydrophobicity and phagocytosis, proposing that S. suis undergoes CPS phase variation and could play a role in the different steps of S. suis infection [2]. We showed direct evidence of encapsulation modulation associated with S. suis pathogenesis in mice. A pig infection model is required to confirm these findings.

      (4) The authors should consider taking the bacteria from NALT/CSF and blood and compare the lag times bacteria from different organs take to enter a log growth phase to show whether the difference in CPS is because S. suis in each location is in a different growth phase. If log phase bacteria were intranasally delivered, would it adapt a stationary phase life strategy? How long would that take? 

      What causes CPS regulation in vivo is not known. CPS changes in different culture stages, indicating that stress, such as nutrition levels, is one of the signals triggering CPS regulation. The microenvironment in the body compartments is far more complex than in vitro, in which host cells, immune factors and others may affect CPS regulation, individually or collectively. The reviewer’ question is important but the suggested experiment is impracticable since bacterial numbers taken from organs are few, and culturing the bacteria in vitro would obliterate the in vivo status.  

      (5) Authors should be cautious about claims about S. suis downregulating CPS in the NALT for increased invasion and upregulating CPS to survive phagocytosis in blood. While it is true that the data shows that there are different levels of CPS in these locations, the regulation and mechanism of the recorded and observed cell wall difference are not investigated past the correlation to the growth phase.

      We lower the tone and change the claim as “suggest a correlation between lower CPS in the NALT and a greater capacity for cellular association, whereas elevated CPS levels in the blood are linked to improved resistance against bactericidal activity. However, the mechanisms behind these associations remain unknown.” (page 7, lines 133-136).

      (6) The mouse model used in this manuscript is useful but cannot reproduce the nasal environment of the natural pig host. It is not clear if the NALTs of pigs and mice have similar microbial communities and how this may affect the pathogenesis of S. Suis in the mouse. Because the authors show a higher infection rate in the mouse with acetic acid, they may want to consider investigating what the mouse NALT microenvironment is naturally doing to exclude more bacterial invasion. Is it simply a host mismatch or is there something about the microbiome or steady-state immune system in the nose of mice that is different from pigs?

      It is a very interesting comment. The mice are SPF level. The microenvironment in SPF mouse NALT should be significantly different from conventional pig tonsils. Although NALT in mice resembles pig tonsils in function, many factors may contribute to the sensitivity to S. suis colonization in the pig nasal cavity, such as the microbiome and local steady-state immune system. More complex microbiota in tonsils could be one of the factors. Analyzing what makes S. suis inclined towards colonization in pig tonsils by SPF and conventional pigs are an ideal experiment to answer the question. 

      (7) Have some concerns regarding the images shown for neuroinvasion because I think the authors mistake several compartments of the mouse nasal cavity as well as the olfactory bulb. These issues are critical because neuroinvasion is one of the major conclusions of this work.

      Thank you for your comments. The olfactory epithelium (OE) is located directly underneath the olfactory bulb in the olfactory mucosa area and lines approximately half of the nasal cavities of the nasal cavity. The remaining surface of the nasal cavity is lined by respiratory epithelium, which lacks neurons. The olfactory receptor neuron in OE is stained green in the images by β-tubulin III, a neuron-specific marker. The respiratory epithelium is colorless due to the absence of nerve cells. Similarly, the green color stained by β-tubulin III identifies the olfactory bulb. The accuracy of the anatomic compartments of the mouse nasal cavity has been checked and confirmed by referring to related literature [3, 4].

      References

      (1) Wu Z, Wu C, Shao J, Zhu Z, Wang W, Zhang W, Tang M, Pei N, Fan H, Li J, Yao H, Gu H, Xu X, Lu C. The Streptococcus suis transcriptional landscape reveals adaptation mechanisms in pig blood and cerebrospinal fluid. RNA. 2014 Jun;20(6):882-98.

      (2) Charland N, Harel J, Kobisch M, Lacasse S, Gottschalk M. Streptococcus suis serotype 2 mutants deficient in capsular expression. Microbiology (Reading). 1998 Feb;144 ( Pt 2):325-332.

      (3) Pägelow D, Chhatbar C, Beineke A, Liu X, Nerlich A, van Vorst K, Rohde M, Kalinke U, Förster R, Halle S, Valentin-Weigand P, Hornef MW, Fulde M. The olfactory epithelium as a port of entry in neonatal neurolisteriosis. Nat Commun. 2018;9(1):4269.

      (4) Sjölinder H, Jonsson AB. Olfactory nerve--a novel invasion route of Neisseria meningitidis to reach the meninges. PLoS One. 2010 Nov 18;5(11):e14034.

      Reviewer 2:

      (1) However, there are serious concerns about data collection and interpretation that require further data to provide an accurate conclusion. Some of these concerns are highlighted below:

      Both reviewers were concerned about some of the interpretations of the results. We modified the interpretations in related lines throughout the manuscript (Please see the related responses to Reviewer 1).

      (2) In figure 2, the authors conclude that high levels of CPS confer resistance to phagocytic killing in blood exposed S. suis. However, it seems equally likely that this is resistance against complement mediated killing. It would be important to compare S. suis killing in animals depleted of complement components (C3 and C5-9).

      We thank the reviewer for the comment. The experiment should be Bactericidal Assay instead of anti-phagocytosis killing. CPS is a main inhibitor of C3b deposition [1]. It interferes with complement-mediated and receptor-mediated phagocytosis; and direct killing. Data in Figure 2C is expressed as “% of bacterial survival in whole blood” for clarity (page 8, Fig. 2C and page 23, lines 489-490).

      (3) Intranasal administration non-CPS antisera provides a nice contrast to intravenous administration, especially in light of the recently identified "blood-olfactory barrier". Can the authors provide any insight into how long and where this antibody would be located after intranasal administration? Would this be antibody mediated cellular resistance, or something akin to simple antibody "neutralization"

      Anti-V5 may not stay long locally following intranasal administration. Efficient reduction of S. suis colonization in NALT supports that anti-V5 could recognize and neutralize the bacteria in NALT quickly, thereby reducing further dissemination in the body. Antibody-mediated phagocytosis may not play a major role because neutrophils are mainly present in the blood but not in the tissues.  

      (4) The micrographs in Figure 7 depict anatomy from the respiratory mucosa. While there is no histochemical identification of neurons, the tissues labeled OE are almost certainly not olfactory and in fact respiratory. However, more troubling is that in figures 7A,a,b,e, and f, the lateral nasal organ has been labeled as the olfactory bulb. This undermines the conclusion of CNS invasion, and also draws into question other experiments in which the brain and CSF are measured.

      We understand the significance of your concerns and appreciate your careful review of Figure 7. The olfactory epithelium (OE) is situated directly beneath the olfactory bulb in the olfactory mucosa area and covers about half of the nasal cavity. This positioning allows information transduction between the olfactory and the olfactory epithelium. The remaining surface of the nasal cavity is lined with respiratory epithelium, which does not contain neurons and primarily serves as a protective barrier. In contrast, the olfactory epithelium consists of basal cells, sustentacular cells, and olfactory receptor neurons. The olfactory receptor neurons are specifically stained green in the images using β-tubulin III, a marker that is unique to neurons. The respiratory epithelium appears colorless due to the lack of nerve cells. Similarly, the green staining with β-tubulin III also highlights the olfactory bulb. The anatomical structures indicated in the images are consistent with those described in the literature [2, 3], confirming that the anatomy of the nasal cavity has been accurately identified.

      (5) Micrographs of brain tissue in 7B are taken from distal parts of the brain, whereas if olfactory neuroinvasion were occurring, the bacteria would be expected to arrive in the olfactory bulb. It's also difficult to understand how an inflammatory process would be developed to this point in the brain -even if we were looking at the appropriate region of the brain -within an hour of inoculation (is there a control for acetic acid induced brain inflammation?). Some explanations about the speed of the immune responses recorded are warranted.

      Thank you for highlighting this issue. Cerebrospinal fluid (CSF) flows into the subarachnoid space surrounding the spinal cord and the brain. There are direct connections from this subarachnoid space to lymphatic vessels that wrap around the olfactory nerves as they cross the cribriform plate towards the nasal submucosa. This connection allows for the drainage of CSF into the nasal submucosal lymphatics in mice [4, 5]. Bacteria may utilize this CSF outflow channel in the opposite direction, which explains the development of brain inflammation in the distal areas of brain tissue adjacent to the subarachnoid space. We have included additional relevant information in the revised manuscript (page 16, lines 323-325).

      (6) The detected presence of S. suis in the CSF 0.5hr following intranasal inoculation is difficult to understand from an anatomical perspective. This is especially true when the amount of S. suis is nearly the same as that found within the NALT. Even motile pathogens would need far longer than 0.5hr to get into the brain, so it's exceedingly difficult to understand how this could occur so extensively in under an hour. The authors are quantifying CSF as anything that comes out of the brain after mincing. Firstly, this should more accurately be referred to as "brain", not CSF. Secondly, is it possible that the lateral nasal organ -which is mistakenly identified as olfactory bulb in figure 7- is being included in the CNS processing? This would explain the equivalent amounts of S. suis in NALT and "CSF".

      The high dose of inoculation used in the experiment may explain the rapid presence of S. suis in the CSF. Mice exhibit low sensitivity to S. suis infection, and the range for the effective intranasal infectious dose is quite narrow. Higher doses lead to the quick death of the mice, while lower doses do not initiate an infection at all. The dose used in this study is empirical and is intended to facilitate the observation of the progression of S. suis infection in mice.

      The NALT tissue and CSF samples are collected separately. After obtaining the NALT tissue, the nasal portion was carefully separated from the rest of the head along the line of the eyeballs. The brain tissue was then extracted from the remaining part of the head to collect the CSF, and it was lacerated to expose the subarachnoid space without being minced. This procedure aims to preserve the integrity of the brain tissue as much as possible. Further details about the CSF collection process can be found in the Materials and Methods section (page 24, lines 508-512).

      (7) To support their conclusions about neuroinvasion along the olfactory route and /CSF titer the authors should provide more compelling images to support this conclusion: sections stained for neurons and S. suis, images of the actual olfactory bulb (neurons, glomerular structure etc).

      Thank you. We respectfully disagree with the reviewer. We stained neurons using a neuron-specific marker to identify the anatomical structures of the olfactory bulb and olfactory epithelium (in green). We used an S. suis-specific antibody to highlight the bacteria present in these areas (in orange and red). The images, along with the bacteria found in the cerebrospinal fluid (CSF) and the brain inflammation observed early in the infection, strongly support our conclusion regarding brain invasion through the olfactory pathway. Please see the response to question 4 for further clarification.

      References

      (1) Seitz M, Beineke A, Singpiel A, Willenborg J, Dutow P, Goethe R, Valentin-Weigand P, Klos A, Baums CG. Role of capsule and suilysin in mucosal infection of complement-deficient mice with Streptococcus suis. Infect Immun. 2014 Jun;82(6):2460-71.

      (2) Sjölinder H, Jonsson AB. Olfactory nerve--a novel invasion route of Neisseria meningitidis to reach the meninges. PLoS One. 2010 Nov 18;5(11):e14034.

      (3) Pägelow D, Chhatbar C, Beineke A, Liu X, Nerlich A, van Vorst K, Rohde M, Kalinke U, Förster R, Halle S, Valentin-Weigand P, Hornef MW, Fulde M. The olfactory epithelium as a port of entry in neonatal neurolisteriosis. Nat Commun. 2018;9(1):4269.

      (4) Yoon JH, Jin H, Kim HJ, Hong SP, Yang MJ, Ahn JH, Kim YC, Seo J, Lee Y, McDonald DM, Davis MJ, Koh GY. Nasopharyngeal lymphatic plexus is a hub for cerebrospinal fluid drainage. Nature. 2024 Jan;625(7996):768-777.

      (5) Spera I, Cousin N, Ries M, Kedracka A, Castillo A, Aleandri S, Vladymyrov M, Mapunda JA, Engelhardt B, Luciani P, Detmar M, Proulx ST. Open pathways for cerebrospinal fluid outflow at the cribriform plate along the olfactory nerves. EBioMedicine. 2023 May;91:104558.

      Response to Recommendations for the authors:

      Reviewer 1:

      Minor concerns for the manuscript:

      (1) In the introduction, please consider giving a little more background about the bacteria itself and how it causes pathogenesis.

      We appreciate your suggestion. We have included additional background on the virulent factors and the pathogenesis of the bacteria in the introduction to enhance understanding of the results (page 4, lines 63-69).

      (2) Figure 2C would be more correct to say percent survival as the CFUs before and after are what are being compared and not if the bacteria is being phagocytosed or not. Flow cytometry of the leukocytes and a fluorescent S. Suis would show phagocytosis. Unless that experiment is performed, the authors cannot claim that there is a resistance to phagocytosis.

      Thank you for your feedback. We agree with the reviewer that the experiment should be Bactericidal Assay rather than anti-phagocytosis killing. CPS interferes with complement-mediated phagocytosis and direct killing, and receptor-mediated phagocytosis. To enhance clarity, the data in Fig. 2C has been presented as “% of bacterial survival in whole blood” (page 8).  

      (3) There are two different legends present for Figure 1. Please resolve.

      We apologize for the oversight. The redundant figure legend has been removed (page 6).

      (4) There are places such as in lines 194-195, that there are assertions and interpretations about the data that are not directly drawn from the data. These hypotheses are valuable, but please move them to the discussion.

      Thank you for your suggestion. The hypothesis has been moved to the Discussion section (page 19, lines 402 - 405).

      (5) In Figure 4B, higher resolution images would strengthen the ability of non-microbiologists to see the differences in CPS levels in the cell wall.

      We achieved the highest resolution possible for clearer distinctions in CPS levels. To enhance the visualization of the different CPS levels in the images, we revised the description of the CPS changes in Figure 4B within the results section (page 11, lines 208-213).

      (6) In Figure 5 there is no D. Further, the schematics throughout would be easier to parse with the text if the challenge occurred at time 0. Consider revising them for clarity.

      Thank you for highlighting the error. We have removed "i.v + i.n (Fig. 5)" from Figure 5A and made adjustments to the schematic illustrations in Figures 5 and 6 as recommended by the reviewer (page 14).

      (7) What is the control for the serum? The findings for figures 5 and 6 would be much stronger if a non- S. Suis isotype control serum was also infused.

      We used a naive serum as a control to avoid interference from a non-S. suis isotype control that targets other surface molecules of S. suis serotypes.

      (8) Figure 6 legend does not include the anti-CPS treatment.

      Thank you. We have added anti-CPS serum in the legend (page 15, line 249).

      (9) Figure 7 legend does not include the time point for panel 7A.

      Thank you. The time point is shown on Fig.7A (page 17).

      (10) Figure 7 should show OB micrographs or entire brain including the OB.

      The neuron-specific marker, β-tubulin III, identifies the neuro cells in the olfactory bulb (OB) as shown in Fig. 7A. Unfortunately, we were unable to provide an image of the entire brain that includes the OB due to limitations in our section preparation. We apologize for the mislabeled structure in Fig. 7A, which may have caused confusion. We have corrected the labeling for consistency (see page 15, lines 257-260). Additionally, we included a drawing of the sagittal plane of the rodent's nose, depicting the compartments of the OB, olfactory epithelium (OE), nasal cavity (NC), and brain. This illustration, presented in Fig. 7B on page 17, aims to clarify the structural and functional connections between the nasopharynx and the CNS.

      (11) Some conclusions may be better drawn if figures were to be consolidated. As noted above, the data at times feels disjointed and the importance is more difficult for readers to follow because data are presented further apart. Particularly figures 5 and 6 which are similar with different time points and controls of antisera administrative routes; placing these figures together would be an example of increasing continuity throughout the paper.

      Thank you for the valuable suggestion. Figures 5 and 6, along with their related descriptions in the results section, have been combined for better cohesiveness (pages 14-15).

      Reviewer #2:

      To support their conclusions about neuroinvasion along the olfactory route and /CSF titer the authors should provide more compelling images to support this conclusion: sections stained for neurons and S. suis, images of the actual olfactory bulb (neurons, glomerular structure etc).

      Please refer to our responses to Reviewer 1's Question 7, Reviewer 2's Questions 4 and 7 in the public reviews, and Reviewer 1's Question 10 in the authors' recommendations.

    1. eLife Assessment

      This valuable study reports the link between a disruption in testicular mineral (phosphate) homeostasis, FGF23 expression, and Sertoli cell dysfunction. The data supporting the conclusion are solid. This work will be of interest to biomedical researchers working on testis biology and male infertility. The assessment is based on the editors' critical evaluation of the authors' responses.

    2. Reviewer #1 (Public review):

      The authors have strengthened their conclusions by providing additional information about the specificity of their antibodies, but at the same time the authors have revealed concerning information about the source of their antibodies.

      It appears that many of the antibodies used in this study have been discontinued because the supplier company was involved in a scandal of animal cruelty and all their goats and rabbits Ab products were sacrificed. The authors acknowledge that this is unfortunate but they also claim that the issue is out of their hands.

      The authors' statement is false; the authors ought to not use these antibodies, just as the providing company chose to discontinue them, as<br /> those antibodies are tied to animal cruelty. The issue that the authors feel OK with using them is of concern. In short, please remove any results from unethical antibodies.

      Removal of such results also best serves science. That is, any of their results using the discontinued antibodies means that the authors' results are non-reproducible and we should be striving to publish good, reproducible science.

      For the antibodies that do not have unethical origins the authors claim that their antibodies have been appropriately validated, by "testing in positive control tissue and/or Western blot or in situ hybridization". This is good but needs to be expanded upon. It is a strong selling point that the Abs are validated and I want to see additional information in their Supplementary Table 2 stating for each Ab specifically:

      (1) What +ve control tissue was used in the validation of each Ab and which species that +ve control came from. Likewise, if competition assays to confirm validity was used, please also specify.

      (2) Which assay was the Ab validated for (WB, IHC, ELISA, all etc)

      (3) For Antibodies that were validated for, or using WBs please let the reader know if there were additional bands showing.

      (4) Include references to the literature that supports these validations. That is, please make it easy for the reader to appreciate the hard work that went into the validation of the Antibodies.

      Finally, for the Abs, when the authors write that "All antibodies used have been validated by testing in positive control tissue and/or Western blot or in situ hybridization" I fail to understand what in situ hybridisation means in this context. I am under the impression that in situ hybridisation is some nucleic acid -hybridising-to-organ or tissue slice. Not polypeptide binding.

    3. Author response:

      The following is the authors’ response to the current reviews.

      Reviewer #1 (Public review):

      The authors have strengthened their conclusions by providing additional information about the specificity of their antibodies, but at the same time the authors have revealed concerning information about the source of their antibodies.

      It appears that many of the antibodies used in this study have been discontinued because the supplier company was involved in a scandal of animal cruelty and all their goats and rabbits Ab products were sacrificed. The authors acknowledge that this is unfortunate but they also claim that the issue is out of their hands.

      The authors' statement is false; the authors ought to not use these antibodies, just as the providing company chose to discontinue them, as those antibodies are tied to animal cruelty. The issue that the authors feel OK with using them is of concern. In short, please remove any results from unethical antibodies.

      Removal of such results also best serves science. That is, any of their results using the discontinued antibodies means that the authors' results are non-reproducible and we should be striving to publish good, reproducible science.

      For the antibodies that do not have unethical origins the authors claim that their antibodies have been appropriately validated, by "testing in positive control tissue and/or Western blot or in situ hybridization". This is good but needs to be expanded upon. It is a strong selling point that the Abs are validated and I want to see additional information in their Supplementary Table 2 stating for each Ab specifically:

      (1) What +ve control tissue was used in the validation of each Ab and which species that +ve control came from. Likewise, if competition assays to confirm validity was used, please also specify.

      (2) Which assay was the Ab validated for (WB, IHC, ELISA, all etc)

      (3) For Antibodies that were validated for, or using WBs please let the reader know if there were additional bands showing.

      (4) Include references to the literature that supports these validations. That is, please make it easy for the reader to appreciate the hard work that went into the validation of the Antibodies.

      Finally, for the Abs, when the authors write that "All antibodies used have been validated by testing in positive control tissue and/or Western blot or in situ hybridization" I fail to understand what in situ hybridisation means in this context. I am under the impression that in situ hybridisation is some nucleic acid -hybridising-to-organ or tissue slice. Not polypeptide binding.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      Remove results that have been obtained by unethically-sourced antibody reagents.

      Strengthen the readers' confidence about the appropriateness & validity of your antibodies.

      First, we want to stress that reviewer 1 has raised his critique related to the used of antibodies from Santa Cruz biotechnology not only through the journal. The head of our department and two others were contacted by reviewer 1 directly without going through the journal or informing/approaching the corresponding or first author. It is our opinion that this debate and critique should be handled through the journal and editorial office and not with people without actual involvement in the project.

      It is correct that we have purchased antibodies from Santa Cruz Biotechnologies both mouse, rabbit and goat antibodies as stated in the correspondence with the reviewer.

      As stated in our previous rebuttal – the goat antibodies from Santa Cruz were discontinued due to inadequate treatment of goats after settling with the authorities in 2016.

      https://www.nature.com/articles/nature.2016.19411

      https://www.science.org/content/blog-post/trouble-santa-cruz-biotechnology

      We have used 11 mouse, rabbit or goat antibodies from Santa Cruz biotechnologies in the manuscript as listed in supplementary table 2 of the manuscript and all of them have been carefully validated in other control tissues supported by ISH and/or WB and many of them already used in several publications by our group (https://pubmed.ncbi.nlm.nih.gov/34612843/, https://pubmed.ncbi.nlm.nih.gov/33893301/, https://pubmed.ncbi.nlm.nih.gov/32931047/, https://pubmed.ncbi.nlm.nih.gov/32729975/, https://pubmed.ncbi.nlm.nih.gov/30965119/, https://pubmed.ncbi.nlm.nih.gov/29029242/, https://pubmed.ncbi.nlm.nih.gov/23850520/, https://pubmed.ncbi.nlm.nih.gov/23097629/, https://pubmed.ncbi.nlm.nih.gov/22404291/, https://pubmed.ncbi.nlm.nih.gov/20362668/, https://pubmed.ncbi.nlm.nih.gov/20172873/,  and other research groups. All antibodies used in this manuscript were purchased before the whole world was aware of mistreatment of goats that was evident several years later.

      We do not support animal cruelty in anyway but the purchase of antibodies from Santa Cruz biotechnologies were conducted long before mistreatment was reported. Moreover, antibodies from Santa Cruz biotechnologies are being used in thousands of publications annually. The company has been punished for their misconduct, and subsequently granted permission to produce antibodies from the relevant authorities again.


      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Despite the study being a collation of important results likely to have an overall positive effect on the field, methodological weaknesses and suboptimal use of statistics make it difficult to give confidence to the study's message.

      Strengths:

      Relevant human and mouse models approached with in vivo and in vitro techniques.

      Weaknesses:

      The methodology, statistics, reagents, analyses, and manuscripts' language all lack rigour.

      (1) The authors used statistics to generate P-values and Rsquare values to evaluate the strength of their findings.

      However, it is unclear how stats were used and/or whether stats were used correctly. For instance, the authors write: "Gaussian distribution of all numerical variables was evaluated by QQ plots". But why? For statistical tests that fall under the umbrella of General Linear Models (line ANOVA, t-tests, and correlations (Pearson's)), there are several assumptions that ought to be checked, including typically:

      (a) Gaussian distribution of residuals.

      (b) Homoskedasticity of the residuals.

      (c) Independence of Y, but that's assumed to be valid due to experimental design.

      So what is the point of evaluating the Gaussian distribution of the data themselves? It is not necessary. In this reviewer's opinion, it is irrelevant, not a good use of statistics, and we ought to be leading by example here.

      Additionally, it is not clear whether the homoscedasticity of the residuals was checked. Many of the data appear to have particularly heteroskedastic residuals. In many respects, homoscedasticity matters more than the normal distribution of the residuals. In Graphpad analyses if ANOVA is used but equal variances are assumed (when variances among groups are unequal then standard deviations assigned in each group will be wrong and thus incorrect p values are being calculated.

      Based on the incomplete and/or wrong statistical analyses it is difficult to evaluate the study in greater depth.

      We agree with the reviewer that we should lead by example and improve clarity on the use of the different statistical tests and their application. In response to the reviewer’s suggestion, we have extended the statistical section, focusing on the analyses used. Additionally, we have specified the statistical test used in the figure legends for each figure. Additionally, we did check for Gaussian distribution and homoskedasticity of residuals before conducting a general linear model test, and this has now been specified in the revised manuscript. In case the assumptions were not met, we have specified which non-parametric test we used. If the assumptions were not met, we specified which non-parametric test was used.

      While on the subject of stats, it is worth mentioning this misuse of statistics in Figure 3D, where the authors added the Slc34a1 transcript levels from controls in the correlation analyses, thereby driving the intercept down. Without the Control data there does not appear to be a correlation between the Slc34a1 levels and tumor size.

      We agree with the reviewer that a correlation analysis is inappropriate here and have removed this part of the figure.

      There is more. The authors make statements (e.g. in the figure levels as: "Correlations indicated by R2.". What does that mean? In a simple correlation, the P value is used to evaluate the strength of the slope being different from zero. The authors also give R2 values for the correlations but they do not provide R2 values for the other stats (like ANOVAs). Why not?

      We agree with the reviewer and have replaced the R2 values with the Pearson correlation coefficient in combination with the P value.

      (2) The authors used antibodies for immunos and WBs. I checked those antibodies online and it was concerning:

      (a) Many are discontinued.

      Many of the antibodies we have used were from the major antibody provider Santa Cruz Biotechnology (SCBT). SCBT was involved in a scandal of animal cruelty and all their goats and rabbits were sacrificed, which explains why several antibodies were discontinued, while the mice antibodies were allowed to continue. This is unfortunate but out of our hands.

      (b) Many are not validated.

      We agree with the reviewer that antibody validation is essential. All antibodies used in this manuscript have been validated. The minimal validation has been to evaluate cellular expression in positive control tissue for instance bone, kidney, or mamma. Moreover, many of the antibodies have been used and validated in previous publications (doi: 10.1593/neo.121164, doi:10.1096/fj.202000061RR, doi: 10.1093/cvr/cvv187) including knockout models. Moreover, many antibodies but not all have been validated by western blot or in situ hybridization. We have included the following in the Materials and Methods section: “All antibodies used have been validated by testing in positive control tissue and/or Western blot or in situ hybridization”.

      (c) Many performed poorly in the Immunos, e.g. FGF23, FGFR1, and Kotho are not really convincing. PO5F1 (gene: OCT4) is the one that looks convincing as it is expressed at the correct cell types.

      We fail to understand the criticism raised by the reviewer regarding the specificity of these specific antibodies. We believe the FGF23 and Klotho antibodies are performing exceptionally well, and FGFR1 is abundantly expressed in many cell types in the testis. As illustrated in Figure 2E, the expression of Klotho, FGF23, and FGFR1 is very clear, specific, and convincing. FGF23 is not expressed in normal testis – which is in accordance with no RNA present there either. However, it is abundantly expressed in GCNIS where RNA is present. On the other hand, Klotho is abundantly expressed in germ cells from normal testis but not expressed in GCNIS.

      (d) Others like NPT2A (product of gene SLC34A1) are equally unconvincing. Shouldn't the immuno show them to be in the plasma membrane?

      If there is some brown staining, this does not mean the antibodies are working. If your antibodies are not validated then you ought to omit the immunos from the manuscript.

      We acknowledge your concerns regarding the NPT2A, NPT2B, and NPT2C staining. While the NPT2A antibody is performing well, we understand your reservations about the other antibodies. It's worth noting that NPT2A is not expressed in normal testis (no RNA either) but is expressed in GCNIS where the RNA is also present. Although it is typically present in the plasma membrane, cytoplasmic expression can be acceptable as membrane availability is crucial for regulating NPT2A function, particularly in the kidney where FGF23 controls membrane availability. We are currently involved in a comprehensive study exploring these phosphate transporters in the organs lining the male reproductive tract. In functional animal models, we have observed very specific staining with this NPT2A antibody following exposed to high phosphate or FGF23. Additionally, we are conducting Western Blot analyses with this antibody, which reinforces our belief that the antibody has a specific binding.

      Reviewer #2 (Public Review):

      Summary:

      This study set out to examine microlithiasis associated with an increased risk of testicular germ cell tumors (TGCT). This reviewer considers this to be an excellent study. It raises questions regarding exactly how aberrant Sertoli cell function could induce osteogenic-like differentiation of germ cells but then all research should raise more questions than it answers.

      Strengths:

      Data showing the link between a disruption in testicular mineral (phosphate)homeostasis, FGF23 expression, and Sertoli cell dysfunction, are compelling.

      Weaknesses:

      Not sure I see any weaknesses here, as this study advances this area of inquiry and ends with a hypothesis for future testing.

      We thank the reviewer for the acknowledgment and highlighting that this is an important message that addresses several ways to develop testicular microlithiasis, which indicates that it is not only due to malignant disease but also frequent in benign conditions.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      I applaud the authors' approach to nomenclature for rodent and human genes and proteins (italicised for genes, all caps for humans, capitalised only for rodents, etc), but the authors frequently got it wrong when referring to genes or proteins. A couple of examples include:

      (1) SLC34A1 (italics) refers to gene (correct use by the authors) but then again the authors use e.g. SLC34A1 (not italics) to refer to the protein product of SLC34A1(italics) gene. In fact, the protein product of the SLC34A1 (italics) gene is called NPT2A (non-italics).

      (2) OCT4 (italics) refers to gene (correct use by the authors) but then again the authors use e.g. OCT4 (not italics) to refer to the protein product of OCT4 (italics)gene. In fact, the protein product of the OCT4 gene (italics) gene is called PO5F1(non-italics).

      The problem with their incorrect and inconsistent nomenclature is widespread in the manuscript making further evaluation difficult.

      Please consult a reliable protein-based database like Uniprot to derive the correct protein names for the genes. You got NANOG correct though.

      We thank the reviewer for addressing this important point. We have corrected the nomenclature throughout the manuscript as suggested.

      (3) The authors use the word "may" too many times. Also often in conjunction with words like "indicates", and "suggests". Examples of phrases that reflect that the authors lack confidence in their own results, conclusions, and understanding of the literature are:

      "...which could indicate that the bone-specific RUNX2 isoform may also be expressed... "

      "...which indicates that the mature bone may have been..."

      Are we shielding ourselves from being wrong in the future because "may" also means "may not"? It is far more engaging to read statements that have a bit more tooth to them, and some assertion too. How about turning the above statements around, to :

      "...which shows that the bone-specific RUNX2 isoform is also expressed... "

      "...which reveals that the mature bone were..."

      ...then revisit ambiguous language ("may", "might" "possibly", "could", "indicate" etc.) throughout the manuscript?

      It's OK to make a statement and be found wrong in the future. Being wrong is integral to Science.

      Thank you for addressing this. We agree with the reviewer that it is fair to be more direct and have revised many of these vague phrases throughout the manuscript.

      (4) The authors use the word "transporter" which in itself is confusing. For instance, is SLC34A1 an importer or an exporter of phosphate? Or both? Do SLC34As move phosphate in or out of the cells or cellular compartments? "Transporter" sounds too vague a word.

      We understand that it might be easier for the reader with the term "importer". However, we should use the specific nomenclature or "wording" that applies to these transporters. The exact terminology is a co-transporter or sodium-dependent phosphate cotransporter as reported here (doi: 10.1152/physrev.00008.2019). Thus, we will use the terms “co-transporter” and “transporter” throughout the revised manuscript.

    1. eLife Assessment

      This study investigates a dietary intervention that employs a smartphone app to promote meal regularity, findings that have theoretical or practical implications for a subfield and may be clinically useful. The intervention to entice participants to adhere to specific meal times represents a restrictive diet (even though it does not ask to limit caloric intake) similar to a time-restricted feeding diet, while the control subjects are not experiencing or adhering to dietary restrictions. The authors report significant weight loss but did not rigorously assess caloric intake which remains a weakness of this study as food diaries are notoriously unreliable. While the concept is very interesting, the study is considered incomplete, and the rigor of the results should be strengthened in follow-up studies to add more stringent methods to assess caloric intake. Additionally, the study hypothesizes that the intervention resets the circadian clock. However, the study needs an objective method for assessing circadian rhythms, such as actigraphy, in addition to a subjective questionnaire.

    2. Reviewer #3 (Public review):

      In this study, the authors tested a dietary intervention focused on improving meal regularity. Participants first utilized a smartphone application to track their meal frequencies, and then they were asked to restrict their meal intake to times when they most often eat to enhance meal regularity for six weeks. This, supposedly, resulted in some weight loss, supposedly independent of changes in caloric intake.

      The concept is appealing, and it is interesting to use a smartphone app in participants' typical everyday environment to regularize food intake. It asks from participants to stick to meal intake times that are supported in many cultures, and it asks them not to eat outside of what are likely unhealthy habits such as grazing a refrigerator late at night. In essence, this is a restrictive diet, not restricting caloric intake but the timing of food intake, and it has many parallel to time restricted feeding. It is important to note that there are many restrictive diets, and a common problem with restrictive diets is that while they allow one to lose a couple of pounds for a couple of months just as with this diet, the long-term success is very poor because they depend on restriction. This issue is still not discussed.

      Further, why the participants lose weight, whether this is indeed due to a reduction in food intake as implied, or if the weight loss occurred without a reduction in caloric intake as first stated by the authors and now suggested remains to be determined as the method of food diary as a method to assess caloric intake lacks rigor as has been well established and has been shown again and again to be misleading even though many readers without that knowledge draw conclusions from such studies and they should best have been omitted.

      The authors hypothesize that the intervention improves metabolism by improving circadian rhythmicity. That's plausible, but the study provides only a subjective questionnaire and lacks more objective measures such as actigraphy.

      While the authors now state now that this as a pilot study, the study falls short of providing mechanistic insights into what underlies the weight loss and the many correlations provided do not make up for this weakness.

      Overall, while this pilot study introduces an interesting approach to meal regularity, its limitations highlight the need for more rigorous studies to validate these findings.

      (1) Unreliable method of caloric intake

      The trial's reliance on self-reported caloric intake is problematic, as participants tend to underreport intake. As pointed out earlier by me and now cited in the revised manuscript, the NEJM paper (DOI: 10.1056/NEJM199212313272701) reported that some participants underreported caloric intake by approximately 50%, rendering such data unreliable and hence misleading. The question is, why include such unreliable data that is more misleading than informative at all? These data should have been omitted. More rigorous methods for assessing food intake should have been utilized. I understand this requires more effort, such as providing participants with meals, or using better methods that photograph and weigh the meals, etc., but it is certainly feasible. It has been done many times in other studies. Further, the control group was not asked to restrict their diet in any way, and hence, asking for a restriction in timing in the treatment group may be sufficient to reduce caloric intake and induce weight loss.<br /> Merely acknowledging the unreliability of self-reported caloric intake is insufficient, as it still leaves the reader with the impression that this weight loss is independent of caloric intake when, in reality, we actually have no idea if food intake contributes to it. A more robust approach to assessing food intake is imperative. Even if a decrease in caloric intake is observed through rigorous measurement, as I am convinced a more rigorous study would unveil testing this paradigm, this intervention may merely represent another restrictive diet among countless others that show that one may lose weight by going on a diet. Seemingly, any restrictive diet works for a few months. The trouble is they do not work long-term because they depend on restriction. I agree with the authors that their intervention seems common sense and has little downside, but one also needs to be realistic about the prospects of this intervention.

      (2) Lack of objective data regarding circadian rhythm

      The assessment of circadian rhythm using the MCTQ, a self-reported measure of chronotype, is subjective. More objective methods like actigraphy would have strengthened the study.

      Actigraphy is considered better than a sleep questionnaire for assessing circadian rhythms because it provides objective data on activity patterns over time, offering a more accurate picture of sleep-wake cycles compared to subjective self-reported information from a questionnaire.

      The authors' responses to my prior review are misleading.

      I understand that this is a pilot study. Is it appropriate to point out weaknesses and flaws in the conclusion drawn from a pilot study? Absolutely, that is the reviewer's job.

      I also understand that food intake can affect circadian rhythm, which was part of the rationale behind the study. Is it appropriate to criticize the study for not examining the effect of the intervention on circadian rhythm using objective measures provided by actigraphy? Yes, it is, as this would have provided mechanistic insights that are more rigorous. I understand that this was not the declared goal, but it should have been examined in a pilot study. To jump to the conclusion that based on prior studies, the intervention will improve circadian rhythms as the authors do is not rigorous and hence a weakness.

      A less rigorous method, such as a food questionnaire, to assess caloric intake can result in inadequately supported and potentially misleading conclusions. By including it, the reader may conclude that there was no change in caloric intake when indeed we do not know. I disagree with the authors that this is a minor issue. The associations and correlations the authors provide do not solve the issue. Hence, to make it very clear, it remains to be studied if this intervention reduces weight by reducing caloric intake or other mechanisms. Including this data reduces the study's rigor as it suggests that there is no difference in food intake.

      I did not suggest to only use an actimeter (which is a device); I suggested actigraphy. Actigraphy is widely recognized in the field for its utility in circadian rhythm research and provides objective data, while the questionnaire used is subjective. The authors do quote papers comparing their survey to actigraphy by correlation analysis, but the fundamental difference of the two approaches remains. Does an objective measure increase rigor compared to a subjective assessment? Yes, it does.

      Similarly, I did not state "that any form of imposed diet appears to lead to weight loss over several months." I said that many forms of restrictive diets do induce weight loss of a similar magnitude to this diet.

      The authors should have discussed the fundamental confounder of the study in that the treatment group is asked to restrict food intake to specific times while the control group is not asked to restrict in any way and the potential contribution of this to the weight loss observed.

    3. Author response:

      The following is the authors’ response to the previous reviews.

      We would like to remind the editors and reviewers that the present project is a pilot study that does not claim to produce definitive results. Pilot studies are exploratory preliminary studies to test the validity of hypotheses, the feasibility of a study as well as the research methods and the study design. From our point of view, our hypotheses and the feasibility of the pilot study have been confirmed to such an extent that the implementation of a larger study is justified. At the same time, it became clear during the pilot that the methods and design need to be adapted in some areas in order to increase the reliability of the results - a finding that pilot studies are usually conducted to obtain. We discussed these limitations in detail in order to explain the planned changes in the follow-up study. What the reviewers and editors interpret as incompleteness is therefore due to the nature of a pilot study.  We consider it necessary that appropriate standards are taken into account in the evaluation of the present work.

      In addition, we would like to make a counterstatement as to what our main claims, which should be used to assess the strength of evidence, are - and what they are not:

      In the introduction, we describe the background that led to the formation of our hypotheses: Previous animal and human studies show that food, along with light, serves as the main Zeitgeber for circadian clocks. It has also been shown that chrononutrition can lead to weight loss and improved well-being. Based on this, we hypothesized that individualized meal timing can enhance these positive effects. This hypothesis has been validated on the basis of the available results. Contrary to what the editors and reviewers stated, the assumption that the observed beneficial effects are indeed related to an alteration or resetting of endogenous circadian rhythms was not intended to be investigated in this study and is not one of our main claims. This has already been sufficiently demonstrated and, in our view, need not and should not be repeated in every study on chrononutrition. Accordingly, this assumption was not formulated as a working hypothesis or main claim. It is described in the paper as a potential mechanism, the assumption of which is justified on the basis of previous studies. The lack of a corresponding examination and the erroneous insinuation that corresponding results were nevertheless listed by us in the paper as a main claim should therefore not be used as a criterion for downgrading the assessment of the strength of evidence.

      The main criticism of our study is the collection of data using self-reported food and food quantities. This form of data collection is indeed prone to error, as there is little control over the accuracy of the reported data. However, we believe that this problem is limited in scope.

      (1) Contrary to what the editors and reviewers claim, at no point do we write that we are convinced that food intake has not changed. On the contrary, in Figure 2 we explicitly show that there was a change in what some participants reported to us regarding their food intake. We make it clear throughout the text that we could not find any correlation between weight change and the changes in the reports of food quantities/meals. These statements are correct and only what are actual and formulated main claims should be included in the evaluation of the study.

      (2) As previously stated, we conducted analyses that suggest that an unreported reduction in food intake is unlikely to be the cause of weight loss. For the most part, participants did not change their reporting behavior during the exploration and intervention phases. That is, participants who underreported food intake reported similar amounts in both phases of the study, but lost weight only in the intervention phase. To explain their weight loss with imprecise reporting, it would have to be assumed that these participants began to eat less in the intervention phase and at the same time report more in order to achieve similar calorie counts and food composition in the evaluation. We consider such behavior to be very unlikely, especially since it would apply to numerous participants.

      (3) The editors and reviewers reduce the results to the absence of a correlation between weight loss and reported food quantity and composition. In their assessment of the significance of the findings, however, they ignore the fact that we did find a significant correlation in our analyses, namely between weight loss and an increase in the regularity of food intake. There is no correlation between an increase in regularity and a reduction in reported calories (R<sup>2</sup> = 0.01472). This is credible in our view, as it is unlikely that the more regularly participants ate, the more pronounced the error in their reports was (while in reality they ate less than before).

      (4) We also had the requirement for the study design that the participants could carry out the intervention in their normal everyday life and environment in order to test and ensure implementation in real life. We consider it unrealistic to be able to monitor food intake continuously and without interruption over a period of several weeks under these conditions. We therefore see no alternative to self-reporting. As the reviewers and editors did not suggest any alternative methods of data collection that would fulfil the requirements of our study, we assume that, despite criticism and reservations, they generally agree with our assessment and take this into account in their evaluation.

      It is still criticized that some confounding factors are present. The reviewer makes no reference to the fact that we either eliminated these in the last version submitted (age range), identified them as unproblematic (unmatched cohorts, menstrual cycle, shift work) or even deliberately used them in order to be able to test our hypothesis more validly (inclusion of individuals with normal weight, overweight, and obesity).

      Besides, the use of actimeters to determine circadian rhythms as proposed by the editors and reviewers is not valid for this study and the requirement to use them to determine a circadian reset in the eLife assessment is misleading and inappropriate. This instrument only measures physical activity, but not the physiological parameters that are relevant for an investigation in this field of research.

      For the assessment of chronotype alone, the MCTQ questionnaire is a valid instrument that has been validated several times against actimetry (e.g., DOIs: 10.1080/07420528.2022.2025821, 10.1080/07420528.2023.2202246, 10.1016/j.ijpsycho.2016.07.433, 10.1155/2018/5646848). The reviewer's statement that the MCTQ questionnaire is unreliable for determining chronotype is unsupported and incorrect.

      Equally unproven is the statement that any form of imposed diet appears to lead to weight loss over a period of several months.

      Nevertheless, in order to prevent further misunderstandings, we have revised our text in a number of places and clarified that our statements are not irrefutable assertions, but potential interpretations of the results obtained in the pilot study, which are to be analyzed in more detail with regard to the planned more comprehensive study.

    1. eLife Assessment

      This study provides a comprehensive exploration of the role of IL-1β signaling during development of lung injury induced by a combination of underlying inflammation and mechanical ventilation. The data are convincing, and while the translatability of the findings related to therapeutic hypothermia may be somewhat complicated, they have the potential to be very valuable to the field.

    2. Reviewer #1 (Public review):

      Summary:

      The authors found that IL-1b signaling is pivotal for hypoxemia development and can modulate NETs formation in LPS+HVV ALI model.

      Strengths:

      They used IL1R1 ko mice and proved that IL1R1 is involved in ALI model proving that IL1b signalling leads towards ARDS. In addition, hypothermia reduces this effect, suggesting a therapeutic option.

      Weaknesses:

      (1) IL1R1 binds IL1a and IL1b. What would be the role of IL1a in this scenario?

      (2) The authors depleted neutrophils using anti-Ly6G. What about MDSCs? Do these latter cells be involved in ARDS and VILI?

      (3) The authors found that TH inhibited IL-1β release from macrophages led to less NETs formation and albumin leakage in the alveolar space in their lung injury model. A graphical abstract could be included suggesting a cellular mechanism.

      (4) If Macrophages are responsible for IL1b release that via IL1R1 induces NETosis, what happens if you deplete macrophages? what is the role of epithelial cells?

    3. Reviewer #2 (Public review):

      Summary:

      The manuscript by Nosaka et al is a comprehensive study exploring the involvement of IL1beta signaling in a 2-hit model of lung injury + ventilation, with a focus on modulation by hypothermia.

      Strengths:

      The authors demonstrate quite convincingly that interleukin 1 beta plays a role in the development of ventilator-induced lung injury in this model, and that this role includes the regulation of neutrophil extracellular trap formation. The authors use a variety of in vivo animal-based and in vitro cell culture work, and interventions including global gene knockout, cell-targeted knockout and pharmacological inhibition, which greatly strengthen the ability to make clear biological interpretations.

      Weaknesses:

      A primary point for open discussion is the translatability of the findings to patients. The main model used, one of intratracheal LPS plus mechanical ventilation is well accepted for research exploring the pathogenesis and potential treatments for acute respiratory distress syndrome (ARDS). However, the interpretation may still be open to question - in the model here, animals were exposed to LPS to induce inflammation for only 2 hours, and seemingly displayed no signs of sickness, before the start of ventilation. This would not be typical for the majority of ARDS patients, and whether hypothermia could be effective once substantial injury is already present remains an open question. The interaction between LPS/infection and temperature is also complicated - in humans, LPS (or infection) induces a febrile, hyperthermic response, whereas in mice LPS induces hypothermia (eg. Ganeshan K, Chawla A. Nat Rev Endocrinol. 2017;13:458-465). Given this difference in physiological response, it is therefore unclear whether hypothermia in mice and hypothermia in humans are easily comparable. Finally, the use of only young, male animals such as in the current study has been typical but may be criticised as limiting translatability to people.

      Therefore while the conclusions of the paper are well supported by the data, and the biological pathways have been impressively explored, questions still remain regarding the ultimate interpretations.

    4. Author response:

      Public Reviews: 

      Reviewer #1 (Public review): 

      Summary: 

      The authors found that IL-1b signaling is pivotal for hypoxemia development and can modulate NETs formation in LPS+HVV ALI model.  

      Strengths: 

      They used IL1R1 ko mice and proved that IL1R1 is involved in ALI model proving that IL1b signalling leads towards ARDS. In addition, hypothermia reduces this effect, suggesting a therapeutic option.  

      We thank the Reviewer for recognizing the strengths of our study and their positive feedback.

      Weaknesses: 

      (1) IL1R1 binds IL1a and IL1b. What would be the role of IL1a in this scenario? 

      Thank you for asking this question. We have addressed this in our previous paper (Nosaka et al. Front Immunol 2020;11; 207) where we used  anti-IL-1a and IL-1a KO mice (Nosaka et al. Front Immunol 2020;11; 207) in our model and found that neither anti-IL-1a treated mice nor IL-1a KO mice were protected. Thus, IL-1b plays a role in inducing hypoxemia during LPS+HVV but not IL-1a. We will now add this point in our revised manuscript discussion.

      (2) The authors depleted neutrophils using anti-Ly6G. What about MDSCs? Do these latter cells be involved in ARDS and VILI?  

      Anti-Ly6G neutrophils depletion may potentially affect G-MDSCs as well (Blood Adv 2022 Jul 29;7(1):73–86), however, we have not looked directly at G-MDSCs.  If these cells were depleted we would have expected to see an increase in inflammation, which we did not.   

      Instead, anti-Ly6G treated mice were protected. Thus, we can not comment on any presumed role of G-MDSCs in LPS+HVV induced severe ALI model that we used.  

      (3) The authors found that TH inhibited IL-1β release from macrophages led to less NETs formation and albumin leakage in the alveolar space in their lung injury model. A graphical abstract could be included suggesting a cellular mechanism.  

      Thanks for summarizing our findings and the suggestion. Unfortunately, eLIFE does not publish a graphical abstract. We tried to mention this mechanism in the discussion.

      (4) If Macrophages are responsible for IL1b release that via IL1R1 induces NETosis, what happens if you deplete macrophages? what is the role of epithelial cells?  

      Previous studies have found that macrophage depletion is protective in several models of ALI (Eyal. Intensive Care Med. 2007;33:1212–1218., Lindauer.  J Immunol. 2009;183:1419–1426.), and other researchers have found that airway epithelial cells did not contribute to IL-1β secretion (Tang. PLoS ONE. 2012;7:e37689.). We have previously reported that epithelial cells produce IL-18 without LPS priming signal during LPS+HVV (Nosaka et al. Front Immunol 2020;11; 207). Thus, IL-18 is not sufficient to induce Hypoxemia as Saline+HVV treated mice do not develop hypoxemia (Nosaka et al. Front Immunol 2020;11; 207). We will now add this point to the revised discussion of the manuscript.

      Reviewer #2 (Public review): 

      Summary: 

      The manuscript by Nosaka et al is a comprehensive study exploring the involvement of IL1beta signaling in a 2-hit model of lung injury + ventilation, with a focus on modulation by hypothermia. 

      Strengths: 

      The authors demonstrate quite convincingly that interleukin 1 beta plays a role in the development of ventilator-induced lung injury in this model, and that this role includes the regulation of neutrophil extracellular trap formation. The authors use a variety of in vivo animal-based and in vitro cell culture work, and interventions including global gene knockout, cell-targeted knockout and pharmacological inhibition, which greatly strengthen the ability to make clear biological interpretations. 

      We thank the Reviewer for their positive feedback 

      Weaknesses: 

      A primary point for open discussion is the translatability of the findings to patients. The main model used, one of intratracheal LPS plus mechanical ventilation is well accepted for research exploring the pathogenesis and potential treatments for acute respiratory distress syndrome (ARDS). However, the interpretation may still be open to question - in the model here, animals were exposed to LPS to induce inflammation for only 2 hours, and seemingly displayed no signs of sickness, before the start of ventilation. This would not be typical for the majority of ARDS patients, and whether hypothermia could be effective once substantial injury is already present remains an open question. The interaction between LPS/infection and temperature is also complicated - in humans, LPS (or infection) induces a febrile, hyperthermic response, whereas in mice LPS induces hypothermia (eg. Ganeshan K, Chawla A. Nat Rev Endocrinol. 2017;13:458-465). Given this difference in physiological response, it is therefore unclear whether hypothermia in mice and hypothermia in humans are easily comparable. Finally, the use of only young, male animals such as in the current study has been typical but may be criticised as limiting translatability to people. 

      Therefore while the conclusions of the paper are well supported by the data, and the biological pathways have been impressively explored, questions still remain regarding the ultimate interpretations.  

      We agree with the reviewer that at two hours post LPS, there is only minimal pulmonary inflammation at that time (Dagvadorj et al Immunity 42, 640–653). This is a limitation to the experimental model we used in our study. Additionally, as the reviewer pointed out that LPS induces hyperthermia in human, but it is also well-established that physiological hypothermia occurs in humans with severe infections and sepsis (Baisse. Am J Emerg Med. 2023 Sep: 71: 134-138., Werner.  Am J Emerg Med. 2025 Feb;88:64-78.). Therefore, the difference between human and mouse responses to sepsis or infections may be more nuanced.  Furthermore, it is important to distinguish between physiological hypothermia (just <36°C) and therapeutic hypothermia (typically 32-34°C). We will add to the discussion whether hypothermia serves as a protective response, and the transition from normothermia to hyperthermia could have detrimental effects. We only used young male mice in our study as the Reviewer points out; we will also add this point to the revised discussion as a limitation of our study.

    1. eLife Assessment

      This study highlights ITCH as a regulator of SARS-CoV-2 replication by promoting K63-linked ubiquitination of M and E proteins. While the findings are potentially useful, the approaches are overly reliant on ectopic expression models and lack direct mechanistic evidence that ubiquitination of M and E has functional relevance. Accordingly, the strength of evidence is incomplete, as further experiments are needed to validate the findings and address potential confounding factors.

    2. Reviewer #1 (Public review):

      Summary:

      The authors investigated the role of an E3 ubiquitin ligase ITCH in regulating the viral life cycle of SARS-CoV-2. The authors showed that ITCH mediates ubiquitination of the membrane (M) and envelope (E) proteins of SARS-CoV-2. Ubiquitination of E and M results in enhanced interactions between the structural proteins and redistribution of the structural proteins into autophagosomes. The authors claim that the enhanced interactions between structural proteins and trafficking of the structural proteins into autophagosomes contribute to SARS-CoV-2 replication and egress, prompting ITCH as a potential antiviral target. ITCH also alters the cellular distribution of host proteases important for spike cleavage which protect and stabilize spike with cleavage. The authors also demonstrated that SARS-CoV-2 replication is augmented by ITCH in which virus replication is significantly impaired in cells lacking ITCH expression.

      Strengths:

      The authors provided high-quality data with appropriate experimental controls to justify their claims and conclusions. The mechanistic analyses are excellent and presented in a logical manner. The investigation of the role of ubiquitination in coronavirus assembly and egress is novel as most previous studies focused on its role in mediating innate immune responses.

      Weaknesses:

      Although the authors showed that ITCH ubiquitinates E and M proteins, the claim that such ubiquitination promotes virion assembly and egress is circumstantial. The enhanced interaction between the structural proteins and targeting of ubiquitinated structural proteins into autophagosomes does not necessarily result in increased virion production and release as suggested by the authors. There is a disconnect between the ubiquitination of structural proteins and the role of ITCH in augmenting virus replication as shown in Fig. 6A and B. In addition, the authors showed that the catalytic activity of ITCH is important for the localization and maturation of host proteases. However, the mechanism behind is unknown. Also, it is unclear how protection of spike from cleavage conferred by ITCH explains its role in promoting replication as a lack of spike cleavage would inevitably compromise entry. The major weakness of the manuscript is the lack of experimental data that explains the molecular role of ITCH in relation to its phenotype observed during SARS-CoV-2 infection.

    3. Reviewer #2 (Public review):

      Summary:<br /> In this manuscript Qiwang Xiang et al. investigated the role of the E3 ubiquitin ligase ITCH in the life cycle of SARS-CoV-2. They claim the following:<br /> i) ITCH promotes virion assembly by interacting with E and M proteins and enhancing their K63-linked ubiquitination<br /> ii) ITCH-mediated ubiquitination promotes autophagosome-dependent secretion of viral particles.<br /> iii) ITCH stabilizes the viral spike protein by impairing its processing by furin and catepsin L proteases.<br /> The manuscript provides an interesting exploration of ITCH's role in the SARS-CoV-2 life cycle but requires additional work to strengthen key claims and address potential confounding factors.

      Strengths:

      The experiments are sufficiently clear in documenting that ITCH activity is critical for efficient SARS-CoV-2 replication and for M and E proteins K63-linked ubiquitination

      Weaknesses:

      • The manuscript does not convincingly demonstrate how ITCH-mediated ubiquitination of E and M impacts virus assembly and release. Identifying the specific lysine residues in M and E targeted by ITCH, and generating mutant VLPs or recombinant viruses, would strengthen the conclusions.<br /> • Most of the conclusions rely on ITCH overexpression data, which may have off-target effects on Golgi integrity and vesicular trafficking. For instance, figure 4F provides evidence of altered Golgi morphology and TGN46 fragmentation raising concerns that ITCH overexpression could indirectly mislocalize furin, affecting S1/S2 cleavage of the spike protein. In addition, inhibition of furin activity may also lead to off-target effects, given its role in processing numerous host proteins.<br /> • Similarly, ITCH overexpression is likely to indirectly affect cathepsin-L maturation. In addition, the manuscript does not clarify how impaired cathepsin L activity would influence virus assembly or release.<br /> • A major concern is also the lack of quantification and statistical analysis of immunofluorescence images throughout the manuscript, which undermines the reliability of these observations.

    4. Reviewer #3 (Public review):

      Summary:

      Xiang et al. investigated the role of ubiquitin E3 ligase ITCH in SARS-CoV-2 replication. First, they described the role of ITCH on the structural proteins. Here, the ubiquitination of E and M (but not S) leads to an enhanced interaction and presumably virion assembly. In addition, E and M ubiquitination seems to be necessary for p62-guided sequestration into autophagosomes for secretion. Furthermore, ITCH regulates S proteolytic cleavage by changing furin localization and inhibiting CTSL protease maturation. In addition, SARS-CoV-2 infection upregulates ITCH phosphorylation, whereas knockout of ITCH reduces SARS-CoV-2 replication.

      Strengths:

      The proposed study is of interest to the virology community because it aims to elucidate the role of ubiquitination by ITCH in SARS-CoV-2 proteins. Understanding these mechanisms will address broadly applicable questions about coronavirus biology and enhance our knowledge of ubiquitination's diverse functions in cell biology.

      Weakness:

      The involvement of ubiquitin ligases in SARS-CoV-2 replication is not entirely new (see E3 Ubiquitin Ligase RNF5; Yuan et al., 2022; Li et al., 2023). While the data generally support the conclusions, additional work is needed to confirm the role of ITCH in SARS-CoV-2 replication in a biologically relevant context. The vast majority of data is based on transient overexpression experiments of ITCH, which ultimately leads to massive ubiquitination of several viral and host cell factors, including potentially low-affinity substrates not typically recognized under physiological conditions. In addition to that, nearly all experiments were done in cells co-overexpressing ITCH and the viral structural proteins (or cellular proteases) in HEK293T cells. Therefore, a proteomic analysis of protein ubiquitination in a) SARS-CoV-2-infected cells (ideally several cell types) and b) SARS-CoV-2-infected v2T-ITCH-KO cells would verify the ITCH-related ubiquitination of e.g., E and M and would strengthen the whole manuscript. In addition, the few key experiments using SARS-CoV-2 infected cells were performed in VeroE6 cells, which are neither human nor lung-derived. Only in one experiment were lung-derived Calu3 cells included.<br /> Moreover, the manuscript names ITCH as a central regulator of SARS-CoV-2 replication. If ITCH is beneficial for E and M interaction and thereby aids virion assembly, showing its effect on VLP production would be desirable. Clarifications regarding data acquisition and data analysis could strengthen the manuscript and its conclusions.

    1. eLife Assessment

      NCX1 is an important cardiac Ca2+/Na+ exchanger whose activity is tightly regulated. This manuscript describes the structural basis of activation by the lipid PIP2 and inhibition by binding of a small molecule to NCX1. These results provide key insights into NCX1 regulation and cellular Ca2+ signaling, but the evidence presented is still incomplete.

    2. Reviewer #1 (Public review):

      This study uses structural and functional approaches to investigate the regulation of the Na/Ca exchanger NCX1 by an activator, PIP2, and an inhibitor, SEA0400.

      State-of-the-art methods are employed, and the data are of high quality and presented very clearly. The manuscript combines two rather different studies (one on PIP2; and one on SEA0400) neither of which is explored in the depth one might have hoped to form robust conclusions and significantly extend knowledge in the field.

      The novel aspect of this work is the study of PIP2. Unfortunately, technical limitations precluded structural data on binding of the native PIP2, so an unnatural short-chained analog, di-C8 PIP2, was used instead. This raises the question of whether these two molecules, which have similar but very distinctly different profiles of activation, actually share the same binding pocket and mode of action. In an effort to address this, the authors mutate key residues predicted to be important in forming the binding site for the phosphorylated head group of PIP2. However, none of these mutations prevent PIP2 activation. The only ones that have a significant effect also influence the Na-dependent inactivation process independently of PIP2, thus casting doubt on their role in PIP2 binding, and thus identification of the PIP2 binding site. A more extensive mutagenic study, based on the di-C8 PIP2 binding site, would have given more depth to this work and might have been more revealing mechanistically.

      The SEA0400 aspect of the work does not integrate particularly well with the rest of the manuscript. This study confirms the previously reported structure and binding site for SEA0400 but provides no further information. While interesting speculation is presented regarding the connection between SEA0400 inhibition and Na-dependent inactivation, further experiments to test this idea are not included here.

    3. Reviewer #2 (Public review):

      The study by Xue et al. reports the structural basis for the regulation of the human cardiac sodium-calcium exchanger, NCX1, by the endogenous activator PIP2 and the small molecule inhibitor SEA400. This well-written study contextualizes the new data within the existing literature on NCX1 and the broader NCX family. This work builds upon the authors' previous study (Xue et al., 2023), which presented the cryo-EM structures of human cardiac NCX1 in both inactivated and activated states. The 2023 study highlighted key structural differences between the active and inactive states and proposed a mechanism where the activity of NCX1 is regulated by the interactions between the ion-transporting transmembrane domain and the cytosolic regulatory domain. Specifically, in the inward-facing state and at low cytosolic calcium levels, the transmembrane (TM) and cytosolic domains form a stable interaction that results in the inactivation of the exchanger. In contrast, calcium binding to the cytosolic domain at high cytosolic calcium levels disrupts the interaction with the TM domain, leading to active ion exchange.

      In the current study, the authors present two mechanisms explaining how both PIP2 stimulates NCX1 activity by destabilizing the protein's inactive state (i.e., by disrupting the interaction between the TM domain and the cytosolic domain) and how SEA400 stabilizes this interaction, thereby acting as a specific inhibitor of the system.

      The first part of the results section addresses the effect of PIP2 and PIP2 diC8 on NCX1 activity. This is pertinent as the authors use the diC8 version of this lipid (which has a shorter acyl chain) in their subsequent cryo-EM structure due to the instability of native PIP2. I am not an electrophysiology expert; however, my main comment would be to ask whether there is sufficient data here to characterise fully the differences between PIP2 and PIP2 diC8 on NCX1 function. It appears from the text that this study is the first to report these differences, so perhaps this data needs to be more robust. The spread of the data points in Figure 1B is possibly a little unconvincing given that only six measurements were taken. Why is there one outlier in Figure 1A? Were these results taken using the same batch of oocytes? Are these technical or biological replicates? Is the convention to use statistical significance for these types of experiments?

      I am also somewhat skeptical about the modelling of the PIP2 diC8 molecule. The authors state, "The density of the IP3 head group from the bound PIP2 diC8 is well-defined in the EM map. The acyl chains, however, are flexible and could not be resolved in the structure (Fig. S2)."

      However, the density appears rather ambiguous to me, and the ligand does not fit well within the density. Specifically, there is a large extension in the volume near the phosphate at the 5' position, with no corresponding volume near the 4' phosphate. Additionally, there is no bifurcation of the volume near the lipid tails. I attempted to model cholesterol hemisuccinate (PDB: Y01) into this density, and it fits reasonably well - at least as well as PIP2 diC8. I am also concerned that if this site is specific for PIP2, then why are there no specific interactions with the lipid phosphates? How can the authors explain the difference between PIP2 and PIP2 diC8 if the acyl chains don't make any direct interactions with the TM domain? In short, the structures do not explain the functional differences presented in Figure 1.

      The side chain densities for Arg167 and Arg220 are also quite weak. While there is some density for the side chain of Lys164, it is also very weak. I would expect that if this site were truly specific for PIP2, it should exhibit greater structural rigidity - otherwise, how is this specific?

      Given this observation, have the authors considered using other PIP2 variants to determine if the specificity lies with PI4,5P2 as opposed to PI3,5P2 or PI3,4P2? A lack of specificity may explain the observed poor density.

      I also noticed many lipid-like densities in the maps for this complex. Is it possible that the authors overlooked something? For instance, there is a cholesterol-like density near Val51, as well as something intriguing near Trp763, where I could model PIP2 diC8 (though this leads to a clash with Trp763). I wonder if the authors are working with mixed populations in their dataset. The accompanying description of the structural changes is well-written (assuming it is accurate).

      I would recommend that the authors update the figures associated with this section, as they are currently somewhat difficult to interpret without prior knowledge of NCX architecture. My suggestions include:

      - Including the density for the PIP2 diC8 in Figure 2A.

      - Adding membrane boundaries (cytosolic vs. extracellular) in Figure 2B.

      - Labeling the cytosolic domains in Figure 2B.

      - Adding hydrogen bond distances in Figure 2A.

      - Detailing the domain movements in Figure 2B (what is the significance of the grey vs. blue structures?).

      The section on the mechanism of SEA400-induced inactivation is strong. The maps are of better quality than those for the PIP2 diC8 complex, and the ligand fits well. However, I noticed a density peak below F02 on SEA400 that lies within the hydrogen bonding distance of Asp825. Is this a water molecule? If so, is this significant?

      Furthermore, there are many unmodeled regions that are likely cholesterol hemisuccinate or detergent molecules, which may warrant further investigation.

      The authors introduce SEA400 as a selective inhibitor of NCX1; however, there is little to no comparison between the binding sites of the different NCX proteins. This section could be expanded. Perhaps Fig. 4C could include sequence conservation data.

      Additionally, is the fenestration in the membrane physiological, or is it merely a hole forced open by the binding of SEA400? I was unclear as to whether the authors were suggesting a physiological role for this feature, similar to those observed in sodium channels.

    4. Reviewer #3 (Public review):

      NCXs are key Ca2+ transporters located on the plasma membrane, essential for maintaining cellular Ca2+ homeostasis and signaling. The activities of NCX are tightly regulated in response to cellular conditions, ensuring precise control of intracellular Ca2+ levels, with profound physiological implications. Building upon their recent breakthrough in determining the structure of human NCX1, the authors obtained cryo-EM structures of NCX1 in complex with its modulators, including the cellular activator PIP2 and the small molecule inhibitor SEA0400. Structural analyses revealed mechanistically informative conformational changes induced by PIP2 and elucidated the molecular basis of inhibition by SEA0400. These findings underscore the critical role of the interface between the transmembrane and cytosolic domains in NCX regulation and small molecule modulation. Overall, the results provide key insights into NCX regulation, with important implications for cellular Ca2+ homeostasis.

    1. eLife Assessment

      This valuable paper reports machine learning-based image analysis pipelines for the automated segmentation of micronuclei and the detection and sorting of micronuclei-containing cells. These are powerful new tools for researchers who study micronuclei and their physiologic consequences. The analysis of the new tools and their benchmarking is rigorous and convincing; applications and remaining limitations are well explained in the paper.

    2. Reviewer #1 (Public review):

      DiPeso et al. develop two tools to i) classify micronucleated (MN) cells, which they call VCS MN, and ii) segment micronuclei and nuclei with MNFinder. They then use these tools to identify transcriptional changes in MN cells.

      The strengths of this study are:

      - Developing highly specialized tools to speed up the analysis of specific cellular phenomena such as MN formation and rupture is likely valuable to the community and neglected by developers of more generalist methods.

      - A lot of work and ideas have gone into this manuscript. It is clearly a valuable contribution.

      - Combining automated analysis, single-cell labeling, and cell sorting is an exciting approach to enrich for phenotypes of interest, which the authors demonstrate here.

      The authors addressed my original concerns related to the first version of this manuscript.

    3. Reviewer #2 (Public review):

      Summary:

      Micronuclei are aberrant nuclear structures frequently seen following the missegregation of chromosomes. The authors present two image analysis methods, one robust and another rapid, to identify micronuclei (MN) bearing cells. To analyse their software efficacy, the authors study images of cells treated with MPS1 inhibitor to induce chromosome missegregation. Next, the authors use RNA-seq to assess the outcomes of their MN-identifying methods: they do not observe a transcriptomic signature specific to MN but find changes that correlate with aneuploidy status. Overall, this work offers new tools to identify MN-presenting cells, and it sets the stage with clear benchmarks for further software development.

      Strengths:

      Currently, there are no robust MN classifiers with a clear quantification of their efficiency across cell lines (mIoU score). The software presented here tries to address this gap. GitHub material (images, ground truth labels, tools, protocols, etc.) provided is a great asset to computational biologists. The method has been tested in more than one cell line. This method can help integrate cell biology and 'omics' data, making it suitable for multimodal studies.

      Weaknesses:

      Although the classifier outperforms available tools for MN segmentation by providing mIoU, it's not yet at a point where it can be reliably applied to functional genomics assays where we expect a range of phenotypic penetrance in most cell lines (e.g., misshapen, multinucleated, and lagging DNA in addition to micronucleated cells). The discussion considers the nature and proportion of MN in RPE1 cells, and how the classifier is well-suited for RPE1 that predominantly display MN structures. Whether the classifier can rigorously assign MN-presenting cells amidst drastic nuclear aberrancies following a spindle checkpoint loss needs to be tested in the future.

    4. Reviewer #3 (Public review):

      Summary:

      The authors develop automated methods to visually identify micronuclei (MN) and MN-containing cells. The authors then use these methods to isolate MN-containing RPE-1 cells post-photoactivation and analyze transcriptional changes in cells with and without micronuclei. The authors find that RPE-1 cells with MN have similar transcriptomic changes as aneuploid cells and that MN rupture does not lead to vast changes in the transcriptome.

      Strengths:

      The authors develop a method that allows for automating measurements and analysis of micronuclei. This has been something that the field has been missing for a long time. Using such a method has the potential to greatly enhance the field's ability to analyze micronuclei and understand the downstream consequences. The authors also develop a method to identify cells with micronuclei in real-time, mark them using photoconversion, and then isolate them via cell sorting, which could change the way we isolate and study MN-containing cells, and the scale at which we do it. The authors use this method to look at the transcriptome. This method is very powerful as it can allow for the separation of a heterogenous population and subsequent analysis with a much higher sample number than previously possible.

      Weaknesses:

      The major weakness of this paper is the transcriptomic analysis of MN. There is in general large variance between replicates in experiments looking at cells with ruptured versus intact micronuclei. This limits our ability to assess if lack of changes are due to truly not having changes between these populations or experimental limitations. More transcriptomic analysis will be necessary to fully understand the downstream consequences of MN rupture.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      DiPeso et al. develop two tools to (i) classify micronucleated (MN) cells, which they call VCS MN, and (ii) segment micronuclei and nuclei with MMFinder. They then use these tools to identify transcriptional changes in MN cells.

      The strengths of this study are:

      (1) Developing highly specialized tools to speed up the analysis of specific cellular phenomena such as MN formation and rupture is likely valuable to the community and neglected by developers of more generalist methods.

      (2) A lot of work and ideas have gone into this manuscript. It is clearly a valuable contribution.

      (3) Combining automated analysis, single-cell labeling, and cell sorting is an exciting approach to enrich phenotypes of interest, which the authors demonstrate here.

      Weaknesses:

      (1) Images and ground truth labels are not shared for others to develop potentially better analysis methods.

      We regret this omission and thank the reviewer for pointing it out. Both the images and ground truth labels for VCS MN and MNFinder are now available on the lab’s github page and described in the README.txt files. VCS MN: https://github.com/hatch-lab/fast-mn. MNFinder: https://github.com/hatch-lab/mnfinder.

      (2) Evaluations of the methods are often not fully explained in the text.

      The text has been extensively updated to include a full description of the methods and choices made to develop the VCS MN and MNFinder image segmentation modules.

      (3) To my mind, the various metrics used to evaluate VCS MN reveal it not to be terribly reliable. Recall and PPV hover in the 70-80% range except for the PPV for MN+. It is what it is - but do the authors think one has to spend time manually correcting the output or do they suggest one uses it as is?

      VCS MN attempts to balance precision and recall with speed to reduce the fraction of MN changing state from intact to ruptured during a single cell cycle during a live-cell isolation experiment. In addition, we chose to prioritize inclusion of small MN adjacent to the nucleus in our positive calls. This meant that there were more false positives (lower PPV) than obtained by other methods but allowed us to include this highly biologically relevant class of MN in our MN+ population. Thus, for a comprehensive understanding of the consequences of MN formation and rupture, we recommend using the finder as is. However, for other visual cell sorting applications where a small number of highly pure MN positive and negative cells is preferred, such as clonal outgrowth or metastasis assays, we would recommend using the slower, but more precise, MNFinder to get a higher precision at a cost of temporal resolution. In addition, MNFinder, with its higher flexibility and object coverage, is recommended for all fixed cell analyses.

      Reviewer #2 (Public review):

      Summary:

      Micronuclei are aberrant nuclear structures frequently seen following the missegregation of chromosomes. The authors present two image analysis methods, one robust and another rapid, to identify micronuclei (MN) bearing cells. The authors induce chromosome missegregation using an MPS1 inhibitor to check their software outcomes. In missegregation-induced cells, the authors do not distinguish cells that have MN from those that have MN with additional segregation defects. The authors use RNAseq to assess the outcomes of their MN-identifying methods: they do not observe a transcriptomic signature specific to MN but find changes that correlate with aneuploidy status. Overall, this work offers new tools to identify MN-presenting cells, and it sets the stage with clear benchmarks for further software development.

      Strengths:

      Currently, there are no robust MN classifiers with a clear quantification of their efficiency across cell lines (mIoU score). The software presented here tries to address this gap. GitHub material (tools, protocols, etc) provided is a great asset to naive and experienced computational biologists. The method has been tested in more than one cell line. This method can help integrate cell biology and 'omics' studies.

      Weaknesses:

      Although the classifier outperforms available tools for MN segmentation by providing mIOU, it's not yet at a point where it can be reliably applied to functional genomics assays where we expect a range of phenotypic penetrance.

      We agree that the MNFinder module has limitations with regards to the degree of nuclear atypia and cell density that can be tolerated. Based on the recall and PPV values and their consistency across the majority conditions analyzed, we believe that MNFinder can provide reliable results for MN frequency, integrity, shape, and label characteristics in a functional genomics assay in many commonly used adherent cell lines. We also added a discussion of caveats for these analyses, including the facts that highly lobulated nuclei will have higher false positive rates and that high cell confluency may require additional markers to ensure highly accurate assignment of MN to nuclei.

      Spindle checkpoint loss (e.g., MPS1 inhibition) is expected to cause a variety of nuclear atypia: misshapen, multinucleated, and micronucleated cells. It may be difficult to obtain a pure MN population following MPS1 inhibitor treatment, as many cells are likely to present MN among multinucleated or misshapen nuclear compartments. Given this situation, the transcriptomic impact of MN is unlikely to be retrieved using this experimental design, but this does not negate the significance of the work. The discussion will have to consider the nature, origin, and proportion of MN/rupture-only states - for example, lagging chromatids and unaligned chromosomes can result in different states of micronuclei and also distinct cell fates.

      We appreciate the reviewer’s comments and now quantify the frequency of other nuclear atypias and MN chromosome content in RPE1 cells after 24 h Mps1 inhibition (Fig. S1). In summary, we find only small increases in nuclear atypia, including multinucleate cells, misshapen nuclei, and chromatin bridges, compared to the large increase in MN formation. This contrasts with what is observed when mitosis is delayed using nocodazole or CENPE inhibitors where nuclear atypia is much more frequent. Importantly, after Mps1 inhibition, RPE1 cells with MN were only slightly more likely to have a misshapen nucleus compared to cells without MN (Fig. S1C).

      Interestingly, this analysis showed that the VCS MN pipeline, which uses the Deep Retina segmenter to identify nuclei, has a strong bias against lobulated nuclei and frequently fails to find them (Fig. S2B). Therefore, the cell populations analyzed by RNAseq were largely depleted of highly misshapen nuclei and differences in nuclear atypia frequency between MN+ and MN- cells in the starting population were lost (Fig. S9A, compare to Fig. S1C). This strongly suggests that the transcript changes we observed reflect differences in MN frequency and aneuploidy rather than differences in nuclei morphology.

      We agree with the reviewer that MN rupture frequency and formation, and downstream effects on cell proliferation and DNA damage, are sensitive to the source of the missegregated chromatin. In the revised manuscript we make clear that we chose Mps1 inhibition because it is strongly biased towards whole chromosome MN (Fig. S1E), limiting signal from DNA damage products, including chromosome fragments and chromatin bridges. This provides a base line to disambiguate the consequences of micronucleation and DNA damage in more complex chromosome missegregation processes, such as DNA replication disruption and irradiation. 

      Reviewer #3 (Public review):

      Summary:

      The authors develop a method to visually analyze micronuclei using automated methods. The authors then use these methods to isolate MN post-photoactivation and analyze transcriptional changes in cells with and without micronuclei of RPE-1 cells. The authors observe in RPE-1 cells that MN-containing cells show similar transcriptomic changes as aneuploidy, and that MN rupture does not lead to vast changes in the transcriptome.

      Strengths:

      The authors develop a method that allows for automating measurements and analysis of micronuclei. This has been something that the field has been missing for a long time. Using such a method has the potential to advance micronuclei biology. The authors also develop a method to identify cells with micronuclei in real time and mark them using photoconversion and then isolate them via FACS. The authors use this method to study the transcriptome. This method is very powerful as it allows for the sorting of a heterogenous population and subsequent analysis with a much higher sample number than could be previously done.

      Weaknesses:

      The major weakness of this paper is that the results from the RNA-seq analysis are difficult to interpret as very few changes are found to begin with between cells with MN and cells without. The authors have to use a 1.5-fold cut-off to detect any changes in general. This is most likely due to the sequencing read depth used by the authors. Moreover, there are large variances between replicates in experiments looking at cells with ruptured versus intact micronuclei. This limits our ability to assess if the lack of changes is due to truly not having changes between these populations or experimental limitations. Moreover, the authors use RPE-1 cells which lack cGAS, which may contribute to the lack of changes observed. Thus, it is possible that these results are not consistent with what would occur in primary tissues or just in general in cells with a proficient cGAS/STING pathway.

      We agree with the reviewer’s assessment of the limitations of our RNA-Seq analysis. After additional analysis, we propose an alternative explanation for the lower expression changes we observe in the MN+ and Mps1 inhibitor RNA-Seq experiments. In summary, we find that VCS MN has a strong bias against highly lobulated nuclei that depletes this class of cells from both the bulk analysis and the micronucleated cell populations (Fig. S9A). Based on this result, we propose that our analysis reduces the contribution of nuclear atypia to these transcriptional changes and that nuclear morphology changes are likely a signaling trigger associated with aneuploidy.

      We believe that this finding strengthens our overall conclusion that MN formation and rupture do not cause transcriptional changes, as suppressing the signaling associated with nuclei atypia should increase sensitivity to changes from the MN. However, we cannot completely rule out that MN formation or rupture cause a broad low-level change in transcription that is obscured by other signals in the dataset.

      As to cGAS signaling, several follow up papers and even the initial studies from the Greenburg lab show that MN rupture does not activate cGAS and does not cause cGAS/STING-dependent signaling in the first cell cycle (see citations and discussion in text). Therefore, we expect the absence of cGAS in RPE1 cells will have no effect in the first cell cycle, but could alter the transcriptional profile after mitosis. Although analysis of RPE1  cGAS+ cells or primary cells in these experiments will be required to definitively address this point, we believe that our interpretation of our RNAseq results is sufficiently backed up by the literature to warrant our conclusion that MN formation and rupture do not induce a transcriptional response in the first cell cycle.

      Reviewer #1 (Recommendations for the authors):

      I do not recommend additional experimental or computational work. Instead, I just recommend adapting the claims of the manuscript to what has been done. I am just asking for further clarification and minor rewriting.

      (1) The manuscript is written like a molecular biology paper with sparse explanations of the authors' reasoning, especially in the development of their algorithms. I was often lost as to why they did things in one way or another.

      The revised manuscript has thorough explanations and additional data and graphics defining how and why the VCS MN and MNFinder modules were developed. We hope that this clears up many of the questions the reviewer had and appreciate their guidance on making it more readable for scientists from different backgrounds.

      (2) Evaluations of their method are often not fully explained, for example:

      "On average, 75% of nuclei per field were correctly segmented and cropped."

      "MN segments were then assigned to 'parent' nuclei by proximity, which correctly associated 97% of MN."

      Were there ground truth images and labels created? How many? For example, I don't know how the authors could even establish a ground-truth for associating MNs to nuclei if MNs happened to be almost equidistant between two nuclei in their images.

      I suggest a separate subsection early in the Results section where the underlying imaging data + labels are presented.

      We added new sections to the text and figures at the beginning of the VCS MN and MNFinder subsections (Fig. S2 and Fig. S5) with specific information about how ground truth images and labels were generated for both modules and how these were broken up for training, validation, and testing.

      We also added information and images to explain how ground truth MN/nucleus associations were derived. In summary, we took advantage of the fact that 2xDendra-NLS is present at low levels in the cytoplasm to identify cell boundaries. This combined with a subconfluent cell population allowed us to unambiguously group MN and nuclei for 98% of MN, we estimate. These identifications were used to generate ground truth labels and analyze how well proximity defines MN/nuclei groups (Fig.s S1 and S2).

      (3) Overall, I find the sections long and more subtitles would help me better navigate the manuscript.

      Where possible, we have added subtitles.

      (4) Everything following "To train the model, H2B channel images were passed to a Deep Retina neural net ..." is fully automated, it seems to me. Thus, there seems to be no human intervention to correct the output before it is used to train the neural network. Therefore, I do not understand why a neural network was trained at all if the pipeline for creating ground truth labels worked fully automatically. At least, the explanations are insufficient.

      We apologize for the initial lack of clarity in the text and included additional details in the revision. We used the Deep Retina segmenter to crop the raw images to areas around individual nuclei to accelerate ground truth labeling of MN. A trained user went through each nucleus crop and manually labeled pixels belonging to MN to generate the ground truth dataset for training, validation, and imaging in VCS MN (Fig. S2A).

      (5) To my mind, the various metrics used to evaluate VCS MN reveal it not to be terribly reliable. Recall and PPV hover in the 70-80% range except for the PPV for MN+. It is what it is - but do the authors think one has to spend time manually correcting the output or do they suggest one uses it as is? I understand that for bulk transcriptomics, enrichment may be sufficient but for many other questions, where the wrong cell type could contaminate the population, it is not.

      Remarks in the Results section on what the various accuracies mean for different applications would be good (so one does not need to wait for the Discussion section).

      One of the strengths of the visual cell sorting system is that any image analysis pipeline can be used with it. We used VCS MN for the transcriptomics experiment, but for other applications a user could run visual cell sorting in conjunction with MNFinder for increased purity while maintaining a reasonable recall or use a pre-existing MN segmentation program that gives 100% purity but captures only a specific subgroup of micronucleated cells (e.g. PIQUE). 

      To maintain readability, especially with the expansion of the results sections, we kept the discussion of how we envision using visual cell sorting for other MN-based applications in the discussion section.

      (6) I am confused about what "cell" is referring to in much of the manuscript. Is it the nucleus + MNs only? Is it the whole cell, which one would ordinarily think it is? If so, are there additional widefield images, where one can discern cell boundaries? I found the section "MNFinder accurately ..." very hard to read and digest for this reason and other ambiguous wording. I suggest the authors take a fresh look at their manuscript and see whether the text can be improved for clarity. I did not find it an easy read overall, especially the computational part.

      After re-examining how “cell” was used, we updated the text to limit its use to the MNFinder arm tasked with identifying MN-nucleus associations where the convex hull defined by these objects is used to determine the “cell” boundary. In all other cases we have replaced cell with “nucleus” because, as the reviewer points out, that is what is being analyzed and converted. We hope this is clearer.

      (7) Post-FACS PPVs are not that great (Figure 3c). It depends on the question one wants to answer whether ~70% PPV is good enough. Again, would be good to comment on.

      We added discussion of this result to the revision. In summary, a likely reason for the reduced PPV is that, although we maintain the cells in buffer with a Cdk1 inhibitor, we know that some proportion of the cells go through mitosis post-sorting. Since MN are frequently reincorporated into the nucleus after mitosis (Hatch et al, 2013; Zhang et al., 2015), we expect this to reduce the MN+ population. Thus, we expect that the PPV in the RNAseq population is higher than what we can measure by analyzing post-sorted cells that have been plated and analyzed later.

      (8) I am thoroughly confused as to why the authors claim that their system works in the "absence of genetic perturbations" and why they emphasize the fact that their cells are non-transformed: They still needed a fluorescent label and they induce MNs with a chemical Mps1 inhibitor. (The latter is not a genetic manipulation, of course, but they still need to enrich MNs somehow. That is, their method has not been tested on a cell population in which MNs occur naturally, presumably at a very low rate, unless I missed something.) A more careful description of the benefits of their method would be good.

      We apologize for the confusion on these points and hope this is clarified in the revision. We were comparing our system, which can be made using transient transfection, if desired, to current tools that disambiguate aneuploidy and MN formation by deleting parts of chromosomes or engineering double strand breaks with CRISPR to generate single chromosome-specific missegregation events. Most of these systems require transformed cancer cells to obtain high levels of recombination. In contrast, visual cell sorting can isolate micronucleated cells from any cell line that can exogenously express a protein, including primary cells and non-transformed cells like RPE1s.

      Other minor points:

      (1) The authors should not refer to "H2B channels" but to "H2B-emiRFP703 channels". It may seem obvious to the authors but for someone reading the manuscript for the very first time, it was not. I was not sure whether there were additional imaging modalities used for H2B/nucleus/chromatin detection before I went back and read that only fluorescence images of H2B-emiRFP703 were used. To put it another way, the authors are detecting fluorescence, not histones -- unless I misunderstood something.

      To address this point, we altered the text to read “H2B-emiRFP703” when discussing images of this construct. For MNFinder some images were of cells expressing H2B-GFP, which has also been clarified.

      (2) If the level of zoom on my screen is such that I can comfortably read the text, I cannot see much in the figure panels. The features that I should be able to see are the size of a title. The image panels should be magnified.

      In the revision, the images are appended to the end at full resolution to overcome this difficulty. Thank you for your forbearance.

      Reviewer #2 (Recommendations for the authors):

      The methods are adequately explained. The Results text narrating experiments and data analysis is clear. Interpretation of a few results could be clarified and strengthened as explained below.

      (1) RNAseq experiments are a good proof of principle. To strengthen their interpretation in Figures 4 and 6, I would recommend the authors cite published work on checkpoint/MPS1 loss-induced chromosome missegregation (PMID: 18545697, PMID: 33837239, PMC9559752) and consider in their discussion the 'origin' and 'proportion' of micronucleated cells and irregularly shaped nuclei expected in RPE1 lines. This will help interpret Figure 6 findings on aneuploidy signature accurately. Not being able to see an MN-specific signature could be due to the way the biological specimen is presented with a mixture of cells with 'MN only' or 'rupture' or 'MN along with misshapen nuclei'. These features may all link to aneuploidy rather than 'MN' specifically.

      We appreciate the reviewer’s suggestion and added a new analysis of nuclear atypia after Mps1 inhibition in RPE1 cells to Fig. S1. Overall, we found that Mps1 inhibition significantly, but modestly, increased the proportion of misshapen nuclei and chromatin bridges. Multinucleate cells were so rare that instead of giving them their own category we included them in “misshapen nuclei.” These results are consistent with images of Msp1i treated RPE1 cells from He et al. 2019 and Santaguida et al. 2017 and distinct from the stronger changes in nuclear morphology observed after delaying mitosis by nocodazole or CENPE inhibition.

      We also found that the Deep Retina segmenter used to identify nuclei in VCS MN had a significant bias against highly lobulated nuclei (Fig. S2B) that led to misshapen nuclei being largely excluded from the RNAseq analyses. As a result we found no enrichment of misshapen nuclei, chromatin bridges, or dead/mitotic nuclear morphologies in MN+ compared to MN- nuclei in our RNASeq experiments (Fig. S9A).

      (2) As the authors clarify in the response letter, one round of ML is unlikely to result in fully robust software; additional rounds of ML with other markers will make the work robust. It will be useful to indicate other ML image analysis tools that have improved through such reiterations. They could use reviews on challenges and opportunities using ML approaches to support their statement. Also in the introduction, I would recommend labelling as 'rapid' instead of 'rapid and precise' method.

      We updated the text to reference review articles that discuss the benefit of additional training for increasing ML accuracy and changed the text to “rapid.”

      (3) The lack of live-cell studies does not allow the authors to distinguish the origin of MN (lagging chromatids or unaligned chromosomes). As explained in 1, considering these aspects in discussion would strengthen their interpretation. Live-cell studies can help reduce the dependencies on proximity maps (Figure S2).

      The revised text includes new references and data (Fig. S1E) demonstrating that Mps1 inhibition strongly biases towards whole chromosome missegregation and that MN are most likely to contain a single centromere positive chromosome rather than chromatin fragments or multiple chromosomes.

      (4) Mean Intersection over Union (mIOU) is a good measure to compare outcomes against ground truth. However, the mIOU is relatively low (Figure 2D) for HeLa-based functional genomics applications. It will help to discuss mIOU for other classifiers (non-MN classifiers) so that they can be used as a benchmark (this is important since the authors state in their response that they are the first to benchmark an MN classifier). There are publications for mitochondria, cell cortex, spindle, nuclei, etc. where IOU has been discussed.

      We added references to classifiers for other small cellular structures. We also evaluated major sources of error in MNFinder found that false negatives are enriched in very small MN (3 to 9 pixels, or about 0.4 µm<sup>2</sup> – 3 µm<sup>2</sup>, Fig. S6B). A similar result was obtained for VCS MN (Fig. S3B). Because small changes in the number of pixels identified in small objects can have outsized effects on mIoU scores, we suspect that this is exerting downward pressure on the mIoU value. Based on the PPV and recall values we identified, we believe that MNFinder is robust enough to use for functional genomics and screening applications with reasonable sample sizes.

      (5) Figure 5 figure legend title is an overinterpretation. MN and rupture-initiated transcriptional changes could not be isolated with this technique where several other missegregation phenotypes are buried (see point 1 above).

      We decided to keep the figure title legend based on our analysis of known missegregation phenotypes in Fig. S1 and S9 showing that there is no difference in major classes of nuclear atypia between MN+ and MN- populations in this analysis. Although we cannot rule out that other correlated changes exist, we believe that the title represents the most parsimonious interpretation.

      Minor comments

      (1) The sentence in the introduction needs clarification and reference. "However, these interventions cause diverse "off-target" nuclear and cellular changes, including chromatin bridges, aneuploidy, and DNA damage." Off-target may not be the correct description since inhibiting MPS1 is expected to cause a variety of problems based on its role as a master kinase in multiple steps of the chromosome segregation process. Consider one of the references in point 1 for a detailed live-cell view of MPS1 inhibitor outcomes.

      We have changed “off-target” to “additional” for clarity.

      (2) In Figure 3 or S3, did the authors notice any association between the cell cycle phase and MN or rupture presence? Is this possible to consider based on FACS outcomes or nuclear shapes?

      Previous work by our lab and others have shown that MN rupture frequency increases during the cell cycle (Hatch et al., 2013; Joo et al., 2023). Whether this is stochastic or regulated by the cell cycle may depend on what chromosome is in the MN (Mammel et al., 2021) and likely the cell line. Unfortunately, the H2B-emiRFP703 fluorescence in our population is too variable to identify cell cycle stage from FACS or nuclear fluorescence analysis.

      (3) Figure 5 - Please explain "MA plot".

      An MA plot, or log fold-change (M) versus average (A) gene expression, is a way to visualize differently expressed genes between two conditions in an RNASeq experiment and is used as an alternative to volcano plots. We chose them for our paper because most of the expression changes we observed were small and of similar significance and the MA plot spreads out the data compared to a volcano plot and allowed a better visualization of trends across the population.

      (4) Page 7: "our results strongly suggest that protein expression changes in MN+ and rupture+ cells are driven mainly by increased aneuploidy rather than cellular sensing of MN formation and rupture.". This is an overstatement considering the mIOU limits of the software tool and the non-exclusive nature of MN in their samples.

      We agree that we cannot rule out that an unknown masking effect is inhibiting our ability to observe small broad changes in transcription after MN formation or rupture. However, we believe we have minimized the most likely sources of masking effects, including nuclear atypia and large scale aneuploidy differences, and thus our interpretation is the most likely one.

      Reviewer #3 (Recommendations for the authors):

      Overall, the authors need to explain their methods better, define some technical terms used, and more thoroughly explain the parameters and rationale used when implementing these two protocols for identifying micronuclei; primarily as this is geared toward a more general audience that does not necessarily work with machine learning algorithms.

      (1) A clearer description in the methods as to how accuracy was calculated. Were micronuclei counted by hand or another method to assess accuracy?

      We significantly expanded the section on how the machine learning models were trained and tested, including how sensitivity and specificity metrics were calculated, in both the results and the methods sections. The code used to compare ground truth labels to computed masks is also now included in the MNFinder module available on the lab github page. 

      (2) Define positive predictive value.

      The text now says “the positive predictive value (PPV, the proportion of true positives, i.e. specificity) and recall (the proportion of MN found by the classifier, i.e. sensitivity)…”.

      (3) Why is it a problem to use the VCS MN at higher magnifications where undersegmentation occurs? What do the authors mean by diminished performance (what metrics are they using for this?).

      We have included a representative image and calculated mIoU and recall for 40x magnification images analyzed by MNFinder after rescaling in Fig. 2A. In summary, VCS MN only correctly labeled a few pixels in the MN, which was sufficient to call the adjacent nucleus “MN+” but not sufficient for other applications, such as quantifying MN area. In addition, VCS MN did much worse at identifying all the MN in 40x images with a recall, or sensitivity, metric of 0.36. We are not sure why. Developing MNFinder provided a module that was well suited to quantify MN characteristics in fixed cell images, an important use case in MN biology.

      (4) The authors should compare MN that are analyzed and not analyzed using these methods and define parameters. Is there a size limitation? Closeness to the main nucleus?

      We added two new figures defining what contributes to module error for both VCS MN (Fig. S3) and MNFinder (Fig. S6). For VCS MN, false negatives are enriched in very large or very small MN and tend to be dimmer and farther from the nucleus than true positives. False positives are largely misclassification of small dim objects in the image as MN. For MNFinder, the most missed class of MN are very small ones (3-9 px in area) and the majority of false positives are misclassifications of elongated nuclear blebs as MN.

      (5) Are there parameters in how confluent an image must be to correctly define that the micronucleus belongs to the correct cell? The authors discussed that this was calculated based on predicted distance. However, many factors might affect proper calling on MN. And the authors should test this by staining for a cytosolic marker and calculating accuracy.

      We updated the text with more information about how the cytoplasm was defined using leaky 2x-Dendra2-NLS signal to analyze the accuracy of MN/nucleus associations (Fig. S2G-H). In addition, we quantified cell confluency and distance to the first and second nearest neighbor for each MN in our training and testing image datasets. We found that, as anticipated, cells were imaged at subconfluent concentrations with most fields having a confluency around 30% cell coverage (Fig. S2E) and that the average difference in distance between the closest nucleus to an MN and the next closest nucleus was 3.3 fold (Fig. S2F). We edited the discussion section to state that the ability of MN/nuclear proximity to predict associations at high cell confluencies would have to be experimentally validated.

      (6) The authors measure the ratio of Dendra2(Red) v. Dendra2 (Green) in Figure 3B to demonstrate that photoconversion is stable. This measurement, to me, is confusing, as in the end, the authors need to show that they have a robust conversion signal and are able to isolate these data. The authors should directly demonstrate that the Red signal remains by analyzing the percent of the Red signal compared to time point 0 for individual cells.

      We found a bulk analysis to be more powerful than trying to reidentify individual cells due to how much RPE1 cells move during the 4 and 8 hours between image acquisitions. In addition, we sort on the ratio between red and green fluorescence per cell, rather than the absolute fluorescence, to compensate for variation in 2xDendra-NLS protein expression between cells. Therefore, demonstrating that distinct ratios remained present throughout the time course is the most relevant to the downstream analysis.

      To address the reviewer’s concern, we replotted the data in Fig. 3B to highlight changes over time in the raw levels of red and green Dendra fluorescence (Fig. S7D). As expected, we see an overall decrease in red fluorescence intensity, and complementary increase in green fluorescence intensity, over 8 hours, likely due to protein turnover. We also observe an increase in the number of nuclei lacking red fluorescence. This is expected since the well was only partially converted and we expect significant numbers of unconverted cells to move into the field between the first image and the 8 hour image.

      (7) The authors isolate and subsequently use RNA-sequencing to identify changes between Mps1i and DMSO-treated cells. One concern is that even with the less stringent cut-off of 1.5 fold there is a very small change between DMSO and MPS1i treated cells, with only 63 genes changing, none of which were affected above a 2-fold change. The authors should carefully address this, including why their dataset sees changes in many more pathways than in the He et al. and Santaguida et al. studies. Is this due to just having a decreased cut-off?

      The reviewer correctly points out that we observed an overall reduction in the strength of gene expression changes between our dataset of DMSO versus Mps1i treated RPE1 cells compared to similar studies. We suggest a couple reasons for this. One is that the log<sub>2</sub> fold changes observed in the other studies are not huge and vary between 2.5 and -3.8 for He et al., 3.3 and -2.3 for Santaguida et al., and -0.8 and 1.6 for our study. This variability is within a reasonable range for different experimental conditions and library prep protocols. A second is that our protocol minimizes a potential source of transcriptional change – nuclear lobulation – that is present in the other datasets.

      For the pathway analysis we did not use a fold-change cut-off for any data set, instead opting to include all the genes found to be significantly different between control and Mps1i treated cells for all three studies. Our read-depth was higher than that of the two published experiments, which could contribute to an increased DEG number. However, we hypothesize that our identification of a broader number of altered pathways most likely arises from increased sensitivity due to the loss of covering signal from transcriptional changes associated with increased nuclear atypia. Additional visual cell sorting experiments sorting on misshapen nuclei instead of MN would allow us to determine the accuracy of this hypothesis.

      (8) Moreover, clustering (in Figure 5E) of the replicates is a bit worrisome as the variances are large and therefore it is unclear if, with such large variance and low screening depth, one can really make such a strong conclusion that there are no changes. The authors should prove that their conclusion that rupture does not lead to large transcriptional changes, is not due to the limitations of their experimental design.

      We agree with the reviewers that additional rounds of RNAseq would improve the accuracy of our transcriptomic analysis and could uncover additional DEGs. However, we believe the overall conclusion to be correct based on the results of our attempt to validate changes in gene expression by immunofluorescence. We analyzed two of the most highly upregulated genes in the ruptured MN dataset, ATF3 and EGR1. Although we saw a statistically significant increase in ATF3 intensity between cells without MN and those with ruptured MN, the fold change was so small compared to our positive control (100x less) that we believe it is it is more consistent with a small increase in the probability of aneuploidy rather than a specific signature of MN rupture.

      (9) The authors also need to address the fact that they are using RPE-1 cells more clearly and that the lack of effect in transcriptional changes may be simply due to the loss of cGAS-STING pathway (Mackenzie et al., 2017; Harding et al., 2017; etc.).

      As we discuss above in the public comments section, the literature is clear that MN do not activate cGAS in the first cell cycle after their formation, even upon rupture. Therefore, we do not expect any changes in our results when applied to cGAS-competent cells. However, this expectation needs to be experimentally validated, which we plan to address in upcoming work.

    1. eLife Assessment

      This valuable study introduces a new method for detecting RNA modification. Since it does not rely on chemical modification of RNA, which often results in RNA degradation and therefore loss of RNA molecules, it complements other approaches for detecting RNA modification, and it might be of particular interest for sites where modifications are found in only a minority of interrogated molecules. The information provided is incomplete, however, to allow for comparison with other methods, since there is uncertainty regarding false positive and false negative rates.

    2. Reviewer #2 (Public review):

      The fledgling field of epitranscriptomics has encountered various technical roadblocks with implications as to the validity of early epitranscriptomics mapping data. As a prime example, the low specificity of (supposedly) modification-specific antibodies for the enrichment of modified RNAs, has been ignored for quite some time and is only now recognized for its dismal reproducibility (between different labs), which necessitates the development of alternative methods for modification detection. Furthermore, early attempts to map individual epitranscriptomes using sequencing-based techniques are largely characterized by the deliberate avoidance of orthogonal approaches aimed at confirming the existence of RNA modifications that have been originally identified.

      Improved methodology, the inclusion of various controls, and better mapping algorithms as well as the application of robust statistics for the identification of false-positive RNA modification calls have allowed revisiting original (seminal) publications whose early mapping data allowed making hyperbolic claims about the number, localization and importance of RNA modifications, especially in mRNA. Besides the existence of m6A in mRNA, the detectable incidence of RNA modifications in mRNAs has drastically dropped.

      As for m5C, the subject of the manuscript submitted by Zhou et al., its identification in mRNA goes back to Squires et al., 2012 reporting on >10.000 sites in mRNA of a human cancer cell line, followed by intermittent findings reporting on pretty much every number between 0 to > 100.000 m5C sites in different human cell-derived mRNA transcriptomes. The reason for such discrepancy is most likely of a technical nature. Importantly, all studies reporting on actual transcript numbers that were m5C-modified relied on RNA bisulfite sequencing, an NGS-based method, that can discriminate between methylated and non-methylated Cs after chemical deamination of C but not m5C. RNA bisulfite sequencing has a notoriously high background due to deamination artifacts, which occur largely due to incomplete denaturation of double-stranded regions (denaturing-resistant) of RNA molecules. Furthermore, m5C sites in mRNAs have now been mapped to regions that have not only sequence identity but also structural features of tRNAs. Various studies revealed that the highly conserved m5C RNA methyltransferases NSUN2 and NSUN6 do not only accept tRNAs but also other RNAs (including mRNAs) as methylation substrates, which in combination account for most of the RNA bisulfite-mapped m5C sites in human mRNA transcriptomes. Is m5C in mRNA only a result of the Star activity of tRNA or rRNA modification enzymes, or is their low stoichiometry biologically relevant?

      In light of the short-comings of existing tools to robustly determine m5C in transcriptomes, other methods, like DRAM-seq, aiming to map m5C independently of ex situ RNA treatment with chemicals, are needed to arrive at a more solid "ground state", from which it will be possible to state and test various hypotheses as to the biological function of m5C, especially in lowly abundant RNAs such as mRNA.

      Importantly, the identification of >10.000 sites containing m5C increases through DRAM-Seq, increases the number of potential m5C marks in human cancer cells from a couple of 100 (after rigorous post-hoc analysis of RNA bisulfite sequencing data) by orders of magnitude. This begs the question, whether or not the application of these editing tools results in editing artefacts overstating the number of actual m5C sites in the human cancer transcriptome.

    3. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #2:

      (1) The use of two m<sup>5</sup>C reader proteins is likely a reason for the high number of edits introduced by the DRAM-Seq method. Both ALYREF and YBX1 are ubiquitous proteins with multiple roles in RNA metabolism including splicing and mRNA export. It is reasonable to assume that both ALYREF and YBX1 bind to many mRNAs that do not contain m<sup>5</sup>C.

      To substantiate the author's claim that ALYREF or YBX1 binds m<sup>5</sup>C-modified RNAs to an extent that would allow distinguishing its binding to non-modified RNAs from binding to m<sup>5</sup>C-modified RNAs, it would be recommended to provide data on the affinity of these, supposedly proven, m<sup>5</sup>C readers to non-modified versus m<sup>5</sup>C-modified RNAs. To do so, this reviewer suggests performing experiments as described in Slama et al., 2020 (doi: 10.1016/j.ymeth.2018.10.020). However, using dot blots like in so many published studies to show modification of a specific antibody or protein binding, is insufficient as an argument because no antibody, nor protein, encounters nanograms to micrograms of a specific RNA identity in a cell. This issue remains a major caveat in all studies using so-called RNA modification reader proteins as bait for detecting RNA modifications in epitranscriptomics research. It becomes a pertinent problem if used as a platform for base editing similar to the work presented in this manuscript.

      The authors have tried to address the point made by this reviewer. However, rather than performing an experiment with recombinant ALYREF-fusions and m<sup>5</sup>C-modified to unmodified RNA oligos for testing the enrichment factor of ALYREF in vitro, the authors resorted to citing two manuscripts. One manuscript is cited by everybody when it comes to ALYREF as m<sup>5</sup>C reader, however none of the experiments have been repeated by another laboratory. The other manuscript is reporting on YBX1 binding to m<sup>5</sup>C-containing RNA and mentions PAR-CLiP experiments with ALYREF, the details of which are nowhere to be found in doi: 10.1038/s41556-019-0361-y.<br /> Furthermore, the authors have added RNA pull-down assays that should substitute for the requested experiments. Interestingly, Figure S1E shows that ALYREF binds equally well to unmodified and m<sup>5</sup>C-modified RNA oligos, which contradicts doi:10.1038/cr.2017.55, and supports the conclusion that wild-type ALYREF is not specific m<sup>5</sup>C binder. The necessity of including always an overexpression of ALYREF-mut in parallel DRAM experiments, makes the developed method better controlled but not easy to handle (expression differences of the plasmid-driven proteins etc.)

      Thank you for pointing this out. First, we would like to correct our previous response: the binding ability of ALYREF to m<sup>5</sup>C-modified RNA was initially reported in doi: 10.1038/cr.2017.55, (and not in doi: 10.1038/s41556-019-0361-y), where it was observed through PAR-CLIP analysis that the K171 mutation weakens its binding affinity to m<sup>5</sup>C -modified RNA.

      Our previous experimental approach was not optimal: the protein concentration in the INPUT group was too high, leading to overexposure in the experimental group. Additionally, we did not conduct a quantitative analysis of the results at that time. In response to your suggestion, we performed RNA pull-down experiments with YBX1 and ALYREF, rather than with the pan-DRAM protein, to better validate and reproduce the previously reported findings. Our quantitative analysis revealed that both ALYREF and YBX1 exhibit a stronger affinity for m<sup>5</sup>C -modified RNAs. Furthermore, mutating the key amino acids involved in m<sup>5</sup>C recognition significantly reduced the binding affinity of both readers. These results align with previous studies (doi: 10.1038/cr.2017.55 and doi: 10.1038/s41556-019-0361-y), confirming that ALYREF and YBX1 are specific readers of m<sup>5</sup>C -modified RNAs. However, our detection system has certain limitations. Despite mutating the critical amino acids, both readers retained a weak binding affinity for m<sup>5</sup>C, suggesting that while the mutation helps reduce false positives, it is still challenging to precisely map the distribution of m<sup>5</sup>C modifications. To address this, we plan to further investigate the protein structure and function to obtain a more accurate m<sup>5</sup>C sequencing of the transcriptome in future studies. Accordingly, we have updated our results and conclusions in lines 294-299 and discuss these limitations in lines 109-114.

      In addition, while the m<sup>5</sup>C assay can be performed using only the DRAM system alone, comparing it with the DRAM<sup>mut</sup>C control enhances the accuracy of m<sup>5</sup>C region detection. To minimize the variations in transfection efficiency across experimental groups, it is recommended to use the same batch of transfections. This approach not only ensures more consistent results but also improve the standardization of the DRAM assay, as discussed in the section added on line 308-312.

      (2) Using sodium arsenite treatment of cells as a means to change the m<sup>5</sup>C status of transcripts through the downregulation of the two major m<sup>5</sup>C writer proteins NSUN2 and NSUN6 is problematic and the conclusions from these experiments are not warranted. Sodium arsenite is a chemical that poisons every protein containing thiol groups. Not only do NSUN proteins contain cysteines but also the base editor fusion proteins. Arsenite will inactivate these proteins, hence the editing frequency will drop, as observed in the experiments shown in Figure 5, which the authors explain with fewer m<sup>5</sup>C sites to be detected by the fusion proteins.

      The authors have not addressed the point made by this reviewer. Instead the authors state that they have not addressed that possibility. They claim that they have revised the results section, but this reviewer can only see the point raised in the conclusions. An experiment would have been to purify base editors via the HA tag and then perform some kind of binding/editing assay in vitro before and after arsenite treatment of cells.

      We appreciate the reviewer’s insightful comment. We fully agree with the concern raised. In the original manuscript, our intention was to use sodium arsenite treatment to downregulate NSUN mediated m<sup>5</sup>C levels and subsequently decrease DRAM editing efficiency, with the aim of monitoring m<sup>5</sup>C dynamics through the DRAM system. However, as the reviewer pointed out, sodium arsenite may inactivate both NSUN proteins and the base editor fusion proteins, and any such inactivation would likely result in a reduced DRAM editing. This confounds the interpretation of our experimental data.

      As demonstrated in Appendix A, western blot analysis confirmed that sodium arsenite indeed decreased the expression of fusion proteins. In addition, we attempted in vitro fusion protein purification using multiple fusion tags (HIS, GST, HA, MBP) for DRAM fusion protein expression, but unfortunately, we were unable to obtain purified proteins. However, using the Promega TNT T7 Rapid Coupled In Vitro Transcription/Translation Kit, we successfully purified the DRAM protein (Appendix B). Despite this success, subsequent in vitro deamination experiments did not yield the expected mutation results (Appendix C), indicating that further optimization is required. This issue is further discussed in line 314-315.

      Taken together, the above evidence supports that the experiment of sodium arsenite treatment was confusing and we determined to remove the corresponding results from the main text of the revised manuscript.

      Author response image 1.

      (3) The authors should move high-confidence editing site data contained in Supplementary Tables 2 and 3 into one of the main Figures to substantiate what is discussed in Figure 4A. However, the data needs to be visualized in another way then excel format. Furthermore, Supplementary Table 2 does not contain a description of the columns, while Supplementary Table 3 contains a single row with letters and numbers.

      The authors have not addressed the point made by this reviewer. Figure 3F shows the screening process for DRAM-seq assays and principles for screening high-confidence genes rather than the data contained in Supplementary Tables 2 and 3 of the former version of this manuscript.

      Thank you for your valuable suggestion. We have visualized the data from Supplementary Tables 2 and 3 in Figure 4A as a circlize diagram (described in lines 213-216), illustrating the distribution of mutation sites detected by the DRAM system across each chromosome. Additionally, to improve the presentation and clarity of the data, we have revised Supplementary Tables 2 and 3 by adding column descriptions, merging the DRAM-ABE and DRAM-CBE sites, and including overlapping m<sup>5</sup>C genes from previous datasets.

    1. eLife Assessment

      This important study shows how genetic variation is associated with fecundity following a period of reproductive diapause in female Drosophila. The work identifies the olfactory system as central to successful diapause with associated changes in longevity and fecundity. While the methods used are convincing, a limitation of the study, as of any other laboratory-based investigation is the challenge of demonstrating how well measures for fitness related to diapause and its recovery correlates with realities encountered during development in the wild.

    2. Reviewer #1 (Public review):

      Summary:

      The paper begins with phenotyping the DGRP for post-diapause fecundity, which is used to map genes and variants associated with fecundity. There are overlaps with genes mapped in other studies and also functional enrichment of pathways including most surprisingly neuronal pathways. This somewhat explains the strong overlap with traits such as olfactory behaviors and circadian rhythm. The authors then go on to test genes by knocking them down effectively at 10 degrees. Two genes, Dip-gamma and sbb are identified as significantly associated with post-diapause fecundity, which they also find the effects to be specific to neurons. They further show that the neurons in the antenna but not arista are required for the effects of Dip-gamma and sbb. They show that removing antenna has a diapause specific lifespan extending effect, which is quite interesting. Finally, ionotropic receptor neurons are shown to be required for the diapause associated effects.

      Strengths:

      Overall I find the experiments rigorously done and interpretations sound. I have no further suggestions except an ANOVA to estimate heritability of the post-diapause fecundity trait, which is routinely done in the DGRP and offers a global parameter regarding how reliable phenotyping is.

      Weaknesses:

      A minor point is I cannot find how many DGRP lines are used.

    3. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #1 (Public Review): 

      Summary: 

      The paper begins with phenotyping the DGRP for post-diapause fecundity, which is used to map genes and variants associated with fecundity. There are overlaps with genes mapped in other studies and also functional enrichment of pathways including most surprisingly neuronal pathways. This somewhat explains the strong overlap with traits such as olfactory behaviors and circadian rhythm. The authors then go on to test genes by knocking them down effectively at 10 degrees. Two genes, Dip-gamma and sbb, are identified as significantly associated with post-diapause fecundity, and they also find the effects to be specific to neurons. They further show that the neurons in the antenna but not the arista are required for the effects of Dip-gamma and sbb. They show that removing the antenna has a diapause-specific lifespan-extending effect, which is quite interesting. Finally, ionotropic receptor neurons are shown to be required for the diapause-associated effects. 

      Strengths and Weaknesses: 

      Overall I find the experiments rigorously done and interpretations sound. I have no further suggestions except an ANOVA to estimate the heritability of the post-diapause fecundity trait, which is routinely done in the DGRP and offers a global parameter regarding how reliable phenotyping is. 

      We added to the Methods: “We performed a one-way ANOVA to get the mean squares for between-group and withingroup variances and calculated broad-sense heritability using the formula: H<sup>2</sup> = MS<sub>G</sub> - MS<sub>E</sub> / MS<sub>G</sub> + (k-1) MS<sub>E</sub> where MS<sub>G</sub> - Mean square between groups and MS<sub>G</sub> - Mean square within groups and k - Number of individuals per group. Using this formula, the broad-sense heritability for normalized post-diapause fecundity was found to be 0.51.” 

      We added to the Results: “The broad-sense heritability for normalized post-diapause fecundity was found to be 0.51 (see Methods).”

      A minor point is I cannot find how many DGRP lines are used. 

      Response: We screened 193 lines and have added that to the Results. 

      Reviewer #2 (Public Review):

      Summary

      In this study, Easwaran and Montell investigated the molecular, cellular, and genetic basis of adult reproductive diapause in Drosophila using the Drosophila Genetic Reference Panel (DGRP). Their GWAS revealed genes associated with variation in post-diapause fecundity across the DGRP and performed RNAi screens on these candidate genes. They also analyzed the functional implications of these genes, highlighting the role of genes involved in neural and germline development. In addition, in conjunction with other GWAS results, they noted the importance of the olfactory system within the nervous system, which was supported by genetic experiments. Overall, their solid research uncovered new aspects of adult diapause regulation and provided a useful reference for future studies in this field.

      Strengths:

      The authors used whole-genome sequenced DGRP to identify genes and regulatory mechanisms involved in adult diapause. The first Drosophila GWAS of diapause successfully uncovered many QTL underlying post-diapause fecundity variations across DGRP lines. Gene network analysis and comparative GWAS led them to reveal a key role for the olfactory system in diapause lifespan extension and post-diapause fecundity.

      Comments on revised version:

      While the authors have addressed many of the minor concerns raised by the reviewers, they have not fully resolved some of the key criticisms. Notably, two reviewers highlighted significant concerns regarding the phenotype and assay of post-diapause fecundity, which are critical to the study. The authors acknowledged that this assay could be confounded by the 'cold temperature endurance phenotype,' potentially altering the interpretation of their results.

      However, they responded by stating that it is not obvious how to separate these effects experimentally. This leaves the analysis in this research ambiguous, as also noted by Reviewer #3.

      We should have clarified earlier that we actually chose to measure post-diapause fecundity in order to minimize any impact of ‘cold temperature endurance.” In fact, we chose post-diapause fecundity as the appropriate measure of successful diapause for both technical and conceptual reasons. Conceptually, the benefit of diapause is to perpetuate the species. It seems obvious to us that post-diapause fecundity is more relevant to species propagation than other measures of diapause such as how many egg chambers contain yolk or how many eggs are laid. Technically, we chose 5-week diapause and recovery based on pilot studies that showed that nearly all DGRP lines showed excellent survival at 5 weeks in diapause conditions. Therefore, our experimental design minimized as much as possible any effect of cold temperature endurance - in the sense of the ability to survive at 10°C - on our phenotype. 

      We apologize for not clarifying that point earlier and have added this text to the Results: “We chose 5 weeks based on pilot studies that showed that nearly all DGRP lines showed excellent survival at 5 weeks in diapause conditions while exhibiting sufficient variation in post-diapause fecundity to carry out GWAS. Beyond 5 weeks, fecundity was low and there was insufficient variation to conduct a GWAS.”

      Additionally, I raised concerns about the validity of prioritizing genes with multiple associated variants. Although the authors agreed with this point, they did not revise the manuscript accordingly. The statement that 'Genes with multiple SNPs are good candidates for influencing diapause traits' is not a valid argument within the context of population and quantitative genetics.

      We apologize for neglecting to revise the manuscript accordingly. We have revised Supplemental Table: S4 and ranked the genes by p-value.

    1. Author Response:

      Reviewer #1 (Public Review):

      [...] Strengths: This study utilized multiple in vitro approaches, such as proteomics, siRNA, and overexpression, to demonstrate that PCBP2 is an intrinsic factor of BMSC aging.

      Weaknesses:

      This study did not perform in vivo experiments.

      Response: We will continue to conduct animal experiments in subsequent studies.

      Reviewer #2 (Public Review):

      [...] Weaknesses: It is unclear if PCBP2 can also function as an intrinsic factor for BMSC cells in female individuals. More work may be needed to further dissect the mechanism of how PCBP2 impacts FGF2 expression. Could PCBP2 impact the FGF2 expression independent of ROS?

      Response: Thank you very much for your valuable comments, which is also the focus of our follow-up work. We will sort out the data and publish the relevant research results as soon as possible.

      Additional context that would help readers interpret or understand the significance of the work: In the current work, the authors studied the aging process of BMSC cells, which are related to osteoporosis. Aging processes also impact many other cell types and their function, such as in muscle, skin, and the brain.

      Response: Thank you very much for your valuable comments, we will continue to improve the writing logic of the article to make the article more understandable.

    1. eLife Assessment

      This useful manuscript reports mechanisms behind the increase in fecundity in response to sub-lethal doses of pesticides in the crop pest, the brown plant hopper. The authors hypothesize that the pesticide works by inducing the JH titer, which through the JH signaling pathway induces egg development. Evidence for this is, however, incomplete.

    2. Reviewer #1 (Public review):

      Summary:

      Gao et al. has demonstrated that the the pesticide emamectin benzoate (EB) treatment of brown plathopper (BPH) leads to increased egg laying in the insect, which is a common agricultural pest. The authors hypothesize that EB upregulates JH titer resulting in increased fecundity.

      Strengths:

      The finding that a class of pesticide increases fecundity of brown planthopper is interesting.

      Weaknesses:

      (1) EB is an allosteric modulator of GluCl. That means it EB physically interacts with GluCl initiating a structural change in the cannel protein. Yet the authors here central hypothesis is about how EB can upregulate the mRNA of GluCl. I do not know whether there is any evidence that an allosteric modulator can function as a transcriptional activator for the same receptor protein. The basic premise of the paper sounds counterintuitive. This is a structural problem and should be addressed by the authors by giving sufficient evidence about such demonstrated mechanisms before.<br /> (2) I am surprised to see a 4th instar larval application or treatment with EB results in upregulation of JH in the adult stages. Complicating the results further is the observation that a 4th instar EB application results in an immediate decrease in JH titer. There is a high possibility that this late JH titer increase is an indirect effect.<br /> (3) The writing quality of the paper needs improvement. Particularly with respect to describing processes, and abbreviations. In several instances authors have not adequately described the processes they have introduced, thus confusing the readers.<br /> (4) In the section 'EB promotes ovarian development' the authors have shown that EB treatment results in increased detention of eggs which contradicts their own results which show that EB promotes egg laying. Again, this is a serious contradiction that nullifies their hypothesis.<br /> (5) Furthermore, the results suggest that oogenesis is not affected by EB application. The authors should devote a section to discussing how they are observing increased egg numbers in EB-treated insects while not impacting Oogenesis.<br /> (6) Met is the receptor of JH and to my understanding, remains mostly constant in terms of its mRNA or protein levels throughout various developmental periods in many different insects. Therefore, the presence of JH becomes the major driving factor for physiological events and not the presence of the receptor Met. Here the authors have demonstrated an increase in Met mRNA as a result of EB treatment. Their central hypothesis is that EB increases JH titer to result in enhanced fecundity. JH action will not result in the activation of Met. Although not contradictory to the hypothesis, the increase in mRNA content of Met is contrary to the findings of the JH field thus far.<br /> (7) As pointed out before, it is hard to rationalize how a 4th instar exposure to EB can result in upregulation of key genes involved in JH synthesis at the adult stage. The authors must consider providing a plausible explanation and discussion in this regard.<br /> (8) I have strong reservations against such an irrational hypothesis that Met (the receptor for JH) and JH-Met target gene Kr-h1 regulates JH titer (Line 311, Fig 3 supplemental 2D). This would be the first report of such an event on the JH field and therefore must be analysed to depth before it may go to publication. I strongly suggest the authors remove such claims from the manuscript without substantiating it.<br /> (9) Kr-h1 is JH/Met target gene. The authors demonstrate that silencing of Kr-h1 results in inhibition of FAMeT, which is a gene involved in JH synthesis. The feedback loop in JH synthesis is unreported. Authors must go ahead with a mechanistic detail of Kr-h1 mediated JH upregulation before this can be concluded. Mere qPCR experiments are not sufficient to substantiate a claim that is completely contrary to the current understanding of JH signalling pathway.<br /> (10) Authors have performed knockdowns of JHAMT, Met and Kr-h1 to demonstrate the effect of these factors on fecundity n BPH. Additionally, they have performed rescue experiments with EB application on these knockdown insects (Figure 3K-M). This I believe is a very flawed experiment. The authors demonstrate EB works through JHAMT in upregulating JH titer. In the absence of JHAMT, EB application is not expected to rescue the phenotype. But authors have reported a complete rescue here. In the absence of Met, the receptor of JH, either EB or JH is not expected to rescue the phenotype. But a complete rescue has been reported. These two experimental results contradict their own hypothesis.<br /> (11) A significant section of the paper deals with how EB upregulates JH titer. JH is a hormone synthesized in the Corpora Allata. Yet the authors have chosen to use the whole body for all of their experiments. Changes in the whole body for mRNA of those enzymes involved in JH synthesis does may not reflect on the situation in Corpora Allata. Although working with corpora Allata is challenging, discarding the abdomen and thorax region and working with the head and neck region of the insect is easily doable. Results from such sampling is always more convincing when it comes to JH synthesis studies.<br /> (12) The phenomenon reported was specific for BPH and not found in other insects. This limits the implications of the study.<br /> (13) Overall, the molecular experiments are very poorly designed and can at best be termed superficial. There are several contradictions within the paper and no discussion or explanation has been provided for that.

      Comments on revisions:

      (1) The onus of making the revisions understandable to the reviewers lies with the authors. In its current form, how the authors have approached the review is hard to follow, in my opinion. Although the authors have taken a lot of effort in answering the questions posed by reviewers, parallel changes in the manuscript are not clearly mentioned. In many cases, the authors have acknowledged the criticism in response to the reviewer, but have not changed their narrative, particularly in the results section.<br /> (2) In the response to reviewers, the authors have mentioned line numbers in the main text where changes were made. But very frequently, those lines do not refer to the changes or mention just a subsection of changes done. The problem is throughout the document making it very difficult to follow the revision and contributing to the point mentioned above.<br /> (3) The authors need to infer the performed experiments rationally without over interpretation. Currently, many of the claims that the authors are making are unsubstantiated. As a result of the first review process, the authors have acknowledged the discrepancies, but they have failed to alter their interpretations accordingly.<br /> (4) I would like to point to the fact that there are significant experimental modifications added to the manuscript. The decision from the first cycle of review was given on 8th Nov 2024. The authors re-submitted the manuscript on 20th Nov 2024. It just beats my understanding, how so many experiments can be done in such a short time. The rush in resubmission is evident in the writing quality as well. Which I think is now poorer than the original version.<br /> (5) The writing quality is still extremely poor.

    3. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This useful manuscript reports mechanisms behind the increase in fecundity in response to sub-lethal doses of pesticides in the crop pest, the brown plant hopper. The authors hypothesize that the pesticide works by inducing the JH titer, which through the JH signaling pathway induces egg development. Evidence for this is, however, inadequate.

      We greatly appreciate your valuable comments and constructive suggestions for our work. All in all, the manuscript has been carefully edited and improved following your suggestions. We also provide more evidence to support our statements by conducting new experiments. First, we found that also EB treatment of adult females can stimulate egg-laying. Second, EB treatment in female adults increases the number of mature eggs in the ovary and ovarioles. Third, EB treatment in females enhances the expression of the kr-h1 gene in the whole body of BPH. Finally, EB treatment in female adults increases the JHIII titer, but has no impact on the 20E titer.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Gao et al. have demonstrated that the pesticide emamectin benzoate (EB) treatment of brown planthopper (BPH) leads to increased egg-laying in the insect, which is a common agricultural pest. The authors hypothesize that EB upregulates JH titer resulting in increased fecundity.

      Strengths:

      The finding that a class of pesticide increases the fecundity of brown planthopper is interesting.

      We greatly appreciate your positive comments on our work.

      Weaknesses:

      (1) EB is an allosteric modulator of GluCl. That means EB physically interacts with GluCl initiating a structural change in the cannel protein. Yet the authors' central hypothesis here is about how EB can upregulate the mRNA of GluCl. I do not know whether there is any evidence that an allosteric modulator can function as a transcriptional activator for the same receptor protein. The basic premise of the paper sounds counterintuitive. This is a structural problem and should be addressed by the authors by giving sufficient evidence about such demonstrated mechanisms before.

      Thank you for your question. As the reviewer points out, EB physically interacts with its target protein GluCl and thus affects its downstream signaling pathway. In the manuscript, we reported that EB-treated brown planthoppers display increased expression of GluCl in the adult stage (Fig. 5A). Actually, there are many studies showing that insects treated with insecticides can increase the expression of target genes. For example, the relative expression level of the ryanodine receptor gene of the rice stem borer, Chilo suppressalis was increased 10-fold after treatment with chlorantraniliprole, an insecticide which targets the ryanodine receptor (Peng et al., 2017). Besides this, in Drosophila, starvation (and low insulin) elevates the transcription level of the sNPF and tachykinin receptors (Ko et al., 2015; Root et al., 2011). In brown planthoppers, reduction in mRNA and protein expression of a nicotinic acetylcholine receptor α8 subunit is associated with resistance to imidacloprid (Zhang et al., 2015). RNA interference knockdown of α8 gene decreased the sensitivity of N. lugens to imidacloprid (Zhang et al., 2015). Hence, expression of receptor genes can be regulated by diverse factors including insecticide treatment. In our case, we found that EB can upregulate its target gene GluCl. However, we did not claim that EB functions as transcriptional activator for GluCl, and we still do not know why EB treatment changes the expression of GluCl in the brown planthopper. Considering our experiments are lasting several days, it might be an indirect (or secondary) effect caused by other factors, which change the expression of GluCl gene upon EB action of the channel. One reason is maybe that the allosteric interaction with GluCl by EB makes it dysfunctional and the cellular response is to upregulate the channel/receptor to compensate. We have inserted text on lines 738 - 757 to explain these possibilities.

      (2) I am surprised to see a 4th instar larval application or treatment with EB results in the upregulation of JH in the adult stages. Complicating the results further is the observation that a 4th instar EB application results in an immediate decrease in JH titer. There is a high possibility that this late JH titer increase is an indirect effect.

      Thank you for your question. Treatment with low doses or sublethal doses of insecticides might have a strong and complex impact on insects (Gandara et al., 2024; Gong et al., 2022; Li et al., 2023; Martelli et al., 2022). We kept the 4th instar of brown planthoppers feeding on EB for four days. They will develop to 5th instar after four days treatment, which is the final nymphal stage of BPH. Since the brown planthopper is a hemimetabolous insect, we cannot rule out the possibility that an indirect effect of treatment with EB results in the upregulation of JH in the adult stages. In this new revised manuscript, we investigated the impact of EB treatment in the adult stage. We found that female adults treated with EB also laid more eggs than controls (Figure 1-figure supplement 1A). The following experiments were performed in adults to address how EB treated stimulates egg-laying in adult brown planthopper.

      (1) We found that EB treatment in adults increases the number of mature eggs in ovary (new Figure 2-figure supplement 1). We add this results in lines 234 – 238 and 281-285.

      (2) We measured the JH titer after the female adults had been treated with EB. We found that EB can also increase the JH titer but has no impact on the 20E titer in the female adult (Figure 3-S3A and B). We add this results in lines 351 – 356 and 281-285.

      (3) EB treatment in adults increases the gene expression of JHAMT and Kr-h1 (Figure 3-S3C and D). We add this results in lines 378 – 379, lines 387-390 and lines 457-462.

      (3) The writing quality of the paper needs improvement. Particularly with respect to describing processes and abbreviations. In several instances the authors have not adequately described the processes they have introduced, thus confusing readers.

      Thank you for your suggestion. We have thoroughly revised the paper to improve clarity.

      (4) In the section 'EB promotes ovarian development' the authors have shown that EB treatment results in increased detention of eggs which contradicts their own results which show that EB promotes egg laying. Again, this is a serious contradiction that nullifies their hypothesis.

      Thank you for pointing this out. We revised the figure 2B to show number of mature eggs in the ovary. The number of mature eggs in ovaries of females that fed on EB was higher than in control females. We also show that BPH fed with EB laid more eggs than controls. Thus, our results suggest that EB promotes ovary maturation (and egg production) and also increases egg laying (Figure 1 and Table S1). Thus, we found that EB treatment can increase both the production of eggs and increase egg laying. We add this results in lines 234 – 238.

      (5) Furthermore, the results suggest that oogenesis is not affected by EB application. The authors should devote a section to discussing how they are observing increased egg numbers in EB-treated insects while not impacting Oogenesis.

      Thank you for your suggestions, and apologies for the lack of clarity in our initial explanation. First, we found that EB treatment led to an increase in the number of eggs laid by female brown planthoppers (Figure 1). Through dissection experiments, we observed that EB-treated females had more mature eggs in their ovaries (Figure 2A and B), indicating that the increased egg-laying was due to a larger production of mature eggs in the ovaries after EB treatment. This is now explained on lines 229-238.

      Additionally, since there is no systematic description of oogenesis in the brown planthopper, we were the first to observe the oogenesis process in this species using immunohistochemistry and laser confocal microscopy. Based on the developmental characteristics, we defined the different stages of oogenesis (Figure 2C, Figure 2-figure supplement 2). We did not observe any significant effect of EB treatment on the various stages of oogenesis, indicating that EB treatment does not impair normal egg development (Figure 2D). Instead, the increase in vitellogenin accelerates the production of mature eggs. This is now explained on lines 243-262.

      During the maturation process, eggs require uptake of vitellogenin, and an increase in vitellogenin (Vg) content can accelerate egg maturation, producing more mature eggs. Our molecular data suggest that EB treatment leads to an upregulation of vg expression. Based on these findings, we conclude that the increase in egg-laying caused by EB treatment is due to the upregulation of vg (Figure 3I), which raises vitellogenin content, promoting the uptake of vitellogenin by maturing eggs and resulting in the production of more mature eggs. We have revised the text on lines 389-395 to clarify this point.

      (6) Met is the receptor of JH and to my understanding, remains mostly constant in terms of its mRNA or protein levels throughout various developmental periods in many different insects. Therefore, the presence of JH becomes the major driving factor for physiological events and not the presence of the receptor Met. Here the authors have demonstrated an increase in Met mRNA as a result of EB treatment. Their central hypothesis is that EB increases JH titer to result in enhanced fecundity. JH action will not result in the activation of Met. Although not contradictory to the hypothesis, the increase in mRNA content of Met is contrary to the findings of the JH field thus far.

      Thank you for your comment. Our results showed that EB treatment can mildly increase (about 2-fold) expression of the Met gene in brown planthoppers (Figure 3G). And our data indicated that Met and FAMeT expression levels were not influenced so much by EB compared with kr-h1 and vg (Figure 3H and I). We agree that JH action will not result in the increase of Met. However, we cannot rule out the possibility of other factors (indirect effects), induced by EB treatment that increase the mRNA expression level of Met. One recent paper reported that downregulation of transcription factor CncC will increase met expression in beetles (see Figure 6A in this reference) (Jiang et al., 2023). Many studies have reported that insecticide treatment will activate the CncC gene signaling pathway, which regulates detoxification gene expression (Amezian et al., 2023; Fu et al., 2024; Hu et al., 2021). Hence, it is possible that EB might influence the CncC gene pathway which then induces met expression. This EB effect on met upregulation may be similar to the upregulation of GluCl and some other secondary effects. We have discussed this on lines 725-738.

      (7) As pointed out before, it is hard to rationalize how a 4th instar exposure to EB can result in the upregulation of key genes involved in JH synthesis at the adult stage. The authors must consider providing a plausible explanation and discussion in this regard.

      Thank you for your comments. It must be mentioned that although we exposed the BPH to EB at 4th instar, we make the insect feed on the EB-treated rice plants for four days. After that, the insect will develop into 5<sup>th</sup> instar, the final nymphal stage of brown planthopper. Since brown planthoppers do not have a pupal stage, this might cause the EB presented to the insects last a longer time even in the adult stage. Besides this, we found that EB treatment will increase the weight of adult females (Figure 1-figure supplement 3E and F), which indicates that EB might increase food intake in BPHs that might produce more insulin peptide. Insulin might increase the JH synthesis at the adult stage. In our revised study we also investigate EB impairment in adult BPHs. We found that, similar to the nymphal stage, EB treatment in adult BPHs also increases the egg laying. Furthermore, the JH titer was increased after treatment of BPH with EB in adults. Besides this, GluCl and kr-h1 genes were also up-regulated after EB treatment in the adult stage. We have discussed this on lines 739-746.

      (8) I have strong reservations against such an irrational hypothesis that Met (the receptor for JH) and JH-Met target gene Kr-h1 regulate JH titer (Line 311, Fig 3 supplemental 2D). This would be the first report of such an event on the JH field and therefore must be analysed in depth. I strongly suggest the authors remove such claims from the manuscript without substantiating it.

      Thank you for your suggestions and comments. We have changed our claims in this revised MS. We found that EB treatment can enhance Kr-h1 expression. We have no evidence to support that JH can induce met expression. We have rewritten the manuscript to avoid confusion (see text on lines 725-735).

      (9) Kr-h1 is JH/Met target gene. The authors demonstrate that silencing of Kr-h1 results in inhibition of FAMeT, which is a gene involved in JH synthesis. A feedback loop in JH synthesis is unreported. It is the view of this reviewer that the authors must go ahead with a mechanistic detail of Kr-h1 mediated JH upregulation before this can be concluded. Mere qPCR experiments are not sufficient to substantiate a claim that is completely contrary to the current understanding of the JH signalling pathway.

      Thank you for your suggestions and comments. We agree that only qPCR experiments are not enough to provide this kind of claim. More evidences need to be provided to support this. We have revised the MS to avoid confusion (see text on lines 725-735).

      (10) The authors have performed knockdowns of JHAMT, Met, and Kr-h1 to demonstrate the effect of these factors on fecundity in BPH. Additionally, they have performed rescue experiments with EB application on these knockdown insects (Figure 3K-M). This, I believe, is a very flawed experiment. The authors demonstrate EB works through JHAMT in upregulating JH titer. In the absence of JHAMT, EB application is not expected to rescue the phenotype. But the authors have reported a complete rescue here. In the absence of Met, the receptor of JH, either EB or JH is not expected to rescue the phenotype. But a complete rescue has been reported. These two experimental results contradict their own hypothesis.

      Thank you for your comments. We thought that this rescue is possible since knockdown of the genes is incomplete when using dsRNA injection (and residual gene expression allows for EB action). It is not a total knockout and actually, these genes still have a low level of expression in the dsRNA-injected insects. Since EB can upregulate the expression of JHAMT, Met, and Kr-h1, it is reasonable that EB treatment can rescue the down-regulation effects of these three genes and make fecundity completely rescued. We have clarified this on lines 411-413).

      (11) A significant section of the paper deals with how EB upregulates JH titer. JH is a hormone synthesized in the Corpora Allata. Yet the authors have chosen to use the whole body for all of their experiment. Changes in the whole body for mRNA of those enzymes involved in JH synthesis may not reflect the situation in Corpora Allata. Although working with Corpora Allata is challenging, discarding the abdomen and thorax region and working with the head and neck region of the insect is easily doable. Results from such sampling are always more convincing when it comes to JH synthesis studies.

      Thank you for your suggestions. Because the head is very difficult to separate from the thorax region in brown planthoppers as you can see in Author response image 1. We are now trying to answer how EB regulates JH synthesis using Drosophila as a model.

      Author response image 1.

      The brown planthopper

      (12) The phenomenon reported was specific to BPH and not found in other insects. This limits the implications of the study.

      Thank you for your comments. The brown planthopper is a serious insect pest on rice in Asia. Our findings can guide the use of this insecticide in the field. Besides this, our findings indicated that EB, which targets GluCl can impair the JH titer. Our findings added new implications for how a neuronal system influences the JH signaling pathway. We will further investigate how EB influences JH in the future and will use Drosophila as a model to study the molecular mechanisms.

      (13) Overall, the molecular experiments are very poorly designed and can at best be termed superficial. There are several contradictions within the paper and no discussion or explanation has been provided for that.

      Thank you for your comments. We have revised the paper according to your suggestions and added further explanation of our results in the discussion parts and hope the conclusions are better supported in the new version. We have discussed this on lines 725-746 and 778-799.

      Reviewer #2 (Public Review):

      The brown plant hopper (BPH) is a notorious crop pest and pesticides are the most widespread means of controlling its population. This manuscript shows that in response to sublethal doses of the pesticide (EB), BPH females show enhanced fecundity. This is in keeping with field reports of population resurgence post-pesticide treatment. The authors work out the mechanism behind this increase in fecundity. They show that in response to EB exposure, the expression of its target receptor, GluCl, increases. This, they show, results in an increase in the expression of genes that regulate the synthesis of juvenile hormone (JH) and JH itself, which, in turn, results in enhanced egg-production and egg-laying. Interestingly, these effects of EB exposure are species-specific, as the authors report that other species of plant hoppers either don't show enhanced fecundity or show reduced fecundity. As the authors point out, it is unclear how an increase in GluCl levels could result in increased JH regulatory genes.

      We greatly appreciate your valuable comments and constructive suggestion to our work. We will try to figure out how EB interacts with its molecular target GluCl and then increases JH regulatory genes in the future work using Drosophila as models.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Overall, the molecular experiments are very poorly designed and can at best be termed superficial. There are several contradictions within the paper and no discussion or explanation has been provided for that.

      The authors should consider a thorough revision.

      Thank you for your comments. We have thoroughly revised the paper according to your suggestions and added further experiments and explanations of our results in the discussion parts.

      Reviewer #2 (Recommendations For The Authors):

      It would help the reader to have more schematics along with the figures. The final figure is helpful, but knowing the JH pathway, and where it acts would help with the interpretations as one reads the manuscript and the figures. The pathways represented in 4N or 5J are helpful but could be improved upon for better presentation.

      It would be nice to have some discussion on how the authors think EB exposure results in an increase in GluCl expression, and how that in turn affects the expression of so many genes.

      Thank you for your comments. We have thoroughly revised the paper according to your suggestions and added further experiments and explanations of how we think EB exposure results in an increase in JH titer and other genes in the discussion parts. We have added the test on lines 753-761.

      References

      Amezian, D., Fricaux, T., de Sousa, G., Maiwald, F., Huditz, H.-I., Nauen, R., Le Goff, G., 2023. Investigating the role of the ROS/CncC signaling pathway in the response to xenobiotics in Spodoptera frugiperda using Sf9 cells. Pesticide Biochemistry and Physiology 195, 105563.

      Fu, B., Liang, J., Hu, J., Du, T., Tan, Q., He, C., Wei, X., Gong, P., Yang, J., Liu, S., Huang, M., Gui, L., Liu, K., Zhou, X., Nauen, R., Bass, C., Yang, X., Zhang, Y., 2024. GPCR–MAPK signaling pathways underpin fitness trade-offs in whitefly. Proceedings of the National Academy of Sciences 121, e2402407121.

      Gandara, L., Jacoby, R., Laurent, F., Spatuzzi, M., Vlachopoulos, N., Borst, N.O., Ekmen, G., Potel, C.M., Garrido-Rodriguez, M., Böhmert, A.L., Misunou, N., Bartmanski, B.J., Li, X.C., Kutra, D., Hériché, J.-K., Tischer, C., Zimmermann-Kogadeeva, M., Ingham, V.A., Savitski, M.M., Masson, J.-B., Zimmermann, M., Crocker, J., 2024. Pervasive sublethal effects of agrochemicals on insects at environmentally relevant concentrations. Science 386, 446-453.

      Gong, Y., Cheng, S., Desneux, N., Gao, X., Xiu, X., Wang, F., Hou, M., 2022. Transgenerational hormesis effects of nitenpyram on fitness and insecticide tolerance/resistance of Nilaparvata lugens. Journal of Pest Science.

      Hu, B., Huang, H., Hu, S., Ren, M., Wei, Q., Tian, X., Esmail Abdalla Elzaki, M., Bass, C., Su, J., Reddy Palli, S., 2021. Changes in both trans- and cis-regulatory elements mediate insecticide resistance in a lepidopteron pest, Spodoptera exigua. PLOS Genetics 17, e1009403.

      Jiang, H., Meng, X., Zhang, N., Ge, H., Wei, J., Qian, K., Zheng, Y., Park, Y., Reddy Palli, S., Wang, J., 2023. The pleiotropic AMPK–CncC signaling pathway regulates the trade-off between detoxification and reproduction. Proceedings of the National Academy of Sciences 120, e2214038120.

      Ko, K.I., Root, C.M., Lindsay, S.A., Zaninovich, O.A., Shepherd, A.K., Wasserman, S.A., Kim, S.M., Wang, J.W., 2015. Starvation promotes concerted modulation of appetitive olfactory behavior via parallel neuromodulatory circuits. eLife 4, e08298.

      Li, Z., Wang, Y., Qin, Q., Chen, L., Dang, X., Ma, Z., Zhou, Z., 2023. Imidacloprid disrupts larval molting regulation and nutrient energy metabolism, causing developmental delay in honey bee Apis mellifera. eLife

      Martelli, F., Hernandes, N.H., Zuo, Z., Wang, J., Wong, C.-O., Karagas, N.E., Roessner, U., Rupasinghe, T., Robin, C., Venkatachalam, K., Perry, T., Batterham, P., Bellen, H.J., 2022. Low doses of the organic insecticide spinosad trigger lysosomal defects, elevated ROS, lipid dysregulation, and neurodegeneration in flies. eLife 11, e73812.

      Peng, Y.C., Sheng, C.W., Casida, J.E., Zhao, C.Q., Han, Z.J., 2017. Ryanodine receptor genes of the rice stem borer, Chilo suppressalis: Molecular cloning, alternative splicing and expression profiling. Pestic. Biochem. Physiol. 135, 69-77.

      Root, Cory M., Ko, Kang I., Jafari, A., Wang, Jing W., 2011. Presynaptic facilitation by neuropeptide signaling mediates odor-driven food search. Cell 145, 133-144.

      Zhang, Y., Wang, X., Yang, B., Hu, Y., Huang, L., Bass, C., Liu, Z., 2015. Reduction in mRNA and protein expression of a nicotinic acetylcholine receptor α8 subunit is associated with resistance to imidacloprid in the brown planthopper, Nilaparvata lugens. Journal of Neurochemistry 135, 686-694.

    1. eLife Assessment

      This valuable study confirms the association between the human leukocyte antigen (HLA)-II region and tuberculosis (TB) susceptibility in genetically admixed South African populations, specifically identifying a near-genome-wide significant association in the HLA-DPB1 gene, which originates from KhoeSan ancestry. The evidence supporting the association between the HLA-II region and TB susceptibility is solid, and the work will be of interest to those studying the genetic basis of tuberculosis susceptibility/infection resistance.

    2. Reviewer #1 (Public review):

      Summary:

      The authors aimed to confirm the association between the human leukocyte antigen (HLA)-II region and tuberculosis (TB) susceptibility within admixed African populations. Building upon previous findings from the International Tuberculosis Host Genetics Consortium (ITHGC), this study sought to address the limitations of small sample size and the inclusion of admixed samples by employing the Local Ancestry Allelic Adjusted (LAAA) model, as well as identify TB susceptibility loci in an admixed South African cohort.

      Strengths:

      The major strengths of this study include the use of multiple TB case-control datasets from diverse South African populations and ADMIXTURE for global ancestry inference.

      Weaknesses:

      The major weakness of this study include insufficient significant novel discoveries and reliance on cross-validation. The use of existing models did not add value to this study.

      Appraisal:<br /> The authors achieved their aims. However, the results still needed to be further validated in the future.

      Impact:<br /> The innovative use of the LAAA model and the comprehensive dataset in this study may make contributions to the field of genetic epidemiology.

    3. Reviewer #2 (Public review):

      Summary:

      This manuscript is about using different analytical approaches to allow ancestry adjustments to GWAS analyses amongst admixed populations. This work is a follow-on from the recently published ITHGC multi-population GWAS (https://doi.org/10.7554/eLife.84394), with the focus on the admixed South African populations. Ancestry adjustment models detected a peak of SNPs in the class II HLA DPB1, distinct from the class II HLA DQA1 loci signficant in the ITHGC analysis.

      Strengths:

      Excellent demonstration of GWAS analytical pipelines in highly admixed populations. Particularly the utility of ancestry adjustment to improve study power to detect novel associations. Further confirmation of the importance of the HLA class II locus in genetic susceptibility to TB.

      Weaknesses:

      Limited novelty compared to the group's previous existing publications and the body of work linking HLA class II alleles with TB susceptibility in South Africa or other African populations. This work includes only ~100 new cases and controls from what has already been published. High resolution HLA typing has detected significant signals in both the DQA1 and DPB1 regions identified by the larger ITHGC and in this GWAS analysis respectively (Chihab L et al. HLA. 2023 Feb; 101(2): 124-137).<br /> Despite the availability of strong methods for imputing HLA from GWAS data (Karnes J et Plos One 2017), the authors did not confirm with HLA typing the importance of their SNP peak in the class II region. This would have supported the importance of this ancestry adjustment versus prior ITHGC analysis.<br /> The populations consider active TB and healthy controls (from high-burden presumed exposed communities) and do not provide QFT or other data to identify latent TB infection.

      Important methodological points for clarification and for readers to be aware of when reading this paper:

      (1) One of the reasons cited for the lack of African ancestry-specific associations or suggestive peaks in the ITHGC study was the small African sample size. The current association test includes a larger African cohort and yields a near-genome-wide significant threshold in the HLA-DPB1 gene originating from the KhoeSan ancestry. Investigation is needed as to whether the increase in power is due to increased African samples and not necessarily the use of the LAAA model as stated on lines 295 and 296?

      Authors response - The Manhattan plot in Figure 3 includes the results for all four models: the traditional GWAS model (GAO), the admixture mapping model (LAO), the ancestry plus allelic (APA) model and the LAAA model. In this figure, it is evident that only the LAAA model identified the association peak on chromosome 6, which lends support the argument that the increase in power is due to the use of the LAAA model and not solely due to the increase in sample size.<br /> Reviewer comment - This data supports the authors conclusions that increase power is related to the LAAA model application rather than simply increase sample size.

      (2) In line 256, the number of SNPs included in the LAAA analysis was 784,557 autosomal markers; the number of SNPs after quality control of the imputed dataset was 7,510,051 SNPs (line 142). It is not clear how or why ~90% of the SNPs were removed. This needs clarification.

      Authors response:<br /> In our manuscript (line 194), we mention that "...variants with minor allele frequency (MAF) < 1% were removed to improve the stability of the association tests." A large proportion of imputed variants fell below this MAF threshold and were subsequently excluded from this analysis.

      Reviewers additional comment: The authors should specify the number of SNPs in the dataset before imputation and indicate what proportion of the 784,557 remaining SNPs were imputed. Providing this information might help the reader better understand the rationale behind the imputation process.

      (3) The authors have used the significance threshold estimated by the STEAM p-value < 2.5x10-6 in the LAAA analysis. Grinde et al. (2019 implemented their significance threshold estimation approach tailored to admixture mapping (local ancestry (LA) model), where there is a reduction in testing burden. The authors should justify why this threshold would apply to the LAAA model (a joint genotype and ancestry approach).

      Authors response: We describe in the methods (line 189 onwards) that the LAAA model is an extension of the APA model. Since the APA model itself simultaneously performs the null global ancestry only model and the local ancestry model (utilised in admixture mapping), we thus considered the use of a threshold tailored to admixture mapping appropriate for the LAAA model.

      Reviewers additional comment: While the LAAA model is an extension of the APA model, the authors describe the LAAA test as 'models the combination of the minor allele and the ancestry of the minor allele at a specific locus, along with the effect of this interaction,' thus a joint allele and ancestry effects model. Grinde et al. (2019) proposed the significance threshold estimation approach, STEAM, specifically for the LA approach, which tests for ancestry effects alone and benefits from the reduced testing burden. However, it remains unclear why the authors found it appropriate to apply STEAM to the LAAA model, a joint test for both allele and ancestry effects, which does not benefit from the same reduction in testing burden.

      (4) Batch effect screening and correction (line 174) is a quality control check. This section is discussed after global and local ancestry inferences in the methods. Was this QC step conducted after the inferencing? If so, the authors should justify how the removed SNPs due to the batch effect did not affect the global and local ancestry inferences or should order the methods section correctly to avoid confusion.

      Authors response: The batch effect correction method utilised a pseudo-case-control comparison which included global ancestry proportions. Thus, batch effect correction was conducted after ancestry inference. We excluded 36 627 SNPs that were believed to have been affected by the batch effect. We have amended line 186 to include the exact number of SNPs excluded due to batch effect.<br /> The ancestry inference by RFMix utilised the entire merged dataset of 7 510 051 SNPs. Thus, the SNPs removed due to the batch effect make up a very small proportion of the SNPs used to conduct global and local ancestry inferences (less than 0.5%). As a result, we do not believe that the removed SNPs would have significantly affected the global and local ancestry inferences. However, we did conduct global ancestry inference with RFMix on each separate dataset as a sanity check. In the Author response tables 1 and 2, we show the average global ancestry proportions inferred for each separate dataset, the average global ancestry proportions across all datasets and the average global ancestry proportions inferred using the merged dataset. The SAC and Xhosa cohorts are shown in two separate tables due to the different number of contributing ancestral populations to each cohort. The differences between the combined average global ancestry proportions across the separate cohorts does not differ significantly to the global ancestry proportions inferred using the merged dataset.

      This is an excellent response and should remain accessible to readers to clarify this issue.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review): 

      Summary: 

      The authors aimed to confirm the association between the human leukocyte antigen (HLA)-II region and tuberculosis (TB) susceptibility within admixed African populations. Building upon previous findings from the International Tuberculosis Host Genetics Consortium (ITHGC), this study sought to address the limitations of small sample size and the inclusion of admixed samples by employing the Local Ancestry Allelic Adjusted (LAAA) model, as well as identify TB susceptibility loci in an admixed South African cohort. 

      Strengths: 

      The major strengths of this study include the use of six TB case-control datasets collected over 30 years from diverse South African populations and ADMIXTURE for global ancestry inference. The former represents comprehensive dataset used in this study and the later ensures accurate determination of ancestral contributions. In addition, the identified association in the HLA-DPB1 gene shows near-genomewide significance, enhancing the credibility of the findings. 

      Weaknesses: 

      The major weakness of this study includes insufficient significant discoveries and reliance on crossvalidation. This study only identified one variant significantly associated with TB status, located in an intergenic region with an unclear link to TB susceptibility. Despite identifying multiple lead SNPs, no other variants reached the genome-wide significance threshold, limiting the overall impact of the findings. The absence of an independent validation cohort, with the study relying solely on crossvalidation, is also a major limitation. This approach restricts the ability to independently confirm the findings and evaluate their robustness across different population samples. 

      Appraisal: 

      The authors successfully achieved their aims of confirming the association between the HLA-II region and TB susceptibility in admixed African populations. However, the limited number of significant discoveries, reliance on cross-validation, and insufficient discussion of model performance and SNP significance weaken the overall strength of the findings. Despite these limitations, the results support the conclusion that considering local ancestry is crucial in genetic studies of admixed populations. 

      Impact:  

      The innovative use of the LAAA model and the comprehensive dataset in this study make substantial contributions to the field of genetic epidemiology. 

      Reviewer #2 (Public review): 

      Summary: 

      This manuscript is about using different analytical approaches to allow ancestry adjustments to GWAS analyses amongst admixed populations. This work is a follow-on from the recently published ITHGC multi-population GWAS (https://doi.org/10.7554/eLife.84394), with a focus on the admixed South African populations. Ancestry adjustment models detected a peak of SNPs in the class II HLA DPB1, distinct from the class II HLA DQA1 loci significant in the ITHGC analysis. 

      Strengths: 

      Excellent demonstration of GWAS analytical pipelines in highly admixed populations. Further confirmation of the importance of the HLA class II locus in genetic susceptibility to TB. 

      Weaknesses: 

      Limited novelty compared to the group's previous existing publications and the body of work linking HLA class II alleles with TB susceptibility in South Africa or other African populations. This work includes only ~100 new cases and controls from what has already been published. High-resolution HLA typing has detected significant signals in both the DQA1 and DPB1 regions identified by the larger ITHGC and in this GWAS analysis respectively (Chihab L et al. HLA. 2023 Feb; 101(2): 124-137). Despite the availability of strong methods for imputing HLA from GWAS data (Karnes J et Plos One 2017), the authors did not confirm with HLA typing the importance of their SNP peak in the class II region. This would have supported the importance of this ancestry adjustment versus prior ITHGC analysis. 

      The populations consider active TB and healthy controls (from high-burden presumed exposed communities) and do not provide QFT or other data to identify latent TB infection. 

      Important methodological points for clarification and for readers to be aware of when reading this paper: 

      (1) One of the reasons cited for the lack of African ancestry-specific associations or suggestive peaks in the ITHGC study was the small African sample size. The current association test includes a larger African cohort and yields a near-genome-wide significant threshold in the HLA-DPB1 gene originating from the KhoeSan ancestry. The investigation is needed as to whether the increase in power is due to increased African samples and not necessarily the use of the LAAA model as stated on lines 295 and 296? 

      Thank you for your comment. The Manhattan plot in Figure 3 includes the results for all four models: the traditional GWAS model (GAO), the admixture mapping model (LAO), the ancestry plus allelic (APA) model and the LAAA model. In this figure, it is evident that only the LAAA model identified the association peak on chromosome 6, which lends support the argument that the increase in power is due to the use of the LAAA model and not solely due to the increase in sample size. 

      (2) In line 256, the number of SNPs included in the LAAA analysis was 784,557 autosomal markers; the number of SNPs after quality control of the imputed dataset was 7,510,051 SNPs (line 142). It is not clear how or why ~90% of the SNPs were removed. This needs clarification. 

      Thank you for your recommendation. In our manuscript (line 194), we mention that “…variants with minor allele frequency (MAF) < 1% were removed to improve the stability of the association tests.” A large proportion of imputed variants fell below this MAF threshold, and were subsequently excluded from this analysis. Below, we show the number of imputed variants across MAF bins for one of our datasets [RSA(A)] to substantiate this claim:  

      Author response image 1.

      (3) The authors have used the significance threshold estimated by the STEAM p-value < 2.5x10<sup>-6</sup> in the LAAA analysis. Grinde et al. (2019 implemented their significance threshold estimation approach tailored to admixture mapping (local ancestry (LA) model), where there is a reduction in testing burden. The authors should justify why this threshold would apply to the LAAA model (a joint genotype and ancestry approach). 

      Thank you for your recommendation. We describe in the methods (line 189 onwards) that the LAAA model is an extension of the APA model. Since the APA model itself simultaneously performs the null global ancestry only model and the local ancestry model (utilised in admixture mapping), we thus considered the use of a threshold tailored to admixture mapping appropriate for the LAAA model.  

      (4) Batch effect screening and correction (line 174) is a quality control check. This section is discussed after global and local ancestry inferences in the methods. Was this QC step conducted after the inferencing? If so, the authors should justify how the removed SNPs due to the batch effect did not affect the global and local ancestry inferences or should order the methods section correctly to avoid confusion. 

      Thank you for your comments. The batch effect correction method utilised a pseudo-case-control comparison which included global ancestry proportions. Thus, batch effect correction was conducted after ancestry inference. We excluded 36 627 SNPs that were believed to have been affected by the batch effect. We have amended line 186 to include the exact number of SNPs excluded due to batch effect. 

      The ancestry inference by RFMix utilised the entire merged dataset of 7 510 051 SNPs. Thus, the SNPs removed due to the batch effect make up a very small proportion of the SNPs used to conduct global and local ancestry inferences (less than 0.5%). As a result, we do not believe that the removed SNPs would have significantly affected the global and local ancestry inferences. However, we did conduct global ancestry inference with RFMix on each separate dataset as a sanity check. In the tables below, we show the average global ancestry proportions inferred for each separate dataset, the average global ancestry proportions across all datasets and the average global ancestry proportions inferred using the merged dataset. The SAC and Xhosa cohorts are shown in two separate tables due to the different number of contributing ancestral populations to each cohort. The differences between the combined average global ancestry proportions across the separate cohorts does not differ significantly to the global ancestry proportions inferred using the merged dataset. 

      Author response table 1.

      Comparison of global ancestry proportions across the separate SAC datasets and the merged cohort.

      Author response table 2.

      Comparison of global ancestry proportions in the Xhosa dataset and the merged cohort. 

      Reviewer #1 (Recommendations for the authors): 

      Suggestions for Improved or Additional Experiments, Data, or Analyses:   

      (1) It might be beneficial to consider splitting the data into separate discovery and validation cohorts rather than relying solely on cross-validation. This approach could provide a stronger basis for independently confirming the findings. 

      Thank you for your suggestion. However, we are hesitant to divide our already modest dataset (n=1544) into separate discovery and validation cohorts, as this would reduce the statistical power to detect significant associations.

      (2) Clearly stating the process of cross-validation in the methods section and reporting relevant validation statistics, such as accuracy, sensitivity, specificity, and area under the curve (AUC), would provide a more comprehensive assessment of the model's performance.  

      Thank you for your recommendation. We would like to highlight this article, “GWAS in the southern African context” (1), which evaluated the performance of the LAAA model compared to other models in three- and five-way admixed populations. Given the thorough evaluation of the model’s performance in that study, we did not find it necessary to reassess its performance in this manuscript.   

      (3) Analysing racial cohorts separately to see if you can replicate previous results and find significant markers in combined non-African populations that are not evident in African-only samples might be useful. 

      Thank you for your suggestion. We would like to respectfully note that race is a social construct, and its use as a proxy for genetic ancestry can be problematic (2). In our study, we rather rely on genetic ancestry inferred using ancestry inference software to provide a more accurate representation of our cohort's genetic diversity. Additionally, our cohort consists mostly of a highly admixed population group, with some individuals exhibiting ancestral contributions from up to five different global populations. Therefore, it is not possible to categorize our samples into distinct “Africanonly” or “non-African” groups.

      (4) It might be worthwhile to consider using polygenic risk scores (PRS) to combine multiple genetic influences. This approach could help in identifying cumulative genetic effects that are not apparent when examining individual SNPs.  

      Thank you for your recommendation. While constructing a polygenic risk score (PRS) is beyond the scope of the current study, but an ongoing interest in our group, we recognize its potential value and will consider incorporating this approach in future research endeavours or a separate publication. A recent publication by Majara et al showed that that PRS accuracy is low for all traits and varies across ancestrally and ethnically diverse South African groups (3).

      Recommendations for Improving the Writing and Presentation: 

      Including a more thorough discussion of the methodological limitations, such as the challenges of studying admixed populations and the potential limitations of the LAAA model, would provide a more balanced perspective. 

      Thank you for your suggestion. To provide a more balanced perspective, we included the limitations of our study in the discussion, from line 429 to like 451.

      Minor Corrections to the Text and Figures: 

      Including all relevant statistics would improve clarity. For example, providing confidence intervals for the odds ratios and discussing any observed trends or outliers would be beneficial. 

      Thank you for your recommendation. We have added 95% confidence intervals to all odds ratios reported in Table 3. However, beyond the association peak identified in the HL-II region associated with the phenotype, we do not observe any other trends or outliers in or LAAA analysis.  

      Reviewer #2 (Recommendations for the authors): 

      Points for improvement: 

      (1) Related to the different datasets and inclusions in previous publications, it would also be good to better understand the different numbers of cases and controls included across the previous and current analyses, or discussion thereof. For instance, the RSA(M) dataset includes 555/440 cases/controls for this analysis and only 410/405 cases/controls in the ITHGC analysis. Other discrepancies are noted across the other published datasets compared to those included in this analysis, and these always need to be detailed in a supplement or similar to better understand if this could have introduced bias or was in fact correct based on the additional ancestry-related restriction applied.  

      Thank you for your comments. Table 1 of our manuscript lists number of individuals in the RSA(M) dataset, including related individuals. As described in line 131, related individuals were subsequently excluded during quality control: “Individual datasets were screened for relatedness using KING software (Manichaikul et al., 2010) and individuals up to second degree relatedness were removed.” The ITHGC only reported the number of unrelated individuals included their analyses, which would account for the discrepancies in the reported number of cases and controls.  

      (2) The imbalance between cases and controls in this analysis is quite striking, and it is unusual to have the imbalance favour cases over controls. This contrasts with the ITHGC, where there are substantially more controls. There is no comment on how this could potentially impact this analysis. 

      Thank you for your comment. We have included a note on our case-control imbalance in the discussion:

      “While many studies discuss methods for addressing case-control imbalances with more controls than cases (which can inflate type 1 error rates (Zhou et al. 2018; Dai et al. 2021; Öztornaci et al. 2023), few address the implications of a large case-to-control ratio like ours (952 cases to 592 controls). To assess the impact of this imbalance, we used the Michigan genetic association study (GAS) power calculator (Skol et al. 2006). Under an additive disease model with an estimated prevalence of 0.15, a disease allele frequency of 0.3, a genotype relative risk of 1.5, and a default significance level of 7 × 10<sup>-6</sup>, we achieved an expected power of approximately 75%. With a balanced sample size of 950 cases and 950 controls, power would exceed 90%, but it would drop significantly with a smaller balanced cohort of 590 cases and 590 controls. Given these results, we proceeded with our analysis to maximize statistical power despite the case-control imbalance.” 

      Author response image 2.

      Minor comments 

      (1) Referencing around key points of TB epidemiology and disease states seems out of date, given recent epidemiology reviews and seminal nature or lancet review articles. Please update.  

      Thank you for your suggestion. We have included the following recent publications in the introductory paragraph: 

      Zaidi, S. M. A., Coussens, A. K., Seddon, J. A., Kredo, T., Warner, D., Houben, R. M. G. J., & Esmail, H. (2023). Beyond latent and active tuberculosis: a scoping review of conceptual frameworks. EClinicalMedicine, 66, 102332. https://doi.org/10.1016/j.eclinm.2023.102332

      Menzies, N. A., Swartwood, N., Testa, C., Malyuta, Y., Hill, A. N., Marks, S. M., Cohen, T., & Salomon, J. A. (2021). Time Since Infection and Risks of Future Disease for Individuals with Mycobacterium tuberculosis Infection in the United States. Epidemiology, 32(1), 70–78. https://doi.org/10.1097/EDE.0000000000001271  

      Cudahy, P. G. T., Wilson, D., & Cohen, T. (2020). Risk factors for recurrent tuberculosis after successful treatment in a high burden setting: a cohort study. BMC Infectious Diseases, 20(1), 789. https://doi.org/10.1186/s12879-020-05515-4  

      Escombe, A. R., Ticona, E., Chávez-Pérez, V., Espinoza, M., & Moore, D. A. J. (2019). Improving natural ventilation in hospital waiting and consulting rooms to reduce nosocomial tuberculosis transmission risk in a low resource setting. BMC Infectious Diseases, 19(1), 88. https://doi.org/10.1186/s12879-019-3717-9  

      Laghari, M., Sulaiman, S. A. S., Khan, A. H., Talpur, B. A., Bhatti, Z., & Memon, N. (2019). Contact screening and risk factors for TB among the household contact of children with active TB: a way to find source case and new TB cases. BMC Public Health, 19(1), 1274. https://doi.org/10.1186/s12889-0197597-0  

      Matose, M., Poluta, M., & Douglas, T. S. (2019). Natural ventilation as a means of airborne tuberculosis infection control in minibus taxis. South African Journal of Science, 115(9/10). https://doi.org/10.17159/sajs.2019/5737

      Smith, M. H., Myrick, J. W., Oyageshio, O., Uren, C., Saayman, J., Boolay, S., van der Westhuizen, L., Werely, C., Möller, M., Henn, B. M., & Reynolds, A. W. (2023). Epidemiological correlates of overweight and obesity in the Northern Cape Province, South Africa. PeerJ, 11, e14723. https://doi.org/10.7717/peerj.14723  

      (2) Lines 46 to 48 appear to have two contradictory statements next to each other. The first says there are numerous GWAS investigating TB susceptibility; the second says there are sparse. Please clarify.

      Thank you for bringing this to our attention. We have amended the lines as follows: 

      “Numerous genome-wide association studies (GWASs) investigating TB susceptibility have been conducted across different population groups. However, findings from these studies often do not replicate across population groups (Möller & Kinnear, 2020; Möller et al., 2018; Uren et al., 2017).”

      (3) Add ref in line 69 for two SAC populations.

      Thank you for your recommendation. We have included the citation for the ITHGC meta-analysis paper here: 

      “The authors described possible reasons for the lack of associations, including the smaller sample size compared to the other ancestry-specific meta-analyses, increased genetic diversity within African individuals and population stratification produced by two admixed cohorts from the South African Coloured (SAC) population (Schurz et al. 2024).”

      (4) Write out abbreviations the first time they appear (Line 121).

      Thank you for your recommendation. We have corrected the sentence as follows: 

      “Monomorphic sites were removed. Individuals were screened for deviations in Hardy-Weinberg Equilibrium (HWE) for each SNP and sites deviating from the HWE threshold of 10-5 were removed.”

      (5) It would be good in the supplement to see if there is a SNP peak in chromosome 20 with a hit that reached significance in the Bantu-speaking African ancestry.

      Thank you for your recommendation. We have included a regional plot for the lead variant identified on chromosome 20 originating from Bantu-speaking African ancestry in the supplementary material (Supplementary Figure 3).

      (6) It would be good to mention the p-values of rs28383206 from the ITHGC paper in this cohort for KhoeSan and Bantu-speaking African ancestries. 

      Thank you for your suggestion. We have included the following paragraph from line 352:

      “The lead variant identified in the ITHGC meta-analysis, rs28383206, was not present in our genotype or imputed datasets. The ITHGC imputed genotypes using the 1000 Genomes (1000G) reference panel (4). Variant rs28383206 has an alternate allele frequency of 11.26% in the African population subgroup within the 1000G dataset (https://www.ncbi.nlm.nih.gov/snp/rs28383206). However, rs28383206 is absent from our in-house whole-genome sequencing (WGS) datasets, which include Bantu-speaking African and KhoeSan individuals. This absence suggests that rs28383206 might not have been imputed in our datasets using the AGR reference panel, potentially due to its low alternate allele frequency in southern African populations. Our merged dataset contained two variants located within 800 base pairs of r_s28383206: rs482205_ (6:32576009) and rs482162 (6:32576019). However, these variants were not significantly associated with TB status in our cohort (Supplementary Table 1).” Supplementary Table 1 can be found in the supplementary material:

      (7) It would improve the readability of the ancestry proportions listed on lines 236 and 237 if these population groups were linked with the corresponding specific population used in Figure 1, as has been done in Table 2.

      Thank you for your suggestion. We have amended Figure 1 to include the corresponding population labels mentioned in Table 2.  

      (8) In line 209, it is not clear why the number of alleles of a specific ancestry at a locus is referred to as a covariate in admixture mapping when the corresponding marginal effect is the parameter of interest. 

      Thank you for bringing this to our attention. We have amended the description as follows: 

      “(2) Local ancestry (LA) model:

      This model is used in admixture mapping to identify ancestry-specific variants associated with a specific phenotype. The LA model evaluates the number of alleles of a specific ancestry at a locus and includes the corresponding marginal effect as a covariate in association analyses.”

      (9) Table 3 would benefit from a column on whether the SNP was genotyped or imputed. 

      Thank you for your suggestion. We have included a column indicating whether the SNP was genotyped or imputed, as well as an additional column with the INFO score for imputed genotypes. 

      (10) The authors should remove the print and download icons in Figure 1 on lines 240 and 241.

      Thank you for your suggestion. We have amended the figure as requested.  

      (11) In the quality control, the authors use a more relaxed threshold for missingness in individuals (90%) and genotypes (5%) and have strayed away from the conventional 97%-98%. An explanation of the choice of these thresholds will be helpful to the reader.

      Thank you for your suggestion. We aimed to use similar genotype and individual missingness thresholds outline by the ITHGC meta-analysis (which utilised a threshold of 10% for both genotype and individual missingness) and the previous LAAA analysis paper performed by Swart et al. in 2021. We have amended line 116 for more clarity: 

      “Individuals with genotype call rates less than 90% and SNPs with more than 5% missingness were removed as described previously (5).”

      References  

      (1) Swart Y, van Eeden G, Uren C, van der Spuy G, Tromp G, Moller M. GWAS in the southern African context. Cold Spring Harbor Laboratory. 2022;

      (2) Byeon YJJ, Islamaj R, Yeganova L, Wilbur WJ, Lu Z, Brody LC, et al. Evolving use of ancestry, ethnicity, and race in genetics research-A survey spanning seven decades. Am J Hum Genet. 2021 Dec 2;108(12):2215–23.

      (3) Majara L, Kalungi A, Koen N, Tsuo K, Wang Y, Gupta R, et al. Low and differential polygenic score generalizability among African populations due largely to genetic diversity. HGG Adv. 2023 Apr 13;4(2):100184.

      (4) Schurz H, Naranbhai V, Yates TA, Gilchrist JJ, Parks T, Dodd PJ, et al. Multi-ancestry metaanalysis of host genetic susceptibility to tuberculosis identifies shared genetic architecture. eLife. 2024 Jan 15;13.

      (5) Swart Y, Uren C, van Helden PD, Hoal EG, Möller M. Local ancestry adjusted allelic association analysis robustly captures tuberculosis susceptibility loci. Front Genet. 2021 Oct 15;12:716558.

    1. eLife Assessment

      Studying several allergens in different mouse strains, the authors assessed the role of IgM in airway inflammatory responses and show that IgM deficient mice have reduced airway hyperresponsiveness. Although the findings are useful and interesting and among others show the expression of a protein that regulates actin in smooth cells, the study remains incomplete as the data and analyses only partly support their primary claim.

    2. Reviewer #1 (Public review):

      Summary:

      The authors of this study sought to define a role for IgM in responses to house dust mites in the lung.

      Strengths:

      Unexpected observation about IgM biology.<br /> Combination of experiments to elucidate function.

      Weaknesses:

      Would love more connection to human disease

    3. Reviewer #2 (Public review):

      Summary:

      The manuscript by Hadebe and colleagues describes a striking reduction in airway hyperresponsiveness in Igm-deficient mice in response to HDM, OVA and papain across the B6 and BALB-c backgrounds. The authors suggest that the deficit is not due to improper type 2 immune responses, nor an aberrant B cell response, despite a lack of class switching in these mice. Through RNA-Seq approaches, the authors identify few differences between the lungs of WT and Igm-deficient mice, but see that two genes involved in actin regulation are greatly reduced in IgM-deficient mice. The authors target these genes by CRISPR-Cas9 in in vitro assays of smooth muscle cells to show that these may regulate cell contraction. While the study is conceptually interesting, there are a number of limitations, which stop us from drawing meaningful conclusions.

      Strengths:

      Fig. 1. The authors clearly show that IgMKO mice have striking reduced AHR in the HDM model, despite the presence of a good cellular B cell response.

      Weaknesses:

      Due to several technical and experimental limitations, it is unclear what leads to the reduction in airway hyperresponsiveness in IGM-KO mice. The limitations as outlined previously remain.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary: The authors of this study sought to define a role for IgM in responses to house dust mites in the lung.

      Strengths:

      Unexpected observation about IgM biology

      Combination of experiments to elucidate function

      Weaknesses:

      Would love more connection to human disease

      We thank the reviewer for these comments. At the time of this publication, we have not made a concrete link with human disease. While there is some anecdotal evidence of diseases such as Autoimmune glomerulonephritis, Hashimoto’s thyroiditis, Bronchial polyp, SLE, Celiac disease and other diseases in people with low IgM. Allergic disorders are also common in people with IgM deficiency, other studies have reported as high as 33-47%. The mechanisms for the high incidence of allergic diseases are unclear as generally, these patients have normal IgG and IgE levels. IgM deficiency may represent a heterogeneous spectrum of genetic defects, which might explain the heterogeneous nature of disease presentations. 

      Reviewer #2 (Public Review):

      Summary:

      The manuscript by Hadebe and colleagues describes a striking reduction in airway hyperresponsiveness in Igm-deficient mice in response to HDM, OVA and papain across the B6 and BALB-c backgrounds. The authors suggest that the deficit is not due to improper type 2 immune responses, nor an aberrant B cell response, despite a lack of class switching in these mice. Through RNA-Seq approaches, the authors identify few differences between the lungs of WT and Igm-deficient mice, but see that two genes involved in actin regulation are greatly reduced in IgM-deficient mice. The authors target these genes by CRISPR-Cas9 in in vitro assays of smooth muscle cells to show that these may regulate cell contraction. While the study is conceptually interesting, there are a number of limitations, which stop us from drawing meaningful conclusions.

      Strengths:

      Fig. 1. The authors clearly show that IgMKO mice have striking reduced AHR in the HDM model, despite the presence of a good cellular B cell response.

      Weaknesses:

      Fig. 2. The authors characterize the cd4 t cell response to HDM in IGMKO mice.<br /> They have restimulated medLN cells with antiCD3 for 5 days to look for IL-4 and IL-13, and find no discernible difference between WT and KO mice. The absence of PBS-treated WT and KO mice in this analysis means it is unclear if HDM-challenged mice are showing IL-4 or IL-13 levels above that seen at baseline in this assay.

      We thank the Reviewer for this comment. We would like to mention that a very minimal level of IL-4 and IL-13 in PBS mice was detected. We have indicated with a dotted line on the Figure to show levels in unstimulated or naïve cytokines. Please see Author response image 1 below from anti-CD3 stimulated cytokine ELISA data. The levels of these cytokines are very low and are not changed between WT and IgM<sup>-/-</sup> mice, this is also true for PMA/ionomycin-stimulated cells.

      Author response image 1.

      The choice of 5 days is strange, given that the response the authors want to see is in already primed cells. A 1-2 day assay would have been better.

      We agree with the reviewer that a shorter stimulation period would work. Over the years we have settled for 5-day re-stimulation for both anti-CD3 and HDM. We have tried other time points, but we consistently get better secretion of cytokines after 5 days.

      It is concerning that the authors state that HDM restimulation did not induce cytokine production from medLN cells, since countless studies have shown that restimulation of medLN would induce IL-13, IL-5 and IL-10 production from medLN. This indicates that the sensitization and challenge model used by the authors is not working as it should.

      We thank the reviewer for this observation. In our recent paper showing how antigen load affects B cell function, we used very low levels of HDM to sensitise and challenge mice (1 ug and 3 ug respectively). See below article, Hadebe et al., 2021 JACI. This is because Labs that have used these low HDM levels also suggested that antigen load impacts B cell function, especially in their role in germinal centres. We believe the reason we see low or undetectable levels of cytokines is because of this low antigen load sensitisation and challenge. In other manuscripts we have published or about to publish, we have shown that normal HDM sensitisation load (1 ug or 100 ug) and challenge (10 ug) do induce cytokine release upon restimulation with HDM. See the below article by Khumalo et al, 2020 JCI Insight (Figure 4A).

      Sabelo Hadebe, Jermaine Khumalo, Sandisiwe Mangali, Nontobeko Mthembu, Hlumani Ndlovu, Amkele Ngomti, Martyna Scibiorek, Frank Kirstein, Frank Brombacher. Deletion of IL-4Ra signalling on B cells limits hyperresponsiveness depending on antigen load. doi.org/10.1016/j.jaci.2020.12.635).

      Jermaine Khumalo, Frank Kirstein, Sabelo Hadebe, Frank Brombacher. IL-4Rα signalling in regulatory T cells is required for dampening allergic airway inflammation through inhibition of IL-33 by type 2 innate lymphoid cells. JCI Insight. 2020 Oct 15;5(20):e136206. doi: 10.1172/jci.insight.136206

      The IL-13 staining shown in panel c is also not definitive. One should be able to optimize their assays to achieve a better level of staining, to my mind.

      We agree with the reviewer that much higher IL-13-producing CD4 T cells should be observed. We don’t think this is a technical glitch or non-optimal set-up as we see much higher levels of IL-13-producing CD4 T cells when using higher doses of HDM to sensitise and challenge, say between 7 -20% in WT mice (see Author response image 2, lung stimulated with PMA/ionomycin+Monensin, please note this is for illustration purposes only and it not linked to the current manuscript, its merely to demonstrate a point from other experiments we have conducted in the lab).

      Author response image 2.

      In d-f, the authors perform a serum transfer, but they only do this once. The half life of IgM is quite short. The authors should perform multiple naïve serum transfers to see if this is enough to induce FULL AHR.

      We thank the reviewer for this comment. We apologise if this was not clear enough on the Figure legend and method, we did transfer serum 3x, a day before sensitisation, on the day of sensitisation and a day before the challenge to circumvent the short life of IgM. In our subsequent experiments, we have now used busulfan to deplete all bone marrow in IgM-deficient mice and replace it with WT bone marrow and this method restores AHR (Figure 3).

      This now appears in line 165 to 169 and reads

      “Adoptive transfer of naïve serum

      Naïve wild-type mice were euthanised and blood was collected via cardiac puncture before being spun down (5500rpm, 10min, RT) to collect serum. Serum (200mL) was injected intraperitoneally into IgM-deficient mice. Serum was injected intraperitoneally at day -1, 0, and a day before the challenge with HDM (day 10).”

      The presence of negative values of total IgE in panel F would indicate some errors in calculation of serum IgE concentrations.

      We thank the reviewer for this observation. For better clarity, we have now indicated these values as undetected in Figure , as they were below our detection limit.

      Overall, it is hard to be convinced that IgM-deficiency does not lead to a reduction in Th2 inflammation, since the assays appear suboptimal.

      We disagree with the reviewer in this instance, because we have shown in 3 different models and in 2 different strains and 2 doses of HDM (high and low) that no matter what you do, Th2 remains intact. Our reason for choosing low dose HDM was based on our previous work and that of others, which showed that depending on antigen load, B cells can either be redundant or have functional roles. Since our interest was to tease out the role of B cells and specifically IgM, it was important that we look at a scenario where B cells are known to have a function (low antigen load). We did find similar findings at high dose of HDM load, but effects on AHR were not as strong, but Th2 was not changed, in fact in some instances Th2 was higher in IgM-deficient mice.

      Fig. 3. Gene expression differences between WT and KO mice in PBS and HDM challenged settings are shown. PCA analysis does not show clear differences between all four groups, but genes are certainly up and downregulated, in particular when comparing PBS to HDM challenged mice. In both PBS and HDM challenged settings, three genes stand out as being upregulated in WT v KO mice. these are Baiap2l1, erdr1 and Chil1.

      Noted

      Fig. 4. The authors attempt to quantify BAIAP2L1 in mouse lungs. It is difficult to know if the antibody used really detects the correct protein. A BAIAP2L1-KO is not used as a control for staining, and I am not sure if competitive assays for BAIAP2L1 can be set up. The flow data is not convincing. The immunohistochemistry shows BAIAP2L1 (in red) in many, many cells, essentially throughout the section. There is also no discernible difference between WT and KO mice, which one might have expected based on the RNA-Seq data. So, from my perspective, it is hard to say if/where this protein is located, and whether there truly exists a difference in expression between wt and ko mice.

      We thank the reviewer for this comment. We are certain that the antibody does detect BAIAP2L1, we have used it in 3 assays, which we admit may show varying specificities since it’s a Polyclonal antibody. However, in our western blot, the antibody detects 1 band at 56.7kDa and no other bands, apart from what we think are isoforms. We agree that BAIAP2L1 is expressed by many cell types, including CD45+ cells and alpha smooth muscle negative cells and we show this in our supplementary Figure 9. Where we think there is a difference in expression between WT and IgM-deficient mice is in alpha-smooth muscle-positive cells. We have tested antibodies from different companies, and we find similar findings. We do not have access to BAIAP2L1 KO mice and to test specificity, we have also used single stain controls with or without secondary antibody and isotype control which show no binding in western blot and Immunofluorescence assays and Fluorescence minus one antibody in Flow cytometry, so that way we are convinced that the signal we are seeing is specific to BAIAP2L1.

      Fig. 5 and 6. The authors use a single cell contractility assay to measure whether BAIAP2L1 and ERDR1 impact on bronchial smooth muscle cell contractility. I am not familiar with the assay, but it looks like an interesting way of analysing contractility at the single cell level.

      The authors state that targeting these two genes with Cas9gRNA reduces smooth muscle cell contractility, and the data presented for contractility supports this observation. However, the efficiency of Cas9-mediated deletion is very unclear. The authors present a PCR in supp fig 9c as evidence of gene deletion, but it is entirely unclear with what efficiency the gene has been deleted. One should use sequencing to confirm deletion. Moreover, if the antibody was truly working, one should be able to use the antibody used in Fig 4 to detect BAIAP2L1 levels in these cells. The authors do not appear to have tried this.

      We thank the reviewer for these observations. We are in a process to optimise this using new polyclonal BAIAP2L1 antibodies from other companies, since the one we have tried doesn’t seem to work well on human cells via western blot. So hopefully in our new version, we will be able to demonstrate this by immunofluorescence or western blot.

      Other impressions:

      The paper is lacking a link between the deficiency of IgM and the effects on smooth muscle cell contraction.

      The levels of IL-13 and TNF in lavage of WT and IGMKO mice could be analysed.

      We have measured Th2 cytokine IL-13 in BAL fluid and found no differences between IgM-deficient mice and WT mice challenged with HDM (Author response image 1). We could not detected TNF-alpha in the BAL fluid, it was below detection limit.

      Author response image 3.

      IL-13 levels are not changed in IgM-deficient mice in the lung. Bronchoalveolar lavage fluid in WT or IgM-deficient mice sensitised and challenged with HDM. TNF-a levels were below the detection limit.

      Moreover, what is the impact of IgM itself on smooth muscle cells? In the Fig. 7 schematic, are the authors proposing a direct role for IgM on smooth muscle cells? Does IgM in cell culture media induce contraction of SMC? This could be tested and would be interesting, to my mind.

      We thank the Reviewer for these comments. We are still trying to test this, unfortunately, we have experienced delays in getting reagents such as human IgM to South Africa. We hope that we will be able to add this in our subsequent versions of the article. We agree it is an interesting experiment to do even if not for this manuscript but for our general understanding of this interaction at least in an in vitro system.

      Reviewer #3 (Public Review):

      Summary:

      This paper by Sabelo et al. describes a new pathway by which lack of IgM in the mouse lowers bronchial hyperresponsiveness (BHR) in response to metacholine in several mouse models of allergic airway inflammation in Balb/c mice and C57/Bl6 mice. Strikingly, loss of IgM does not lead to less eosinophilic airway inflammation, Th2 cytokine production or mucus metaplasia, but to a selective loss of BHR. This occurs irrespective of the dose of allergen used. This was important to address since several prior models of HDM allergy have shown that the contribution of B cells to airway inflammation and BHR is dose dependent.

      After a description of the phenotype, the authors try to elucidate the mechanisms. There is no loss of B cells in these mice. However, there is a lack of class switching to IgE and IgG1, with a concomitant increase in IgD. Restoring immunoglobulins with transfer of naïve serum in IgM deficient mice leads to restoration of allergen-specific IgE and IgG1 responses, which is not really explained in the paper how this might work. There is also no restoration of IgM responses, and concomitantly, the phenotype of reduced BHR still holds when serum is given, leading authors to conclude that the mechanism is IgE and IgG1 independent. Wild type B cell transfer also does not restore IgM responses, due to lack of engraftment of the B cells. Next authors do whole lung RNA sequencing and pinpoint reduced BAIAP2L1 mRNA as the culprit of the phenotype of IgM<sup>-/-</sup> mice. However, this cannot be validated fully on protein levels and immunohistology since differences between WT and IgM KO are not statistically significant, and B cell and IgM restoration are impossible. The histology and flow cytometry seems to suggest that expression is mainly found in alpha smooth muscle positive cells, which could still be smooth muscle cells or myofibroblasts. Next therefore, the authors move to CRISPR knock down of BAIAP2L1 in a human smooth muscle cell line, and show that loss leads to less contraction of these cells in vitro in a microscopic FLECS assay, in which smooth muscle cells bind to elastomeric contractible surfaces.

      Strengths:

      (1) There is a strong reduction in BHR in IgM-deficient mice, without alterations in B cell number, disconnected from effects on eosinophilia or Th2 cytokine production

      (2) BAIAP2L1 has never been linked to asthma in mice or humans

      Weaknesses:

      (1) While the observations of reduced BHR in IgM deficient mice are strong, there is insufficient mechanistic underpinning on how loss of IgM could lead to reduced expression of BAIAP2L1. Since it is impossible to restore IgM levels by either serum or B cell transfer and since protein levels of BAIAP2L1 are not significantly reduced, there is a lack of a causal relationship that this is the explanation for the lack of BHR in IgM-deficient mice. The reader is unclear if there is a fundamental (maybe developmental) difference in non-hematopoietic cells in these IgM-deficient mice (which might have accumulated another genetic mutation over the years). In this regard, it would be important to know if littermates were newly generated, or historically bred along with the KO line.

      We thank the reviewer for asking this question and getting us to think of this in a different way. This prompted us to use a different method to try and restore IgM function and since our animal facility no longer allows irradiation, we opted for busulfan. We present this data as new data in Figure 3. We had to go back and breed this strain and then generated bone marrow chimeras. What we have shown now with chimeras is that if we can deplete bone marrow from IgM-deficient mice and replace it with congenic WT bone marrow when we allow these mice to rest for 2 months before challenge with HDM (new Supplementary Figure 6 a-c) We also show that AHR (resistance and elastance) is partially restored in this way (Figure 3 a and b) as mice that receive congenic WT bone marrow after chemical irradiation can mount AHR and those that receive IgM-deficient bone marrow, can’t mount AHR upon challenge with HDM. If the mice had accumulated an unknown genetic mutation in non-hematopoietic cells, the transfer of WT bone marrow would not make a difference. So, we don’t believe the colony could have gained a mutation that we are unaware of. We have also shipped these mice to other groups and in their hands, this strains still only behaves as an IgM only knockout mice. See their publication below.

      Mark Noviski, James L Mueller, Anne Satterthwaite, Lee Ann Garrett-Sinha, Frank Brombacher, Julie Zikherman 2018. IgM and IgD B cell receptors differentially respond to endogenous antigens and control B cell fate. eLife 2018;7:e35074. DOI: https://doi.org/10.7554/eLife.35074 we have also added methods for bone marrow chimaeras and added results sections and new Figures related to this methods.

      Methods (line 171-182).

      “Busulfan Bone marrow chimeras

      WT (CD45.2) and IgM<sup>-/-</sup> (CD45.2) congenic mice were treated with 25 mg/kg busulfan (Sigma-Aldrich, Aston Manor, South Africa) per day for 3 consecutive days (75 mg/kg in total) dissolved in 10% DMSO and Phosphate buffered saline (0.2mL, intraperitoneally) to ablate bone marrow cells. Twenty-four hours after last administration of busulfan, mice were injected intravenously with fresh bone marrow (10x10<sup>6</sup> cells, 100mL) isolated from hind leg femurs of either WT (CD45.1) or IgM<sup>-/-</sup> mice(33). Animals were then allowed to complement their haematopoietic cells for 8 weeks. In some experiments the level of bone marrow ablation was assessed 4 days post-busulfan treatment in mice that did not receive donor cells. At the end of experiment level of complemented cells were also assessed in WT and IgM<sup>-/-</sup> mice that received WT (CD45.1) bone marrow.”

      Results (line 491-521)

      “Replacement of IgM-deficient mice with functional hematopoietic cells in busulfan mice chimeric mice restores airway hyperresponsiveness.

      We then generated bone marrow chimeras by chemical radiation using busulfan(33). We treated mice three times with busulfan for 3 consecutive days and after 24 hrs transferred naïve bone marrow from congenic CD45.1 WT mice or CD45.2 IgM<sup>-/-</sup> mice (Fig. 3a and Supplementary Fig. 5a). We showed that recipient mice that did not receive donor bone marrow after 4 days post-treatment have significantly reduced lineage markers (CD45+Sca-1+) or lineage negative (Lin-) cells in the bone marrow when compared to untreated or vehicle (10% DMSO) treated mice (Supplementary Figure 5b-c). We allowed mice to reconstitute bone marrow for 8 weeks before sensitisation and challenge with low dose HDM (Figure 3a). We showed that WT (CD45.2) recipient mice that received WT (CD45.1) donor bone marrow had higher airway resistance and elastance and this was comparable to IgM<sup>-/-</sup> (CD45.2) recipient mice that received donor WT (CD45.1) bone marrow (Figure 3b). As expected, IgM<sup>-/-</sup> (CD45.2) recipient mice that received donor IgM<sup>-/-</sup> (CD45.2) bone marrow had significantly lower AHR compared to WT (CD45.2) or IgM<sup>-/-</sup> (CD45.2) recipient mice that received WT (CD45.1) bone marrow (Figure 3b). We confirmed that the differences observed were not due to differences in bone marrow reconstitution as we saw similar frequencies of CD45.1 cells within the lymphocyte populations in the lungs and other tissues (Supplementary Fig. 5d). We observed no significant changes in the lung neutrophils, eosinophils, inflammatory macrophages, CD4 T cells or B cells in WT or IgM<sup>-/-</sup> (CD45.2) recipient mice that received donor WT (CD45.1/CD45.2) or IgM<sup>-/-</sup> (CD45.2) bone marrow when sensitised and challenged with low dose HDM (Fig. 3c)

      Restoring IgM function through adoptive reconstitution with congenic CD45.1 bone marrow in non-chemically irradiated recipient mice or sorted B cells into IgM<sup>-/-</sup> mice (Supplementary Fig.  6a) did not replenish IgM B cells to levels observed in WT mice and as a result did not restore AHR, total IgE and IgM in these mice (Supplementary Fig.  6b-c).”

      The 2 new figures are

      Figure 3 which moved the rest of the Figures down and Supplementary Figure 5, which also moved the rest of the supplementary figures down.

      Discussion appears in line 757-766 of the untracked version of the article.

      To resolve other endogenous factors that could have potentially influenced reduced AHR in IgM-deficient mice, we resorted to busulfan chemical irradiation to deplete bone marrow cells in IgM-deficient mice and replace bone marrow with WT bone marrow. While it is well accepted that busulfan chemical irradiation partially depletes bone marrow cells, in our case it was not possible to pursue other irradiation methods due to changes in ethical regulations and that fact that mice are slow to recover after gamma rays irradiation. Busulfan chemical irradiation allowed us to show that we could mostly restore AHR in IgM-deficient recipient mice that received donor WT bone marrow when challenged with low dose HDM.

      (2) There is no mention of the potential role of complement in activation of AHR, which might be altered in IgM-deficient mice 

      We thank the reviewer for this comment. We have not directly looked at complement in this instance, however, from our previous work on C3-/- mice, there have been comparable AHR to WT mice under the HDM challenge.

      (3) What is the contribution of elevated IgD in the phenotype of the IgM-deficient mice. It has been described by this group that IgD levels are clearly elevated

      We thank the reviewer for this question. We believe that IgD is essentially what drives partial class switching to IgG, we certainly have shown that in the case of VSV virus and Trypanosoma congolense and Trypanosoma brucei brucei that elevated IgD drive delayed but effective IgG in the absence of IgM (Lutz et al, 2001, Nature). This is also confirmed by Noviski studies where they show that both IgM and IgD do share some endogenous antigens, so its likely that external antigens can activate IgD in a similar manner to prompt class switching.

      (4) How can transfer of naïve serum in class switching deficient IgM KO mice lead to restoration of allergen specific IgE and IgG1?

      We thank the Reviewer for these comments, we believe that naïve sera transferred to IgM deficient mice is able to bind to the surface of B cells via IgM receptors (FcμR / Fcα/μR), which are still present on B cells and this is sufficient to facilitate class switching. Our IgM<sup>-/-</sup> mouse lacks both membrane-bound and secreted IgM, and transferred serum contains at least secreted IgM which can bind to surfaces via its Fc portion. We measured HDM-specific IgE and we found very low levels, but these were not different between WT and IgM<sup>-/-</sup> adoptively transferred with WT serum. We also detected HDM-specific IgG1 in IgM<sup>-/-</sup> transferred with WT sera to the same level as WT, confirming a possible class switching, of course, we can’t rule out that transferred sera also contains some IgG1. We also can’t rule out that elevated IgD levels can partially be responsible for class switched IgG1 as discussed above.

      In the discussion line 804-812, we also added the following

      “We speculate that IgM can directly activate smooth muscle cells by binding a number of its surface receptors including FcμR, Fcα/μR and pIgR(52-54). IgM binds to FcμR strictly, but shares Fcα/μR and pIgR with IgA(5,52,54). Both Fcα/μR and pIgR can be expressed by non-structural cells at mucosal sites(54,55). We would not rule out that the mechanisms of muscle contraction might be through one of these IgM receptors, especially the ones expressed on smooth muscle cells(54,55). Certainly, our future studies will be directed towards characterizing the mechanism by which IgM potentially activates the smooth muscle.”

      We have discussed this section under Discussion section, line 731 to 757. In addition, since we have now performed bone marrow chimaeras we have further added the following in our discussion in line 757-766.

      To resolve other endogenous factors that could have potentially influenced reduced AHR in IgM-deficient mice, we resorted to busulfan chemical irradiation to deplete bone marrow cells in IgM-deficient mice and replace bone marrow with WT bone marrow. While it is well accepted that busulfan chemical irradiation partially depletes bone marrow cells, in our case it was not possible to pursue other irradiation methods due to changes in ethical regulations and that fact that mice are slow to recover after gamma rays irradiation. Busulfan chemical irradiation allowed us to show that we could mostly restore AHR in IgM-deficient recipient mice that received donor WT bone marrow when challenged with low dose HDM.

      We removed the following lines, after performing bone marrow chimaeras since this changed some aspects.

      Our efforts to adoptively transfer wild-type bone marrow or sorted B cells into IgM-deficient mice were also largely unsuccessful partly due to poor engraftment of wild-type B cells into secondary lymphoid tissues. Natural secreted IgM is mainly produced by B1 cells in the peritoneal cavity, and it is likely that any transfer of B cells via bone marrow transfer would not be sufficient to restore soluble levels of IgM(3,10).

      (5) Alpha smooth muscle antigen is also expressed by myofibroblasts. This is insufficiently worked out. The histology mentions "expression in cells in close contact with smooth muscle". This needs more detail since it is a very vague term. Is it in smooth muscle or in myofibroblasts.

      Response: We appreciate that alpha-smooth muscle actin-positive cells are a small fraction in the lung and even within CD45 negative cells, but their contribution to airway hyperresponsiveness is major. We also concede that by immunofluorescence BAIAP2L1 seems to be expressed by cells adjacent to alpha-smooth muscle actin (Fig. 5b), however, we know that cells close to smooth muscle (such as extracellular matrix and myofibroblasts) contribute to its hypertrophy in allergic asthma.

      James AL, Elliot JG, Jones RL, Carroll ML, Mauad T, Bai TR, et al. Airway Smooth Muscle Hypertrophy and Hyperplasia in Asthma. Am J Respir Crit Care Med [Internet]. 2012;185:1058–64. Available from: https://doi.org/10.1164/rccm.201110-1849OC

      (6) Have polymorphisms in BAIAP2L1 ever been linked to human asthma?

      No, we have looked in asthma GWAS studies, at least summary statics and we have not seen any SNPs can could be associated with human asthma.

      (7) IgM deficient patients are at increased risk for asthma. This paper suggests the opposite. So the translational potential is unclear

      We thank the reviewer for these comments. At the time of this publication, we have not made a concrete link with human disease. While there is some anecdotal evidence of diseases such as Autoimmune glomerulonephritis, Hashimoto’s thyroiditis, Bronchial polyp, SLE, Celiac disease and other diseases in people with low IgM. Allergic disorders are also common in people with IgM deficiency as the reviewer correctly points out, other studies have reported as high as 33-47%. The mechanisms for the high incidence of allergic diseases are unclear as generally, these patients have normal or higher IgG and IgE levels. IgM deficiency may represent a heterogeneous spectrum of genetic defects, which might explain the heterogeneous nature of disease presentations.

    1. eLife Assessment

      This important study describes how a single effector of the Type Six Secretion System (T6SS) has two distinct functions, which may contribute to bacterial survival and the development of novel antibacterials. The authors utilized various methods in biochemistry, microbiology, and microscopy to produce convincing data supporting their claims about the protein's function; however, they could clarify the implications for non-experts to enhance the accessibility of this work. This manuscript is of interest to those studying T6SS, particularly those interested in effectors and bacterial enzymes.

    2. Reviewer #1 (Public review):

      Summary:

      The manuscript performs a comprehensive biochemical, structural, and bioinformatic analysis of TseP, a type 6 secretion system effector from Aeromonas dhakensis that includes identification of a domain required for secretion and residues conferring target organism specificity. Through targeted mutations, they have expanded the target range of a T6SS effector to include a gram-positive species, which are not typically susceptible to T6SS attack. Although this is not the first dual domain effector to be described, this is the first time anyone has been able to modify a T6SS effector to have an expanded target species range.

      Strengths:

      The thorough dissection of TseP activity and modulation of target specificity represent a novel contribution to the field of antibacterial research.

      Weaknesses:

      Although the mechanistic activity of TseP is fully dissected here, there are some unaddressed questions regarding the importance/evolution of the dual activity domain organization. For example, does the modified Gram-positive targeting TseP effector still kill Gram-negative bacteria in bacterial mixtures? And if so, what is the evolutionary benefit of having a TseP that cannot target Gram-positives? And can something be inferred about the biology of Aeromonas from this?

      Comments on revisions:

      The comments and critiques from the initial submission have been addressed. However, some of them have only been addressed in the author's rebuttal. Some of the discussion particularly regarding the validity of using E. coli PG, the ability for TseP_C4+ to still kill E. coli, and the advantages of having dual domain function effectors probably should be present in the actual manuscript.

    3. Reviewer #2 (Public review):

      Summary:

      Wang et al. investigate the role of TseP, a Type VI secretion system (T6SS) effector molecule, revealing its dual enzymatic activities as both an amidase and a lysozyme. This discovery significantly enhances the understanding of T6SS effectors, which are known for their roles in interbacterial competition and survival in polymicrobial environments. TseP's dual function is proposed to play a crucial role in bacterial survival strategies, particularly in hostile environments where competition between bacterial species is prevalent.

      Strengths:

      (1) The dual enzymatic function of TseP is a significant contribution, expanding the understanding of T6SS effectors.<br /> (2) The study provides important insights into bacterial survival strategies, particularly in interbacterial competition.<br /> (3) The findings have implications for antimicrobial research and understanding bacterial interactions in complex environments.

      Weaknesses:

      (1) The manuscript assumes familiarity with previous work, making it difficult to follow. Mutants and strains need clearer definition and references.<br /> (2) Figures lack proper controls, quantification, and clarity in some areas, notably in Figures 1A and 1C.<br /> (3) The Materials and Methods section is poorly organized, hindering reproducibility. Biophysical validation of Zn²⁺ interaction and structural integrity of proteins need to be addressed.<br /> (4) Discrepancies in protein degradation patterns and activities across different figures raise concerns about data reliability.

      Comments on revisions:

      The authors have addressed most of the comments, significantly improving the manuscript. They provided clear details of mutant constructs and strains, including additional references and a revised strain. Individual data points and statistical analyses were added to key figures, ensuring transparency and reproducibility. Supplemental data, such as protein purification details and loading controls, were included to address concerns about experimental reliability. However, the authors did not perform new experiments, such as isothermal titration calorimetry (ITC) to demonstrate the interaction between Zn<sup>2+</sup> and TsePN or stop-flow spectroscopy to examine enzymatic kinetics, which could have further strengthened the manuscript. I trust these aspects will be addressed in future studies.

      The revised Materials and Methods section was significantly improved, providing detailed protocols for bioinformatics analyses, microscopic imaging, and enzymatic assays.

      These revisions provide a clearer and more robust presentation of TseP's dual enzymatic functions and their implications in bacterial competition. The manuscript now represents a significant contribution to understanding T6SS effectors, and I recommend it for publication in its current form.

    4. Reviewer #3 (Public review):

      Summary:

      Type VI secretion systems (T6SS) are employed by bacteria to inject competitor cells with numerous effector proteins. These effectors can kill injected cells via an array of enzymatic activities. A common class of T6SS effector are peptidoglycan (PG) lysing enzymes. In this manuscript, the authors characterize a PG-lysing effector-TseP-from the pathogen Aeromonas dhakensis. While the C-terminal domain of TseP was known to have lysozyme activity, the N-terminal domain was uncharacterized. Here, the authors functionally characterize TsePN as a zinc-dependent amidase. This discovery is somewhat novel because it is rare for PG-lysing effectors to have amidase and lysozyme activity. In the second half of the manuscript, the authors utilize a crystal structure of the lysozyme TsePC domain to inform the engineering of this domain to lyse gram-positive peptidoglycan.

      Strengths:

      The two halves of the manuscript considered together provide a nice characterization of a unique T6SS effector and reveal potentially general principles for lysozyme engineering.

      Weaknesses:

      The advantage of fusing amidase and lysozyme domains in a single effector is not discussed but would appear to be a pertinent question.

      Comments on revisions:

      The authors have adequately addressed my previous comments. The authors did not conduct any additional experiments to address the comments made by other reviewers. However, in most cases it seems that paring down the strength of claims made in the text or adding data to the supplement is sufficient to address these concerns.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      The manuscript performs a comprehensive biochemical, structural, and bioinformatic analysis of TseP, a type 6 secretion system effector from Aeromonas dhakensis that includes the identification of a domain required for secretion and residues conferring target organism specificity. Through targeted mutations, they have expanded the target range of a T6SS effector to include a gram-positive species, which is not typically susceptible to T6SS attack.

      Strengths:

      All of the experiments presented in the study are well-motivated and the conclusions are generally sound.

      Thank you.

      Weaknesses:

      There are some issues with the clarity of figures. For example, the microscopy figures could have been more clearly presented as cell counts/quantification rather than representative images. Similarly, loading controls for the secreted proteins for the westerns probably should be shown.

      Also, some of the minor/secondary conclusions reached regarding the "independence" of the N and C term domains of the TseP are a bit overreaching.

      We thank the reviewer for pointing out the issues and have carefully revised the manuscript accordingly. We acknowledge the reviewer’s concern regarding the independence of the N- and C-terminal domains, and have toned down the relevant claims.

      Reviewer #2 (Public review):

      Summary:

      Wang et al. investigate the role of TseP, a Type VI secretion system (T6SS) effector molecule, revealing its dual enzymatic activities as both an amidase and a lysozyme. This discovery significantly enhances the understanding of T6SS effectors, which are known for their roles in interbacterial competition and survival in polymicrobial environments. TseP's dual function is proposed to play a crucial role in bacterial survival strategies, particularly in hostile environments where competition between bacterial species is prevalent.

      Strengths:

      (1) The dual enzymatic function of TseP is a significant contribution, expanding the understanding of T6SS effectors.

      (2) The study provides important insights into bacterial survival strategies, particularly in interbacterial competition.

      (3) The findings have implications for antimicrobial research and understanding bacterial interactions in complex environments.

      Thank you.

      Weaknesses:

      (1) The manuscript assumes familiarity with previous work, making it difficult to follow. Mutants and strains need clearer definitions and references.

      Thank you for raising the issue. We have revised the manuscript accordingly to improve the clarity by including more detailed descriptions of the mutants and strains, along with references to prior work where relevant, to improve clarity.

      (2) Figures lack proper controls, quantification, and clarity in some areas, notably in Figures 1A and 1C.

      We have now added the controls as requested by reviewers.

      (3) The Materials and Methods section is poorly organized, hindering reproducibility. Biophysical validation of Zn<sup>2+</sup> interaction and structural integrity of proteins need to be addressed.

      We have now included more details in the Materials and Methods section. While we recognize the importance of biophysical validation of the Zn<sup>2+</sup> interaction, this analysis lies beyond the primary scope of the current study. We plan to investigate the role of Zn²⁺ interaction and the EF-hand domain in greater depth as part of our follow-up studies. Thank you for this suggestion.

      (4) Discrepancies in protein degradation patterns and activities across different figures raise concerns about data reliability.

      We acknowledge the concern about discrepancies in protein degradation patterns. TseP exhibits inherent instability, which might explain the observed variations. We have added an explanation in the detailed response letter and the manuscript.

      Reviewer #3 (Public review):

      Summary:

      Type VI secretion systems (T6SS) are employed by bacteria to inject competitor cells with numerous effector proteins. These effectors can kill injected cells via an array of enzymatic activities. A common class of T6SS effector are peptidoglycan (PG) lysing enzymes. In this manuscript, the authors characterize a PG-lysing effector-TseP-from the pathogen Aeromonas dhakensis. While the C-terminal domain of TseP was known to have lysozyme activity, the N-terminal domain was uncharacterized. Here, the authors functionally characterize TsePN as a zinc-dependent amidase. This discovery is somewhat novel because it is rare for PG-lysing effectors to have amidase and lysozyme activity.

      In the second half of the manuscript, the authors utilize a crystal structure of the lysozyme TsePC domain to inform the engineering of this domain to lyse gram-positive peptidoglycan.

      Strengths:

      The two halves of the manuscript considered together provide a nice characterization of a unique T6SS effector and reveal potentially general principles for lysozyme engineering.

      Thank you.

      Weaknesses:

      The advantage of fusing amidase and lysozyme domains in a single effector is not discussed but would appear to be a pertinent question. Labeling of the figures could be improved to help readers understand the data.

      Thank you for the suggestions. We have revised the manuscript and figures to improve clarity.

      The advantage of having dual-domain functions relative to having just one of the two functions is likely for increasing competitive fitness. Although such dual functional cell-wall targeting effectors have not been characterized prior to this study, there are some examples that dual functions are encoded by the same secretion module, for example the VgrG1-TseL pair in Vibrio cholerae. The C-terminal of VgrG1 not only catalyzes actin crosslinking but also recognizes and delivers the downstream encoded lipase effector TseL through direct interaction. In this context, the VgrG1-TseL pair also represent a dual-functional module. Therefore, it is likely that fusing effector domains and coupling effector functions are parallel strategies for the evolution of T6SS effectors.

    1. eLife Assessment

      This manuscript reports an important new statistical method for calculating the significance of correlations between two time-series, which provides more accuracy than other methods when the data has few replicates. The proposed method solves a real-life problem that is frequently encountered and is broadly applicable to many realistic datasets in many experimental contexts. The technique is supported with compelling mathematical derivations as well as analysis of both computer-generated and previously published experimental data.

    2. Reviewer #1 (Public review):

      Summary:

      The manuscript puts forward a statistical method to more accurately report the significance of correlations within data. The motivation for this study is two-fold. First, the publication of biological studies demands the report of p-values, and it is widely accepted that p-values below the arbitrary threshold of 0.05 give the authors of such studies justification to draw conclusions about their data. Second, many biological studies are limited by the number of replicate samples that are feasible, with replicates of less than 5 typical. The authors report a statistical tool that uses a permute-match approach to calculate p-values. Notably, the proposed method reduces p-values from around 0.2 to 0.04 as compared to a standard permutation test with a small sample size. The approach is clearly explained, including detailed mathematical explanations and derivations. The advantage of the approach is also demonstrated through analysis of computer-generated synthetic data with specified correlation and analysis of previously published data related to fish schooling. The authors make a clear case that this method is an improvement over the more standard approach currently used, and also demonstrate the impact of this methodology on the ability to obtain p-values that are the standard for biological research. Overall, this paper is very strong. While the subject matter seems somewhat specialized, I would make the case that this will be an important study that has broad general interest to readers. The findings are very general and applicable to many research contexts. Experimentalists also want to report accurate p-values in their work and better understand how these values are calculated. Although I believe the previous statement is true, I am not sure that many research groups doing biological work are reading specialized statistics journals regularly. Therefore a useful and broadly applicable statistical tool is well placed in this journal.<br /> Strengths:

      The proposed method is broadly applicable to many realistic datasets in many experimental contexts.

      The power of this method was demonstrated with both real experimental data and "synthetic" data. The advantages of the tool are clearly reported. The zebrafish data is a great example dataset.

      The method solves a real-life problem that is frequently encountered by many experimental groups in the biological sciences.

      The writing of the paper is surprisingly clear, given the technical nature of the subject matter. I would not at all consider myself a statistician or mathematician, but I found the text easy to follow. The authors did an impressive job guiding the reader through material that would often be difficult to grasp. The introduction was also well-written and clearly motivated the goals of the study.

      Weaknesses:

      A few changes could be made if the manuscript is revised. I would consider all of these points minor, but the paper could be improved if these points were addressed.

      (1) The caption of Figure 2 doesn't seem to mention panel D. Figure A-2 also does not mention C in the caption.

      (2) Figure 2D is a little hard to follow. First, the definition of "Power" is not clear, and I couldn't find the precise definition in the text. Second, the legend for the different lines in 2D is only given in Figure A-2. Perhaps a portion of the caption for Figure 2 is missing?

      (3) The concept of circular variance for the fish data was heard to understand/visualize. The equation on line 326 did not help much. If there is a very simple picture that could be added near line 326 that helps to explain Ct and theta, that could be a big help for some readers who do not work on related systems. The analysis performed is understandable, the reader just has to accept that circular variance captions the degree of alignment of the fish.

      (4) For the data discussed in Figure 3, I wasn't 100% sure how the time windows were selected. In the caption, it says "time series to different lengths starting from the first frame". So the 20 s time window was from t=0 to t= 20 s. Would a different result be obtained if a different 20 s window was chosen (from t = 4 min to t = 4 min 20 s just to give a specific example). I suppose by chance one of the time windows would give a p-value less than the target 0.05, that wouldn't be surprising. Maybe a random time window should be selected (although I am not indicating what was reported was incorrect)? A little more discussion on this aspect of the study may be helpful.

    3. Reviewer #2 (Public review):

      Summary:

      This paper presented a hypothesis testing procedure for the independence of two time-series that was potentially suitable for nonlinear dependence and for small-sample cases. This should bring potential benefits for biology data.

      Strengths:

      The test offers good flexibility for different kinds of dependence (through adjusting \rho), and seems to have good finite sample performance compared to the literature. The justification regarding the validity of the test procedure is clear.

      Weaknesses:

      (1) The size of the test is not guaranteed to (asymptotically) equal \alpha, which may damage the power.

      (2) The computational time can be an issue for a moderately large sample size when calculating the X / Y-perfect match. It will be beneficial to include discussions on the implementations of the test.

    1. eLife Assessment

      This important study uses extensive comparative analysis to examine the relationship between plasma glucose levels, albumin glycation levels, and diet and life history, within the framework of the "pace of life syndrome" hypothesis. The evidence that glucose is positively correlated with glycation levels and lifespan is convincing and, although there are some limitations related to data collection, they likely make the statistically significant findings more conservative. As the first extensive comparative analysis of glycation rates, life history, and glucose levels in birds, the study has the potential to be of interest to evolutionary ecologists and the aging research community more broadly.

    2. Reviewer #2 (Public review):

      Summary

      In this extensive comparative study, Moreno-Borrallo and colleagues examine the relationships between plasma glucose levels, albumin glycation levels, diet and life-history traits across birds. Their results confirmed the expected positive relationship between plasma blood glucose level and albumin glycation rate but also provided findings that are somewhat surprising or contrast with findings of some previous studies (positive relationships between blood glucose and lifespan, or absent relationships between blood glucose and clutch mass or diet). This is the first extensive comparative analysis of glycation rates and their relationships to plasma glucose levels and life history traits in birds that is based on data collected in a single study, with blood glucose and glycation measured using unified analytical methods (except for blood glucose data for 13 species collected from a database).

      Strengths

      This is an emerging topic gaining momentum in evolutionary physiology, which makes this study a timely, novel and important contribution. The study is based on a novel data set collected by the authors from 88 bird species (67 in captivity, 21 in the wild) of 22 orders, except for 13 species, for which data were collected from a database of veterinary and animal care records of zoo animals (ZIMS). This novel data set itself greatly contributes to the pool of available data on avian glycemia, as previous comparative studies either extracted data from various studies or a ZIMS database (therefore potentially containing much more noise due to different methodologies or other unstandardised factors), or only collected data from a single order, namely Passeriformes. The data further represents the first comparative avian data set on albumin glycation obtained using a unified methodology. The authors used LC-MS to determine glycation levels, which does not have problems with specificity and sensitivity that may occur with assays used in previous studies. The data analysis is thorough, and the conclusions are substantiated. Overall, this is an important study representing a substantial contribution to the emerging field evolutionary physiology focused on ecology and evolution of blood/plasma glucose levels and resistance to glycation.

      Weaknesses

      Unfortunately, the authors did not record handling time (i.e., time elapsed between capture and blood sampling), which may be an important source of noise because handling-stress-induced increase in blood glucose has previously been reported. Moreover, the authors themselves demonstrate that handling stress increases variance in blood glucose levels. Both effects (elevated mean and variance) are evident in Figure ESM1.2. However, this likely makes their significant findings regarding glucose levels and their associations with lifespan or glycation rate more conservative, as highlighted by the authors.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      The paper explored cross-species variance in albumin glycation and blood glucose levels in the function of various life-history traits. Their results show that

      (1) blood glucose levels predict albumin gylcation rates

      (2) larger species have lower blood glucose levels

      (3) lifespan positively correlates with blood glucose levels and

      (4) diet predicts albumin glycation rates.

      The data presented is interesting, especially due to the relevance of glycation to the ageing process and the interesting life-history and physiological traits of birds. Most importantly, the results suggest that some mechanisms might exist that limit the level of glycation in species with the highest blood glucose levels.

      While the questions raised are interesting and the amount of data the authors collected is impressive, I have some major concerns about this study:

      (1) The authors combine many databases and samples of various sources. This is understandable when access to data is limited, but I expected more caution when combining these. E.g. glucose is measured in all samples without any description of how handling stress was controlled for. E.g glucose levels can easily double in a few minutes in birds, potentially introducing variation in the data generated. The authors report no caution of this effect, or any statistical approaches aiming to check whether handling stress had an effect here, either on glucose or on glycation levels.

      (2) The database with the predictors is similarly problematic. There is information pulled from captivity and wild (e.g. on lifespan) without any confirmation that the different databases are comparable or not (and here I'm not just referring to the correlation between the databases, but also to a potential systematic bias (e.g. captivate-based sources likely consistently report longer lifespans). This is even more surprising, given that the authors raise the possibility of captivity effects in the discussion, and exploring this question would be extremely easy in their statistical models (a simple covariate in the MCMCglmms).

      (3) The authors state that the measurement of one of the primary response variables (glycation) was measured without any replicability test or reference to the replicability of the measurement technique.

      (4) The methods and results are very poorly presented. For instance, new model types and variables are popping up throughout the manuscript, already reporting results, before explaining what these are e.g. results are presented on "species average models" and "model with individuals", but it's not described what these are and why we need to see both. Variables, like "centered log body mass", or "mass-adjusted lifespan" are not explained. The results section is extremely long, describing general patterns that have little relevance to the questions raised in the introduction and would be much more efficiently communicated visually or in a table.

      Reviewer #2 (Public review):

      Summary

      In this extensive comparative study, Moreno-Borrallo and colleagues examine the relationships between plasma glucose levels, albumin glycation levels, diet, and lifehistory traits across birds. Their results confirmed the expected positive relationship between plasma blood glucose level and albumin glycation rate but also provided findings that are somewhat surprising or contradicting findings of some previous studies (relationships with lifespan, clutch mass, or diet). This is the first extensive comparative analysis of glycation rates and their relationships to plasma glucose levels and life history traits in birds that are based on data collected in a single study and measured using unified analytical methods.

      Strengths

      This is an emerging topic gaining momentum in evolutionary physiology, which makes this study a timely, novel, and very important contribution. The study is based on a novel data set collected by the authors from 88 bird species (67 in captivity, 21 in the wild) of 22 orders, which itself greatly contributes to the pool of available data on avian glycemia, as previous comparative studies either extracted data from various studies or a database of veterinary records of zoo animals (therefore potentially containing much more noise due to different methodologies or other unstandardised factors), or only collected data from a single order, namely Passeriformes. The data further represents the first comparative avian data set on albumin glycation obtained using a unified methodology. The authors used LC-MS to determine glycation levels, which does not have problems with specificity and sensitivity that may occur with assays used in previous studies. The data analysis is thorough, and the conclusions are mostly wellsupported (but see my comments below). Overall, this is a very important study representing a substantial contribution to the emerging field of evolutionary physiology focused on the ecology and evolution of blood/plasma glucose levels and resistance to glycation.

      Weaknesses

      My main concern is about the interpretation of the coefficient of the relationship between glycation rate and plasma glucose, which reads as follows: "Given that plasma glucose is logarithm transformed and the estimated slope of their relationship is lower than one, this implies that birds with higher glucose levels have relatively lower albumin glycation rates for their glucose, fact that we would be referring as higher glycation resistance" (lines 318-321) and "the logarithmic nature of the relationship, suggests that species with higher plasma glucose levels exhibit relatively greater resistance to glycation" (lines 386-388). First, only plasma glucose (predictor) but not glycation level (response) is logarithm transformed, and this semi-logarithmic relationship assumed by the model means that an increase in glycation always slows down when blood glucose goes up, irrespective of the coefficient. The coefficient thus does not carry information that could be interpreted as higher (when <1) or lower (when >1) resistance to glycation (this only can be done in a log-log model, see below) because the semi-log relationship means that glycation increases by a constant amount (expressed by the coefficient of plasma glucose) for every tenfold increase in plasma glucose (for example, with glucose values 10 and 100, the model would predict glycation values 2 and 4 if the coefficient is 2, or 0.5 and 1 if the coefficient is 0.5). Second, the semi-logarithmic relationship could indeed be interpreted such that glycation rates are relatively lower in species with high plasma glucose levels. However, the semi-log relationship is assumed here a priori and forced to the model by log-transforming only glucose level, while not being tested against alternative models, such as: (i) a model with a simple linear relationship (glycation ~ glucose); or (ii) a loglog model (log(glycation) ~ log(glucose)) assuming power function relationship (glycation = a * glucose^b). The latter model would allow for the interpretation of the coefficient (b) as higher (when <1) or lower (when >1) resistance in glycation in species with high glucose levels as suggested by the authors.

      Besides, a clear explanation of why glucose is log-transformed when included as a predictor, but not when included as a response variable, is missing.

      We apologize for missing an answer to this part before. Indeed, glucose is always log transformed and this is explained in the text.

      The models in the study do not control for the sampling time (i.e., time latency between capture and blood sampling), which may be an important source of noise because blood glucose increases because of stress following the capture. Although the authors claim that "this change in glucose levels with stress is mostly driven by an increase in variation instead of an increase in average values" (ESM6, line 46), their analysis of Tomasek et al.'s (2022) data set in ESM1 using Kruskal-Wallis rank sum test shows that, compared to baseline glucose levels, stress-induced glucose levels have higher median values, not only higher variation.

      Although the authors calculated the variance inflation factor (VIF) for each model, it is not clear how these were interpreted and considered. In some models, GVIF^(1/(2*Df)) is higher than 1.6, which indicates potentially important collinearity; see for example https://www.bookdown.org/rwnahhas/RMPH/mlr-collinearity.html). This is often the case for body mass or clutch mass (e.g. models of glucose or glycation based on individual measurements).

      It seems that the differences between diet groups other than omnivores (the reference category in the models) were not tested and only inferred using the credible intervals from the models. However, these credible intervals relate to the comparison of each group with the reference group (Omnivore) and cannot be used for pairwise comparisons between other groups. Statistics for these contrasts should be provided instead. Based on the plot in Figure 4B, it seems possible that terrestrial carnivores differed in glycation level not only from omnivores but also from herbivores and frugivores/nectarivores.

      Given that blood glucose is related to maximum lifespan, it would be interesting to also see the results of the model from Table 2 while excluding blood glucose from the predictors. This would allow for assessing if the maximum lifespan is completely independent of glycation levels. Alternatively, there might be a positive correlation mediated by blood glucose levels (based on its positive correlations with both lifespan and glycation), which would be a very interesting finding suggesting that high glycation levels do not preclude the evolution of long lifespans.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) Line 84: "glycation scavengers" such as polyamines - can you specify what these polyamines do exactly?

      A clarification of what we mean with "glycation scavengers" is added.

      (2) Line 87-89: specify that the work of Wein et al. and this sentence is about birds.

      This is now clarified.

      (3) Line 95: "88 species" add "OF BIRDS". Also, I think it would be nice if you specified here that you are relying on primary data.

      This is now clarified (line 96).

      (4) Line 90-119: I find this paragraph very long and complex, with too many details on the methodology. For instance, I agree with listing your hypothesis, e.g. that with POL, but then what variables you use to measure the pace of life can go in the materials and methods section (so all lines between 112-119).

      This is explained here as a previous reviewer considered this presentation was indeed needed in the introduction.

      (5) Line 122-124: The first sentence should state that you collected blood samples from various sources, and list some examples: zoos? collaborators? designated wild captures? Stating the sample size before saying what you did to get them is a bit weird. Besides, you skipped a very important detail about how these samples were collected, when, where, and using what protocols. We know very well, that glucose levels can increase quickly with handling stress. Was this considered during the captures? Moreover, you state that you had 484 individuals, but how many samples in total? One per individual or more?

      We kindly ask the reviewer to read the multiple supplementary materials provided, in which the questions of source of the samples, potential stress effects and sample sizes for each model are addressed. All individuals contributed with one sample. More details about the general sources employed are given now in lines 125-127.

      (6) Line 135-36: numbers below 10 should be spelled out.

      Ok. Now that is changed.

      (7) Line 136: the first time I saw that you had both wild and captive samples. This should be among the first things to be described in the methods, as mentioned above.

      As stated above, details on this are included in the supplementary materials, but further clarifications have now been included in the main text (question 5).

      (8) Line 137-138: not clear. So you had 46 samples and 9 species. But what does the 3-3-3 sample mean? or for each species you chose 9 samples (no, cause that would be 81 samples in total)?

      This has now been clarified (lines 139-140).

      (9) Line 139-141: what methodological constraints? Too high glucose levels? Too little plasma?

      There were cases in which the device (glucometer) produced an unspecific error. This did not correspond to too high nor too low glucose levels, as these are differently signalled errors. Neither the manual nor the client service provided useful information to discern the cause. This may perhaps be related to the composition of the plasma of certain species, interfering with the measurement. Some clarifications have been added (lines 143-146).

      (10) Line 143: should be ZIMS.

      Corrected.

      (11) Line 120-148: you generally talk about individuals here, but I feel it would be more precise to use 'samples'.

      The use is totally interchangeable, as we never measured more than one sample for a given individual within this study. Besides, in some cases, saying “sample” could result less informative.

      (12) Line 150: missing the final number of measurements for glucose and glycation.

      Please, read the ESM6 (Table ESM6.1), where this information is given.

      (13) Line 154-155: so you took multiple samples from the same individual? It's the first time the text indicates so. Or do you mean technical replicates were not performed on the same samples?

      As previously indicated, each individual included only one sample. Replicates were done only for some individuals to validate the technique, as it would be unfeasible to perform replicates of all of them. This part of the text is referring to the fact that not all samples were analysed at the same time, as it takes a considerable amount of time, and the mass spectrometry devices are shared by other teams and project. Clarifications in this sense are now added (lines 160-163).

      (14) Line 171-172: "After realizing that diet classifications from AVONET were not always suitable for our purpose" - too informal. Try rephrasing, like "After determining that AVONET diet classifications did not align with our research needs...", but you still need to specify what was wrong with it and what was changed, based on what argument?

      The new formulation suggested by the reviewer has now been applied (lines 181-183). The details are given in the ESM6, as indicated in the text. 

      (15) Line 174-176: You start a new paragraph, talking about missing values, but you do not specify what variable are you talking about. you talk about calculating means, but the last variable you mentioned was diet, so it's even more strange.

      We refer to life history traits. It has now been clarified in the text (line 185).

      (16) Line 177: what longevity records? Coming from where? How did you measure longevity? Maximum lifespan ever recorded? 80-90% longevity, life expectancy???

      We refer to maximum lifespan, as indicated in the introduction and in every other case throughout the manuscript. Clarifications have now been introduced (188-190).

      (17) Line 180-183: using ZIMS can be problematic, especially for maximum longevity. There are often individuals who had a wrong date of birth entered or individuals that were failed to be registered as dead. The extremes in this database are often way off. If you want to combine though, you can check the correlation of lifespans obtained from different sources for the overlapping species. If it's a strong correlation it can be ok, but intuitively this is problematic.

      The species for which we used ZIMS were those for which no other databases reported any values. We could try correlations for other species, but this issue is not necessarily restricted to ZIMS, as the primary origin of the data from other databases is often difficultly traceable. Also, ZIMS is potentially more updated that some of the other databases, mainly Amniotes database, from which we rely the most, as it includes the highest number of species in the most easily accessible format.

      (18) Line 181-186: in ZIMS you calculate the average of the competing records, otherwise you choose the max. Why use different preferences for the same data?

      This constitutes a misunderstanding, for which we include clarifications now (line 196). We were referring here to the fact that for maximum lifespan the maximum is always chosen, while for other variables an average is calculated. 

      (19) Line 198: Burn-in and thinning interval is quite low compared to your number of iterations. How were model convergences checked?

      Please, check ESM1.

      (20) Line 201-203: What's the argument using these priors? Why not use noninformative ones? Do you have some a priori expectations? If so, it should be explained.

      Models have now been rerun with no expectations on the variance partitions so the priors are less informative, given the lack of firm expectations, and results are similar. Smaller nu values are also tried.

      (21) Line 217: "carried" OUT.

      Corrected (now in line 229).

      (22) Line 233-234: "species average model" - what is this? it was not described in the methods.

      Please, read the ESM6.

      (23) Line 232-246: (a) all this would be better described by a table or plot. You can highlight some interesting patterns, but describing it all in the text is not very useful I think, (b) statistically comparing orders represented by a single species is a bit odd.

      (a) Figure 1 shows this graphically, but this part was found to be quite short without descriptions by previous reviewers. (b) We recognise this limitation, but this part is not presented as one of the main results of the article, and just constitutes an attempt to illustrate very general patterns, in order to guide future research, as in most groups glycation has never been measured, so this still constitutes the best illustration of such patterns in the literature.

      (24) Line 281: the first time I saw "mass-adjusted maximum lifespan" - what is this, and how was it calculated? It should be described in the methods. But in any case, neither ratios, nor residuals should be used, but preferably the two variables should be entered side by side in the model.

      Please, see ESM6 for the explanations and justifications for all of this.

      (25) Line 281: there was also no mention of quadratic terms so far. How were polynomial effects tested/introduced in the models? Orthogonal polynomials? or x+ x^2?

      Please, read ESM6.

      (26) Table 1. What is 'Centred Log10Body mass', should be added in the methods.

      Please, read ESM6.

      (27) Table 1: what's the argument behind separating terrestrial and aquatic carnivores?

      This was mostly based on the a priori separation made in AVONET, but it is also used in a similar way by Szarka and Lendvai 2024 (comparative study on glucose in birds), where differences in glucose levels between piscivorous and carnivorous are reported. We had some reasons to think that certain differences in dietary nutrient composition, as discussed later, can make this difference relevant.

      (28) Table 1: The variable "Maximum lifespan" is discussed and plotted as 'massadjusted maximum lifespan' and 'residual maximum lifespan'. First, this is confusing, the same name should be used throughout and it should be defined in the methods section. Second, it seems that non-linear effects were tested by using x + x^2. This is problematic statistically, orthogonal polynomials should be used instead (check polyfunction in R). Also, how did you decide to test for non-linear effects in the case of lifespan but not the other continuous predictors? Should be described in the methods again.

      Please, read ESM6. Data exploration was performed prior to carry out these models. Orthogonal polynomials were considered to difficult the interpretation of the estimates and therefore the patterns predicted by the models, so raw polynomials were used. Clarifications have now been included in line 297.

      (29) Figure 2. From the figure label, now I see that relative lifespan is in fact residual. This is problematic, see Freckleton, R. P. (2009). The seven deadly sins of comparative analysis. Journal of evolutionary biology, 22(7), 1367-1375. Using body mass and lifespan side by side is preferred. This would also avoid forcing more emphasis on body mass over lifespan meaning that you subjectively introduce body mass as a key predictor, but lifespan and body size are highly correlated, so by this, you remove a large portion of variance that might in fact be better explained by lifespan.

      Please, read ESM6 for justifications on the use of residuals.

      Reviewer #2 (Recommendations for the authors):

      (1) If the semi-logarithmic relationship (glycation ~ log10(glucose)) is to be used to support the hypothesis about higher glycation resistance in species with high blood glucose (lines 318-321 and 386-388), it should be tested whether it is significantly better than the model assuming a simple linear relationship (i.e., glycation ~ glucose). Alternatively, if the coefficient is to be used to determine whether glycation rate slows down or accelerates with increasing glucose levels, log-log model (log10(glycation) ~ log10(glucose)) assuming power function relationship (glycation = a * glucose^b) should be used (as is for example in the literature about relationships between metabolic rates and body size). Probably the best approach would be to compare all three models (linear, semi-logarithmic, and log-log) and test if one performs significantly better. If none of them, then the linear model should be selected as the most parsimonious.

      Different options (linear, both semi-logarithmic combinations and log-log) have now been tested, with similar results. All of the models confirm the pattern of a significant positive relationship between glucose and glycation. Moreover, when standardizing the variables (both glucose and glycation, either log transformed or not), the estimate of the slope is almost equal for all the models. It is also lower than one, which in the case of both the linear and log-log confirms the stated prediction. The log-log model, showing a much lower DIC than the linear version, is now shown as the final model.

      (2) ESM6, line 46: Please note that Kruskal-Wallis rank sum test in ESM1 shows that, compared to baseline glucose levels, stress-induced glucose levels have higher median values (not only higher variation). With this in mind, what is the argument here about increased variation being the main driver of stress-induced change in glucose levels based on? It seems that both the median values and variation differ between baseline and stress-induced levels, and this should be acknowledged here.

      As discussed in the public answers, Kruskal Wallis does not allow to determine differences in mean, but just says that the groups are “different” (implicitly, in their ranksums, which does not mean necessarily in mean), while the Levene test performed signals heteroskedasticity. This makes this feature of the data analytically more grounded. Of course, when looking at the data, a higher mean can be perceived, but nothing can be said about its statistical significance. Still, some subtle changes have been introduced in corresponding section of the ESM6.

      (3) Have you recorded the sampling times? If yes, why not control them in the models? It is at least highly advisable to include the sampling times in the data (ESM5).

      As indicated in ESM6 lines 42-43, we do not have sampling times for most of the individuals (only zebra finches and swifts), so this cannot be accounted for in the models.

      (4) If sampling times will remain uncontrolled statistically, I recommend mentioning this fact and its potential consequences (i.e., rather conservative results) in the Methods section of the main text, not only in ESM6.

      A brief description of this has now been included in the main text (lines 129-132), referencing the more detailed discussion on the supplementary materials. Some subtle changes have also been included in the “Possible effects of stress” section of the ESM6.

      (5) ESM6, lines 52-53: The lower repeatability in Tomasek et al.' study compared to your study is irrelevant to the argument about the conservative nature of your results (the difference in repeatability between both studies is most probably due to the broader taxonomic coverage of the current study). The important result in this context is that repeatability is lower when sampling time is not considered within Tomasek et al's data set (ESM1). Therefore, I suggest rewording "showing a lower species repeatability than that from our data" to "showing lower species repeatability when sampling time is not considered" to avoid confusion. Please also note that you refer here to species repeatability but, in ESM1, you calculate individual repeatability. Nevertheless, both individual and species repeatabilities are lower when not controlling for sampling time because the main driver, in that case, is an increased residual variance.

      We recognize the current confusion in the way the explanation is exposed, and have significantly changed the redaction of the section. However, we would like to indicate that ESM1 shows both species and individual repeatability (for Tomasek et al. 2022 data, for ours only species as we do not have repeated individual values). Changes are now made to make it more evident.

      (6) I recommend providing brief guidelines for the interpretation of VIFs to the readers, as well as a brief discussion of the obtained values and their potential importance.

      Thank you for the recommendation. We included a brief description in lines 230-231. Also in the results section (lines 389-393).

      (7) Line: 264: Please note that the variance explained by phylogeny obtained from the models with other (fixed) predictors does not relate to the traits (glucose or glycation) per se but to model residuals.

      We appreciate the indication, and this has been rephrased accordingly (lines 280-286).

      (8) Change the term "confidence intervals" to "credible intervals" throughout the paper, since confidence interval is a frequentist term and its interpretations are different from Bayesian credible interval.

      Thank you for the remark, this has now been changed.

      (9) Besides lifespan, have you also considered quadratic terms for body mass? The plot in Figure 2A suggests there might be a non-linear relationship too.

      A quadratic component of body mass has not shown any significant effect on glucose in an alternative model. Also, a model with linear instead of log glucose (as performed in other studies) did not perform better by comparing the DICs, despite both showing a significant relationship between glucose and body mass. Therefore, this model remains the best option considered as presented in the manuscript.

      (10) ESM6, lines 115-116: It is usually recommended that only factors with at least 6 or 8 levels are included as random effects because a lower number of levels is insufficient for a good estimation of variance.

      In a Bayesian approach this does not apply, as random and fixed factors are estimated similarly. 

      (11) Typos and other minor issues:

      a) Line 66: Delete "related".

      b) Figure 2: "B" label is missing in the plot.

      c) Reference 9: Delete "Author".

      d) References 15 and 83 are duplicated. Keep only ref. 83, which has the correct citation details.

      e) ESM6, line 49: Change "GLLM" to "GLMM".

      Thank you for indicating this. Now it’s corrected.

    1. eLife Assessment

      This important study introduces a fully differentiable variant of the Gillespie algorithm as an approximate stochastic simulation scheme for complex chemical reaction networks, allowing kinetic parameters to be inferred from empirical measurements of network outputs using gradient descent. The concept and algorithm design are convincing and innovative. While the proofs of concept are promising, some questions are left open about implications for more complex systems that cannot be addressed by existing methods. This work has the potential to be of significant interest to a broad audience of quantitative and synthetic biologists.

    2. Reviewer #1 (Public review):

      Summary:

      This work introduces the differentiable Gillespie algorithm, DGA, which is a differentiable variant of the celebrated (and exact) Gillespie algorithm commonly used to perform stochastic simulations across numerous fields, notably in the life sciences. The proposed DGA approximates the exact Gillespie algorithm using smooth functions yielding a suitable approximate differentiable stochastic system as a proxy for the underlying discrete stochastic system, where DGA stochastic reactions have continuous reaction index and the species abundances. To illustrate their methodology, the authors specifically consider in detail the case of a well-studied two-state promoter gene regulation system that they analyze using a machine learning approach, and by combining simulation data with analytical results. For the two-state promoter gene system, the DGA is benchmarked by accurately reproducing the results of the exact Gillespie algorithm. For this same simple system, the authors also show how the DGA can be used for estimating kinetic parameters of both simulated and real noisy experimental data. This lets them argue convincingly that the DGA can become a powerful computation tool for applications in quantitative and synthetic biology. In order to argue that the DGA can be employed to design circuits with ad-hoc input-output relations, these considerations are then extended to a more complex four-state promoter model of gene regulation. The main strength of the paper is its clarity and its pedagogical presentation of the simulation methods.

      Strengths:

      The main strength of the paper is its clarity and its pedagogical presentation of the simulation methods.

      Weaknesses:

      It would have been useful to have a brief discussion, based on a concrete example, of what can be achieved with the DGA and is totally beyond the reach of the Gillespie algorithm and the numerous existing stochastic simulation methods. A more comprehensive and quantitative analysis of the limitations of the DGA, e.g. for rare events, and how it might be used for stochastic spatial systems would have also been helpful. However, this is arguably beyond the scope of this study whose primary goal is to introduce the DGA and demonstrate that it can achieve tasks like parameter estimation and network design.

      Comments on revisions:

      The authors have made a sound effort to address many of the comments raised in the previous reports. This has helped improve the clarity of the discussion.

    3. Reviewer #2 (Public review):

      Summary:

      In this work, the authors present a differentiable version of the widely-used Gillespie Algorithm. The Gillespie Algorithm has been used for decades to simulate the behavior of stochastic biochemical reaction networks. But while the Gillespie Algorithm is a powerful tool for the forward simulation of biochemical systems given some set of known reaction parameters, it cannot be used for reverse process, i.e. inferring reaction parameters given a set of measured system characteristics. The Differentiable Gillespie Algorithm ("DGA") overcomes this limitation by approximating two discontinuous steps in the Gillespie Algorithm with continuous functions. This makes it possible to calculate of gradients for each step in the simulation process which, in turn, allows the reaction parameters to be optimized via powerful backpropagation techniques. In addition to describing the theoretical underpinnings of DGA, the authors demonstrate different potential use-cases for the algorithm in the context of simple models of stochastic gene expression.

      Overall, the DGA represents an important conceptual step forward for the field and should lay the groundwork for exciting innovations in the analysis and design of stochastic reaction networks. At the same time, significantly more work is needed to establish when the approximations made by DGA are valid and to demonstrate the viability of the algorithm in the context of complicated reaction networks.

      Strengths:

      This work makes an important conceptual leap by introducing a version of the Gillespie Algorithm that is end-to-end differentiable. This idea alone has the potential to drive a number of exciting innovations in the analysis, inference, and design of biochemical reaction networks. Beyond the theoretical adjustments, the authors also implement their algorithm in a Python-based codebase that combines DGA powerful optimization libraries like PyTorch. This codebase has the potential to be of interest to a wide range of researchers, even if the true scope of the method's applicability remains to be fully determined.

      The authors also demonstrate how DGA can be used in practice both to infer reaction parameters from real experimental data (Figure 7) and to design networks with user-specified input-output characteristics (Figure 8). These illustrations should provide a nice roadmap for researchers interested in applying DGA to their own projects/systems.

      Finally, although it does not stem directly from DGA, the exploration of pairwise parameter dependencies in different network architectures provides an interesting window into the design constraints (or lack thereof) that shape the architecture of biochemical reaction networks.

      Weaknesses:

      While it is clear that the DGA represents an important conceptual advancement, the authors do not do enough in the present manuscript to (i) validate the robustness of DGA inference and (ii) demonstrate that DGA inference works in the kinds of complex biochemical networks where it would actually be of legitimate use.

      It is to the authors' credit that they are open and explicit about the potential limitations of DGA due to breakdowns in its continuous approximations. However they do not provide the reader with nearly enough empirical (i.e. simulation-based) or theoretical context to assess when, why, and to what extent DGA will fail in different situations. In Figure 2, they compare DGA to GA (i.e. ground-truth) in the context of a simple two state model of a stochastic transcription. Even in this minimal system, we see that DGA deviates notably from ground-truth both in the simulated mRNA distributions (Figure 2A) and in the ON/OFF state occupancy (Figure 2C). This begs the question of how DGA will scale to more complicated systems, or systems with non-steady state dynamics. Will the deviations become more severe? This is important because, in practice, there is really not much need for using DGA with a simple 2 state system-we have analytic solutions for this case. It is the more complex systems where DGA has the potential to move the needle.

      A second concern is that the authors' present approach for parameter inference and error calculation does not seem to be reliable. For example, in Figure 5A, they show DGA inference results for the ON rate of a two-state system. We see substantial inference errors in this case, even though the inference problem should be non-degenerate in this case. One reason for this seems to be that the inference algorithm does not reliably find the global minimum of the loss function (Figure 2B). To turn DGA into a viable approach, it is paramount that the authors find some way to improve this behavior, perhaps by using multiple random initializations to better search the loss space.

      Finally, the authors do a good job of illustrating how DGA might be used to infer biological parameters (Figure 7) and design reaction networks with desired input-output characteristics (Figure 8). However, analytic solutions exist for both of the systems they select for examples. This means that, in practice, there would be no need for DGA in these contexts, since one could directly optimize, e.g., the expressions for the mean and Fano Factor of the system in Figure 7A. I still believe that it is useful to have these examples, but it seems critical to add a use-case where DGA is the only option.

      Comments on revisions:

      I am concerned that the results in Figure 8D may not be correct, or that the authors may be mis-interpreting them. From my reading of the paper they cite (Lammers & Flamholz 2023), the equilibrium sharpness limit for the network they consider in Figure 8 should be 0.25. But both solutions shown in Figure 8D fall below this limit, which means that they have sharpness levels that could have been achieved with no energy expenditure. If this is the case, then it would imply that while both systems do dissipate energy, they are not doing so productively; meaning that the same results could be achieved while holding Phi=0.

      I acknowledge that this could be due to a difference in how they measure sharpness, but wanted to raise it here in case it is, in fact, a genuine issue with the analysis.

      There should be an easy fix for this: just set the sharper "desired response" curve in 8b to be such that it demands non-equilibrium sharpness levels (0.25)

    4. Reviewer #3 (Public review):

      Summary:

      This manuscript introduces a differentiable variant of the Gillespie algorithm (DGA) that allows gradient calculation using backpropagation. The most significant contribution of this work is the development of the DGA itself, a novel approach to making stochastic simulations differentiable. This is achieved by replacing discontinuous operations in the traditional Gillespie algorithm with smooth, differentiable approximations using sigmoid and Gaussian functions. This conceptual advance opens up new avenues for applying powerful gradient-based optimization techniques, prevalent in machine learning, to studying stochastic biological systems.

      The method was tested on a simple two-state promoter model of gene expression. The authors found that the DGA accurately captured the moments of the steady-state distribution and other major qualitative features. However, it was less accurate at capturing information about the distribution's tails, potentially because rare events result from frequent low-probability reaction events where the approximations made by the DGA have a greater impact. The authors also used the DGA to design a four-state promoter model of gene regulation that exhibited a desired input-output relationship. The DGA could learn parameters that produced a sharper response curve, which was achieved by consuming more energy.

      The authors conclude that the DGA is a powerful tool for analyzing and designing stochastic systems. The discussion lays several open questions in the field and constructively addresses shortcomings of the proposed method as well as potential ways forward.

      Strengths:

      The DGA allows gradient-based optimization techniques to estimate parameters and design networks with desired properties.

      The DGA efficacy in estimating kinetic parameters from both synthetic and experimental data. This capability highlights the DGA's potential to extract meaningful biophysical parameters from noisy biological data.

      The DGA's ability to design a four-state promoter architecture exhibits a desired input-output relationship. This success indicates the potential of the DGA as a valuable tool for synthetic biology, enabling researchers to engineer biological circuits with predefined behaviours.

      Weaknesses:

      The study primarily focuses on analysing the steady-state properties of stochastic systems.

      Comments on revisions:

      Thank you for addressing all the points raised. I am looking forward to seeing the next steps in DGAs development and performance!

    5. Author response:

      The following is the authors’ response to the current reviews.

      Response to Reviewer 2’s comments:

      I am concerned that the results in Figure 8D may not be correct, or that the authors may be mis-interpreting them. From my reading of the paper they cite (Lammers & Flamholz 2023), the equilibrium sharpness limit for the network they consider in Figure 8 should be 0.25. But both solutions shown in Figure 8D fall below this limit, which means that they have sharpness levels that could have been achieved with no energy expenditure. If this is the case, then it would imply that while both systems do dissipate energy, they are not doing so productively; meaning that the same results could be achieved while holding Phi=0.

      I acknowledge that this could be due to a difference in how they measure sharpness, but wanted to raise it here in case it is, in fact, a genuine issue with the analysis.There should be an easy fix for this: just set the sharper "desired response" curve in 8b to be such that it demands non-equilibrium sharpness levels (0.25<S<0.5).

      Thank you for raising this point regarding the interpretation of our results in Figure 8D. We agree that if the equilibrium sharpness limit for this particular network is around 0.25 (as shown by Lammers & Flamholz 2023), then achieving a sharpness below this threshold could, in principle, be accomplished without any energy expenditure. However, in our current design approach, the loss function is solely designed to enforce agreement with a target mean mRNA level at different input concentrations; it does not explicitly constrain energy dissipation, noise, or other metrics. Consequently, the DGA has no built-in incentive to minimize or optimize energy consumption, which means the resulting solutions may dissipate energy without exceeding the equilibrium sharpness limit.

      In other words, the same input–output relationship could theoretically be achieved with \Phi =0 if an explicit constraint or regularization term penalizing energy usage had been included. As noted, adding such a term (e.g., penalizing \Phi^2) is conceptually straightforward but falls outside the scope of this study. Our primary goal is to demonstrate the flexibility of the DGA in designing a desired response, rather than to delve into energy–sharpness trade-offs or other biological considerations

      While we appreciate the suggestion to set a higher target sharpness that exceeds the equilibrium limit, we believe the current example effectively demonstrates the DGA’s ability to design circuits with desired input-output relationships, which is the primary focus of this study. Researchers interested in optimizing energy efficiency, burst size, burst frequency, noise, response time, mutual information, or other system properties can easily extend our approach by incorporating additional terms into the loss function to target these specific objectives.

      We hope this explanation addresses your concern and clarifies that the manuscript provides sufficient context for readers to interpret the results in Figure 8D correctly.


      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      We thank Reviewer #1 for their thoughtful feedback and appreciation of the manuscript's clarity. Our primary goal is to introduce the DGA  as a foundational tool for integrating stochastic simulations with gradient-based optimization. While we recognize the value of providing detailed comparisons with existing methods and a deeper analysis of the DGA’s limitations (such as rare event handling), these topics are beyond the scope of this initial work. Our focus is on presenting the core concept and demonstrating its potential, leaving more extensive evaluations for future research.

      Reviewer #2 (Public review):

      We thank Reviewer #2 for their detailed and constructive feedback. We appreciate the recognition of the DGA as a significant conceptual advancement for stochastic biochemical network analysis and design.

      Weaknesses:

      (1) Validation of DGA robustness in complex systems:

      Our primary goal is to introduce the DGA framework and demonstrate its feasibility. While validation on high-dimensional and non-steady-state systems is important, it is beyond the scope of this initial work. Future studies may improve scalability by employing techniques such as dynamically adjusting the smoothness of the DGA's approximations during simulation or using surrogate models that remain differentiable but more accurately capture discrete behaviors in critical regions, thus preserving gradient computation while improving accuracy.

      (2) Inference accuracy and optimization:

      We acknowledge that the non-convex loss landscape in the DGA can hinder parameter inference and convergence to global minima, as seen in Figure 5A. While techniques like multi-start optimization or second-order methods (e.g., L-BFGS) could improve performance, our focus here is on establishing the DGA framework. We plan to explore better optimization methods in future work to improve the accuracy of parameter inference in complex systems.

      (3) Use of simple models for demonstration:

      We selected well-understood systems to clearly illustrate the capabilities of the DGA. These examples were intended to demonstrate how the DGA can be applied, rather than to solve problems better addressed by analytical methods. Applying DGA to more complex, analytically intractable systems is an exciting avenue for future work, but introducing the method was our main objective in this study.

      Reviewer #3 (Public review):

      We thank the reviewer for their detailed and insightful feedback. We appreciate the recognition of the DGA as a significant advancement for enabling gradient-based optimization in stochastic systems.

      Weaknesses:

      (1) Application beyond steady-state analysis

      We acknowledge the limitation of focusing solely on steady-state properties. To extend the DGA for analyzing transient dynamics, time-dependent loss functions can be incorporated to capture system evolution over time. This could involve aligning simulated trajectories with experimental time-series data or using moment-matching across multiple time points. 

      (2) Numerical instability in gradient computation

      The reviewer correctly highlights that large sharpness parameters (a and b) in the sigmoid and Gaussian approximations can induce numerical instability due to vanishing or exploding gradients. To address this, adaptive tuning of a and b during optimization could balance smoothness and accuracy. Additionally, alternative smoothing functions (e.g., softmax-based reaction selection) and gradient regularization techniques (such as gradient clipping and trust-region methods) could improve stability and convergence.

      Reviewer #1 (recommendations):

      We thank the reviewer for their thoughtful and constructive feedback on our manuscript. Below, we address each of the comments and suggestions raised.

      Main points:

      (1) It would have been useful to have a brief discussion, based on a concrete example, of what can be achieved with the DGA and is totally beyond the reach of the Gillespie algorithm and the numerous existing stochastic simulation methods.

      Thank you for your comment. We would like to clarify that the primary aim of this work is to introduce the DGA and demonstrate its feasibility for tasks such as parameter estimation and network design. Unlike traditional stochastic simulation methods, the DGA’s differentiable nature enables gradient-based optimization, which is not possible with the classical Gillespie algorithm or its variants.

      (2) As often with machine learning techniques, there is a sense of black box, with a lack of mathematical details of the proposed method: as opposite to the exact Gillespie algorithm, whose foundations lie on solid mathematical results (exponentially-distributed waiting times of continuous-time Markov processes), the DGA involves uncontrolled approximations, that are only briefly mentioned in the paper. For instance, it is currently simply noted that "the approximations introduced by the DGA may be pronounced in more complex settings such as the calculation of rare events", without specifying how limiting these errors are. It would be useful to include a clearer and more comprehensive discussion of the limitations of the DGA: When does it work accurately? What are the approximations/errors and can they be controlled? When is it worth paying the price for those approximations/errors, and when is it better to stick to the Gillespie algorithm? Is this notably the case for problems involving rare events? Clearly, these are difficult questions, and the answers are problem specific. However, it would be important to draw the readers' attention on the issues, especially if the DGA is presented as a potentially significant tool in computational and synthetic biology.

      We acknowledge the importance of discussing the limitations of the DGA in more detail. While we have noted that the approximations introduced by the DGA may impact its accuracy in certain scenarios, such as rare-event problems, a deeper exploration of these trade-offs is outside the scope of this work. Instead, we provide sufficient context in the manuscript to guide readers on when the DGA is appropriate.

      (3) The DGA is here introduced and discussed in the context of non-spatial problems (simple gene regulatory networks). However, numerous problems in the life sciences and computational/synthetic biology, involve stochasticity and spatial degrees of freedom (e.g. for problems involving diffusion, migration, etc). It is notoriously challenging to use the Gillespie algorithm to efficiently simulate stochastic spatial systems, especially in the context of rare events (e.g., extinction or fixation problems). It would be useful to comment on whether, and possibly how, the DGA can be used to efficiently simulate stochastic spatial systems, and if it would be better suited than the Gillespie algorithm for this purpose.

      Thank you for pointing this out. Although our current work centers on non-spatial systems, we agree that many biological contexts incorporate both stochasticity and spatial degrees of freedom. Extending the DGA to efficiently simulate such systems would indeed require substantial modifications—for instance, coupling it with reaction-diffusion frameworks or spatial master equations. We believe this is an exciting direction for future research and mention it briefly in the discussion as a potential extension.

      Minor suggestions:

      (1) After Eq.(10): it would be useful to explain and motivate the choice of the ratio JSD/H.

      Done.

      (2) On page 6, just below the caption of Fig.4: it would be useful to clarify what is actually meant by "... convergence towards the steady-state distribution of the exact Gillespie simulation, which is obtained at a simulation time of 10^4".

      Done.

      (3) At the end of Section B on page 7: please clarify what is meant here by "soft directions".

      Done.

      Reviewer #2 (recommendations):

      We thank the reviewer for their thoughtful comments and constructive feedback. Below, we address each of the comments/suggestions.

      Main points:

      (1) Enumerate the conditions under which DGA assumptions hold (and when they do not). There is currently not enough information for the interested reader to know whether DGA would work for their system of interest. Without this information, it is difficult to assess what the true scope of DGA's impact will be. One simple idea would be to test DGA performance along two axes: (i) increasing number of model states and (ii) presence/absence of non-steady state dynamics. I acknowledge that these are very open-ended directions, but looking at even a single instance of each would greatly strengthen this work. Alternatively, if this is not feasible, then the authors should provide more discussion of the attendant difficulties in the main text.

      We agree that a detailed exploration of the conditions under which the DGA assumptions hold would be a valuable addition to the field. However, this paper primarily aims to introduce the DGA methodology and demonstrate its proof-of-concept applications. A comprehensive analysis along axes such as increasing model states or non-steady-state dynamics, while important, would require significant additional simulations and is beyond the scope of this work. In Appendix A, we have discussed the trade-off between accuracy and numerical stability. Additionally, we encourage future users to tune the hyperparameters a and b for their specific systems.

      (2) Demonstrate DGA performance in a more complex biochemical system. Clearly the authors were aware that analytic solutions exist for the 2-state system in Figure 7, but it this is actually also the case (I think) for mean mRNA production rate of the non-equilibrium system in Figure 8. To really demonstrate that DGA is practically viable, I encourage the authors to seek out an interesting application that is not analytically tractable.

      We appreciate the suggestion to validate DGA on a more complex biochemical system. However, the goal of this study is not to provide an exhaustive demonstration of all possible applications but to introduce the DGA and validate it in systems where ground-truth comparisons are available. While the non-equilibrium system in Figure 8 might be analytically tractable, its complexity already provides a meaningful demonstration of DGA’s ability to optimize parameters and design systems. Extending this work to analytically intractable systems is an exciting direction for future studies, and we hope this paper will inspire others to explore these applications.

      (3) Take steps to improve the robustness of parameter optimization and error bar calculations. (3a) When the loss landscape is degenerate, shallow, or otherwise "difficult," a common solution is to perform multiple (e.g. 25-100) inference runs starting from different random positions in parameter space. Doing this, and then taking the parameter set that minimizes the loss should, in theory, lead to a more robust recovery of the optimal parameter set.

      (3b) It seems clear that the Hessian approximation is underestimating the true error in your inference results. One alternative is to use a "brute force" approach like bootstrap resampling to get a better estimate for the statistical dispersion in parameter estimates. But I recognize that this is only viable if the inference is relatively fast. Simply recovering the true minimum will, of course, also help.

      (3a) We acknowledge the challenge posed by degenerate or shallow loss landscapes during parameter optimization. While performing multiple inference runs from different initializations is a common strategy, this approach is computationally intensive. Instead, we rely on standard optimization techniques (e.g., ADAM) to find a robust local minimum. 

      (3b) Thank you for your comment. We agree that Hessian-based error bars can underestimate uncertainty, particularly in degenerate or poorly conditioned loss landscapes. While methods like bootstrap and Monte Carlo can provide more robust estimates, they can be computationally prohibitive for larger-scale simulations. A simpler reason for not using them is the high resource demand from repeated simulations, which quickly becomes infeasible for complex or high-dimensional models. We note these trade-offs between robust estimation and practicality as an important area for further exploration.

      Moderate comments:

      (1) Figure 7: is it possible to also show the inferred kon values? Specifically, it would be of interest to see how kon varies with repressor concentration.

      Thank you for the suggestion. We have updated Figure 7 to include the inferred kon values, showing their variation with the mean mRNA copy number. However, we could not plot them against repressor concentration due to the lack of available data.

      (2) Figure 8B & D: the authors claim that the sharper system dissipates more energy, but doesn't 8D show the opposite of this? More importantly, it does not look like either network drives sharpness levels that exceed the upper equilibrium limit cited in [36]. So it is not clear that it is appropriate to look at energy dissipation here. In fact, it is likely that equilibrium networks could produce the curves in 8B, and might be worth checking.

      Thank you for pointing this out. We realized that the plotted values in Figure 8D were incorrect, as we had mistakenly plotted noise instead of energy dissipation. The plot has now been corrected. 

      (3) Figure 8: I really like this idea of using DGA to "design" networks with desired input-output properties, but I wonder if you could explore more a biologically compelling use-case. Specifically, what about some kind of switch-like logic where, as the activator concentration increases, you have first 0 genes on, then 1 promoter on, then 2 promoters on. This would achieve interesting regulatory logic, and having DGA try to produce step functions would ensure that you force the networks to be maximally sharp (i.e. about double what you're currently achieving).

      Thank you for this intriguing suggestion. While the proposed switch-like logic use case is indeed compelling, implementing such a system would require significant work. This goes beyond the scope of the current study, which focuses on demonstrating the feasibility of DGA for network design with simple input-output properties.

      Minor comments:

      (1) Figure 4B & C: the bar plots do not do a good job conveying the points made by the authors. Consider alternatives, such as scatter plots or box plots that could convey inference uncertainty.

      Done.

      (2) Figure 4B: consider using a log y-axis.

      The y-axis in Figure 4B is already plotted on a log scale.

      (3) Figure 4D is mentioned prior to 4C in the text. Consider reordering.

      Done. 

      (4) Figure 5B: it is difficult to assess from this plot whether or not the landscape is truly "flat," as the authors claim. Flat relative to what? Consider alternative ways to convey your point.

      Thank you for highlighting this ambiguity. By describing the loss landscape as “flat,” we intend to convey its relative insensitivity to parameter variations in certain regions, rather than implying a completely level surface. While we believe Figure 5B still provides a useful qualitative depiction of this behavior, we acknowledge that it does not quantitatively establish “flatness.” In future work, we plan to incorporate more rigorous measures—such as gradient magnitudes or Hessian eigenvalues—to more accurately characterize and communicate the geometry of the loss landscape.

      Reviewer #3 (recommendations):

      We sincerely thank the reviewer for their thoughtful feedback and constructive suggestions, which have helped us improve the clarity and rigor of our manuscript. Below, we address each of the comments.

      (1) Precision is lacking in the introduction section. Do the authors mean the Direct SSA, sorted SSA, which is usually faster, and how about rejection sampling methods?

      Thank you for pointing this out. We have updated the introduction to explicitly mention the Direct SSA.

      (2) When mentioning PyTorch and Jax, would be good to also talk about Julia, as they have fast stochastic simulators.

      We have now mentioned Julia alongside PyTorch and Jax.

      (3) Mentioned references 22-27. Reference 26 is an odd choice; a better reference is from the same author the Automatic Differentiation of Programs with Discrete Randomness, G Arya, M Schauer, F Schäfer, C Rackauckas, Advances in Neural Information Processing Systems, NeurIPS 2022

      We have now cited the suggested reference.

      (4) Page 1, Section: 'To circumnavigate these difficulties, the DGA modifies....' Have you thought about how you would deal with the bias that will be introduced by doing this?

      Thank you for your insightful comment. We acknowledge the potential for bias due to the differentiable approximations in the DGA; however, our analysis has not revealed any systematic bias compared to the exact Gillespie algorithm. Instead, we observe irregular deviations from the exact results as the smoothness of the approximations increases.

      (5) Page 2, first sentence '... traditional Gillespie...' be more precise here - the direct algorithm.

      Thank you for your comment. We believe that the context of the paper, particularly the schematic in Figure 1, makes it clear that we are focusing on the Direct SSA. 

      (6) Page 2, second paragraph: ' In order to simulate such a system...' This doesn't fit here as this section is about tau-leaping. As this approach approximates discrete operations, it is unclear if it would work for large models, snap-shot data of larger scale and if it would be possible to extend it for time-lapse data

      Thank you for your comment. We respectfully disagree that this paragraph is misplaced. The purpose of this paragraph is to explain why the standard Gillespie algorithm does not use fixed time intervals for simulating stochastic processes. By highlighting the inefficiency of discretizing time into small intervals where reactions rarely occur, the paragraph provides necessary context for the Gillespie algorithm’s event-driven approach, which avoids this inefficiency.

      Regarding the applicability of the DGA to larger models, snapshot data, or time-lapse data, we acknowledge these are important directions and have noted them as potential extensions in the discussion section.

      (7) Page 2 Section B: 'In order to make use of modern deep-learning techniques...' It doesn't appear from the paper that any modern deep learning is used.

      Thank you for your comment. Although the DGA does not utilize deep learning architectures such as neural networks, it employs automatic differentiation techniques provided by frameworks like PyTorch and Jax. These tools allow efficient gradient computations, making the DGA compatible with modern optimization workflows.

      (8) Page 3, Fig 1(a). S matrix last row, B and C should swap places: B should be 1 and C is -1.

      Corrected the typo.

      (9) Fig1 needs a more detailed caption.

      Expanded the caption slightly for clarity.

      (10) Page 3 last paragraph: 'The hyperparameter b...' Consequences of this are relevant, for example can we now go below zero. Also, we lose more efficient algorithms here. It would be good to discuss this in more detail that this is an approx.. algorithm that is good for our case study, but for other to use it more tests are needed.

      Thank you for the comment. Appendix A discusses the trade-offs related to a and b, but we agree that more detailed analysis is needed. The hyperparameters are tailored to our case study and must be tuned for specific systems.

      (11) Page 4, Section C, first paragraph, 'The goal of making...' This is snapshot data. Would the framework also translate to time-lapse data? Also, it would be better to make it clearer earlier which type of data are the target of this study.

      Thank you for your suggestion. While the current study focuses on snapshot data and steady-state properties, we believe the DGA could be extended to handle time-lapse data by incorporating multiple recorded time points into its inference objective. Specifically, one could modify the loss function to penalize discrepancies across observed transitions between these time points, effectively capturing dynamic trajectories. We consider this an exciting area for future development, but it lies beyond our present scope.

      (12) Page 4 Section C, sentence '...experimentally measured moments'. Should later be mentioned as error, as moments are imperfect

      Thank you for your comment. We agree that experimentally measured moments are inherently noisy and may not perfectly represent the true system. However, within the context of the DGA, these moments serve as target quantities, and the discrepancy between simulated and measured moments is already accounted for in the loss function. 

      (13) Page 4 Section C, last sentence '...second-order...such as ADAM'. Another formulation would be better as second order can be confusing, especially in the context of parameter estimation

      We have revised the language to avoid confusion regarding “second-order” methods.

      (14) Fig 4(a) a density plot would fit better here

      Fig. 4(a) has been updated to a scatter density plot as suggested.

      (15) Fig 4(c) Would be interesting to see closer analysis of trade of between gradient and accuracy when changing a and b parameters

      Thank you for this suggestion. We acknowledge that an in-depth exploration of these trade-offs could provide deeper insights into the method’s performance. However, for now, we believe the current analysis suffices to highlight the utility of the DGA in the contexts examined.

      (16) Page 6 Section III, first sentence: This fits more to intro. Further the reference list is severely lacking here, with no comparison to other methods for actually fitting stochastic models.

      Thank you for the suggestion. We have added a few references there.

      (17) Page 6, Section A, sentence: '....experimental measured mean...' Why is it a good measure here (moment matching is not perfect), also do you have distribution data, would that not be better? How about accounting for measurement error?

      Thank you for the comment. While we do not have full distribution data, we acknowledge that incorporating experimental measurement error could enhance the framework. A weighted loss function could model uncertainty explicitly, but this is beyond the scope of the current study. 

      (18) Page 7, section B, first paragraph: 'Motivated by this, we defined the...'Why using Fisher-Information when profile-likelihood have proven to be better, especially for systems with few parameters like this.

      Thank you for the suggestion. While profile-likelihood is indeed a powerful tool for parameter uncertainty analysis, we chose Fisher Information due to its computational efficiency and compatibility with the differentiable nature of the DGA framework.

      (19)  Page 7, section C, sentence '...set kR/off=1..'. In this case, we cannot infer this parameter.

      Thank you for the comment. You are correct that setting kR/off = 1 effectively normalizes the rates, making this parameter unidentifiable. In steady-state analyses, not all parameters can be independently inferred because observable quantities depend on relative—rather than absolute—rate values (as evident when setting the time derivative to zero in the master equation). To infer all parameters, one would need additional information, such as time-series data or moments at finite time.

      (20)  Page 7 Section 2. Estimating parameters .... Sentence: '....as can be seen, there is very good agreement..' How many times the true value falls within the CI (because corr 0.68 is not great).

      Thank you for your comment. While a correlation coefficient of 0.68 indicates moderate agreement, the primary goal was to demonstrate the feasibility of parameter estimation using the DGA rather than achieving perfect accuracy. The coverage of the CI was not explicitly calculated, as the focus was on the overall trends and relative agreement.

      (21) Page 7 Section 2. Estimating parameters .... Sentence: 'Fig5(c) shows....' Is this when using exact simulator?

      Thank you for your question. Yes, the exact values in x-axis of Fig. 5(c) are obtained using the exact Gillespie simulation.

      (22) Page 7 Section 3 Estimating parameters for the... Sentence: 'Fig6(a) shows...' Why Cis are not shown?

      Thank you for your comment. CIs are not shown in Fig. 6(a) because this particular case is degenerate, making the calculation and meaningful representation of CIs challenging. 

      (23) Page 10, Sentence: 'As can be seen in Fig 7(b)...' Can you show uncertainty in measured value? It would be good to see something of a comparison against an exact method, at least on simulated synthetic data

      Thank you for the comment. Fig. 7(a) already includes error bars for the experimental data, which account for measurement uncertainty. However, in Fig. 7(b), we do not include error bars for the experimental values due to limitations in the available data.

      (24) Page 12, Section B Loss function '...n=600...' This is on a lower range. Have you tested with n=1000?

      Yes, we have tested with n=1000 and observed no significant difference in the results. This indicates that n=600 is sufficient for the purposes of this study. 

      (25) Fig 8(c) why there are no CI shown?

      Thank you for your comment. CIs were not included in Fig. 8(c) due to degeneracy, which makes meaningful confidence intervals difficult to compute.

      (26) Page 12 Conclusion, sentence: '..gradients via backpropagation...' Actually, by making the function continuous, both forward and reverse mode might be used. And in this case, forward-mode would actually be the fastest by quite a margin

      Thank you for your insightful comment. You are correct that by making the function continuous, both forward-mode and reverse-mode automatic differentiation can be used. We have now mentioned this point in the discussion.

      (27) Overall comment for the Conclusion section: It would be good to discuss how this framework compares to other model-fitting frameworks for models with stochastic dynamics. The authors mention dynamic data and more discussion on this would be very welcomed. Why use ADAM and not something established like BFGS for model fitting? It would be interesting to discuss how this can fit with other SSA algorithms (e.g. in practice sorting SSA is used when models get larger). Also, inference comparison against exact approaches would be very nice. As it is now, the authors truly only check the accuracy of the SSA on 1 model -it would be interesting to see for other models.

      Thank you for your detailed comments. While this study focuses on introducing the DGA and demonstrating its feasibility, we agree that comparisons with other model-fitting frameworks, testing on additional models, and integrating with other SSA variants like sorted SSA are important directions for future work. Similarly, extending the DGA to handle transient dynamics and exploring alternatives to ADAM, such as BFGS, are promising areas to investigate further.

    1. eLife Assessment

      This study presents important insights into the regulation of left-right organ formation. By combining genetic perturbation of all three Meteorin genes in zebrafish and timelapse imaging, the authors identify an essential role for this protein family in the establishment of left-right patterning. They provide convincing evidence that Meteorins are required for the morphogenesis of dorsal forerunner cells, the precursors of the left-right organizer (also named Kupffer's vesicle) in zebrafish. In line with this, Meteorins were shown to genetically interact with integrins ItgaV and Itgb1b to regulate dorsal forerunner cell clustering.

    2. Reviewer #1 (Public review):

      Summary:

      Meteorin proteins were initially described as secreted neurotrophic factors. In this manuscript, Eggeler et al. demonstrate a novel role for Meteorins in establish left-right axis formation in the zebrafish embryo. The authors generated null mutations in each of the three zebrafish meteorin genes - metrn, metrnla, and metrnlab. Triple mutant embryos displayed phenotypes strongly associated with left-right defects such as heart looping and visceral organ placement, and disrupted expression of Nodal-responsive genes, as did single mutants for metrn and metrnla. The authors then go on to demonstrate that these defects in left-right asymmetry are likely to due to defects in Kupffer's Vesicle and the progenitor dorseal forerunner cells including impaired lumen formation and reduced fluid flow, reduced clustering among DFCs, impaired DFC migration, mislocalization of apical proteins ZO-1 and aPKC, and detachment of DFCs from the EVL. Notably, the authors found that expression of marker genes sox32 and sox17 were not affected, suggesting Meteorins are required for DFC/KV morphogenesis but not necessarily fate specification. Finally, the authors show genetic interaction between Meteorins and integrin receptors, which were previously implicated in left-right patterning. In a supplemental figure, the manuscript also presents data showing expression of meteorin genes around the chick Hensen's node, suggesting that the left-right patterning functions may be conserved among vertebrates.

      Strengths:

      Strengths of this study include the generation of a triple mutant line that targets all known zebrafish meteorin family members. The experiments presented in this study were rigorous, especially with respect to quantification and statistical analysis.

      Weaknesses:

      Although the authors convincingly demonstrate a role for Meteorins in zebrafish left-right patterning, data supporting a conserved role in other vertebrates is compelling but limited to one supplemental figure.

    3. Reviewer #2 (Public review):

      Summary:

      In this manuscript the authors describe their study on the role of meteorins in establishing the left-right organizer. The left-right organizer is a transient organ in vertebrate embryos in which rotating cilia cause a fluid flow that breaks the left-right symmetry and coordinates lateralization of internal organs such as gut and heart. In zebrafish, the left-right organizer (also named Kupffer's vesicle) is formed by dorsal forerunner cells, but very little is known about how dorsal forerunner cells coalles and form this ciliated vesicle in the embryo. The authors mutated the three meteorin-coding genes in zebrafish and observed that mutations in each one of these causes laterality defects with the strongest defects observed in the triple mutant. Loss of meteorins affects nodal gene expression, which play essential roles in establishing organ laterality. Meteorins are widely expressed in developing embryos and expression in lateral plate mesoderm and dorsal forerunner cells was observed. The meteorin triple mutant embryos display defects in the migration and clustering of the dorsal forerunner cells impairing kupffer's vesicle formation and cilia rotation. Finally, the authors show that meteorins genetically interact with integrins.

      Strengths:

      - These authors went through the lengthy process of generating triple mutants affecting all three meteorin genes. This provides robust genetic evidence on the role of meteorins in establishing organ laterality and circumvented that interpretation of the results would be hard due to redundant functions of meteorins.<br /> - The use of life imaging on triple mutants is appreciated<br /> - High-quality imaging of dorsal forerunner to quantify cell migrations and its relation to Kupffer's vesicle formation.

      Weaknesses:

      - Lack of a model how meteorins regulate dorsal forerunner cell migration.<br /> - Only genetic data to suggest a link between meteorins and integrins<br /> - Besides its role in DFC migration, meteorins may also play a more direct role in regulating Nodal signaling, which is not addressed here.

    1. eLife Assessment

      This study maps the genotype-phenotype landscapes of three E. coli transcription factors and the topographical features of these landscapes. It shows that ruggedness and epistasis do not hinder the evolution of strong transcription factor binding sites. These convincing findings contribute valuable insights into fitness landscape theories and highlight the role of chance, contingency, and evolutionary biases in gene regulation. The authors then study the topographical features of these landscapes, especially the number and distribution of local maxima, as well as the statistical properties of evolutionary paths on these landscapes.

    2. Reviewer #1 (Public review):

      Summary:

      For each of the three key transcription factor (TF) proteins in E. coli, the authors generate a large library of TF binding site (TFBS) sequences on plasmids, such that each TFBS is coupled to the expression of a fluorescence reporter. By sorting the fluorescence of individual cells and sequencing their plasmids to identify each cell's TFBS sequence (sort-seq), they are able to map the landscape of these TFBSs to the gene expression level they regulate. The authors then study the topographical features of these landscapes, especially the number and distribution of local maxima, as well as the statistical properties of evolutionary paths on these landscapes. They find the landscapes to be highly rugged, with about as many local peaks as a random landscape would have, and with those peaks distributed approximately randomly in sequence space. The authors find that there are a number of peaks that produce regulation stronger than that of the wild-type sequence for each TF and that it is not too unlikely to reach one of those "high peaks" from a random starting sequence. Nevertheless, the basins of attractions for different peaks have significant overlap, which means that chance plays a major role in determining which peak a population will evolve to.

      Strengths:

      (1) The experiments and analysis of this paper are very well-executed and, by and large, very thorough (with an important exception identified below). I appreciated the systematic nature of the project, both the large-scale experiments done on three TFs with replicates and the systematic analysis of the resulting landscapes. This not only makes the paper easy to follow but also inspires confidence in their results since there is so much data and so many different ways of analyzing it. It's a great recipe for other studies of genotype-phenotype landscapes to follow.

      (2) Considering how technical the project was, I am really impressed at how easy to read I found the paper, and the authors deserve a lot of credit for making it so. They do a great job of building up the experiments and analyses step-by-step and explaining enough of the basics of the experimental design and the essence of each analysis in the main text without getting too complicated with details that can be left to the Methods or SI. Compared to other big data papers, this one was refreshingly not overwhelming.

      Weaknesses:

      (1) The main weakness of this paper, in my view, is that it felt disconnected from the larger body of work on fitness and genotype-phenotype landscapes, including previous data on TFBSs in E. coli, genotype-phenotype maps of TFBSs in other systems, protein sequence landscapes (e.g., from mutational scans or combinatorially-complete libraries), and fitness landscapes of genomic mutations (e.g., combinatorially-complete landscapes of antibiotic resistance alleles). I have no doubt the authors are experts in this literature, and they probably cite most of it already given the enormous number of references. But they don't systematically introduce and summarize what was already known from all that work, and how their present study builds on it, in the Abstract and Introduction, which left me wondering for most of the paper why this project was necessary. Eventually, the authors do address most of these points, but not until the end, in the Discussion. Readers who have no familiarity with this literature might read this paper thinking that it's the first paper ever to study topography and evolutionary paths on genotype-phenotype landscapes, which is not true.

      There were two points that made this especially confusing for me. First, in order to choose which nucleotides in the binding sites to vary, the authors invoke existing data on the diversity of these sequences (position-weight matrices from RegulonDB). But since those PWMs can imply a genotype-phenotype map themselves, an obvious question I think the authors needed to have answered right away in the Introduction is why it is insufficient for their question. They only make a brief remark much later in the Results that the PWM data is just observed sequence diversity and doesn't directly reflect the regulation strength of every possible TFBS sequence. But that is too subtle in my opinion, and such a critical motivation for their study that it should be a major point in the Introduction.

      The second point where the lack of motivation in the Introduction created confusion for me was that they report enormous levels of sign epistasis in their data, to the point where these landscapes look like random uncorrelated landscapes. That was really surprising to me since it contrasts with other empirical landscape data I'm familiar with. It was only in the Discussion that I found some significant explanation of this - namely that this could be a difference between prokaryotic TFBSs, as this paper studies, and the eukaryotic TFBSs that have been the focus of many (almost all?) previous work. If that is in fact the case - that almost all previous studies have focused on eukaryotic TFBSs or other kinds of landscapes, and this is the first to do a systematic test of prokaryotic TFBS, then that should be a clear point made in the Abstract and Introduction. (I find a comparable statement only in the very last paragraph of the Discussion.) If that's the case, then I would also find that point to be a much stronger, more specific conclusion of this paper to emphasize than the more general result of observing epistasis and contingency (as is currently emphasized in the Abstract), which has been discussed in tons of other papers. This raises all sorts of exciting questions for future studies - why do the landscapes of prokaryotic TFBSs differ so dramatically from almost all the other landscapes we've observed in biology? What does that mean for the evolutionary dynamics of these different systems?

      (2) I am a bit concerned about the lack of uncertainties incorporated into the results. The authors acknowledge several key limitations of their approach, including the discreteness of the sort-seq bins in determining possible values of regulation strength, the existence of a large number of unsampled sequences in their genotype space, as well as measurement noise in the fluorescence readouts and sequencing. While the authors acknowledge the existence of these factors, I do not see much attempt to actually incorporate the effect of these uncertainties into their conclusions, which I suspect may be important. For example, given the bin size for the fluorescence in sort-seq, how confident are they that every sequence that appears to be a peak is actually a peak? Is it possible that many of the peak sequences have regulation strengths above all their neighbors but within the uncertainty of the fluorescence, making it possible that it's not really a peak? Perhaps such issues would average out and not change the statistical nature of their results, which are not about claiming that specific sequences are peaks, just how many peaks there are. Nevertheless, I think the lack of this robustness analysis makes the results less convincing than they otherwise would be.

    3. Reviewer #2 (Public review):

      The authors aim to investigate the ability of evolution to create strong transcription factor binding sites (TFBSs) de novo in E. coli. They focus on three global transcriptional regulators: CRP, Fis, and IHF, using a massively parallel reporter assay to evaluate the regulatory effects of over 30,000 TFBS variants. By analyzing the resulting genotype-phenotype landscapes, they explore the ruggedness, accessibility, and evolutionary dynamics of regulatory landscapes, providing insights into the evolutionary feasibility of strong gene regulation. Their experiments show that de novo adaptive evolution of new gene regulation is feasible. It is also subject to a blend of chance, historical contingency, and evolutionary biases that favor some peaks and evolutionary paths.

      (1) Strengths of the methods and results:

      The authors successfully employed a well-designed sort-seq assay combined with high-throughput sequencing to map regulatory landscapes. The experimental design ensures reliable measurement of regulation strengths. Their system accounts for gene expression noise and normalizes measurements using appropriate controls.

      Comprehensive Landscape Mapping:<br /> The study examines ~30,000 TFBS variants per transcription factor, providing statistically robust and thorough maps of the regulatory landscapes for CRP, Fis, and IHF. The landscapes are rigorously analyzed for ruggedness (e.g., number of peaks) and epistasis, revealing parallels with theoretical uncorrelated random landscapes.

      Evolutionary Dynamics Simulations:<br /> Through simulations of adaptive walks under varying population dynamics, the authors demonstrate that high peaks in regulatory landscapes are accessible despite ruggedness. They identify key evolutionary phenomena, such as contingency (multiple paths to peaks) and biases toward specific evolutionary outcomes.

      Biological Relevance and Novelty:<br /> The author's work is novel in focusing on global regulators, which differ from previously studied local regulators (e.g., TetR). They provide compelling evidence that rugged landscapes are navigable, facilitating de novo evolution of regulatory interactions. The comparison of landscapes for CRP, Fis, and IHF underscores shared topographical features, suggesting general principles of global transcriptional regulation in bacteria.

      (2) Weaknesses of the methods and results:

      Undersampling of Genotype Space:<br /> While the quality filtering of the data ensures robustness, ~40% of the TFBS space remains uncharacterized. The authors acknowledge this limitation but could improve the analysis by employing subsampling or predictive modeling.

      Simplified Regulatory Architecture:<br /> The study considers a minimal system of a single TFBS upstream of a reporter gene. While this may have been necessary for clarity, this simplification may not reflect the combinatorial complexity of transcriptional regulation in vivo.

      Lack of Experimental Validation of Simulations:<br /> The adaptive walks are based on simulated dynamics rather than experimental evolution. Incorporating in vivo experimental evolution studies would strengthen the conclusions. Although this is a large request for the paper, that would not prevent publication.

      Impact on the Field:<br /> This study advances our understanding of adaptive landscapes in gene regulation and offers a critical step toward deciphering how global regulators evolve de novo binding sites. The findings provide foundational insights for synthetic biology, evolutionary genetics, and systems biology by highlighting the evolutionary accessibility of strong regulation in bacteria.

      Utility of Methods and Data:<br /> The sort-seq approach, combined with landscape analysis, provides a robust framework that can be extended to other transcription factors and systems. If made publicly available, the study's data and code would be valuable for researchers modeling transcriptional regulation or studying evolutionary dynamics.

      Additional Context:<br /> The study builds on a growing body of work exploring regulatory evolution. For instance, recent studies on local regulators like TetR and AraC have revealed high ruggedness and epistasis in TFBS landscapes. This study distinguishes itself by focusing on global regulators, which are more biologically complex and influential in bacterial gene networks. The observed evolutionary contingency aligns with findings in other biological systems, such as protein evolution and RNA folding landscapes, underscoring the generality of these evolutionary principles.

      Conclusion:<br /> The authors successfully mapped the genotype-phenotype landscapes for three global regulators and simulated evolutionary dynamics to assess the feasibility of strong TFBS evolution. They convincingly demonstrate that ruggedness and epistasis, while prominent, do not preclude the evolution of strong regulation. Their results support the notion that gene regulation evolves through a blend of chance, contingency, and evolutionary biases.

      This paper makes a significant contribution to the understanding of regulatory evolution in bacteria. While minor limitations exist, the authors' methods are robust, and their findings are well-supported. The work will likely be of broad interest to researchers in molecular evolution, synthetic biology, and gene regulation.

    1. eLife Assessment

      This important study characterizes and validates a new activity marker - fast labelling of engram neurons (FLEN) - which is transiently active and driven by cFos, allowing the monitoring of intrinsic and synaptic properties of engram neurons shortly after the learning experience. The results convincingly demonstrate the utility of this novel viral tool for studying early changes in the properties of engram cells. However, the study would benefit from exploring how accurately FLEN reflects endogenous cFos activity, how this labelling technique compares to previous versions, and from careful consideration of alternative explanations such as changes in release probability.

    2. Reviewer #1 (Public review):

      Summary:

      The manuscript by Cupollilo et al describes the development, characterization, and application of a novel activity labeling system; fast labelling of engram neurons (FLEN). Several such systems already exist but this study adds additional capability by leveraging an activity marker that is destabilized (and thus temporally active) as well as being driven by the full-length promoter of cFos. The authors demonstrate the activity-dependent induction and time course of expression, first in cultured neurons and then in vivo in hippocampal CA3 neurons after one trial of contextual fear conditioning. In a series of ex vivo experiments, the authors perform patch clamp analysis of labeled neurons to determine if these putative engram neurons differ from non-labelled neurons using both the FLEN system as well as the previously characterized RAM system. Interestingly the early labelled neurons at 3 h post CFC (FLEN+) demonstrated no differences in excitability whereas the RAM-labelled neurons at 24h after CFC had increased excitability. Examination of synaptic properties demonstrated an increase in sEPCS and mEPSC frequencies as well as those for sIPSCs and mIPSCs which was not due to a change in the mossy fiber input to these neurons.

      Strengths:

      Overall the data is of high quality and the study introduces a new tool while also reassessing some principles of circuit plasticity in the CA3 that have been the focus of prior studies.

      Weaknesses:

      No major weaknesses were noted.

    3. Reviewer #2 (Public review):

      Summary:

      Cupollilo et al. investigate the properties of hippocampal CA3 neurons that express the immediate early gene cFos in response to a single foot shock. They compare ex-vivo the electrophysiological properties of these "engram neurons" labeled with two different cFos promoter-driven green markers: Their new tool FLEN labels neurons 2-6 h after activity, while RAM contains additional enhancers and peaks considerably later (>24 h). Since the fraction of labeled CA3 cells is comparable with both constructs, it is assumed (but not tested) that they label the same population of activated neurons at different time points. Both FLEN+ and RAM+ neurons in CA3 receive more synaptic inputs compared to non-expressing control neurons, which could be a causal factor for cFos activation, or a very early consequence thereof. Frequency facilitation and E/I ratio of mossy fiber inputs were also tested, but are not different in both cFos+ groups of neurons. One day after foot shock, RAM+ neurons are more excitable than RAM- neurons, suggesting a slow increase in excitability as a major consequence of cFos activation.

      Strengths:

      The study is conducted to high standards and contributes significantly to our understanding of memory formation and consolidation in the hippocampus. Modifications of intrinsic neuronal properties seem to be more salient than overall changes in the total number of (excitatory and inhibitory) inputs, although a switch in the source of the synaptic inputs would not have been detected by the methods employed in this study

      Weaknesses:

      With regard to the new viral tool, a direct comparison between the new tool FLEN and existing cFos reporters is missing.

    1. eLife Assessment

      This study provides important evidence that the postmating behavioral switch in male mice is mediated by distinct stages of synaptic plasticity within the medial amygdala-MPOA-BSTrh pathway. The findings are convincing, supported by rigorous behavioral characterization and electrophysiological approaches that disentangle the contributions of mating, cohabitation, and parental experience to neural circuit changes. While some methodological details and statistical reporting require clarification, the study significantly advances our understanding of the neural mechanisms underlying paternal behavior.

    2. Reviewer #1 (Public review):

      Summary:

      After mating, male mice undergo a behavioral switch from infanticide to parental behavior (postmating switch). The neural mechanisms underlying this switch are still largely unknown. Studies performed in different mouse strains have also resulted in mixed evidence for whether mating (specifically: ejaculation) itself is sufficient for this switch, or whether subsequent cohabitation with the pregnant female, and parental experience with pups is required. Recent work found that while lesions to the central part of the medial preoptic area (cMPOA) promote infanticidal behavior, lesions to the rhomboid nucleus of the bed nucleus of the stria terminalis (BSTrh) inhibit infanticide. The current work convincingly adds to this evidence by showing that mating and cohabitation lead to reduced inhibition from Cart-positive medial amygdala neurons onto cMPOA neurons, and that this synaptic change is in fact critical for the postmating switch. Further, the authors demonstrate that parental experience increases inhibitory synaptic transmission onto BSTrh neurons. The male postmating switch thus appears to rely on two sequential stages of synaptic plasticity.

      Strengths:

      (1) The behavioral characterization is thorough and the authors nicely manage to disentangle the relative contributions of mating, cohabitation, and parental experience to the postmating switch. Their finding of dissociable plasticity mechanisms underlying mating/cohabitation vs pup experience is intriguing.

      (2) Most conclusions are based on complementary evidence from different experimental approaches and are compelling.

      Weaknesses:

      (1) The authors do not provide an explicit synthesis/model of the circuit-level changes underlying this switch. For instance, how does cMPOA-to-BSTrh connectivity change in fathers, and how does the necessity of the cMPOA for the exposure/sensitisation effect square with the effect being postsynaptic in the BSTrh?

      (2) The presentation of the manuscript (clarity of language, grammar, reporting of stats in figures etc.) needs to be improved.

    3. Reviewer #2 (Public review):

      Summary:

      The present study identifies how mating and pup experience are correlated with differences in inhibitory neurotransmission underlying the promotion of paternal behavior toward pups. The study builds on existing knowledge about the circuit between the medial amygdala, medial preoptic area, and the bed nucleus of stria terminalis to uncover synaptic changes correlated with behavior. The authors find that inhibition from the medial amygdala is decreased in the medial preoptic area and increased in the bed nucleus of stria terminalis to promote paternal behavior in mated males.

      Strengths:

      The authors use a combination of in vivo activity manipulation and slice electrophysiology to study the role of inhibition in this circuit in dynamic infant-directed behavior induced by mating.

      Weaknesses:

      (1) Some technical and methodological details are incomplete or missing for interpretation of the significance of the findings. Statistical details are also left out.

      (2) The rationale for using Cartpt as a marker is not fully explained. This marker has activity-dependent expression and this possibility is not explored experimentally--for example, could exposure to objects or pups change expression (or the number of cells expressing) cartpt alone?

      (3) The cfos experiment is quantified by exposing a male to a pup inside a tea ball. Therefore, it is unclear how the male was classified as infanticidal or parental based on the available criteria provided in the methods section.

      (4) There is no information about inclusion/exclusion criteria for chemical and viral experiments. Specifically, there is no information provided about the validation of the lesion experiment--how large were the lesions? Is there concern about leakage of the chemical into the recorded region (MPOA and BNST are adjacent).

      (5) The authors do not provide information about how long rAAV is allowed to express before quantifying retrograde transport.

      (6) For statistics, the authors do not provide information distinguishing the main effects from multiple comparisons post hoc testing for the ANOVA analyses.

    4. Reviewer #3 (Public review):

      Ito et al. investigate the role of synaptic plasticity in the medial preoptic area (MPOA) pathway of male mice and its involvement in transitions from infanticidal aggression to parental behavior. Using optogenetics, whole-cell patch-clamp recordings, and behavioral assays, they demonstrate that inhibitory synaptic transmission from the posterior-dorsal medial amygdala (MePD) to the central MPOA (cMPOA) decreases following mating and cohabitation with pregnant females. This synaptic disinhibition is correlated with a reduction in aggressive behavior toward pups. They further show that paternal experience induces enhanced inhibitory transmission in the rhomboid nucleus of the bed nucleus of the stria terminalis (BSTrh), downstream of the MPOA, through postsynaptic mechanisms. These findings suggest a circuit-based model where social experiences and mating induce synaptic changes in the Me-cMPOA-BSTrh pathway, mediating the transition to parental behavior.

      The conclusions of this paper are largely supported by the data, but several methodological and conceptual aspects require clarification or additional experiments.

      (1) When evaluating the Me Cartpt-expressing neuron projection to the cMPOA, the authors compared excitatory postsynaptic currents (EPSCs) and inhibitory postsynaptic currents (IPSCs). However, the standard procedure for isolating these currents is to hold the membrane potential at the reversal potential for inhibitory or excitatory currents, respectively. The authors appear not to have followed this procedure, making it unclear how EPSCs and IPSCs were calculated. This requires clarification to ensure the validity of their reported E/I balance changes.

      (2) The authors chose to assess parental behavior over four consecutive days. It is unclear why this specific timeframe was selected. A justification for this choice would strengthen the interpretation of the behavioral data.

      (3) The experimental design in Figure 5, where the authors lesioned the entire cMPOA to assess its role in BSTrh inhibition, presents several limitations: First, the effects on BSTrh activity could result from indirect circuit alterations rather than direct cMPOA projections. The current lesion approach cannot disentangle these possibilities. Second, the cMPOA is a heterogeneous region containing diverse neuronal subtypes. Full lesions prevent the differentiation of the roles played by distinct populations within this region. Third, lesion specificity is questionable, as some lesions extended beyond the cMPOA boundaries (Figure S5). This overextension complicates the interpretation of the results and requires tighter control.

      (4) In Figure 3, the authors show that optogenetic inhibition of Me projections to the cMPOA modifies the frequency of spontaneous inhibitory postsynaptic currents (sIPSCs). However, the proposed mechanism that this modulation reflects inter-neuronal network activity within the cMPOA lacks sufficient experimental validation. Additional experiments assessing circuit-level interactions could substantiate these claims.

      (5) While the paper highlights synaptic changes in the cMPOA, it does not establish a direct relationship between these changes and the social experience. How do mating and cohabitation with females impact this pathway and modulate synaptic strength? The discussion could benefit from integrating these factors into their proposed model.

      Overall, the paper offers valuable insights into the neural circuitry underlying male parental behavior, particularly the synaptic dynamics of the Me-cMPOA-BSTrh pathway. However, addressing these methodological and conceptual limitations would significantly enhance the clarity and impact of the work.

    1. eLife Assessment

      This study provides valuable observations indicating that human pyramidal neurons propagate information as fast as rat pyramidal neurons despite their larger size. Convincing evidence demonstrates that this property is due to several biophysical properties of human neurons. This study will be of interest to neurophysiologists.

    2. Reviewer #1 (Public review):

      The propagation of electrical signals within neuronal circuits is tightly regulated by the physical and molecular properties of neurons. Since neurons vary in size across species, the question arises whether propagation speed also varies to compensate for it. The present article compares numerous speed-related properties in human and rat neurons. They found that the larger size of human neurons seems to be compensated by a faster propagation within dendrites but not axons of these neurons. The faster dendritic signal propagation was found to arise from wider dendritic diameters and greater conductance load in human neurons. In addition, the article provides a careful characterization of human dendrites and axons, as the field has only recently begun to characterize post-operative human cells. There are only a few studies reporting dendritic properties and these are not all consistent, hence there is added value of reporting these findings, particularly given that the characterization is condensed in a compartmental model.

      Strengths

      The study was performed with great care using standard techniques in slice electrophysiology (pharmacological manipulation with somatic patch-clamp) as well as some challenging ones (axonal and dendritic patch-clamp). Modeling was used to parse out the role of different features in regulating dendritic propagation speed. The finding that propagation speed varies across species is novel as previous studies did not find a large change in membrane time constant nor axonal diameters (a significant parameter affecting speed). A number of possible, yet less likely factors were carefully tested (Ih, membrane capacitance). The main features outlined here are well known to regulate speed in neuronal processes. The modeling was also carefully done to verify that the magnitude of the effects is consistent with the difference in biophysical properties. Hence, the findings appear very solid to me.

      Weaknesses

      The role of diameter in regulating propagation speed is well known in the axon literature.

      Comment on the revised version: the authors have now made clearer that the role of diameter was well known in the manuscript.

    3. Reviewer #2 (Public review):

      Summary:

      In this paper, Oláh and colleagues introduce new research data on the cellular and biophysical elements involved in transmission within the pyramidal circuits of the human neocortex. They gathered a comprehensive set of patch-clamp recordings from human and rat pyramidal neurons to compare how the temporal aspect of neuronal processing is maintained in the larger human neocortex. A range of experimental techniques have been used, including two-photon guided dual whole-cell recordings, electron microscopy, complemented by theoretical and computational methods.

      The authors find that synaptically connected pyramidal neurons within the human neocortex have longer intercellular path lengths. They go on to show that the short soma to soma latencies is not due to propagation velocity along the axon but instead reflects a higher propagation speed of synaptic potentials from dendrite to soma. Next, in a series of extensive computational modeling studies focusing on the synaptic potentials, the authors show that the shorter latency may be explained by larger diameters, affecting the cable properties and resulting is relatively faster propagation of EPSPs in the human neuron. The manuscript is well-written, and the physiological experiments and in-depth theoretical steps for the simulations are clear. Whether passive cable properties of the dendrites alone are responsible for higher velocities remains to be further investigated. Based on the present data the contribution of active membrane properties cannot be excluded.

      Strengths:

      The authors used complex 2P-guided dual whole-cell recordings in human neurons. In combination with detailed reconstructions, these approaches represent the next steps in unravelling the information processing in human circuits.

      The computational modelling and cable theory application to the experimentally constrained simulations provides an integrated view of the passive membrane properties of human neurons.

      Weaknesses:

      Whether the cable properties alone are the main explanation for speeding the electrical signaling in human pyramidal neurons deserves further studies.

    4. Reviewer #3 (Public review):

      Summary:

      This study indicates that connections across human cortical pyramidal cells have identical latencies despite a larger mean dendritic and axonal length between somas in human cortex. A precise demonstration combining detailed electrophysiology and modeling, indicates that this property is due to faster propagation of signals in proximal human dendrites. This faster propagation is itself due to a slightly thicker dendrite, to a larger capacitive load, and to stronger hyperpolarizing currents. Hence, the biophysical properties of human pyramidal cells are adapted such that they do not compromise information transfer speed.

      Strengths:

      The manuscript is clear and very detailed. The authors have experimentally verified a large number of aspects that could affect propagation speed and have pinpointed the most important one. This paper provides an excellent comparision of biophysical properties between rat and human pyramidal cells. Thanks to this approach a comprehensive description of the mechanisms underlying the acceleration of propagation in human dendrite is provided.

      Weaknesses:

      The weaknesses I had identified have been addressed by the authors.

    5. Author response:

      The following is the authors’ response to the previous reviews.

      We are grateful for the positive evaluation of the work and the critical points raised by the reviewers. We thank all reviewers for their excellent comments. We believe that these revisions have significantly improved the quality of our study.

      In response to the 2nd reviewer, we apologise for the missing data, we failed to provide a P-value of the RM ANOVA post-hoc test, we are very grateful that this was brought to our attention. We have revised the RM ANOVA by using the Tukey HSD post-hoc test, which is generally recommended for pairwise comparisons as it is more robust to unequal sample sizes. The controversial statistical analysis of the overall comparison of speed differences was deleted, as were three supplementary figures (Fig. S4, Fig. S9 and S10), which are less informative in support of the manuscript.

    1. eLife Assessment

      This study is valuable as it provides information about the genes regulated by sex hormone treatment in song nuclei and other brain regions and suggests candidate genes that might induce sexual dimorphism in the zebra finch brain. The analysis presented is thorough and detailed. Whereas the evidence for gene regulation by hormone treatment is well supported, the evidence for an association of those genes with song learning (as written in the title) is incompletely supported as no manipulation of song learning or song analysis was conducted.

    2. Reviewer #3 (Public review):

      Summary:

      Davenport et al have investigated how a masculinizing dose of estrogen changes the transcriptomes of several key song nuclei song and adjacent brain areas in juvenile zebra finches of both sexes. Only male zebra finches sing, learn song, and normally have a fully developed song control circuitry, so the study was aimed at further understanding how genetic and hormonal factors contribute to the dimorphism in song behavior and related brain circuitry in this species. Using WGCNA and follow-up correlations to re-analyze published transcriptome datasets, the authors provide evidence that the main variance of several identified gene co-expression modules significantly correlates with one or some of the factors examined, including sex, estrogen treatment, regional neuroanatomy, chromosomal placement, or vocal learning, noting that the latter is largely based on inference due to expression in song control nuclei.

      Strengths:

      Among the main strengths are the thorough gene co-expression module and correlation analyses, and the inclusion of both song nuclei and adjacent areas, the latter serving as sort of controls for areas that are not dimorphic and likely broadly present in birds in general. In situ hybridization data discussed in a previous publication (Choe et al., Hormones and Behavior, 2021) provides some support for the neuroanatomical specializations of gene expression. It is also significant that the transcriptome re-analysis was performed with an improved genome assembly that also includes the sex chromosomes, thus expanding the Z/W chromosome gene analyses in Friedrich et al, Cell Reports, 2022. The most relevant finding is arguably the identification of some modules where gene expression variation within song nuclei correlates with hormonal effects and/or gene location on sex chromosomes, which are present at different dosages between sexes. Sex differences in gene expression in areas that are not song nuclei may also bring insights into functions other than song behavior or vocal learning. The study also shows how a published RNA-seq dataset can be reanalyzed in novel and informative ways.

      Weaknesses:

      The validation of the inferred direction of regulation in the identified co-expression modules is limited to the in situ data mentioned above. Further evidence that representative genes in the main modules differ in expression when comparing sexes or E2- vs VEH-treated tissues using independent samples and/or methods would provide further validation and enhance rigor. Most importantly, E2 is known to exert various actions on brain physiology and neuronal function. Because there was no manipulation of candidate genes, nor assessment/manipulation of vocal behavior or vocal learning, an involvement of the identified candidate genes in setting up the sexual dimorphism of the song system or song behavior was not directly tested in this study. For the latter reason, the implication of the Title (..."gene expression associated with vocal learning...") is not well supported. While novel insights were gained into brain expression of Z chromosome genes, it cannot be excluded that the higher male expression of some Z genes may not affect brain cell function and thus may not require active compensation (as discussed for nucleus RA in Friedrich et al, Cell Reports, 2022).

    3. Author response:

      The following is the authors’ response to the original reviews.

      eLife Assessment

      This study is useful as it provides further analysis of previously published data to address which specific genes are part of the masculinizing actions of E2 on female zebra finches, and where these key genes are expressed in the brain. However the data supporting the conclusion of masculinizing the song system are incomplete as the current manuscript is a re-analysis of differential gene expression modulated by E2 treatment between male/female zebra finches without manipulation of gene expression. The conclusions (and title) regarding song learning are also incompletely supported with no gene manipulation or song analysis. Importantly, the use of WGCNA for a question of sex-chromosome expression in species without dosage compensation is considered inadequate. As the experimental design did not include groups to directly test for song learning, and there was also no analysis of song performance, these data were also considered inadequate in that regard.

      We are sorry the editor felt the manuscript so incomplete and inadequate. Though the tone of this assessment seems more severe than the below reviewer comments, we are also happy to see that the editor has considered our paper further for a revised publication, based on the reviewer’s comments. We address the editor’s comments as follows:

      While we agree that manipulation of some of the genes we discovered, whose expression levels are E2-sensitive in the song system, would take the study further in validating some proposed hypothesis in the discussion of the paper, we don’t think the outcome of gene manipulations would change the major conclusions from the results of the paper. In this study we performed estrogen hormone manipulations, with causal consequences on gene expression in song nuclei and associated song behavior. In a way this is analogous to gene manipulations, but manipulating directly the action of estrogen. The categories of genes impacted, and the differences among the sex chromosomes wouldn’t change.

      For the comment on WGCNA being inadequate for addressing questions on sex chromosome expression in species without dosage compensation, we think the evidence in our data does not bear that out. One main result of this paper is the separation of Z chromosome transcripts whose expression is most strongly regulated by chromosomal dosage (WGCNA module E) across regions from those subject to additional sources of regulation in song nuclei (other modules). It seems to us that rather than being confounded by the lack of dosage compensation, WGCNA allowed us to better resolve the effects of dosage on different genes within the sex chromosomes. We have added a new figure more directly examining sex chromosome transcript abundance within different modules. Briefly, we found that module E assigned Z chromosome genes exhibited almost exactly the male-biased expression ratio expected from no dosage compensation while the Z chromosome genes in song nuclei assigned to other modules were expressed below the dosage predicted value, consistent with module E containing those genes whose expression are most strongly regulated by dose across all brain regions sampled.

      At its core, WGCNA finds sets of correlated genes. The biological reality of the zebra finch transcriptome is that Z chromosome expression is largely anti-correlated with W chromosome due to dosage. However, this dosage effect is not felt equally by all genes and WGCNA provides an unbiased computational framework which can be used to separate dose from other potential sources of gene regulation. This is why roughly ⅓ of Z chromosome genes are not assigned to module E; for example the growth hormone receptor is assigned to module G based on its correlation with genes upregulated within HVC.

      “As the experimental design did not include groups to directly test for song learning, and there was also no analysis of song performance, these data were also considered inadequate in that regard.”

      Concerning the comment on no analysis on song performance in the paper, all such analyses were conducted on our previous study on the same animals (Choe et al. 2021, Hormones & Behavior). The birds considered here were sacrificed at PHD30, prior to the onset of learned song behavior. However, females treated with E2 the same at the same time and allowed to mature into adulthood, went onto to develop rudimentary song. Further, induction of rudimentary song learning in females following E2 treatment has been well established since the early ‘80s. We have added the following text toward the end of the intro to make this more clear:

      “While the birds for this study were sacrificed prior to the developmental presentation of song behavior, we have previously shown that female finches treated in exactly the say way with E2 go on to produce rudimentary imitative songs as adults (Choe et al 2021), consistent with the known induction of vocal learning in females by E2 (REF).”

      Reviewer #1 (Recommendations For The Authors):

      Overall, this is a wonderfully designed and executed study that takes full advantage of new resources, such as the most complete zebra finch genome assembly yet, as well as the latest methods. I have very few suggestions as to the improvement of the manuscript. They are as follows:

      Results Section:

      In the paragraph "Identification of gene expression modules in song nuclei":

      "The E2-treated females in this study had similarly sized song system nuclei as males, indicating that E2 treatment prevented atrophy."

      Clarify if this comparison is to treated and/or untreated males.

      We thank the reviewer for their comment. The relative differences in the song nuclei sizes between the E2-treated females and the other groups is more complex that our original sentence implied. We have revised the main the text as follows

      “In our previous study, we found that estradiol treatment in PHD30 females caused HVC to enlarge and Area X to appear when it normally does not develop in females, but both at sizes less than in untreated or treated males.The sizes of PHD30 female LMAN RA were already the sizes as seen in males, as the later has not atrophied yet at this age(25).”

      In the paragraph "Sex- and micro-chromosome gene expression across the telencephalon": "These animal and chromosome specific shifts in the transcriptomes could represent the systemic effects of allelic chromosomal structural variation..."

      The authors should clarify the meaning of a"llelic chromosomal structural variation" in this context, as it is an unusual phrase. Major chromosomal structural variation seems unlikely to produce these effects. Is it also possible that animal-specific modules with brain-wide higher could also result from laboratory contamination between all samples from one animal? This is not too likely but perhaps should be acknowledged or ruled out.

      We have removed the word allelic, which was unnecessary. We can’t envision how laboratory contamination could occur such that all of one animal’s samples would be affected to produce the observed result which is module and chromosome specific. An animal wide effect could emerge during sacrifice, but we can think of no reason that would affect these modules and not others. Rather, the most likely explanation is biological natural difference between animals. We have added this consideration of alternative explanations.

      In the section "Candidate gene drivers of HVC specialization in E2-treated females":

      When discussing GHR's role in cell growth and proliferation, the authors' argument could be expanded by including the documented role of GH signaling in anti-apoptotic protection of neurons from rounds of neural pruning during development as documented in the chicken, e.g. • Harvey S, Baudet M-L, Sanders EJ. 2009. Growth Hormone-induced Neuroprotection in the Neural Retina during Chick Embryogenesis. Annals of the New York Academy of Sciences, 1163: 414-416. https://doi.org/10.1111/j.1749-6632.2008.03641.x

      We thank the reviewer for sharing this publication with us.. We have added the following sentence to our discussion with the above citation. “Further, our results are consistent with growth hormone’s known role in avian anti-apoptotic protection, with elevated signaling associated with the survival of chicken neurons during rounds of pruning in the developing

      retina.”

      The authors' argument of the relevance of the passerine GH duplication would be strengthened by citing:

      • Rasband SA, Bolton PE, Fang Q, Johnson PLF, Braun MJ. 2023. Evolution of the Growth Hormone Gene Duplication in Passerine Birds, Genome Biol Evol, 15(3) https://doi.org/10.1093/gbe/evad033. Greatly expands on the Yuri et al. paper cited by characterizing of the molecular evolution of these genes across hundreds of avian species, supporting positive selection on multiple amino acid sites identified in both ancestral and duplicate (passerine) growth hormone.

      • Xie F, London SE, Southey BR et al. 2010. The zebra finch neuropeptidome: prediction, detection and expression. BMC Biol 8, 28. https://doi.org/10.1186/1741-7007-8-28 The authors report significantly different expression of the ancestral GH gene in the adult male zebra finch auditory forebrain after different song exposure experiences.

      We have amended the results section sentence and added all suggested citations. The sentence now reads: “The gene which encodes growth hormone receptor’s ligand, growth hormone, is interestingly duplicated and undergoing accelerated evolution in the genomes of songbirds (Rasband et al 2023); the GH ligand has been found to be upregulated in the zebra finch auditory forebrain following the presentation of familiar song (Xie et al 2010).”

      Figures:

      - Figure 1B. "Duration of sex typing" being a shorter bar compared to the others is not fully explained in the experimental design. Presumably at the end of this time period, the sex is non-invasively, phenotypically evident. I suggest an arrow pointing to the PHD/PHD range when sex is apparent in plumage/anatomy.

      - Figure 4. Caption appears to be truncated; "across all... genes"?

      Fixed

      - Figure 5. For 5E, 5F, 5G, 5H, consider enlarging the plots so overlapping gene symbols are readable. Alternately, smaller numbers or symbols could be used with a key in areas where overlapping symbols are hard to prevent.

      We agree that these are not the easiest to read; we originally offset the symbols in R to minimize overlaps, but it can only do so much for the more crammed panels. We have now added a supplemental .xlsx file with the underlying data from each of the 4 tests for readers that want to examine the data in more detail.

      Reviewer #2 (Recommendations For The Authors):

      Since WGCNA methods will inherently draw together sex-chromosome genes into the same module in systems without dosage compensation, I suggest the authors rerun the WGCNA using only female samples and only male samples. Then identify the composition of modules that differ between E2 and vehicle-treated females and compare these genes to males. Then from male WGCNA identify the composition of modules that differ between E2 and vehicle-treated males and compare to female modules.

      We thank the reviewer for their suggestions. However, we believe it is not as strong as the approach we used, which is grouping data from both sexes in the WGCNA analyses in a study that is looking for sex differences. The reviewer's proposed approach amounts to computing modules twice (once per sex), determining song system specialized modules and E2 responsive modules in both settings, then intersecting the two sets to find corresponding modules, all done to prevent the non-dose compensated sex chromosome genes from being drawn into the same module.

      While WGCNA does group the majority of sex chromosome genes into module E, it does not categorize them all this way (Fig 3). The module classification instead differentiates those sex chromosome genes whose expression are most explained by chromosome dosage / sex across regions (modE) from those whose expression is controlled by other sources of regulation; for an example of the latter, the growth hormone receptor (GHR) is one of several Z chromosome genes classified into modG as its expression better correlates with the genes specialized to HVC than it does with the majority of dosage-dependent Z chromosome genes found in modE. Further, to remove biological sex as a variable in a WGCNA analysis that is focused on sex differences seems counterintuitive.

      Instead, to quantitatively address the reviewer’s concern, we conducted additional analyses, that led to an added new figure, associated text, and tables, that better describes sex/chromosome dosage effects on the abundance (FPKM) and expression ratios of sex chromosome transcripts by module irrespective of brain region (Fig. 5). We find that the Z chromosome genes in modE were expressed at the expected chromosome dosage in the non-vocal surrounding regions (65.06% observed vs 66.6% expected) while in other modules, other Z chromosome genes were expressed at intermediate levels between equal expression and the expected chromosomal dosage. For example, the Z chromosome content of modules D and H exhibited near equal expression between sexes. Within the song system, Z chromosome gene content of modG was highly expressed in males beyond what is expected from chromosome dosage, consistent with modG’s male-specific upregulation in song nuclei relative to surrounds in the absence of E2. These results better demonstrate that in our WGCNA on the combined dataset we are able to separate those Z chromosome genes whose expression is predominantly dosage controlled from those subject to additional regulation such as song system specialization.

      Fig. S3 Legend: 'Black arrow' -> 'Red arrow'

      Change made.

      Fig. S5 - What part of the figure shows the 'human convergent signature'? Also, simply listing the number of genes mapped to a chromosome is misleading to readers unfamiliar with the zebra finch genome, you should either provide the number of genes on each chromosome or present as corrected by that number.

      Fig. S5 was the same type of analyses in Fig. 3 but with an older zebra finch genome assembly, where we had not included the panel a for enrichments with genes convergent in expression between songbird song regions and humans speech brain regions. However, we see that Fig. S5 was not adding any new important information to the paper, so we removed it.

      For the chromosome analyses in Fig. 3b, we provide both the total raw number of module assigned genes broken down by chromosome (The black bar plots on the right) as well as a statistical fold-enrichment value of modules per chromosome. Given the number of genes per chromosome and genes per module in our data, we computed the fold-enrichment for each intersection (observed intersection size / expected intersection size). To test for the significance of these enrichments, we bootstrapped FDR corrected p values for the enrichment of each chromosome-module pairing by randomizing the mapping of genes to modules to construct a null distribution of fold enrichments for each intersection. Our intent was not to describe the size of the chromosomes themselves, information readily available elsewhere, but to show the disproportionate chromosomal origins of the gene sets considered by this study. Performing this enrichment test using all annotated genes per chromosome would artificially increase enrichment values and make the analysis less conservative by confounding the results with the inherent enrichment for “brain function” in the assigned genes relative to all genes.

      At several places you say "we correlated expression of each sex chromosome transcript with sexual dimorphism within each region, such that expressed W genes would be positively correlated and depleted Z chromosome genes would be anticorrelated." What was the sexual dimorphism that was being correlated with? Is this the eigengene?

      We thank you for this comment. Our language was less clear than it could be. We tested for correlations of both the eigengene and the individual gene expression profiles with the biological sex of the animals. We have changed the text to:

      “To do this, we tested for a correlation between the expression of each sex chromosome transcript to the animals’ sex within each brain region. We found that female-enriched transcripts were positively correlated with sex and male-enriched transcripts were anticorrelated (Fig. 4f,g).”

      Fig. 4A: The 'true/false' boxes and animal A-L is confusing and unnecessary. I'd suggest just using M and F (or sex symbols) with a horizontal line below each set of 3 for respective E2 and Veh.

      Change made.

      Reviewer #3 (Recommendations For The Authors):

      General comments:

      After the initial characterization of the datasets and module identification, it is quite hard to follow the logic of the data presentation in the various other Results sections or to clearly understand how they relate to the main stated goal to identify factors related to sex differences in vocal learning. The most relevant findings relate to the presumed actions of hormone treatment and sex chromosome gene dosage in song nuclei, whereas analyses of other brain areas, other chromosomes, or speech-related genes serve more as controls and/or appear as distractions from the main theme. A suggestion to increase the clarity of the presentation and potential impact of the study is to change the order of the presentation, focusing first on the specific analyses and comparisons that most directly speak to the main goals of the study, and then secondarily and more briefly presenting the controls or less related comparisons.

      The reviewer’s suggestion for the results section organization is exactly what we had tried to do. We opened the first paragraph on identification of modules, then presented the song nuclei specific modules, followed by E2-changes to those modules; and the followed by other specific results for the remainder of the paper, including module enrichments to specific chromosomes. The reviewer mentioned our analyses of “other brain areas” (which we assume to mean the non-vocal surround regions), other chromosomes (which we assume means autosomes) and speech-related genes as controls were a distraction in the paper; but within our analysis, these other brain regions are essential controls needed to assess the song-system specificity of any observed sex differences observed from the very first paragraphs of the results; the autosomes were not controls for sex chromosome results, but primary results in of themselves; the overlap with speech-related genes was also not a control, but a novel discovery. We have revised these points in the paper to make them clearer, and revised some of the section titles and transitions between sections to help increase clarity of the main storyline of the paper.

      A related comment is that many of the inferences drawn from the WGCNA analysis were quite complex, thus independent verification of some predictions would be quite valuable. For example, consider the passage: "In non-vocal learning juvenile females, interestingly LMAN was specialized relative to the AN by the same gene modules as in males (B, F, and I) as well as an additional module G (Fig. 2b); RA was specialized by module A as in males, but not module L and by additional modules A and G. In contrast, neither juvenile female HVC nor Area X exhibited significant gene module expression specializations relative to their surrounds." Providing in situ hybridization verification of these regional gene expression predictions with a few representative genes seems quite feasible given the group's expertise and would considerably strengthen confidence in the module-based inferences.

      We performed in-situ independent validation of 36 candidate genes in our first study with this dataset (Choe et al 2021). We now mention this validation in the revised paper. The reviewer’s selection of one of our sentences though made us realize that our grammar used to explain the results was not as clear as it needs to be. We thus cleaned up the grammar of our module descriptions so that it should be communicated with less complexity, the main issue noted by the reviewer.

      Because this is a re-analysis of a previously published dataset, the authors should more explicitly describe somewhere in the Discussion how the present analysis advances the understanding of sex differences in songbird neuroanatomy and behavior beyond the previous analysis.

      We have added an additional sentence into the discussion more clearly separating the results of the current study from our previous work.

      Specific comments:

      Abstract:

      There is evidence (from Frank Johnson's lab) that RA does not completely atrophy in female zebra finches, but is still present with more preserved connectivity than previously thought, possibly related to non-singing function(s). A term like 'marked reduction' of female RA may more accurately reflect the current state of knowledge.

      We have changed the text to “partial atrophy”.

      The term "driver" is undefined and unclear at this point of the paper; a clear definition for "driver" is also lacking in the Intro.

      We now define “driver” or “genetic driver” as understood to mean “a genetic locus whose expression and/or inheritance strongly regulates the trait of interest”.

      When citing the literature on studies that identified "specific genes with specialized up- or down-regulated expression in song and speech circuits relative to the surrounding motor control circuits", the authors should also cite studies from other labs (e.g. Li et al., PNAS, 2007; Lovell et al, Plos One 2008; Lovell et al, BMC Genomics 2018; Nevue et al, Sci Rep. 2020), to be accurate and fair.

      Citations added

      For clarity, the authors should explicitly formulate the hypothesis they are proposing at the end of the Summary.

      We thank the reviewer for this comment. We have replaced the final sentence of the summary with: “We present a hypothesis where reduced dosage and expression of these Z chromosome genes changes the developmental trajectory of female HVC, partially preventable by estrogen treatment, contributing to the loss of song learning behavior.”

      Introduction:

      Vocal learning is arguably the ability to imitate 'vocal' sounds, this could be clarified here.

      We have amended the sentence to “Vocal learning is the ability to imitate heard sounds using a vocal organ…”

      Given they are currently considered sister taxa, can the author briefly explain what is the basis for assuming that songbirds and parrots independently evolved vocal learning?

      Although songbirds and parrots belong to a monophyletic clade, they are not sister taxa. There are two clades separating them that are vocal non-learners. We have cited the reference that demonstrated this (e.g. Jarvis et al 2014 Science).

      Why use Taeniopygia castanotis rather than the more broadly used Taeniopygia guttata?

      Zebra finches were recently reclassified and T.castanotis is now more accurate. The Indonesian Timor zebra finch retained T.guttata while the Australian finch, used here, was classified as T.castanotis.

      The authors state: "...vocal learning is strongly sexually dimorphic in zebra finches and many other vocal learning species" and cite Nottebohm and Arnold, Science, 1978. That landmark paper only shows dimorphism in song nuclei (not learning) in two songbird species. The authors should provide citations for other species and behavior, or modify the statement.

      We have added an additional citation (Odom et al.) to this sentence which covers the phylogeny more broadly.

      The authors refer to the nucleus RA as being located in the lateral intermediate arcopallium (LAI). Other labs have described this domain as the dorsal part of the intermediate arcopallium, thus AId or AID (Mello et al., JCN, 2019; Yuan and Bottjer, J Neurophys 2019; Yuan and Bottjer, eNeuro, 2020; Nevue et al., BCM Genomics, 2020). The authors should acknowledge this discrepancy in nomenclature so that data and conclusions can be more readily compared across studies.

      We thank the reviewer and agree that this is helpful. We have added a note at the first mention of LAI.

      The authors state that data from the gynandromorph bird described by Agate et al implicates "sex chromosome gene expression within the song system" as involved in the song system sexual dimorphism. That study, however, only rules out circulating gonadal steroids, and while suggesting a cell-autonomous mechanism like sex chromosome genes, it does not necessarily exclude other brain-autonomous factors like sex differences in local production of sex steroids.

      We say that this study “implicated” sex chromosome gene expression, which is accurate per the results and discussion of that study. We are unsure what “brain autonomous factors like sex differences in local production of sex steroids” means?. “Brain autonomous” and “local production” in the brain seem contradictory in this context?

      Results:

      The authors state that "the E2-treated females in this study had similarly sized song system nuclei as males, indicating that E2 treatment prevented atrophy". Can they clarify whether the VEH-treated females actually had smaller RAs than E2-treated females or VEH-treated males at this age? This is still quite early in development and it is unclear to what extent RA's marked sexual dimorphism in adults or later developmental ages has already taken place in untreated (or VEH-treated) birds. A related comment is that the authors state later on: "We interpret these findings to indicate that: LMAN and RA atrophy later in juvenile female development..." Does this mean these nuclei actually did not show the marked decreases predicted earlier in the text? Clarifying this point would be helpful.

      We thank the reviewer for pointing out this discrepancy, which reviewer #1 asked for clarification as well. RA size at this age is similar in males and females. However, HVC and Area X is smaller and absent respectively in females and E2 treatment partially prevents this atrophy. The text now reads:

      “In our previous study, we found that estradiol treatment in PHD30 females caused HVC to enlarge and Area X to appear when it normally does not develop in females, but both at sizes less than in untreated or treated males.The sizes of PHD30 female LMAN RA were already the sizes as seen in males, as the later has not atrophied yet at this age(25).”

      The authors acknowledge that area X is absent in untreated and VEH-treated females. Could they please clarify how area X and the surrounding stratal tissue that excludes area X were identified for laser capture dissections in juvenile females?

      We have added the following statement to the main text portion discussing the dissections.

      “In the case of vehicle-treated females which lack Area X, a piece of striatum from the same location of where Area X is found in males was taken. “

      Some passages in Results discussing the authors' interpretation of the modules seem quite speculative and possibly belong instead in the Discussion. For example: "... that module A and G genes could be associated with the start of this atrophy; HVC and Area X are likely the first to atrophy or not develop; and lack of any gene module specialization in them at this age could mean that they would be more sensitive to estrogen prevention of vocal learning loss."

      As suggested, we have removed this text from the results; these ideas were already presented in the Discussion. We have merged the resulting small paragraph with the preceding paragraph.

      The authors state: "To assess the effects of chronic exogenous estrogen on the developing song system, we first performed a control analysis of modules in the E2-treated juvenile males." How can an assessment of estrogen effects be a "control" analysis? Does this refer to a contrast with females? Please clarify the language here.

      The reviewer is correct, that E2 treatment in males should not be considered a control experiment. We removed the word “control”.

      When discussing the GO-enriched terms for module G, it is unclear how the authors reached the conclusion about "proliferative", as the enriched terms do not refer to processes more directly indicative of proliferation like "cell division" or "cell cycle regulation". Rather, these terms seem more related to differentiation and growth, which do not necessarily imply proliferation. The authors also refer to "HVC proliferation" later on in the Discussion. However, there is conclusive evidence from several labs that proliferative events associated with postnatal neuronal addition and/or replacement in song nuclei occur in the subventricular zone, not in song nuclei like HVC itself, and that the growth of song nuclei largely reflects cell survival, as well as growth in size and complexity under the regulation of sex steroids.

      We agree that “proliferative” may have been a poor word choice here. We did not mean to indicate that cell division was occuring in HVC itself. Instead we meant to indicate that HVC is able to accommodate the new born neurons from the SVZ. We have replaced the word “proliferative” throughout. In the instance the reviewer mentions specifically we replaced it with,“...potentially act to integrate and differentiate late born neurons.”

      With regard to module E, referring to a telencephalon-wide sexually dimorphic gene expression program seems quite a stretch, given that only a few regions were sampled and compared between sexes. These related statements should be toned down.

      We have replaced “telencephalon-wide” with “more distributed across the finch telencephalon” and other similar language in each instance.

      The following passage is very speculative and should shortened and/or moved to the Discussion: "Based on the findings in these gene sets, we hypothesize that without excess estrogen in females, HVC expansion is prevented by not specializing the growth and neuronal migration promoting genes in module G to the HVC lineage by late development. This is potentially enacted by depleting necessary gene products from the Z sex chromosome, such as GHR, which are already present in only one copy."

      We have deleted this portion of the text, as the idea is already present in the discussion.

      Figure 5: To this reviewer, the comparisons of sex differences and of female response to E2 are the most relevant and informative ones, whereas the regional differences between song nuclei and surrounds refer to different cell populations and cell types where other processes may be occurring, independently of what occurs in song nuclei. It thus seems like the intersection analysis in panel 5i may be subtracting out important "core genes" in terms of E2 effects and/or sex differences in the most relevant cell populations, i.e. in this case within song nucleus HVC.

      Song learning and the vocal learning brain regions are specialized behaviors and associated nuclei which have a set of hundreds of specialized genes compared to the surrounds. Our previous findings shows that E2 drives the appearance of these specializations in female zebra finches. Thus, we considered this the most interesting question to focus on, which we have further highlighted. Nevertheless, in response to the reviewers suggestion, we have added a .xlsx supplemental file containing the results from each of the individual tests so readers may examine any single comparison, or set of comparisons, in more detail.

      Discussion:

      It is unclear what the term "critical period" refers to in: "during the critical period of atrophy for the female vocal circuit"; please clarify.

      We agree that our language was nebulous. We have replaced it with “as several male song control nuclei begin to expand and female nuclei partially atrophy”

      In: "HVC appeared unspecialized at the level of gene module expression in control females", does "unspecialized" refer to a lack of difference in gene expression when compared to surroundings? Please clarify. The same comment applies to other uses of "unspecialized" in this paragraph.

      Yes, unspecialized means lack of difference in gene expression in the song nucleus. To clarify this point, we have reworked that and the following sentence as follows:

      “HVC appeared unspecialized compared to the surrounding nidopallium at the level of gene module expression in control females, with no significantly differentially expressed MEGs . However, in E2-treated females, HVC exhibited a subset of the observed male HVC gene expression specializations. Similarly, the vehicle-treated female striatum located where Area X would be also lacked any specialized gene module expression, but the E2-treated female Area X exhibited a subset of the male Area X specializations, consistent with the known absence of Area X in vehicle-treated females and presence in E2-treated females.”

      The authors state: "...we surprisingly found that the most specialized genes were disproportionately from the Z chromosome", when discussing module G in HVC. Why is this so surprising? In a sense, this could be taken as consistent with the findings of Friedrich et al, 2022, where sex differences in the RA transcriptome were predominantly Z related on 20 dph. Arguably 20 dph is still quite close to 30 dph in the present study, when compared to 50 dph in Friedrich et al, when autosomes predominate.

      Our bioRxiv was originally posted in July 2021, prior to the publication of Friedrich et al, 2022; however we had previously added to our discussion that several of our results are consistent with the observations of Friedrich et al..

      We have a different interpretation of Z chromosome gene results in Friedrich et al.. While the percentage of specialized genes from the Z chromosome decreased, the absolute number of specialized Z chromosome genes actually increased over this interval. In Fig. 3a from Friedrich et al. it appears that ~28% of Z chromosome genes were sexually dimorphic in their expression in RA at PHD20 but that ~39% of Z chromosome genes were similarly dimorphic at PHD50. We interpret this result as the Z chromosome genes being among the earliest genes differentially expressed between the sexes, not that their differential expression or role ever subsequently decreased. We have reworked this portion of the discussion to make our point more clear:

      “This model of sex chromosome influenced song system development is consistent with recent observations comparing male and female zebra finch transcriptomes from RA at young juvenile (PHD20) and young adult (PHD50) ages in un-manipulated birds (Friedrich et al. 2022)57. While that study proposes that the role of the sex chromosome in maintaining transcriptomic sex differences diminishes across development, as the proportion of specialized genes that originate on the sex chromosomes diminishes, this effect was driven by large increases in differentially expressed autosomal genes rather than by any reduction in sex chromosome dimorphism; the percentage of differentially expressed Z chromosome genes increased from PHD20 (28%) to PHD50 (39%) (Friedrich et al). This leads us to conclude that sexually dimorphic Z chromosome expression at juvenile ages precedes the sexually dimorphic expression of the autosomes seen in adults. This is consistent with our hypothesis that sufficient expression of select Z chromosome gene products (GHR, etc..) is necessary for subsequent autosomal song system specializations (modG).”

      Further, when we write ”When examining the module G HVC specialization induced by E2-treatment in female HVC, we surprisingly found that the most specialized genes were disproportionately from the Z chromosome” we are referring to the upregulation of module G by E2 in female HVC, not the sex difference described in RA by Friedrich et al. which only utilized un-treated RA samples and thus is more likely related to our observations of module E.

      The term "sexual dimorphism" has been more traditionally used for sex differences that are very marked, like features that are highly regressed or absent in one sex, most often in females. Quantitative differences in gene expression, including dosage differences like those related to module E, are more appropriately described as sex differences rather than dimorphisms. That usage would be more consistent with most of the literature, and thus preferable.

      We did a google search for common definitions, and found more the opposite. Sexual dimorphism being used more often as differences of degree (with the zebra finch example as one of the top hits), and sex differences being used often as more absolute differences (like presence vs absence of the Y chromosome). Further, as in the reviewer’s first sentence, the definition of sexual dimorphism is a sex difference. That is, the two phrases can be interchangeable. Thus, we prefer to keep sexual dimorphism.

      Several references are incomplete or seem truncated, like 9 and 10.

      Fixed

      Table S2: Please examine and take into account the W gene curation presented in Table S3 of Friedrich et al., 2022.

      We have added additional supplementals (supplemetal_w_chrom_express.csv and supplemetal_z_chrom_express.csv) of the data provided in new Fig 5 incorporating the curation information from Table S3 from Friedrich et al.

      Data availability:

      Genes for all the main modules identified should be presented in a Supplemental Table, or through a link to a stable data repository.

      We have added an additional Supplemental Table supplemental_gene_module_assignment.csv with this information.

    1. eLife Assessment

      This valuable paper introduces Heron, lightweight scientific software that is designed to streamline the implementation of complex experimental pipelines. The software is tailored for workflows that require coordinating many logical steps across interconnected hardware components with heterogeneous computing environments. The authors convincingly demonstrate Heron's utility and effectiveness in the context of behavioral experiments, addressing a growing need among experimentalists for flexible and scalable solutions that accommodate diverse and evolving hardware requirements.

    2. Reviewer #2 (Public review):

      Summary:

      The authors provide an open-source graphic user interface (GUI) called Heron, implemented in Python, that is designed to help experimentalists to:

      (1) Design experimental pipelines and implement them in a way that is closely aligned with their mental schemata of the experiments<br /> (2) Execute and control the experimental pipelines with numerous interconnected hardware and software on a network.

      The former is achieved by representing an experimental pipeline using a Knowledge Graph and visually representing this graph in the GUI. The latter is accomplished by using an actor model to govern the interaction among interconnected nodes through messaging, implemented using ZeroMQ. The nodes themselves execute user-supplied code in, but not limited to, Python.

      Using three showcases of behavioral experiments on rats, the authors highlighted four benefits of their software design:

      (1) The knowledge graph serves as a self-documentation of the logic of the experiment, enhancing the readability and reproducibility of the experiment,<br /> (2) The experiment can be executed in a distributed fashion across multiple machines that each has different operating system or computing environment, such that the experiment can take advantage of hardware that sometimes can only work on a specific computer/OS, a commonly seen issue nowadays,<br /> (3) The users supply their own Python code for node execution that is supposed to be more friendly to those who do not have a strong programming background,<br /> (4) The GUI can also be used as an experiment control panel for users to control/update parameters on the fly.

      Strengths:

      (1) The software is light-weight and open-source, provides a clean and easy-to-use GUI,<br /> (2) The software answers the need of experimentalists, particularly in the field of behavioral science, to deal with the diversity of hardware that becomes restricted to run on dedicated systems. It can also be widely adopted in many other experimental settings.<br /> (3) The software has a solid design that seems to be functionally reliable and useful under many conditions, demonstrated by a number of sophisticated experimental setups.<br /> (4) The software is well documented. The authors pay special attention to documenting the usage of the software and setting up experiments using this software.

      Comments on revisions: The authors have addressed my concerns from the initial review.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews

      Reviewer #1 (Public Review):

      Summary:

      The authors have created a system for designing and running experimental pipelines to control and coordinate different programs and devices during an experiment, called Heron. Heron is based around a graphical tool for creating a Knowledge Graph made up of nodes connected by edges, with each node representing a separate Python script, and each edge being a communication pathway connecting a specific output from one node to an iput on another. Each node also has parameters that can be set by the user during setup and runtime, and all of this behavior is concisely specified in the code that defines each node. This tool tries to marry the ease of use, clarity, and selfdocumentation of a purely graphical system like Bonsai with the flexibility and power of a purely code-based system like Robot Operating System (ROS).

      Strengths:

      The underlying idea behind Heron, of combining a graphical design and execution tool with nodes that are made as straightforward Python scripts seems like a great way to get the relative strengths of each approach. The graphical design side is clear, selfexplanatory, and self-documenting, as described in the paper. The underlying code for each node tends to also be relatively simple and straightforward, with a lot of the complex communication architecture successfully abstracted away from the user. This makes it easy to develop new nodes, without needing to understand the underlying communications between them. The authors also provide useful and well-documented templates for each type of node to further facilitate this process. Overall this seems like it could be a great tool for designing and running a wide variety of experiments, without requiring too much advanced technical knowledge from the users.

      The system was relatively easy to download and get running, following the directions and already has a significant amount of documentation available to explain how to use it and expand its capabilities. Heron has also been built from the ground up to easily incorporate nodes stored in separate Git repositories and to thus become a large community-driven platform, with different nodes written and shared by different groups. This gives Heron a wide scope for future utility and usefulness, as more groups use it, write new nodes, and share them with the community. With any system of this sort, the overall strength of the system is thus somewhat dependent on how widely it is used and contributed to, but the authors did a good job of making this easy and accessible for people who are interested. I could certainly see Heron growing into a versatile and popular system for designing and running many types of experiments.

      Weaknesses:

      (1) The number one thing that was missing from the paper was any kind of quantification of the performance of Heron in different circumstances. Several useful and illustrative examples were discussed in depth to show the strengths and flexibility of Heron, but there was no discussion or quantification of performance, timing, or latency for any of these examples. These seem like very important metrics to measure and discuss when creating a new experimental system.

      Heron is practically a thin layer of obfuscation of signal passing across processes. Given its design approach it is up to the code of each Node to deal with issues of timing, synching and latency and thus up to each user to make sure the Nodes they author fulfil their experimental requirements. Having said that, Heron provides a large number of tools to allow users to optimise the generated Knowledge Graphs for their use cases. To showcase these tools, we have expanded on the third experimental example in the paper with three extra sections, two of which relate to Heron’s performance and synching capabilities. One is focusing on Heron’s CPU load requirements (and existing Heron tools to keep those at acceptable limits) and another focusing on post experiment synchronisation of all the different data sets a multi Node experiment generates.   

      (2) After downloading and running Heron with some basic test Nodes, I noticed that many of the nodes were each using a full CPU core on their own. Given that this basic test experiment was just waiting for a keypress, triggering a random number generator, and displaying the result, I was quite surprised to see over 50% of my 8-core CPU fully utilized. I don’t think that Heron needs to be perfectly efficient to accomplish its intended purpose, but I do think that some level of efficiency is required. Some optimization of the codebase should be done so that basic tests like this can run with minimal CPU utilization. This would then inspire confidence that Heron could deal with a real experiment that was significantly more complex without running out of CPU power and thus slowing down.

      The original Heron allowed the OS to choose how to manage resources over the required process. We were aware that this could lead to significant use of CPU time, as well as occasionally significant drop of packets (which was dependent on the OS and its configuration). This drop happened mainly when the Node was running a secondary process (like in the Unity game process in the 3rd example). To mitigate these problems, we have now implemented a feature allowing the user to choose the CPU that each Node’s worker function runs on as well as any extra processes the worker process initialises. This is accessible from the Saving secondary window of the node. This stops the OS from swapping processes between CPUs and eliminates the dropping of packages due to the OS behaviour. It also significantly reduces the utilised CPU time. To showcase this, we initially run the simple example mentioned by the reviewer. The computer running only background services was using 8% of CPU (8 cores). With Heron GUI running but with no active Graph, the CPU usage went to 15%. With the Graph running and Heron’s processes running on OS attributed CPU cores, the total CPU was at 65% (so very close to the reviewer’s 50%). By choosing a different CPU core for each of the three worker processes the CPU went down to 47% and finally when all processes were forced to run on the same CPU core the CPU load dropped to 30%.  So, Heron in its current implementation running its GUI and 3 Nodes takes 22% of CPU load. This is still not ideal but is a consequence of the overhead of running multiple processes vs multiple threads. We believe that, given Heron’s latest optimisation, offering more control of system management to the user, the benefits of multi process applications outweigh this hit in system resources. 

      We have also increased the scope of the third example we provide in the paper and there we describe in detail how a full-scale experiment with 15 Nodes (which is the upper limit of number of Nodes usually required in most experiments) impacts CPU load. 

      Finally, we have added on Heron’s roadmap projects extra tasks focusing only on optimisation (profiling and using Numba for the time critical parts of the Heron code).

      (3) I was also surprised to see that, despite being meant specifically to run on and connect diverse types of computer operating systems and being written purely in Python, the Heron Editor and GUI must be run on Windows. This seems like an unfortunate and unnecessary restriction, and it would be great to see the codebase adjusted to make it fully crossplatform-compatible.

      This point was also mentioned by reviewer 2. This was a mistake on our part and has now been corrected in the paper. Heron (GUI and underlying communication functionality) can run on any machine that the underlying python libraries run, which is Windows, Linux (both for x86 and Arm architectures) and MacOS. We have tested it on Windows (10 and 11, both x64), Linux PC (Ubuntu 20.04.6, x64) and Raspberry Pi 4 (Debian GNU/Linux 12 (bookworm), aarch64). The Windows and Linux versions of Heron have undergone extensive debugging and all of the available Nodes (that are not OS specific) run on those two systems. We are in the process of debugging the Nodes’ functionality for RasPi. The MacOS version, although functional requires further work to make sure all of the basic Nodes are functional (which is not the case at the moment). We have also updated our manuscript (Multiple machines, operating systems and environments) to include the above information. 

      (4) Lastly, when I was running test experiments, sometimes one of the nodes, or part of the Heron editor itself would throw an exception or otherwise crash. Sometimes this left the Heron editor in a zombie state where some aspects of the GUI were responsive and others were not. It would be good to see a more graceful full shutdown of the program when part of it crashes or throws an exception, especially as this is likely to be common as people learn to use it. More problematically, in some of these cases, after closing or force quitting Heron, the TCP ports were not properly relinquished, and thus restarting Heron would run into an "address in use" error. Finding and killing the processes that were still using the ports is not something that is obvious, especially to a beginner, and it would be great to see Heron deal with this better. Ideally, code would be introduced to carefully avoid leaving ports occupied during a hard shutdown, and furthermore, when the address in use error comes up, it would be great to give the user some idea of what to do about it.

      A lot of effort has been put into Heron to achieve graceful shut down of processes, especially when these run on different machines that do not know when the GUI process has closed. The code that is being suggested to avoid leaving ports open has been implemented and this works properly when processes do not crash (Heron is terminated by the user) and almost always when there is a bug in a process that forces it to crash. In the version of Heron available during the reviewing process there were bugs that caused the above behaviour (Node code hanging and leaving zombie processes) on MacOS systems. These have now been fixed. There are very seldom instances though, especially during Node development, that crashing processes will hang and need to be terminated manually. We have taken on board the reviewer’s comments that users should be made more aware of these issues and have also described this situation in the Debugging part of Heron’s documentation. There we explain the logging and other tools Heron provides to help users debug their own Nodes and how to deal with hanging processes.

      Heron is still in alpha (usable but with bugs) and the best way to debug it and iron out all the bugs in all use cases is through usage from multiple users and error reporting (we would be grateful if the errors the reviewer mentions could be reported in Heron’s github Issues page). We are always addressing and closing any reported errors, since this is the only way for Heron to transition from alpha to beta and eventually to production code quality.

      Overall I think that, with these improvements, this could be the beginning of a powerful and versatile new system that would enable flexible experiment design with a relatively low technical barrier to entry. I could see this system being useful to many different labs and fields. 

      We thank the reviewer for positive and supportive words and for the constructive feedbacks. We believe we have now addressed all the raised concerns.  

      Reviewer #2 (Public Review):

      Summary:

      The authors provide an open-source graphic user interface (GUI) called Heron, implemented in Python, that is designed to help experimentalists to

      (1) design experimental pipelines and implement them in a way that is closely aligned with their mental schemata of the experiments,

      (2) execute and control the experimental pipelines with numerous interconnected hardware and software on a network.

      The former is achieved by representing an experimental pipeline using a Knowledge Graph and visually representing this graph in the GUI. The latter is accomplished by using an actor model to govern the interaction among interconnected nodes through messaging, implemented using ZeroMQ. The nodes themselves execute user-supplied code in, but not limited to, Python.

      Using three showcases of behavioral experiments on rats, the authors highlighted three benefits of their software design:

      (1) the knowledge graph serves as a self-documentation of the logic of the experiment, enhancing the readability and reproducibility of the experiment,

      (2) the experiment can be executed in a distributed fashion across multiple machines that each has a different operating system or computing environment, such that the experiment can take advantage of hardware that sometimes can only work on a specific computer/OS, a commonly seen issue nowadays,

      (3) he users supply their own Python code for node execution that is supposed to be more friendly to those who do not have a strong programming background.

      Strengths:

      (1) The software is light-weight and open-source, provides a clean and easy-to-use GUI,

      (2) The software answers the need of experimentalists, particularly in the field of behavioral science, to deal with the diversity of hardware that becomes restricted to run on dedicated systems.

      (3) The software has a solid design that seems to be functionally reliable and useful under many conditions, demonstrated by a number of sophisticated experimental setups.

      (4) The software is well documented. The authors pay special attention to documenting the usage of the software and setting up experiments using this software.

      Weaknesses:

      (1) While the software implementation is solid and has proven effective in designing the experiment showcased in the paper, the novelty of the design is not made clear in the manuscript. Conceptually, both the use of graphs and visual experimental flow design have been key features in many widely used softwares as suggested in the background section of the manuscript. In particular, contrary to the authors’ claim that only pre-defined elements can be used in Simulink or LabView, Simulink introduced MATLAB Function Block back in 2011, and Python code can be used in LabView since 2018. Such customization of nodes is akin to what the authors presented.

      In the Heron manuscript we have provided an extensive literature review of existing systems from which Heron has borrowed ideas. We never wished to say that graphs and visual code is what sets Heron apart since these are technologies predating Heron by many years and implemented by a large number of software. We do not believe also that we have mentioned that LabView or Simulink can utilise only predefined nodes. What we have said is that in such systems (like LabView, Simulink and Bonsai) the focus of the architecture is on prespecified low level elements while the ability for users to author their own is there but only as an afterthought. The difference with Heron is that in the latter the focus is on the users developing their own elements. One could think of LabView style software as node-based languages (with low level visual elements like loops and variables) that also allow extra scripting while Heron is a graphical wrapper around python where nodes are graphical representations of whole processes. To our knowledge there is no other software that allows the very fast generation of graphical elements representing whole processes whose communication can also be defined graphically. Apart from this distinction, Heron also allows a graphical approach to writing code for processes that span different machines which again to our knowledge is a novelty of our approach and one of its strongest points towards ease of experimental pipeline creation (without sacrificing expressivity). 

      (2) The authors claim that the knowledge graph can be considered as a self-documentation of an experiment. I found it to be true to some extent. Conceptually it’s a welcoming feature and the fact that the same visualization of the knowledge graph can be used to run and control experiments is highly desirable (but see point 1 about novelty). However, I found it largely inadequate for a person to understand an experiment from the knowledge graph as visualized in the GUI alone. While the information flow is clear, and it seems easier to navigate a codebase for an experiment using this method, the design of the GUI does not make it a one-stop place to understand the experiment. Take the Knowledge Graph in Supplementary Figure 2B as an example, it is associated with the first showcase in the result section highlighting this self-documentation capability. I can see what the basic flow is through the disjoint graph where 1) one needs to press a key to start a trial, and 2) camera frames are saved into an avi file presumably using FFMPEG. Unfortunately, it is not clear what the parameters are and what each block is trying to accomplish without the explanation from the authors in the main text. Neither is it clear about what the experiment protocol is without the help of Supplementary Figure 2A.

      In my opinion, text/figures are still key to documenting an experiment, including its goals and protocols, but the authors could take advantage of the fact that they are designing a GUI where this information, with properly designed API, could be easily displayed, perhaps through user interaction. For example, in Local Network -> Edit IPs/ports in the GUI configuration, there is a good tooltip displaying additional information for the "password" entry. The GUI for the knowledge graph nodes can very well utilize these tooltips to show additional information about the meaning of the parameters, what a node does, etc, if the API also enforces users to provide this information in the form of, e.g., Python docstrings in their node template. Similarly, this can be applied to edges to make it clear what messages/data are communicated between the nodes. This could greatly enhance the representation of the experiment from the Knowledge graph.

      In the first showcase example in the paper “Probabilistic reversal learning.

      Implementation as self-documentation” we go through the steps that one would follow in order to understand the functionality of an experiment through Heron’s Knowledge Graph. The Graph is not just the visual representation of the Nodes in the GUI but also their corresponding code bases. We mention that the way Heron’s API limits the way a Node’s code is constructed (through an Actor based paradigm) allows for experimenters to easily go to the code base of a specific Node and understand its 2 functions (initialisation and worker) without getting bogged down in the code base of the whole Graph (since these two functions never call code from any other Nodes). Newer versions of Heron facilitate this easy access to the appropriate code by also allowing users to attach to Heron their favourite IDE and open in it any Node’s two scripts (worker and com) when they double click on the Node in Heron’s GUI. On top of this, Heron now (in the versions developed as answers to the reviewers’ comments) allows Node creators to add extensive comments on a Node but also separate comments on the Node’s parameters and input and output ports. Those can be seen as tooltips when one hovers over the Node (a feature that can be turned off or on by the Info button on every Node).  

      As Heron stands at the moment we have not made the claim that the Heron GUI is the full picture in the self-documentation of a Graph. We take note though the reviewer’s desire to have the GUI be the only tool a user would need to use to understand an experimental implementation. The solution to this is the same as the one described by the reviewer of using the GUI to show the user the parts of the code relevant to a specific Node without the user having to go to a separate IDE or code editor. The reason this has not been implemented yet is the lack of a text editor widget in the underlying gui library (DearPyGUI). This is in their roadmap for their next large release and when this exists we will use it to implement exactly the idea the reviewer is suggesting, but also with the capability to not only read comments and code but also directly edit a Node’s code (see Heron’s roadmap). Heron’s API at the moment is ideal for providing such a text editor straight from the GUI.

      (3) The design of Heron was primarily with behavioral experiments in mind, in which highly accurate timing is not a strong requirement. Experiments in some other areas that this software is also hoping to expand to, for example, electrophysiology, may need very strong synchronization between apparatus, for example, the record timing and stimulus delivery should be synced. The communication mechanism implemented in Heron is asynchronous, as I understand it, and the code for each node is executed once upon receiving an event at one or more of its inputs. The paper, however, does not include a discussion, or example, about how Heron could be used to address issues that could arise in this type of communication. There is also a lack of information about, for example, how nodes handle inputs when their ability to execute their work function cannot keep up with the frequency of input events. Does the publication/subscription handle the queue intrinsically? Will it create problems in real-time experiments that make multiple nodes run out of sync? The reader could benefit from a discussion about this if they already exist, and if not, the software could benefit from implementing additional mechanisms such that it can meet the requirements from more types of experiments.

      In order to address the above lack of explanation (that also the first reviewer pointed out) we expanded the third experimental example in the paper with three more sections. One focuses solely on explaining how in this example (which acquires and saves large amounts of data from separate Nodes running on different machines) one would be able to time align the different data packets generated in different Nodes to each other. The techniques described there are directly implementable on experiments where the requirements of synching are more stringent than the behavioural experiment we showcase (like in ephys experiments). 

      Regarding what happens to packages when the worker function of a Node is too slow to handle its traffic, this is mentioned in the paper (Code architecture paragraph): “Heron is designed to have no message buffering, thus automatically dropping any messages that come into a Node’s inputs while the Node’s worker function is still running.” This is also explained in more detail in Heron’s documentation. The reasoning for a no buffer system (as described in the documentation) is that for the use cases Heron is designed to handle we believe there is no situation where a Node would receive large amounts of data in bursts while very little data during the rest of the time (in which case a buffer would make sense). Nodes in most experiments will either be data intensive but with a constant or near constant data receiving speed (e.g. input from a camera or ephys system) or will have variable data load reception but always with small data loads (e.g. buttons). The second case is not an issue and the first case cannot be dealt with a buffer but with the appropriate code design, since buffering data coming in a Node too slow for its input will just postpone the inevitable crash. Heron’s architecture principle in this case is to allow these ‘mistakes’ (i.e. package dropping) to happen so that the pipeline continues to run and transfer the responsibility of making Nodes fast enough to the author of each Node. At the same time Heron provides tools (see the Debugging section of the documentation and the time alignment paragraph of the “Rats playing computer games”  example in the manuscript) that make it easy to detect package drops and either correct them or allow them but also allow time alignment between incoming and outgoing packets. In the very rare case where a buffer is required Heron’s do-it-yourself logic makes it easy for a Node developer to implement their own Node specific buffer.

      (4) The authors mentioned in "Heron GUI’s multiple uses" that the GUI can be used as an experimental control panel where the user can update the parameters of the different Nodes on the fly. This is a very useful feature, but it was not demonstrated in the three showcases. A demonstration could greatly help to support this claim.

      As the reviewer mentions, we have found Heron’s GUI double role also as an experimental on-line controller a very useful capability during our experiments. We have expanded the last experimental example to also showcase this by showing how on the “Rats playing computer games” experiment we used the parameters of two Nodes to change the arena’s behaviour while the experiment was running, depending on how the subject was behaving at the time (thus exploring a much larger set of parameter combinations, faster during exploratory periods of our shaping protocols construction). 

      (5) The API for node scripts can benefit from having a better structure as well as having additional utilities to help users navigate the requirements, and provide more guidance to users in creating new nodes. A more standard practice in the field is to create three abstract Python classes, Source, Sink, and Transform that dictate the requirements for initialisation, work_function, and on_end_of_life, and provide additional utility methods to help users connect between their code and the communication mechanism. They can be properly docstringed, along with templates. In this way, the com and worker scripts can be merged into a single unified API. A simple example that can cause confusion in the worker script is the "worker_object", which is passed into the initialise function. It is unclear what this object this variable should be, and what attributes are available without looking into the source code. As the software is also targeting those who are less experienced in programming, setting up more guidance in the API can be really helpful. In addition, the self-documentation aspect of the GUI can also benefit from a better structured API as discussed in point 2 above.

      The reviewer is right that using abstract classes to expose to users the required API would be a more standard practice. The reason we did not choose to do this was to keep Heron easily accessible to entry level Python programmers who do not have familiarity yet with object oriented programming ideas. So instead of providing abstract classes we expose only the implementation of three functions which are part of the worker classes but the classes themselves are not seen by the users of the API. The point about the users’ accessibility to more information regarding a few objects used in the API (the worker object for example) has been taken on board and we have now addressed this by type hinting all these objects both in the templates and more importantly in the automatically generated code that Heron now creates when a user chooses to create a Node graphically (a feature of Heron not present in the version available in the initial submission of this manuscript).  

      (6) The authors should provide more pre-defined elements. Even though the ability for users to run arbitrary code is the main feature, the initial adoption of a codebase by a community, in which many members are not so experienced with programming, is the ability for them to use off-the-shelf components as much as possible. I believe the software could benefit from a suite of commonly used Nodes.

      There are currently 12 Node repositories in the Heron-repositories project on Github with more than 30 Nodes, 20 of which are general use (not implementing a specific experiment’ logic). This list will continue to grow but we fully appreciate the truth of the reviewer’s comment that adoption will depend on the existence of a large number of commonly used Nodes (for example Numpy, and OpenCV Nodes) and are working towards this goal.

      (7) It is not clear to me if there is any capability or utilities for testing individual nodes without invoking a full system execution. This would be critical when designing new experiments and testing out each component.

      There is no capability to run the code of an individual Node outside Heron’s GUI. A user could potentially design and test parts of the Node before they get added into a Node but we have found this to be a highly inefficient way of developing new Nodes. In our hands the best approach for Node development was to quickly generate test inputs and/or outputs using the “User Defined Function 1I 1O” Node where one can quickly write a function and make it accessible from a Node. Those test outputs can then be pushed in the Node under development or its outputs can be pushed in the test function, to allow for incremental development without having to connect it to the Nodes it would be connected in an actual pipeline. For example, one can easily create a small function that if a user presses a key will generate the same output (if run from a “User Defined Function 1I 1O” Node) as an Arduino Node reading some buttons. This output can then be passed into an experiment logic Node under development that needs to do something with this input. In this way during a Node development Heron allows the generation of simulated hardware inputs and outputs without actually running the actual hardware. We have added this way of developing Nodes also in our manuscript (Creating a new Node).

      Reviewer #3 (Public Review):

      Summary:

      The authors present a Python tool, Heron, that provides a framework for defining and running experiments in a lab setting (e.g. in behavioural neuroscience). It consists of a graphical editor for defining the pipeline (interconnected nodes with parameters that can pass data between them), an API for defining the nodes of these pipelines, and a framework based on ZeroMQ, responsible for the overall control and data exchange between nodes. Since nodes run independently and only communicate via network messages, an experiment can make use of nodes running on several machines and in separate environments, including on different operating systems.

      Strengths:

      As the authors correctly identify, lab experiments often require a hodgepodge of separate hardware and software tools working together. A single, unified interface for defining these connections and running/supervising the experiment, together with flexibility in defining the individual subtasks (nodes) is therefore a very welcome approach. The GUI editor seems fairly intuitive, and Python as an accessible programming environment is a very sensible choice. By basing the communication on the widely used ZeroMQ framework, they have a solid base for the required non-trivial coordination and communication. Potential users reading the paper will have a good idea of how to use the software and whether it would be helpful for their own work. The presented experiments convincingly demonstrate the usefulness of the tool for realistic scientific applications.

      Weaknesses:

      (1) In my opinion, the authors somewhat oversell the reproducibility and "selfdocumentation" aspect of their solution. While it is certainly true that the graph representation gives a useful high-level overview of an experiment, it can also suffer from the same shortcomings as a "pure code" description of a model - if a user gives their nodes and parameters generic/unhelpful names, reading the graph will not help much. 

      This is a problem that to our understanding no software solution can possibly address. Yet having a visual representation of how different inputs and outputs connect to each other we argue would be a substantial benefit in contrast to the case of “pure code” especially when the developer of the experiment has used badly formatted variable names.

      (2) Making the link between the nodes and the actual code is also not straightforward, since the code for the nodes is spread out over several directories (or potentially even machines), and not directly accessible from within the GUI. 

      This is not accurate. The obligatory code of a Node always exists within a single folder and Heron’s API makes it rather cumbersome to spread scripts relating to a Node across separate folders. The Node folder structure can potentially be copied over different machines but this is why Heron is tightly integrated with git practices (and even politely asks the user with popup windows to create git repositories of any Nodes they create whilst using Heron’s automatic Node generator system). Heron’s documentation is also very clear on the folder structure of a Node which keeps the required code always in the same place across machines and more importantly across experiments and labs. Regarding the direct accessibility of the code from the GUI, we took on board the reviewers’ comments and have taken the first step towards correcting this. Now one can attach to Heron their favourite IDE and then they can double click on any Node to open its two main scripts (com and worker) in that IDE embedded in whatever code project they choose (also set in Heron’s settings windows). On top of this, Heron now allows the addition of notes both for a Node and for all its parameters, inputs and outputs which can be viewed by hovering the mouse over them on the Nodes’ GUIs. The final step towards GUI-code integration will be to have a Heron GUI code editor but this is something that has to wait for further development from Heron’s underlying GUI library DearPyGUI.

      (3) The authors state that "[Heron’s approach] confers obvious benefits to the exchange and reproducibility of experiments", but the paper does not discuss how one would actually exchange an experiment and its parameters, given that the graph (and its json representation) contains user-specific absolute filenames, machine IP addresses, etc, and the parameter values that were used are stored in general data frames, potentially separate from the results. Neither does it address how a user could keep track of which versions of files were used (including Heron itself).

      Heron’s Graphs, like any experimental implementation, must contain machine specific strings. These are accessible either from Heron’s GUI when a Graph json file is opened or from the json file itself. Heron in this regard does not do anything different to any other software, other than saving the graphs into human readable json files that users can easily manipulate directly.

      Heron provides a method for users to save every change of the Node parameters that might happen during an experiment so that it can be fully reproduced. The dataframes generated are done so in the folders specified by the user in each of the Nodes (and all those paths are saved in the json file of the Graph). We understand that Heron offers a certain degree of freedom to the user (Heron’s main reason to exist is exactly this versatility) to generate data files wherever they want but makes sure every file path gets recorded for subsequent reproduction. So, Heron behaves pretty much exactly like any other open source software. What we wanted to focus on as the benefits of Heron on exchange and reproducibility was the ability of experimenters to take a Graph from another lab (with its machine specific file paths and IP addresses) and by examining the graphical interface of it to be able to quickly tweak it to make it run on their own systems. That is achievable through the fact that a Heron experiment will be constructed by a small amount of Nodes (5 to 15 usually) whose file paths can be trivially changed in the GUI or directly in the json file while the LAN setup of the machines used can be easily reconstructed from the information saved in the secondary GUIs.

      Where Heron needs to improve (and this is a major point in Heron’s roadmap) is the need to better integrate the different saved experiments with the git versions of Heron and the Nodes that were used for that specific save. This, we appreciate is very important for full reproducibility of the experiment and it is a feature we will soon implement. More specifically users will save together with a graph the versions of all the used repositories and during load the code base utilised will come from the recorded versions and not from the current head of the different repositories. This is a feature that we are currently working on now and as our roadmap suggests will be implemented by the release of Heron 1.0. 

      (4) Another limitation that in my opinion is not sufficiently addressed is the communication between the nodes, and the effect of passing all communications via the host machine and SSH. What does this mean for the resulting throughput and latency - in particular in comparison to software such as Bonsai or Autopilot? The paper also states that "Heron is designed to have no message buffering, thus automatically dropping any messages that come into a Node’s inputs while the Node’s worker function is still running."- it seems to be up to the user to debug and handle this manually?

      There are a few points raised here that require addressing. The first is Heron’s requirement to pass all communication through the main (GUI) machine. We understand (and also state in the manuscript) that this is a limitation that needs to be addressed. We plan to do this is by adding to Heron the feature of running headless (see our roadmap). This will allow us to run whole Heron pipelines in a second machine which will communicate with the main pipeline (run on the GUI machine) with special Nodes. That will allow experimenters to define whole pipelines on secondary machines where the data between their Nodes stay on the machine running the pipeline. This is an important feature for Heron and it will be one of the first features to be implemented next (after the integration of the saving system with git). 

      The second point is regarding Heron’s throughput latency. In our original manuscript we did not have any description of Heron’s capabilities in this respect and both other reviewers mentioned this as a limitation. As mentioned above, we have now addressed this by adding a section to our third experimental example that fully describes how much CPU is required to run a full experimental pipeline running on two machines and utilising also non python code executables (a Unity game). This gives an overview of how heavy pipelines can run on normal computers given adequate optimisation and utilising Heron’s feature of forcing some Nodes to run their Worker processes on a specific core. At the same time, Heron’s use of 0MQ protocol makes sure there are no other delays or speed limitations to message passing. So, message passing within the same machine is just an exchange of memory pointers while messages passing between different machines face the standard speed limitations of the Local Access Network’s ethernet card speeds. 

      Finally, regarding the message dropping feature of Heron, as mentioned above this is an architectural decision given the use cases of message passing we expect Heron to come in contact with. For a full explanation of the logic here please see our answer to the 3rd comment by Reviewer 2.

      (5) As a final comment, I have to admit that I was a bit confused by the use of the term "Knowledge Graph" in the title and elsewhere. In my opinion, the Heron software describes "pipelines" or "data workflows", not knowledge graphs - I’d understand a knowledge graph to be about entities and their relationships. As the authors state, it is usually meant to make it possible to "test propositions against the knowledge and also create novel propositions" - how would this apply here?

      We have described Heron as a Knowledge Graph instead of a pipeline, data workflow or computation graph in order to emphasise Heron’s distinct operation in contrast to what one would consider a standard pipeline and data workflow generated by other visual based software (like LabView and Bonsai). This difference exists on what a user should think of as the base element of a graph, i.e. the Node. In all other visual programming paradigms, the Node is defined as a low-level computation, usually a language keyword, language flow control or some simple function. The logic in this case is generated by composing together the visual elements (Nodes). In Heron the Node is to be thought of as a process which can be of arbitrary complexity and the logic of the graph is composed by the user both within each Node and by the way the Nodes are combined together. This is an important distinction in Heron’s basic operation logic and it is we argue the main way Heron allows flexibility in what can be achieved while retaining ease of graph composition (by users defining their own level of complexity and functionality encompassed within each Node). We have found that calling this approach a computation graph (which it is) or a pipeline or data workflow would not accentuate this difference. The term Knowledge Graph was the most appropriate as it captures the essence of variable information complexity (even in terms of length of shortest string required) defined by a Node.

      Recommendations for the authors:  

      Reviewer #1 (Recommendations For The Authors):

      -  No buffering implies dropped messages when a node is busy. It seems like this could be very problematic for some use cases... 

      This is a design principle of Heron. We have now provided a detailed explanation of the reasoning behind it in our answer to Reviewer 2 (Paragraph 3) as well as in the manuscript. 

      -  How are ssh passwords stored, and is it secure in some way or just in plain text?  

      For now they are plain text in an unencrypted file that is not part of the repo (if one gets Heron from the repo). Eventually, we would like to go to private/public key pairs but this is not a priority due to the local nature of Heron’s use cases (all machines in an experiment are expected to connect in a LAN).  

      Minor notes / copyedits:

      -  Figure 2A: right and left seem to be reversed in the caption. 

      They were. This is now fixed. 

      -  Figure 2B: the text says that proof of life messages are sent to each worker process but in the figure, it looks like they are published by the workers? Also true in the online documentation.  

      The Figure caption was wrong. This is now fixed.

      -  psutil package is not included in the requirements for GitHub

      We have now included psutil in the requirements.

      -  GitHub readme says Python >=3.7 but Heron will not run as written without python >= 3.9 (which is alluded to in the paper)

      The new Heron updates require Python 3.11. We have now updated GitHub and the documentation to reflect this.

      -  The paper mentions that the Heron editor must be run on Windows, but this is not mentioned in the Github readme.  

      This was an error in the manuscript that we have now corrected.

      -  It’s unclear from the readme/manual how to remove a node from the editor once it’s been added.  

      We have now added an X button on each Node to complement the Del button on the keyboard (for MacOS users that do not have this button most of the times).

      -  The first example experiment is called the Probabilistic Reversal Learning experiment in text, but the uncertainty experiment in the supplemental and on GitHub.  

      We have now used the correct name (Probabilistic Reversal Learning) in both the supplemental material and on GitHub

      -  Since Python >=3.9 is required, consider using fstrings instead of str.format for clarity in the codebase  

      Thank you for the suggestion. Latest Heron development has been using f strings and we will do a refactoring in the near future.

      -  Grasshopper cameras can run on linux as well through the spinnaker SDK, not just Windows.  

      Fixed in the manuscript. 

      -  Figure 4: Square and star indicators are unclear.

      Increased the size of the indicators to make them clear.

      -  End of page 9: "an of the self" presumably a typo for "off the shelf"?  

      Corrected.

      -  Page 10 first paragraph. "second root" should be "second route"

      Corrected.

      -  When running Heron, the terminal constantly spams Blowfish encryption deprecation warnings, making it difficult to see the useful messages.  

      The solution to this problem is to either update paramiko or install Heron through pip. This possible issue is mentioned in the documentation.

      -  Node input /output hitboxes in the GUI are pretty small. If they could be bigger it would make it easier to connect nodes reliably without mis-clicks.

      We have redone the Node GUI, also increasing the size of the In/Out points.

      Reviewer #2 (Recommendations For The Authors):

      (1) There are quite a few typos in the manuscript, for example: "one can accessess the code", "an of the self", etc.  

      Thanks for the comment. We have now screened the manuscript for possible typos.

      (2) Heron’s GUI can only run on Windows! This seems to be the opposite of the key argument about the portability of the experimental setup.  

      As explained in the answers to Reviewer 1, Heron can run on most machines that the underlying python libraries run, i.e. Windows and Linux (both for x86 and Arm architectures). We have tested it on Windows (10 and 11, both x64), Linux PC (Ubuntu 20.04.6, x64) and Raspberry Pi 4 (Debian GNU/Linux 12 (bookworm), aarch64). We have now revised the manuscript and the GitHub repo to reflect this.

      (3) Currently, the output is displayed along the left edge of the node, but the yellow dot connector is on the right. It would make more sense to have the text displayed next to the connectors.  

      We have redesigned the Node GUI and have now placed the Out connectors on the right side of the Node.

      (4) The edges are often occluded by the nodes in the GUI. Sometimes it leads to some confusion, particularly when the number of nodes is large, e.g., Fig 4.

      This is something that is dependent on the capabilities of the DearPyGUI module. At the moment there is no way to control the way the edges are drawn.

      Reviewer #3 (Recommendations For The Authors):

      A few comments on the software and the documentation itself:

      - From a software engineering point of view, the implementation seems to be rather immature. While I get the general appeal of "no installation necessary", I do not think that installing dependencies by hand and cloning a GitHub repository is easier than installing a standard package.

      We have now added a pip install capability which also creates a Heron command line command to start Heron with. 

      -The generous use of global variables to store state (minor point, given that all nodes run in different processes), boilerplate code that each node needs to repeat, and the absence of any kind of automatic testing do not give the impression of a very mature software (case in point: I had to delete a line from editor.py to be able to start it on a non-Windows system).  

      As mentioned, the use of global variables in the worker scripts is fine partly due to the multi process nature of the development and we have found it is a friendly approach to Matlab users who are just starting with Python (a serious consideration for Heron). Also, the parts of the code that would require a singleton (the Editor for example) are treated as scripts with global variables while the parts that require the construction of objects are fully embedded in classes (the Node for example). A future refactoring might make also all the parts of the code not seen by the user fully object oriented but this is a decision with pros and cons needing to be weighted first. 

      Absence of testing is an important issue we recognise but Heron is a GUI app and nontrivial unit tests would require some keystroke/mouse movement emulator (like QTest of pytest-qt for QT based GUIs). This will be dealt with in the near future (using more general solutions like PyAutoGUI) but it is something that needs a serious amount of effort (quite a bit more that writing unit tests for non GUI based software) and more importantly it is nowhere as robust as standard unit tests (due to the variable nature of the GUI through development) making automatic test authoring an almost as laborious a process as the one it is supposed to automate.

      -  From looking at the examples, I did not quite see why it is necessary to write the ..._com.py scripts as Python files, since they only seem to consist of boilerplate code and variable definitions. Wouldn’t it be more convenient to represent this information in configuration files (e.g. yaml or toml)?  

      The com is not a configuration file, it is a script that launches the communication process of the Node. We could remove the variable definitions to a separate toml file (which then the com script would have to read). The pros and cons of such a set up should be considered in a future refactoring.

      Minor comments for the paper:

      -  p.7 (top left): "through its return statement" - the worker loop is an infinite loop that forwards data with a return statement?  

      This is now corrected. The worker loop is an infinite loop and does not return anything but at each iteration pushes data to the Nodes output.

      -  p.9 (bottom right): "of the self" → "off-the-shelf"  

      Corrected.

      -  p.10 (bottom left): "second root" → "second route"  

      Corrected.

      -  Supplementary Figure 3: Green start and square seem to be swapped (the green star on top is a camera image and the green star on the bottom is value visualization - inversely for the green square).  

      The star and square have been swapped around.

      -  Caption Supplementary Figure 4 (end): "rashes to receive" → "rushes to receive"  

      Corrected.

    1. eLife Assessment

      This important study advances our understanding of the role of dopamine in modulating pair bonding in mandarin voles by examining dopamine signaling within the nucleus accumbens across various social stimuli using state-of-the-art causal perturbations. The evidence supporting the findings is compelling, particularly cutting-edge approaches for measuring dopamine release as well as the activity of dopamine receptor populations during social bonding. Some concerns remain about the statistical analyses.

    2. Reviewer #2 (Public review):

      Summary:

      Using in vivo fiber-photometry the authors first establish that DA release when contacting their partner mouse increases with days of cohabitation while this increase is not observed when contacting a stranger mouse. Similar effects are found in D1-MSNs and D2-MSNs with the D1-MSN responses increasing and D2-MSN responses decreasing with days of cohabitation. They then use slice physiology to identify underlying plasticity/adaptation mechanisms that could contribute to the changes in D1/D2-MSN responses. Last, to address causality the authors use chemogenetic tools to selectively inhibit or activate NAc shell D1 or D2 neurons that project to the ventral pallidum. They found that D2 inhibition facilitates bond formation while D2 excitation inhibits bond formation. In contrast, both D1-MSN activation and inhibition inhibits bond formation.

      Strengths:

      The strength of the manuscript lies in combining in vivo physiology to demonstrate circuit engagement and chemogenetic manipulation studies to address circuit involvement in pair bond formation in a monogamous vole.

      Weaknesses:

      Weaknesses include that a large set of experiments within the manuscript are dependent on using short promoters for D1 and D2 receptors in viral vectors. As the authors acknowledge this approach can lead to ectopic expression and the presented immunohistochemistry supports this notion. It seems to me that the presented quantification underestimates the degree of ectopic expression that is observed by eye when looking at the presented immunohistochemistry. However, given that Cre transgenic animals are not available for Microtus mandarinus and given the distinct physiological and behavioral outcomes when imaging and manipulating both viral-targeted populations this concern is minor.

      The slice physiology experiments provide some interesting outcomes but it is unclear how they can be linked to the in vivo physiological outcomes and some of the outcomes don't match intuitively (e.g. cohabitation enhances excitatory/inhibitory balance in D2-MSNs but the degree of contact-induced inhibition is enhanced in D2-MSN).

      One interesting finding is that the relationship between D2-MSN and pair bond formation is quite clear (inhibition facilitates while excitation inhibits pair bond formation). In contrast, the role of D1-MSNs is more complicated since both excitation and inhibition disrupts pair bond formation. This is not convincingly discussed.

      It seemed a missed opportunity that physiological read out is limited to males. I understand though that adding females may be beyond the scope of this manuscript.

      Comments on revised version:

      The authors addressed most of my comments, some would still need to be addressed.

      (1) Previous comment: "The authors do not use an isosbestic control wavelength in photometry experiments, although they do use EGFP control mice which show no effects of these interventions, a within-subject control such as an isosbestic excitation wavelength could give more confidence in these data and rule out motion artefacts within subjects."

      The authors should include a paragraph in the discussion addressing the limitations of not using an internal control for the fiberphotometric measurements.

      (2) Previous Comment: The slice physiology experiments provide some interesting outcomes but it is unclear how they can be linked to the in vivo physiological outcomes and some of the outcomes don't match intuitively (e.g. cohabitation enhances excitatory/inhibitory balance in D2-MSNs but the degree of contact-induced inhibition is enhanced in D2-MSN).

      My comment may not have been clear and the response didn't address my comment. What is missing in the discussion is an explanation of why a relative increase in excitation of D2-MSNs in the slice (Fig. 4J) is associated with an increased inhibition in vivo (Fig. 2H)?

      (3) Previous Comment: One interesting finding is that the relationship between D2-MSN and pair bond formation is quite clear (inhibition facilitates while excitation inhibits pair bond formation). In contrast, the role of D1-MSNs is more complicated since both excitation and inhibition disrupt pair bond formation. This is not convincingly discussed.

      Similarly, here the response provided does not address my question. Please focus on discussing why both excitation and inhibition of D1-MSNs can disrupt pair bond formation (Figure 7).

    3. Reviewer #3 (Public review):

      Summary:

      The manuscript is evaluating changes in dopamine signaling in the nucleus accumbens following pair bonding and exposure to various stimuli in mandarin voles. In addition, the authors present chemogenetic data which demonstrates excitation and inhibition of D1 and D2 MSN affect pair bond formation.

      Strengths:

      The experimental designs are strong. The approaches are innovative and use cutting-edge methods. The manuscript is well written.

      Comments on revised version:

      I appreciate the efforts by the authors to address many of my previous comments. The issues that remain are those associated with the statistics. It seems that not all statistical analyses were performed with the correct test. For example, the photometry data comparing emissions during partner vs stranger investigation over time would be best performed as a two-way ANOVA with odor type and time being separate variables. Also, there are paired t-tests being performed by calculating an average deltaF/F during the 4 second window following the being of a behavioral event. I think an area-under-the-curve calculation of these events would better capture the fluorescent emissions of these events as an index. Details in the Result describing the data being analyzed via ANOVA vs t-tests when reporting the results would be useful for the reviewer to understand each analysis.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      These experiments are some of the first to assess the role of dopamine release and the activity of D1 and D2 MSNs in pair bond formation in Mandarin voles. This is a novel and comprehensive study that presents exciting data about how the dopamine system is involved in pair bonding. The authors provide very detailed methods and clearly presented results. Here they show dopamine release in the NAc shell is enhanced when male voles encounter their pair bonded partner 7 days after cohabitation. In addition, D2 MSN activity decreases whereas D1 MSN activity increases when sniffing the pair-bonded partner.

      The authors do not provide justification for why they only use males in the current study, without discussing sex as a biological variable these data can only inform readers about one sex (which in pair-bonded animals by definition have 2 sexes). In addition, the authors do not use an isosbestic control wavelength in photometry experiments, although they do use EGFP control mice which show no effects of these interventions, a within-subject control such as an isosbestic excitation wavelength could give more confidence in these data and rule out motion artefacts within subjects.

      We agree with your suggestion that mechanism underlying pair bonding in females should also be investigated. In general, natal philopatry among mammals is female biased in the wild(Greenwood, 1983; Brody and Armitage, 1985; Ims, 1990; Solomon and Jacquot, 2002); social mammals are rarely characterized by exclusively male natal philopatry (Solomon and Jacquot, 2002). Males often disperse from natal area to a new place. Thus, males rodents may play a dominant role in the formation and maintenance of mating relationships. This is a reason we investigate pair bonding in male firstly. Certainly, female mate selection, and sexual receptivity or refusal through olfactory cues from males, thereby affect the formation and maintenance of pair bonding (Hoglen and Manoli, 2022). This is also the reason why we should focus on the mechanisms underlying pair bonding formation in females in the future research. This has been added in the limitation in the discussion.

      In photometry experiments, rAAV-D1/D2-GCaMP6m, a D1/D2 genetically encoded fluorescent calcium sensor, was injected into the NAc shell. The changes in fluorescence signals during these social interactions were collected and digitalized. To assess the specific response to social stimulus in fluorescence signals, changes in fluorescence signals during non-social behavioral bouts (such as freezing, exploration of the environment, grooming, rearing, etc…) were also recorded and analyzed. The result showed that dopamine release or D1/D2 MSNs activity displayed no significant changes after cohabitation of 3 or 7 days upon occurring of no-social behavior such as freezing, exploring, grooming and rearing. In addition, GCaMP6m is a genetically encoded calcium indicator. Changes in its fluorescence signal reflect changes in intracellular calcium ion concentration. Using EGFP virus as a control, it can be determined whether the fluorescence signal observed in the experiment is generated by the specific response of GCaMP6m to calcium or if there are other non-specific factors leading to fluorescence changes. If there is no similar fluorescence change in the EGFP control group, it can more strongly prove that the signal detected by GCaMP6m is a calcium-related specific signal. In some research article, they also use EGFP control group in photometry experiments (Yamaguchi et al., 2020; Qu et al., 2024; Zhan et al., 2024). Therefore, changes in fluorescence signals observed in the present study reflect neuron activities upon specific social behaviors, but were not affected by motion artefacts.

      There is an existing literature (cited in this manuscript) from Aragona et al., (particularly Aragona et al., 2006) which has highlighted key differences in the roles of rostral versus caudal NAc shell dopamine in pair bond formation and maintenance. Specifically, they report that dopamine transmission promoting pair bonding only occurs in the rostral shell and not the caudal shell or core regions. Given that the authors have targeted more caudally a discussion of how these results fit with previous work and why there may be differences in these areas is warranted.

      Thanks for your professional consideration. The brain coordinates of Bilateral 26-gauge guide cannulae were NAc (1.6 mm rostral, ± 1 mm bilateral, 4.5 mm ventral (for shell), 3.5 mm ventral (for core) from bregma) in report from Aragona et al (2006). In the present study, the brain coordinates of virus injection were (AP: +1.5, ML: ±0.99, DV: −4.2 (for NAc shell)). Thus, the virus injection sites were close to rostral shell in our study. However, as the diffusive expression of the virus, part of neurons in the rostrocaudal border and caudal shell also be infected by the virus, so we did not distinguish different subregions of NAc shell. In the future, we will use AAV13, a viral strategy could target / manipulate precise local neural populations, to address this issue. NAc is a complex brain structure with distinct regions that have different functions. Previous study suggested that GABAergic substrates of positive and negative types of motivated behavior in the nucleus accumbens shell are segregated along a rostrocaudal gradient (Reynolds and Berridge, 2001). However, a study found that food intake is significantly enhanced by administering μ-selective opioid agonists into the NAc, especially its shell region (Znamensky et al., 2001). Also, μ-opioid stimulation increases the motivation to eat (“wanting”) both in the NAc shell and throughout the entire NAc, as well as in several limbic or striatal structures beyond. For DAMGO stimulation of eating, the “wanting” substrates anatomically extend additionally beyond the rostrodorsal shell and throughout the entire shell (the caudal shell). Furthermore, DAMGO stimulates eating at NAc shell and core, as well as the neostriatum, amygdala…(Gosnell et al., 1986; Gosnell and Majchrzak, 1989; Peciña and Berridge, 2000; Zhang and Kelley, 2000; Echo et al., 2002; Peciña and Berridge, 2005, 2013; Castro and Berridge, 2014). In pair bond formation and maintenance, the rostral shell is the specific subregion of the NAc important for DA regulation of partner preference (Aragona et al., 2006). In conclusion, it appears that the changes in real time dopamine release and activities and electrophysiological properties of D1R, D2R MSNs in the NAc shell after pair bond formation may have primarily targeted to the rostral shell in our study, which is consistent with the report from Aragona et al.

      The authors could discuss the differences between pair bond formation and pair bond maintenance more deeply.

      Thanks for your suggestion. I have discussed the differences between pair bond formation and pair bond maintenance more deeply.

      The dopamine and different types of dopamine receptors in the NAc may play different roles in regulation of pair bond formation and maintenance. The chemogenetic manipulation revealed that VP-projecting D2 MSNs are necessary and more important in pair bond formation compared to VPprojecting D1 MSNs. It is consistent with previous pharmacological experiments that blocking of D2R with its specific antagonist, while D1R was not blocked, can prevent the formation of a pair bond in prairie voles (Gingrich et al., 2000). This indicates that D2R is crucial for the initial formation of the pair bond. D2R is involved in the reward aspects related to mating. In female prairie voles, D2R in the NAc is important for partner preference formation. The activation of D2R may help to condition the brain to assign a positive valence to the partner's cues during mating, facilitating the development of a preference for a particular mate. In addition, the cohabitation caused the DA release, the high affinity Gi-coupled D2R was activated first, which inhibited D2 MSNs activity and promoted the pair bond formation. And then, after 7 days of cohabitation, the pair bonding was already established, the significantly increased release of dopamine significantly activated Gs-coupled D1R with the low affinity to dopamine, which increased D1 MSNs activity and maintained the formation of partner preference. While D1R is also present and involved in the overall process, its role in the initial formation of the pair bond is not as dominant as D2R (Aragona et al., 2006). However, it still participates in the neurobiological processes related to pair bond formation. For example, in male mandarin voles, after 7 days of cohabitation with females, D1R activity in the NAc shell was affected during pair bond formation. The extracellular DA concentration was higher when sniffing their partner compared to a stranger, and this increase in DA release led to an increase in D1R activity in the NAc shell. In prairie voles, dopamine D1 receptors seem to be essential for pair bond maintenance. Neonatal treatment with D1 agonists can impair partner preference formation later in life, suggesting an organizational role for D1 in maintaining the bond (Aragona et al., 2006). In pair-bonded male prairie voles, D1R is involved in inducing aggressive behavior toward strangers, which helps to maintain the pair bond by protecting it from potential rivals. In the NAc shell, D1 agonist decreases the latency to attack same-sex conspecifics, while D1 antagonism increases it (Aragona et al., 2006). In summary, D2R is more crucial for pair bond formation, being involved in reward association and necessary for the initial development of the pair bond. D1R, on the other hand, is more important for pair bond maintenance, being involved in aggression and mate guarding behaviors and having an organizational role in maintaining the pair bond over time. We therefore suggest that D2 MSNs are more predominantly involved in the formation of a pair bond compared with D1 MSNs.

      The authors have successfully characterised the involvement of dopamine release, changes in D1 and D2 MSNs, and projections to the VP in pair bonding voles. Their conclusions are supported by their data and they make a number of very reasonable discussion points acknowledging various limitations

      Reviewer #2 (Public review):

      Summary:

      Using in vivo fiber-photometry the authors first establish that DA release when contacting their partner mouse increases with days of cohabitation while this increase is not observed when contacting a stranger mouse. Similar effects are found in D1-MSNs and D2-MSNs with the D1MSN responses increasing and D2-MSN responses decreasing with days of cohabitation. They then use slice physiology to identify underlying plasticity/adaptation mechanisms that could contribute to the changes in D1/D2-MSN responses. Last, to address causality the authors use chemogenetic tools to selectively inhibit or activate NAc shell D1 or D2 neurons that project to the ventral pallidum. They found that D2 inhibition facilitates bond formation while D2 excitation inhibits bond formation. In contrast, both D1-MSN activation and inhibition inhibit bond formation.

      Strengths:

      The strength of the manuscript lies in combining in vivo physiology to demonstrate circuit engagement and chemogenetic manipulation studies to address circuit involvement in pair bond formation in a monogamous vole.

      Weaknesses:

      Comment: Weaknesses include that a large set of experiments within the manuscript are dependent on using short promoters for D1 and D2 receptors in viral vectors. As the authors acknowledge this approach can lead to ectopic expression and the presented immunohistochemistry supports this notion. It seems to me that the presented quantification underestimates the degree of ectopic expression that is observed by eye when looking at the presented immunohistochemistry. However, given that Cre transgenic animals are not available for Microtus mandarinus and given the distinct physiological and behavioral outcomes when imaging and manipulating both viral-targeted populations this concern is minor.

      Thanks for your professional comment. The virus used in the present study were purchased from brainVTA company. D1/D2 receptor promoter genes were predicted and amplified for validation by the company. The promoter gene was constructed and packaged by aav virus vector (taking rAAV-D2-mCherry-WPRE-bGH_polyA virus as an example, Author response image 1A). The D1/D2 promoter sequence is shown in the Author response image 1B-C. In addition, the D1 receptor gene promoter and D2 receptor gene promoter viruses used in this paper have been used in several published papers with high specificity (Zhao et al., 2019; Ying et al., 2022). In our paper, a high proportion of virus and mRNA co-localization was found through FISH verification and also showed high specificity of virus (Figure S15, S16).

      Author response image 1.

      (A)   Gene carrier of rAAV-D2-mCherry-WPRE-bGH_polyA. (B-C) Gene sequence of D1 promoter and D2 promoter.

      The slice physiology experiments provide some interesting outcomes but it is unclear how they can be linked to the in vivo physiological outcomes and some of the outcomes don't match intuitively (e.g. cohabitation enhances excitatory/inhibitory balance in D2-MSNs but the degree of contact-induced inhibition is enhanced in D2-MSN).

      Thanks for your comment. The present study found that the frequencies of sEPSC and sIPSC were significantly enhanced after the formation of a pair bond in NAc shell D2 MSNs. The excitatory/inhibitory balance of D2 MSNs was enhanced after cohabitation.These results are not consistent with the findings from fiber photometry of calcium signals. One study showed that NAc D2 MSNs was linked to both ‘liking’ (food consumption) and ‘wanting’ (food approach) but with opposing actions; high D2 MSNs activity signaled ‘wanting’, and low D2 MSNs activity enhanced ‘liking’. D2 MSNs are faced with a tradeoff between increasing ‘wanting’ by being more active or allowing ‘liking’ by remaining silent (Guillaumin et al., 2023). Therefore, the increase in frequencies of sEPSC and sIPSC in D2 MSNs may reflect two processes, liking and wanting, respectively. We thought that hedonia and motivation might influence D2 MSNs activity differently during cohabitation and contribute to the processing of pair bond formation in a more dynamic and complex way than previously expected.

      Moreover, the frequencies of sEPSC and sIPSC were significantly reduced in the NAc shell D1 MSNs after pair bonding, whereas the intrinsic excitability increased after cohabitation with females.

      The bidirectional modifications (reduced synaptic inputs vs. increased excitability) observed in D1 MSNs might result from homeostatic regulation. The overall synaptic transmission may produce no net changes, given that reductions in both excitatory and inhibitory synaptic transmission of D1 MSNs were observed. Also, increases in the intrinsic excitability of D1 MSNs would result in an overall excitation gain on D1 MSNs.

      One interesting finding is that the relationship between D2-MSN and pair bond formation is quite clear (inhibition facilitates while excitation inhibits pair bond formation). In contrast, the role of D1-MSNs is more complicated since both excitation and inhibition disrupt pair bond formation. This is not convincingly discussed.

      Considering the reviewer’s suggestion, the discussion has been added in the revised manuscript.

      In the present study, DREADDs approaches were used to inhibit or excite NAc MSNs to VP projection and it was found that D1 and D2 NAc MSNs projecting to VP play different roles in the formation of a pair bond. Chemogenetic inhibition of VP-projecting D2 MSNs promoted partner preference formation, while activation of VP-projecting D2 MSNs inhibited it (Figure 6). Chemogenetic activation of D2 MSNs produced the opposite effect of DA on the D2 MSNs on partner preference, while inhibition of these neurons produced the same effects of DA on D2 MSNs. DA binding with D2R is coupled with Gi and produces an inhibitory effect (Lobo and Nestler, 2011). It is generally assumed that activation of D2R produces aversive and negative reinforcement. These results were consistent with the reduced D2 MSNs activity upon sniffing their partner in the fiber photometry test and the increased frequency and amplitude of sIPSC in the present study. Our results also agree with other previous studies that chemogenetic inhibition of NAc D2 MSNs is sufficient to enhance reward-oriented motivation in a motivational task (Carvalho Poyraz et al., 2016; Gallo et al., 2018). Inhibition of D2 MSNs during self-administration enhanced response and motivation to obtain cocaine (Bock et al., 2013). This also suggests that the mechanism underlying attachment to a partner and drug addiction is similar.

      Besides, in the present study, the formation of partner preference was inhibited after activation or inhibition of VP-projecting D1 MSNs, which is not consistent with conventional understanding of prairie vole behavior. Alternatively, DA binding with D1R is coupled with Gs and produces an excitatory effect (Lobo and Nestler, 2011), while activation of D1R produces reward and positive reinforcement (Hikida et al., 2010; Tai et al., 2012; Kwak and Jung, 2019). For example, activation of D1 MSNs enhances the cocaine-induced conditioned place preference (Lobo et al., 2010). In addition, D1R activation by DA promotes D1 MSNs activation, which promotes reinforcement. However, a recent study found that NAc-ventral mesencephalon D1 MSNs promote reward and positive reinforcement learning; in contrast, NAc-VP D1 MSNs led to aversion and negative reinforcement learning (Liu et al., 2022). It is consistent with our results that activation of NAc-VP D1 MSNs pathway reduced time spent side-by-side and impaired partner preference after 7 days of cohabitation. In contrast to inhibition of D2 MSNs, we found that inhibition of the D1 MSNs did not elicit corresponding increases in partner preference. One possible explanation is that almost all D1 MSNs projecting to the VTA/ substantia nigra (SN) send collaterals to the VP (Pardo-Garcia et al., 2019). For example, optogenetically stimulating VP axons may inadvertently cause effects in the VTA/SN through the antidromic activation of axon collaterals (Yizhar et al., 2011). Therefore, chemogenetic inhibition of D1 MSNs may also inhibit DA neurons in VTA, subsequently inhibiting the formation of a pair bond.

      The dopamine and different types of dopamine receptors in the NAc may play different roles in regulation of pair bond formation and maintenance. The chemogenetic manipulation revealed that VP-projecting D2 MSNs are necessary and more important in pair bond formation compared to VPprojecting D1 MSNs. It is consistent with previous pharmacological experiments that blocking of D2R with its specific antagonist, while D1R was not blocked, can prevent the formation of a pair bond in prairie voles (Gingrich et al., 2000). This indicates that D2R is crucial for the initial formation of the pair bond. D2R is involved in the reward aspects related to mating. In female prairie voles, D2R in the NAc is important for partner preference formation. The activation of D2R may help to condition the brain to assign a positive valence to the partner's cues during mating, facilitating the development of a preference for a particular mate. In addition, the cohabitation caused the DA release, the high affinity Gi-coupled D2R was activated first, which inhibited D2 MSNs activity and promoted the pair bond formation. And then, after 7 days of cohabitation, the pair bonding was already established, the significantly increased release of dopamine significantly activated Gs-coupled D1R with the low affinity to dopamine, which increased D1 MSNs activity and maintained the formation of partner preference. While D1R is also present and involved in the overall process, its role in the initial formation of the pair bond is not as dominant as D2R (Aragona et al., 2006). However, it still participates in the neurobiological processes related to pair bond formation. For example, in male mandarin voles, after 7 days of cohabitation with females, D1R activity in the NAc shell was affected during pair bond formation. The extracellular DA concentration was higher when sniffing their partner compared to a stranger, and this increase in DA release led to an increase in D1R activity in the NAc shell. In prairie voles, dopamine D1 receptors seem to be essential for pair bond maintenance. Neonatal treatment with D1 agonists can impair partner preference formation later in life, suggesting an organizational role for D1 in maintaining the bond (Aragona et al., 2006). In pair-bonded male prairie voles, D1R is involved in inducing aggressive behavior toward strangers, which helps to maintain the pair bond by protecting it from potential rivals. In the NAc shell, D1 agonist decreases the latency to attack same-sex conspecifics, while D1 antagonism increases it (Aragona et al., 2006). In summary, D2R is more crucial for pair bond formation, being involved in reward association and necessary for the initial development of the bond. D1R, on the other hand, is more important for pair bond maintenance, being involved in aggression and mate guarding behaviors and having an organizational role in maintaining the bond over time. We therefore suggest that D2 MSNs are more predominantly involved in the formation of a pair bond compared with D1 MSNs.

      It seemed a missed opportunity that physiological readout is limited to males. I understand though that adding females may be beyond the scope of this manuscript.

      We gratefully appreciate for your valuable comment. The reviewer 1 also concerned this issue. We made a following response.

      In general, natal philopatry among mammals is female biased in the wild(Greenwood, 1983; Brody and Armitage, 1985; Ims, 1990; Solomon and Jacquot, 2002); social mammals are rarely characterized by exclusively male natal philopatry (Solomon and Jacquot, 2002). Males often disperse from natal area to a new place. Thus, male rodents may play a dominant role in the formation and maintenance of mating relationships. This is a reason we investigate pair bonding in male firstly. Certainly, female mate selection, and sexual receptivity or refusal through olfactory cues from males, thereby affect the formation and maintenance of pair bonding (Hoglen and Manoli, 2022). This is also the reason why we should focus on the mechanisms underlying pair bonding formation in females in the future research. This has been added in the limitation in the discussion.

      Reviewer #3 (Public review):

      Summary:

      The manuscript is evaluating changes in dopamine signaling in the nucleus accumbens following pair bonding and exposure to various stimuli in mandarin voles. In addition, the authors present chemogenetic data that demonstrate excitation and inhibition of D1 and D2 MSN affect pair bond formation.

      Strengths:

      The experimental designs are strong. The approaches are innovative and use cutting-edge methods.

      The manuscript is well written.

      Weaknesses:

      The statistical results are not presented, and not all statistical analyses are appropriate.

      Additionally, some details of methods are absent.

      As you suggested, we added the detailed information in the revised manuscript.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) Remove references to 'extreme significance' - p is set as a threshold and the test is either significant or not.

      Thanks for your suggestion. We have removed 'extreme significance' in the revised manuscript.

      (2) The second half of the abstract is a little confusing the use of activation/inhibition makes it difficult to read and follow, this could be re-worded for clarity.

      Sorry for the confusing. We reorganized the sentence as following.

      In addition, chemogenetic inhibition of ventral pallidum-projecting D2 MSNs in the NAc shell enhanced pair bond formation, while chemogenetic activation of VP-projecting D2 MSNs in the NAc shell inhibited pair bond formation.

      Reviewer #2 (Recommendations for the authors):

      (1) In many instances repeated measures are presented from the same mice (e.g. Figures 1F, I; S1BC). Repeated measures for each mouse should be connected with a line in the figures. This will allow the reader to visually compare the repeated measures for each animal.

      Thanks for your careful consideration. As reviewer suggested, the figures have been changed.

      (2) It is unclear to me how the time point 0 for sniffing was determined. How is the time point 0 for side-by-side contact determined?

      Sniffing is a behavior for olfactory investigation and defined as animals uses nose to inspect any portion of the stimulus mouse’s body, including the tail. The time point 0 for sniffing was the beginning of sniffing behavior occurs. The side-by-side behavior is defined as significant physical contact with a social object and huddle in a quiescent state. The time point 0 for side-byside behavior was the beginning of side-by-side behavior occurs.

      (3) Figure 1-3: For the fiber photometry data 7 events (sniffs) are shown in the heat maps. Are these the first 7 sniffs? What went into the quantification? It seems that DA and D1/D2 responses are habituating. This could be analyzed and would need to be discussed.

      In the heat maps (Figure 1-3), we showed the mean fluorescence signal changes of every subject (n = 7 voles) upon sniffing partner, stranger or an object in the experiment, but not the fluorescence signal changes of sniffing events in one vole. The quantification of changes in mean fluorescence signal of all subjects was showed in Figure 1F, 1I, Figure 2F, 2I, Figure 3F and 3I.

      (4) Generally, it is very difficult to obtain cell type selectivity using short promoters in viruses (the authors acknowledge this). Which D1 and D2 promoter sequences were used for obtaining specificity? The degree of ectopic expression looks much higher than the quantification (e.g. in Fig. 3b, 6C, 7C, S14A, C). Is this due to thresholding?

      The virus used in the present study were purchased from brainVTA company. D1/D2 receptor promoter genes were predicted and amplified for validation. The promoter gene was constructed and packaged by aav virus vector (taking rAAV-D2-mCherry-WPRE-bGH_polyA virus as an example, Author response image 1A). The D1/D2 promoter sequence is shown in the Author response image 1B-C. In addition, the D1 receptor gene promoter and D2 receptor gene promoter viruses used in this paper have been used in several published papers with high specificity (Zhao et al., 2019; Ying et al., 2022). In the Figure 6C, the first image is the merged fluorescence images that were taken under different fluorescence channels with the 20X objective. The second and the third images were taken under 40X objective from field of white box in the first image. The second and the third images were merged into fourth one. Due to the different exposure time and intensity, the fluorescence photo taken at 40X are clearer compared to image taken at the 20X. For example, in the Figure 6C, the labeled-cells were presented as following (Author response image 2). In our paper,virus infection and mRNA through FISH verification were co-localized in a high proportion displaying high specificity of virus (Figure S15, S16).Certainly, the number of positive neurons may be dependent on visuality (thresholding). Only visible cells were counted. The cell counting results at Author response image 2B and 2C are similar to the quantification in the Figure 6C.

      Author response image 2.

      (A) Immunohistological image showing co-localization of hM3Dq- mCherry-anti expression (green), D2R-mRNA (red), and DAPI (blue) in the NAc shell. Scale bar: 100 μm. (B) The cell counts and the determination of colocalization of the 20× immunohistochemistry images. The marked neurons were counted with white dots. (C) The cell counts and the determination of colocalization of the 40× immunohistochemistry images. The marked neurons were counted with white dots.

      (5) Figure 6D/7D: the time scale seems to be off for both traces (40 seconds). For the hM3D Gq experiment, only one trace is shown. It would be more convincing to provide an input-output curve from several mice and to statistically compare the curves.

      Response: Thanks for your careful consideration. As reviewer suggested, the figure of resting membrane potentials before and after drug CNO exposure from several voles was added in the revised manuscript.

      (6) The presence of GIRK channels in MSNs has been a long debate and hM4D Gi activation may mostly act at the level of terminals by inhibiting neurotransmitter release. For demonstrating hyperpolarization of the soma showing the resting membrane potential before and after drug CNO exposure would be more convincing.

      Thanks for your careful consideration. As reviewer suggested, the figure of resting membrane potential before and after drug CNO exposure was added in the revised manuscript.

      (7) It is unclear to me how far the slice physiology informs the in vivo physiology (e.g. cohabitation enhances excitatory/inhibitory balance in D2-MSNs but the degree of contact-induced inhibition is enhanced in D2-MSN; D2-MSNs become less responsive to DA in the slice yet but at the time of enhanced DA release D2-MSN activity is also strongly reduced).

      The present study found that the frequencies of sEPSC and sIPSC were significantly enhanced after the formation of a pair bond in NAc shell D2 MSNs. The excitatory/inhibitory balance of D2 MSNs was enhanced after cohabitation. These results are not consistent with the findings from fiber photometry of calcium signals. One study showed that NAc D2 MSNs was linked to both ‘liking’ (food consumption) and ‘wanting’ (food approach) but with opposing actions; high D2 MSNs activity signaled ‘wanting’, and low D2 MSNs activity enhanced ‘liking’. D2 MSNs are faced with a tradeoff between increasing ‘wanting’ by being more active or allowing ‘liking’ by remaining silent (Guillaumin et al., 2023). Therefore, the increase in frequencies of sEPSC and sIPSC in D2 MSNs may reflect two processes, liking and wanting, respectively. We thought that hedonia and motivation might different influence D2 MSNs activity during cohabitation and contribute to the processing of pair bond formation in a more dynamic and complex way than previously expected.

      Moreover, the frequencies of sEPSC and sIPSC were significantly reduced in the NAc shell D1

      MSNs after pair bonding, whereas the intrinsic excitability increased after cohabitation with females.

      The bidirectional modifications (reduced synaptic inputs vs. increased excitability) observed in D1 MSNs might result from homeostatic regulation. The overall synaptic transmission may produce no net changes, given that reductions in both excitatory and inhibitory synaptic transmission of D1 MSNs were observed. Also, increases in the intrinsic excitability of D1 MSNs would result in an overall excitation gain on D1 MSNs.

      (8) One interesting finding is that the relationship between D2-MSN and pair bond formation is quite clear (inhibition facilitates while excitation inhibits pair bond formation). In contrast, the role of D1-MSNs is more complicated since both excitation and inhibition disrupt pair bond formation.

      The discussion of this would benefit from another attempt.

      As reviewer suggested, the discussion was added in the revised manuscript.

      In the present study, DREADDs approaches were used to inhibit or excite NAc MSNs to VP projection and it was found that D1 and D2 NAc MSNs projecting to VP play different roles in the formation of a pair bond. Chemogenetic inhibition of VP-projecting D2 MSNs promoted partner preference formation, while activation of VP-projecting D2 MSNs inhibited it (Figure 6). Chemogenetic activation of D2 MSNs produced the opposite effect of DA on the D2 MSNs on partner preference, while inhibition of these neurons produced the same effects of DA on D2 MSNs. DA binding with D2R is coupled with Gi and produces an inhibitory effect (Lobo and Nestler, 2011). It is generally assumed that activation of D2R produces aversive and negative reinforcement. These results were consistent with the reduced D2 MSNs activity upon sniffing their partner in the fiber photometry test and the increased frequency and amplitude of sIPSC in the present study. Our results also agree with other previous studies, which showed that chemogenetic inhibition of NAc D2 MSNs is sufficient to enhance reward-oriented motivation in a motivational task (Carvalho Poyraz et al., 2016; Gallo et al., 2018). Inhibition of D2 MSNs during self-administration enhanced response and motivation to obtain cocaine (Bock et al., 2013). This also suggests that the mechanism underlying attachment to a partner and drug addiction is similar.

      Besides, in the present study, the formation of partner preference was inhibited after activation or inhibition of VP-projecting D1 MSNs, which is not consistent with conventional understanding of prairie vole behavior. Alternatively, DA binding with D1R is coupled with Gs and produces an excitatory effect (Lobo and Nestler, 2011), while activation of D1R produces reward and positive reinforcement (Hikida et al., 2010; Tai et al., 2012; Kwak and Jung, 2019). For example, activation of D1 MSNs enhances the cocaine-induced conditioned place preference (Lobo et al., 2010). In addition, D1R activation by DA promotes D1 MSNs activation, which promotes reinforcement. However, a recent study found that NAc-ventral mesencephalon D1 MSNs promote reward and positive reinforcement learning; in contrast, NAc-VP D1 MSNs led to aversion and negative reinforcement learning (Liu et al., 2022). It is consistent with our results that activation of NAc-VP D1 MSNs pathway reduced time spent side-by-side and impaired partner preference after 7 days of cohabitation. In contrast to inhibition of D2 MSNs, we found that inhibition of the D1 MSNs did not elicit corresponding increases in partner preference. One possible explanation is that almost all D1 MSNs projecting to the VTA/ substantia nigra (SN) send collaterals to the VP (Pardo-Garcia et al., 2019). For example, optogenetically stimulating VP axons may inadvertently cause effects in the VTA/SN through the antidromic activation of axon collaterals (Yizhar et al., 2011). Therefore, chemogenetic inhibition of D1 MSNs may also inhibit DA neurons in VTA, subsequently inhibiting the formation of a pair bond.

      The dopamine and different types of dopamine receptors in the NAc may play different roles in regulation of pair bond formation and maintenance. The chemogenetic manipulation revealed that VP-projecting D2 MSNs are necessary and more important in pair bond formation compared to VPprojecting D1 MSNs. It is consistent with previous pharmacological experiments that blocking of D2R with its specific antagonist, while D1R was not blocked, can prevent the formation of a pair bond in prairie voles (Gingrich et al., 2000). This indicates that D2R is crucial for the initial formation of the pair bond. D2R is involved in the reward aspects related to mating. In female prairie voles, D2R in the NAc is important for partner preference formation. The activation of D2R may help to condition the brain to assign a positive valence to the partner's cues during mating, facilitating the development of a preference for a particular mate. In addition, the cohabitation caused the DA release, the high affinity Gi-coupled D2R was activated first, which inhibited D2 MSNs activity and promoted the pair bond formation. And then, after 7 days of cohabitation, the pair bonding was already established, the significantly increased release of dopamine significantly activated Gs-coupled D1R with the low affinity to dopamine, which increased D1 MSNs activity and maintained the formation of partner preference. While D1R is also present and involved in the overall process, its role in the initial formation of the pair bond is not as dominant as D2R (Aragona et al., 2006). However, it still participates in the neurobiological processes related to pair bond formation. For example, in male mandarin voles, after 7 days of cohabitation with females, D1R activity in the NAc shell was affected during pair bond formation. The extracellular DA concentration was higher when sniffing their partner compared to a stranger, and this increase in DA release led to an increase in D1R activity in the NAc shell. In prairie voles, dopamine D1 receptors seem to be essential for pair bond maintenance. Neonatal treatment with D1 agonists can impair partner preference formation later in life, suggesting an organizational role for D1 in maintaining the bond (Aragona et al., 2006). In pair-bonded male prairie voles, D1R is involved in inducing aggressive behavior toward strangers, which helps to maintain the pair bond by protecting it from potential rivals. In the NAc shell, D1 agonist decreases the latency to attack same-sex conspecifics, while D1 antagonism increases it (Aragona et al., 2006). In summary, D2R is more crucial for pair bond formation, being involved in reward association and necessary for the initial development of the bond. D1R, on the other hand, is more important for pair bond maintenance, being involved in aggression and mate guarding behaviors and having an organizational role in maintaining the bond over time. We therefore suggest that D2 MSNs are more predominantly involved in the formation of a pair bond compared with D1 MSNs.

      (9) For the chemogenetic inhibition/excitation experiment please specify the temporal relationship between CNO injection and the behavioral testing. Are the DREADDs activated during the preference testing or are we only looking at the consequences of DREADD activation during cohabitation? This would impact the interpretation of the results.

      Considering the reviewer’s suggestion, we have clarified the time of CNO injection and the behavioral testing. In chemogenetic experiments, male voles were injected with CNO (1 mg/kg, i.p. injection) or saline once per day during 7-days cohabitation period. On day 3 and day 7 of cohabitation, the partner preference tests (3 h) were conducted after 3h of injection. Anton Pekcec (Jendryka et al., 2019) found that, in mice, after 60 min of CNO injection (i.p.), free CNO levels had dropped surprisingly sharply in CSF and cortex tissue, CNO could not be detected after 60 min. However, associated biological effects are reported to endure 6 - 24 h after CNO treatment (Farzi et al., 2018; Desloovere et al., 2019; Paretkar and Dimitrov, 2019). For example, René He et al. (Anacker et al., 2018) showed that chemogenetic inhibition of adult-born neurons in the vDG promotes susceptibility to social defeat stress by using of DREADDs for 10 days, whereas increasing neurogenesis confers resilience to chronic stress. Moreover, Ming-Ming Zhang et al. (Zhang et al., 2022) revealed that the selective activation or inhibition of the IC-BLA projection pathway strengthens or weakens the intensity of observational pain while the CNO (1 mg/kg) was i.p. injected into the infected mice on days 1, 3, 5, and 7 after virus expression. Furthermore, in study of James P Herman et al. (Nawreen et al., 2020) chronic inhibition of IL PV INs reduces passive and increases active coping behavior in FST. Therefore, we believe that 7-day CNO injections can produce chronic effects on MSNs and alters the formation of partner preferences.

      (10) Discussion: "The observed increase in DA release resulted in suppression of D2 neurons in the NAc shell". "In contrast, the rise in DA release increases D1 activity selectively in response to their partner after extended cohabitation." These statements would need to be weakened as causality is not shown here.

      Thanks for your rigorous consideration. We have reorganized the discussion in the revised manuscript.

      “The observed increase in DA release resulted in alterations in activities of D2 and D1 neurons in the NAc shell selectively in response to their partner after extended cohabitation.”

      (11) It would help if the order of supplementary figures would match their order of figures appearance in the result section.

      Thanks for your suggestion. We reorganized the order of appearance in the revised manuscript.

      (12) This may be beyond the focus of the study but it would be very interesting to know whether the physiological responses to partner contact are similarly observed in females.

      Thanks for your concern. It is regretful that we did not observe physiological responses of female to partner contact. We predict the females may show the similar response patterns to their partner. In the future, we will supplement the research on the mechanism of partner preferences in female voles.

      Reviewer #3 (Recommendations for the authors):

      The manuscript is evaluating changes in dopamine signaling in the nucleus accumbens following pair bonding and exposure to various stimuli in mandarin voles. The manuscript is generally wellwritten. The experiment designs seem strong, although there are missing details to fully evaluate them. The statistics are not completed correctly, and the statistical values are not reported making them even harder to evaluate. There are a lot of potential strengths in this research. However, my review is limited because I am limited in how to evaluate data interpretation when statistical analyses are not clear. I provide details below.

      Major

      (1) Statistics should be provided in the Results section. It is not clear how to evaluate the authors' interpretations without presenting the statistical data. What stats are being reported about viral expression in cells on lines 192-194? What posthocs? There is only one condition, so I assume the statistic was a one-sample t-test. The authors should report the t-value, df, and p-value. No post-hoc is needed. There are many issues like this, which makes reviewing this manuscript very difficult. If the statistics were not conducted properly and reported clearly, I do not have confidence that I can evaluate the author's interpretation of the results.

      Thanks for your suggestion. We report the t-value, df, and p-value in the Results section.

      (2) Statistical tests should be labeled correctly. ANOVAs (found in figure caption) for Figure 1 data are not repeated measures. Rather, they are one-way ANOVA (with stimulus as a within-subject variable).

      We used one-way ANOVA to analyze the changes in fluorescence signals in figure1-3. In the experiment, the changes in fluorescence signals of every subject were collected upon sniffing the partner, an unknown female, and an object. So, we used One-Way Repeated Measures ANOVA to analyze the data.

      (3) The protocol for behavioral assessment and stimulus presentation during fiber photometry recording is not clear. For example, the authors mention on line 662 that voles ate carrots during some of the recording sessions, but nothing else is described about the recording session. What was the order of stimulus presentation? What was the object provided? Why is eating carrots analyzed separately from object, partner, and stranger exposure?

      Response: Sorry for the confusing. The detailed description has been added. After 3 and 7 days of cohabitation, males were exposed to their partner or an unfamiliar female (each exposure lasted for 30 min) in random order in a clean social interaction cage. The changes in fluorescence signals during these social interactions with their partner, an unfamiliar vole of the opposite sex, or an object (Rubik's Cube) were collected and digitalized by CamFiberPhotometry software (ThinkerTech). To rule out that the difference in fluorescence signals was caused by the difference in virus expression at different time points, we used the same experimental strategy in new male mandarin voles and measured the fluorescence signal changes upon eating carrot after 3 and 7 days of cohabitation (The male mandarin voles were fasted for four hours before the test.). Since sniffing (object, partner, and stranger) and eating carrot were not tested in the same males, we analyzed sniffing and eating carrot separately.

      (4) Supplement figures would be better as figures instead of tables. Many effects are hard to interpret.

      As you suggested, we added the information of Supplement table1 in results.

      (5) Citations should be included to note when pair bonding occurs in mandarin voles.

      As you suggested, we added the citation in the revised manuscript.

      Minor

      (1) Add a citation for the statement that married people live longer than unmarried people (Lines 51-52).

      As you suggested, we added the citation in the revised manuscript.

      (2) There is a table labeling viral vectors, but the table is not titled properly or referenced in the methods section.

      Thanks for our careful checking. We reorganized the table title and the table was also cited in the revised manuscript.

      (3) Sentences on lines 608-610 and 610-612 seem redundant.

      This sentence was corrected.

      (4) This is a rather subjective statement "Carrots are voles' favorite food."

      We reorganized the sentence in the revised manuscript.

      "Carrots are voles' daily food."

      Anacker C, Luna VM, Stevens GS, Millette A, Shores R, Jimenez JC, Chen B, Hen R (2018) Hippocampal neurogenesis confers stress resilience by inhibiting the ventral dentate gyrus. Nature 559:98-102.

      Aragona BJ, Liu Y, Yu YJ, Curtis JT, Detwiler JM, Insel TR, Wang Z (2006) Nucleus accumbens dopamine differentially mediates the formation and maintenance of monogamous pair bonds. Nature neuroscience 9:133-139.

      Bock R, Shin JH, Kaplan AR, Dobi A, Markey E, Kramer PF, Gremel CM, Christensen CH, Adrover MF, Alvarez VA (2013) Strengthening the accumbal indirect pathway promotes resilience to compulsive cocaine use. Nature neuroscience 16:632-638.

      Brody AK, Armitage KB (1985) The effects of adult removal on dispersal of yearling yellow-bellied marmots. Canadian Journal of Zoology 63:2560-2564.

      Carvalho Poyraz F, Holzner E, Bailey MR, Meszaros J, Kenney L, Kheirbek MA, Balsam PD, Kellendonk C (2016) Decreasing Striatopallidal Pathway Function Enhances Motivation by Energizing the Initiation of Goal-Directed Action. The Journal of neuroscience : the official journal of the Society for Neuroscience 36:5988-6001.

      Castro DC, Berridge KC (2014) Opioid hedonic hotspot in nucleus accumbens shell: mu, delta, and kappa maps for enhancement of sweetness "liking" and "wanting". The Journal of neuroscience : the official journal of the Society for Neuroscience 34:4239-4250.

      Desloovere J, Boon P, Larsen LE, Merckx C, Goossens MG, Van den Haute C, Baekelandt V, De Bundel D, Carrette E, Delbeke J, Meurs A, Vonck K, Wadman W, Raedt R (2019) Longterm chemogenetic suppression of spontaneous seizures in a mouse model for temporal lobe epilepsy. Epilepsia 60:2314-2324.

      Echo JA, Lamonte N, Ackerman TF, Bodnar RJ (2002) Alterations in food intake elicited by GABA and opioid agonists and antagonists administered into the ventral tegmental area region of rats. Physiology & behavior 76:107-116.

      Farzi A, Lau J, Ip CK, Qi Y, Shi YC, Zhang L, Tasan R, Sperk G, Herzog H (2018) Arcuate nucleus and lateral hypothalamic CART neurons in the mouse brain exert opposing effects on energy expenditure. eLife 7.

      Gallo EF, Meszaros J, Sherman JD, Chohan MO, Teboul E, Choi CS, Moore H, Javitch JA, Kellendonk C (2018) Accumbens dopamine D2 receptors increase motivation by decreasing inhibitory transmission to the ventral pallidum. Nature communications 9:1086.

      Gingrich B, Liu Y, Cascio C, Wang Z, Insel TR (2000) Dopamine D2 receptors in the nucleus accumbens are important for social attachment in female prairie voles (Microtus ochrogaster). Behavioral neuroscience 114:173-183.

      Gosnell BA, Majchrzak MJ (1989) Centrally administered opioid peptides stimulate saccharin intake in nondeprived rats. Pharmacology, biochemistry, and behavior 33:805-810.

      Gosnell BA, Levine AS, Morley JE (1986) The stimulation of food intake by selective agonists of mu, kappa and delta opioid receptors. Life sciences 38:1081-1088.

      Greenwood PJ (1983) Mating systems and the evolutionary consequences of dispersal. The ecology of animal movement:116-131.

      Guillaumin MCC, Viskaitis P, Bracey E, Burdakov D, Peleg-Raibstein D (2023) Disentangling the role of NAc D1 and D2 cells in hedonic eating. Molecular psychiatry 28:3531-3547.

      Hikida T, Kimura K, Wada N, Funabiki K, Nakanishi S (2010) Distinct roles of synaptic transmission in direct and indirect striatal pathways to reward and aversive behavior. Neuron 66:896907.

      Hoglen NEG, Manoli DS (2022) Cupid's quiver: Integrating sensory cues in rodent mating systems. Frontiers in neural circuits 16:944895.

      Ims RA (1990) Determinants of natal dispersal and space use in grey-sided voles, Clethrionomys rufocanus : a combined field and laboratory experiment. Oikos 57:106-113.

      Jendryka M, Palchaudhuri M, Ursu D, van der Veen B, Liss B, Kätzel D, Nissen W, Pekcec A (2019) Pharmacokinetic and pharmacodynamic actions of clozapine-N-oxide, clozapine, and compound 21 in DREADD-based chemogenetics in mice. Scientific reports 9:4522.

      Kwak S, Jung MW (2019) Distinct roles of striatal direct and indirect pathways in value-based decision making. eLife 8.

      Liu Z, Le Q, Lv Y, Chen X, Cui J, Zhou Y, Cheng D, Ma C, Su X, Xiao L, Yang R, Zhang J, Ma L, Liu X (2022) A distinct D1-MSN subpopulation down-regulates dopamine to promote negative emotional state. Cell Res 32:139-156.

      Lobo MK, Nestler EJ (2011) The striatal balancing act in drug addiction: distinct roles of direct and indirect pathway medium spiny neurons. Front Neuroanat 5:41.

      Lobo MK, Covington HE, 3rd, Chaudhury D, Friedman AK, Sun H, Damez-Werno D, Dietz DM, Zaman S, Koo JW, Kennedy PJ, Mouzon E, Mogri M, Neve RL, Deisseroth K, Han MH, Nestler EJ (2010) Cell type-specific loss of BDNF signaling mimics optogenetic control of cocaine reward. Science (New York, NY) 330:385-390.

      Nawreen N, Cotella EM, Morano R, Mahbod P, Dalal KS, Fitzgerald M, Martelle S, Packard BA, Franco-Villanueva A, Moloney RD, Herman JP (2020) Chemogenetic Inhibition of Infralimbic Prefrontal Cortex GABAergic Parvalbumin Interneurons Attenuates the Impact of Chronic Stress in Male Mice. eNeuro 7.

      Pardo-Garcia TR, Garcia-Keller C, Penaloza T, Richie CT, Pickel J, Hope BT, Harvey BK, Kalivas PW, Heinsbroek JA (2019) Ventral Pallidum Is the Primary Target for Accumbens D1 Projections Driving Cocaine Seeking. The Journal of neuroscience : the official journal of the Society for Neuroscience 39:2041-2051.

      Paretkar T, Dimitrov E (2019) Activation of enkephalinergic (Enk) interneurons in the central amygdala (CeA) buffers the behavioral effects of persistent pain. Neurobiology of disease 124:364-372.

      Peciña S, Berridge KC (2000) Opioid site in nucleus accumbens shell mediates eating and hedonic 'liking' for food: map based on microinjection Fos plumes. Brain research 863:71-86.

      Peciña S, Berridge KC (2005) Hedonic hot spot in nucleus accumbens shell: where do mu-opioids cause increased hedonic impact of sweetness? The Journal of neuroscience : the official journal of the Society for Neuroscience 25:11777-11786.

      Peciña S, Berridge KC (2013) Dopamine or opioid stimulation of nucleus accumbens similarly amplify cue-triggered 'wanting' for reward: entire core and medial shell mapped as substrates for PIT enhancement. The European journal of neuroscience 37:1529-1540.

      Qu Y, Zhang L, Hou W, Liu L, Liu J, Li L, Guo X, Li Y, Huang C, He Z, Tai F (2024) Distinct medial amygdala oxytocin receptor neurons projections respectively control consolation or aggression in male mandarin voles. Nature communications 15:8139.

      Reynolds SM, Berridge KC (2001) Fear and feeding in the nucleus accumbens shell: rostrocaudal segregation of GABA-elicited defensive behavior versus eating behavior. The Journal of neuroscience : the official journal of the Society for Neuroscience 21:3261-3270.

      Solomon NG, Jacquot JJ (2002) Characteristics of resident and wandering prairie voles, Microtus ochrogaster. Canadian Journal of Zoology 80:951-955.

      Tai LH, Lee AM, Benavidez N, Bonci A, Wilbrecht L (2012) Transient stimulation of distinct subpopulations of striatal neurons mimics changes in action value. Nature neuroscience 15:1281-1289.

      Yamaguchi T, Wei D, Song SC, Lim B, Tritsch NX, Lin D (2020) Posterior amygdala regulates sexual and aggressive behaviors in male mice. Nature neuroscience 23:1111-1124.

      Ying L, Zhao J, Ye Y, Liu Y, Xiao B, Xue T, Zhu H, Wu Y, He J, Qin S, Jiang Y, Guo F, Zhang L, Liu N, Zhang L (2022) Regulation of Cdc42 signaling by the dopamine D2 receptor in a mouse model of Parkinson's disease. Aging cell 21:e13588.

      Yizhar O, Fenno LE, Davidson TJ, Mogri M, Deisseroth K (2011) Optogenetics in neural systems. Neuron 71:9-34.

      Zhan S, Qi Z, Cai F, Gao Z, Xie J, Hu J (2024) Oxytocin neurons mediate stress-induced social memory impairment. Current biology : CB 34:36-45.e34.

      Zhang M, Kelley AE (2000) Enhanced intake of high-fat food following striatal mu-opioid stimulation: microinjection mapping and fos expression. Neuroscience 99:267-277.

      Zhang MM et al. (2022) Glutamatergic synapses from the insular cortex to the basolateral amygdala encode observational pain. Neuron 110:1993-2008.e1996.

      Zhao J, Ying L, Liu Y, Liu N, Tu G, Zhu M, Wu Y, Xiao B, Ye L, Li J, Guo F, Zhang L, Wang H, Zhang L (2019) Different roles of Rac1 in the acquisition and extinction of methamphetamineassociated contextual memory in the nucleus accumbens. Theranostics 9:7051-7071.

      Znamensky V, Echo JA, Lamonte N, Christian G, Ragnauth A, Bodnar RJ (2001) gammaAminobutyric acid receptor subtype antagonists differentially alter opioid-induced feeding in the shell region of the nucleus accumbens in rats. Brain research 906:84-91.

    1. eLife Assessment

      The authors aim to elucidate the mechanism by which pyroptosis (through the formation of Gasdermin D (GSDMD) pores in the plasma membrane) contributes to increased release of procoagulant Tissue Factor-containing microvesicles. The data offers solid mechanistic insights as to the interplay between pyroptosis and microvesicle release with NINJ1. The study provides useful insights into the potential of targeting Ninj1 as a therapeutic strategy.

    2. Reviewer #1 (Public review):

      The authors demonstrated that NINJ1 promotes TF-positive MV release during pyroptosis and thereby triggers coagulation. Coagulation is one of the risk factors that can cause secondary complications in various inflammatory diseases, making it a highly important therapeutic target in clinical treatment. This paper effectively explains the connection between pyroptosis and MV release with Ninj1, which is a significant strength. It provides valuable insight into the potential of targeting Ninj1 as a therapeutic strategy.

      Although the advances in this paper are valuable, several aspects need to be clarified. Some comments are discussed below.

      (1) Since it is not Ninj1 directly regulating coagulation but rather the MV released by Ninj1 playing a role, the title should include that. The current title makes it seem like Ninj1 directly regulates inflammation and coagulation. It would be better to revise the title.

      (2) Ninj1 is known to be an induced protein that is barely expressed in normal conditions. As you showed in "Fig1G" data, control samples showed no detection of Ninj1. However, in "Figure S1", all tissues (liver, lung, kidney and spleen) expressed Ninj1 protein. If the authors stimulated the mice with fla injection, it should be mentioned in the figure legend.

      (3) In "Fig3A", the Ninj1 protein expression was increased in the control of BMDM +/- cell lysate rather than fla stimulation. However, in MV, Ninj1 was not detected at all in +/- control but was only observed with Fla injection. The authors need to provide an explanation for this observation. Additionally, looking at the MV β-actin lane, the band thicknesses appear to be very different between groups. It seems necessary to equalize the protein amounts. If that is difficult, at least between the +/+ and +/- controls.

      (4) Since the authors focused Ninj1-dependent microvesicle (MV) release, they need to show MV characterizations (EM, NTA, Western for MV markers, etc...).

      (5) To clarify whether Ninj1-dependent MV induces coagulation, the authors need to determine whether platelet aggregation is reduced with isolated +/- MVs compared to +/+ MVs.

      (6) Even with the authors well established experiments with haploid mice, it is a critical limitation of this paper. To improve the quality of this paper, the authors should consider confirming the findings using mouse macrophage cell lines, such as generating Ninj1-/- Raw264.7 cell lines, to examine the homozygous effect.

      (7) There was a paper reported in 2023 (Zhou, X. et al., NINJ1 Regulates Platelet Activation and PANoptosis in Septic Disseminated Intravascular Coagulation. Int. J. Mol. Sci. 2023) that revealed the relationship between Ninj1 and coagulation. According to this paper, inhibition of Ninj1 in platelets prevents pyroptosis, leading to reduced platelet activation and, consequently, the suppression of thrombosis. How about the activation of platelets in Ninj1 +/- mice? The author should add this paper in the reference section and discuss the platelet functions in their mice.

    3. Reviewer #2 (Public review):

      Summary:

      The authors main goal is to understand the mechanism by which pyroptosis (through the formation of Gasdermin D (GSDMD) pores in the plasma membrane) contributes to increased release of procoagulant Tissue Factor-containing microvesicles (MV). Their previous data demonstrate that GSDMD is critical for the release of MV that contains Tissue Factor (TF), thus making a link between pyroptosis and hypercoagulation. Given the recent identification of NINJ1 being responsible for plasma membrane rupture (Kayagaki et al. Nature 2011), the authors wanted to determine if NINJ1 is responsible for TF-containing MV release. Given the constitutive ninj1 KO mouse leads to partial embryonic lethality, the authors decide to use a heterozygous ninj1 KO mouse (ninj1+/-), and demonstrate that Ninj1 plays a role in release of TF-containing MV.

    4. Author response:

      The following is the authors’ response to the current reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      The authors demonstrated that NINJ1 promotes TF-positive MV release during pyroptosis and thereby triggers coagulation. Coagulation is one of the risk factors that can cause secondary complications in various inflammatory diseases, making it a highly important therapeutic target in clinical treatment. This paper effectively explains the connection between pyroptosis and MV release with Ninj1, which is a significant strength. It provides valuable insight into the potential of targeting Ninj1 as a therapeutic strategy.

      Although the advances in this paper are valuable, several aspects need to be clarified. Some comments are discussed below. 

      (1) Since it is not Ninj1 directly regulating coagulation but rather the MV released by Ninj1 playing a role, the title should include that. The current title makes it seem like Ninj1 directly regulates inflammation and coagulation. It would be better to revise the title.

      Thanks for the thoughtful comments. We show that the release of procoagulant MVs by plasma membrane rupture (PMR) is a critical step in the activation of coagulation. In addition, the release of cytokines and danger molecules by PMR may also contribute to coagulation. In choosing the title, we are trying to emphasize NINJ1-dependent PMR as a common trigger for these biological processes.

      (2) Ninj1 is known to be an induced protein that is barely expressed in normal conditions. As you showed in "Fig1G" data, control samples showed no detection of Ninj1. However, in "Figure S1", all tissues (liver, lung, kidney and spleen) expressed Ninj1 protein. If the authors stimulated the mice with fla injection, it should be mentioned in the figure legend. 

      We respectfully disagree with the comment that “Ninj1 is known to be an induced protein that is barely expressed in normal conditions”. NINJ1 protein is abundantly expressed (without induction) in tissues including liver, lung, kidney, and spleen, as shown in Fig S1. Consistently, other groups have shown abundant NINJ1 expression at baseline in tissues and cells such as liver (Kayagaki et.al. Nature 2023) and BMDM (Kayagaki et.al. Nature 2021; Borges et.al. eLife 2023). Fig 1G shows fibrin deposition as an indicator of coagulation, not NINJ1 protein.

      (3) In "Fig3A", the Ninj1 protein expression was increased in the control of BMDM +/- cell lysate rather than fla stimulation. However, in MV, Ninj1 was not detected at all in +/- control but was only observed with Fla injection. The authors need to provide an explanation for this observation. Additionally, looking at the MV β-actin lane, the band thicknesses appear to be very different between groups. It seems necessary to equalize the protein amounts. If that is difficult, at least between the +/+ and +/- controls. 

      Thanks for the valuable comments. In Fla-stimulated Ninj1+/- BMDMs, most of the NINJ1 is released in MVs, therefore, not in the cell lysate, as shown in Fig 3A. The difference in beta-actin band intensity correlated with MV numbers shown in Fig 3B. We ensure consistency by using the same number of cells.

      (4) Since the authors focused Ninj1-dependent microvesicle (MV) release, they need to show MV characterizations (EM, NTA, Western for MV markers, etc...). 

      Thanks for the suggestion. We now add NTA analysis of MV for BMDMs in Fig S4C.

      (5) To clarify whether Ninj1-dependent MV induces coagulation, the authors need to determine whether platelet aggregation is reduced with isolated +/- MVs compared to +/+ MVs. 

      Thanks for the suggestion. We agree that platelet aggregation is closely linked to blood coagulation but would argue that one does not directly cause the other. While it would be interesting to examine whether MVs induce platelet aggregation, we hope the reviewer would agree that the outcome of this experiment would neither significantly support nor challenge our statement that NINJ1-dependent PMR promotes coagulation.

      (6) Even with the authors well established experiments with haploid mice, it is a critical limitation of this paper. To improve the quality of this paper, the authors should consider confirming the findings using mouse macrophage cell lines, such as generating Ninj1-/- Raw264.7 cell lines, to examine the homozygous effect. 

      Thanks for the valuable comments. We acknowledge the limitation of using haploid mice in this study. However, our data provides strong evidence supporting the role of NINJ1-dependent plasma membrane rupture in blood coagulation using primary macrophages.

      (7) There was a paper reported in 2023 (Zhou, X. et al., NINJ1 Regulates Platelet Activation and PANoptosis in Septic Disseminated Intravascular Coagulation. Int. J. Mol. Sci. 2023) that revealed the relationship between Ninj1 and coagulation. According to this paper, inhibition of Ninj1 in platelets prevents pyroptosis, leading to reduced platelet activation and, consequently, the suppression of thrombosis. How about the activation of platelets in Ninj1 +/- mice? The author should add this paper in the reference section and discuss the platelet functions in their mice.

      Thanks for the valuable comments. We examine PT time, plasma TAT, and tissue fibrin deposition as direct evidence of blood coagulation in this manuscript. We acknowledge that platelets play a key role in thrombosis; however, we hope the reviewer would agree that tissue factor-induced blood coagulation and platelet aggregation are linked yet distinct processes. Therefore, the role of NINJ1 in platelet aggregation falls beyond the scope of this manuscript.


      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Referring to previous research findings, the authors explain the connection between NINJ1 and MVs. Additional experiments and clarifications will strengthen the conclusions of this study.

      Below are some comments I feel could strengthen the manuscript: 

      (1) The authors mentioned their choice of using heterozygous NINJ1+/- mice on page 4, because of lethality and hydrocephalus. Nonetheless, there is a substantial number of references that use homozygous NINJ1-/- mice. Could there be any other specific reasons for using heterozygous mice in this study? 

      Thanks for the thoughtful comments. We are aware that a few homozygous NINJ1-/- mouse strains were used in several publications by different groups, including Drs. Kayagaki and Dixit (Genentech), from whom we obtained the heterozygous NINJ1+/- breeders. We do not have experience with the homozygous NINJ1-/- mice used by other groups. It’s reasonable to assume that homozygous NINJ1-/-, if healthy, would have even stronger protection against coagulopathy than heterozygous NINJ1+/-. The only reason for not using homozygous mice in this study is that a majority of our homozygous NINJ1-/- develops hydrocephalus around weaning and these mice are required to be euthanized by the rules of our DLAR facility. Although our homozygous NINJ1-/- mice develop hydrocephalus (the same reported by Drs. Kayagaki and Dixit, PMID: 37196676, PMCID: PMC10307625), heterozygous NINJ1+/- mice remain healthy.

      (2) Figure S2 clearly shows the method of pyroptosis induction by flagellin. It is also necessary as a prerequisite for this paper to show the changes in flagellin-induced pyroptosis in heterozygous NINJ1+/- mice.

      Thanks for the valuable suggestions. We agree that a plasma LDH measurement as an indicator of pyroptosis in vivo would add to the manuscript. Therefore, we have made several attempts to measure plasma LDH in flagellin-challenged WT and NINJ1+/- mice using CytoTox96 Non-Radioactive Cytotoxicity Assay (a Promega kit commonly used for LDH, Promega#G1780). Flagellin-challenged WT and NINJ1+/- mice develops hemolysis, which renders plasma red. Because plasma coloring interferes with the assay, we could not get a meaningful reading to make an accurate comparison. We also tried LHD-Glo Cytotoxicity Assay (Luciferase based, Promega#J2380) with no luck on both plasma and serum. We hope the reviewer would agree that reduced plasma MV count (Fig 3C) would serve as an alternative indictor for reduced pyroptosis.

      (3) IL-1ß levels controlled by GSDMD were not affected by NINJ1 expression according to previous studies (Ref 37, 29, Nature volume 618, pages 1065-1071 (2023)). GSDMD also plays an important role in TF release in pyroptosis. Are GSDMD levels not altered in heterozygous NINJ1 +/- mice?  

      Thanks for raising these great points. It’s been reported that IL-1β secretion in cell culture supernatant were not affected by NINJ1 deficiency or inhibition when BMDMs were stimulated by LPS (Ref 29, 37, now Ref 29, 35) or nigericin (Ref 29). As GSDMD pore has been shown to facilitate the release of mature IL-1β, these in vitro observations are reasonable given that NINJ1-mediated PMR is a later event than GSDMD pore-forming. However, we observed that plasma IL-1β (also TNFα and IL-6) in Ninj1+/- mice were significantly lower. There are a few differences in the experimental condition that might contribute to the discrepancy: 1, there was no priming in our in vivo experiment, while priming in BMDMs were performed in both in vitro observations before stimulating with LPS or nigericin; 2, the flagellin in our study engages different inflammasome than either LPS or nigericin. Priming might change the expression and dynamics of IL-1β. More importantly, there might be unrecognized mechanisms in IL-1β secretion in vivo. We now add discussion on this in the main text.

      We examined GSDMD protein levels in liver, lung, kidney, and spleen from WT and NINJ1+/- mice by Western blotting. The data is now presented in the updated Fig S1, we did not observe apparent difference in GSDMD expression between the two genotypes.

      (4) In Fig 1 F, the authors used a fibrin-specific monoclonal antibody for staining fibrin, but it's not clearly defined. There may be some problem with the quality of antibody or technical issues. Considering this, exploring alternative methods to visualize fibrin might be beneficial. Fibrin is an acidophil material, so attempting H&E staining or Movat's pentachrome staining might help for identify fibrin areas.

      Thanks for the valuable suggestions. The fibrin-specific monoclonal antibody in our study is mouse anti-fibrin monoclonal antibody (59D8). This antibody has been shown to bind to fibrin even in the presence of human fibrinogen at the concentration found in plasma [Hui et al. (1983). Science. 222 (4628); 1129-1132]. We apologize that we did not cite the reference in our initial submission. We obtained this antibody from Dr. Hartmut Weiler at Medical College of Wisconsin and Dr. Rodney M. Camire at the University of Pennsylvania, who were acknowledged in our initial submission.

      We performed H&E staining on serial sections of the same tissues for Figure 1F. The data is now presented as Fig S3.

      Reviewer #2 (Public Review): 

      Summary: 

      The author's main goal is to understand the mechanism by which pyroptosis (through the formation of Gasdermin D (GSDMD) pores in the plasma membrane) contributes to increased release of procoagulant Tissue Factor-containing microvesicles (MV). Their previous data demonstrate that GSDMD is critical for the release of MV that contains Tissue Factor (TF), thus making a link between pyroptosis and hypercoagulation. Given the recent identification of NINJ1 being responsible for plasma membrane rupture (Kayagaki et al. Nature 2011), the authors wanted to determine if NINJ1 is responsible for TF-containing MV release. Given the constitutive ninj1 KO mouse leads to partial embryonic lethality, the authors decided to use a heterozygous ninj1 KO mouse (ninj1+/-). While the data are well controlled, there is limited understanding of the mechanism of action. Also, given that the GSDMD pores have an ~18 nm inner diameter enough to release IL-1β, while larger molecules like LDH (140 kDa) and other DAMPs require plasma membrane rupture (likely mediated by NINJ1), it s not unexpected that large MVs require NINJ1-mediated plasma cell rupture. 

      Strengths: 

      The authors convincingly demonstrate that ninj1 haploinsufficiency leads to decreased prothrombin time, plasma TAT and plasma cytokines 90 minutes post-treatment in mice, which leads to partial protection from lethality. 

      Weaknesses: 

      - In the abstract, the authors say "...cytokines and protected against blood coagulation and lethality triggered by bacterial flagellin". This conclusion is not substantiated by the data, as you still see 70% mortality at 24 hours in the ninj1+/- mice. 

      Thanks for the thoughtful comments. We corrected the text to “partially protected against blood coagulation and lethality triggered by bacterial flagellin”.

      - The previous publication by the authors (Wu et al. Immunity 2019) clearly shows that GSDMDdependent pyroptosis is required for inflammasome-induced coagulation and mouse lethality. However, as it is not possible for the authors to use the homozygous ninj1 KO mouse due to partial embryonic lethality, it becomes challenging to compare these two studies and the contributions of GSDMD vs. NINJ1. Comparing the contributions of GSDMD and NINJ1 in human blood-derived monocytes/macrophages where you can delete both genes and assess their relevant contributions to TF-containing MV release within the same background would be crucial in comparing how much contribution NINJ1 has versus what has been published for GSDMD? This would help support the in vivo findings and further corroborate the proposed conclusions made in this manuscript.  

      Thanks for the valuable question. We have shown that plasma MV TF activity was reduced in both GSDMD deficient mice (Ref 23) and Ninj1+/- mice (present manuscript). Given that TF is a plasma membrane protein, MV TF most likely comes from ruptured plasma membrane. In flagellin-induced pyroptosis, both GSDMD and NINJ1 deficiency equally blocked LDH release (plasma membrane rupture) in BMDMs (Ref 29). Further, in pyroptosis glycine acts downstream of GSDMD pore formation for its effect against NINJ1 activation (Ref 35). Therefore, GSDMD pore-forming should be upstream of NINJ1 activation in pyroptosis (which may not be the case in other forms of cell death) and there are likely equal effects of GSDMD and NINJ1 on MV release in flagellin-induced pyroptosis. As the reviewer suggested, experiments using human blood-derived monocytes/macrophages will enable a direct comparison to determine the relative contribution. However, this approach presents a few technical difficulties: it’s not easy to manipulate gene expression on primary human monocytes/macrophages (in our experience); variable efficiency in gene manipulation of GSDMD and NINJ1 will complicate the comparison. I hope the reviewer would agree that a direct comparison between GSDMD and NINJ1 is not required to support our conclusion that NINJ1-dependent membrane rupture is involved in inflammasome-pyroptosis induced coagulation and inflammation.

      - What are the levels of plasma TAT, PT, and inflammatory cytokines if you collect plasma after 90 minutes? Given the majority (~70%) of the ninj+/- mice are dead by 24 hours, it is imperative to determine whether the 90-minute timeframe data (in Fig 1A-G) is also representative of later time points. The question is whether ninj1+/- just delays the increases in prothrombin time, plasma TAT, and plasma cytokines. 

      Thank for the valuable question. The time point (90 min) was chosen based on our in vitro observation that flagellin-induced pyroptosis in BMDMs largely occurs within 60-90 min. 

      Because our focus on the primary effect of flagellin in vivo, potential secondary effects at later points may complicate the results and are hard to interpret. As the reviewer suggested, we have measured plasma PT, TAT at 6 hours post-flagellin challenge. The significant difference in PT sustained between Ninj1+/+ and Ninj1+/- (Fig A), suggesting coagulation proteins remained more depleted in Ninj1+/+ mice than in Ninj1+/- mice. However, plasma TAT levels were diminished to baseline level (refer to Fig 1B in main text) in both groups and showed no significant difference between groups (Fig B), which could be explained by the short half-life (less than 30 min) in the blood. Since flagellin challenge is a one-time hit, there might not a second episode of coagulation after the 90-minute time point, at least not triggered by flagellin, supported by the plasma TAT levels at 6 hours. We now comment on this limitation at the end of the main text.

      Based on our previous studies, plasma IL-1β and TNFα peaked at early time point and diminished over time, but plasma IL-6 levels maintained. As shown below, plasma IL-6 appeared higher in Ninj1+/+ compared with Ninj1+/-, but not statistically significant (partly because one missing sample, n = 4 not 5, in Ninj1+/+ group decreased the statistical power of detecting a difference).

      Author response image 1.

      Mice were injected with Fla (500 ng lFn-Fla plug3 ugPA). Blood was collected 6 hours after Fla injection. Prothrombin time (A), plasma TAT (B), and plasma IL-6 (C) were measured. Mann-Whitney test were performed.

      Recommendations for the authors:  

      Reviewer #1 (Recommendations For The Authors): 

      - Fig 1F: are there lower magnification images that capture the fibrin deposition? The IHC data seems at odds with the WB data in Fig. 1G where there is still significant fibrin detected in the heterozygous lungs and liver. Quantitating the Fig. 1G Western blot would also be helpful.

      IHC surveys a thin layer of tissue section while WB surveys a piece of tissue, therefore fibrin deposition may be missing from IHC and but found in WB. That is why we used two methods. Below we provide lower mag images of fibrin deposition (about 2 x 1.6 mm area).

      Author response image 2.

      - Fig1H - lethality study uses 5x dose of Fla used in earlier studies. In the lethality data where there is a delay in ninj1+/- mortality, are the parameters (prothrombin time, plasma TAT, and plasma cytokines) measured at 90 minutes different between WT and ninj+/- mice? This would be critical to confirm that this is not merely due to a delayed release of TF-containing MVs.

      We used 5x lower dose of Fla in coagulation study than lethality study because it’s not as easy to draw blood from septic mouse with higher dose of flagellin. We need to terminate the mice to collect blood for plasma measurement and therefore the parameters were not measured for mice in lethality study.

      - What is the effect of ninj+/- on E. coli-induced lethality in mice? How do these data compare to E. coli infection of GSDMD-/- mice? 

      We did not examine the effect of Ninj1+/- on E. coli-induced lethality. After the initial submission of our manuscript, we have focused on Ninj1 flox/flox mice instead of Ninj1+/- for NINJ1 deficiency. We are using induced global Ninj1 deficient mice for polymicrobial infectioninduced lethality in our new studies.

      - Fig 2 - in the E. coli model, the prothrombin time, plasma TAT, and plasma cytokines are measured 6 hours post-infection. How were these time points chosen? Did the authors measure prothrombin time, plasma TAT, and plasma cytokines at different time points?  

      The in vivo time point for flagellin and E.coli were chosen based on our in vitro observation of the timelines on BMDM pyroptosis induced by flagellin and bacteria. This disparity probably arises from distinct dynamics between purified protein and bacterial infections. Purified proteins can swiftly translocate into cells and take effect immediately after injection. Conversely, during bacterial infection, macrophages engulf and digest the bacteria to expose their antigens. Subsequently, these antigens initiate further effects, a process that takes some time to unfold. 

      Our focus is on the primary effect of flagellin in vivo, potential secondary effects at later points may complicate the results and are hard to interpret. As the reviewer suggested, we have measured plasma PT, TAT at 6 hours post-flagellin challenge. The significant difference in PT sustained between Ninj1+/+ and Ninj1+/- (Fig A), suggesting coagulation proteins remained more depleted in Ninj1+/+ mice than in Ninj1+/- mice. However, plasma TAT levels were diminished to baseline level (refer to Fig 1B in main text) in both groups and showed no significant difference between groups (Fig B), which could be explained by the short half-life (less than 30 min) in the blood. Since flagellin challenge is a one-time hit, there might not a second episode of coagulation after the 90-minute time point, at least not triggered by flagellin, supported by the plasma TAT levels at 6 hours. We now comment on this limitation at the end of the main text.

      Based on our previous studies, plasma IL-1β and TNFα peaked at early time point and diminished over time, but plasma IL-6 levels maintained. As shown below, plasma IL-6 appeared higher in Ninj1+/+ compared with Ninj1+/-, but not statistically significant (partly because one missing sample, n = 4 not 5, in Ninj1+/+ group decreased the statistical power of detecting a difference).

      - Fig 3 - the sequence of figure panels listed in the legend needs to be corrected. Fig 3A requires quantitation of NINJ1 levels compared to beta-actin. Fig 3C - needs a control for equal MV loading. 

      Thanks for the recommendations. The figure sequence has been corrected. There remain no common markers or loading controls for MV, so we use equal plasma volume for loading control.

      Additional comments: 

      (1) In Fig 3A, the size of NINJ1 appears to be increased in the NINJ+/- group.  

      This discrepancy is likely attributed to a technical issue when running the protein gel and protein transfer, which makes the image tilt to one side.

      (2) Describe the method of BMDM isolation.

      Thanks for the recommendations. We now include the method of BMDM isolation. In brief, mouse femur and tibia from one leg are harvested and rinsed in ice-cold PBS, followed by a brief rinse in 70% ethanol for 10-15 seconds. Both ends of the bones are then cut open, and the bone marrow is flushed out using a 10 ml syringe with a 26-gauge needle. The marrow is passed through a 19-gauge needle once to disperse the cells. After filtering through a 70-μm cell strainer, the cells are collected by centrifugation at 250 g for 5 minutes at 4 °C, then suspended in two 150 mm petri dish, each containing 25 ml of L-cell conditioned medium (RPMI-1640 supplemented with 10% FBS, 2mM L-Glutamine, 10mM HEPES, 15% LCM, and penicillin/streptomycin). After 3 days, 15 mL of LCM medium is added to each dish cells. The cells typically reach full confluency by days 5-7.

      (3) According to this method, BMDMs are seeded without any M-CSF or L929-cell conditioned medium. How many macrophages survive under this condition? 

      BMDMs are cultured and differentiated in medium supplemented with 15% L929-cell conditioned medium. For the experiment, the cells were seeded in Opti-MEM medium (Thermo Fisher Scientific, Cat# 51985034) without M-CSF or L929-cell conditioned medium. BMDMs can survive under this condition, as evidenced by low LDH and high ATP measurement (Fig S5).

      Reviewer #2 (Recommendations For The Authors): 

      - There is significant information missing in the methods and this makes it unclear how to interpret how some of the experiments were performed. For example, there is no detailed description or references in the methods on how the in vivo experiments were performed. The methods section needs significantly more details so that any reader is able to follow the protocols in this manuscript. References to previous work should also be included as needed.

      Thanks for the recommendations. We had some of the details in the figure legend. We now add details in the methods for better interpretation of our data. 

      - Line numbers in the manuscript would be helpful when resubmitting the manuscript so that the reviewer can easily point to the main text when making comments. 

      Thanks for the recommendations. We now add line numbers in the manuscript.

    1. eLife Assessment

      In this valuable study, the authors integrate several datasets to describe how the genome interacts with nuclear bodies across distinct cell types and in Lamin A and LBR knockout cells. They provide convincing evidence to support their claims and particularly find that specific genomic regions segregate relative to the equatorial plane of the cell when considering their interaction with various nuclear bodies. The authors are encouraged to consider citing the relevant work of other labs who have shown the presence of different types of Lamin Associated Domains (LADs).

    2. Reviewer #2 (Public review):

      Summary:

      Golamalamdari, van Schaik, Wang, Kumar Zhang, Zhang and colleagues study interactions between the speckle, nucleolus and lamina in multiple cell types (K562, H1, HCT116 and HFF). Their datasets define how interactions between the genome and the different nuclear landmarks relate to each other and change across cell types. They also identify how these relationships change in K562 cells in which LBR and LMNA are knocked out.

      Strengths:

      Overall, there are a number of datasets that are provided, and several "integrative" analyses performed. This is a major strength of the paper, and I imagine the datasets will be of use to the community to further probed and the relationships elucidated here further studied. An especially interesting result was that specific genomic regions (relative to their association with the speckle, lamina, and other molecular characteristics) segregate relative to the equatorial plane of the cell.

      Weaknesses:

      The experiments are primarily descriptive, and the cause-and-effect relationships are limited (though the authors do study the role of LMNA/LBR knockdown with their technologies).

    3. Author response:

      The following is the authors’ response to the original reviews.

      eLife Assessment

      (1) This is a valuable manuscript that successfully integrates several data sets to determine genomic interactions with nuclear bodies.

      In this paper we both challenge and/or revise multiple long-standing “textbook” models of nuclear genome organization while also revealing new features of nuclear genome organization. Therefore, we argue that the contributions of this paper extend well beyond “valuable”. Specifically, these contributions include:

      a. We challenge a several decades focus on the correlation of gene positioning relative to the nuclear lamina. Instead, through comparison of cell lines, we show a strong correlation of di4erences in gene activity with di4erences in relative distance to nuclear speckles in contrast to a very weak correlation with di4erences in relative distance to the nuclear lamina. This inference of little correlation of gene expression with nuclear lamina association was supported by direct experimental manipulation of genome positioning relative to the nuclear lamina. Despite pronounced changes in relative distances to the nuclear lamina there was little change relative to nuclear speckles and little change in gene expression.

      b. We similarly challenge the long-standing proposed functional correlation between the radial positioning of genes and gene expression. Here, and in a now published companion paper (doi.org/10.1038/s42003-024-06838-7), we demonstrate how nuclear speckle positioning relative to nucleoli and the nuclear lamina varies among cell types, as does the inverse relationship between genome positioning relative to nuclear speckles and the nuclear lamina. Again, this is consistent with the primary correlation of gene activity being the positioning of genes relative to nuclear speckles and also explains previous observations showing a strong relationship between radial position and gene expression only in some cell types.

      c. We identified a new partially repressed, middle to late DNA replicating type of chromosome domain- “p-w-v fILADs”- by their weak interaction with the nuclear lamina, which, based on our LMNA/LBR KO experimental results, compete with LADs for nuclear lamina association. Moreover, we show that when fLADs convert to iLADs, most conversions are to this p-w-v fiLAD state, although ~ one third are to a normal, active, early replicating iLAD state. Thus, fLADs can convert between repressed, partially repressed, and active states, challenging the prevailing assumption of the division of the genome into two states – active, early replicating A compartment/iLAD regions versus inactive, late replicating, B compartment/LAD regions.

      d. We identified nuclear speckle associated domains as DNA replication initiation zones, with the domains showing strongest nuclear speckle attachment initiating DNA replication earliest in S-phase.

      e. We describe for the first time an overall polarization of nuclear genome organization in adherent cells with the most active, earliest replicating genomic regions located towards the equatorial plane and less expressed genomic regions towards the nuclear top or bottom surfaces. This includes polarization of some LAD regions to the nuclear lamina at the equatorial plane and other LAD regions to the top or bottom nuclear surfaces.

      We have now rewritten the text to make the significance of these new findings clearer.

      (2) Strength of evidence: The evidence supporting the central claims is varied in its strength ranging from solid to incomplete. Orthogonal evidence validating the novel methodologies with alternative approaches would better support the central claims.

      We argue that our work exploited methods, data, and analyses equal to or more rigorous than the current state-of-the-art. This indeed includes orthogonal evidence using alternative methods which both supported our novel methodologies as well as demonstrating their robustness relative to more conventional approaches. This explains how we were able to challenge/revise long-standing models and discover new features of nuclear genome organization. More specifically:

      a. Unlike most previous analyses, we have integrated both genomic and imaging approaches to examine the nuclear genome organization relative to not one, but several di4erent nuclear locales and we have done this across several cell types. To our knowledge, this is the first such integrated approach and has been key to our success in appreciating new features of nuclear genome organization.

      b. The 16-fraction DNA replication Repli-seq data we developed and applied to this project represents the highest temporal mapping of DNA replication timing to date.

      c. The TSA-seq approach that we used remains the most accurate sequence-based method for estimating microscopic distance of chromosome regions to di4erent nuclear locales. As implemented, this method is unusually robust and direct as it exploits the exponential micron-scale gradient established by the di4usion of the free-radicals generated by peroxidase labeling to measure relative distances of chromosome regions to labeled nuclear locales. We had previously demonstrated that TSA-seq was able to estimate the average distances of genomic regions to nuclear speckles with an accuracy of ~50 nm, as validated by light microscopy. The TSA-seq 2.0 protocol we developed and applied to this project maintained the original resolution of TSA-seq to estimate to an accuracy of ~50 nm the average distances of genomic regions from nuclear speckles, as validated by light microscopy, while achieving more than a 10-fold reduction in the required number of cells.

      We have rewritten the text to address the reviewer concerns that led them to their initial characterization of the TSA-seq as novel and not yet validated.

      First, we have added a discussion of how the use of nuclear speckle TSA-seq as a “cytological ruler” was based on an extensive initial characterization of TSA-seq as described in previous published literature. In that previous literature we showed how the conventional molecular proximity method, ChIP-seq, instead showed local accumulation of the same marker proteins over short DNA regions unrelated to speckle distances. Second, we reference our companion paper, now published, and describe how the extension of TSA-seq to measure relative distances to nucleoli was further validated and shown to be robust by comparison to NAD-seq and extensive multiplexed immuno-FISH data. We further discuss how in the same companion paper we show how nucleolar DamID instead was inconsistent with both the NAD-seq and multiplexed immuno-FISH data as well as the nucleolar TSA-seq.

      Third, we have added scatterplots showing exactly how highly the estimated microscopic distances to all three nuclear locales, measured in IMR90 fibroblasts, correlate with the TSA-seq measurements in HFF fibroblasts. This addresses the concern that we were not using the exact same fibroblast cell line for the TSA-seq versus microscopic measurements. The strong correlation already observed would only be expected to become even stronger with use of the exact same fibroblast cell lines for both measurements.

      Fourth, we have addressed the reviewer concern that the nuclear lamin TSA-seq was not properly validated because it did not match nuclear lamin Dam-ID. We have now added to the text a more complete explanation of how microscopic proximity assays such as TSA-seq measure something di4erent from molecular proximity assays such as DamID or NAD-seq. We have added further explanation of how TSA-seq complements molecular proximity assays such as DamID and NAD-seq, allowing us to extract further information than either measurement alone. We also briefly discuss why TSA-seq succeeds for certain nuclear locales using multiple independent markers whereas molecular proximity assays may fail against the same nuclear locales using the same markers. This includes brief discussion from our own experience attempting unsuccessfully to use DamID against nucleoli and nuclear speckles.

      Reviewer #1 (Public Review):

      (1) The weakness of this study lies in the fact that many of the genomic datasets originated from novel methods that were not validated with orthogonal approaches, such as DNAFISH. Therefore, the detailed correlations described in this work are based on methodologies whose efficacy is not clearly established. Specifically, the authors utilized two modified protocols of TSA-seq for the detection of NADs (MKI67IP TSA-seq) and LADs (LMNB1-TSA-seq).

      We disagree with the statement that the TSA-seq approach and data has not been validated by orthogonal approaches. We have now addressed this point in the revised manuscript text:

      a) We added text to describe how previously FISH was used to validate speckle TSA-seq by demonstrating a residual of ~50 nm between the TSA-seq predicted distance to speckles and the distance measured by light microscopy using FISH:

      "In contrast, TSA-seq measures relative distances to targets on a microscopic scale corresponding to 100s of nm to ~ 1 micron based on the measured diffusion radius of tyramide-biotin free-radicals (Chen et al., 2018). Exploiting the measured exponential decay of the tyramide-biotin free-radical concentration, we showed how the mean distance of chromosomes to nuclear speckles could be estimated from the TSA-seq data to an accuracy of ~50 nm, as validated by FISH (Chen et al., 2018)."

      b) We note that we also previously have validated lamina (Chen et al, JCB 2018) and nucleolar (Kumar et al, 2024) TSA-seq and further validated speckle TSA-seq (Zhang et al, Genome Research 2021) by traditional immuno-FISH and/or immunostaining. The overall high correlation between lamina TSA-seq and the orthogonal lamina DamID method was also extensively discussed in the first TSA-seq paper (Chen et al, JCB 2018). Included in this discussion was description of how the di4erences between lamina TSA-seq and DamID were expected, given that DamID produces a signal more proportional to contact frequency, and independent of distance from the nuclear lamina, whereas TSA-seq produces a signal that is a function of microscopic distance from the lamina, as validated by traditional FISH.

      c) We added text to describe how the nucleolar TSA-seq previously was validated by two orthogonal methods- NAD-seq and multiplexed DNA immuno-FISH:

      "We successfully developed nucleolar TSA-seq, which we extensively validated using comparisons with two different orthogonal genome-wide approaches (Kumar et al., 2024)- NAD-seq, based on the biochemical isolation of nucleoli, and previously published direct microscopic measurements using highly multiplexed immuno-FISH (Su et al., 2020)."

      d) We have now added panels A&B to Fig. 7 and a new Supplementary Fig. 7 demonstrating further validation of TSA-seq based on showing the high correlation between the microscopically measured distances of many hundreds of genomic sites across the genome from di4erent nuclear locales and TSA-seq scores. As discussed in response #2 below, we have used comparison of distances measured in IMR90 fibroblasts with TSA-seq scores measured in HFF fibroblasts. We would argue therefore that these correlations are a lower estimate and therefore the correlation between microscopic distances and TSAseq scores would likely have been still higher if we had performed both assays in the exact same cell line.

      (2) Although these methods have been described in a bioRxiv manuscript by Kumar et al., they have not yet been published. Moreover, and surprisingly, Kumar et al., work is not cited in the current manuscript, despite its use of all TSA-seq data for NADs and LADs across the four cell lines.

      The Kumar et al, Communications Biology, 2024 paper is now published and is cited properly in our revision. We apologize for this oversight and confusion our initial omission of this citation may have created. We had been writing this manuscript and the Kumar et al manuscript in parallel and had intended to co-submit. We planned to cross-reference the two at the time we co-submitted, adding the Kumar et al reference to the first version of this manuscript once we obtained a doi from bioRxiv. But we then submitted the Kumar et al manuscript several months earlier, but meanwhile forgot that we had not added the reference to our first manuscript version.

      (3) Moreover, Kumar et al. did not provide any DNA-FISH validation for their methods.

      As we described in our response to Reviewer 1's comment #1, we had previously provided traditional FISH validation of lamina TSA-seq in our first TSA- seq paper as well as validation by comparison with lamina DamID (Chen et al, 2018).

      We also described how the nucleolar TSA-seq was extensively cross-validated in the Kumar et al, 2024 paper by both NAD-seq and the highly multiplexed immuno-FISH data from Su et al, 2020).

      We note additionally that in the Kumar et al, 2024 paper the nucleolar TSA-seq was additionally validated by correlating the predicted variations in centromeric association with nucleoli across the four cell lines predicted by nucleolar TSA-seq with the variations observed by traditional immunofluorescence microscopy.

      (4) Therefore, the interesting correlations described in this work are not based on robust technologies.

      This comment was made in reference to the Kumar et al paper not having been published, and, as noted in responses to points #2 and #3, the paper is now published.

      But we wanted to specifically note, however, that our experience is that TSA-seq has proven remarkably robust in comparison to molecular proximity assays. We've described in our responses to the previous points how TSA-seq has been cross-validated by both microscopy and by comparison with lamina DamID and nucleolar NAD-seq. We note also that in every application of TSA-seq to date, all antibodies that produced good immunostaining showed good TSA-seq results. Moreover, we obtained nearly identical results in every case in which we performed TSA-seq with different antibodies against the same target. Thus anti-SON and antiSC35 staining produced very similar speckle TSA-seq data (Chen et al, 2018), anti-lamin A and anti-lamin B staining produced very similar lamina TSA-seq data (Chen et al, 2018), antinucleolin and anti-POL1RE staining produced very similar DFC/FC nucleolar TSA-seq data (Kumar et al, 2024), and anti-MKI67IP and anti-DDX18 staining produced very similar GC nucleolar TSA-seq data (Kumar et al, 2024).

      This independence of results with TSA-seq to the particular antibody chosen to label a target differs from experience with methods such as ChIP, DamID, and Cut and Run/Tag in which results can differ or be skewed based on variable distance and therefore reactivity of target proteins from the DNA or due to other factors such as non-specific binding during pulldown (ChIP) or differential extraction by salt washes (Cut and Tag).

      Our experience in every case to date is that antibodies that produce similar immunofluorescence staining produce similar TSA-seq results. We attribute this robustness to the fact that TSA-seq is based only on the original immunostaining specificity provided by the primary and secondary antibodies plus the diffusion properties of the tyramide-free radical.

      We've now added the following text to our revised manuscript:

      "As previously demonstrated for both SON and lamin TSA-seq (Chen et al., 2018), nucleolar TSA-seq was also robust in the sense that multiple target proteins showing similar nucleolar staining showed similar TSA-seq results (Kumar et al., 2024); this robustness is intrinsic to TSA-seq being a microscopic rather than molecular proximity assay, and therefore not sensitive to the exact molecular binding partners and molecular distance of the target proteins to the DNA."

      (5) An attempt to validate the data was made for SON-TSA-seq of human foreskin fibroblasts (HFF) using multiplexed FISH data from IMR90 fibroblasts (from the lung) by the Zhuang lab (Su et al., 2020). However, the comparability of these datasets is questionable. It might have been more reasonable for the authors to conduct their analyses in IMR90 cells, thereby allowing them to utilize MERFISH data for validating the TSA-seq method and also for mapping NADs and LADs.

      We disagree with the reviewer's overall assessment that that the use of the IMR90 data to further validate the TSA-seq is questionable because the TSA-seq data from HFF fibroblasts is not necessarily comparable with multiplexed immuno-FISH microscopic distances measured in IMR90 fibroblasts.

      In response we have now added panels to Fig. 7 and Supplementary Fig. 7, showing:

      a) There is very little di4erence in correlation between speckle TSA-seq and measured distances from speckles in IMR90 cells whether we use IMR90 or HFF cells SON TSA-seq data (R<sup>2</sup> = 0.81 versus 0.76) (new Fig. 7A).

      b) There is also a high correlation between lamina (R<sup>2</sup> = 0.62) and nucleolar (R<sup>2</sup> = 0.73) HFF TSA-seq and measured distances in IMR90 cells. Thus, we conclude that this high correlation shows that the multiplexed data from ~1000 genomic locations does validate the TSA-seq. These correlations should be considered lower bounds on what we would have measured using IMR90 TSA-seq data. Thus, the true correlation between distances of loci from nuclear locales and TSA-seq would be expected to be either comparable or even stronger than what we are seeing with the IMR90 versus HFF fibroblast comparisons.

      c) This correlation is cell-type specific (Fig. 7B, new SFig. 7). Thus, even for speckle TSAseq, highly conserved between cell types, the highest correlation of IMR90 distances with speckle TSA-seq is with IMR90 and HFF fibroblast data. For lamina and nucleolar TSA-seq, which show much lower conservation between cell types, the correlation of IMR90 distances is high for HFF data but much lower for data from the other cell types. This further justifies the use of IMR90 fibroblast distance measurements as a proxy for HFF fibroblast measurements.

      Thus, we have added the following text to the revised manuscript:

      "We reasoned that the nuclear genome organization in the two human fibroblast cell lines would be sufficiently similar to justify using IMR90 multiplexed FISH data [43] as a proxy for our analysis of HFF TSA-seq data. Indeed, the high inverse correlation (R= -0.86) of distances to speckles measured by MERFISH in IMR90 cells with HFF SON TSA-seq scores is nearly identical to the inverse correlation (R= -0.89) measured instead using IMR90 SON TSA-seq scores (Fig. 7A). Similarly, distances to the nuclear lamina and nucleoli show high inverse correlations with lamina and nucleolar TSA-seq, respectively (Fig. 7A). These correlations were cell type specific, particularly for the lamina and nucleolar distance correlations, as these correlations were reduced if we used TSA-seq data from other cell types (SFig. 7A). Therefore, the high correlation between IMR90 microscopic distances and HFF TSA-seq scores can be considered a lower bound on the likely true correlation, justifying the use of IMR90 as a proxy for HFF for testing our predictions."

      Reviewer #2 (Public Review):

      Weaknesses:

      (1) The experiments are largely descriptive, and it is difficult to draw many cause-andeffect relationships...The study would benefit from a clear and specific hypothesis.

      This study was hypothesis-generating rather than hypothesis-testing in its goal. Our research was funded through the NIH 4D-Nucleome Consortium, which had as its initial goal the development, benchmarking, and validation of new genomic technologies. Our Center focused on the mapping of the genome relative to different nuclear locales and the correlation of this intranuclear positioning of the genome with functions- specifically gene expression and DNA replication timing. By its very nature, this project took a discovery-driven versus hypothesis-driven scientific approach. Our question fundamentally was whether we could gain new insights into nuclear genome organization through the integration of genomic and microscopic measurements of chromosome positioning relative to multiple different nuclear compartments/bodies and their correlation with functional assays such as RNA-seq and Repliseq.

      Indeed, this study resulted in multiple new insights into nuclear genome organization as summarized in our last main figure. We believe our work and conclusions will be of general interest to scientists working in the fields of 3D genome organization and nuclear cell biology. We anticipate that each of these new insights will prompt future hypothesis-driven science focused on specific questions and the testing of cause-and-effect relationships.

      However, we do want to point out that our comparison of wild-type K562 cells with the LMNA/LBR double knockout was designed to test the long-standing model that nuclear lamina association of genomic loci contributes to gene silencing. This experiment was motivated by our surprising result that gene expression differences between cell lines correlated strongly with differences in positioning relative to nuclear speckles rather than the nuclear lamina. Despite documenting in these double knockout cells a decreased nuclear lamina association of most LADs, and an increased nuclear lamina association of the “p-w-v” fiLADs identified in this manuscript, we saw no significant change in gene expression in any of these regions as compared to wild-type K562 cells. Meanwhile, distances to nuclear speckles as measured by TSA-seq remained nearly constant.

      We would argue that this represents a specific example in which new insights generated by our genomics comparison of cell lines led to a clear and specific hypothesis and the experimental testing of this hypothesis.

      (2) Similarly, the paper would be very much strengthened if the authors provided additional summary statements and interpretation of their results (especially for those not as familiar with 3D genome organization).

      We appreciate this feedback and agree with the reviewer that this would be useful, especially for those not familiar with previous work in the field of 3D genome organization. In an earlier draft, we had included additional summary and interpretation statements in both the Introduction and Results sections. At the start of each Results section, we had also previously included brief discussion of what was known before and the context for the subsequent analysis contained in that section. However, we had thought we might be submitting to a journal with specific word limits and had significantly cut out that text.

      We have now restored this text and, in certain cases, added additional explanations and context.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Figures 1C and D. Please add the units at the values of each y-axis.

      We have done that.

      The representation of Figure 2C lacks clarity and is diJicult to understand. The x-axis labeling regarding the gene fraction number needs clarification.

      We've modified the text to the Fig. 2C legend: "Fraction of genes showing significant di=erence in relative positioning to nuclear speckles (gene fraction, x-axis) versus log2 (HFF FKPM / H1 FKPM) (y-axis);"

      "We next used live-cell imaging to corroborate that chromosome regions close to nuclear speckles, primarily Type I peaks, would show the earliest DNA replication timing." This sentence requires modification as Supplementary Figure 3F does not demonstrate that Type I peaks exhibit the earliest DNA replication timing; it only indicates that the first PCNA foci in S-phase are in proximity to nuclear speckles.

      We've modified the text to: "We next used live-cell imaging to show that chromosome regions close to nuclear speckles show the earliest DNA replication timing; this is consistent with the earliest firing DNA replication IZs, as determined by Repli-seq, aligning with Type 1 peaks that are closely associated with nuclear speckles."

      In Figure 5, the authors employed LaminB1-DamID to quantify LADs in LBR-KO and LMNA/LBR-DKO K562 cells. These are interesting results. However, for these experiments, it is crucial to assess LMNB1 signal at the nuclear periphery via immunofluorescence (IF) to confirm the absence of changes, ensuring that the DamID signal solely reflects contacts with the nuclear lamina. Furthermore, in this instance, their findings should be validated through DNA-FISH.

      Immunostaining of LMNB1 was performed and showed a normal staining pattern as a ring adjacent to the nuclear periphery. Images of this staining were included in the metadata tied to the sequencing data sets deposited on the 4D Nucleome Data portal. We thank the reviewer for bringing up this point, and have added a sentence mentioning this result in the Results Section:

      "Immunostaining against LMNB1 revealed the normal ring of staining around the nuclear periphery seen in wt cells (images deposited as metadata in the deposited sequencing data sets)."

      Because both TSA-seq and DamID have been extensively validated by FISH, as detailed in our previous responses to the public reviewer comments, we feel it is unnecessary to validate these findings by FISH.

      p-w-v-fiLADs should be labelled in Figure 5B.

      We've added labeling as suggested.

      "The consistent trend of slightly later DNA replication timing for regions (primarily p-w-v fiLADs) moving closer to the lamina" is not visible in the representation of the data of Figure 5G.

      We did not make a change as we believed this trend was apparent in the Figure.

      To reduce the descriptive nature of the data, it would be pertinent to conduct H3K9me3 and H3K27me3 ChIP-seq analyses in both the parental and DKO mutant cells. This would elucidate whether p-w-v-fiLADs and NADs anchoring to the nuclear lamina undergo changes in their histone modification profile.

      We believe further analysis of the reasons underlying these shifts in positioning, including such ChIP-seq or equivalent analysis, is of interest but beyond the scope of this publication. We see such measurements as the beginning of a new story but insuJicient alone to determine mechanism. Therefore we believe such experiments should be part of that future study.

      The description of Figure 7 lacks clarity. Additionally, it appears that TSA-seq for NADs and LADs may not be universally applicable across all cell types, particularly in flat cells, whereas DamID scores demonstrate less variation across cell lines, as also stated by the authors.

      TSA-seq is a complement to rather than a replacement for either DamID or NAD-seq. TSAseq reports on microscopic distances whereas both DamID and NAD-seq instead are more proportional to contact frequency with the nuclear lamina or nucleoli, respectively, and insensitive to distances of loci away from the lamina or nucleoli. Thus, TSA-seq provides additional information based on the intrinsic diJerences in what TSA-seq measures relative to molecular proximity methods such as DamID or NAD-seq. The entire point is that the convolution of the exponential point-spread-function of the TSA-seq with the shape of the nuclear periphery allows us to distinguish genomic regions in the equatorial plane versus the top and bottom of the nuclei. The TSA-seq is therefore highly "applicable" when properly interpreted in discerning new features of genome organization. As we stated in the revised manuscript, the lamina DamID and TSA-seq are complementary and provide more information together then either method along. The same is true for the NAD-seq and nucleolar TSA-seq comparison, as described in more detail in the Kumar, et al, 2024 paper.

      Introduction:

      The list of methodologies for mapping genomic contacts with nucleoli (NADs) should also include recent technologies, such as Nucleolar-DamID (Bersaglieri et al., PMID: 35304483), which has been validated through DNA-FISH.

      We did not include nucleolar DamID in the mention in the Introduction of methods for identifying diJerential lamina versus nucleolar interactions of heterochromatin- either from our own collaborative group or from the cited reference- because we did not have confidence in the accuracy of this method in identifying NADs. In the case of the published nucleolar DamID from our collaborative group, published in Wang et al, 2021, we later discovered that despite apparent agreement of the nucleolar DamID with a small number of published FISH localization the overall correlation of the nucleolar DamID with nucleolar localization was poor. As described in detail in the Kumar et al, 2024 publication, this poor correlation of the nucleolar DamID was established using three orthogonal methods- nucleolar TSA-seq, NAD-seq, and multiplexed immuno-FISH measurements from ~1000 genomic locations. Instead, we found that this nucleolar DamID showed high correlation with lamina DamID. We note that many strong NADs are also LADs, which we think is why validation with only several FISH probes is inadequate to demonstrate overall validation of the approach.

      We could not compare our nucleolar-DamID data in human cells with the alternative nucleolar-DamID results cited by the reviewer which were performed in mouse cells. We note that in this paper the nucleolar DamID FISH validation only included several putative NAD chromosome regions and, I believe, one LAD region. However, our initial comparison of the nucleolar DamID cited by the reviewer with unpublished TSA-seq data from mouse ESCs produced by the Belmont laboratory and with NAD-seq data from the Kaufman laboratory shows a similar lack of correlation of the nucleolar DamID signal with nucleolar TSA-seq and NAD-seq, as well as multiplexed immuno-FISH data from the Long Cai laboratory, as we saw in our analysis of own nucleolar DamID data in human cells.

      We have added explanation concerning the lack of correlation of our nucleolar DamID with orthogonal measurements of nucleolar proximity in the added text (below) to our revised manuscript:

      "Nucleolar DamID instead showed broad positive peaks over large chromatin domains, largely overlapping with LADs mapped by LMNB1 DamID (Wang et al., 2021). However, this nucleolar DamID signal, while strongly correlated with lamin DamID, showed poor correlation with either NAD-seq or nucleolar distances mapped by multiplexed immunoFISH (Kumar et al., 2024). We suspect the problem is that with molecular proximity assays the output signals are disproportionally dominated by the small fraction of target proteins juxtaposed in su=icient proximity to the DNA to produce a signal rather than the amount of protein concentrated in the target nuclear body. "

      Our mention of nucleolar TSA-seq was in the context of why we focused on nucleolar TSAseq and excluded our own nucleolar DamID. We chose not to discuss the second nucleolar DamID method cited above 1) because it was not appropriate to our discussion of our own experimental approach and 2) also because we cannot yet make a definitive statement of its accuracy for nucleolar mapping.

      Reviewer #2 (Recommendations For The Authors):

      (1) The authors start the manuscript by describing the 'radial genome organization' model and contrast it with the 'binary model' of genome organization. It would be helpful for the authors to contextualize their results a bit more with regard to these two diJerent models in the discussion.

      We have added several sentences in the first paragraph of the Discussion to accomplish this contextualization. The new paragraph reads:

      "Here we integrated imaging with both spatial (DamID, TSA-seq) and functional (Repli-seq, RNA-seq) genomic readouts across four human cell lines. Overall, our results significantly extend previous nuclear genome organization models, while also demonstrating a cell-type dependent complexity of nuclear genome organization. Briefly, in contrast to the previous radial model of genome organization, we reveal a primary correlation of gene expression with relative distances to nuclear speckles rather than radial position. Additionally, beyond a correlation of nuclear genome organization with radial position, in cells with flat nuclei we show a pronounced correlation of nuclear genome organization with distance from the equatorial plane. In contrast to previous binary models of genome organization, we describe how both iLAD / A compartment and LAD / B compartment contain within them smaller chromosome regions with distinct biochemical and/or functional properties that segregate di=erentially with respect to relative distances to nuclear locales and geometry."

      (2) Data should be provided demonstrating KO of LBR and LMNA - immunoblotting for both proteins would be one approach. In addition, it would be helpful to provide additional nuclear morphology measurements of the DKO cells (volume, surface area, volume of speckles/nucleoli, number of speckles/nucleoli).

      We've added additional description describing the generation and validation of the KO lines:

      "To create LMNA and LBR knockout (KO) lines and the LMNA/LBR double knockout (DKO) line, we started with a parental "wt" K562 cell line, clone #17, expressing an inducible form of Cas9 (Brinkman et al., 2018). The single KO and DKO were generated by CRISPR-mediated frameshift mutation according to the procedure described previously (Schep et al., 2021). The "wt" K562 clone #17 was used for comparison with the KO clones.

      The LBR KO clone, K562 LBR-KO #19, was generated, using the LBR2 oligonucleotide GCCGATGGTGAAGTGGTAAG to produce the gRNA, and validated previously, using TIDE (Brinkman et al., 2014) to check for frameshifts in all alleles as described elsewhere (Schep et al., 2021). The LMNA/LBR DKO, K562 LBR-LMNA DKO #14, was made similarly, starting with the LBR KO line and using the combination of two oligonucleotides to produce gRNAs:

      LMNA-KO1: ACTGAGAGCAGTGCTCAGTG, LMNA-KO2: TCTCAGTGAGAAGCGCACGC.

      Additionally, the LMNA KO line, K562 LMNA-KO #14, was made the same way but starting with the "wt" K562 cell line. Validation was as described above; additionally, for the new LMNA KO and LMNA/LBR DKO lines, immunostaining showed the absence of anti-LMNA antibody signal under confocal imaging conditions used to visualize the wt LMNA staining while the RNA-seq from these clones revealed an ~20-fold reduction in LMNA RNA reads relative to the wt K562 clone."

      As suggested, we also added morphological data for the DKO line in a modified SFig.5.

      (3) The rationale for using LMNB1 TSA-seq and LMNB1 DAMID is not immediately clear. The LMNB1 TSA-seq is more variable across cell types and replicates than the DAMID. Could the authors please compare the datasets a bit more to understand the diJerences? For example, the authors demonstrate that "40-70% of the genome shows statistically significant diJerences in Lamina TSA-seq over regions 100 kb or larger, with most of these regions showing little or no diJerences in speckle TSA-seq scores." If the LMNB1 DAMID data is used for this analysis or Figure 2D, is the same conclusion reached? Also, in Figure 6, the authors conclude that C1 and C3 LAD regions are enriched for constitutive LADs, while C2 and C4 LAD regions are fLADs. This is a bit surprising because the authors and others have previously shown that constitutive LADs have higher LMNB1 contact frequency than facultative LADs (Kind, et al Cell 2015, Figure 3C).

      Indeed, in the first TSA-seq paper (Chen et al, 2018) we did observe that cLADs had the highest LMNB TSA-seq scores; this was for K562 cells with round nuclei in which there is therefore no diJerence in lamina TSA-seq scores produced by nuclear shape over the entire nucleus.

      However, there are diJerences between TSA-seq and DamID in terms of what they measure and we refer the reviewer to the first TSA-seq paper (Chen et al, 2018) that explains in greater depth these diJerences. This first paper explains how DamID is indeed related to contact frequency but how the TSA-seq instead estimates mean distances from the target, in this case the nuclear lamina. This is because the diJusion of tyramide free radicals from the site of their constant HRP production produces an exponential decay gradient of tyramide free radical concentration at steady state.

      We have summarized these diJerences in in text we have added to introduce both DamID and TSA-seq in the second Results section:

      "DamID is a well-established molecular proximity assay; DamID applied to the nuclear lamina divides the genome into lamina-associated domains (LADs) versus nonassociated “inter-LADs” or “iLADs” (Guelen et al., 2008; van Steensel and Belmont, 2017). In contrast, TSA-seq measures relative distances to targets on a microscopic scale corresponding to 100s of nm to ~ 1 micron based on the measured diJusion radius of tyramide-biotin free-radicals (Chen et al., 2018)... While LMNB1 DamID segments LADs most accurately, lamin TSA-seq provides distance information not provided by DamID- for example, variations in relative distances to the nuclear lamina of diJerent iLADs and iLAD regions. These diJerences between the LMNB1 DamID and LMNB TSA-seq signals are also crucial to a computational approach, SPIN, that segments the genome into multiple states based on their varying nuclear localization, including biochemically and functionally distinct lamina-associated versus near-lamina states (Consortium et al., 2024; Wang et al., 2021).

      Thus, lamin DamID and TSA-seq complement each other, providing more information together than either one separately."

      We note that these diJerences in lamina DamID and TSA-seq are crucial to being able to gain additional information by comparing variations in the lamina TSA-seq for LADs in Figs. 6&7. See our response to point (4) below, for further explanation.

      (4) In 7B/C, the authors show that the highest LMNB1 regions in HFF are equator of IMR90s. However, in Figure 7G, their cLAD score indicates that constitutive LADs are not at the equator. This is a bit surprising given the point above and raises the possibility that SON signals (as opposed to LMNB1 signals) might be more responsible for correlation to localization relative to the equator. Hence, it might be helpful if the authors repeat the analyses in Figures 7B/C in regions with diJering LMNB1 signals but similar SON signals (and vice versa).

      Again, this is based on the apparent assumption by the reviewer that DamID and TSA-seq work the same way and measure the same thing. But as explained above in the previous point, this is not true.

      In our first TSA-seq paper (Chen et al, 2018) we showed how we could use the exponential decay point-spread-function produced by TSA, measured directly by light microscopy, to convert sequencing reads from the TSA-seq into a predicted mean distance from nuclear speckles, approximated as point sources. These mean distances predicted from the SON TSA-seq data agreed with measured FISH distances to nuclear speckles to within ~50 nm for a set of DNA probes from diJerent chromosome regions. Moreover, varying TSA staining conditions changed the decay constants of this exponential decay, thus producing diJerences in the SON TSA-seq signals. By using these diJerent exponential decay functions to convert the TSA-seq scores from these independent data sets to estimated distances from nuclear speckles, we again observed a distance residual of ~50 nm; in this case though this distance residual of ~50 nm represented the mean residual observed genome-wide. This gives us great confidence that the TSA-seq is working as we have modeled it.

      As we mentioned in our response to point 3 above, we did see the highest LMNB TSA-seq signal for cLADs in K562 cells with round nuclei (Chen et al, 2018).

      But as we now show in our simulation performed in this paper for Fig. 7, the observed tyramide free radical exponential decay gradient convolved with the flat nuclear lamina shape produces a higher equatorial LMNB1 TSA-seq signal for LADs at the equatorial plane. We confirmed that LADs with this higher TSA-seq signal were enriched at the equatorial plane by mining the multiplexed IMR90 imaging data. Similar mining of the multiplexed FISH IMR90 data showed localization of cLADs away from the equatorial plane.

      We are not clear about the rationale for what the reviewer is suggesting about SON signals "being more responsible for correlation to localization to the equator". We have provided an explanation for the higher lamina TSA-seq scores for LADs near the equator based on the measured spreading of the tyramide free radicals convolved with the eJect of the nuclear shape. This makes a prediction that the observed variation in lamina TSA-seq scores for LADs with similar DamID scores is related to their positioning relative to the equatorial plane as we then validated through our mining of the IMR90 multiplexed FISH data.

      (5) FISH of individual LADs, v-fiLADs, and p-w-v-fiLADs relative to the lamina and speckle would be helpful to understand their relative positioning in control and LBR/LMNA double KO cells. This would significantly bolster the claim that "histone mark enrichments..more precisely revealed the diJerential spatial distribution of LAD regions...".

      Adequately testing these predictions made from the lamina/SON TSA-seq scatterplots by direct FISH measurements would require measurements from large numbers of diJerent chromosome regions through a highly multiplexed immuno-FISH approach. We are not equipped currently in any of our laboratories to do such measurements and we leave this therefore for future studies.

      Rather our statement is based on our use of TSA-seq analyzed through these 2D scatterplots and should be valid to the degree that our TSA-seq measurements do indeed correlate with microscopy derived distances.

      However, we do now include demonstration of a high correlation of speckle, lamina, and nucleolar TSA-seq with highly multiplexed immuno-FISH measurement of distances to these locales in a revised Fig. 7. The high correlation shown between the TSA-seq scores and measured distances does therefore add additional support to our claim that the reviewer is discussing, even without our own multiplexed FISH validation.

      (6) "In contrast, genes within genomic regions which in pair-wise comparisons of cell lines show a statistically significant diJerence in lamina TSA-seq show no obvious trend in their expression diJerences (Figure 2C).". This appears to be an overstatement based on the left panel of 2D.

      We do not follow the reviewer's point. In Fig. 2C we show little bias in the diJerences in gene expression between the two cell types for regions that showed diJerences in lamina TSA-seq. The reviewer is suggesting something otherwise based on their impression, not explicitly stated, of the left panel of Fig. 2D. But we see similar shades of blue extending vertically at low SON values and similar shades of red extending vertically at high SON values, suggesting a correlation of gene expression only with the SON TSA-seq score but not with the LMNB1 TSA-seq score displayed on the y-axis. This is also consistent with the very small and/or insignificant correlation coeJicients measured in our linear model relating diJerences in LMNB1 TSA-seq to diJerences in expression but the large correlation coeJicient observed for SON TSA-seq (Fig. 2E). Thus, we see Fig. 2C-E as self-consistent.

      (7) In the section on "Polarity of Nuclear Genome Organization" - "....Using the IMR90 multiplexed FISH data set [43]...." - The references are not numbered.

      We thank the reviewer for this correction.

      (8) I believe there is an error in the Figure 7B legend. The descriptions of Cluster 1 and 2 do not match those indicated in the figure.

      We again thank the reviewer for this correction.

    1. eLife Assessment

      This important study allows for a better understanding of anthelmintic drug resistance in nematodes. The authors provide a detailed analysis of the role of UBR-1 and its underlying mechanism in ivermectin resistance using convincing behavioural and genetic experiments with C. elegans. Although the authors have addressed the concerns of the reviewers, it would be prudent for the authors to disclose the Dyf phenotype in ubr-1 mutants. The authors should at the very least report the Dyf phenotype and the experiment on which they base the argument that the Dyf phenotype does not affect their conclusions.

    2. Reviewer #1 (Public review):

      Summary:

      The drug Ivermectin is used to effectively treat a variety of worm parasites in the world, however resistance to Ivermectin poses a rising challenge for this treatment strategy. In this study, the authors found that loss of the E3 ubiquitin ligase UBR-1 in the worm C. elegans results in resistance to Ivermectin. In particular, the authors found that ubr-1 mutants are resistant to the effects of Ivermectin on worm viability, body size, pharyngeal pumping and locomotion. The authors previously showed that loss of UBR-1 disrupts homeostasis of the amino acid and neurotransmitter glutamate resulting in increased levels of glutamate in C. elegans. Here, the authors found that the sensitivity of ubr-1 mutants to Ivermectin can be restored if glutamate levels are reduced using a variety of different methods. Conversely, treating worms with exogenous glutamate to increase glutamate levels also results in resistance to Ivermectin supporting the idea that increased glutamate promotes resistance to Ivermectin. The authors found that the primary known targets of Ivermectin, glutamate-gated chloride channels (GluCls), are downregulated in ubr-1 mutants providing a plausible mechanism for why ubr-1 mutants are resistant to Ivermectin. Although it is clear that loss of GluCls can lead to resistance to Ivermectin, this study suggests that one potential mechanism to decrease GluCl expression is via disruption of glutamate homeostasis that leads to increased glutamate. This study suggests that if parasitic worms become resistant to Ivermectin due to increased glutamate, their sensitivity to Ivermectin could be restored by reducing glutamate levels using drugs such as Ceftriaxone in a combination drug treatment strategy.

      Strengths:

      - The use of multiple independent assays (i.e., viability, body size, pharyngeal pumping, locomotion and serotonin-stimulated pharyngeal muscle activity) to monitor the effects of Ivermectin<br /> - The use of multiple independent approaches (got-1, eat-4, ceftriaxone drug, exogenous glutamate treatment) to alter glutamate levels to support the conclusion that increased glutamate in ubr-1 mutants contributes to Ivermectin resistance

      Weaknesses:

      - The primary target of Ivermectin is GluCls so it is not surprising that alteration of GluCl expression or function would lead to Ivermectin resistance<br /> - It remains to be seen what percent of Ivermectin resistant parasites in the wild have disrupted glutamate homeostasis as opposed to mutations that more directly decrease GluCl expression or function.

      Comments on revisions: All my concerns have been addressed by the authors.

    3. Reviewer #2 (Public review):

      Summary:

      The authors provide a very thorough investigation on the role of UBR-1 in anthelmintic resistance using the non-parasitic nematode, C. elegans. Anthelmintic resistance to macrocyclic lactones is a major problem in veterinary medicine and likely just a matter of time until resistance emerges in human parasites too. Therefore, this study providing novel insight into the mechanisms of ivermectin resistance is particularly important and significant.

      Strengths:

      The authors use very diverse technologies (behavior, genetics, pharmacology, genetically encoded reporters) to dissect the role of UBR-1 in ivermectin resistance. Deploying such a comprehensive suite of tools and approaches provides exceptional insight into the mechanism of how UBR-1 functions in terms of ivermectin resistance.

      Weaknesses:

      I do not see any major weaknesses in this study. My only concern is whether the observations made by the authors would translate to any of the important parasitic helminths in which resistance has naturally emerged in the field. This is always a concern when leveraging a non-parasitic nematode to shed light on a potential mechanism of resistance of parasitic nematodes, and I understand that it is likely beyond the scope of this paper to test some of their results in parasitic nematodes.

      Comments on revisions: The authors have now addressed all my concerns.

    4. Reviewer #3 (Public review):

      Summary:

      Li et al propose to better understand the mechanisms of drug resistance in nematode parasites by studying mutants of the model roundworm C. elegans that are resistant to the deworming drug ivermectin. They provide compelling evidence that loss-of-function mutations in the E3 ubiquitin ligase encoded by the UBR-1 gene make worms resistant to the effects of ivermectin (and related compounds) on viability, body size, pharyngeal pumping rate, and locomotion and that these mutant phenotypes are rescued by a UBR-1 transgene. They propose that the mechanism is resistance is indirect, via the effects of UBR-1 on glutamate production. They show mutations (vesicular glutamate transporter eat-4, glutamate synthase got-1) and drugs (glutamate, glutamate uptake enhancer ceftriaxone) affecting glutamate metabolism/transport modulate sensitivity to ivermectin in wild type and ubr-1 mutants. The data are generally consistent with greater glutamate tone equating to ivermectin resistance. Finally, they show that manipulations that are expected to increase glutamate tone appear to reduce expression of the targets of ivermectin, the glutamate-gated chloride channels, which is known to increase resistance.

      There is a need for genetic markers of ivermectin resistance in livestock parasites that can be used to better track resistance and to tailor drug treatment. The discovery of UBR-1 as a resistance gene in C. elegans will provide a candidate marker that can be followed up in parasites. The data suggest Ceftriaxone would be a candidate compound to reverse resistance.

      Strengths:

      The strength of the study is the thoroughness of the analysis and the quality of the data. There can be little doubt that ubr-1 mutations do indeed confer ivermectin resistance. The use of both rescue constructs and RNAi to validate mutant phenotypes is notable. Further, the variety of manipulations they use to affect glutamate metabolism/transport makes a compelling argument for some kind of role for glutamate in resistance.

      Weaknesses:

      The use of single ivermectin dose assays can be misleading. A response change at a single dose shows that the dose-response curve has shifted, but the response is not linear with dose, so the degree of that shift may be difficult to discern and may result from a change in slope but not EC50.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      The drug Ivermectin is used to effectively treat a variety of worm parasites in the world, however resistance to Ivermectin poses a rising challenge for this treatment strategy. In this study, the authors found that loss of the E3 ubiquitin ligase UBR-1 in the worm C. elegans results in resistance to Ivermectin. In particular, the authors found that ubr-1 mutants are resistant to the effects of Ivermectin on worm viability, body size, pharyngeal pumping, and locomotion. The authors previously showed that loss of UBR-1 disrupts homeostasis of the amino acid and neurotransmitter glutamate resulting in increased levels of glutamate in C. elegans. Here, the authors found that the sensitivity of ubr-1 mutants to Ivermectin can be restored if glutamate levels are reduced using a variety of different methods. Conversely, treating worms with exogenous glutamate to increase glutamate levels also results in resistance to Ivermectin supporting the idea that increased glutamate promotes resistance to Ivermectin. The authors found that the primary known targets of Ivermectin, glutamate-gated chloride channels (GluCls), are downregulated in ubr-1 mutants providing a plausible mechanism for why ubr-1 mutants are resistant to Ivermectin. Although it is clear that loss of GluCls can lead to resistance to Ivermectin, this study suggests that one potential mechanism to decrease GluCl expression is via disruption of glutamate homeostasis that leads to increased glutamate. This study suggests that if parasitic worms become resistant to Ivermectin due to increased glutamate, their sensitivity to Ivermectin could be restored by reducing glutamate levels using drugs such as Ceftriaxone in a combination drug treatment strategy.

      Strengths:

      (1) The use of multiple independent assays (i.e., viability, body size, pharyngeal pumping, locomotion, and serotonin-stimulated pharyngeal muscle activity) to monitor the effects of Ivermectin

      (2) The use of multiple independent approaches (got-1, eat-4, ceftriaxone drug, exogenous glutamate treatment) to alter glutamate levels to support the conclusion that increased glutamate in ubr-1 mutants contributes to Ivermectin resistance.

      Weaknesses:

      (1) The primary target of Ivermectin is GluCls so it is not surprising that alteration of GluCl expression or function would lead to Ivermectin resistance.

      (2) It remains to be seen what percent of Ivermectin-resistant parasites in the wild have disrupted glutamate homeostasis as opposed to mutations that more directly decrease GluCl expression or function.

      Thank you for your thoughtful and constructive comments. We completely agree with your observation that alterations in GluCl expression or function can lead to Ivermectin resistance. However, we would like to emphasize that our study highlights an additional mechanism: disruptions in glutamate homeostasis can also lead to decreased GluCl expression, thereby contributing to Ivermectin resistance. This mechanism, which has not been fully explored previously, offers new insights into the complexity of drug resistance and could have important implications for understanding the development of Ivermectin resistance in parasitic nematodes.

      As you pointed out, the role of disrupted glutamate homeostasis in wild parasitic populations and the proportion of resistant parasites with this mechanism remain unknown. We believe this uncertainty underlines the significance of our findings, as they suggest a novel avenue for studying Ivermectin resistance and for developing potential strategies to counteract it.

      We have incorporated this discussion into the revised manuscript to further enrich the context of our findings.

      Reviewer #2 (Public review):

      Summary:

      The authors provide a very thorough investigation of the role of UBR-1 in anthelmintic resistance using the non-parasitic nematode, C. elegans. Anthelmintic resistance to macrocyclic lactones is a major problem in veterinary medicine and likely just a matter of time until resistance emerges in human parasites too. Therefore, this study providing novel insight into the mechanisms of ivermectin resistance is particularly important and significant.

      Strengths:

      The authors use very diverse technologies (behavior, genetics, pharmacology, genetically encoded reporters) to dissect the role of UBR-1 in ivermectin resistance. Deploying such a comprehensive suite of tools and approaches provides exceptional insight into the mechanism of how UBR-1 functions in terms of ivermectin resistance.

      Weaknesses:

      I do not see any major weaknesses in this study. My only concern is whether the observations made by the authors would translate to any of the important parasitic helminthes in which resistance has naturally emerged in the field. This is always a concern when leveraging a non-parasitic nematode to shed light on a potential mechanism of resistance of parasitic nematodes, and I understand that it is likely beyond the scope of this paper to test some of their results in parasitic nematodes.

      Thank you for your kind words and positive feedback on our work. We greatly appreciate your acknowledgment of the diverse technologies and comprehensive approaches we utilized to uncover the role of UBR-1 in ivermectin resistance.

      Your concern about whether our findings in C. elegans translate to parasitic helminthes in which ivermectin resistance has naturally emerged is both valid and critical. This is indeed a key question we expect to figure out in future studies. Collaborating with parasitologists to investigate whether naturally occurring mutations in ubr-1 exist in parasitic and non-parasitic nematodes is a priority for us. We hope that these efforts will lead to meaningful discoveries that have a significant impact on both livestock management and medicine.

      Reviewer #3 (Public review):

      Summary:

      Li et al propose to better understand the mechanisms of drug resistance in nematode parasites by studying mutants of the model roundworm C. elegans that are resistant to the deworming drug ivermectin. They provide compelling evidence that loss-of-function mutations in the E3 ubiquitin ligase encoded by the UBR-1 gene make worms resistant to the effects of ivermectin (and related compounds) on viability, body size, pharyngeal pumping rate, and locomotion and that these mutant phenotypes are rescued by a UBR-1 transgene. They propose that the mechanism is resistance is indirect, via the effects of UBR-1 on glutamate production. They show mutations (vesicular glutamate transporter eat-4, glutamate synthase got-1) and drugs (glutamate, glutamate uptake enhancer ceftriaxone) affecting glutamate metabolism/transport modulate sensitivity to ivermectin in wild-type and ubr-1 mutants. The data are generally consistent with greater glutamate tone equating to ivermectin resistance. Finally, they show that manipulations that are expected to increase glutamate tone appear to reduce expression of the targets of ivermectin, the glutamate-gated chloride channels, which is known to increase resistance.

      There is a need for genetic markers of ivermectin resistance in livestock parasites that can be used to better track resistance and to tailor drug treatment. The discovery of UBR-1 as a resistance gene in C. elegans will provide a candidate marker that can be followed up in parasites. The data suggest Ceftriaxone would be a candidate compound to reverse resistance.

      Strengths:

      The strength of the study is the thoroughness of the analysis and the quality of the data. There can be little doubt that ubr-1 mutations do indeed confer ivermectin resistance. The use of both rescue constructs and RNAi to validate mutant phenotypes is notable. Further, the variety of manipulations they use to affect glutamate metabolism/transport makes a compelling argument for some kind of role for glutamate in resistance.

      Weaknesses:

      The proposed mechanism of ubr-1 resistance i.e.: UBR-1 E3 ligase regulates glutamate tone which regulates ivermectin receptor expression, is broadly consistent with the data but somewhat difficult to reconcile with the specific functions of the genes regulating glutamatergic tone. Ceftriaxone and eat-4 mutants reduce extracellular/synaptic glutamate concentrations by sequestering available glutamate in neurons, suggesting that it is extracellular glutamate that is important. But then why does rescuing ubr-1 specifically in the pharyngeal muscle have such a strong effect on ivermectin sensitivity? Is glutamate leaking out of the pharyngeal muscle into the extracellular space/synapse? Is it possible that UBR-1 acts directly on the avr-15 subunit, both of which are expressed in the muscle, perhaps as part of a glutamate sensing/homeostasis mechanism?

      Thank you for your insightful feedback and thought-provoking questions. These are excellent points that have prompted us to critically reconsider our findings and the proposed mechanism.

      Several potential explanations could be considered, although we currently lack direct evidence to support this hypothesis: (1) The pharynx likely plays a dominant role in ivermectin resistance, as previously reported (Dent et al., 1997; Dent et al., 2000), and overexpression of UBR-1 in the pharyngeal muscle may exhibit a strong effect on ivermectin sensitivity. (2) It is also possible that pharyngeal muscle cells have the capacity to release glutamate into the extracellular space, which could contribute to the observed effect. (3) Alternatively, UBR-1 expression in the pharyngeal muscle may regulate other indirect pathways affecting extracellular or synaptic glutamate concentrations.

      We also appreciate your suggestion that UBR-1 may act directly on AVR-15 in the pharynx. While this is an interesting possibility, UBR-1 is an E3 ubiquitin ligase, and if AVR-15 were a direct target, we would expect UBR-1 to ubiquitinate AVR-15 and promote its degradation. In this case, loss of UBR-1 should inhibit AVR-15 ubiquitination, reduce its degradation, and lead to increased AVR-15 protein levels in the pharynx. However, our experimental data show a reduction, rather than an increase, in AVR-15::GFP levels in ubr-1 mutants (Figure 4A). This observation suggests that AVR-15 is less likely to be a direct target of UBR-1. To definitively address this hypothesis, a direct assessment of AVR-15 ubiquitination levels in wild-type and ubr-1 mutant backgrounds would be needed. We agree that this is an important avenue for future investigation.

      The use of single ivermectin dose assays can be misleading. A response change at a single dose shows that the dose-response curve has shifted, but the response is not linear with dose, so the degree of that shift may be difficult to discern and may result from a change in slope but not EC50. Similarly, in Figure 3C, the reader is meant to understand that eat-4 mutant is epistatic to ubr-1 because the double mutant has a wild-type response to ivermectin. But eat-4 alone is more sensitive, so (eyeballing it and interpolating) the shift in EC50 caused by the ubr-1 mutant in a wild type background appears to be the same as in an eat-4 background, so arguably you are seeing an additive effect, not epistasis. For the above reasons, it would be desirable to have results for rescuing constructs in a wild-type background, in addition to the mutant background.

      Thank you for your detailed feedback and observations.

      The potential additive effect you noted in Figure 3C appears to be specific to the body length analysis. In our other three ivermectin resistance assays (viability, pumping rate, and locomotion velocity), this additive effect was not observed. A possible explanation for this is that eat-4 and got-1 single mutants inherently exhibit reduced body length compared to wild-type worms (Mörck and Pilon 2006; Greer et al. 2008; Chitturi et al. 2018), which may give the appearance of an additive effect in this particular assay.

      Regarding the use of rescuing constructs, we performed these experiments in the ubr-1;got-1 and ubr-1;eat-4 double mutant backgrounds. This was designed to test whether the suppression of ubr-1-mediated ivermectin resistance by got-1 or eat-4 mutations is indeed due to the functional activity of GOT-1 and EAT-4, respectively. The choice of this setup was to ensure that the double mutant phenotype was fully addressed. In contrast, rescuing constructs of GOT-1 and/or EAT-4 in a wild-type background might not sufficiently reveal the relationship between GOT-1, EAT-4, and UBR-1. However, we are open to further testing your suggestion in the future.

      To aid in the interpretation and clarify the apparent effects, we have revised Figure 3 annotation to clearly represent the data and the comparisons being made. We hope this adjustment makes the results more straightforward and easier for readers to understand.

      The added value of the pumping data in Figure 5 (using calcium imaging) over the pump counts (from video) in Figure 1G, Figure 2E, F, K, & Figure 3D, H is not clearly explained. It may have to do with the use of "dissected" pharynxes, the nature/advantage of which is not sufficiently documented in the Methods/Results.

      Thank you for pointing this out. The behavioral pumping data in Figure 1G, Figure 2E, F, K, & Figure 3D and calcium imaging data in Figure 5 were obtained under different experimental conditions. Specifically, the behavioral assays (pumping rate) were conducted on standard culture plates with freely moving worms, whereas the calcium imaging experiments were performed in a liquid environment with immobilized worms. In the calcium imaging setup, the dissection refers to gently puncturing the epidermis behind head of the worm with a glass electrode to relieve internal pressure, which aids in stabilizing the calcium imaging process and ensures better visualization of pharyngeal muscle activity.

      We compared the pharyngeal muscle activity of worms that were not subjected to puncturing the epidermis and found no significant difference when activated by 20 mM serotonin. Therefore, we speculate that there is no direct interaction between the bath solution and the pharynx or head neurons. To avoid confusion, we have removed the term "dissected" from the manuscript and added additional experimental details in the Methods section.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) The authors propose that ubr-1 mutants are resistant to ivermectin due to persistent elevation of glutamate that leads to a compensatory reduction in GluCl levels and thus resistance to Ivermectin. This model would be strengthened by experiments more directly connecting glutamate, GluCls and Ivermectin sensitivity. For example, does overexpression of a relevant GluCl such as AVR-15 restore Ivermectin sensitivity to ubr-1 mutants? Does Ceftriaxone treatment affect the Ivermectin resistance of worms lacking the relevant GluCls (i.e., avr-15, avr-14 and glc-1)? - The model suggests that Ceftriaxone treatment would have no effect in the latter case.

      Thank you for your valuable suggestion. Based on your recommendation, we have performed two additional experiments to strengthen our model. First, we conducted an overexpression experiment of AVR-15 and found that it significantly, though partially, restored ivermectin sensitivity in ubr-1 mutants (p < 0.01, Supplemental Figure S5D). Second, we tested the effect of Ceftriaxone treatment on the IVM resistance of avr-15; avr-14; glc-1 triple mutants, which encode the most critical glutamate receptors involved in IVM sensitivity. As expected, we found that Ceftriaxone treatment did not alter the IVM resistance in these triple mutants (Supplemental Figure S5E), supporting the idea that these specific GluCls are key to the observed resistance.

      These two experiments provide further support for our proposed model. We have integrated the results into the manuscript, updating the Results section and Supplemental Figure S5D, E, as well as the corresponding Figure Legends.

      (2) Line 211 - Ceftriaxone is known to upregulate EAAT2 expression in mammals. Do the authors know if the drug also increases EAAT expression in C. elegans?

      Thank you for raising this point. To our knowledge, this is the first study to demonstrate the antagonistic effect of ceftriaxone on ivermectin resistance in C. elegans, particularly in the context of ubr-1-mediated resistance. Ceftriaxone enhances glutamate uptake by increasing the expression of excitatory amino acid transporter-2 (EAAT2) in mammals (Rothstein et al., 2005, Lee et al., 2008). C. elegans has six glutamate transporters encoded by glt-1 and glt-3–7 (Mano et al. 2007).

      Compared to testing whether ceftriaxone increases the expression of these EAATs in C. elegans, identifying which specific glt gene targeted by ceftriaxone may better reveal its mechanism of action. To investigate this, we performed a genetic analysis. In the ubr-1 mutant, we individually deleted the six glt genes and found that ceftriaxone’s ability to restore ivermectin sensitivity was specifically suppressed in the ubr-1; glt-1 and ubr-1; glt-5 double mutants (Author response image 1A). This suggests that glt-1 and glt-5 may be the targets of ceftriaxone in C. elegans. In contrast,  ivermectin sensitivity was unaffected in the individual glt mutants (Author response image 1B), indicating that a single glt deletion may not be sufficient to alter glutamate level or induce GluRs downregulation. Further studies are needed to determine whether ceftriaxone directly increases GLT-1 and GLT-5 expression in C. elegans and to explore the underlying mechanisms.

      Author response image 1.

      Glutamate transporter removal inhibits ceftriaxone-mediated restoration of ivermectin sensitivity in ubr-1. (A) Compared to the ubr-1 mutants, the ubr-1; glt-1 and ubr-1; glt-5 double mutants show enhanced ivermectin resistance under ceftriaxone treatment. (B) The glt mutants do not show resistance to ivermectin. ****p < 0.0001; one-way ANOVA test.

      (3) Line 64 - as part of the rationale for the study, the authors state that "...increasing reports of unknown causes of IVM resistance continue to emerge...suggesting that additional unknown mechanisms are awaiting investigation." While this may be true, the ultimate conclusion from this study is that decreasing expression of Ivermectin-targeted GluCls causes Ivermectin resistance, which is a known mechanism. The field already knows that Ivermectin targets GluCls and thus decreasing GluCl expression or function would lead to Ivermectin resistance. The authors may want to edit the sentence mentioned above for clarity.

      Thanks for the suggestion. We have revised the sentence for clarity: “…, suggesting that previously unrecognized or additional mechanisms regulating GluCls expression may await further investigation.” This revision better reflects the focus on GluCl regulation and clarifies the potential for additional mechanisms to be explored.

      (4) The introduction to the serotonin-stimulated pharyngeal Calcium imaging section is a little confusing. The role of the various GluCls in pharyngeal pumping should be defined/clarified in the introduction to the last section (lines 337-341).

      Thanks. We have revised and clarified the introduction as follows: “GluCls downregulation was functionally validated by the diminished IVM-mediated inhibition of serotonin-activated pharyngeal Ca2+ activity observed in ubr-1 mutants. ”

      Additionally, the role of the various GluCls in pharyngeal pumping has been clarified:

      “Using translational reporters, we found that IVM resistance in ubr-1 mutants is caused by the functional downregulation of IVM-targeted GluCls, including AVR-15, AVR-14, and GLC-1. These receptors are activated by glutamate to facilitate chloride ion influx into pharyngeal muscle cells, resulting in the inhibition of muscle contractions and the suppression of food intake in C. elegans. ”

      We hope these revisions address the concerns raised and improve the clarity of this section.

      (5) The color code key on the right-hand side of the Raster Plots in Figure 1H should be made larger for clarity.

      Revised.

      (6) In Figure S3, a legend should be included to define the black and blue box plots.

      Thank you for your comment. We have added the following clarification to the figure legend: “Black plots: wild-type, blue plots: ubr-1 mutants.” This should now make the distinction between the two groups clear.

      (7) Figure S4, the brackets above the graphs are misleading. It is not clear which comparisons are being made.

      Thank you for your feedback. We have clarified the figure by updating the legend to include the statement: “All statistical analyses were performed against the ubr-1 mutant.” This clarification is now also included in Figure 3F-I to ensure consistency and avoid any confusion regarding the comparisons being made.

      Reviewer #2 (Recommendations for the authors):

      (1) In Figure 1A: the "trails" table needs more clarification to orient the reader.

      To improve clarity and better orient the reader, we have updated Figure 1A by explicitly adding the number of trials and including a statistical analysis of the viability of wild-type and ubr-1 mutants under different ML conditions. In Figure 1A legend, we have added “we used shades of red to represent worm viability on each experimental plate (n = 50 animals per plate), with darker shades indicating lower survival rates. The viability test was repeated at least 5 times (5 trials).”. These modifications aim to provide a clearer understanding of the data presentation and its significance.

      (2) In Figure S2: it would benefit the reader to include the major human parasitic nematodes in the phylogeny and include a discussion of the conservation.

      Thank you for your insightful comment. In Figure S2A, we have included the human parasitic nematodes Onchocerca volvulus, Brugia malayi, and Toxocara canis. Unfortunately, other major human parasitic nematodes, such as Ascaris lumbricoides (roundworm), Ancylostoma duodenale (hookworm), and Trichuris trichiura (whipworm), currently lack reported homologs of the ubr-1 gene.

      To provide some context, Onchocerca volvulus is a leading cause of infectious blindness globally, affecting millions of people, while Brugia malayi causes lymphatic filariasis, a significant tropical disease. Toxocara canis is a zoonotic parasite responsible for serious human syndromes such as visceral and ocular larval migration. Ivermectin remains a primary treatment for these parasitic infections.

      Interestingly, while we have identified relevant sequences in Onchocerca volvulus, Brugia malayi, and Toxocara canis, potential mutations in ubr-1-like genes in these parasitic nematodes may lead to ivermectin resistance. Sequence comparison analysis could shed light on the risks of such mutations and their relevance to ivermectin treatment failure, warranting further attention. We have added a discussion of this potential risk in the manuscript.

      Reviewer #3 (Recommendations for the authors):

      Minor corrections/suggestions:

      (1) The level of resistance in ubr-1 is similar to dyf genes. Should double-check ubr-1 mutant is not dyf.

      Thank you for your insightful suggestion. We are also interested in this point and designed the following experiments. We first directly tested the Dyf phenotype of ubr-1 using standard DIO dye staining (Author response image 2A) and found that ubr-1 clearly show a "dye filling defective" phenotype (Author response image 2B). This raises an interesting question: Could the IVM resistance observed in ubr-1 be due to its Dyf defect? To address this, we further performed experiment by using Ceftriaxone to test ubr-1’s Dyf phenotype. Ceftriaxone can fully rescue the sensitivity of ubr-1 to IVM (Figure 2). If IVM resistance observed in ubr-1 is due to its Dyf defect, we should observe same rescued Dyf defect. After treating ubr-1 mutants with Ceftriaxone (50 μg/mL) until L4 stage, we again performed DIO dye staining and found that while Ceftriaxone fully rescued IVM resistance in ubr-1, it did not rescue the Dyf defect (Author response image 2C). These results suggest that while ubr-1 has a Dyf defect, it is unlikely the primary cause of the IVM resistance in ubr-1 mutant.

      Author response image 2.

      ubr-1 mutant is not dyf. (A) Depiction of the DIO dye-staining assays. Diagram is adapted from (Power et al. 2020). (B) ubr-1 mutant exhibits obvious Dyf phenotype. (C) Cef treatment (50 μg/mL) does not alter the ubr-1 Dyf defect phenotype. Scale bar, 20 µm.

      (2) 367 "in IVM" superscript.

      (3) 429 ubr-1 italics.

      Thanks, revised.

      (4) Methods: Need more info on dissection: if there is direct interaction of bath with pharynx, as suggested by bath solution, then 5HT concentrations are too high. Direct exposure to 20mM 5HT will kill a pharynx. 20uM 5HT?

      Thank you for your comment. We have reviewed our experimental records and confirmed that the concentration mentioned in the manuscript is correct. In our experiment, the dissection refers to gently puncturing the epidermis behind head of the worm with a glass electrode to relieve internal pressure, which helps stabilize the calcium imaging process. In fact, there is no direct interaction between the bath solution and the pharynx or head neurons. We have revised the Methods section to clarify this point.

      (5) Figure 2: Meaning of "Trials" arrow on grid y-axis is not immediately obvious to me. Would prefer you just label/number individual trials.

      Sure, we have labeled the trails accordingly in revised Figure 1, 2, and Figure S1.

      (6) Figure 3: Legend should include [IVM]. Meaning of +EAT-4, +GOT-1 should be described in the legend.

      Thank you for your suggestion. We have updated the figure legend to include the IVM concentration (5 ng/mL). Additionally, we have clarified the meaning of +EAT-4 and +GOT-1 in the legend with the description: “…whereas the re-expression of GOT-1 (+GOT-1) and EAT-4 (+EAT-4) partially reinstated IVM resistance in the respective double mutants.” This ensures the figure is more informative and accessible to the reader.

      (7) 784 signalling pathway should just be pathway.

      Thanks, revised.

      (8) Line 811 " Both types of motor neurons are innervated by serotonin (5 -HT)." Innervated by serotonergic "neurons"? However, even that is misleading because serotonin is not necessarily synaptic.

      Thank you for your comment. We have revised the sentence to: “Both types of motor neurons could be activated by serotonin (5-HT).” This clarification better reflects the role of serotonin in modulating motor neuron activity.

      (9) Line 814 puffing or perfusion. Perfusion seems more accurate. Make the figure consistent.

      Thanks, revised.

      (10) Figure S1 requires an x axis label with better explanation.

      Thank you for your feedback. We have revised Figure S1 and added "x-axis" to clarify that it represents the trail number. Additionally, we have updated the figure legend to include the experimental conditions: “The shades of red represent worm viability, with darker shades indicating lower survival rates, based on 100 animals per plate and at least 5 trials.”

      (11) Figure S2 C-F needs ivermectin concentration.

      (12) Line 865 plants -> plates?

      Thanks, revised.

      (13) Figure S4. 875 "Rescue of IVM sensitivity of the ubr-1 mutant by the UBR-1 genomic fragment." Wrong title? Describes GFP expression and RNAi experiments.

      Thank you for pointing out the mistake in the title. We have revised the title to: “Knockdown of UBR-1 induces IVM resistance phenotypes.” Additionally, we have updated the figure description to include details about GFP expression and RNAi experiments. The GFP expression is now described as: “Expression of functional UBR-1::GFP, driven by its endogenous promoter, was observed predominantly in the pharynx, head neurons, and body wall muscles with weaker expression detected in vulval muscles and the intestine.” The RNAi experiments are described as: “Double-stranded RNA (dsRNA) interference was employed to suppress gene expression in specific tissues (Methods).”

    1. eLife Assessment

      This manuscript describes a resource detailing the econstitution of Holothuria glaberrima gut following self-evisceration in response to a potassium chloride injection, using scRNAseq and fluorescent RNA localization in situ. It provides some new findings about organ regeneration, as well as the origins of pluripotent cells, and places these findings in the context of regeneration across species. The paper's schematic model and HCR images are a valuable foundation for future work. The authors provide convincing RNA localization images to validate their data and to provide spatial context. These validation experiments are of good quality but remain challenging to connect to the complex spatial organization of complex tissues. This resource will be of interest to the field of regeneration, particularly in invertebrates, but also in comparative studies in other species, including evolutionary studies.

    2. Reviewer #1 (Public review):

      Summary:

      Joshua G. Medina-Feliciano et al. investigated the single-cell transcriptomic profile of holoturian regenerating intestine following evisceration, a process used to expel their viscera in response to predation. Using single-cell RNA-Sequencing and standard analysis such as "Find cluster markers", "Enrichment analysis of Gene Ontology" and "RNA velocity", they identify 13 cell clusters and their potential cell identity. Based on bioinformatic analysis they identified potentially proliferating clusters and potential trajectories of cell differentiation. This manuscript represents a useful dataset that can provide candidate cell types and cell markers for more in-depth functional analysis of the holoturian intestine regeneration.

      The conclusions of this paper are supported only by bioinformatic analyses since the in vivo validation through HCR is not sufficient to support them.

      Strengths:

      - The Authors are providing a single-cell dataset obtained from sea cucumbers regenerating their intestines. This represents the first fundamental step to an unbiased approach to better understand this regeneration process and the cellular dynamics taking part in it.

      - The Authors run all the standard analyses providing the reader with a well digested set of information about cell clusters, potential cell types, potential functions and potential cell differentiation trajectories.

      Weaknesses:

      - The Authors frequently report the percentage of cells with a specific feature (either labelled or expressing a certain gene or belonging to a certain cluster). This number can be misleading since that is calculated after cell dissociation and additional procedures (such as staining or sequencing and dataset cleanup) that can heavily bias the ratio between cell types. Similarly, the Authors cannot compare cell percentage between anlage and mesentery samples since that can be affected by technical aspects related to cell dissociation, tissue composition and sequencing depth.

      - The Authors did not validate all the clusters.

      - There is no validation of the trajectory analysis and there is no validation of the proliferating cluster with H3P or EdU co-labeling.

    3. Reviewer #2 (Public review):

      Summary:

      This research offers a comprehensive analysis of the regenerative process in sea cucumbers and builds upon decades of previous research. The approach involves a detailed examination using single-cell sequencing, making it a crucial reference paper while shedding new light on regeneration in this organism.

      Strengths:

      Detailed analysis of single-cell sequencing data and high-quality RNA localization images provide significant new insights into regeneration in sea cucumbers and, more broadly, in animals. Identifying a proliferating cluster of cells is very interesting and may open avenues to identify the cell lineage history and deeper molecular properties of the cells that regenerate the intestine.

      Weaknesses:

      The spatial context of the RNA localization images is challenging to interpret in this spatially complex tissue organization. Although the authors have taken care to perform RNA localization staining, it is still challenging to relate these data to their schematic model. This is only a minor weakness that will almost certainly be clarified by future work from the authors as they follow up on findings.

    4. Reviewer #3 (Public review):

      Summary:

      The authors have done a good job at creating a "resource" paper for the study of gut regeneration in sea cucumbers. They present a single-cell RNAseq atlas for the reconstitution of Holothuria glaberrima gut following self-evisceration in response to a potassium chloride injection. The authors provide data characterizing cellular populations and precursors of the regenerating anlage at 9 days post evisceration. As a "Tools and Resources" contribution to eLife, this work, with some revisions, could be appropriate. It will be impactful in the fields of regeneration, particularly in invertebrates, but also in comparative studies in other species, including evolutionary studies. Some of these comparative studies could extend to vertebrates and could therefore impact regenerative medicine in the future.

      Strengths:

      • Novel and useful information for a model organism and question for which this type of data has not yet been reported<br /> • Single-cell gene expression data will be valuable for developing testable hypotheses in the future<br /> • Marker genes for cell types provided to the field<br /> • Interesting predictions about possible lineage relationships between cells during sea cucumber gut regeneration<br /> • Authors have done a good job in the revision of making sure not to overstate the lineage claims in absence of definitive lineage-tracing experiments<br /> • Authors have improved the figures and the overall readability of the figures and text

      Specific questions:

      - Is there any way to systematically compare these cells to evolutionarily-diverged cells in distant relatives to sea cucumbers? Or even on a case-by-case basis? For example, is there evidence for any of these transitory cell types to have correlate(s) in vertebrate gut regeneration?

      • Authors acknowledged this would be interesting and important, but they say in the response document this is outside the scope of the current manuscript and more data would be needed to do this well.

      - Line 808: The authors may make a more accurate conclusion by saying that the characteristics are similar to blastemas or behaves like a blastema rather than it is blastema. There is ambiguity about the meaning of this term in the field, but most researchers seem to currently have in mind that the "blastema" definitions includes a discrete spatial organization of cells, and here these cells are much more spread out. This could be a good opportunity for the authors to engage in this dialogue, perhaps parsing out the nuances of what a "blastema" is, what the term has traditionally referred to, and how we might consider updating this term or at least re-framing the terminology to be inclusive of functions that "blastemas" have traditionally had in the literature and how they may be dispersed over geographical space in an organism more so than the more rigid, geographically-restricted definition many researchers have in mind. However, if the authors choose to elaborate on these issues, those elaborations do belong in the discussion, and the more provisional terminology we mention here could be used throughout the paper until that element of the revised discussion is presented. We would welcome the authors to do this as a way to point the field in this direction as this is also how we view the matter. For example, some of the genes whose expression has been observed to be enriched following removal of brain tissue in axolotls (such as kazald2, Lust et al.), are also upregulated in traditional blastemas, for instance, in the limb, but we appreciate that the expression domain may not be as localized as in a limb blastema. Additionally, since there is now evidence that some aspects of progenitor cell activation even in limb regeneration extend far beyond the local site of amputation injury (Johnson et al., Payzin-Dogru et al.), there is an opportunity to connect the dots and make the claim that there could be more dispersion of "blastema function" than previously appreciated in the field. Diving a bit more into these nuances may also enable a better conceptual framework of how blastema function may evolve across vast evolutionary time and between different injury contexts in super-regenerative organisms.

      • Authors addressed this comment and agree it is interesting, but given how much territory they had to cover and space limitations, they will save this type of discussion and comparative theoretical work for the future.

      Overall, the manuscript is much improved.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1:

      The entire study is based on only 2 adult animals, that were used for both the single cell dataset and the HCR. Additionally, the animals were caught from the ocean preventing information about their age or their life history. This makes the n extremely small and reduces the confidence of the conclusions. 

      This statement is incorrect.  While the scRNAseq was indeed performed in two animals (n=2), the HCR-FISH was performed in 3-5 animals (depending on the probe used).  These were different animals from those used for the scRNAseq.  The number of animals used has now been included in the manuscript.

      All the fluorescent pictures present in this manuscript present red nuclei and green signals being not color-blind friendly. Additionally, many of the images lack sufficient quality to determine if the signal is real. Additional images of a control animal (not eviscerated) and of a negative control would help data interpretation. Finally, in many occasions a zoomed out image would help the reader to provide context and have a better understanding of where the signal is localized. 

      Fluorescent photos have been changed to color-blind friendly colors.  Diagrams, arrows and new photos have been included as to guide readers to the signal or labeling in cells. Controls for HCR-FISH and labeling in normal intestines have been included.  

      Reviewer #2:

      The spatial context of the RNA localization images is not well represented, making it difficult to understand how the schematic model was generated from the data. In addition, multiple strong statements in the conclusion should be better justified and connected to the data provided.

      As explained above we have made an effort to provide a better understanding of the cellular/tissue localization of the labeled cells. Similarly, we have revised the conclusions so that the statements made are well justified.

      Reviewer #3:

      Possible theoretical advances regarding lineage trajectories of cells during sea cucumber gut regeneration, but the claims that can be made with this data alone are still predictive.

      We are conscious that the results from these lineage trajectories are still predictive and have emphasized this in the text. Nonetheless, they are important part of our analyses that provide the theoretical basis for future experiments.

      Better microscopy is needed for many figures to be convincing. Some minor additions to the figures will help readers understand the data more clearly.

      As explained above we have made an effort to provide a better understanding of the cellular/tissue localization of the labeled cells.  

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      -  Page 4, line 70-81: if the reader is not familiar with holothurian anatomy and regeneration process, this section can be complicated to fully understand. An illustration, together with clear definitions of mesothelium, coelomic epithelium, celothelium and luminal cells would help the reader. 

      A figure (now Figure 1) detailing the holothurian anatomy of normal and regenerating animals has been added. A figure detailing the intestinal regeneration process has also been included (S1).

      -  Page 5 line 92-104: this paragraph could be shortened. It would be more important to explain what the main question is the Authors would like to answer and why single cell would be the best technique to answer it, than listing previous studies that used scRNA-Seq. 

      The paragraph has been shortened and the focus has been shifted to the question of cellular components of regenerative tissues in holothurians.

      -  Page 6, line 125-127 and line 129-132: this belongs to the method section. 

      This information is now provided in the Materials and Methods section.

      -  Page 11, line 210-217: this belongs to the discussion. 

      This section has now been included in the Discussion.

      -  How many mesenteries are present in one animal? 

      This has now been included as part of Figure S1.

      -  In the methods there are no information about the quality of the dataset and the sequencing and the difference between the 2 samples coming from the 2 animals. How many cells from each sample and which is the coverage? The Authors provided this info only between mesentery and anlage but not between animals. 

      We have added additional information about the sequencing statistics in S4 Fig and S15 Table. Description has also been added in the methods in lines 922-926 under Single Cell RNA Sequencing and Data Analysis section.

      -  The result section "An in-depth analysis of the various cluster..." is particularly long and very repetitive. I would encourage to Authors to remove a lot of the details (list of genes and GO terms) that can be found in the figures and stressed only the most important elements that they will need to support their conclusions. Having full and abbreviated gene names and the long list of references makes the text difficult to read and it is challenging to identify the main point that the Authors are trying to highlight. 

      This section has been abbreviated.

      -  Figure 1: I would suggest adding a graph of holothurian anatomy before and after the evisceration to provide more context of the process we are looking at and remove 1C. 

      Information on the holothurian anatomy has been included in a new Fig 1 and in supplementary figure S1

      -  Figure 2: I would suggest removing this figure that is redundant with Figure 3 and several genes are not cluster specific. Figure 3 is doing a better job in showing similar concepts. 

      Figure 2 was removed and placed in the Supplement section. 

      - In figure 3 how were the 3 cell types defined? Was this done manually or through a bioinformatic analysis? 

      The cell definition was done following the analysis of the highly expressed transcripts and comparisons to what has been shown in the scientific literature.

      -  Figure 2O shows that one of the supra-cluster is made of C2, C7, C6 and C10. This contradicts the text page 9, line 195. 

      The transcript chosen for this figure gives the wrong idea that these 4 clusters are similar. We have now addressed this in the manuscript.

      -  Figure 4A and 4C: if these are representing a subset of Figure 3, they should be removed in one or the other. The same comment is valid also for Figures 5, 6 and 7. In general the manuscript is very redundant both in terms of Figures and text. 

      These are indeed subsets of Fig 3 that were added with the purpose of clarifying the findings, however, in view of the reviewer’s comment we have deleted the redundant information from all figures.

      -  Figure 9: since the panels are not in order, it is difficult to follow the flow of the figure.  - All UMAP should have the number of the cluster on the UMAP itself instead of counting only on the color code in order to be color-blind friendly. 

      The figure has been modified and clusters are now identified in the UMAP by their number.

      -  Figure S1F seems acquired in very different conditions compared to the other images in the same figure. 

      Fig S1F (now S2 Fig) is an overlay of fluorescent immune-histochemistry (UV light detected) with “classical” toluidine blue labeling (visible light detected).  This has now been explained in the figure legend.

      -  Table S7 is lacking some product numbers. 

      The toluidine blue product number has now been added to the table.  The antibodies that lack product number correspond to antibodies generated in our lab  and described in the references provided.

      -  The discussion is pretty long and partially redundant with the result section. I would encourage the Authors to shorten the text and shorten paragraphs that have repeating information.  - It might be out of the scope of the Authors but the readers would benefit from having a manuscript that focuses more on the novel aspects discovered with the single-cell RNA-Seq and then have a review that will bring together all the literature published on this topic and integrating the single-cell data with everything that is known so far. 

      We have tried to shorten the discussion by eliminating redundant text.

      Reviewer #2 (Recommendations For The Authors): 

      -  An intriguing finding is the lack of significant difference in the cell clusters between the anlage and mesentery during regeneration. This discovery raises important questions about the regenerative process. The authors should provide a more detailed explanation of the implications of this finding. For example, does it suggest that both organs contribute equally to the regenerated tissues? 

      The lack of significant differences in the cell clusters between the anlage and the mesentery is somewhat surprising but can be explained by two different facts. First, we have previously shown that many of the cellular processes that take place in the anlage, including cell proliferation, apoptosis, dedifferentiation and ECM remodeling occur in a gradient that begins at the tip of the mesentery where the anlage forms and extends to various degrees into the mesentery.  Similarly, migrating cells move along the connective tissue of the mesentery to the anlage.  Thus, there is no clear partition of the two regions that would account for distinct cell populations associated with the regenerative stage.  Second, the two cell populations that would have been found in the mesentery but not in the regenerating anlage, mature muscle and neurons, were not dissociated by our experimental protocol as to allow for their sequencing.  Our current experiments are being done using single nuclei RNA sequencing to overcome this hurdle. This has now been included in the discussion.

      -  Proliferating cells are obviously important to the study of regeneration as it is assumed these form the regenerating tissue. The authors describe cluster 8 as the proliferative cells. Is there evidence of proliferation in other cell types or are these truly the only dividing cells? Is c8 of multiple cell types but the clustering algorithm picks up on the markers of cell division i.e. what happens if you mask cell division markers - does this cluster collapse into other cluster types? This is important as if there is only one truly proliferating cell type then this may be the origin of the regenerative tissues and is important for this study to know this. 

      As the reviewer highlights, we also believe this to be an important aspect to discuss. We have addressed this in the manuscript discussion with the following: “Our data suggest that there appears to be a specific population of only proliferative cells (C8) characterized by a large number of cell proliferation genes, which can be visualized by the top genes shown in Fig 3. These cell proliferation genes are specific to C8, with minimum representation in other populations. Interestingly, as mentioned before C8 expresses at lower levels many of the genes of other coelomic epithelium populations. Nevertheless, even if we mask the top 38 proliferation genes (not shown), this cluster is maintained as an independent cluster, suggesting that its identity is conferred by a complex transcriptomic profile rather than only a few proliferation-related genes. Therefore, the identity and potential role of C8 could be further described by two distinct alternatives: (1) cells of C8 could be an intermediate state between the anlage precursor cells (discussed below) and the specialized cell populations or (2) cells of C8 are the source of the anlage precursor populations from which all other populations arise. The pseudotime data is certainly complex and challenging to interpret with our current dataset, yet the RNA velocity analysis showed in Fig 11B would suggests that cells of C8 transition into the anlage precursor populations, rather than being an intermediate state. This is also supported by the Slingshot pseudotime analysis that incorporates C8 (S13 Fig).

      Nevertheless, additional experiments are needed to confirm this hypothesis.”

      -  The schematic model presented in Fig 10 is essential for clarifying the paper's findings and will provide a crucial baseline model for future research. However, the comparison of the data shown in the HCR figures with the schematic is challenging due to the lack of spatial context in the HCR figures. The authors should find a way to provide better context in the figures, such as providing two-color in situ images to compare spatial relationships of cell types and/or including lower resolution and side-by-side fluorescent and bright field images if possible. 

      The figure has been modified to explain the spatial arrangement of the tissues.

      The authors make several strong statements in the discussion that weren't well connected to the findings in the data. Specifically: 

      “Regardless of which cell population is responsible for giving rise to the cells of the regenerating intestine, our study reveals that the coelomic epithelium, as a tissue layer, is pluripotent.” 

      This has now been expanded to better explain the statement.

      738 “…we postulate that cells from C1 stand as the precursor cell population from which the rest of the cells in the coelomic epithelium arise”. 

      This has now been expanded to better explain the statement.

      748 “differentiation: muscle, neuroepithelium, and coelomic epithelium cells. We also propose the presence of undifferentiated and proliferating cell populations in the coelomic epithelia, which give rise to the cells in this layer…”

      This has now been expanded to better explain the statement.

      777 “amphibians, the cells of the holothurian anlage coelomic epithelium are proliferative undifferentiated cells and originated via a dedifferentiation process…”

      This has now been expanded to better explain the statement.

      Reviewer #3 (Recommendations For The Authors): 

      Specific questions: 

      - Is there any way to systematically compare these cells to evolutionarily-diverged cells in distant relatives to sea cucumbers? Or even on a case-by-case basis? For example, is there evidence for any of these transitory cell types to have correlate(s) in vertebrate gut regeneration? 

      This is a most interesting question but one that is perhaps a bit premature to answer due to multiple reasons.  First, most of the studies in vertebrates focus on the regeneration of the luminal epithelium, a layer that we are not studying in our system since it appears later in the regeneration process.  Second, there is still too little data from adult echinoderms to fully comprehend which cells are cell orthologues to vertebrates. Third, we are only analyzing one regenerative stage.  It is our hope that this is just the start of a full description of what cell types/stages are found and how they function in regeneration and that this will lead us to identify the cellular orthologues among animal species.

      Major revisions: 

      - If lineage tracing is within the scope of this paper, it would provide more definitive evidence to the conclusions made about the precursor populations of the regenerating anlage. 

      Response:  This is certainly one of the next steps, however at present, it is not possible due to technical limitations.

      Minor revisions: 

      - Line 47: "for decades" even longer! Could the authors also cite some other amphibians, such as other salamanders (newts) and larval frogs? 

      References have been added.

      - Line 85: "specially"-could authors potentially change to "specifically" 

      Corrected

      - Line 122: Authors should add the full words of what these abbreviations stand for in the caption for Figure 1 or in Figure 1A itself. 

      Corrected

      - Lines 153: What conclusions are the authors trying to make from one type of tubulin presence compared to the others? It's unclear from the text. 

      The authors are not trying to reach any particular conclusion.  They are just stating what was found using several markers, and the possibility that what might be viewed first hand as a single cell population might be more heterogenous.  Although the tubulin-type information might not be relevant for the conclusions in the present manuscript, it might be important for future work on the cell types involved in the regeneration process.

      - Line 226: Could the authors clarify if "WNT9" is "WNT9a". Figure 3 lists WNT9a but authors refer to WNT9 in the text. 

      The gene names in Fig 3 are based on the human identifiers. H. glaberrima only has one sequence of Wnt9 (Auger et al. 2023) and this sequence shares the highest similarity to human Wnt9a, thus the name in the list. We have now identified the gene as Wnt9 to avoid confusion.

      - Lines 236-237: Can authors rule out that some immune cells might infiltrate the mesenchymal population? 

      No, this cannot be ruled out.  In fact, we believe that most of the immune cells found in our scRNA-seq are indeed cells that have infiltrated the anlage and are part of the mesenchyma.  This has been reported by us previously (see Garcia-Arraras et al. 2006). We have now included this in the text.

      - Line 452-453: The over-representation of ribosomal genes not shown. Would it be possible to show this information in the supplementary figures? 

      The sentence has been modified, the data is being prepared as part of a separate publication that focuses on the ribosomal genes.

      - Line 480: Could authors clarify if it's WNT9a or just WNT9?

      It is indeed Wnt9. See previous response above.

      - Line 500: In future experiments, it would be interesting to compare to populations at different timepoints in order see how the populations are changing or if certain precursors are activated at different times. 

      We fully agree with the reviewer. These are ongoing experiments or are part of new grant proposals.

      - Line 567-568: Choosing 9-dpe allowed for 13 clusters, but do authors expect a different number of clusters at different timepoints as things become more terminally differentiated? 

      Definitely, we believe that clusters related to the different regenerative stages of cells can be found by looking at earlier or later regeneration stages of the organ.  A clear example is that if the experiment is done at 14-dpe, when the lumen is forming, cells related to luminal epithelium populations will appear. It is also possible that different immune cells will be associated with the different regeneration stages.

      - Line 653: References Figure 10D (not in this manuscript). Are authors referring to only 1D or 9D or an old draft figure number? 

      As the reviewer correctly points out, this was a mistake where the reference is to a previous draft. It has now been corrected.

      - Line 701: "our study reveals that the coelomic epithelium, as a tissue layer, is pluripotent." Phrasing may be better as referring to the cell population making up the tissue layer as pluripotent/multipotent or that the cells it contains would likely be pluripotent or multipotent. Additionally, lineage tracing may be needed to definitively demonstrate this. 

      This has been modified.

      - Line 808: The authors may make a more accurate conclusion by saying that the characteristics are similar to blastemas or behave like a blastema rather than it is blastema. There is ambiguity about the meaning of this term in the field, but most researchers seem to currently have in mind that the "blastema" definition includes a discrete spatial organization of cells, and here these cells are much more spread out. This could be a good opportunity for the authors to engage in this dialogue, perhaps parsing out the nuances of what a "blastema" is, what the term has traditionally referred to, and how we might consider updating this term or at least re-framing the terminology to be inclusive of functions that "blastemas" have traditionally had in the literature and how they may be dispersed over geographical space in an organism more so than the more rigid, geographically-restricted definition many researchers have in mind. However, if the authors choose to elaborate on these issues, those elaborations do belong in the discussion, and the more provisional terminology we mention here could be used throughout the paper until that element of the revised discussion is presented. We would welcome the authors to do this as a way to point the field in this direction as this is also how we view the matter. For example, some of the genes whose expression has been observed to be enriched following removal of brain tissue in axolotls (such as kazald2, Lust et al.), are also upregulated in traditional blastemas, for instance, in the limb, but we appreciate that the expression domain may not be as localized as in a limb blastema. Additionally, since there is now evidence that some aspects of progenitor cell activation even in limb regeneration extend far beyond the local site of amputation injury (Johnson et al., Payzin-Dogru et al.), there is an opportunity to connect the dots and make the claim that there could be more dispersion of "blastema function" than previously appreciated in the field. Diving a bit more into these nuances may also enable better conceptual framework of how blastema function may evolve across vast evolutionary time and between different injury contexts in super-regenerative organisms. 

      We have followed the reviewer’s suggestion and stated that the holothurian anlage behaves as a blastema. Though we would love to elaborate on the blastema topic, as suggested by the reviewer, we believe that it would extend the discussion too much and that the topic might be better served in a different publication.

      - In the discussion, it would be important not to leave the reader with the impression that all amphibian blastema cells originate via dedifferentiation. This is not the case. For example, in axolotls (Sandoval-Guzman et al.) and in larval/juvenile newts, muscle progenitors within the blastema structure have been shown to originate from muscle satellite cells, a kind of stem cell, in stump tissues (while adult newts use dedifferentiation of myofibers to generate muscle progenitors in the blastema). Most cell lineages simply have not been evaluated in the level of detail that would be required to definitively conclude one way or the other, and the door is open for a more substantial contribution from stem cell populations than previously appreciated especially because new tools exist to detect and study them. Providing the reader with a more nuanced view of this situation will not negatively impact the findings in this paper, but it will show that there is biological complexity still waiting to be discovered and that we don't have all the answers at this point. 

      This has now been corrected. 

      Figures: Overall, the figures need minor work. 

      - Figure 1A: Can the authors draw a smaller, full-body cartoon and feature the current high-mag cartoon as an inset to that? Can they label the axes and make it clear how the geometry works here?

      Fig 1 has been re-done and now is split into Fig 1 and Fig 2.

      - Figure 1B: Can the authors label the UMAP with cluster identities on the map itself? This will make it easier to identify each cluster (especially to make sure cluster 11 is easier to find). 

      This has been corrected.

      - Figure 2: Could the authors put boxes/clearly distinguish panel labels around each cluster (AO), so that there are clear boundaries? 

      Fig 2 has been moved to Supplement, following another reviewer recommendation.

      - "Gene identifiers starting with "g" correspond to uncharacterized gene models of H. glaberrima." - The sentence is from another figure caption but this figure would benefit from having this sentence in the figure caption as well. 

      This has been added to other figures as suggested.

      - Figure 3A: Can the authors potentially bold, highlight, or underline genes you discuss in text, so it's easier for the reader to reference? 

      This has been added as suggested.

      - Figure 3C: Can the authors please label the cell types directly on the UMAP here as well? 

      The changes were made following the reviewer’s recommendation.

      - Figure 4D-E: There's not much context here to determine if this HCR-FISH validation can tell us anything about these cells besides some of them appear to be there. Do authors expect the coelomocyte morphology to look different in regenerating/injured tissue versus normal animals? Can the authors provide some double in situs, as well as some lower-magnification views showing where the higher-magnification insets are located? Is there any spatial pattern to where these cells are found? Counter stains would be helpful. 

      - Figure 6C: If clusters C5, C8, C9 are part of the coelomic epithelium, then authors could show a smaller diagram above with blue and grey to show types and then show clusters separately to help get their point across better. 

      - Figure 6G: This image appears to have high background- would it be possible for authors to repeat phalloidin stain or reimage with a lower exposure/gain. Additionally, imaging with Zstacks would help to obtain maximum intensity projections. It would greatly aid the reader if each image was labeled with HCR probes/antibodies that have been applied to the sample. 

      - Figure 7E: The cells appear to be out of focus and have high background. Additionally, they are lacking the speckled appearance expected to be seen with HCR-FISH. Would it be possible for authors to collect another image utilizing z-stacks? 

      HCR-FISH figures identifying the gene expression characteristic of cell clusters have been modified following the reviewer’s concerns.  The changes include:

      (1) Additional clusters have been verified with probes to gene identifiers. These include clusters 8, 9 and 12.

      (2) Redundant information has been removed.

      (3) Colors have been changed to make figures friendlier to color-impaired readers.

      (4) Spatial context has been added or identified.

      (5) In some cases, improved photos have been added

      (6) Better labels have been included

      (7) When necessary individual photos used for the overlay have been included.

      - Figure 9A: Could authors add cluster labels onto UMAP directly? 

      This change was made to Fig 2A. UMAP in Fig 9A is the same and used just as reference of the subset.

      - Figure 10: It could be useful if authors put a small map of the sea cucumber like in other images so that readers know where in the anlage this zoomed in model represents. 

      Added as suggested by the reviewer.

      - Supplementary figure 1F: Could authors add an arrow to the dark cell that's being pointed out? 

      Changed made as suggested by the reviewer.

      - Supplementary figure 1: Could authors label clearly what color is labeled with what marker? 

      Changed made as suggested by the reviewer.

    1. eLife Assessment

      The authors present convincing findings on trends in hind limb morphology through the evolution of titanosaurian sauropod dinosaurs, the land animals that reached the most remarkable gigantic sizes. The important results include the use of 3D geometric morphometrics to examine the femur, tibia, and fibula to provide new information on the evolution of this clade and on evolutionary trends between morphology and allometry.

    2. Joint Public Review:

      Páramo et al. used 3D geometric morphometric analyses of the articulated femur, tibia, and fibula of 17 macronarian taxa (known to preserve these three skeletal elements) to investigate morphological changes that occurred in the hind limb through the evolutionary history of this sauropod clade. A principal components analysis was completed to understand the distribution of the morphological variation. A supertree was constructed to place evolutionary trends in morphological variation into phylogenetic context, and hind limb centroid size was used to investigate potential relationships between skeletal anatomy and gigantism. The majority of the results did not yield statistically significant differences, but they did identify interesting shape-change trends, especially within subclades of Titanosauria. Many previous studies have attempted to elucidate a link between wide-gauge posture and gigantism, which in this study Páramo et al. investigate among several titanosaurian subclades. They propose that morphologies associated with wide-gauge posture arose in parallel with increasing body size among basal members of Macronaria and that this connection became less significant once wide-gauge posture was acquired within Titanosauria. The authors also suggest that other biomechanical factors influenced the independent evolution of subclades within Titanosauria and that these influences resulted in instances of convergent evolution. Therefore, they infer that, overall, wide-gauge posture was not significantly correlated with gigantism, though some morphological aspects of hind limb skeletal anatomy appear to have been associated with gigantism. Their work also supports previous findings of a decrease in body size within Titanosauriformes (which they found to be not significant with shape variables but significant with Pagel's lambda). Collectively, their results support and build on previous work to elucidate more specifics on the evolution of this enigmatic clade. Further study will show if their hypotheses stand or if the inclusion of additional specimens and taxa yields alternative results.

      [Editors' note: One of the original reviewers, Reviewer 2, reviewed this revised version of the manuscript; they reported satisfaction with the changes made by the authors in response to the original reviewer comments.]

    3. Author response:

      The following is the authors’ response to the original reviews.

      eLife Assessment

      The authors present valuable findings on trends in hind limb morphology throughout the evolution of titanosaurian sauropod dinosaurs, the land animals that reached the most remarkable gigantic sizes. The solid results include the use of 3D geometric morphometrics to examine the femur, tibia, and fibula to provide new information on the evolution of this clade and understand the evolutionary trends between morphology and allometry. Further justification of the ontogenetic stages of the sampled individuals would help strengthen the manuscript's conclusions, and the inclusion of additional large-body mass taxa could provide expanded insights into the proposed trends.

      Most of the analyzed specimens, especially from the smaller taxa, come from adult or subadult specimens. None exhibit features that may indicate juvenile status. However, we lack information of the paleohistology that may be a stronger indicator on the ontogenetic status of the individual, and some of operative taxonomic units used in the study come from mean shape of all the sampled specimens.

      Current information on morphological differences between adult and subadult or juvenile specimens indicates that even early juvenile specimens may share same morphological features and overall morphology as the adult (e.g., see Curry-Rogers et al., 2016; Appendix S3). We included a comprehensive analysis of the impact of juvenile specimens as one of the aspects of the intraspecific variability that may alter our results in Appendix S3.

      Public Reviews:

      Reviewer #1:

      Weaknesses:

      Several sentences throughout the manuscript could benefit from citations. For example, the discussion of using hind limb centroid size as a proxy for body mass has no citations attributed. This should be cited or described as a new method for estimating body mass with data from extant taxa presented in support of this relationship. This particular instance is a very important point to include supporting documentation because the authors' conclusions about evolutionary trends in body size are predicated on this relationship.

      We address this issue in the text (Line 32 & 64). Centroid size seems a good indication as it’s the overall size of the entire hind limb, and the length of the femur and tibia is well correlated independently with the body size/mass. Also, as we use few landmarks and only those that are purely type I or II landmarks, with curves of semilandmarks bounded or limited by them, centroid size is not sensible to landmark number differences across the sample in our study (as the centroid size is dependent of the number of landmarks of the current study as well as the physical dimensions of the specimens).

      We have sampled and repeated all the analyses using other proxies like the femoral length and the body mass estimated from the Campione & Evans (2020) and Mazzeta et al. (2004) methods. The comprehensive description of the method is in Appendix S2, the alternative analyses can be accessed in the Appendix S3 and S4; and the code for the alternative analyses can be accessed in the modified Appendix S5. All offer similar results than the ones obtained in our analyses with the body size proxied with the hind limb landmark configuration centroid size.

      An additional area of concern is the lack of any discussion of taphonomic deformation in Section 3.3 Caveats of This Study, the results, or the methods. The authors provide a long and detailed discussion of taphonomic loss and how this study does a good job of addressing it; however, taphonomic deformation to specimens and its potential effects on the ensuing results were not addressed at all. Hedrick and Dodson (2013) highlight that, with fossils, a PCA typically includes the effects of taphonomic deformation in addition to differences in morphology, which results in morphometric graphs representing taphomorphospaces. For example, in this study, the extreme negative positioning of Dreadnoughtus on PC 2 (which the authors highlight as "remarkable") is almost certainly the result of taphonomic deformation to the distal end of the holotype femur, as noted by Ullmann and Lacovara (2016).

      We included a brief commentary in the Caveats of This Study (Line 467) and greatly expanded this issue in the Appendix S3. We followed the methodology proposed by Lefebvre et al. (2020) to discuss the effects of taphonomic deformation in the shape analyses.

      Our shape variables (PCs obtained from the shape PCA) should be viewed as taphomorphospaces as Hedrick and Dodson, as well as the reviewer, points in such cases.

      The analysis of the effects of taphonomy or errors induced by the landmark estimation method indicate that Dreadnoughtus schrani is one of the few sampled taxa that may have a noticeable impact on our analyses due lithostatic deformation. Other taxa like Mendozasaurus neguyelap or Ampelosaurus atacis may also induce some alterations to the PCs. In general, the trends of those PCs slightly altered by taphonomy, where D. scharni is the only sauropod that may alter an entire PC like PC2, did not exhibit phylogenetic signal and are a small proportion of the sample variance.

      The authors investigated 17 taxa and divided them into 9 clades, with only Titanosauria and Lithostrotia including more than two taxa (and four clades are only represented by one taxon). While some of these clades represent the average of multiple individuals, the small number of plotted taxa can only weakly support trends within Titanosauria. If similar general trends could be found when the taxa are parsed into fewer, more inclusive clades, it would support and strengthen their claims. Of course, the authors can only study what is preserved in the fossil record, and titanosaurian remains are often highly fragmentary; these deficiencies should therefore not be held against the authors. They clearly put effort and thought into their choices of taxa to include in this study, but there are limitations arising from this low sample size that inherently limit the confidence that can be placed on their conclusions, and this caveat should be more clearly discussed. Specifically, the authors note that their dataset contains many lithostrotians, but they do not discuss unevenness in body size sampling. As neither their size-category boundaries nor the taxa which fall into each of them are clearly stated, the reader must parse the discussion to glean which taxa are in each size category. It should be noted that the authors include both Jainosaurus and Dreadnoughtus as 'large' taxa even though the latter is estimated to have been roughly five times the body mass of the former, making Dreadnoughtus the only taxon included in this extreme size category. The effects that this may have on body size trends are not discussed. Additionally, few taxa between the body masses of Jainosaurus and Dreadnoughtus have been included even though the hind limbs of several such macronarians have been digitized in prior studies (such as Diamantinasaurus and Giraffititan; Klinkhamer et al. 2018). Also, several members of Colossosauria are more similar in general body size to Dreadnoughtus than Jainosaurus, but unfortunately, they do not preserve a known femur, tibia, and fibula, so the authors could not include them in this study. Exclusion of these taxa may bias inferences about body size evolution, and this is a sampling caveat that could have been discussed more clearly. Future studies including these and other taxa will be important for further evaluating the hypotheses about macronarian evolution advanced by Páramo et al. in this study.

      Sadly, we could not include some larger sized titanosaurians sauropods. As the reviewers points out, the lack of larger sauropods among the sampled taxa may hinder our results, as the “large-bodied” category is filled with some mid-sized taxa and the former Dreadnoughtus schrani which is five times larger than some of them. We tried to include Elaltitan lilloi, digitized for this study and included in preliminary analyses, but the fragmentary status increased greatly the error by the estimation method as there is only a proximal third or mid femur preserved from this taxon. Therefore we opted to exclude it from our database.

      Other taxa considered, as the reviewer suggest, was not readily available for the authors as the time of this study was conducted and including now may have increased the possible bias of our study. Giraffatitan brancai is an Late Jurassic brachiosaurid, which may again increase the number of early-branching titanosauriforms with large body masses while most of the smaller taxa sampled are recovered in deeply-branching macronarians (including Diamantinasaurus matildae if we would have also included it). Future analyses may include a wider sample of the mid to large-bodied titanosaurians, especially lithostrotians, as well as some colossosaurs like Patagotitan mayorum.

      Reviewer #1 (Recommendations For The Authors):

      These are all minor comments that would improve the manuscript.

      - There are a few typos throughout the manuscript such as: line 70 should be 2016 and line 242 should be forelimb.

      Corrected.

      - To me, the most interesting aspect of your study is the diversity and trends recovered in titanosaurian subclades and I would highlight this, not gigantism, in the title if you choose to revise the title.

      It has been addressed. The specificality of some of the tests and the implication to the acquisition of the spread limb posture and gigantism in early-branching taxa is important nonetheless, so we think that it may remain in the title.

      - The abstract should provide more details on the results such as none of the listed trends were statistically significant.

      Many of the trends exhibit phylogenetic signal, but not the allometric components. We have briefly addressed them.

      - Several sentences in the manuscript need citations such as: line 48 the reference to other megaherbivores, line 66 the discussion of poor understanding of the relationship of wide gauge posture and gigantism, and the use of centroid size as an estimate of body mass (see Public Review).

      We changed the line 66 to improve the focus on the current state of the art in the hypothesis of a relationship between arched limbs and in the increase of body size. We included a section relating centroid size as a proxy (due the good correlation between the femur and tibia length and the body mass) and the caveats of using it. We also expanded in the Appendix S2 the use of centroid size and the alternative models.

      - With titanosaur evolution, you mention that they are adapting to new niches and topography (line 64). What support is there for this versus they are adapting to be more successful in their current environment?

      Noted, we have changed the phrase to improved efficiency exploiting of inland environments, as thy can be either opening new inland niches or adapting better to current inland niches that were already exploited for less deeply branching sauropods. However, its testing is beyond the scope of the current work.

      - Line 384-385: the discussion of Rapetosaurus should mention that it is a juvenile and some studies have suggested that titanosaur limbs grow allometrically.

      We have included a small line. Whether Rapetosaurus krausei exhibit allometric growth or not may not change greatly the discussion, maybe only excluding it as morphologically convergent to Lirainosaurus and Muyelensaurus. But if that so, it will be further proof that small-sized titanosaurs exhibit the robust skeleton expected in the giant titanosaurs.

      - I would consider addressing the question of if we are certain enough in our understanding of titanosaurian phylogeny to rule out homology, especially when you discuss the uncertainty of the placement of specific taxa. Also, Diamantinasaurus is not the only titanosaur that has been proposed as a member of both basal and more derived subclades (e.g., Dreadnoughtus).

      We tried to assume a more conservative approach. We could not fully rule out that some of the features observed in the sampled deeply branching lithostrotians, especially saltasauroids, cannot be present in the entire somphospondylan lineage. However, none of the less deeply-branching or early-branching titanosaurs exhibit this kind of morphology. Recent studies propose the possibility that entire groups, included in this study like the Colossosauria, change its position in the phylogeny. However, despite the debated phylogenetic position of Diamantinasaurus or Dreadnoughtus, or even the inclusion of Colossosauria within the saltasauroids and the inclusion of the Ibero-Armorican lithostrotians as putative saltasaurids (Mocho et al. 2024). However, even considering these changes we did not notice any relevant differences in our conclusions about hind limb arched morphology nor about size. Distal hind limb overall robustness should indeed be addressed in the light of shifts in phylogenetic position and include some interesting sauropods like Diamantinasaurus or expand the large-sized Colossosauria or early-branching somphospondyls as it may have profound implications on the morphofunctional adaptations to specific feeding niches, e.g., see current hypotheses about rearing as mentioned in Bates et al. (2016), Ullmann et al. (2017) or Vidal et al. (2020). We had not enough information to conclude the presence of any plesiomorphic condition or analogous feature with our current sample and the debated titanosaurian phylogeny.

      - I understand this is not standard in the field, but your study provides the opportunity to conduct sensitivity testing of the effects of cartilage thickness and user articulation of the bones on PCA results. This would be an inciteful addition to the field of GMM.

      We are currently developing such a comprehensive analysis and several other implications on our past results. However, we feel that it is beyond the scope of the current study. We appreciate the suggestion nonetheless, as it would be a sensitivity test of the impact of several of our assumptions in the final results that is often not considered.

      - In Figure 1, if all the limbs were arranged the same way it would be easier to interpret. Consider flipping panels B and D to match A and C.

      Accepted.

      - In Figures 2-4, the views in C should be labeled in the figure or caption. Oceanotitan is also in the PCA plot but not included in the figure caption. Also, consider changing the names to represent the paraphyletic groupings you are using instead of formal clade names. For example, change 'Titanosauria' to 'Basal Titanosaurs' to reflect that it is not including all titanosaurs in the sample.

      Changes accepted for the shape PCA results. The informal (i.e., paraphyletic) terms such as “Basal Titanosaurs” were only used in the shape analyses as in the RMA, the Titanosauria (and other more inclusive groups) were used as natural groups. Each partial RMA model is based on a sample of all the taxa that are included within that particular clade (e.g., Titanosauria includes both Dreadnoughtus and Saltasaurus; Lithostrotia excludes the former).

      - I am concerned that centroid size does not scale evenly across the wide-ranging body mass of titanosaurs. I do not know if this affects your size trends or their significance, but as I mentioned above Dreadnoughtus is much bigger than most of the taxa included and that isn't as drastically apparent in centroid size (in Figure 5) as it is when taxa are plotted by body mass.

      Main problematic with centroid size of the hind limb is the shift in the body plan of deeply-branching titanosaurs as the Center of Masses is displaced toward the anterior portion of the body and it has been proposed due a large development of the forelimb region (e.g., Bates et al. 2016). However, it would only increase the effects of the phyletic body size reduction, as smaller taxa tend to have a 1:1 fore limb and hind limb ratio, e.g., from our past analyses as in Páramo et al. (2019), and the sacrum is not as beveled as in earlier somphospondyls, e.g., Vidal et al. (2020). The role of the low-browsing feeding habits of deeply-branching lithostrotians shall be explored elsewhere, as it may be the main driving force of this effect. Our point is, the proxy used may have some slight offset due some high-browsing giant early-branching titanosaurs which has a greater cranial region development which increase its body size and mass beyond our bare-minimum estimation based on the hind limb region. But, overall, this offset is assumed to be low. We repeated the analyses with the femoral length as proxy of body size and a mass estimation, including the quadratic equation based on both humeral and femoral lengths, and the results remain similar. Another problem that arises with the use of centroid size is the way it shall be calculated, but as we used an even number of landmarks and curve semilandmarks, and all of them bounded to anatomical features, it remains equal at least for our sample (but cannot be extrapolated to other geometric morphometric studies that do not use the same configurations)

      We appreciate the reviewer concerns nonetheless, as it was on of our own when designing this study, and we in the future will try to expand the analyses, or advise anyone expanding on this study, using total body size/volume estimations following Bates et al. (2016). Which also includes test of the effects of the different whole-body estimation models.

      Cites:

      Bates KT, Mannion PD, Falkingham PL, Brusatte SL, Hutchinson JR, Otero A, Sellers WI, Sullivan C, Stevens KA, Allen V. 2016. Temporal and phylogenetic evolution of the sauropod dinosaur body plan. Royal Society Open Science 3:150636. doi:10.1098/rsos.150636

      Mocho P, Escaso F, Marcos-Fernández F, Páramo A, Sanz JL, Vidal D, Ortega F. 2024. A Spanish saltasauroid titanosaur reveals Europe as a melting pot of endemic and immigrant sauropods in the Late Cretaceous. Commun Biol 7:1016. doi:10.1038/s42003-024-06653-0

      Páramo A, Ortega F, Sanz JL. 2019. A Niche Partitioning Scenario for the Titanosaurs of Lo Hueco (Upper Cretaceous, Spain). International Congress of Vertebrate Morphology (ICVM) - Abstract Volume, Journal of Morphology. Prague. p. S197.

      Ullmann PV, Bonnan MF, Lacovara KJ. 2017. Characterizing the Evolution of Wide-Gauge Features in Stylopodial Limb Elements of Titanosauriform Sauropods via Geometric Morphometrics. The Anatomical Record 300:1618–1635. doi:10.1002/ar.23607

      Vidal D, Mocho P, Aberasturi A, Sanz JL, Ortega F. 2020. High browsing skeletal adaptations in Spinophorosaurus reveal an evolutionary innovation in sauropod dinosaurs. Sci Rep 10:6638. doi:10.1038/s41598-020-63439-0

      Reviewer #2:

      The authors report a quantitative comparative study regarding hind limb evolution among titanosaurs. I find the conclusions and findings of the manuscript interesting and relevant. The strength of the paper would be increased if the authors were to improve their reporting of taxon sampling and their discussion of age estimation and the potential implications that uncertainty in these estimates would have for their conclusions regarding gigantism (vs. ontogenetic patterns).

      Considering the observations made by reviewer #1, we included a data about the impact of ontogenetic patterns and other intraspecific variability in the Appendix S3. We considered to increase the sample but it has not been possible at the time of this study was carried out.

      Reviewer #2 (Recommendations For The Authors):

      I have a few concerns/requests for the authors, that I hope can be easily resolved.

      Comments:

      - What drove taxon sampling?

      Random sampling of somphospondylan sauropods focused on the Lithostrotia clade for the thesis project of one of the authors, APB. Logistics were also one of the bias on our sample, and based on the available titanosaurian material we left out several macronarians that has been already sampled but would further induce a early-branching large sauropod, deeply-branching small sauropod that may alter our results.

      - Which phylogenies were used to create the supertree applied to the analyses? What references were used to time-calibrate the tips and deeper nodes? I couldn't find any reference to this. Additionally, more information regarding the R packages and analytical pipeline would be appreciated: e.g. were measurements used in the analyses log-transformed?

      A comprehensive description of the methodology is provided in Appendix S2.

      - Age estimate: can the author confirm the skeletal maturity of the sampled individuals? If this is not the case, how can the author be sure that the patterns towards gigantism are not reflecting different ontogenetic stages? I believe this should be part of both methods and discussion.

      As commented before, we excluded small, probable juvenile specimens from our sample. We have no paleohistological sample backing the claims of the ontogenetic status of some of the specimens that were included or excluded were calculating the mean shape for the operative taxonomic units. However, we followed a criteria to identify the relative ontogenetic status and it has been included in Appendix S3.

      - The authors used the centroid size for regressions in Figure 6. Although I believe that this is a good variable, would the author be willing to use body mass and log-transformed femur length in addition to what was done? These would be very useful considering that these variables are (relatively) independent from shape/morphology.

      Accepted, we tested our hypotheses with three alternative models based on femoral length, combined femoral and humeral lengths for body mass estimations. Methodology can be found in Appendix S2, results on Appendix S4, code for the alternative methods in Appendix S5.

      - Data access: will stl. Files of the limb elements be shared and freely available? In this case, where the files will be deposited?

      At the time of the current study, some of the sampled specimens cannot be available (material under study) but the mean shapes can be generated after the landmarks and semilandmark curves and the “atlas” mesh.

      - Additionally, outstanding references regarding limb evolution, GMM, role of ontogeny, and evolution of columnar gait are missing. The authors should reinforce the literature review with the following (alphabetical order):

      Bonnan, M. F. (2003). The evolution of manus shape in sauropod dinosaurs: implications for functional morphology, forelimb orientation, and phylogeny. Journal of Vertebrate Paleontology, 23(3), 595-613.

      Botha, J., Choiniere, J. N., & Benson, R. B. (2022). Rapid growth preceded gigantism in sauropodomorph evolution. Current Biology, 32(20), 4501-4507.

      Curry Rogers, K., Whitney, M., D'Emic, M., & Bagley, B. (2016). Precocity in a tiny titanosaur from the Cretaceous of Madagascar. Science, 352(6284), 450-453.

      Day, J. J., Upchurch, P., Norman, D. B., Gale, A. S., & Powell, H. P. (2002). Sauropod trackways, evolution, and behavior. Science, 296(5573), 1659-1659.

      Fabbri, M., Navalón, G., Benson, R. B., Pol, D., O'Connor, J., Bhullar, B. A. S., ... & Ibrahim, N. (2022). Subaqueous foraging among carnivorous dinosaurs. Nature, 603(7903), 852-857.

      Fabbri, M., Navalón, G., Mongiardino Koch, N., Hanson, M., Petermann, H., & Bhullar, B. A. (2021). A shift in ontogenetic timing produced the unique sauropod skull. Evolution, 75(4), 819-831.

      González Riga, B. J., Lamanna, M. C., Ortiz David, L. D., Calvo, J. O., & Coria, J. P. (2016). A gigantic new dinosaur from Argentina and the evolution of the sauropod hind foot. Scientific Reports, 6(1), 19165.

      Lefebvre, R., Allain, R., & Houssaye, A. (2023). What's inside a sauropod limb? First three‐dimensional investigation of the limb long bone microanatomy of a sauropod dinosaur, Nigersaurus taqueti (Neosauropoda, Rebbachisauridae), and implications for the weight‐bearing function. Palaeontology, 66(4), e12670.

      McPhee, B. W., Benson, R. B., Botha-Brink, J., Bordy, E. M., & Choiniere, J. N. (2018). A giant dinosaur from the earliest Jurassic of South Africa and the transition to quadrupedality in early sauropodomorphs. Current Biology, 28(19), 3143-3151.

      Martin Sander, P., Mateus, O., Laven, T., & Knötschke, N. (2006). Bone histology indicates insular dwarfism in a new Late Jurassic sauropod dinosaur. Nature, 441(7094), 739-741.

      Remes, K. (2008). Evolution of the pectoral girdle and forelimb in Sauropodomorpha (Dinosauria, Saurischia): osteology, myology and function (Doctoral dissertation, München, Univ., Diss., 2008).

      Sander, P. M., & Clauss, M. (2008). Sauropod gigantism. Science, 322(5899), 200-201.

      Yates, A. M., & Kitching, J. W. (2003). The earliest known sauropod dinosaur and the first steps towards sauropod locomotion. Proceedings of the Royal Society of London. Series B: Biological Sciences, 270(1525), 1753-1758.

      We appreciate this suggestion and we already used some of the articles in our study but the selection of cites were based also in the available manuscript space enforced by the edition guidelines. We would have like to include several of these works but we had opted to include some of the works that summarize some of them, whereas excluding others.

    1. eLife Assessment

      This is a valuable study that tests the functional role of food-washing behavior in removing tooth-damaging sand and grit in long-tailed macaques and whether dominance rank predicts level of investment in the behavior. The evidence that food-washing is deliberate is compelling, but the evidence for variable and adaptive investment depending on rank, including the fitness-relevance and ultimate evolutionary implications of the findings, is incomplete given limitations of the experimental design. Overall, the paper should be of interest to researchers interested in foraging behavior, cognition, and primate evolution.

    2. Reviewer #1 (Public review):

      In this paper, the authors had 2 aims:

      (1) Measure macaques' aversion to sand and see if its' removal is intentional, as it likely in an unpleasurable sensation that causes tooth damage.

      (2) Show that or see if monkeys engage in suboptimal behavior by cleaning foods beyond the point of diminishing returns, and see if this was related to individual traits such as sex and rank, and behavioral technique.

      They attempted to achieve these aims through a combination of geochemical analysis of sand, field experiments, and comparing predictions to an analytical model.

      The authors' conclusions were that they verified a long-standing assumption that monkeys have an aversion to sand as it contains many potentially damaging fine grained silicates, and that removing it via brushing or washing is intentional.

      They also concluded that monkeys will clean food for longer than is necessary, i.e. beyond the point of diminishing returns, and that this is rank-dependent.

      High and low-ranking monkeys tended not to wash their food, but instead over-brushed it, potentially to minimize handling time and maximize caloric intake, despite the long-term cumulative costs of sand.

      This was interpreted through the *disposable soma hypothesis*, where dominants maximize immediate needs to maintain rank and increase reproductive success at the potential expense of long-term health and survival.

      # Strengths

      The field experiment seemed well designed, and their quantification of the physical and mineral properties of quartz particles (relative to human detection thresholds) seemed good relative to their feret diameter and particle circularity (to a reviewer that is not an expert in sand). The *Rank Determination* and *Measuring Sand* sections were clear.

      In achieving Aim 1, the authors validated a commonly interpreted, but unmeasured function, of macaque and primate behavior-- a key study/finding in primate food processing and cultural transmission research.

      I commend their approach in trying to develop a quantitative model to generate predictions to compare to empirical data for their second aim.<br /> This is something others should strive for.

      I really appreciated the historical context of this paper in the introduction and found it very enjoyable and easy to read.

      I do think that interpreting these results in the context of the *disposable soma hypothesis* and the potential implications in the *paleolithic matters* section about interpreting dental wear in the fossil record are worthwhile.

      # Weaknesses

      Several of my concerns in an earlier review were addressed in revision, which I appreciate. One thing I think could strengthen this paper is a clearer link to social foraging theory to explore heterogeneity in handling times (as the currency they are trying to maximize).

      I am satisfied with the improvements in statistics and that I can access the code and data.

      I am still struck that there was an analysis of only trials where <3 individuals are present. If rank was important, I would imagine that behavior might be different in social contexts when theft, scrounging, policing, aggression, or other distractions might occur-- where rank would have effects on foraging behavior. Maybe lower rankers prioritize rapid food intake then. If rank should be related to investment in this behavior, we might expect this to be magnified (or different) in social contexts where it would affect foraging. It might just be that the data was too hard to score or process in those settings, or the analysis was limited. Additionally, I think that more robust metrics of rank from more densely sampled focal follow data would be a better measure, but I acknowledge the limitations in getting the ideal . Since rank is central to the interpretation of these results, I think that reduced social contexts in which rank was analyzed and the robustness of the data from which rank was calculated and analyzed are the main weaknesses of the evidence presented in this paper.

      While some of the boxes about raccoons and Concorde Fallacy were interesting, they did feel like a bit of a distraction from the main message in the paper.

    3. Reviewer #3 (Public review):

      This revised paper provides evidence that food washing and brushing in wild long-tailed macaques are deliberate behaviors to remove sand that can damage tooth enamel. The demonstration of the immediate functional importance of these behaviors is nicely done, and there is some interesting initial evidence that macaques differ systematically in their investment in food cleaning based on dominance rank.

      The authors interpret this evidence as support for "disposable soma" effects: that reduced time and effort invested food washing in high-ranking individuals is attributable to prioritizing reproductive effort. Given that the analysis is on a single group with no longitudinal data, there are no fitness measures or fitness proxies, the energetic constraints faced by this population are not clear, and both sexes are combined into a single dominance hierarchy (trade-offs between different forms of investment are typically thought to differ between sexes), this conclusion is premature, although an interesting foundation for future studies.

      More generally, the results directly supported by the data collection and analysis (grit on Koshima likely damages macaque teeth; processing food helps mitigate the damage; there is some interesting interindividual variation in food processing time, and that time is not always in line with what appears to be optimal) tend to be combined with interpretation that is much more speculative (e.g., the effect sizes observed are consequential for fitness; high-ranking animals are making choices that optimize their long-term fitness at the expense of their soma). This is in part a stylistic choice but can have the effect of drawing attention away from the stronger empirical findings and/or be misleading. Similarly, although I appreciate that the authors were trying to interpret and respond to previous feedback from reviewers, I found the addition of the box text on the raccoon nomenclature and on irrational behavior and the Concorde effect distracting (more intro-textbook style than journal article style).

    4. Author response:

      The following is the authors’ response to the original reviews.

      We thank the reviewers for their constructive criticism. It is rare and gratifying to receive such thoughtful feedback, and the result is a much stronger paper. We made significant changes to our statistical analyses and figures to better differentiate the effects of sex and dominance rank on food-cleaning behaviors. These revisions uphold our original conclusion––that rank-related variation overwhelms any sex difference in cleaning behavior. We hope that these edits, together with the rest of our responses, provide a convincing demonstration of the tradeoffs of eliminating quartz from food surfaces.

      Reviewer #1 (Public Review):

      Summary

      We have no objections to Reviewer 1’s summary of our manuscript.

      Strengths

      Reviewer 1 is extremely gracious, and we are grateful for the kind words.

      Weaknesses

      Reviewer 1 identified several weaknesses, enumerating three types: (1) statistics, (2) insufficient links to foraging theory, and (3) interpretation and validity of the model. The present response is organized around these same categories.

      (1) Statistics

      We put all of our data and code into the Zenodo repository prior to submission. This content should have been accessible to Reviewer 1 from the outset. But in any event, we are very sorry for the mixup. To ensure access to our data and code during the present stage of review, we included the URL in the main mainscript and here: https://doi.org/10.5281/zenodo.14002737

      (a) AIC and outcome distributions

      Reviewer 1 criticized our use of AIC for determining model selection. We agree and this aspect of our manuscript is now removed. In lieu of AIC, we produced two data sets consisting of whole number counts (seconds) with means <5. The data were right-skewed due to high concentrations of biologically-meaningful zeros (i.e., bouts of food handling without any cleaning effort). Following the recommendations of Bolker et al. (2008) and others (Brooks et al. 2017, 2019), we chose an outcome distribution (zero-inflated Poisson, see response below) that best matched this data distribution. In addition, we evaluated the post-hoc performance of each of our models using the standardized residual diagnostic tools for hierarchical regression models available in the DHARMa package (Hartig, 2022). To further evaluate our choice of outcome distribution, we generated QQ-plots and residual vs. predicted plots for each model and included them in our revision as Figures S3-S5.

      (b) zeros

      Reviewer 1 expressed concern over our treatment of biologically-meaningful zeros, and recommended use of a zero-inflated GLMM with either a Poisson or negative binomial outcome distribution. We agree that such models are best for our two data sets. Accordingly, we fit a series of zero-inflated generalized linear mixed models (ZIGLMM) using the glmmTMB package in R, each with a logit-link function, a single zero-inflation parameter applying to all observations, and a Poisson error distribution. For the food-brushing model, we fit a zero-inflated Poisson (ZIP), which produced favorable standardized residual diagnostic plots with no major patterns of deviation (Figure S3) and minor, but non-significant underdispersion (DHARMa dispersion statistic = 0.99, p = 0.80). For our two food-washing models, we used zero-inflated models with Conway-Maxwell Poisson (ZICMP) distributions, an error distribution chosen for its ability to handle data that are more underdispersed (DHARMa dispersion statistic = 8.2E-09, p = 0.74) than the standard zero-inflated Poisson (Brooks et al. 2019). Using this error distribution improved residual diagnostic plots over a standard ZIP model and we view any deviations in the standardized residuals as minor and attributable to the smaller sample size of our food-washing data set (see Figures S4 and S5) (Hartig, 2022). We reported the summarized fixed effects tests for each GLMM in Tables S1-S3 as Analysis of Deviance Tables (Type II Wald chi square tests, one-sided) along with 𝜒2 values, degrees of freedom, and p-values (one-sided tests). Full model summaries with standard errors and confidence intervals are also included in Tables S4-S6. For all statistical analyses, we set 𝛼 = 0.05.

      (2) Absence of Links to Foraging Theory

      This critique has three components. The first revisits the absence of code for the optimal cleaning time model. This omission was an unfortunate error at the moment of submission, but our code is available now as a Mathematica notebook in Zenodo (https://doi.org/10.5281/zenodo.14002737). The second pivots around our scholarship, admonishing us for failing to acknowledge the marginal value theorem of Charnov (1976). It is a fair point and we have corrected the oversight with a citation to this classic paper. The third criticism is also rooted in scholarship, with Reviewer 1 asking for greater connection to the existing literature on optimal foraging theory, a point echoed in the summary assessment of the editors at eLife. This comment and the weight given to it by eLife’s editors put us in a difficult spot, as our paper is focused on the optimization of delayed gratification, not food acquisition per se. So, we are in the awkward position of gently resisting this recommendation while simultaneously agreeing with Reviewer 1 that we need to better situate our findings in the landscape of existing literature. To thread this needle, we produced Box 2 with a photograph and 410 words. This display box puts our findings into direct conversation with recent research focused on the sunk cost fallacy.

      (3) Interpretation and validity of model relative to data

      This critique is focused on the simulated brushing and washing results reported in Figure S1, along with its captioning, which was inadequate. We edited the caption to identify the author (JER) who simulated the brushing and washing behaviors of the monkeys. In addition, we clarified the number of brushing replicates (3) and washing replicates (3) for each of three treatments, for a total of 18 simulations.

      We followed Reviewer 1’s suggestion, incorporating the experimental uncertainty of grit removal into our optimal cleaning time model. We drew % grit removed values the % grit removed is used to estimate the cleaning inefficiency≥ 100%parameter 𝑐 for from a distribution, discounting the rare event when values were drawn. As brushing and washing, the included uncertainty now allows us to evaluate these parameters as distributions; and, in turn, obtain a distribution for our predicted brushing and washing optimal cleaning times. As we now describe in the main text, the optimal cleaning time for brushing and washing are 𝑡* \= 0. 98 ± 0. 19 s and * = 2. 40 ± 0. 74 s, respectively. We are grateful for Reviewer 1’s suggestion, for it added𝑡 valuable context to our model predictions. Notably, the inclusion of experimental uncertainty did not change the qualitative nature of our results, or the interpretations of our model predictions compared to observed cleaning behaviors.

      We choose to exclude variability in handling time h to generate predicted cleaning time optima, at least in the main text. Our reasoning stems from the observation that handling time variability is long-tailed, with the longer handling times associated with behaviors that we do not account for in our analysis. For example, individuals carrying multiple cucumber slices to the ocean were apt to drop them, struggling at times to re-grasp so many at once. Such moments increased handling times substantially. Still, we acted on Reviewer 1’s suggestion, accounting for the tandem effects of handling time variability and uncertainty in % grit removed (see Figure S6). Drawing handling time estimates from a log-normal distribution fitted to the handling time data, we found that these dual sources of uncertainty did not qualitatively change our results. They added further uncertainty to the predicted washing time, but the mean remains roughly equivalent. (We note that brushing is assumed to have a constant handling time––composed of only assessment time and no travel––such that the results for brushing do not change.) Both analyses are included in the Mathematica notebook at (https://doi.org/10.5281/zenodo.14002737).

      Reviewer #2 (Public Review):

      Summary

      We have no objections to Reviewer 2’s summary of our manuscript.

      Strengths

      Reviewer 2 is extremely gracious, and we are grateful for the kind words.

      Weaknesses

      Reviewer 2 noted that our manuscript failed to provide “sufficient background on [our study] population of animals and their prior demonstrations of food-cleaning behavior or other object-handling behaviors (e.g., stone handling).” To address this comment, we edited the introduction (lines 56-58) to alert readers to the onset of regular food-cleaning behaviors sometime after December 26, 2004. In addition, we edited our methods text (lines 155-160) to highlight the onset and limited scope of prior research with this study population:

      “The animals are well habituated to human observers due to regular tourism and sustained study since 2013 (Tan et al., 2018). Most of this research has revolved around stone tool-mediated foraging on mollusks, the only activity known to elicit stone handling (Malaivijitnond et al., 2007; Gumert and Malaivijitnond, 2012, 2013; Tan et al., 2015), although infants and juveniles will sometimes use stones during object play (Tan, 2017). There has been no prior examination of food-cleaning behaviors.”

      Reviewer #3 (Public Review):

      Reviewer 3 identified three weaknesses, which we address in three paragraphs.

      Reviewer 3 questioned our methods for determining rank-dependent differences in cleaning behavior, arguing that our conclusions were unsupported. It is a fair point, and it compelled us to combine males and females into a single standardized ordinal rank of 24 individuals. This unified ranking is now reflected in the x-axes of Figure 2 and Figure S2. Plotting the data this way––see Figure S2––underscores Reviewer 3’s concern that sex and dominance rank are confounding variables. To address this problem, our GLMM included rank and sex as predictor variables, which controls for the effect of sex when assessing the relationship between rank and cleaning time across the three treatments. Reported in Tables S1-S3, these findings show that the effect of sex on either brushing or washing time was not significant. This result bolsters our original contention that rank-related variation in cleaning time overwhelms any sex differences.

      Relatedly, Reviewer 3 questioned our conclusions on the effects of rank because our study was focused on a single social group. In other words, it is plausible that our results were heavily influenced by the idiosyncrasies of select individuals, not dominance rank per se. It is a fair point, and it compelled us to include individual ID as a random effect in each of our GLMMs. Including individual ID as a random intercept allowed us to control for inter-individual variation in cleaning duration while assessing the effects of rank. An analysis based on additional social groups or longitudinal data are certainly desirable, but also well beyond the scope of a Short Report for eLife.

      Finally, Reviewer 3 objected to fragments of sentences in our abstract, introduction, and discussion, combining them into a criticism of claims that we did not and do not make. It probably wasn’t intentional, but it puts us in the awkward position of deconstructing a strawman:

      ● Review 3 begins, “there is no evidence presented on the actual fitness-related costs of tooth wear or the benefits of slightly faster food consumption”. This statement is true while insinuating that collecting such evidence was our intent. To be clear, our experiment was never designed to measure tooth wear or reproductive fitness, nor do we make any claims of having done so.

      ● Reviewer 3 adds, “Support for these arguments is provided based on other papers, some of which come from highly resource-limited populations (and different species). But this is a population that is supplemented by tourists with melons, cucumbers, and pineapples!” We were puzzled over these sentences. The first fails to mention that the citations exist in our discussion. Citing relevant work in a discussion is a basic convention of scientific writing. But it seems the underlying intent of these words is to denigrate the value of our study population because two dozen tourists visit Koram Island once a day. Exclamations to the contrary, the amount of tourist-provisioned food in the diet of any one monkey is negligible.

      ● Last, Reviewer 3 commented on matters of style, objecting to “overly strong claims.” We puzzled over this criticism because the claims in question are broader points of introduction or discussion, not results. The root problem appears to be the final sentence of our abstract:

      “Dominant monkeys abstained from washing, balancing the long-term benefits of mitigating tooth wear against immediate energetic requirements, an essential predictor of reproductive fitness.”

      This sentence has three clauses. The first is a statement of results, whereas the second and third are meant to mirror our discussion on the importance of our findings. We combined the concepts into a single concluding sentence for the sake of concision, but we can appreciate how a reader could feel deceived, expecting to see data on tooth wear and fitness. So, our impression is that we are dealing with a simple misunderstanding of our own making, and that this single sentence explains Reviewer 3’s criticism and tone––it cast a long shadow over the substance of our paper. To resolve this problem, we edited the sentence:

      “Dominant monkeys abstained from washing, a choice consistent with the impulses of dominant monkeys elsewhere: to prioritize rapid food intake and greater reproductive fitness over the long-term benefits of prolonging tooth function.”

    1. eLife Assessment

      This important study characterizes the molecular signatures and function of a type of enteric neuron (IPAN) in the mouse colon, identifying molecular markers (Cdh6 and Cdh8) for these cells. A battery of compelling and comprehensive experimental findings suggests data from other species are likely translatable to mice, bridging the abundant literature from humans and other mammals into this experimentally tractable animal model. This work will be of interest to scientists studying the motor control of the colon and more generally the enteric neuromuscular system.

    2. Reviewer #1 (Public review):

      Summary:

      In their manuscript, Gomez-Frittelli and colleagues characterize the expression of cadherin6 (and -8) in colonic IPANs of mice. Moreover, they found that these cdh6-expressing IPANs are capable of initiating colonic motor complexes in the distal colon, but not proximal and midcolon. They support their claim by morphological, electrophysiological and optogenetic, and pharmacological experiments.

      Strengths:

      The work is very impressive and involves several genetic models and state-of-the-art physiological setups including respective controls. It is a very well-written manuscript that truly contributes to our understanding of GI-motility and its anatomical and physiological basis. The authors were able to convincingly answer their research questions with a wide range of methods without overselling their results.

      Weaknesses:

      The authors put quite some emphasis on stating that cdh6 is a synaptic protein (in the title and throughout the text), which interacts in a homophilic fashion. They deduct that cdh6 might be involved in IPAN-IPAN synapses (line 247ff.). However, Cdh6 does not only interact in synapses and is expressed by non-neuronal cells as well (see e.g., expression in the proximal tubuli of the kidney). Moreover, cdh6 does not only build homodimers, but also heterodimers with Chd9 as well as Cdh7, -10, and -14 (see e.g., Shimoyama et al. 2000, DOI: 10.1042/0264-6021:3490159). It would therefore be interesting to assess the expression pattern of cdh6-proteins using immunostainings in combination with synaptic markers to substantiate the authors' claim or at least add the possibility of cell-cell-interactions other than synapses to the discussion. Additionally, an immunostaining of cdh6 would confirm if the expression of tdTomato in smooth muscle cells of the cdh6-creERT model is valid or a leaky expression (false positive).

      Comments on revisions:

      The authors have updated their manuscript and have provided insights and discussions to my remarks.

    3. Reviewer #2 (Public review):

      Summary:

      Intrinsic primary afferent neurons are an interesting population of enteric neurons that transduce stimuli from the mucosa, initiate reflexive neurocircuitry involved in motor and secretory functions, and modulate gut immune responses. The morphology, neurochemical coding, and electrophysiological properties of these cells have been relatively well described in a long literature dating back to the late 1800's but questions remain regarding their roles in enteric neurocircuitry, potential subsets with unique functions, and contributions to disease. Here, the authors provide RNAscope, immunolabeling, electrophysiological, and organ function data characterizing IPANs in mice and suggest that Cdh6 is an additional marker of these cells.

      Strengths:

      This paper would likely be of interest to the enteric neuroscience community and increases information regarding the properties of IPANs in mice. These data are useful and suggest that prior data from studies of IPANs in other species are likely translatable to mice.

      Weaknesses:

      Major weaknesses:<br /> (1) The novelty of this study is relatively limited. The main point of novelty suggests an additional marker of IPANs (Cdh6) that would add to the known list of markers for these cells. How useful this would be is unclear. Other main findings basically confirm that IPANs in mice display the same classical characteristics that have been known for many years from studies in guinea pigs, rats, mice and humans.

      (2) Critical controls are needed to support the optogenetic experiments. Control experiments are needed to show that ChR2 expression 1) does not change the baseline properties of the neurons, 2) that stimulation with the chosen intensity of light elicits physiologically relevant responses in those neurons, and 3) that stimulation via ChR2 elicits comparable responses in IPANs in the different gut regions focused on here. These essential controls remain absent in the study and limit confidence in the data derived from this model.

      (3) The motor effects observed in optogenetic experiments are difficult to understand in the absence of good controls for optogenetic control of the proposed neuron population (discussed above). It remains unclear how stimulating IPANs in the distal colon would generate retrograde CMCs while stimulating IPANs in the proximal colon did nothing. Key controls confirming that the optogentic stimulus was adequate, specific, and relevant are needed. In addition, better characterization of the Cdh6+ population of cells in both regions would be needed to understand the mechanisms underlying these effects.

      (4) From the data shown, it is clear that expression driven by the Cdh6CreERT2 driver is not confined to IPANs. There is obviously expression of GFP and ChR2 in smooth muscle cells. This is a major limitation for the physiological experiments that attempt to use this model to specifically stimulate IPANs and assess changes in gut motor function. Better characterization of this model is needed and control experiments are necessary to assess whether functional ChR2 is expressed in cells beyond the proposed subtype of enteric IPANs.

      (5) Some of the main conclusions of this study are overstated and claims of priority are made that are not true. For example, the authors state on lines 27-28 of the abstract that their findings provide the "first demonstration of selective activation of a single neurochemical and functional class of enteric neurons". This is certainly not true since Gould et al (AJP-GIL 2019) expressed ChR2 in nitrergic enteric neurons and showed that activating those cells disrupted CMC activity. In fact, prior work by the authors themselves (Hibberd et al Gastro 2018) showed that activating calretinin neurons with ChR2 evoked motor responses. Work by other groups has used chemogenetics and optogenetics to show effects of activating multiple other classes of neurons in the gut.

      (6) The electrophysiological characterization of mouse IPANs is useful but is limited to a small subset of Cdh6+ neurons in the distal colon myenteric plexus. Therefore, it remains unclear how well the properties reported here might reflect those of other Cdh6+ IPANs in the same or different regions. Similarly, blocking IH with ZD7288 affects all IPANs and does not add specific information regarding the role of the proposed Cdh6+ subtype.

      (7) The submucosal plexus (SMP) also contains enteric IPANs and these were not included in the analysis of Cdh6 expression. Whether or not the proposed IPAN marker Cdh6 would be useful for identifying or targeting those cells remains unclear.

      [Editor's note: The Reviewing Editor considers that further controls requested from the reviewers have largely been provided already in prior publications by other groups, as they concern specifically tools published years ago but in a different tissue context. Hence the methodology used to deliver the results reported here fall within the standard practices in the field. The comprehensive, multi-technique approach to the results is compelling in and of itself, and ought to suffice, rendering this work reproducible and therefore a basis for further research.]

    4. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Summary:

      In their manuscript, Gomez-Frittelli and colleagues characterize the expression of cadherin6 (and -8) in colonic IPANs of mice. Moreover, they found that these cdh6-expressing IPANs are capable of initiating colonic motor complexes in the distal colon, but not proximal and midcolon. They support their claim by morphological, electrophysiological, optogenetic, and pharmacological experiments.

      Strengths:

      The work is very impressive and involves several genetic models and state-of-the-art physiological setups including respective controls. It is a very well-written manuscript that truly contributes to our understanding of GI-motility and its anatomical and physiological basis. The authors were able to convincingly answer their research questions with a wide range of methods without overselling their results.

      We greatly appreciate the reviewer’s time, careful reading and support of our study.

      Weaknesses:

      The authors put quite some emphasis on stating that cdh6 is a synaptic protein (in the title and throughout the text), which interacts in a homophilic fashion. They deduct that cdh6 might be involved in IPAN-IPAN synapses (line 247ff.). However, Cdh6 does not only interact in synapses and is expressed by non-neuronal cells as well (see e.g., expression in the proximal tubuli of the kidney). Moreover, cdh6 does not only build homodimers, but also heterodimers with Chd9 as well as Cdh7, -10, and -14 (see e.g., Shimoyama et al. 2000, DOI: 10.1042/02646021:3490159). It would therefore be interesting to assess the expression pattern of cdh6proteins using immunostainings in combination with synaptic markers to substantiate the authors' claim or at least add the possibility of cell-cell-interactions other than synapses to the discussion. Additionally, an immunostaining of cdh6 would confirm if the expression of tdTomato in smooth muscle cells of the cdh6-creERT model is valid or a leaky expression (false positive).

      We agree with the reviewer that Cdh6 could be mediating some other cell-cell interaction besides synapses between IPANs, and we noted it in the discussion. Cdh6 primarily forms homodimers but, as the reviewer points out, has been known to also form heterodimers with some other cadherins. We performed RNAscope in the colonic myenteric plexus with Cdh7 and found no expression (data not shown). Cdh10 is suggested to have very low expression (Drokhlyansky et al., 2020), possibly in putative secretomotor vasodilator neurons, and Cdh14 has not been assayed in any RNAseq screens. We attempted to visualize Cdh6 protein via antibody staining (Duan et al., 2018) but our efforts did not result in sufficient signal or resolution to identify synapses in the ENS, which remain broadly challenging to assay. Similarly, immunostaining with Cdh6 antibody was unable to confirm Cdh6 protein in tdT-expressing muscle cells, or by RNAscope. We have addressed these caveats in the discussion section.

      (1) E. Drokhlyansky, C. S. Smillie, N. V. Wittenberghe, M. Ericsson, G. K. Griffin, G. Eraslan, D. Dionne, M. S. Cuoco, M. N. Goder-Reiser, T. Sharova, O. Kuksenko, A. J. Aguirre, G. M. Boland, D. Graham, O. Rozenblatt-Rosen, R. J. Xavier, A. Regev, The Human and Mouse Enteric Nervous System at Single-Cell Resolution. Cell 182, 1606-1622.e23 (2020).

      (2) X. Duan, A. Krishnaswamy, M. A. Laboulaye, J. Liu, Y.-R. Peng, M. Yamagata, K. Toma, J. R. Sanes, Cadherin Combinations Recruit Dendrites of Distinct Retinal Neurons to a Shared Interneuronal Scaffold. Neuron 99, 1145-1154.e6 (2018).

      Reviewer #2 (Public review):

      Summary:

      Intrinsic primary afferent neurons are an interesting population of enteric neurons that transduce stimuli from the mucosa, initiate reflexive neurocircuitry involved in motor and secretory functions, and modulate gut immune responses. The morphology, neurochemical coding, and electrophysiological properties of these cells have been relatively well described in a long literature dating back to the late 1800's but questions remain regarding their roles in enteric neurocircuitry, potential subsets with unique functions, and contributions to disease. Here, the authors provide RNAscope, immunolabeling, electrophysiological, and organ function data characterizing IPANs in mice and suggest that Cdh6 is an additional marker of these cells.

      Strengths:

      This paper would likely be of interest to a focused enteric neuroscience audience and increase information regarding the properties of IPANs in mice. These data are useful and suggest that prior data from studies of IPANs in other species are likely translatable to mice.

      We appreciate the reviewer’s support of our study and insightful critiques for its improvement.

      Weaknesses:

      The advance presented here beyond what is already known is minimal. Some of the core conclusions are overstated and there are multiple other major issues that limit enthusiasm. Key control experiments are lacking and data do not specifically address the properties of the proposed Cdh6+ population.

      Major weaknesses:

      (1) The novelty of this study is relatively low. The main point of novelty suggests an additional marker of IPANs (Cdh6) that would add to the known list of markers for these cells. How useful this would be is unclear. Other main findings basically confirm that IPANs in mice display the same classical characteristics that have been known for many years from studies in guinea pigs, rats, mice and humans.

      We appreciate the already existing markers for IPANs in the ENS and the existing literature characterizing these neurons. The primary intent of this study was to use these well-established characteristics of IPANs in both mice and other species to characterize Cdh6-expressing neurons in the mouse myenteric plexus and confirm their classification as IPANs.

      (2) Some of the main conclusions of this study are overstated and claims of priority are made that are not true. For example, the authors state in lines 27-28 of the abstract that their findings provide the "first demonstration of selective activation of a single neurochemical and functional class of enteric neurons". This is certainly not true since Gould et al (AJP-GIL 2019) expressed ChR2 in nitrergic enteric neurons and showed that activating those cells disrupted CMC activity. In fact, prior work by the authors themselves (Hibberd et al., Gastro 2018) showed that activating calretinin neurons with ChR2 evoked motor responses. Work by other groups has used chemogenetics and optogenetics to show the effects of activating multiple other classes of neurons in the gut.

      We thank the reviewer for bringing up this important point and apologize if our wording was not clear. Whilst single neurochemical classes of enteric neurons have been manipulated to alter gut functions, all such instances to date do not represent manipulation of a single functional class of enteric neurons. In the given examples, multiple functional classes are activated utilizing the same neurotransmitter, as NOS and calretinin are each expressed to varying degrees across putative motor neurons, interneurons and IPANs. In contrast, Chd6 is restricted to IPANs and therefore this study is the first optogenetic investigation of enteric neurons from a single putative functional class. Our abstract and discussion emphasizes this point and differentiates this study from those previous.

      (3) Critical controls are needed to support the optogenetic experiments. Control experiments are needed to show that ChR2 expression a) does not change the baseline properties of the neurons, b) that stimulation with the chosen intensity of light elicits physiologically relevant responses in those neurons, and c) that stimulation via ChR2 elicits comparable responses in IPANs in the different gut regions focused on here.

      We completely agree controls are essential. However, our paper is not the first to express ChR2 in enteric neurons. Authors of our paper have shown in Hibberd et al. 2018 that expression of ChR2 in a heterogeneous population of myenteric neurons did not change network properties of the myenteric plexus. This was demonstrated in the lack of change in control CMC characteristics in mice expressing ChR2 under basal conditions (without blue light exposure). Regarding question (b), that it should be shown that stimulation with the chosen intensity of light elicits physiologically relevant responses in those neurons. We show the restricted expression of ChR2 in IPANs and that motor responses (to blue light) are blocked by selective nerve conduction blockade.

      Regarding question (c), that our study should demonstrate that stimulation via ChR2 elicits comparable responses in IPANs in the different gut regions. We would not expect each region of the gut to behave comparably. This is because the different gut regions (i.e. proximal, mid, distal) are very different anatomically, as is anatomy of the myenteric plexus and myenteric ganglia between each region, including the density of IPANs within each ganglia, in addition to the presence of different patterns of electrical and mechanical activity [Spencer et al., 2020]. Hence, it is difficult to expect that between regions stimulation of ChR2 should induce similar physiological responses. The motor output we record in our study (CMCs) is a unified motor program that involves the temporal coordination of hundreds of thousands of enteric neurons and a complex neural circuit that we have previously characterized [Spencer et al., 2018]. But, never has any study until now been able to selectively stimulate a single functional class of enteric neurons (with light) to avoid indiscriminate activation of other classes of neurons.

      (1) T. J. Hibberd, J. Feng, J. Luo, P. Yang, V. K. Samineni, R. W. Gereau, N. Kelley, H. Hu, N. J. Spencer, Optogenetic Induction of Colonic Motility in Mice. Gastroenterology 155, 514-528.e6 (2018).

      (2) N. J. Spencer, L. Travis, L. Wiklendt, T. J. Hibberd, M. Costa, P. Dinning, H. Hu, Diversity of neurogenic smooth muscle electrical rhythmicity in mouse proximal colon. American Journal of Physiology-Gastrointestinal and Liver Physiology 318, G244–G253 (2020).

      (3) N. J. Spencer, T. J. Hibberd, L. Travis, L. Wiklendt, M. Costa, H. Hu, S. J. Brookes, D. A. Wattchow, P. G. Dinning, D. J. Keating, J. Sorensen, Identification of a Rhythmic Firing Pattern in the Enteric Nervous System That Generates Rhythmic Electrical Activity in Smooth Muscle. The Journal of Neuroscience 38, 5507–5522 (2018).

      (4) The electrophysiological characterization of mouse IPANs is useful but this is a basic characterization of any IPAN and really says nothing specifically about Cdh6+ neurons. The electrophysiological characterization was also only done in a small fraction of colonic IPANs, and it is not clear if these represent cell properties in the distal colon or proximal colon, and whether these properties might be extrapolated to IPANs in the different regions. Similarly, blocking IH with ZD7288 affects all IPANs and does not add specific information regarding the role of the proposed Cdh6+ subtype.

      Our electrophysiological characterization was guided to be within a subset of Cdh6+ neurons by Hb9:GFP expression. As in the prior comment (1) above, we used these experiments to confirm classification of Cdh6+ (Hb9:GFP+) neurons in the distal colon as IPANs. We have clarified in the results and methods that these experiments were performed in the distal colon and agree that we cannot extrapolate that these properties are also representative of IPANs in the proximal colon. We apologize that this was confusing. Finally, we agree with the reviewer that ZD7288 affects all IPANs in the ENS and have clarified this in the text.

      (5) Why SMP IPANs were not included in the analysis of Cdh6 expression is a little puzzling. IPANs are present in the SMP of the small intestine and colon, and it would be useful to know if this proposed marker is also present in these cells.

      We agree with the reviewer. In addition to characterizing Cdh6 in the myenteric plexus, it would be interesting to query if sensory neurons located within the SMP also express Cdh6. Our preliminary data (n=2) show ~6-12% tdT/Hu neurons in Cdh6-tdT ileum and colon (data not shown). We have added a sentence to the discussion.

      (6) The emphasis on IH being a rhythmicity indicator seems a bit premature. There is no evidence to suggest that IH and IT are rhythm-generating currents in the ENS.

      Regarding the statement there is no evidence to suggest that IH and IT are rhythm-generating currents in the ENS. We agree with the reviewer that evidence of rhythm generation by IH and IT in the ENS has not been explicitly confirmed. We are confident the reviewer agrees that an absence of evidence is not evidence of absence, although the presence of IH has been well described in enteric neurons. We have modified the text in the results to indicate more clearly that IH and IT are known to participate in rhythm generation in thalamocortical circuits, though their roles in the ENS remain unknown. Our discussion of the potential role of IH or IT in rhythm generation or oscillatory firing of the ENS is constrained to speculation in the discussion section of the text.

      (7) As the authors point out in the introduction and discuss later on, Type II Cadherins such as Cdh6 bind homophillically to the same cadherin at both pre- and post-synapse. The apparent enrichment of Cdh6 in IPANs would suggest extensive expression in synaptic terminals that would also suggest extensive IPAN-IPAN connections unless other subtypes of neurons express this protein. Such synaptic connections are not typical of IPANs and raise the question of whether or not IPANs actually express the functional protein and if so, what might be its role. Not having this information limits the usefulness of this as a proposed marker.

      We agree with the reviewer that the proposed IPAN-IPAN connection is novel although it has been proposed before (Kunze et al., 1993). As detailed in our response to Reviewer #1, we attempted to confirm Cdh6 protein expression, but were unsuccessful, due to insufficient signal and resolution. We therefore discuss potential IPAN interconnectivity in the discussion, in the context of contrasting literature.

      (1) W. A. A. Kunze, J. B. Furness, J. C. Bornstein, Simultaneous intracellular recordings from enteric neurons reveal that myenteric ah neurons transmit via slow excitatory postsynaptic potentials. Neuroscience 55, 685–694 (1993).

      (8) Experiments shown in Figures 6J and K use a tethered pellet to drive motor responses. By definition, these are not CMCs as stated by the authors.

      The reviewer makes a valid criticism as to the terminology, since tethered pellet experiments do not record propagation. We believe the periodic bouts of propulsive force on the pellet is triggered by the same activity underlying the CMC. In our experience, these activities have similar periodicity, force and identical pharmacological properties. Consistent with this, we also tested full colons (n = 2) set up for typical CMC recordings by multiple force transducers, finding that CMCs were abolished by ZD7288, similar to fixed pellet recordings (data not shown).

      (9) The data from the optogenetic experiments are difficult to understand. How would stimulating IPANs in the distal colon generate retrograde CMCs and stimulating IPANs in the proximal colon do nothing? Additional characterization of the Cdh6+ population of cells is needed to understand the mechanisms underlying these effects.

      We agree that the different optogenetic responses in the proximal and distal colon are challenging to interpret, but perhaps not surprising in the wider context. It is not only possible that the different optogenetic responses in this study reflect regional differences in the Chd6+ neuronal populations, but also differences in neural circuits within these gut regions. A study some time ago by the authors showed that electrical stimulation of the proximal mouse colon was unable to evoke a retrograde (aborally) propagating CMC (Spencer, Bywater, 2002), but stimulation of the distal colon was readily able to. We concluded that at the oral lesion site there is a preferential bias of descending inhibitory nerve projections, since the ascending excitatory pathways have been cut off. In contrast, stimulation of the distal colon was readily able to activate an ascending excitatory neural pathway, and hence induce the complex CMC circuits required to generate an orally propagating CMC. Indeed, other recent studies have added to a growing body of evidence for significant differences in the behaviors and neural circuits of the two regions (Li et al., 2019, Costa et al., 2021a, Costa et al., 2021b, Nestor-Kalinoski et al., 2022). We have expanded this discussion.

      (1) N. J. Spencer, R. A. Bywater, Enteric nerve stimulation evokes a premature colonic migrating motor complex in mouse. Neurogastroenterology & Motility 14, 657–665 (2002).

      (2) Li Z, Hao MM, Van den Haute C, Baekelandt V, Boesmans W, Vanden Berghe P, Regional complexity in enteric neuron wiring reflects diversity of motility patterns in the mouse large intestine. Elife 8:e42914 (2019).

      (3) Costa M, Keightley LJ, Hibberd TJ, Wiklendt L, Dinning PG, Brookes SJ, Spencer NJ, Motor patterns in the proximal and distal mouse colon which underlie formation and propulsion of feces. Neurogastroenterology & Motility e14098 (2021a).

      (4) Costa M, Keightley LJ, Hibberd TJ, Wiklendt L, Smolilo DJ, Dinning PG, Brookes SJ, Spencer NJ, Characterization of alternating neurogenic motor patterns in mouse colon. Neurogastroenterology & Motility 33:e14047 (2021b).

      (5) Nestor-Kalinoski A, Smith-Edwards KM, Meerschaert K, Margiotta JF, Rajwa B, Davis BM, Howard MJ, Unique Neural Circuit Connectivity of Mouse Proximal, Middle, and Distal Colon Defines Regional Colonic Motor Patterns. Cellular and Molecular Gastroenterology and Hepatology 13:309-337.e303 (2022).

      Recommendations for the Authors:

      Reviewer #1 (Recommendations for the authors):

      As mentioned above, immunolocalization of cdh6 would be helpful to substantiate the claims regarding IPAN-IPAN synapses.

      As mentioned in our response to both reviewers’ public reviews, we attempted to visualize Cdh6 protein via antibody staining (Duan et al., 2018), but our efforts did not result in sufficient signal or resolution to identify Cdh6+ synapses.

      (1) X. Duan, A. Krishnaswamy, M. A. Laboulaye, J. Liu, Y.-R. Peng, M. Yamagata, K. Toma, J. R. Sanes, Cadherin Combinations Recruit Dendrites of Distinct Retinal Neurons to a Shared Interneuronal Scaffold. Neuron 99, 1145-1154.e6 (2018).

      Reviewer #2 (Recommendations for the authors):

      (1) The authors repeatedly refer to IPANs as "sensory" neurons (e.g. in title, abstract, and introduction) but there is some debate regarding whether these cells are truly "sensory" because the information they convey never reaches sensory perception. This is why they have classically been referred to as intrinsic primary afferent (IPAN) neurons. It would be more appropriate to stick with this terminology unless the authors have compelling data showing that information detected by IPANs reaches the sensory cortex.

      We thank the reviewer for their comment, but respectfully disagree. The term “sensory neuron” is well established in the ENS. The first definitive proof that “sensory neurons” exist in the ENS was published in Kunze et al., 1995. We note that this paper did not use the word “IPAN” but used the term “sensory neuron”. Furthermore, mechanosensory neurons were published in Spencer and Smith (2004).

      Regarding the reviewer’s comment that the authors would need compelling data showing that information detected by IPANs reaches the sensory cortex before the term “sensory neuron” should be valid, it is important to note that many sensory neurons do not provide direct information to the cortex.

      (1) W. A. A. Kunze, J. C. Bornstein, J. B. Furness, Identification of sensory nerve cells in a peripheral organ (the intestine) of a mammal. Neuroscience 66, 1–4 (1995).

      (2) N. J. Spencer, T. K. Smith, Mechanosensory S-neurons rather than AH-neurons appear to generate a rhythmic motor pattern in guinea-pig distal colon. The Journal of Physiology 558, 577–596 (2004).

      (2) Important information regarding the gut region shown and other details are absent from many figure legends.

      We apologize for this omission. We have updated the figure legends to include information on gut regions.

    1. eLife Assessment

      This valuable study reports on the critical role of ANKRD5 (ANKEF1) in sperm motility and male fertility. However, the supporting data remain incomplete. This work will be of interest to biomedical researchers working in sperm biology and andrologists.

    2. Reviewer #1 (Public review):

      Summary:

      Asthenospermia, characterized by reduced sperm motility, is one of the major causes of male infertility. The "9 + 2" arranged MTs and over 200 associated proteins constitute the axoneme, the molecular machine for flagellar and ciliary motility. Understanding the physiological functions of axonemal proteins, particularly their links to male infertility, could help uncover the genetic causes of asthenospermia and improve its clinical diagnosis and management. In this study, the authors generated Ankrd5 null mice and found that ANKRD5-/- males exhibited reduced sperm motility and infertility. Using FLAG-tagged ANKRD5 mice, mass spectrometry, and immunoprecipitation (IP) analyses, they confirmed that ANKRD5 is localized within the N-DRC, a critical protein complex for normal flagellar motility. However, transmission electron microscopy (TEM) and cryo-electron tomography (cryo-ET) of sperm from Ankrd5 null mice did not reveal any structural abnormalities.

      Strengths:

      The phenotypes observed in ANKRD5-/- mice, including reduced sperm motility and male infertility, are conversing. The authors demonstrated that ANKRD5 is an N-DRC protein that interacts with TCTE1 and DRC4. Most of the experiments are thoughtfully designed and well executed.

      Weaknesses:

      The cryo-FIB and cryo-ET analyses require further investigation, as detailed below. The molecular mechanism by which the loss of ANKRD5 affects sperm flagellar motility remains unclear. The current conclusion that Ankrd5 knockout reduces axoneme stability is not well-supported. Specifically, are other axonemal proteins diminished in Ankrd5 knockout sperm? Conducting immunofluorescence analyses and revisiting the quantitative proteomics data may help address these questions.

    3. Reviewer #2 (Public review):

      Summary:

      The manuscript investigates the role of ANKRD5 (ANKEF1) as a component of the N-DRC complex in sperm motility and male fertility. Using Ankrd5 knockout mice, the study demonstrates that ANKRD5 is essential for sperm motility and identifies its interaction with N-DRC components through IP-mass spectrometry and cryo-ET. The results provide insights into ANKRD5's function, highlighting its potential involvement in axoneme stability and sperm energy metabolism.

      Strengths:

      The authors employ a wide range of techniques, including gene knockout models, proteomics, cryo-ET, and immunoprecipitation, to explore ANKRD5's role in sperm biology.

      Weaknesses:

      (1) Limited Citations in Introduction: Key references on the role of N-DRC components (e.g., DRC1, DRC2, DRC3, DRC5) in male infertility are missing, which weakens the contextual background.

      (2) Lack of Functional Insights: While interacting proteins outside the N-DRC complex were identified, their potential roles and interactions with ANKRD5 are not adequately explored or discussed.

      (3) Mitochondrial Function Uncertainty: Immunofluorescence suggests possible mitochondrial localization for ANKRD5, but experiments on its role in energy metabolism (e.g., ATP production, ROS) are insufficient, especially given the observed sperm motility defects.

      (4) Glycolysis Pathway Impact: Proteomic analysis indicates glycolysis pathway disruptions in Ankrd5-deficient sperm, but the link between these changes and impaired motility is not well explained.

      (5) Cryo-ET Data Limitations: The structural analysis of the DMT lacks clarity on how ANKRD5 influences N-DRC or RS3. The low quality of RS3 data hinders the interpretation of ANKRD5's impact on axoneme structure.

      (6) Discussion of Findings: The manuscript could benefit from a deeper discussion on the broader implications of ANKRD5's interactions and its role in sperm energy metabolism and motility mechanisms.

    4. Author response:

      Thank you for the constructive feedback from the reviewers. We are grateful for their insights and are committed to addressing the key concerns raised in the public reviews through the following revisions:

      (1) Validating Axoneme Stability Claims

      We have procured new antibodies for DRC11, as well as marker proteins for ODA, IDA, and RS. We will conduct quantitative immunofluorescence staining to validate our claims regarding axoneme stability.

      (2) Investigating ANKRD5 Expression in Other Ciliated Cells

      We plan to examine the expression of ANKRD5 in mouse respiratory cilia to determine whether it is also expressed in these cells.

      (3) Supplementing Key Citations for N-DRC Components

      We will add references to published studies on N-DRC components (e.g., DRC1, DRC2, DRC3, DRC5) associated with male infertility in the Introduction to strengthen the background context.

      (4) Further Analysis and Validation of ANKRD5 Interactome

      We will conduct additional analyses and validation of the interactome of ANKRD5 detected by LC-MS.

      (5) Elucidating the Function of ANKRD5 in Mitochondria

      We will further investigate the role of ANKRD5 in mitochondrial function.

      (6) Investigating Mitochondrial Function and Energy Metabolism

      We will further explore the role of ANKRD5 in mitochondrial function and energy metabolism.

      (7) Improving Cryo-ET Data Quality and Interpretation

      We will attempt to further improve the quality of the STA results and try to calculate the DMT structure with a period of 96 nm. We will also use the WT density map with the same period to generate a difference map.

      (8) Expanding Discussion and Correcting Terminology

      The Discussion section will be revised to elaborate on the implications of ANKRD5 for male contraceptive research, particularly in targeting sperm motility. We will also correct terminology inaccuracies (e.g., changing "9+2 microtubule doublet" to "9+2 structure") and address formatting issues (e.g., capitalizing "Control").

      Response to Reviewer #2 Comment 4:

      We appreciate the reviewer's careful consideration of our proteomic data. However, our Gene Set Enrichment Analysis (GSEA) of glycolysis/gluconeogenesis pathways showed no significant enrichment (p-value=0.089, NES=0.708; Fig.6D), which does not meet the statistical thresholds for biological significance (|NES|>1, pvalue<0.05). This observation is further corroborated by our direct ATP measurements showing no difference between genotypes (Fig.6E). We agree that further studies on metabolic regulation could be valuable, but current evidence does not support glycolysis disruption as a primary mechanism for the motility defects observed in Ankrd5-null sperm. This misinterpretation likely arose from the reviewer's overinterpretation of non-significant proteomic trends. We request that this specific claim be excluded from the assessment to avoid misleading readers.

      We will provide a comprehensive point-by-point response, along with detailed experimental data and revised figures, in the resubmitted manuscript. Thank you once again for the opportunity to address the reviewers' concerns. We are confident that these revisions will strengthen our manuscript and contribute to the scientific community.

    1. eLife Assessment

      This study demonstrates the critical role of Afadin on the generation and maintenance of complex cellular layers in the mouse retina. The data are solid, which provides important insights into how cell-adhesion molecules contribute to retinal organization. However, further investigations are needed to clarify the mechanisms underlying the cellular disorganization phenotype in the retina and axonal projection to the brain.

    2. Reviewer #1 (Public review):

      Summary:

      In this study, the authors examined the role of Afadin, a key adaptor protein associated with cell-adhesion molecules, in retinal development. Using a conditional knockout mouse line (Six3-Cre; AfadinF/F), the authors successfully characterized a disorganized pattern of various neuron types in the mutant retinae. Despite these altered distributions, the retinal neurons maintained normal cell numbers and seemingly preserved some synaptic connections. Notably, tracing results indicated mistargeting of retinal ganglion cell (RGC) axon projections to the superior colliculus, and electroretinography (ERG) analyses suggested deficits in visual functions.

      Strengths:

      This compelling study provides solid evidence addressing the important question of how cell-adhesion molecules influence neuronal development. Compared to previous research conducted in other parts of the central nervous system (CNS), the clearly defined lamination of cell types in the retina serves as a unique model for studying the aberrant neuronal localizations caused by Afadin knockout. The data suggest that cell-cell interactions are critical for retinal cellular organization and proper axon pathfinding, while aspects of cell fate determination and synaptogenesis remain less understood. This work has broad implications not only for retinal studies but also for developmental biology and regenerative medicine.

      Weaknesses:

      While the phenotypes observed in the Afadin knockout (cKO) mice are intriguing, I would expect to see evidence confirming that Afadin is indeed knocked out in the retina through immunostaining. Specifically, is Afadin knocked out only in certain retinal regions and not others, as suggested by Figures 4A-B? Are Afadin levels different among distinct neuron types, which could mean that its knockout may have a more pronounced impact on certain cell types, such as rods compared to others?

      The authors suggest that synapses may form between canonical synaptic partners, based on the proximity of their processes (Figure 2). However, more solid evidence is needed to verify these synapses through the use of synaptic marker staining or transsynaptic labeling before drawing further conclusions.

      Although the Afadin cKO mice displayed dramatic phenotypes, additional experiments are necessary to clarify the details of this process. By manipulating Afadin levels in specific cell types or at different developmental time points, we could gain a better understanding of how Afadin regulates accurate retinal lamination and axonal projection.

    3. Reviewer #2 (Public review):

      Summary:

      This study by Lum and colleagues reports on the role of Afadin, a cytosolic adapter protein that organizes multiple cell adhesion molecule families, in the generation and maintenance of complex cellular layers in the mouse retina. They used a conditional deletion approach, removing Afadin in retinal progenitors, and allowing them to analyze broad effects on retinal neuron development.

      The study presents high-quality and extensive characterization of the cellular phenotypes, supporting the main conclusions of the paper. They show that Afadin loss results in significant disorganization of the retinal cellular layers and the neuropil, producing rosettes and displacement of cells away from their resident layers. The major classes of neurons in the inner retina are affected, and some neurons are, remarkably, displaced to the other side of the inner plexiform layer. Nevertheless, they mostly target their synaptic partners, including the RGCs to distant retinorecipient targets in the brain. The main conclusions are as follows. Afadin is necessary for establishing and maintaining the retinal architecture. It is not necessary for the generation of the correct numbers/densities of retinal neuron subtypes. Moreover, Afadin loss preserves associations between known synaptic partners and preserves axonal targeting to retinorecipient layers. The consequences on photoreceptor viability and visual processing are also interesting, underscoring the essential function for maintaining retinal structure and function. Overall the main conclusions describing the consequences are supported by the results.

      Strengths:

      The study provides new knowledge on the requirement of Afadin in retinal development. The introduction and discussion effectively set up the rationale for this work, and place it in the context of previous studies of Afadin in other regions of the CNS.

      The study presents high-quality and extensive characterizations of the cellular phenotypes resulting from Afadin loss. By analyzing various aspects of retinal organization - from cellular densities to axon targeting to brain - the study narrows down the role of a structure for promoting the establishment of the layers, or maintenance. The data are straightforward and convincing, and the interpretations are bounded by the data shown (though minor weakness re. survival). Another important finding is that the targeting of retinal neuron processes to synaptic partners, including retinorecipient targets in the brain, are intact.

      The study is important as it establishes a focused requirement for Afadin to set up and preserve the overall cellular organizations within the retinal tissue. The demonstration that Afadin is needed for photoreceptor viability and overall visual function enhances impact by establishing its functional importance.

      The manuscript is well well-written and presented. The images are attractive and compelling, and the figures are well organized.

      Weaknesses:

      (1) Expanding on the developmental mechanism is beyond the scope of the study, and would not add to the main conclusions. However, the manuscript would be improved by providing more clarity on the developmental emergence of the defects. The study left me questioning whether the rosettes and cell displacements occur during earlier stages of retina development, or are progressive. For instance, do the RGCs migrate and establish within the GCL correctly at first, and then are displaced with the progressive disorganization? Or are they disorganized and delaminate en route? Images of RGC staining at P0, or earlier during their migration, would be informative. Data in Figure 1 is limited to DAPI staining at P7. Figure 4 shows an image of rod photoreceptors at P7, with their displacement in the GCL layer (and not contained within a rosette). Are the progenitors mislocalized due to delamination?

      A few additional thoughts on how these defects compare to other mutants with rosettes might give us more context for understanding the results.

      (2) The manuscript reports that the densities of major inner retinal classes are unaffected. There are a few details missing for this point. How were the cell densities quantified (in terms of ROI size), and normalized? This information is lacking in the methods. There is a striking thickening of the GCL in the DAPI-labeled images shown in Figure 1. What are these cells?

    4. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      In this study, the authors examined the role of Afadin, a key adaptor protein associated with cell-adhesion molecules, in retinal development. Using a conditional knockout mouse line (Six3-Cre; AfadinF/F), the authors successfully characterized a disorganized pattern of various neuron types in the mutant retinae. Despite these altered distributions, the retinal neurons maintained normal cell numbers and seemingly preserved some synaptic connections. Notably, tracing results indicated mistargeting of retinal ganglion cell (RGC) axon projections to the superior colliculus, and electroretinography (ERG) analyses suggested deficits in visual functions.

      Thank you for the summary and highlights of our study. We appreciate the input from Reviewer 1 and the Editor on this study, with focus on laminar choices, synaptic choices and axonal projections.

      Strengths:

      This compelling study provides solid evidence addressing the important question of how cell-adhesion molecules influence neuronal development. Compared to previous research conducted in other parts of the central nervous system (CNS), the clearly defined lamination of cell types in the retina serves as a unique model for studying the aberrant neuronal localizations caused by Afadin knockout. The data suggest that cell-cell interactions are critical for retinal cellular organization and proper axon pathfinding, while aspects of cell fate determination and synaptogenesis remain less understood. This work has broad implications not only for retinal studies but also for developmental biology and regenerative medicine.

      Weaknesses:

      While the phenotypes observed in the Afadin knockout (cKO) mice are intriguing, I would expect to see evidence confirming that Afadin is indeed knocked out in the retina through immunostaining. Specifically, is Afadin knocked out only in certain retinal regions and not others, as suggested by Figures 4A-B? Are Afadin levels different among distinct neuron types, which could mean that its knockout may have a more pronounced impact on certain cell types, such as rods compared to others?

      The authors suggest that synapses may form between canonical synaptic partners, based on the proximity of their processes (Figure 2). However, more solid evidence is needed to verify these synapses through the use of synaptic marker staining or transsynaptic labeling before drawing further conclusions.

      Although the Afadin cKO mice displayed dramatic phenotypes, additional experiments are necessary to clarify the details of this process. By manipulating Afadin levels in specific cell types or at different developmental time points, we could gain a better understanding of how Afadin regulates accurate retinal lamination and axonal projection.

      Regarding the antibody confirming the Knockout, we tested the commercially available antibody from Sigma but weren’t able to confirm its specificity. There was a homemade antibody from another Japan-based laboratory, but it was not available to share at the moment when the study was conducted. Nonetheless, the original allele was derived for hippocampal and cortical studies by Louis Reichardt’s Lab (UCSF), with verified efficacies of the KO allele.

      Regarding phenotypical penetrance, this may likely come from the mosaicism of the clone and the symmetric cell division, leading to a rosette-like structure. At this moment, we reason that Afadin KO does NOT lead to direct neuronal loss, and the selective rod loss may derive from other issues, but we lack direct evidence to validate this point.

      In regards to the specific neuronal types and synaptic pairs, we acknowledge the limitations of the current Figure 2 in linking the mutant phenotypes to circuit changes. However, the current genetic reagents (Six3Cre) are not compatible with neuron-type specific labeling of synaptic labeling – i.e., cell type-specific Cre and additional Cre-dependent AAV tools might be desired. To do so, we will need to initiate cell-type-specific breeding of transgenic markers such as Hb9GFP for ooDSGCs, or Chat-Cre, VGlut3-Cre for starburst amacrine cells, vG3 amacrine cells, followed by retinal physiology. These experiments take multi-allelic genetic crosses for a very low breeding yield (1/16 or 1/32 Mendelian ratio). These extensive genetic tests are beyond the scope of the current manuscript.

      Reviewer #2 (Public review):

      Summary:

      This study by Lum and colleagues reports on the role of Afadin, a cytosolic adapter protein that organizes multiple cell adhesion molecule families, in the generation and maintenance of complex cellular layers in the mouse retina. They used a conditional deletion approach, removing Afadin in retinal progenitors, and allowing them to analyze broad effects on retinal neuron development.

      The study presents high-quality and extensive characterization of the cellular phenotypes, supporting the main conclusions of the paper. They show that Afadin loss results in significant disorganization of the retinal cellular layers and the neuropil, producing rosettes and displacement of cells away from their resident layers. The major classes of neurons in the inner retina are affected, and some neurons are, remarkably, displaced to the other side of the inner plexiform layer. Nevertheless, they mostly target their synaptic partners, including the RGCs to distant retinorecipient targets in the brain. The main conclusions are as follows. Afadin is necessary for establishing and maintaining the retinal architecture. It is not necessary for the generation of the correct numbers/densities of retinal neuron subtypes. Moreover, Afadin loss preserves associations between known synaptic partners and preserves axonal targeting to retinorecipient layers. The consequences on photoreceptor viability and visual processing are also interesting, underscoring the essential function for maintaining retinal structure and function. Overall, the main conclusions describing the consequences are supported by the results.

      Strengths:

      The study provides new knowledge on the requirement of Afadin in retinal development. The introduction and discussion effectively set up the rationale for this work, and place it in the context of previous studies of Afadin in other regions of the CNS.

      The study presents high-quality and extensive characterizations of the cellular phenotypes resulting from Afadin loss. By analyzing various aspects of retinal organization - from cellular densities to axon targeting to brain - the study narrows down the role of a structure for promoting the establishment of the layers, or maintenance. The data are straightforward and convincing, and the interpretations are bounded by the data shown (though minor weakness re. survival). Another important finding is that the targeting of retinal neuron processes to synaptic partners, including retinorecipient targets in the brain, are intact.

      The study is important as it establishes a focused requirement for Afadin to set up and preserve the overall cellular organizations within the retinal tissue. The demonstration that Afadin is needed for photoreceptor viability and overall visual function enhances impact by establishing its functional importance.

      The manuscript is well well-written and presented. The images are attractive and compelling, and the figures are well organized.

      Thank you for your high praise on the logic, data presentation, and significance of the current manuscript. We appreciate your comments on the novelty and impact of our study using retinal circuits as a model.

      Weaknesses:

      (1) Expanding on the developmental mechanism is beyond the scope of the study, and would not add to the main conclusions. However, the manuscript would be improved by providing more clarity on the developmental emergence of the defects. The study left me questioning whether the rosettes and cell displacements occur during earlier stages of retina development, or are progressive. For instance, do the RGCs migrate and establish within the GCL correctly at first, and then are displaced with the progressive disorganization? Or are they disorganized and delaminate en route? Images of RGC staining at P0, or earlier during their migration, would be informative. Data in Figure 1 is limited to DAPI staining at P7. Figure 4 shows an image of rod photoreceptors at P7, with their displacement in the GCL layer (and not contained within a rosette). Are the progenitors mislocalized due to delamination?  A few additional thoughts on how these defects compare to other mutants with rosettes might give us more context for understanding the results.

      We chose P7 as our focus due to the lamination in controls. In the revised manuscript, we plan to include earlier time points, as suggested by the reviewer. The data in Figure 1 at P7 utilizes well-established cell type markers (RBPMS, Chx10, Ap2α) and is not limited only to DAPI. Additionally, we will revise the discussion section and place our mutant analyses in the context of other mutants with rosettes (beta-catenin, etc.) in the retina. Finally, we will address the comment on progenitor lamination by exploring earlier developmental time points.

      (2) The manuscript reports that the densities of major inner retinal classes are unaffected. There are a few details missing for this point. How were the cell densities quantified (in terms of ROI size), and normalized? This information is lacking in the methods. There is a striking thickening of the GCL in the DAPI-labeled images shown in Figure 1. What are these cells?

      We will revise the manuscript, particularly the methods section, to address these comments. Additionally, we will tackle ROI units and normalization. The cells in the thickened GCL were identified as displaced amacrine cells and bipolar cells.

    1. eLife Assessment

      Centromeres are specific sites on chromosomes that are essential for mitosis and genome fidelity. This valuable work extends previous studies to convincingly show that the centromere-histone core contributes to force transduction through the kinetochore. The centromere mainly strengthens one of the two paths of force transduction, influenced by the centromeric DNA sequence, the mechanism for which remains to be determined. This work will be of interest to those studying cell division and chromosome segregation.

    2. Reviewer #1 (Public review):

      Summary:

      The authors address the role of the centromere histone core in force transduction by the kinetochore.

      Strengths:

      They use a hybrid DNA sequence that combines CDEII and CDEIII as well as Widom 601 so they can make stable histones for biophysical studies (provided by the Widom sequence) and maintain features of the centromere (CDE II and III).

      Weaknesses:

      The main results are shown in one figure (Figure 2). Indeed the Centromere core of Widom and CDE II and III contribute to strengthening the binding force for the OA-beads. The data are very nicely done and convincingly demonstrate the point. The weakness is that this is the entire paper. It is certainly of interest to investigators in kinetochore biology, but beyond that, the impact is fairly limited in scope.

    3. Reviewer #2 (Public review):

      Summary:

      This paper provides a valuable addendum to the findings described in Hamilton et al. 2020 (https://doi.org/ 10.7554/eLife.56582). In the earlier paper, the authors reconstituted the budding yeast centromeric nucleosome together with parts of the budding yeast kinetochore and tested which elements are required and sufficient for force transmission from microtubules to the nucleosome. Although budding yeast centromeres are defined by specific DNA sequences, this earlier paper did not use centromeric DNA but instead the generic Widom 601 DNA. The reason is that it has so far been impossible to stably reconstitute a budding yeast centromeric nucleosome using centromeric DNA.

      In this new study, the authors now report that they were able to replace part of the Widom 601 DNA with centromeric DNA from chromosome 3. This makes the assay more closely resemble the in vivo situation. Interestingly, the presence of the centromeric DNA fragment makes one type of minimal kinetochore assembly, but not the other, withstand stronger forces.

      Which kinetochore assembly turned out to be affected was somewhat unexpected, and can currently not be reconciled with structural knowledge of the budding yeast centromere/kinetochore. This highlights that, despite recent advances (e.g. Guan et al., 2021; Dendooven et al., 2023), aspects of budding yeast kinetochore architecture and function remain to be understood and that it will be important to dissect the contributions of the centromeric DNA sequence.

      Given the unexpected result, the study would become yet more informative if the authors were able to pinpoint which interactions contribute to the enhanced force resistance in the presence of centromeric DNA.

      Strength:

      The paper demonstrates that centromeric DNA can increase the attachment strength between budding yeast microtubules and centromeric nucleosomes.

      Weakness:

      How centromeric DNA exerts this effect remains unclear.

    4. Author response:

      Reviewer #1:

      Summary:

      The authors address the role of the centromere histone core in force transduction by the kinetochore.

      Strengths:

      They use a hybrid DNA sequence that combines CDEII and CDEIII as well as Widom 601 so they can make stable histones for biophysical studies (provided by the Widom sequence) and maintain features of the centromere (CDE II and III).

      Weaknesses:

      The main results are shown in one figure (Figure 2). Indeed the Centromere core of Widom and CDE II and III contribute to strengthening the binding force for the OA-beads. The data are very nicely done and convincingly demonstrate the point. The weakness is that this is the entire paper. It is certainly of interest to investigators in kinetochore biology, but beyond that, the impact is fairly limited in scope.

      This reviewer might have missed that this is a Research Advance, not an article.  Research Advances are limited in scope by definition and provide a new development that builds on research reported in a prior paper.  They can be of any length.  Our Research Advance builds on our prior work, Hamilton et al., 2020 and provides the new result that native centromere sequences strengthen the attachment of the kinetochore to the nucleosome.

      Reviewer #2:

      Summary:

      This paper provides a valuable addendum to the findings described in Hamilton et al. 2020 (https://doi.org/ 10.7554/eLife.56582). In the earlier paper, the authors reconstituted the budding yeast centromeric nucleosome together with parts of the budding yeast kinetochore and tested which elements are required and sufficient for force transmission from microtubules to the nucleosome. Although budding yeast centromeres are defined by specific DNA sequences, this earlier paper did not use centromeric DNA but instead the generic Widom 601 DNA. The reason is that it has so far been impossible to stably reconstitute a budding yeast centromeric nucleosome using centromeric DNA.

      In this new study, the authors now report that they were able to replace part of the Widom 601 DNA with centromeric DNA from chromosome 3. This makes the assay more closely resemble the in vivo situation. Interestingly, the presence of the centromeric DNA fragment makes one type of minimal kinetochore assembly, but not the other, withstand stronger forces.

      We thank the reviewer for their careful and positive assessment of our work.

      Which kinetochore assembly turned out to be affected was somewhat unexpected, and can currently not be reconciled with structural knowledge of the budding yeast centromere/kinetochore. This highlights that, despite recent advances (e.g. Guan et al., 2021; Dendooven et al., 2023), aspects of budding yeast kinetochore architecture and function remain to be understood and that it will be important to dissect the contributions of the centromeric DNA sequence.

      We couldn’t agree more.

      Given the unexpected result, the study would become yet more informative if the authors were able to pinpoint which interactions contribute to the enhanced force resistance in the presence of centromeric DNA.

      Strength:

      The paper demonstrates that centromeric DNA can increase the attachment strength between budding yeast microtubules and centromeric nucleosomes.

      Weakness:

      How centromeric DNA exerts this effect remains unclear.

    1. eLife Assessment

      In this work, the authors use a Drosophila melanogaster adult ventral nerve cord injury model extending and confirming previous observations. This important study reveals key aspects of adult neural plasticity. Taking advantage of several genetic reporter and fate tracing tools, the authors provide solid evidence for different forms of glial plasticity, that are increased upon injury. The significance of the generated cell types under homeostatic conditions and in response to injury remains to be further explored and open up new avenues of research.

    2. Reviewer #2 (Public review):

      Summary:

      Casas-Tinto et al., provide new insight into glial plasticity using a crush injury paradigm in the ventral nerve cord (VNC) of adult Drosophila. The authors find that both astrocyte-like glia (ALG) and ensheating glia (EG) divide under homeostatic conditions in the adult VNC and identify ALG as the glial population that specifically ramps up proliferation in response to injury, whereas the number of EGs decreases following the insult. Using lineage-tracing tools, the authors interestingly observe interconversion of glial subtypes, especially of EGs into ALGs, which occurs independent of injury and is dependent on the availability of the transcription factor Prospero in EGs, adding to the plasticity observed in the system. Finally, when tracing the progeny of glia, Casas-Tinto and colleagues detect cells of neuronal identity and provide evidence that such glia-derived neurogenesis is favored following ventral nerve cord injury, which puts forward a remarkable way in which glia can respond to neuronal damage.

      Strengths:

      This study highlights a new facet of adult nervous system plasticity at the level of the ventral nerve cord, supporting the view that proliferative capacity is maintained in the mature CNS and stimulated upon injury.

      The injury paradigm is well chosen, as the organization of the neuromeres allows specific targeting of one segment, compared to the remaining intact and with the potential to later link observed plasticity to behavior such as locomotion.

      Numerous experiments have been carried out in 7-day old flies, showing that the observed plasticity is not due to residual developmental remodeling or a still immature VNC.

      Different techniques are used to observe proliferation in the VNC.

      By elegantly combining different methods, the authors show glial divisions including with mitotic-dependent tracing and find that the number of generated glia is refined by apoptosis later on.

      The work identifies prospero in glia as important coordinator of glial cell fate, from development to the adult context, which draws further attention to the upstream regulatory mechanisms.

      Weaknesses:

      The authors do not discuss their results on gliogenesis or neurogenesis in the adult VNC to previous findings made in the context of the injured adult brain.

      The authors speculate about the role of glial inter-conversion for tissue homeostasis or regeneration, but no supportive evidence is cited or provided. Further experiments will be required to test the function of the described glial plasticity.

      Elav+ cells originating from glia do not express markers for mature neurons at the analysed time-point. If they will eventually differentiate<br /> or what type of structure is formed by them will have to be followed up in future studies.

      Context/Discussion

      Highlighting some differences in the reactiveness of glia in the VNC compared to the brain could reveal important differences in repair strategies in different areas of the CNS.

    3. Reviewer #3 (Public review):

      In this manuscript, Casas-Tintó et al. explore the role of glial cell in the response to a neurodegenerative injury in the adult brain. They used Drosophila melanogaster as a model organism, and found that glial cells are able to generate new neurons through the mechanism of transdifferentiation in response to injury. This paper provides a new mechanism in regeneration, and gives an understanding to the role of glial cells in the process.

      The authors have now addressed all my concerns.

    4. Author response:

      The following is the authors’ response to the previous reviews.

      eLife Assessment

      In this work, the authors use a Drosophila adult ventral nerve cord injury model extending and confirming previous observations; this important study reveals key aspects of adult neural plasticity. Taking advantage of several genetic reporter and fate tracing tools, the authors provide solid evidence for different forms of glial plasticity, that are increased upon injury. The data on detected plasticity under physiologic conditions and especially the extent of cell divisions and cell fate changes upon injury would benefit from validation by additional markers. The experimental part would improve if strengthened and accompanied by a more comprehensive integration of results regarding glial reactivity in the adult CNS.

      Thank you very much for your thoughtful comments and constructive feedback regarding our manuscript. We appreciate all the positive remarks on the significance of our findings on neural plasticity in this Drosophila adult ventral nerve cord injury model.

      In response to your suggestion, we fully agree that the continuation of this project should address in detail cell fate changes with additional markers if available, or an “omic” approach such as scRNAseq. Unfortunately, these further experiments are beyond the scope of this paper to describe the in vivo phenomena of cell reprogramming, and the cellular events that take glial cells to convert into neurons or neuronal precursors.

      Additionally, we agree that the experimental part can be further improved by providing a more comprehensive integration of our results with current knowledge on glial reactivity in the adult CNS. We will revise the manuscript accordingly to include a deeper discussion of the broader implications of our findings and their alignment with existing literature.

      Thank you again for your valuable input, which will undoubtedly enhance the quality of our work. We look forward to submitting the revised manuscript for your consideration.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      Casas-Tinto et al. present convincing data that injury of the adult Drosophila CNS triggers transdifferentiation of glial cell and even the generation of neurons from glial cells. This observation opens up the possibility to get an handle on the molecular basis of neuronal and glial generation in the vertebrate CNS after traumatic injury caused by Stroke or Crush injury. The authors use an array of sophisticated tools to follow the development of glial cells at the injury site in very young and mature adults. The results in mature adults reveal a remarkable plasticity in the fly CNS and dispels the notion that repair after injury may be only possible in nerve cords which are still developing. The observation of so called VC cells which do not express the glial marker repo could point to the generation of neurons by former glial cells.

      Conclusion:

      The authors present an interesting story which is technically sound and could form the basis for an in depth analysis of the molecular mechanism driving repair after brain injury in Drosophila and vertebrates.

      Strengths:

      The evidence for transdifferentiation of glial cells is convincing. In addition, the injury to the adult CNS shows an inherent plasticity of the mature ventral nerve cord which is unexpected.

      Weaknesses:

      Traumatic brain injury in Drosophila has been previously reported to trigger mitosis of glial cells and generation of neural stem cells in the larval CNS and the adult brain hemispheres. Therefore this report adds to but does not significantly change our current understanding. The origin and identity of VC cells is still unclear. The authors show that VC cells are not GABA- or glutamergic. Yet, there are many other neurotransmitter or neuropetides. It would have been nice to see a staining with another general neuronal marker such as anti-Syt1 to confirm the neuronal identity of Syt1.

      We thank the reviewer for the constructive comments and positive feedback. We concur that previous studies have demonstrated glial cell proliferation in response to CNS injury. In contrast, our study focuses on glial transdifferentiation that emerges as a novel phenomenon, particularly in response to injury. We found that neuropile glia lose their glial identity and express the pan-neuronal marker Elav. To investigate the identity of these newly observed elav-positive cells, we employed anti-ChAT, antiGABA and anti-GluRIIA antibodies to determine the functional identity of these cells, besides we stained them with other neuronal markers such Enabled, Gigas or Dac (not shown); however, our attempts yielded limited success. To address this, we have now included a discussion section exploring the potential identity of these cells, considering the possibility that they may represent immature neurons.

      Reviewer #2 (Public review):

      Summary:

      Casas-Tinto et al., provide new insight into glial plasticity using a crush injury paradigm in the ventral nerve cord (VNC) of adult Drosophila. The authors find that both astrocyte-like glia (ALG) and ensheating glia (EG) divide under homeostatic conditions in the adult VNC and identify ALG as the glial population that specifically ramps up proliferation in response to injury, whereas the number of EGs decreases following the insult. Using lineage-tracing tools, the authors interestingly observe interconversion of glial subtypes, especially of EGs into ALGs, which occurs independent of injury and is dependent on the availability of the transcription factor Prospero in EGs, adding to the plasticity observed in the system. Finally, when tracing the progeny of glia, Casas-Tinto and colleagues detect cells of neuronal identity and provide evidence that such gliaderived neurogenesis is specifically favoured following ventral nerve cord injury, which puts forward a remarkable way in which glia can respond to neuronal damage.

      Strengths:

      This study highlights a new facet of adult nervous system plasticity at the level of the ventral nerve cord, supporting the view that proliferative capacity is maintained in the mature CNS and stimulated upon injury.

      The injury paradigm is well chosen, as the organization of the neuromeres allows specific targeting of one segment, compared to the remaining intact and with the potential to later link observed plasticity to behaviour such as locomotion.

      Numerous experiments have been carried out in 7-day old flies, showing that the observed plasticity is not due to residual developmental remodelling or a still immature VNC.

      By elegantly combining different methods, the authors show glial divisions including with mitotic-dependent tracing and find that the number of generated glia is refined by apoptosis later on.

      The work identifies prospero in glia as an important coordinator of glial cell fate, from development to the adult context, which draws further attention to the upstream regulatory mechanisms.

      We would like to thank the reviewer for his/her comments and the positive analysis of this work.

      Weaknesses:

      The authors observe consistent inter-conversion of EG to ALG glial subtypes that is further stimulated upon injury. The authors conclude that these findings have important consequences for CNS regeneration and potentially for memory and learning. However, it remains somewhat unclear how glial transformation could contribute to regeneration and functional recovery.

      This is an ongoing question in the laboratory and in the field. We know that glial cells contribute to the regenerative program in the nervous system, and molecular signalling in glial cells is determinant for the functional recovery (Losada-Perez et al 2021). Therefore, we include this concept in the discussion as the evidence indicates that glial cells participate in these programs. However, further investigation is required to clarify and determine the mechanisms underlying this glial contribution. To determine if glial to neuron transformation contributes to functional recovery, we would need to compare the recovery of animals with new VC to animals without VC, however, the  molecular mechanism that produces this change of identity is still unknown, and therefore we are not able to generate injured flies with no new VC

      The signal of the Fucci cell cycle reporter seems more complex to interpret based on the panels provided compared to the other methods employed by the authors to assess cell divisions.

      We agree that Fly Fucci is a genetic reporter that might be more complex to interpret than EdU staining or other markers. However, glial cells proliferation is a milestone of this manuscript, and we used different available tools to confirm our results. We have revised this specific section to ensure that the text is clear and straightforward.

      Elav+ cells originating from glia do not express markers for mature neurons at the analysed time-point. If they will eventually differentiate or what type of structure is formed by them will have to be followed up in future studies.

      We fully agree with the reviewer, and we will analyze later days to study neuronal fate and contribution to VNC function.

      Context/Discussion

      There is some lack of connecting or later comparing the observed forms of glial plasticity in the VNC with respect to plasticity described in the fly brain.

      Highlighting some differences in the reactiveness of glia in the VNC compared to the brain could point to relevant differences in repair capacity in different areas of the CNS.

      Based on the assays employed, the study points to a significant amount of glial "identity" changes or interconversions under homeostatic conditions. The potential significance of this rather unexpected "baseline" plasticity in adult tissues is not explicitly pointed out and could improve the understanding of the findings.

      Some speculations if "interconversion" of glia is driven by the needs in the tissue could enrich the discussion.

      We would like to thank the reviewer for these suggestions. We have changed the discussion to introduce these concepts.

      Reviewer #3 (Public review):

      In this manuscript, Casas-Tintó et al. explore the role of glial cell in the response to a neurodegenerative injury in the adult brain. They used Drosophila melanogaster as a

      model organism, and found that glial cells are able to generate new neurons through the mechanism of transdifferentiation in response to injury. This paper provides a new mechanism in regeneration, and gives an understanding to the role of glial cells in the process.

      Comments on revisions:

      In the previous version of the manuscript, I had suggested several recommendations for the authors. Unfortunately, none of these were addressed in the author's revision.

      We are sorry for this error. We apologize but we never received these comments. We have now found them, and we have incorporated these comments in the new version of the manuscript.

      (1) Have you tried screening for other markers for the EdU+ Repo+ Pros- cells?

      We have identified these cells as glial cells (Repo +), and not astrocyte-like glia (pros-). But we have not further characterized  the identity of these cells. Our aim was to identify these proliferating glial cells as NPG (Neuropile glia), which are Astrocyte-Like Glia (ALG), as previous works suggest in larvae (Kato et al., 2020; Losada-Perez et al., 2016), or Ensheathing Glia (EG). To discard the ALG identity, we used prospero as the best marker. The results indicate that there are ALG among the proliferating population, but in addition, we also found pros- glial cells that were EdU positive. These cells are located in the interface between cortex and neuropile, where the neuropile glia position is described. The anti-pros staining indicated they were no ALG which suggest that they are EG.

      There is no specific nuclear marker for EG cells, therefore we used FLY_FUCCI under the control of a EG specific promoter (R56F03-Gal4) to determine if the other dividing cells were EG. These results indicate that EG glia divide although their proliferation does not increase upon injury.

      The R56F03 Gal4 construct is described as ensheathing glia specific by previous publications, including:

      (1) Kremer M. C., Jung C., Batelli S., Rubin G. M. and Gaul U. (2017). The glia of the adult Drosophila nervous system. Glia 65, 606-638. 10.1002/glia.23115

      (2) Qingzhong Ren, Takeshi Awasaki, Yu-Chun Wang, Yu-Fen Huang, Tzumin Lee. Lineage-guided Notch-dependent gliogenesis by Drosophila multi-potent progenitors. Development. 2018 Jun 11;145(11):dev160127. doi: 10.1242/dev.160127   

      To summarize, our results suggest that part of these proliferating glial cells are ALG and EG. Our results can not discard that a residual part of these proliferating cells are not AG nor EG.

      (2) You mentioned that ALG are heterogenous in size and shape, does that mean that you may have different subpopulations of ALG? Would that also mean that only a portion of them responds to injury?

      Yes, as in Astrocytes in vertebrates this population is highly heterogeneous. Currently there are no molecular tools to specifically identify these subpopulations and characterize their distinct roles. However, emerging research suggests that differences in size, shape, and potentially molecular markers could correlate with functional diversity. This implies that certain subpopulations of ALG may be more specialized or primed to respond to injury, while others may play roles in homeostasis or other processes. Understanding this heterogeneity will require advanced techniques such as single-cell RNA sequencing, spatial transcriptomics, or live imaging to unravel how these subpopulations contribute to injury responses and overall tissue dynamics.

      (3) You mentioned that NP-like cells have similar nuclear shape and size to ALG and EG, while Ventral cortex cells have larger nuclei. Can you please show a quantification of the NP-like cells and Ventral cortex cells size, and show a direct comparison with ALG and EG cells to support those claims (images, quantification and analysis)?

      We added a new supplementary figure with a graph showing nuclei size differences between VC and NP-like cells, and a diagram showing VC cell localization. Images in figure 2A-A’ and 2B-B’ show both types of cells with the same scale, additionally, NPG cells are shown in red (current expression of the specific Gal4 line). A direct comparison between EG and NP-like glia can be observed in Figure 3 as well.

      Besides of size and localization, we conclude  that VC and N-like cells present different molecular markers as VC are elav-positive and reponegative whereas NP-like cells are repo-positive elav-negative

      (4) In Figure 2B, the repo expression is not very clear. I suggest using a different example to support the claim that NP cells are Repo+.

      We have changed the color of anti-elav staining to facilitate visualisation

      (5) Again, in Figure 2C, you need quantification and analysis to support the claim that you used nuclear shape and size to identify VC vs. NP like cells.

      Quantification in point 3, criteria in Figure S1

      (6) What is the identity of the newly formed neurons? Other than Elav, have you tried using other markers of neurons that are typically found in this area?

      This question is of great interest and relevance. We have done great efforts to solve this open question and so far, our data suggest that these neurons might be in an immature state. In this last version of the manuscript, we included the results (Figure S1) with several different markers. 

      The molecular identity of these cell populations, glia and neurons, is currently under investigation.

      Minor comments:

      (1) In the abstract, EG and ALG abbreviations are not introduced properly.

      Thank you very much for noticing this missing information, we have now included it in the abstract.

      (2) Please include a representation of the NPG somata location in Figure 1A.

      We have included this information in the figure

      (3) A schematic showing the differences between ALG and EG cells would be helpful as well.

      We have included in the introduction references and reviews where other authors describe in detail the differences.

      (4) In Figure 1 E, G, H- please indicated the genotype of the fly used in the panel as well as the cell type studied.

      The complete genotype is included in the corresponding figure legend. We have added a simplified genotype in the figure for clarity.

      (5) Please show the genotype used for images in Figure 2: ALG or EG specific drivers.

      This information is included in the corresponding figure legend. We believe that it is better to keep the figure clean so we decided to keep the complete genotype, which is considerably long, only in the figure legend.

    1. eLife Assessment

      This study presents valuable findings by using Fmr1 knockout mice as a model to investigate the role of Fmr1 in sleep regulation. These mice exhibited clear evidence of sleep and circadian disturbances, including abnormal retinal innervation of the SCN, which may provide a potential mechanistic explanation for the observed behavioral deficits. Interestingly, the results suggest that a scheduled feeding approach could improve sleep and circadian rhythms while enhancing social interactions and reducing repetitive behaviors in a mouse model of Fragile X syndrome. The topic is both intriguing and highly significant; however, while the evidence supporting the authors' claims is solid, several issues hinder the manuscript's clarity and impact.

    2. Reviewer #1 (Public review):

      Summary:

      The authors investigated sleep and circadian rhythm disturbances in Fmr1 KO mice. Initially, they monitored daily home cage behaviors to assess sleep and circadian disruptions. Next, they examined the adaptability of circadian rhythms in response to photic suppression and skeleton photic periods. To explore the underlying mechanisms, they traced retino-suprachiasmatic connectivity. The authors further analyzed the social behaviors of Fmr1 KO mice and tested whether a scheduled feeding strategy could mitigate sleep, circadian, and social behavior deficits. Finally, they demonstrated that scheduled feeding corrected cytokine levels in the plasma of mutant mice.

      Strengths:

      (1) The manuscript addresses an important topic-investigating sleep deficits in an FXS mouse model and proposing a potential therapeutic strategy.

      (2) The study includes a comprehensive experimental design with multiple methodologies, which adds depth to the investigation.

      Weaknesses:

      (1) The first serious issue in the manuscript is the lack of a clear description of how they performed the experiments and the missing definitions of various parameters in the results. Given that monitoring and analyzing sleep behaviors are the key experiments of this manuscript, I use the "Immobility-Based Sleep Behavior" section of Methods as an example to elaborate:

      Incomplete or Incorrect Description of Tracking Threshold:<br /> o The phrase "tracked the (40 sec or greater as previously described" is incomplete and does not clarify what is being tracked. This appears to be an error in writing or editing.<br /> Unclear Relationship Between Threshold and EEG Validation:<br /> o The threshold "40 sec or greater" is mentioned without context or explanation of what it represents (e.g., sleep bout duration, inactivity, or another parameter). The reference to Fisher et al. (2016) and "99% correlation with EEG-defined sleep" seems misaligned with the paragraph's content.

      Confusing Definition of Sleep Bout:<br /> o The definition of a sleep bout is unclear. Sleep bouts should logically be based on periods of inactivity, not activity. The sentence suggesting sleep is measured by "activity staying above the threshold" is confusing. The phrase "3 counts of sleep per minute for longer than one minute" requires clarification.

      Unclear Data Selection for Analysis:<br /> o The phrase "2 days with the best recording quality" is vague and does not specify how "best" was determined or why only two days out of five were analyzed.

      Awkward Grammar and Structure:<br /> o Phrases like "Acquiring data were exported in 1-min bins" are grammatically awkward. "Acquiring" should be "Acquired." Some sentences are overly long and lack clarity, making the text harder to follow.<br /> In addition to this section, the authors should review all paragraphs in the Methods section to improve readability.

      (2) Although the manuscript has a relatively long Methods section, some essential information is missing. For instance, the definition of sleep bout, as described above, is unclear. Additional missing information includes:

      Figure 2: "Rhythmic strength (%)" and "Cycle-to-cycle variability (min)."<br /> Figure 3: "Activity suppression."<br /> Figure 4: "Rhythmic power (V%)" (is this different from rhythmic strength (%)?) and "Subjective day activity (%)."<br /> Figure 5: Clear labeling of the SCN's anatomical features and an explanation for quantifying only the ventral part instead of the entire SCN. Alternatively, the authors should consider quantifying the whole SCN.<br /> Figure 6: Inconsistencies in terms like "Sleep frag. (bout #)" and "Sleep bouts (#)." Consistent terminology throughout the manuscript is essential.

      (3) Figure 1A shows higher mouse activity during ZT13-16. It is unclear why the authors scheduled feeding during ZT15-21, as this seems to disturb the rhythm. Consistent with this, the body weights of WT and Fmr1 KO mice decreased after scheduled feeding. The authors should explain the rationale for this design clearly.

      (4) The interpretation of social behavior results in Figure 6 is questionable. The authors claim that Fmr1 KO mice cannot remember the first stranger in a three-chamber test, writing, "The reduced time in exploring and staying in the novel-mouse chamber suggested that the Fmr1 KO mutants were not able to distinguish the second novel mouse from the first now-familiar mouse." However, an alternative explanation is that Fmr1 KO mice do remember the first stranger but prefer to interact with it due to autistic-like tendencies. Data in Table 5 show that Fmr1 KO mice spent more time interacting with the first stranger in the 3-chamber social recognition test, which support this possibility. Similarly, in the five-trial social test, Fmr1 KO mice's preference for familiar mice might explain the reduced interaction with the second stranger.

      In Figure 6C (five-trial social test results), only the fifth trial results are shown. Data for trials 1-4 should be provided and compared with the fifth trial. The behavioral features of mice in the 5-trial test can then be shown completely. In addition, the total interaction times for trials 1-4 (154 {plus minus} 15.3 for WT and 150 {plus minus} 20.9 for Fmr1 KO) suggest normal sociability in Fmr1 KO mice (it is different from the results of 3-chamber). Thus, individual data for trials 1-4 are required to draw reliable conclusions.

      In Table 6 and Figure 6G-6J, the authors claim that "Sleep duration (Figures 6G, H) and fragmentation (Figures 6I, J) exhibited a moderate-strong correlation with both social recognition and grooming." However, Figure 6I shows a p-value of 0.077, which is not significant. Moreover, Table 6 shows no significant correlation between SNPI of the three-chamber social test and any sleep parameters. These data do not support the authors' conclusions.

      (5) Figure 7 demonstrates the effect of scheduled feeding on circadian activity and sleep behaviors, representing another critical set of results in the manuscript. Notably, the WT+ALF and Fmr1 KO+ALF groups in Figure 7 underwent the same handling as the WT and Fmr1 KO groups in Figures 1 and 2, as no special treatments were applied to these mice. However, the daily patterns observed in Figures 7A, 7B, 7F, and 7G differ substantially from those shown in Figures 2B and 1A, respectively. Additionally, it is unclear why the WT+ALF and Fmr1 KO+ALF groups did not exhibit differences in Figures 7I and 7J, especially considering that Fmr1 KO mice displayed more sleep bouts but shorter bout lengths in Figures 1C and 1D.

      Furthermore, it is not specified whether the results in Figure 7 were collected after two weeks of scheduled feeding (for how many days?) or if they represent the average data from the two-week treatment period.

      The rationale behind analyzing "ZT 0-3 activity" in Figure 7D instead of the parameters shown in Figures 2C and 2D is also unclear.

      In Figure 7F, some data points appear to be incorrectly plotted. For instance, the dark blue circle at ZT13 connects to the light blue circle at ZT14 and the dark blue circle at ZT17. This is inconsistent, as the dark blue circle at ZT13 should link to the dark blue circle at ZT14. Similarly, it is perplexing that the dark blue circle at ZT16 connects to both the light blue and dark blue circles at ZT17. Such errors undermine confidence in the data. The authors need to provide a clear explanation of how these data were processed.

      Lastly, in the Figure 7 legend, Table 6 is cited; however, this appears to be incorrect. It seems the authors intended to refer to Table 7.

      (6) Similar to the issue in Figure 7F, the data for day 12 in Supplemental Figure 2 includes two yellow triangles but lacks a green triangle. It is unclear how the authors constructed this chart, and clarification is needed.

      (7) In Figure 8, a 5-trial test was used to assess the effect of scheduled feeding on social behaviors. It is essential to present the results for all trials (1 to 4). Additionally, it is unclear whether the results for familial mice in Figure 8A correspond to trials 1, 2, 3, or 4.<br /> The legend for Figure 8 also appears to be incorrect: "The left panels show the time spent in social interactions when the second novel stranger mouse was introduced to the testing mouse in the 5-trial social interaction test. The significant differences were analyzed by two-way ANOVA followed by Holm-Sidak's multiple comparisons test with feeding treatment and genotype as factors." This description does not align with the content of the left panels. Moreover, two-way ANOVA is not the appropriate statistical analysis for Figure 8A. The authors need to provide accurate details about the analysis and revise the figure legend accordingly.

      (8) The circadian activity and sleep behaviors of Fmr1 KO mice have been reported previously, with some findings consistent with the current manuscript, while others contradict it. Although the authors acknowledge this discrepancy, it seems insufficiently thorough to simply state that the reasons for the conflicts are unknown. Did the studies use the same equipment for behavior recording? Were the same parameters used to define locomotor activity and sleep behaviors? The authors are encouraged to investigate these details further, as doing so may uncover something interesting or significant.

      (9) Some subtitles in the Results section and the figure legends do not align well with the presented data. For example, in the section titled "Reduced rhythmic strength and nocturnality in the Fmr1 KOs," it is unclear how the authors justify the claim of altered nocturnality in Fmr1 KO mice. How do the authors define changes in nocturnality? Additionally, the tense used in the subtitles and figure legends is incorrect. The authors are encouraged to carefully review all subtitles and figure legends to correct these errors and enhance readability.

    3. Reviewer #2 (Public review):

      Summary:

      In the present study, the authors, using a mouse model of Fragile X syndrome, explore the very interesting hypothesis that restricting food access over a daily schedule will improve sleep patterns and, subsequently, behavioral capacities. By restricting food access from 12h to 6h over the nocturnal period (active period for mice), they show, in these KO mice, an improvement of the sleep pattern accompanied by reduced systemic levels of inflammatory markers and improved behavior. Using a classical mouse model of neurodevelopmental disorder (NDD), these data suggest that eating patterns might improve sleep quality, reduce inflammation and improve cognitive/behavioral capacities in children with NDD.

      Strengths:

      Overall, the paper is very well-written and easy to follow. The rationale of the study is generally well-introduced. The data are globally sound. The provided data support the interpretation overall.

      Weaknesses:

      (1) The introduction part is quite long in the Abstract, leaving limited space for the data provided by the present study.

      (2) A couple of points are not totally clear for a non-expert reader:<br /> - The Fmr1/Fxr2 double KO mice are not well described.<br /> - What is the rationale for performing both LD and DD measures?

      (3) The data on cytokines and chemokines are interesting. However, the rationale for the selection of these molecules is not given. In addition, these measures have been performed in the systemic blood. Measures in the brain could be very informative.

      (4) An important question is the potential impact of fasting vs the impact of the food availability restriction. Indeed fasting has several effects on brain functioning including cognitive functions.

      (5) How do the authors envision the potential translation of the present study to human patients? How to translate the 12 to 6 hours of food access in mice to children with Fragile X syndrome?

    1. eLife Assessment

      This study presents an important discovery regarding the diversity and evolution of gall-forming microbial effectors. Supported by convincing computational structural predictions and analyses, the research provides insights into the unique mechanisms by which gall-forming microbes exert their pathogenicity in plants. This study also offers guidance that is of value for future studies on pathogen effector function and co-evolution with host plants.

    2. Reviewer #1 (Public review):

      Summary:

      This manuscript presents a comprehensive structure-guided secretome analysis of gall-forming microbes, providing valuable insights into effector diversity and evolution. The authors have employed AlphaFold2 to predict the 3D structures of the secretome from selected pathogens and conducted a thorough comparative analysis to elucidate commonalities and unique features of effectors among these phytopathogens.

      Strengths:

      The discovery of conserved motifs such as 'CCG' and 'RAYH' and their central role in maintaining the overall fold is an insightful finding. Additionally, the discovery of a nucleoside hydrolase-like fold conserved among various gall-forming microbes is interesting.

      Weaknesses:

      Important conclusions are not verified by experiments.

    3. Reviewer #2 (Public review):

      Summary:

      Soham Mukhopadhyay et al. investigated the protein folding of the secretome from gall-forming microbes using the AI-based structure modeling tool AlphaFold2. Their study analyzed six gall-forming species, including two Plasmodiophorid species and four others spanning different kingdoms, along with one non-gall-forming Plasmodiophorid species, Polymyxa betae. The authors found no effector fold specifically conserved among gall-forming pathogens, leading to the conclusion that their virulence strategies are likely achieved through diverse mechanisms. However, they identified an expansion of the Ankyrin repeat family in two gall-forming Plasmodiophorid species, with a less pronounced presence in the non-gall-forming Polymyxa betae. Additionally, the study revealed that known effectors such as CCG and AvrSen1 belong to sequence-unrelated but structurally similar (SUSS) effector clusters.

      Strengths:

      (1) The bioinformatics analyses presented in this study are robust, and the AlphaFold2-derived resources deposited in Zenodo provide valuable resources for researchers studying plant-microbe interactions. The manuscript is also logically organized and easy to follow.

      (2) The inclusion of the non-gall-forming Polymyxa betae strengthens the conclusion that no effector fold is specifically conserved in gall-forming pathogens and highlights the specific expansion of the Ankyrin repeat family in gall-forming Plasmodiophorids.

      (3) Figure 4a and 4b effectively illustrate the SUSS effector clusters, providing a clear visual representation of this finding.

      (4) Figure 1 is a well-designed, comprehensive summary of the number and functional annotations of putative secretomes in gall-forming pathogens. Notably, it reveals that more than half of the analyzed effectors lack known protein domains in some pathogens, yet some were annotated based on their predicted structures, despite the absence of domain annotations.

      Weaknesses:

      (1) The effector families discussed in this paper remain hypothetical in terms of their functional roles, which is understandable given the challenges of demonstrating their functions experimentally. However, this highlights the need for experimental validation as a next step.

      (2) Some analyses, such as those in Figure 4e, emphasize motifs derived from sequence alignments of SUSS effector clusters. Since these effectors are sequence-unrelated, sequence alignments might be unreliable. It would be more rigorous to perform structure-based alignments in addition to sequence-based ones for motif confirmation. For instance, methods described in Figure 3E of de Guillen et al. (2015, https://doi.org/10.1371/journal.ppat.1005228) or tools like Foldseek (https://search.foldseek.com/foldmason) could be useful for aligning structures of multiple sequences.

      (3) When presenting AlphaFold-generated structures, it is essential to include confidence scores such as pLDDT and PAE. For example, in Figure 1D of Derbyshire and Raffaele (2023, https://doi.org/10.1038/s41467-023-40949-9), the structural representations were colored red due to their high pLDDT scores, emphasizing their reliability.

    4. Author response:

      We appreciate the constructive feedback provided by the reviewers and the editorial board. We are delighted by the positive reception of our work and the thoughtful insights shared.

      Regarding the validation of our predicted interactions, we are currently conducting yeast two-hybrid (Y2H) assays using a commercially available Arabidopsis thaliana cDNA library to screen for interacting partners of the ANK putative effector PBTT_00818 from Plasmodiophora brassicae. Following this initial screening, we will validate positive interactions through targeted 1-to-1 Y2H assays. In particular, we aim to confirm the AlphaFold Multimer-predicted interaction between PBTT_00818 and MPK3, a key immunity-related kinase in Arabidopsis

      We are grateful for the reviewers’ thoughtful suggestions regarding clustering visualization, sequence vs. structure-based motif alignments, and structural confidence assessments. We will carefully incorporate these improvements in our planned revisions.

      Once again, we thank the editors and reviewers for their rigorous and constructive assessment. We look forward to implementing these refinements and submitting an updated version that further enhances the impact of our study.

    1. eLife Assessment

      This important study reports a detailed computational analysis of the CFTR ion channel's permeation mechanism, advancing our understanding of its structure-function relationship. The conclusions are based on extensive molecular dynamics simulations and thorough analysis, but the use of an approximate chloride ion model, known to underestimate key ion-protein interactions, leaves them incomplete without experimental or alternative computational validation. The work will be of interest to biophysicists working on CFTR and cystic fibrosis.

    2. Reviewer #1 (Public review):

      Summary:

      The goal of this study was to overcome the apparent difficulty in constructing structural models of the open state of the CFTR chloride channel. While several CFTR structural models at near-atomic resolution have been published under a variety of conditions, none of them have demonstrated a pore open across the full dimension of the plasma membrane. Instead, these have routinely been referred to as "near-open" models. In the present study, the authors extended their findings from a prior paper from their group that investigated a series of brief MD simulations, a small number of which exhibited permeation events where chloride ions permeated the pore. This study included massively repeated simulations initiated from these aforementioned Cl permeable conformations. Extensive analysis of the data identified a novel penta-helical structure that comprises the channel pore. This comprehensive study attempted to explain several features of conducting CFTR channels, including single-channel conductance, selectivity, and the mechanisms linking the ATP-induced dimerization of the cytosolic nucleotide-binding domains (NBDs) to the opening of the channel pore (a.k.a., "pore-gating".

      Strengths:

      The major strength of this study is its comprehensive nature. The approaches applied are cutting-edge and beyond, and are used to explain many different aspects of channel function in CFTR. The strength of evidence is very strong. The paper is extremely well-written, and the arguments are well-supported.

      Weaknesses:

      The major weakness is that none of the novel conclusions (i.e., those arising solely from this study and not previously published (have been supported by experimental confirmation. That is typical of computational studies such as this.

    3. Reviewer #2 (Public review):

      Although recent cryo-EM structures of the CFTR ion channel were reported in a putative open state (ATP-bound, NBD-dimerized), it remains unclear whether these structures explain the conductive properties of the open channel observed in functional experiments. To investigate this, the authors conducted extensive molecular dynamics simulations at different voltages. The simulations are started from snapshots of their prior work, based on the experimental putative open state and including conditions with high negative voltage. Their analysis reveals that the cryo-EM structure represents a near-open metastable state, with most trajectories transitioning to either more closed or more open conformations, leading to the identification of a potential new open state. Permeation rate analysis shows that, unlike the other states, the proposed open state exhibits functional conductive properties of the open channel, although a strong inward rectification, inconsistent with experimental data, is also noted. Further structural analysis and simulations of ATP-unbound closed states offer additional mechanistic insights.

      Overall, this work tackles key questions about CFTR: What is the true open conductive state? Does the ATP-bound cryo-EM structure reflect an actual open state? What is the ion permeation mechanism, and what structural changes occur during the closed-to-open transition? Which residues are critical, particularly those linked to diseases like CF? The study, based on a comprehensive set of all-atom molecular dynamics simulations, including a range of physiologically relevant voltages, provides important insights in this regard. It identifies key structural states, permeation pathways, critical residues, and conductance properties that can be directly compared to functional data. Notably, the analysis identifies a new open state of the channel, which, systematic analysis convincingly demonstrates is a conductive conformation of the channel, in line with experimental data at negative voltages. The authors carefully address some of the limitations of their results, exploring and discussing discrepancies with functional experiments, such as inward rectification. The work is also very well written, with a clear and logical presentation of key findings.

      The main weakness of this study is that the simulation data rely on the conventional CHARMM36 force field for Cl− ions, which has been shown to significantly underestimate the interaction between Cl− and proteins (J. Chem. Theory Comput. 2021, 17, 6240-6261). For example, the conventional CHARMM36 force field destabilizes the Cl-binding site in CLC-ec1. The latter ion unbinds irreversibly during microseconds-long simulations which is at odds with the experimental binding affinity.

      This imbalance in Cl−/protein/water interactions could significantly impact the CFTR simulations, potentially altering state populations and Cl− permeability. Notably, recent work by Levring and Chen (Proc Natl Acad Sci U S A. 2024) identifies a likely Cl− binding site in the bottleneck region of the channel, which contradicts the simulation results showing low occupancy Cl− ions in this region (Fig. 1B and Fig. 6A). This discrepancy may be due to the underestimation of Cl−/protein interactions. Indeed, Orabi et al. have proposed corrections that specifically tune these interactions, including those with aromatic residues, in line with the binding site geometry suggested by Levring and Chen. This imbalance in interactions may also lead to an underestimation of the conductance in the experimental near-open state.<br /> Balanced Cl−/protein interactions could also influence voltage/current relationships, potentially affecting the degree of inward rectification. For example, higher Cl− occupancy in the bottleneck region may stabilize the down state of R334, along with other measured interactions, thereby increasing conductance as the authors have shown.

      The experimental evidence reported and discussed by the authors in support of the proposed open state is largely qualitative. For instance, in Figure 4 Supplement 2 there is a significant overlap in the distances and SASA distributions of open and near-open states for the reported residues (are those residues water accessible in the simulations?).

      Given the known limitations of the standard CHARMM36 Cl− force field and in the absence of robust experimental validation of the proposed open state, I recommend validating at least part of the results using an independent set of simulations (not started from the previous ones) with an updated Cl− force field. It would be especially important to reassess whether the experimental near-open state is truly metastable and less probable than the new open state, and confirm that the near-open state exhibits negligible conductance.

      A minor point worth discussing is whether the observed inward rectification may be influenced by hysteresis or incomplete equilibration, as many simulations were started from prior trajectories at large negative voltages and may not have fully relaxed. For instance, is not uncommon that small structural changes in backbone and sidechains occur in several microseconds (Shaw et al., Science, 2010). That said, discrepancies in current-voltage relationships are not unexpected due to challenges in simulation sampling and force field accuracy (J Gen Physiol 2013 May;141(5):619-32) as the authors stated.

      Another minor point to address is the preparation of the simulation setup for the ATP-free structure of the protein. It would be helpful to specify whether any particular controls or steps were taken, given that the structure is based on a relatively low resolution (3.87 Å) model.

    4. Reviewer #3 (Public review):

      Background:

      Cystic Fibrosis Transmembrane Conductance Regulator (CFTR) is a chloride channel whose dysfunction underlies cystic fibrosis, a life-limiting condition caused by thick, sticky mucus buildup in the lungs and other organs. Despite multiple high-resolution structures of CFTR, these snapshots have all captured the channel in a non-conducting or "closed" conformation - even when the protein was prepared under conditions that should favor channel opening. This discrepancy has posed a key challenge: how can a channel be experimentally observed as closed while physiological tests demonstrate it conducts chloride ions?

      Key Findings:

      (1) Stable Open Conformation

      Through repeated molecular dynamics (MD) simulations of human CFTR in lipid bilayers, researchers observed a reproducible, stable open state. Unlike previous transient openings seen in single-run or short simulations, this conformation remains consistently permeable over extended timescales.

      (2) Penta-Helical Arrangement

      The authors highlight a "penta-helical" pore-lining arrangement in which five transmembrane helices symmetrically organize to create a clear ion-conduction pathway. This novel configuration resolves the previously puzzling hydrophobic bottleneck found in cryo-EM structures.

      (3) Conductance Close to Experimental Values

      By analyzing chloride ion flow under near-physiological voltages, they calculate a channel conductance aligning well with electrophysiological measurements. This alignment provides strong support that the observed structure is functionally relevant.

      (4) Roles of Key Residues

      Several positively charged (cationic) residues in the pore appear crucial for guiding and stabilizing chloride ions. Simultaneously, small kinks in certain helices may act as structural "hinges," allowing or blocking chloride passage.

      How to Interpret These Results:

      (1) Bridging a Major Gap: The study tackles the mismatch between static "closed" CFTR structures and their known open-channel function. Successfully capturing a stable open state in MD simulations is a significant step toward reconciling what cryo-EM data shows versus what physiological experiments have long told us.

      (2) Strength in Multiple Replicas: Running many simulation repeats (rather than relying on a single trajectory) lends credibility. Only if a phenomenon is reproducible across multiple runs can it be considered robust.

      (3) Consistency with Mutational Data: Observing that known functional hotspots (e.g., specific charged residues) play a key role in the new pore model further validates these findings.

      Important Caveats and Limitations:

      (1) Simulation Timescales vs. Biology<br /> Even extended MD (on the microsecond scale) is still much faster, simpler, and more controlled than real cellular processes.

      (2) Physiological existence of the penta-helical pore<br /> Although the simulations and results are highly compelling, several factors leave open the possibility of a physiological open conformation differing from the observed penta-helical pore. These factors include ATP hydrolysis, interactions with physiological binding partners, the native membrane environment, and regions not modeled in the CFTR structures, such as the R domain. Most importantly, the transmembrane voltage is very high (500mV).

      Bottom Line:

      This work delivers a long-awaited, near-physiological view of CFTR's open conformation. It provides a foundational structure against which future experimental and computational studies can be compared. By demonstrating reliable chloride conduction and matching established biophysical data, these simulations bring us closer to understanding - and potentially targeting - CFTR's gating mechanism in health and disease. Readers should applaud the breakthroughs while recognizing that further exploration (including more complex in vitro and in vivo experiments) will still be necessary to capture the full dynamism of CFTR in the living cell environment.

    5. Reviewer #4 (Public review):

      Summary:

      The structural mechanism of anion permeation through the open CFTR pore has remained unresolved and is subject to ongoing debate. That is because even in CFTR structures obtained under conditions that normally maximally activate the channel (phosphorylation + ATP + non-hydrolytic mutations + potentiator drugs) a bottleneck region in the pore, too narrow to allow passage of hydrated chloride ions, is observed.

      The present study uses molecular dynamics (MD) simulations initiated from such "quasi-open" states to address local conformational dynamics of the pore. The authors conclude that the quasi-open structure stably relaxes to a fully open conformation on the sub-microsecond time scale. They provide a detailed analysis of this fully open structure and of the mechanism of chloride permeation. They conclude that two major exit pathways (a central and a peripheral) exist for chloride ions, and that the ions remain near-fully hydrated throughout the pore: chloride-protein interactions displace only 1-2 waters from the first solvation shell. Furthermore, the simulations provide some hints for conformational changes involved in gating.

      Strengths:

      The findings are interpreted in the context of the large body of published functional studies on CFTR permeation properties, and caveats are adequately discussed.

      Weaknesses:

      The conclusions on gating would benefit from further discussions. In particular, a fair comparison of the timescale at which channel gating happens, and that of the MD simulations would strengthen the manuscript.

    1. eLife Assessment

      Rennert et al. developed a valuable thermodynamic framework to study the force response of branched actin networks from the crucial and unexplored perspective of energetic cost. They used the fact that the entropy production rate must be positive to derive inequalities that set limits on the maximum force produced by branched actin networks, and speculate that the dissipative cost beyond that required to move the load may be necessary to maintain an adaptive steady state. This work is highly innovative, but remains incomplete until the hypotheses of the model are better justified and the conclusions about the dissipative cost of the system are better established.

    2. Reviewer #1 (Public review):

      Summary:

      This paper investigated the dynamic self-assembly of branched actin networks and the relation between the nonequilibrium features of the dynamics with the thermodynamic cost. The authors constructed a chain model to describe the self-assembly process of a branched actin network, including events like nucleation, polymerization, and capping. The forward and backward transition rates associated with the events allowed them to investigate the entropy production rate of the dynamics. They then used the fact that the entropy production rate has to be greater than zero to derive inequalities that set bounds for the maximum force produced by the branched actin network. The idea is similar to estimating the polymerization force of actin filament via the equation F_{max} = dG/delta, which sets a bound on the maximum force by the thermodynamic potential dG which is the chemical energy associated with ATP hydrolysis and delta is the length increment upon monomer insertion. Furthermore, they speculated the dissipative cost beyond what is necessary to move the load may be necessary to maintain an adaptive steady state.

      Strengths:

      The authors developed a simple model that is capable of qualitatively reproducing some mechanical phenomena for a branched actin network. The model has captured the essential dynamic elements in the branched actin network and built connections between the maximum load and the adaptation behavior with the energetic cost. It is an interesting study that provides a new perspective to look at the mechanical response of the branched actin network.

      Weaknesses:

      The text needs to be improved, particularly in the model introduction part. It is unclear to me what happens to the state when the reverse reaction in Figure 2 occurs.

      Furthermore, what the authors have done is similar to estimate the polymerization force of actin filaments but in a more complicated scenario. Their conclusion that "dissipative cost in the system beyond what is necessary to move the load may be necessary to maintain an adaptive steady state" is skeptical. The branched actin network is a nonequilibrium system driven by active processes like ATP hydrolysis that converts chemical energy into mechanical work. There has to be a gap between the actual E-C_f curve and that when dissipation rate dot{S} = 0. If the authors want to make the claim, they have to decompose the dissipation into different parts and show that a particular part is associated with adaption. Otherwise, the conclusion about the gap is baseless.

    3. Reviewer #2 (Public review):

      Summary:

      Rennert et al. developed a thermodynamic framework for the assembly of branched networks to calculate the entropy dissipation associated with this process. They base their model on the simplest possible experimental system consisting of four proteins: actin, Arp2/3, capping protein, and NPF. They decompose the network assembly into a linear model where the order of events (polymerization, capping, and nucleation) is recorded sequentially. Polymerization and capping are sensitive to load and affected by Brownian ratchet effects, while nucleation is not. This simplified model provides an analytical solution that describes the load sensitivity of actin networks and agrees well with experimental data for a given set of transition rates.

      Strengths:

      (1) These thermodynamic approaches are original and fundamental to our understanding of these non-equilibrium systems.

      (2) The fact that the model fits experimental data is encouraging.

      Weaknesses:

      (1) The possibility of describing branched actin assembly as a Markov process is not well justified.

      (2) The choice of parameters controlling the system is open to question. Some parameters are probably completely negligible, while other ignored effects are potentially significant.

      (3) The main conclusion of the manuscript, linked to the existence of a dissipation gap, is quite expected. The manuscript would have been more valuable if the authors had been able to decompose dissipation into different components in order to prove that a particular fraction is associated with adaptation.

    1. eLife Assessment

      This study presents a potentially fundamental analysis of the original color of a fossil feather from the crest of a 125-million-year-old enantiornithine bird, using sophisticated 3D microscopic and numerical methods to conclude that the feather was iridescent and brightly colored, possibly indicating that this was a male bird that used its crest in sexual displays. At present, the strength of evidence supporting the authors' conclusions is considered incomplete based on methodological incompleteness and questions about taphonomy.

    2. Reviewer #1 (Public review):

      Summary:

      Li et al describe a novel form of melanosome based iridescence in the crest of an Early Cretaceous enantiornithine avialan bird from the Jehol Group.

      Strengths:

      Novel set of methods applied to the study of fossil melanosomes.

      Weaknesses:

      (1) Firstly, several studies have argued that these structures are in fact not a crest, but rather the result of compression. Otherwise, it would seem that a large number of Jehol birds have crests that extend not only along the head but the neck and hindlimb. It is more parsimonious to interpret this as compression as has been demonstrated using actuopaleontology (Foth 2011).<br /> (2) The primitive morphology of the feather with their long and possibly not interlocking barbs also questions the ability of such feathers to be erected without geologic compression.<br /> (3) The feather is not in situ and therefore there is no way to demonstrate unequivocally that it is indeed from the head (it could just as easily be a neck feather)<br /> (4) Melanosome density may be taphonomic; in fact, in an important paper that is notably not cited here (Pan et al. 2019) the authors note dense melanosome packing and attribute it to taphonomy. This paper describes densely packed (taphonomic) melanosomes in non-avian avialans, specifically stating, "Notably, we propose that the very dense arrangement of melanosomes in the fossil feathers (Fig. 2 B, C, and G-I, yellow arrows) does not reflect in-life distribution, but is, rather, a taphonomic response to postmortem or postburial compression" and if this paper was taken into account it seems the conclusions would have to change drastically. If in this case the density is not taphonomic, this needs to be justified explicitly (although clearly these Jehol and Yanliao fossils are heavily compressed).<br /> (5) Color in modern birds is affected by the outer keratin cortex thickness which is not preserved but the authors note the barbs are much thicker (10um) than extant birds; this surely would have affected color so how can the authors be sure about the color in this feather?<br /> (6) Authors describe very strange shapes that are not present in extant birds: "...different from all other known feather melanosomes from both extant and extinct taxa in having some extra hooks and an oblique ellipse shape in cross and longitudinal sections of individual melanosome" but again, how can it be determined that this is not the result of taphonomic distortion?<br /> (7) The authors describe the melanosomes as hexagonally packed but this does not appear to be in fact the case, rather appearing quasi-periodic at best, or random. If the authors could provide some figures to justify this hexagonal interpretation?<br /> (8) One way to address these concerns would be to sample some additional fossil feathers to see if this is unique or rather due to taphonomy<br /> (9) On a side, why are the feet absent in the CT scan image?

    3. Reviewer #2 (Public review):

      Summary:

      The authors reconstructed the three-dimensional organization of melanosomes in fossilized feathers belonging to a spectacular specimen of a stem avialan from China. The authors then proceed to infer the original coloration and related ecological implications.

      Strengths:

      I believe the study is well executed and well explained. The methods are appropriate to support the main conclusions. I particularly appreciate how the authors went beyond the simple morphological inference and interrogated the structural implications of melanosome organization in three dimensions. I also appreciate how the authors were upfront with the reliability of their methods, results, and limitations of their study. I believe this will be a landmark study for the inference of coloration in extinct species and how to interrogate its significance in the future.

      Weaknesses:

      I have a few minor comments.<br /> Introduction: I would suggest the authors move the paragraph on coloration in modern birds (lines 75-97) before line 64, as this is part of the reasoning behind the study. I believe this change would improve the flow of the introduction for the general reader.<br /> Melanosome organization: I was surprised to find little information in the main text regarding this topic. As this is one of the major findings of the study, I would suggest the authors include more information regarding the general geometry/morphology of the single melanosomes and their arrangement in three dimensions.<br /> Keratin: the authors use such a term pretty often in the text, but how is this inference justified in the fossil? Can the authors extend on this? Previous studies suggested the presence of degradation products deriving from keratin, rather than immaculated keratin per se.<br /> Ontogenetic assessment: the authors infer a sub-adult stage for the specimen, but no evidence or discussion is reported in the SI. Can the authors describe and discuss their interpretations?<br /> CT scan data: these data should be made freely available upon publication of the study.

    4. Reviewer #3 (Public review):

      Summary:

      The paper presents an in-depth analysis of the original colour of a fossil feather from the crest of a 125-million-year-old enantiornithine bird. From its shape and location, it would be predicted that such a feather might well have shown some striking colour and pattern. The authors apply sophisticated microscopic and numerical methods to determine that the feather was iridescent and brightly coloured and possibly indicates this was a male bird that used its crest in sexual displays.

      Strengths:

      The 3D micro-thin-sectioning techniques and the numerical analyses of light transmission are novel and state-of-the-art. The example chosen is a good one, as a crest feather is likely to have carried complex and vivid colours as a warning or for use in sexual display. The authors correctly warn that without such 3D study feather colours might be given simply as black from regular 2D analysis, and the alignment evidence for iridescence could be missed.

      Weaknesses: Trivial.

    1. eLife Assessment

      This fundamental manuscript comprehensively examines the roles of nine structural proteins in herpes simplex virus 1 (HSV-1) assembly and nuclear egress. By integrating cryo-light microscopy and soft X-ray tomography, the study presents an innovative approach to investigating viral assembly within cells. The research is thoroughly executed, yielding compelling data that explain previously unknown functions of these structural proteins. This work is of broad interest to virologists, cellular biologists, and structural biologists, offering a robust, contextually rich methodology for studying large protein complex assembly within the cellular environment, serving as an excellent starting point for high-resolution techniques.

    2. Reviewer #1 (Public review):

      Summary:

      Nahas et al. investigated the roles of herpes simplex virus 1 (HSV-1) structural proteins using correlative cryo-light microscopy and soft X-ray tomography. The authors generated nine viral variants with deletions or mutations in genes encoding structural proteins. They employed a chemical fixation-free approach to study native-like events during viral assembly, enabling observation of a wider field of view compared to cryo-ET. The study effectively combined virology, cell biology, and structural biology to investigate the roles of viral proteins in virus assembly and budding.

      Strengths:

      (1) The study presented a novel approach to studying viral assembly in cellulo.

      (2) The authors generated nine mutant viruses to investigate the roles of essential proteins in nuclear egress and cytoplasmic envelopment.

      (3) The use of correlative imaging with cryoSIM and cryoSXT allowed for the study of viral assembly in a near-native state and in 3D.

      (4) The study identified the roles of VP16, pUL16, pUL21, pUL34, and pUS3 in nuclear egress.

      (5) The authors demonstrated that deletion of VP16, pUL11, gE, pUL51, or gK inhibits cytoplasmic envelopment.

      (6) The manuscript is well-written, clearly describing findings, methods, and experimental design.

      (7) The figures and data presentation are of good quality.

      (8) The study effectively correlated light microscopy and X-ray tomography to follow virus assembly, providing a valuable approach for studying other viruses and cellular events.

      (9) The research is a valuable starting point for investigating viral assembly using more sophisticated methods like cryo-ET with FIB-milling.

      (10) The study proposes a detailed assembly mechanism and tracks the contributions of studied proteins to the assembly process.

      (11) The study includes all necessary controls and tests for the influence of fluorescent proteins.

      Weaknesses:

      Overall, the manuscript does not have any major weaknesses, just a few minor comments:

      (1) The gel quality in Figure 1 is inconsistent for different samples, with some bands not well resolved (e.g., for pUL11, GAPDH, or pUL20).

      (2) The manuscript would benefit from a summary figure or table to concisely present the findings for each protein. It is a large body of manuscript, and a summary figure showing the discovered function would be great.

      (3) Figure 2 lacks clarity on the type of error bars used (range, standard error, or standard deviation). It says, however, range, and just checking if this is what the authors meant.

      (4) The manuscript could be improved by including details on how the plasma membrane boundary was estimated from the saturated gM-mCherry signal. An additional supplementary figure with the data showing the saturation used for the boundary definition would be helpful.

      (5) Additional information or supplementary figures on the mask used to filter the YFP signal for Figure 4 would be helpful.

      (6) The figure legends could include information about which samples are used for comparison for significance calculations. As the color of the brackets is different from the compared values (dUL34), it would be great to have this information in the figure legend.

      (7) In Figure 5B, the association between YFP and mCherry signals is difficult to assess due to the abundance of mCherry signal; single-channel and combined images might improve visualization.

      (8) In Figure 6D, staining for tubulin could help identify the cytoskeleton structures involved in the observed virus arrays.

      (9) It is unclear in Figure 6D if the microtubule-associated capsids are with the gM envelope or not, as the signal from mCherry is quite weak. It could be made clearer with the split signals to assess the presence of both viral components.

      (10) The representation of voxel intensity in Figure 8 is somewhat confusing. Reversion of the voxel intensity representation to align brighter values with higher absorption, which would simplify interpretation.

      (11) The visualization in panel I of Figure 8 might benefit from a more divergent colormap to better show the variation in X-ray absorbance.

      (12) Figure 9 would be enhanced by images showing the different virus sizes measured for the comparative study, which would help assess the size differences between different assembly stages.

      Overall, this is an excellent manuscript and an enjoyable read. It would be interesting to see this approach applied to the study of other viruses, providing valuable insights before progressing to high-resolution methods.

    3. Reviewer #2 (Public review):

      Summary:

      For centuries, humans have been developing methods to see ever smaller objects, such as cells and their contents. This has included studies of viruses and their interactions with host cells during processes extending from virion structure to the complex interactions between viruses and their host cells: virion entry, virus replication and virion assembly, and release of newly constructed virions. Recent developments have enabled simultaneous application of fluorescence-based detection and intracellular localization of molecules of interest in the context of sub-micron resolution imaging of cellular structures by electron microscopy.

      The submission by Nahas et al., extends the state-of-the-art for visualization of important aspects of herpesvirus (HSV-1 in this instance) virion morphogenesis, a complex process that involves virus genome replication, and capsid assembly and filling in the nucleus, transport of the nascent nucleocapsid and some associated tegument proteins through the inner and outer nuclear membranes to the cytoplasm, orderly association of several thousand mostly viral proteins with the capsid to form the virion's tegument, envelopment of the tegumented capsid at a virus-tweaked secretory vesicle or at the plasma membrane, and release of mature virions at the plasma membrane.

      In this groundbreaking study, cells infected with HSV-1 mutants that express fluorescently tagged versions of capsid (eYFP-VP26) and tegument (gM-mCherry) proteins were visualized with 3D correlative structured illumination microscopy and X-ray tomography. The maturation and egress pathways thus illuminated were studied further in infections with fluorescently tagged viruses lacking one of nine viral proteins.

      Strengths:

      This outstanding paper meets the journal's definitions of Landmark, Fundamental, Important, Valuable, and Useful. The work is also Exceptional, Compelling, Convincing, and Solid. The work is a tour de force of classical and state-of-the-art molecular and cellular virology. Beautiful images accompanied by appropriate statistical analyses and excellent figures. The numerous complex issues addressed are explained in a clear and coordinated manner; the sum of what was learned is greater than the sum of the parts. Impacts go well beyond cytomegalovirus and the rest of the herpesviruses, to other viruses and cell biology in general.

      Weaknesses:

      I have a few suggestions for minor adjustments in the text.

    4. Reviewer #3 (Public review):

      Summary:

      Kamal L. Nahas et al. demonstrated that pUL16, pUL21, pUL34, VP16, and pUS3 are involved in the egress of the capsids from the nucleous, since mutant viruses ΔpUL16, ΔpUL21, ΔUL34, ΔVP16, and ΔUS3 HSV-1 show nuclear egress attenuation determined by measuring the nuclear:cytoplasmic ratio of the capsids, the dfParental, or the mutants. Then, they showed that gM-mCherry+ endomembrane association and capsid clustering were different in pUL11, pUL51, gE, gK, and VP16 mutants. Furthermore, the 3D view of cytoplasmic budding events suggests an envelopment mechanism where capsid budding into spherical/ellipsoidal vesicles drives the envelopment.

      Strengths:

      The authors employed both structured illumination microscopy and cellular ultrastructure analysis to examine the same infected cells, using cryo-soft-X-ray tomography to capture images. This combination, set here for the first time, enabled the authors to obtain holistic data regarding a biological process, as a viral assembly. Using this approach, the researchers studied various stages of HSV-1 assembly. For this, they constructed a dual-fluorescently labelled recombinant virus, consisting of eYFP-tagged capsids and mCherry-tagged envelopes, allowing for the independent identification of both unenveloped and enveloped particles. They then constructed nine mutants, each targeting a single viral protein known to be involved in nuclear egress and envelopment in the cytoplasm, using this dual-fluorescent as the parental one. The experimental setting, both the microscopic and the virological, is robust and well-controlled. The manuscript is well-written, and the data generated is robust and consistent with previous observations made in the field.

      Weaknesses:

      It would be helpful to find out what role the targeted proteins play in nuclear egress or envelopment acquisition in a different orthoherpesvirus, like HSV-2. This would confirm the suitability of the technical approach set and would also act as a way to validate their mechanism at least in one additional herpesvirus beyond HSV-1. So, using the current manuscript as a starting point and for future studies, it would be advisable to focus on the protein functions of other viruses and compare them.

    1. eLife Assessment

      This study provides important insights into the regulation of type-I interferon signaling and anti-tumor immunity, demonstrating that ORMDL3 promotes RIG-I degradation to suppress immune responses. The evidence is convincing, with well-executed mechanistic experiments and in vivo validation in syngeneic tumor models. These findings have significant implications for cancer immunotherapy, highlighting ORMDL3 as a potential therapeutic target.

    2. Reviewer #2 (Public review):

      Summary:

      The authors identified ORMDL3 as a negative regulator of the RLR pathway and anti-tumor immunity. Mechanistically, ORMDL3 interacts with MAVS and further promotes RIG-I for proteasome degradation. In addition, the deubiquitinating enzyme USP10 stabilizes RIG-I and ORMDL3 disturbs this process. Moreover, in subcutaneous syngeneic tumor models in C57BL/6 mice, they showed that inhibition of ORMDL3 enhances anti-tumor efficacy by augmenting the proportion of cytotoxic CD8-positive T cells and IFN production in the tumor microenvironment (TME).

      Strengths:

      The paper has a clearly arranged structure and the English is easy to understand. It is well written. The results clearly support the conclusion.

      Comments on revisions:

      All questions have been answered.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations for the authors):

      Minor Points:

      • HEK293T cells are not typically Type 1 IFN-producing cells; it is recommended to use other immune cell lines to validate results obtained with ORMDL3 overexpression in 293T cells. The same applies to A549 alveolar basal epithelial cells.

      Thanks for the reviewer’s insightful comment. In Figure 1C, we overexpressed ORMDL3 in mouse primary BMDM cell and stimulated it with poly(I:C) or poly(dG:dC), which suggests that ORMDL3 inhibits IFN expression in primary cell BMDM.

      • Clarify whether TLR3 is expressed in the cell lines used in Figure 1 and whether TLR3 is present in mouse BMDMs.

      Thanks for your suggestions. We identified whether TLR3 is expressed in HEK293T, A549 and BMDM. We designed primers of human TLR3 and murine Tlr3, and the results showed that Tlr3 is expressed in BMDM but not in HEK293T and A549. As it shown in Author response image 1.

      Author response image 1.

      PCR amplification of human TLR3 was conducted on cDNA derived from HEK293T and A549 cells (lanes 1 and 2, respectively), and PCR amplification of murine Tlr3 was performed on cDNA from BMDM (lane 3). Human spleen cDNA (lane 4, TAKARA Human MTCTM Panel I, Cat# 636742) served as a positive control, and 18s rRNA was used as an internal control.

      primer sequences:

      human TLR3: forward TTGCCTTGTATCTACTTTTGGGG   reverse TCAACACTGTTATGTTTGTGGGT

      murine Tlr3: forward GTGAGATACAACGTAGCTGACTG   reverse TCCTGCATCCAAGATAGCAAGT

      18s (human/mice): forward GTAACCCGTTGAACCCCATT   reverse CCATCCAATCGGTAGTAGCG

      • Specify the type of luciferase reporter assay used in Figure 1E.

      Thanks for the reviewer’s insightful comment. The Dual-Luciferase® Reporter (DLR™) Assay System efficiently measures two luciferase signals. In brief, the IFN-reporter luciferase is derived from firefly (Photinus pyralis), while the internal control luciferase is from Renilla (Renilla reniformis or sea pansy). These dual luciferases are measured sequentially from a single sample. In Figure 1E, we measured the luciferase activity of IFN (firefly) and internal control gene TK (Renilla), and their ratio is shown in Figure 1E.

      • Clarify what was knocked down in the A549 stable KD cell line and whether HSV-1 infects and replicates in A549 cells.

      We sincerely appreciate the reviewer’s concern and apologize for any ambiguous descriptions. In Figure 1H, we knocked down ORMDL3 and infected the cell with HSV-1, which shows that ORMDL3 does not affect the infection and replication of HSV-1 in A549.

      • In Figure 2E, provide the rationale for using the same tag (Flag) in overexpression experiments with different molecules such as Flag-ORDML3 and Flag-RIG-I.

      We thank the reviewer’s concern. We tried to co-express different tags of ORMDL3 and innate immunity proteins, and we got the same results as before. ORMDL3-Myc overexpression can only promote the degradation of Flag-RIG-I-N, as shown in current Figure 2E.

      • Address the low knockdown efficiency shown in Figure 2D and consider whether it is sufficient for drawing conclusions.

      Thanks for the reviewer’s concern. Because ORMDL3 antibody (Abcam 107639) can recognize all ORMDL family members (ORMDL1, 2 and 3), this may explain why the knockdown efficiency of ORMDL3 is not apparent in Figure2D. We also detect the knockdown efficiency of ORMDL3 at mRNA level, which showed that ORMDL3 was silenced efficiently and specifically (Figure S2C).

      • Replace the Tubulin/β-Actin WB control with a more distinguishable band.

      Thanks for the suggestion. Owing to different gel concentration, sometimes the protein bands appear fused, but it is distinguishable that the internal controls are consistent.

      • In Figures 3D/E, the expression level of the Lysine mutant of RIG-I-N is too low. Please provide an explanation or repeat the experiment to achieve comparable expression levels and update the figure accordingly.

      Thanks for the question. The expression of lysine mutant of RIG-I-N is low, we have increased the amount of plasmid in transfection, but this still hasn't increased its expression level. Though its abundance is low, we provided evidence to show that it would not be degraded by ORMDL3. In some literatures (for example: RNF122 suppresses antiviral type I interferon production by targeting RIG-I CARDs to mediate RIG-I degradation. Proc Natl Acad Sci U S A. 2016 Aug 23;113(34):9581-6; TRIM4 modulates type I interferon induction and cellular antiviral response by targeting RIG-I for K63-linked ubiquitination. J Mol Cell Biol. 2014 Apr;6(2):154-63.), it has also been reported that lysine mutant can affect RIG-I stability. In addition, we speculate that the 4KR mutant (K146R, K154R, K164R, K172R) may change RIG-I conformation, so its expression is lower.

      • Explain why there is no difference in MAVS expression levels despite binding with MAVS.

      Thanks for the question. In our experiment, ORMDL3 has no effect on MAVS expression. Our results showed that ORMDL3 interacts with MAVS and promotes the degradation of RIG-I, so only RIG-I level has a significant difference.

      • Verify if Flag-tagged ORMDL3 is present in the IP sample in Figure 3G.

      Thanks for the comment. We reloaded the samples and blot flag, and we found that ORMDL3 cannot be pulled down by RIG-I. We have added the results in Figure 3G.

      • Reload the samples in Figure 4C to clearly identify the correct band for GFP-tagged ORMDL3.

      Thanks for the question. As ORMDL3 is small molecular protein, we fused it and its fragments to GFP to increase its molecular weight. In our GFP vector, for some unknown reason, the 26kDa band always exists. This is actually a technical difficulty. Although the GFP-fused protein and GFP band are very close, they can still be distinguished as two bands.

      • Rerun the Western blot for Actin IB in Figure 4E, as the ORMDL3-GFP (1-153) full-length appears abnormal.

      Thanks for the question. As we first blot GFP and then blot actin on the same membrane, so it appears abnormal. We reloaded the previous sample and blotted the actin again.

      • Clarify in which figure RIG-I ubiquitination is shown and whether ORMDL3 has E3 ubiquitin ligase activity. Explain how ORMDL3 facilitates USP10 transfer to RIG-I despite no direct interaction.

      Thank you for your question. In Figure 3B we showed the ubiquitination of RIG-I and ORMDL3 does not have an E3 ubiquitin ligase activity. Our results showed that although ORMDL3 does not directly interacted with RIG-I, it forms complex with USP10 (Figure 5B, 5C) and disrupt USP10 induced RIG-I stabilization by decreasing the interaction between USP10 and RIG-I (Figure 6A). The detailed mechanism needs further investigation.

      • Provide quantification for Figure 5D. Explain why the bands are not degraded by RIG-I and USP10.

      Thanks for the concern. We quantified the bands and found that overexpression of USP10 increased RIG-I protein abundance. The quantitative gray values are added into the image. USP10 functions to stabilize RIG-I rather than promoting its degradation.

      • Explain the decrease in RIG-I levels in Figure 5E when USP10 levels decrease.

      Thanks for the concern. As shown in the working model (Supplementary Figure 8), USP10 is a deubiquitinase that stabilizes RIG-I by decreasing its K48-linked ubiquitination. So, in Figure 5E, we knocked down USP10 and found a decrease in RIG-I levels, which is consistent with Figure 5D.

      • Clarify whether K48 ubiquitination on RIG-I has decreased in Figure 5F, as this is not clear from the image.

      Thanks for the question. In Figure 5F it is shown that the K48 ubiquitination level of RIG-I significantly decreased (please see the density of the bands in the IP samples).

      • Address whether ORMDL3 reduces RIG-I-N degradation in Figure 5H, as the results do not clearly support this claim.

      Thanks for the concern. We quantified the bands and the results showed that ORMDL3 promotes the degradation of RIG-I-N. The quantitative gray values are added into the image.

      • Reload Flag-ORMDL3 in Figure 6C to determine whether RIG-I-N is restored in the MG132-treated samples.

      Thank you for your question. We quantified the bands and the results showed that RIG-I-N is restored in the MG132-treated samples. The quantitative gray values are added into the image.

      • Correct numerous typos and errors, especially in the Discussion section, to improve readability

      Thanks for the suggestion. We have revised the manuscript carefully to correct these errors.

      Reviewer #2 (Recommendations for the authors):

      (1) In Figure 1G and H, The number of virus-infected cells was observed using a fluorescence microscope. In addition, can the author use other techniques to detect the impact of ORMDL3 on virus replication?

      Thanks for the question. Except for using a fluorescence microscope, we also used RT-PCR to quantify the amount of viral mRNA, and results were added in Figure 1G and H.

      (2) In Figure 3C, ORMDL3 overexpression promotes the degradation of RIG-I-N. ORMDL3 is one of three ORMDL proteins with similar amino acid sequences, does ORMDL1/2 also have this function?

      Thanks for the suggestion. We compared the function between ORMDLs and found that only ORMDL3 overexpression facilitated RIG-I-N degradation. The results were shown in Figure S2D.

      (3) In Figure 5A, USP10 is not the top protein in the Mass spec assay. Does the author verified the interaction between ORMDL3 and other protein (for example CAND1)?

      Thanks for your suggestion. We verified that ORMDL3 has no interaction with CAND1 and UFL1 but only interacts with USP10, as Figure S5 shows.

      (4) A scale bar to be added to the images in Figure 1 G, H and Figure 7K.

      Thanks for the suggestion. We have added the scale bars.

      (5) The annotations in Figure 4B, C and E should be aligned.

      Thanks for the suggestion. We have aligned the annotations.

      (6) Provide Statistical methods

      Thanks for the suggestion. We have provided the statistical methods in the materials and methods part.

    1. eLife Assessment

      This study addresses an important and longstanding question regarding the molecular mechanism of protein misfolding in Ig light chain (LC) amyloidosis (AL), a life-threatening condition. By combining advanced techniques, including small-angle X-ray scattering, molecular dynamics simulations, and hydrogen-deuterium exchange mass spectrometry, the authors provide convincing evidence that the "H state" distinguishes amyloidogenic from non-amyloidogenic LCs. These findings not only offer novel insights into LC structural dynamics but also hold promise for guiding therapeutic strategies in amyloidosis and will be of particular interest to structural biologists, biophysicists, and many others working on amyloid diseases.

    2. Reviewer #1 (Public review):

      The study investigates light chains (LCs) using three distinct approaches, with a focus on identifying a conformational fingerprint to differentiate amyloidogenic light chains from multiple myeloma light chains. The study's major contribution is the identification of a low-populated "H state," which the authors propose as a unique marker for AL-LCs. While this finding is promising, the review highlights several strengths and weaknesses. Strengths include the valuable contribution of identifying the H state and the use of multiple approaches, which provide a comprehensive understanding of LC structural dynamics. Weaknesses include a lack of physical insights explaining the changes.

    3. Reviewer #2 (Public review):

      Summary:

      This well-written manuscript addresses an important but recalcitrant problem - molecular mechanism of protein misfolding in Ig light chain (LC) amyloidosis (AL), a major life-threatening form of systemic human amyloidosis. The authors use expertly recorded and analyzed small-angle X-ray scattering (SAXS) data as a restraint for molecular dynamics simulations (called M&M). Six patient-based LC proteins are explored, including four AL and two non-AL. The authors report a partially populated "H-state" determined computationally, wherein the two domains in an LC molecule acquire a straight rather than bent conformation, with an extended interdomain linker; this H-state distinguishes AL from non-AL LCs. H-D exchange mass spectrometry is used to support this conclusion. This is a novel and interesting finding with potentially important translational implications.

      Strengths:

      Expertly recorded and analyzed SAXS data combined with clever M&M simulations lead to a novel and interesting conclusion, which is supported by limited H-D exchange data.<br /> Stabilization of the CL-CL interface is a good idea that may help protect a subset of AL LCs from misfolding in amyloid.

      Computational M&M evidence is convincing and is supported by SAXS data, which are used as restraints for simulations. Although Kratky plots reported in the main MS Fig. 1 show significant differences between the data and the structural model for only one AL protein, AL-55, H-state is also inferred for other AL proteins.

      Apparent limitations:

      HDX MS results show that residues 35-50 from VL-VL and VL-CL dimerization interface are less protected in AL vs. non-AL proteins, which is consistent with the H-state. However, the small number of proteins yielding useful HDX data (three AL and one non-AL) suggests that this conclusion should be treated with caution. It is unclear whether the conformational heterogeneity depicted in M&M simulations is consistent with HDX results, and whether prior HDX studies of AL and MM LCs are consistent with the conclusions that a particular domain-domain interface is weakened in AL vs. non-AL LCs. The butterfly plots in Fig. 5 could benefit from the X-axis labeling with the peptide fragments.

    4. Reviewer #3 (Public review):

      Summary:

      This study identifies confirmational fingerprints of amylodogenic light chains, that set them apart from the non-amylodogenic ones.

      Strengths:

      The research employs a comprehensive combination of structural and dynamic analysis techniques, providing evidence that conformational dynamics at VL-CL interface and structural expansion are distinguished features of amylodogenic LCs.

      Weaknesses:

      The sample size is limited, which may affect the generalizability of the findings. Additionally, the study could benefit from deeper analysis of specific mutations driving this unique conformation to further strengthen therapeutic relevance.

      Furthermore. p-value (statistical significance) of Rg difference should be computer. Finally, significance of mutations (SHM?) at the interface, such as A40G should be compared with previous observations. (Garofalo et al., 2021)

    5. Author response:

      The following is the authors’ response to the original reviews.

      eLife Assessment

      This important study identifies the "H-state" as a potential conformational marker distinguishing amyloidogenic from non-amyloidogenic light chains, addressing a critical problem in protein misfolding and amyloidosis. By combining advanced techniques such as small-angle X-ray scattering, molecular dynamics simulations, and H-D exchange mass spectrometry, the authors provide convincing evidence for their novel findings. However, incomplete experimental descriptions, limitations in SAXS data interpretation, and the way HDX MS data is presented aHect the strength and generalizability of the conclusions. Strengthening these aspects would enhance the impact of this work for researchers in amyloidosis and protein misfolding.

      We thank eLife editors and reviewers for their constructive feedback. The manuscript has been improved to provide a more complete description of the experiments and to strengthen the interpretation and presentation of all data. Updated Figures (Figure 2 and Figure 5) and a new Table (Table 2) in the main text provide a more complete and clearer comparison of the SAXS data with MD simulations as well as a clearer representation of the HDX MS data. Additional figures have been added in SI. The text has been extended accordingly and complete materials and methods are now included in the main text. Abstract, introduction and discussion have been revised to improve the overall readability of the manuscript.

      Public Reviews:

      Reviewer #1 (Public review):

      The study investigates light chains (LCs) using three distinct approaches, with a focus on identifying a conformational fingerprint to diHerentiate amyloidogenic light chains from multiple myeloma light chains. The study's major contribution is identifying a low-populated "H state," which the authors propose as a unique marker for AL-LCs. While this finding is promising, the review highlights several strengths and weaknesses. Strengths include the valuable contribution of identifying the H state and using multiple approaches, which provide a comprehensive understanding of LC structural dynamics. However, the study suHers from weaknesses, particularly in interpreting SAXS data, lack of clarity in presentation, and methodological inconsistencies. Critical concerns include high error margins between SAXS profiles and MD fits, unclear validation of oligomeric species in SAXS measurements, and insuHicient quantitative cross-validation between experimental (HDX) and computational data (MD). This reviewer calls for major revisions including clearer definitions, improved methodology, and additional validation, to strengthen the conclusions.

      We thank the reviewer for the supportive comments, in the revised version of the manuscript we have focused on improving the clarity and completeness of our work. We are sorry for example to not have made previously clear enough that the comparison of SAXS with MD simulation was not that shown in the main text in Figure 1 and Table 1 (this is the comparison with single structures) but that reported in the SI (previously Figure S1 and Table S2, showing very good fits). These data have been moved in the main text in the reworked Figure 2 and new Table 2.  We have also improved the presentation of the HDX MS data in Figure 5 and in the text adding also additional analysis in SI. Materials and methods are now completely moved in the main text. We generally revised the manuscript for clarity.

      Reviewer #2 (Public review):

      Summary:

      This well-written manuscript addresses an important but recalcitrant problem - the molecular mechanism of protein misfolding in Ig light chain (LC) amyloidosis (AL), a major life-threatening form of systemic human amyloidosis. The authors use expertly recorded and analyzed smallangle X-ray scattering (SAXS) data as a restraint for molecular dynamics simulations (called M&M) and to explore six patient-based LC proteins. The authors report that a highly populated "H-state" determined computationally, wherein the two domains in an LC molecule acquire a straight rather than bent conformation, is what distinguishes AL from non-AL LCs. They then use H-D exchange mass spectrometry to verify this conclusion. If confirmed, this is a novel and interesting finding with potentially important translational implications.

      We thank the reviewer for the supportive comments.

      Strengths:

      Expertly recorded and analyzed SAXS data combined with clever M&M simulations lead to a novel and interesting conclusion. Regardless of whether or not the CL-CL domain interface is destabilized in AL LCs explored in this (Figure 6) and other studies, stabilization of this interface is an excellent idea that may help protect at least a subset of AL LCs from misfolding in amyloid. This idea increases the potential impact of this interesting study.

      We thank the reviewer for the supportive comments.

      Weaknesses:

      The HDX analysis could be strengthened.

      We have extended the analysis and improved the presentation of the HDX data. Figure 5 has been reworked, text has been improved accordingly and additional analysis have been reported in SI.

      Reviewer #3 (Public review):

      Summary:

      This study identifies conformational fingerprints of amyloidogenic light chains, that set them apart from the non-amyloidogenic ones.

      We thank the reviewer for the supportive comments.

      Strengths:

      The research employs a comprehensive combination of structural and dynamic analysis techniques, providing evidence that conformational dynamics at the VL-CL interface and structural expansion are distinguished features of amyloidogenic LCs.

      We thank the reviewer for the supportive comments.

      Weaknesses:

      The sample size is limited, which may aHect the generalizability of the findings. Additionally, the study could benefit from deeper analysis of specific mutations driving this unique conformation to further strengthen therapeutic relevance.

      We agree, we tried to maximise the size of the sample and this was the best we could do. With respect to the analysis of the mutations, while we tried to discuss some of them also in view of previous works, because our set covers multiple germlines instead than focusing on a single one, this limit our ability to discuss single point mutations systematically, at the same time the discussion of single points mutations has been the focus of many recent works, while our approach provide a diNerent point of view.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      This study provides an investigation of light chains (LCs) using three distinct approaches, focusing primarily on identifying a conformational fingerprint to distinguish amyloidogenic light chains (AL-LCs) from multiple myeloma light chains (MM-LCs). The authors propose that the presence of a low-populated "H state," characterized by an extended quaternary structure and a perturbed CL-CL interface, is unique to AL-LCs. This finding is validated through hydrogendeuterium exchange mass spectrometry (HDX-MS). The study makes a valuable contribution to understanding the structural dynamics of light chains, particularly with the identification of the H state in AL-LCs. However, significant concerns regarding the interpretation of the SAXS data, clarity in presentation, and methodological rigor must be addressed. I recommend major revisions and resubmission of the work.

      Major concerns:

      (1) A critical concern is how the authors ensure that the SAXS profiles represent only dimeric species, given the high propensity of LCs to aggregate. If higher-order aggregates or monomers were present, this would significantly impact the SAXS data and SAXS-MD integration. Some measurements are bulk SAXS, while others are SEC-SAXS, making the study questionable. The authors need to clarify how only dimeric species were measured for the SEC-SAXS analysis, and all assessments of the dimeric state should be shown in the SI. Additionally, complementary techniques such as DLS or SEC-MALS should be used to verify the oligomeric state of the samples. Without this validation, the SAXS profiles may not be reliable.

      We added SEC-MALS and SEC-SAXS data in the SI (Figures S20 and S21) as well the SAXS curves shown in log-log plot (Figure S1) that display a flat trend at low q that exclude aggregation. SAXS is very sensitive to oligomers and aggregates and our data do not indicate the presence of those species. When we had indication of possible aggregation in the sample we used SEC-SAXS.

      (2) A major problem with the paper is that the claim of the "H state," which is the novelty of the study and serves as a marker of aggregation, is derived from samples where the error between the SAXS profiles and MD fits is extremely high. This casts doubt on whether the structure is indeed resolved by MD. The main conclusion of the paper is derived from weak consistency between experiment and simulation. In AL55, the error between experiment and simulation is greater than 5; for H7, it is higher than 2.8. The residuals show significant error at mid-q values, suggesting that long-range distance correlations (20-10 Å, CL, VL positioning) are not consistent between simulation and experiment. Furthermore, the FES plots of two independent replicas show deviation in the existence of the H state. One shows a minimum in that region, while the other does not. So, how robust is this conclusion? What is the chi-squared value if each replica is used independently? A separate experimental cross-validation is necessary to claim the existence of the H state.

      We apologise for the misunderstanding underlying this reviewer comment. The poor agreement mentioned is not between the SAXS and MD simulations, but with the individual structures, and this disagreement led us to perform MD simulations that are in much better agreement with the data (previously Fig. S1 and Table S2). To avoid this misunderstanding, which would indeed weaken our work, we have now moved both the figure and the table in the main text to the updated Figure 2 and the new Table 2.

      Regarding the robustness of the sampling, we believe that Table 3 (previously Table 2) clearly shows the statistical convergence of the data, diNerences in the presentation of the free energy are purely interpolation issues. The chi-squares of each replicate are reported in Table 2 (previously Table S2).

      (3) There is insuHicient discussion about SAXS computations from MD trajectories. The accuracy of these calculations is crucial to deriving the existing conclusions, and the study's reliance on the PLUMED plugin, which is known to give inaccurate results for SAXS computations, raises concerns. How the solvent is treated in the SAXS computations needs to be explained. Alternative methods like WAXSiS or Crysol should be explored to check whether the SAXS profiles derived from the MD trajectory are consistent across other SAXS computation methods for the major conformers of the proteins.

      We have now clarified that while the SAXS calculation to perform Metainference MD were done using PLUMED (that to our knowledge is as accurate as crysol) SAXS curves used for analysis were calculated using crysol.

      (4) The HDX and MD results do not seem to correlate well, and there is a disconnect between Figure 2 (SAXS profiles) and Figure 5 (HDX structural interpretation). The authors should quantitatively assess residue-level dynamics by comparing HDX signals with MD-derived HDX signals for each protein. This would provide a cross-validation between the experimental and computational data.

      In our opinion our SAXS, MD and HDX MS data provide a consistent picture. Our HDX-MS do not provide per residue data, making a quantitative comparison out of scope. RMSF data do not necessarily need to correlate with the deuterium uptake.

      (5) MD simulations are only used to refine the structure of AlphaFold predictions, but the trajectories could help explain why these structures diHer, what stabilizes the dimer, or what leads to the conformational transition of the H state. A lack of analysis regarding the physical mechanism behind these structural changes is a weakness of the study. The authors should dedicate more eHort to analyzing their data and provide physical insights into why these changes are observed.

      Our aim was to identify a property that could discriminate between AL and MM LCs. We used MD simulations, not to refine structures, but to explore the conformational dynamics of LCs (starting from either X-ray structures, homology or AlphaFold models), because SAXS data suggested that conformational dynamics could discriminate between AL- and MM-LCs. Simulations allowed us to propose a hypothesis, which we tested by HDX MS. While more insight is always welcome, we believe that we have achieved our goal for now. In the discussion, we present additional analysis of the simulations to connect with previous literature, we agree that more analysis can be done, and also for this reason, all our data are publicly available.

      Minor concerns

      (6) The abstract leans heavily on describing the problem and methods but lacks a clear presentation of key results. Providing a concise summary of the main findings (e.g., the identification of the H state) would better balance the abstract.

      We agree with the reviewer and we rewrote the abstract.

      (7) In the abstract, the term "experimental structure" is used ambiguously. Since SAXS also provides an experimental structure, it is unclear what the authors are referring to. This should be clarified.

      We agree with the reviewer and we rewrote the abstract.

      (8) Abbreviations such as VL (variable domain) and CL (constant domain) are not defined, making it harder for readers unfamiliar with the field to follow. Abbreviations should be defined when first mentioned.

      We agree with the reviewer and we rewrote the abstract.

      (9) The introduction provides a good general context but fails to explicitly define the knowledge gap. Specifically, the structural and dynamic determinants of LC amyloidogenicity are not well established, and this study could be framed as addressing that gap.

      We thank the reviewer and we agree this could be better framed, we improved the introduction accordingly.

      (10) The introduction does not present the novel discovery of the H state early enough. The unique contribution of identifying this state as a marker for AL-LCs should be mentioned upfront to guide the reader through the significance of the study.

      We thank the reviewer and we have now made more explicit what we found.

      (11) The therapeutic implications of this research should be highlighted more clearly in the discussion. Examples of how these findings could be utilized in drug design or therapeutic approaches would enhance the study's impact.

      We thank the reviewer, but while we think that the H-state could be targeted for drug design, since we do not have data yet we do not want to stress this point more than what we are already doing.

      (12) There is an overwhelming use of abbreviations such as H3, H7, H18, M7, and M10 without proper introduction. This makes it diHicult for readers to follow the results, and the average reader may become lost in the details. An introductory figure summarizing the sequences under study, along with a schematic of the dimeric structure defining VL and CL domains, would significantly aid comprehension.

      We agree and we tried to better introduce the systems and simplify the language without adding a figure that we think would be redundant.

      (13) In Figure 1, add labels to each SAXS curve to indicate which protein they correspond to. Also, what does online SEC-SAXS mean?

      Done

      (14) The caption of Figure 3 is unclear, particularly with abbreviations like Lb, Ls, G, and H, which are not mentioned in the captions. The authors should define these terms for clarity.

      Done

      (15) The study claims that the dominant structure of the dimer changes between diHerent LCs. However, Figure 5 shows identical structures for all proteins, raising questions about the consistency between the SAXS and HDX data. This inconsistency is a general problem between the MD and HDX sections, where cross-communication and comparisons are not properly addressed.

      We do not claim that the dominant structure of the dimer changes between diNerent LCs, this would also be in contradiction with current literature. We claim a diNerence in a low-populated state. From this point of view using always the same structure is consistent and should simplify the representation of the results. We agree that the manuscript may be not always easy to follow and we thank the reviewer in helping us improving it.

      (16) The authors show I(q) vs q and residuals for each protein. The Kratky plots are not suHicient to compare the SAXS computations with the measured profile.

      Showing Kratky and residuals is a standard and complementary way to present and compare SAXS data to structures. Chi-square values are also reported. Log-log plots have been added to SI in response to previous comments.

      (17) The authors need to explain how they estimate the Rg values (from simulation or SAXS profiles). If they are using simulations, they should compute the Rg values from the simulations for comparison.

      Rg values reported in Table 1 are derived from SAXS. Rg from simulations have been added in Table 2.

      (18) The evolution of the sampling is unclear. The authors need to show the initial starting conformation in each case and the most likely conformation after M&M in the SI, to demonstrate that their approach indeed caused changes in the initial predictions.

      Our approach is not structure refinement and as such the proposed analysis would be misleading. Metainference is meant to generate a statistical ensemble representing the equilibrium conformations that as whole reproduce the data. DiNerences (or not) between initial and selected configurations will not be particularly informative in this context.

      (19) The authors should also provide a running average of chi-squared values over time to demonstrate that the conformational ensemble converged toward the SAXS profile.

      Our simulations are not driven to improve the agreement with SAXS over time, this is not structure refinement. Metainference is meant to generate a statistical ensemble representing the equilibrium conformations that as whole reproduce the data. The suggested analysis would be a misinterpretation of our simulations. The comparison with SAXS is provided in Figure 2 and Table 2 as mentioned above.

      (20) The aggregate simulation time of 120 microseconds is misleading, as each replica was only run for 2-3 microseconds. This should be clarified.

      The number reported in the text is accurate and represent the aggregated sampling. The number of replicas for each metainference simulation and their length is reported in Table 2 now moved for clarity from the SI to main text.

      (21) It is not clear how the replicas were weighted to compute the SAXS profiles and FES. There are two independent runs in each case, and each run has about 30 replicas. How these replicas are weighted needs to be discussed in the SI.

      Done

      (22) The methods section is unevenly distributed, with detailed explanations of LC production and purification, while other key methodologies like SAXS+MD integration and HDX are not even mentioned in the main text (they are in the Supporting Information). The authors should provide a brief overview of all methodologies in the main text or move everything to the SI for consistency.

      We agree with the reviewer, all methods are now in main text. 

      Reviewer #2 (Recommendations for the authors):

      (1) Computational M&M evidence is strong (Figure 3) and is supported by SAXS (used as restraints). However, Kratky plots reported in the main MS Figure 1 show significant diHerences between the data and the structural model only for one protein, AL-55. It is hard for the general reader to see how these SAXS data support a clear diHerence between AL and non-AL proteins. If possible, please strengthen the evidence; if not, soften the conclusions.

      We thank the reviewer for the comments. The chi-square (Table 1) and the residuals (Figure 1) are a strong indication of the diNerence. To strengthen the evidence, following also the comment from reviewer 3 we calculated the p-value (<10<sup>-5</sup>) on the significance of the radius of gyration to discriminate AL and MM LCs. We agree that SAXS alone was not enough and this is indeed what prompted us to perform MD simulations.

      (2) HDX MS results are cursory and not very convincing as presented. The butterfly plots in Figure 5 are too small to read and are unlabeled so it is unclear which protein is which.  

      Figure 5 has been reworked for readability. More data have been added in SI. 

      (3) What labeling time was selected to construct these plots and why?

      The deuterium uptakes at 30 min HDX time showed the most pronounced diNerences between diNerent proteins, which were chosen to illustrate the key structural features in the main figure panel (Figure 5).

      How diHerent are the results at other labeling times? Showing uptake curves (with errors) for more than just two peptides in the supplement Figure S12 might be helpful. 

      We found a continuous increase in deuterium uptake as we increased the exchange time from 0.5 to 240 min, which reached saturation at 120 min. Therefore, the exchange follows the same pattern at all time points. Butterfly plots at diNerent HDX times of 0.5 to 240 min are shown in gradient of light blue to dark blue which clearly shows the pattern of deuterium uptake at increasing incubation times (Figure 5). The HDX uptake kinetics of selected peptides with corresponding error bars are shown in Figure S12.

      How redundant are the data, i.e. how good is the peptide coverage/resolution in key regions at the domain-domain interface that the authors deem important? Mapping the maximal deuterium uptake on the structures in Figure 5 is not very helpful. Perhaps mapping the whole range of uptake using a gradient color scheme would be more informative.

      Overall coverage and redundancy for all four proteins are> 90% and > 4.0, respectively, with an average error margin in fractional uptake among all peptides is 0.04-0.05 Da, which suggests that our data is reliable (Table S3). We modified the main panel figures showing the gradient of deuterium uptake in blue-white-red for 0 to 30% of deuterium uptake on the chain A of the dimeric LCs.

      (3) Is the conformational heterogeneity depicted in M&M simulations consistent with HDX results? The authors may want to address this by looking at the EX1/EX2 exchange kinetics for AL vs. non-AL proteins. Do AL proteins show more EX1?

      No, we don’t see any EX1 exchange kinetics in our analysis. This is compatible with the prediction of the H-state that is a native like state and not an unfolded/partially folded state. 

      (4) Perhaps the main conclusion could be softened given the small number of proteins (six), esp. since only four (3 AL and 1 non-AL) could be explored by HDX. Are other HDX MS data of AL LCs from the same Lambda6 family (e.g. PMID: 34678302) consistent with the conclusions that a particular domain-domain interface is weakened in AL vs. non-AL LCs?

      We thank the reviewer for this suggestions. A diNerence in HDX MS data is indeed visible between AL and MM proteins for peptide 33-47 in the suggested paper (Figures 4, S5 and S8). The diNerence is reduced by the mutation identified in the paper as driving the aggregation in that specific case. We now mention this in the discussion.

      (5) Please clarify if the H* state is the same for a covalent vs. non-covalent LC dimer.

      We do not know because our data are only for covalent dimers. But, interestingly, the state is very similar to what was observed for a model kappa light-chain in Weber, et al., we have better highlighted this point in the discussion.

      (6) Please try and better explain why a smaller distance between CL domains in H7 protein and a larger distance in other AL proteins both promote protein misfolding.

      We do not have elements to discuss this point in more detail.

      (7) Please comment on the Kratky plots data vs. model agreement (see comments above).

      Done.

      (8) Please find a better way to display, describe, and interpret the HD exchange MS data.

      We have generated new main text (new Figure 5) and SI figures that we think allow the reader to better appreciated our observations. Corresponding results sections have been also improved.

      Minor points:

      (9) Is the population of the H-state with perturbed CL-CL domain interface, which was obtained in M&M simulations, suHicient to be observable by HDX MS?

      While populations alone are not enough to determine what is observable by HDX MS, a 10% population correspond roughly to 6 kJ/mol of ΔG and is compatible with EX2 kinetics. Previous works suggested that HDX-MS data should be sensitive to subpopulations of the order of 10%, (https://doi.org/10.1016/j.bpj.2020.02.005, https://doi.org/10.1021/jacs.2c06148)

      (10) Typically, an excited intermediate in protein unfolding is a monomer, while here it is an LC dimer. Is this unusual?

      This is a good point, we think that intermediates have mostly been studied on monomeric proteins because these are more commonly used as model systems, but we do not feel like discussing this point.

      (11) Low deuterium uptake is consistent with a rigid structure but may also reflect buried structure and/or structure that moves on a time scale greater than the labeling time.

      We agree.

      Reviewer #3 (Recommendations for the authors):

      (1) The p-value (statistical significance) of Rg diHerence should be computed.

      We thank the reviewer for the suggestion, we calculated the p-value that resulted quite significant.

      (2) The significance of mutations (SHM?) at the interface, such as A40G should be compared with previous observations. (Garrofalo et al., 2021).

      We thank the reviewer for the suggestion, a sentence has been added in the discussion.

    1. eLife Assessment

      The authors present three transgenic models carrying three representative exon deletions of the dystrophin gene. The findings presented are valuable to the field of muscle diseases, particularly muscular dystrophies. The evidence provided in the manuscript is convincing, with rigorous biochemical assays and state-of-the-art microscopy methods.

    2. Reviewer #2 (Public review):

      Miyazaki et al. established three distinct BMD mouse models by deleting different exon regions of the dystrophin gene, observed in human BMD. The authors demonstrated that these models exhibit pathophysiological changes, including variations in body weight, muscle force, muscle degeneration, and levels of fibrosis, alongside underlying molecular alterations such as changes in dystrophin and nNOS levels. Notably, these molecular and pathological changes progress at different rates depending on the specific exon deletions in dystrophin gene. Additionally, the authors conducted extensive fiber typing, revealing a site-specific decline in type IIa fibers in BMD mice, which they suggest may be due to muscle degeneration and reduced capillary formation around these fibers.

      Strengths:

      The manuscript introduces three novel BMD mouse models with different dystrophin exon deletions, each demonstrating varying rates of disease progression similar to the human BMD phenotype. The authors also conducted extensive fiber typing across different muscles and regions within the muscles, effectively highlighting a site-specific decline in type IIa muscle fibers in BMD mice.

      Comments on revisions:

      The authors did an excellent job addressing all or most of the concerns I raised in my previous review and have incorporated the necessary changes into the manuscript.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      In this article the authors described mouse models presenting with backer muscular dystrophy, they created three transgenic models carrying three representative exon deletions: ex45-48 del., ex45-47 19 del., and ex45-49 del. This article is well written but needs improvement in some points.

      Strengths:

      This article is well written. The evidence supporting the authors' claims is robust, though further implementation is necessary. The experiments conducted align with the current state-of-the-art methodologies.

      Weaknesses:

      This article does not analyze atrophy in the various mouse models. Implementing this point would improve the impact of the work

      We thank the reviewer for their constructive suggestions and comments on this work. Muscle hypertrophy is shown with growth in dystrophin-deficient skeletal muscle in mdx mice; thus, we did not pay attention to the factors associated with muscle atrophy in BMD mice. As the reviewer suggested, the examination of the association between type IIa fiber reduction and muscle atrophy is important, and the result is considered to be helpful in resolving the cause of type IIa fiber reduction in BMD mice.

      In response, we reviewed the following.

      (1) The cross-sectional areas (CSAs) of muscles. We confirmed that the CSAs in BMD and mdx mice were rather high at 3 months, in accordance with muscle hypertrophy, compared with those of WT mice. The data is presented in Fig. 4–figure supplement 1B.

      (2) The mRNA expression levels of Murf1 and atrogin-1. We confirmed that these muscle atrophy inducing factors did not differ among WT, BMD, and mdx mice. The data is presented in Fig. 4–figure supplements 1C and 1D.

      Reviewer #2 (Public review):

      Summary:

      Miyazaki et al. established three distinct BMD mouse models by deleting different exon regions of the dystrophin gene, observed in human BMD. The authors demonstrated that these models exhibit pathophysiological changes, including variations in body weight, muscle force, muscle degeneration, and levels of fibrosis, alongside underlying molecular alterations such as changes in dystrophin and nNOS levels. Notably, these molecular and pathological changes progress at different rates depending on the specific exon deletions in the dystrophin gene. Additionally, the authors conducted extensive fiber typing, revealing a site-specific decline in type IIa fibers in BMD mice, which they suggest may be due to muscle degeneration and reduced capillary formation around these fibers.

      Strengths:

      The manuscript introduces three novel BMD mouse models with different dystrophin exon deletions, each demonstrating varying rates of disease progression similar to the human BMD phenotype. The authors also conducted extensive fiber typing across different muscles and regions within the muscles, effectively highlighting a site-specific decline in type IIa muscle fibers in BMD mice.

      Weaknesses:

      The authors have inadequate experiments to support their hypothesis that the decay of type IIa muscle fibers is likely due to muscle degeneration and reduced capillary formation. Further investigation into capillary density and histopathological changes across different muscle fibers is needed, which could clarify the mechanisms behind these observations.

      We thank the reviewer for these positive comments and the very important suggestion about type IIa fiber reduction and capillary change around muscle fibers in BMD mice. From the results of the cardiotoxin-induced muscle degeneration and regeneration model, type IIa and IIx fibers showed delayed recovery compared with that of type-IIb fibers. However, this delayed recovery of type IIa and IIx could not explain the cause of the selective muscle fiber reduction limited to type IIa fibers in BMD mice. Therefore, we considered vascular dysfunction as the reason for the selective type IIa fiber reduction, and we found morphological capillary changes from a “ring pattern” to a “dot pattern” around type IIa fibers in BMD mice. However, the association between selective type IIa fiber reduction and the capillary change around muscle fibers in BMD mice remains unclear due to the lack of information about capillaries around type IIx and IIb fibers. The reviewer pointed out this insufficient evaluation of capillaries around other muscle fibers (except for type IIa fibers), and this suggestion is very helpful for explaining the association between selective type IIa fiber reduction and vascular dysfunction in BMD mice.

      In response, we reviewed the following.

      (1) The capillary formation around type IIx, IIb, and I fibers, in addition to that around type IIa fibers. We found that capillaries contacting around type IIx, IIb, and I fibers were poor in WT mice compared with that around type IIa fibers, with ‘incomplete ring-patterns’ around type IIx fibers, and ‘dot-patterns’ around type IIb and I fibers in WT mice. Morphological capillary changes around muscle fibers from WT to d45-49 and mdx mice were ‘incomplete dot-pattern’ to ‘dot-pattern’ around type IIx fibers, and ‘dot-pattern’ to ‘dot-pattern’ around type IIb and I fibers. This was in contrast to those around type IIa fibers: remarkable ‘ring-pattern’ to ‘dot-pattern’. These data are presented in Fig. 6B.

      (2) The endothelial area in contact with type IIx, IIb, and I fibers, and additionally that in contact with type IIa fibers. The endothelial area in contact with both type IIa and IIx fibers was less in d45-49 and mdx mice than in WT mice, but the reduction was larger around type IIa fibers than around type IIx fibers, reflecting the difference between the ‘ring-pattern’ around the former and the ‘incomplete ring-pattern’ around the latter in WT mice. These data are presented in Fig. 6C.

      (3) Transversely interconnected branches and capillary loops, using longitudinal muscle sections. We confirmed that there were fewer interconnected capillaries in BMD and mdx mice than in WT mice. These data are presented in Fig. 6E.

      (4) The mRNA expression levels of neuronal nitric oxide synthase (nNOS). We confirmed that nNOS protein expression levels were decreased in BMD and mdx mice in spite of adequate levels of nNOS mRNA expression. The data on nNOS mRNA expression levels is presented in Fig. 3–figure supplement 1C.

      (5) We added a sentence in the Abstract about the potential utility of BMD mice in developing vascular targeted therapies.

      Recommendation for the authors:

      Reviewer #1 (Recommendation for the authors):

      Abstract:

      Abstract: more emphasis should be on the pathological implications of Becker muscular dystrophy (BMD). Furthermore, should be emphasized the findings made in this article and the conclusions. Abbreviations such as DMD and MDX should be written in full and only then with the acronym.

      We appreciate the reviewers’ comments, and we apologize for the confusion over abbreviations. DMD is the gene name encoding dystrophin, and mdx is the strain name of mouse lacking dystrophin.

      In the Abstract and the Figure legends we changed:

      (1) DMD to DMD;

      (2) mdx mice to mdx mice.

      Results:

      Line 95: in this line, authors evaluated serum creatinine kinase (CK) levels at 1, 3, 6 and 12 months in WT mice and mdx mice. Why did you decide to study it? This part should be described in more detail. Serum CK is one of the main markers of muscle necrosis; therefore, I would report this data alongside the description of the muscle histology and necrotic fibers.

      We thank the reviewers for the important remarks. In this study, serum creatine kinase (CK) levels were two-fold to four-fold higher in BMD mice than in WT mice, but its rate of increase was less than that of mdx mice. We consider that the lesser changes in serum CK levels in BMD mice may be due to the smaller area of muscle degeneration because of focal and uneven muscle degeneration compared with that in mdx mice, which showed diffuse muscle degeneration.

      In response, we have moved the description of serum CK levels in the Results, from the section about the establishment of BMD mice to the section about site-specific muscle degeneration in BMD mice.

      In addition, we added a description in the Discussion about the possible association between the lesser changes in serum CK levels in BMD mice and its uneven distribution of muscle degeneration.

      Line 192-202: In these lines, authors observed a decrease in type IIa fibers after 3 months in BMD mice. I suggest evaluating also atrophy through evaluating cross-sectional areas (CSA) and expression of Murf1 and Atrogin1

      We thank the reviewer for the point about the association between type IIa fiber reduction and muscle atrophy. We evaluated the CSAs and the mRNA expression levels of Murf1 and atrogin-1. We confirmed that the CSAs in BMD and mdx mice were rather high at 3 months, in accordance with muscle hypertrophy, compared with those of WT mice, and that Murf1 and atrogin-1 mRNA expression levels did not differ among WT, BMD, and mdx mice. These data are presented in Fig. 4–figure supplements 1B, 1C, and 1D. We added a sentence about the changes in CSA and muscle atrophy inducing factors in the Discussion.

      Methods and material

      Line 342-348: authors have described animals, but not specified sex and number of mice in each group. This part should be improved.

      We apologize for our insufficient information about the sex and number of mice in the Materials and methods.

      We added a sentence specifying the sex, number, and evaluation period of each mouse group in the section on the generation of BMD mice.

      Line 426-433: authors described qPCR. It is necessary that the authors also describe primer sequences.

      We apologize for any lack of information about the primer sequences used in qPCR analysis. Supplemental Table 1 lists the primer sequences.

      We also added a sentence about the information in the primer list in the section on RNA isolation and RT-PCR in the Materials and methods.

      Reviewer #2 (Recommendation for the authors):

      Miyazaki et al. established three distinct BMD mouse models by removing different exon regions of the dystrophin gene. The authors demonstrated that the pathophysiological and molecular changes in these models progress at varying rates. Additionally, they observed a site-specific decline in type IIa fibers in BMD mice, while the proportions of other fiber types, such as type I and type IIx, remained consistent with those in wild-type mice. They proposed that the selective decay of type IIa fibers in BMD mice could be due to two primary factors: 1) muscle degeneration and regeneration, supported by their findings in cardiotoxin-treated mouse models, and 2) reduced capillary formation around type IIa fibers. However, the authors also presented evidence that type IIx fibers exhibited delayed recovery, similar to type IIa fibers, as demonstrated in cardiotoxin-induced regeneration models. Additionally, dot-patterned capillary formations were observed around both type IIa and type IIx fibers. Despite these findings, BMD mice did not show any changes in the proportion of type IIx fibers in inner BMD muscles. The authors should consider adding further analysis to strengthen their hypothesis and to disclose any possible mechanisms that led to these discrepancies.

      If the authors hypothesize that reduced capillary density around type IIa fibers contribute to their site-specific decay in BMD mice, they should consider measuring and statistically analyzing the endothelial area around all fiber types. By plotting and comparing these measurements across different fiber types between wild-type, BMD, and mdx mice, the authors could provide more robust evidence to support their hypothesis. This approach would help clarify whether reduced capillary density is a contributing factor to the site-specific decay of type IIa fibers in BMD mice and the more diffuse, non-specific muscle changes observed in mdx mice.

      The authors reported in the first part of the manuscript that histopathological changes, including muscle degeneration in BMD mice, are predominantly restricted to the inner part of the muscles. In the second part, they noted a decline in type IIa fibers specifically in the inner muscle region. To strengthen the hypothesis that the decay of type IIa fibers in the inner muscle is linked to muscle degeneration, the authors should consider performing histopathological measurements across different fiber types within the inner muscle. Reporting the correlations between these measurements would provide more compelling evidence to support their hypothesis.

      We thank the reviewer for these important suggestions about the association between type IIa fiber reduction and capillary change around muscle fibers in BMD mice. We prepared an additional evaluation about the capillary formation (in Fig. 6B) and endothelial area (in Fig. 6C) around type IIx, IIb, and I fibers. We found that capillaries contacting around type IIx, IIb, and I fibers were poor in WT mice compared with those around type IIa fibers, and showed an ‘incomplete ring-pattern’ around type IIx fibers and a ‘dot-pattern’ around type IIb and I fibers in WT mice, in contrast with type IIa fibers, which showed remarkable ‘ring-pattern’ capillaries. Reflecting this, the changes in endothelial area around type IIx, IIb, and I fibers between WT and BMD mice were less than those around type IIa fibers. These results suggest that type IIa fibers may require numerous capillaries and maintained blood flow compared with type IIx, IIb, and I fibers, and this high requirement for blood flow might be associated with the type IIa fiber-specific decay in BMD mice.

      We added the following.

      (1) Sentences in the Results about the capillary changes around type IIx, IIb, and I fibers in WT, d45-49, and mdx mice.

      (2) Sentences in the Results about the changes in endothelial area around type IIx, IIb, and I fibers in WT, d45-49, and mdx mice.

      (3) Sentences in the Discussion about the association between the type IIa fiber-specific decay in BMD mice and the differences in capillary changes of each muscle fiber from WT to BMD mice.

      We changed a sentence in the Discussion about the delayed recovery of type IIa and IIx fibers after CTX injection, to make it clear that the recovery of type IIx fibers was slower than that of type IIa fibers after CTX injection, and that therefore the type IIa fiber-specific decay in BMD mice might not be explained by this vulnerability and delayed recovery during muscle degeneration and regeneration.

      Minor Issues:

      Line 103: The word "mice" is duplicated and should be corrected.

      We apologize that “mice” was duplicated. We have corrected it.

      Line 120: Revise for clarity: "The proportion of opaque fibers is significantly different between d45-48 mice and WT at 3 months, with an increased tendency observed only in 1-month-old mice."

      We apologize for the confusion about the proportion of opaque fibers. We revised this sentence as follows.

      “Opaque fibers, which are thought to be precursors of necrotic fibers, increased at an earlier age of 1 month in d45–49 mice compared with WT mice; in contrast, the proportion of opaque fibers differs significantly between d45–47 and WT mice at 3 months, with an increased tendency only in 1-month-old mice (Fig. 2C).”

      Line 152: Clarify the statement regarding utrophin levels, as it currently contradicts the Western blot data. The sentence reads: "The increased levels of utrophin are 8-fold higher at 1 month and 30-fold higher at 3 months." This should be verified against the data, as the band densities in the Western blots suggest otherwise.

      We apologize for the confusion about utrophin expression levels. We revised this sentence as follows.

      “By western blot analysis, the utrophin expression levels showed only an increased tendency in all BMD mice at 3 months, whereas there was a significant increase in mdx mice (8-fold at 1 month, and 30-fold at 3 months) compared to WT mice (Figs. 3C and F).”

      Line 235: Correct the sentence to accurately reflect the findings: "BMD mice showed reduced muscle weakness."

      We apologize for our incorrect wording. We have removed the word “reduced” in this sentence.

    1. eLife Assessment

      This valuable work provides solid evidence that a neuronal metallothionein, GIF/MT-3, incorporates metal-persulfide clusters. A variety of well-designed assays support the authors' hypothesis, revealing that sulfane sulfur is released from MT-3. The biological role of the persulfidated form is not yet clearly defined. There are caveats to the findings that limit the study, but the work will nevertheless prompt major follow-up work.

    2. Reviewer #2 (Public review):

      Summary:

      In this manuscript, the authors reveal that GIF/MT-3 regulates the zinc homeostasis depending on the cellular redox status. The manuscript technically sounds, and their data concretely suggest that the recombinant MTs, not only GIF/MT-3 but also canonical MTs such as MT-1 and MT-2, contain sulfane sulfur atoms for the Zn-binding. The scenario proposed by the authors seems to be reasonable to explain the Zn homeostasis by the cellular redox balance.

      Strengths:

      The data presented in the manuscript solidly reveal that recombinant GIF/MT-3 contains sulfane sulfur.

      Weaknesses:

      It remains unclear whether native MTs, in particular induced MTs in vivo contain sulfane sulfur or not.

      Comments on revisions:

      Although the authors have revealed the sulfane sulfur content in native MT-3, my question, namely, whether canonical MT-1 and MT-2 contained sulfane sulfur after the induction has been left.<br /> The authors argue that the biological significance of sulfane sulfur in MTs lies in its ability to contribute to metal binding affinity, provide a sensing mechanism against oxidative stress, and aid in the regulation of the protein. Due to their biological roles, induced MT-1 and MT-2 could contain sulfane sulfur in their molecules. Thus, I expect the authors to evaluate or explain the sulfane sulfur content in induced MT-1 and MT-2.

    3. Reviewer #3 (Public review):

      Summary:

      The authors were trying to show that a novel neuronal metallothionein of poorly defined function, GIF/MT3, is actually heavily persulfidated in both the Zn-bound and apo (metal-free) forms of the molecule as purified from a heterologous (bacterial) or native host. Evidence in support of this conclusion is strong, with both spectroscopic and mss spectrometry evidence strongly consistent with this general conclusion. The authors would appear to have achieved their aims.

      Strengths:

      The analytical data in support of the author's primary conclusions are strong. The authors also provide some modeling evidence that supports the contention that MT3 (and other MTs) can readily accommodate a sulfane sulfur on each of the 20 cysteines in the Zn-bound structure, with little perturbation of the overall structure. This is not the case with Cys trisulfides, which suggests that the persulfide-metallated state is clearly positioned at lower energy relative to the immediately adjacent thiolate- or trisulfidated metal coordination complexes.

      Weaknesses:

      The biological significance of the findings is not entirely clear. On the one hand, the analytical data are solid (albeit using a protein derived from a bacterial over-expression experiment), and yes, it's true that sulfane S can protect Cys from overoxidation, but everything shown in the summary figure (Fig. 9D) can be done with Zn release from a thiol by ROS, and subsequent reduction by the Trx/TR system. In addition, it's long been known that Zn itself can protect Cys from oxidation. I view this as a minor shortcoming that will motivate follow-up studies.

      Impact:

      The impact will be high since the finding is potentially disruptive to the MT field for sure. The sulfane sulfur counting experiment (the HPE-IAM electrophile trapping experiment) may well be widely adopted by the field. Those in the metals field always knew that this was a possibility, and it will interesting to see the extent to which metal binding thiolates broadly incorporate sulfane sulfur into their first coordination shells.

      Comments on revisions:

      The revised manuscript is only slightly changed from the original, with the inclusion of a supplementary figure (Fig. S2) and minor changes in the text. The authors did not choose to carry out the quantitative Zn binding experiment (which I really wanted to see), but given the complexities of the experiment, I'll let it go.

      Fig. 9: the authors imply in the mechanistic "redox-switch" figure that Trx/TR can not reduce persulfide linkages. A number of groups have shown this to be the case. I recommend modifying the figure legend or text to make this clear to the reader,

    4. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      The manuscript by Dr. Shinkai and colleagues is about the posttranslational modification of a highly important protein, MT3, also known as the growth inhibitory factor. Authors postulate that MT3, or generally all MT isoforms, are sulfane sulfur binding proteins. The presence of sulfane sulfur at each Cys residue has, according to the authors, a critical impact on redox protein properties and almost does not affect zinc binding. They show a model in which 20 Cys residues with sulfane sulfur atoms can still bind seven zinc ions in the same clusters as unmodified protein. They also show that recombinant MT3 (but also MT1 and MT2) protein can react with HPE-IAM, an efficient trapping reagent of persulfides/polysulfides. This reaction performed in a new approach (high temperature and high reagent concentration) resulted in the formation of bis-S-HPE-AM product, which was quantitatively analyzed using LC-MS/MS. This analysis indicated that all Cys residues of MT proteins are modified by sulfane sulfur atoms. The authors performed a series of experiments showing that such protein can bind zinc, which dissociates in the reaction with hydrogen peroxide or SNAP. They also show that oxidized MT3 is reduced by thioredoxin. It gives a story about a new redox-dependent switching mechanism of zinc/persulfide cluster involving the formation of cystine tetrasulfide bridge.

      The whole story is hard to follow due to the lack of many essential explanations or full discussion. What needs to be clarified is the conclusion (or its lack) about MT3 modification proven by mass spectrometry. Figure 1B shows the FT-ICR-MALDI-TOF/MS spectrum of recombinant MT3. It clearly shows the presence of unmodified MT3 protein without zinc ions. Ions dissociate in acidic conditions used for MALDI sample preparation. If the protein contained all Cys residues modified, its molecular weight would be significantly higher. Then, they show the MS spectrum (low quality) of oxidized protein (Fig. 1C), in which new signals (besides reduced apo-MT3) are observed. They conclude that new signals come from protein oxidation and modification with one or two sulfur atoms. If the conclusion on Cys residue oxidation is reasonable, how this protein contains sulfur is unclear. What is the origin of the sulfur if apo-MT does not contain it? Oxidized protein was obtained by acidification of the protein, leading to zinc dissociation and subsequent neutralization and air oxidation. Authors should perform a detailed isotope analysis of the isotopic envelope to prove that sulfur is bound to the protein. They say that the +32 mass increase is not due to the appearance of two oxygen donors. They do not provide evidence. This protein is not a sulfane sulfur binding protein, or its minority is modified. Moreover, it is unacceptable to write that during MT3 oxidation are "released nine molecules of H2". How is hydrogen molecule produced? Moreover, zinc is not "released", it dissociates from protein in a chemical process.

      Thank you for your comment. According to your suggestion, we have rewritten the corresponding sentences below, together with addition of new Fig.1D.

      First, the sentence “which corresponded to the mass of zinc-free apo-GIF/MT3 and indicated that zinc was removed during MS analysis.” was changed to “which corresponded to the mass of zinc-free apo-GIF/MT3 and indicated that zinc dissociates from protein in acidic conditions used for MALDI sample preparation.” in the introduction section. Second, we have added the following sentence “However, FT-ICR-MALDI-TOF/MS analysis failed to detect sulfur modifications in GIF/MT-3 (Fig. 1B), suggesting that sulfur modifications in the protein were dissociated during laser desorption/ionization. Therefore, we postulate that the small amount of sulfur detected in oxidized apo-GIF/MT-3 is derived from the effect of laser desorption/ionization rather than any actual modification of the minority component.” in the discussion section. Third, we have added new Fig. 1D and the corresponding citation in the introduction. Fourth, the sentence “An increase in mass of 32 Da can also result from addition of two oxygen atoms, but we attributed it to one sulfur atom for reasons described later.” was changed to “Note that an increase in mass of 32 Da can also result from addition of two oxygen atoms.”.

      Another important point is a new approach to the HPE-IAM application. Zinc-binding MT3 was incubated with 5 mM reagent at 60°C for 36 h. Authors claim that high concentration was required because apoMT3 has stable conformation. Figure 2B shows that product concentration increases with higher temperature, but it is unclear why such a high temperature was used. Figure 1D shows that at 37°C, there is almost no reaction at 5 mM reagent. Changing parameters sounds reasonable only when the reaction is monitored by mass spectrometry. In conclusion, about 20 sulfane sulfur atoms present in MT3 would be clearly visible. Such evidence was not provided. Increased temperature and reagent concentration could cause modification of cysteinyl thiol/thiolates as well, not only persulfides/polysulfides. Therefore, it is highly possible that non-modified MT3 protein could react with HPE-IAM, giving false results. Besides mass spectrometry, which would clearly prove modifications of 20 Cys, authors should use very important control, which could be chemically synthesized beta- or alfa-domain of MT3 reconstituted with zinc (many protocols are present in the literature). Such models are commonly used to test any kind of chemistry of MTs. If a non-modified chemically obtained domain would undergo a reaction with HPE-IAM under such rigorous conditions, then my expectation would be right.

      Thank you for your comments. Although we have already confirmed that no false-positive results were observed using this method in Fig. 5 (previously Fig. 4), we have conducted additional experiments by preparing chemically synthesized α- and β-domains of GIF/MT-3, as well as recombinant α- and β-domains of GIF/MT-3. As shown in the new Fig. S2A, the chemically synthesized α- and β-domains of GIF/MT-3 detected almost no sulfane sulfur (less than 1 molecule per protein), whereas the recombinant α- and β-domains detected several molecules of sulfane sulfur (more than 5 molecules per protein) (Fig. S2A). Therefore, I would like to emphasize here that the cysteine residue itself cannot be the source of the bis-S-HPE-AM product (sulfane sulfur derivative).

      Accordingly, we have added the following sentence in the results section: “Because this assay was performed at relatively high temperatures (60°C), we also examined the sulfane sulfur levels of several mutant proteins using chemically synthesized α- and β-domains of GIF/MT-3 to eliminate false-positive results. As shown in Fig. S2A, sulfane sulfur (less than 1 molecule per protein) was undetectable in chemically synthesized α- and β-domains of GIF/MT-3, whereas several molecules of sulfane sulfur per protein were detected in recombinant α- and β-domains exhibited (Fig. S2B, left panel). These findings indicated that the sulfane sulfur detected in our assay was derived from biological processes executed during the production of GIF/MT-3 protein. We further analyzed mutant proteins with β-Cys-to-Ala and α-Cys-to-Ala substitutions and found that their sulfane sulfur levels were comparable with those of the α- and β-domains of GIF/MT-3, respectively (Fig. S2B, left panel). Additionally, Ser-to-Ala mutation did not affect the sulfane sulfur levels of GIF/MT-3. The zinc content of each mutant protein was also determined under these conditions (Fig. S2B, right panel).”

      - The remaining experiments provided in the manuscript can also be applied for non-modified protein (without sulfane sulfur modification) and do not provide worthwhile evidence. For instance, hydrogen peroxide or SNAP may interact with non-modified MTs. Zinc ions dissociate due to cysteine residue modification, and TCEP may reduce oxidized residue to rescue zinc binding. Again, mass spectrometry would provide nice evidence.

      Thank you for your comment. We understand that such experiments can also be applied to non-modified proteins (without sulfane sulfur modification). However, the experiments shown in Fig. 4 and Fig. 6 were conducted to investigate the role of sulfane sulfur under oxidative stress conditions, rather than to examine sulfur modification in the protein itself. As mentioned previously, it is difficult to detect sulfur modifications directly in the protein using MALDI-TOF/MS (Fig. 1), as sulfur modifications appear to dissociate during the laser desorption/ionization process.

      - The same is thioredoxin (Fig. 7) and its reaction with oxidized MT3. Nonmodified and oxidized MT3 would react as well.

      Thank you for your comment. We understand that such experiments can also be applied to non-modified MT-3 protein. However, to the best of our knowledge, this is the first report demonstrating that apo-MT-3 can serve as a good substrate for the Trx system. In fact, this experiment is not intended to prove that MT-3 is sulfane sulfur-binding protein. Rather, it demonstrates the novel finding that apo-MT3 serves as an excellent substrate for Trx and that the sulfane sulfur (persulfide structure) remains intact throughout the reduction process.

      - If HPE-IAM reacts with Cys residues with unmodified MT3, which is more likely the case under used conditions, the protein product of such reaction will not bind zinc. It could be an explanation of the cyanolysis experiment (Fig. 6).

      Thank you for your comment. As you pointed out, HPE-IAM reacts with cysteine residues in unmodified MT-3, thereby preventing zinc from binding to the protein. However, we did not use HPE-IAM prior to measuring zinc binding. Instead, HPE-IAM was used solely for determining the sulfane sulfur content in the protein, and thus it cannot explain the results of the cyanolysis experiment.

      - Figure 4 shows the reactivity of (pol)sulfides with TCEP and HPE-IAM. What are redox potentials? Do they correlate with the obtained results?

      Thank you for your comment. However, we must apologize as we do not fully understand the rationale behind determining redox potentials in this experiment. We believe the data itself to be very clear and presenting convincing results.

      - Raman spectroscopy experiments would illustrate the presence of sulfane sulfur in MT3 only if all Cys were modified.

      Yes, that is correct. Since approximately 20 sulfane sulfur atoms are detected in the protein with 20 cysteine residues, we believe that nearly all cysteine residues are modified by sulfane sulfur. Therefore, Raman spectroscopy is considered applicable to our current study.

      - The modeling presented in this study is very interesting and confirms the flexibility of metallothioneins. MT domains are known to bind various metal ions of different diameters. They adopt in this way to larger size the ions. The same mechanism could be present from the protein site. The presence of 9 or 11 sulfur atoms in the beta or alfa domain would increase the size of the domains without changing the cluster structure.

      We truly appreciate your positive evaluation of this work.

      - Comment to authors. Apo-MT is not present in the cell. It exists as a partially metallated species. The term "apo-MT" was introduced to explain that MTs are not fully saturated by metals and function as a metal buffer system. Apo-MT comes from old ages when MT was considered to be present only in two forms: apo-form and fully saturated forms.

      Thank you for your insightful comments. We find it reasonable to understand that apo-MT exists as a partially metallated species within the cell.

      Reviewer #2 (Public Review):

      Summary:

      In this manuscript, the authors reveal that GIF/MT-3 regulates zinc homeostasis depending on the cellular redox status. The manuscript technically sounds, and their data concretely suggest that the recombinant MTs, not only GIF/MT-3 but also canonical MTs such as MT-1 and MT-2, contain sulfane sulfur atoms for the Zn-binding. The scenario proposed by the authors seems to be reasonable to explain the Zn homeostasis by the cellular redox balance.

      Strengths:

      The data presented in the manuscript solidly reveal that recombinant GIF/MT-3 contains sulfane sulfur.

      Weaknesses:

      It is still unclear whether native MTs, in particular, induced MTs in vivo contain sulfane sulfur or not.

      Thank you for pointing out the strengths and weaknesses of this manuscript. Based on your suggestions, we have determined the sulfane sulfur content in the native GIF/MT-3 protein, as explained in our response to "Recommendations for the Authors #2."

      Reviewer #3 (Public Review):

      Summary:

      The authors were trying to show that a novel neuronal metallothionein of poorly defined function, GIF/MT3, is actually heavily persulfidated in both the Zn-bound and apo (metal-free) forms of the molecule as purified from a heterologous or native host. Evidence in support of this conclusion is compelling, with both spectroscopic and mass spectrometry evidence strongly consistent with this general conclusion. The authors would appear to have achieved their aims.

      Strengths:

      The analytical data are compelling in support of the author's primary conclusions are strong. The authors also provide some modeling evidence that strongly supports the contention that MT3 (and other MTs) can readily accommodate sulfane sulfur on each of the 20 cysteines in the Zn-bound structure, with little perturbation of the structure. This is not the case with Cys trisulfides, which suggests that the persulfide-metallated state is clearly positioned at lower energy relative to the immediately adjacent thiolate- or trisulfidated metal coordination complexes.

      Weaknesses:

      The biological significance of the findings is not entirely clear. On the one hand, the analytical data are clearly solid (albeit using a protein derived from a bacterial over-expression experiment), and yes, it's true that sulfane S can protect Cys from overoxidation, but everything shown in the summary figure (Fig. 8D) can be done with Zn release from a thiol by ROS, and subsequent reduction by the Trx/TR system. In addition, it's long been known that Zn itself can protect Cys from oxidation. I view this as a minor weakness that will motivate follow-up studies. Fig. 1 was incomplete in its discussion and only suggests that a few S atoms may be covalently bound to MT3 as isolated. This is in contrast to the sulfate S "release" experiment, which I find quite compelling.

      Impact:

      The impact will be high since the finding is potentially disruptive to the metals in the biology field in general and the MT field for sure. The sulfane sulfur counting experiment (the HPE-IAM electrophile trapping experiment) may well be widely adopted by the field. Those of us in the metals field always knew that this was a possibility, and it will interesting to see the extent to which metal-binding thiolates broadly incorporate sulfate sulfur into their first coordination shells.

      Thank you for pointing out the strengths and weaknesses of this manuscript. As you noted, the explanations and discussions regarding Fig. 1 were missing. To address this, we have added the following sentences to the discission section: “However, FT-ICR-MALDI-TOF/MS analysis failed to detect sulfur modifications in GIF/MT-3 (Fig. 1B), suggesting that sulfur modifications in the protein were dissociated during laser desorption/ionization. Therefore, we postulate that the small amount of sulfur detected in oxidized apo-GIF/MT-3 is derived from the effect of laser desorption/ionization rather than any actual modification of the minority component.”

      Reviewer #1 (Recommendations For The Authors):

      Overall, the topic of the study is interesting, but the provided evidence is insufficient to claim that MT3 is a sulfane sulfur-binding protein. Indeed, some recent studies showed that natural and recombinant MT proteins can be modified, but only one or a few cysteine residues were modified. Authors should follow my suggestion and apply mass spectrometry to all performed reactions and, first of all, to freshly obtained protein. I strongly suggest using chemically synthesized and reconstituted domains to test whether the home-developed approach is appropriate. Moreover, native MS and ICP-MS analysis of MT3 would support their claims.

      Thank you for your insightful comments. Following your suggestions, we have prepared chemically synthesized proteins of the α- and β-domains of GIF/MT-3 and conducted additional experiments, as explained in response comments to “Public Review #1”. Regarding the MS analysis, we have also added a discussion on the difficulty of detecting sulfur modifications in the protein.

      Reviewer #2 (Recommendations For The Authors):

      I have some minor points which should be considered by the authors.

      (1) Table 1: In the simulation by MOE, the authors speculated 7 atoms of metal bound to GIF/MT-3. Although a total of 7 atoms of Zn or Cd are actually bound to MTs as a divalent ion, the number of Cu and Hg bound to MTs as a monovalent ion is scientifically controversial. Several ideas have been proposed in the literature, however, "7 atoms of Cu or Hg" could be inappropriate as far as I know. The authors should simulate again using a more appropriate number of Cu or Hg in MTs.

      Thank you for providing this valuable information. We reviewed several papers by the Stillman group and found that the relative binding constants of Cu4-MT, Cu6-MT, and Cu10-MT were determined after the addition of Cu(I) to apo MT-1A, MT-2, and MT-3 (Melenbacher and Stillman, Metallomics, 2024). However, incorporating these copper numbers into our GIF/MT-3 simulation model proved challenging. Therefore, we decided to omit the score value for copper in Table 1.

      On the other hand, some researchers have reported that mercury binds to MT as a divalent ion, and the formation of Hg<sub>7</sub>MT is possible (not just other forms). Therefore, we decided to continue using the score value for mercury shown in Table 1.

      (2) If possible, native MT samples isolated from an experimental animal should be evaluated for the sulfane sulfur content. Canonical MTs, MT-1 and MT-2, are highly inducible by not only heavy metals but also oxidative stress. Under the oxidative stress condition such as the exposure of hydrogen peroxide, it is questionable whether the induced Zn-MTs contain sulfane sulfur or not.

      According to your suggestion, we evaluated the sulfane sulfur content in native GIF/MT-3 samples isolated from mouse brain cytosol (Fig. 10). The measured amount was 3.3 per protein. This suggests that sulfane sulfur in GIF/MT-3 could be consumed under oxidative conditions, as you anticipated. Another possible explanation for the discrepancy between the native form and recombinant protein is likely related to metal binding in the protein. It is generally understood that both zinc and copper bind to GIF/MT-3 in approximately equal proportions in vivo. When we prepared recombinant copper-binding GIF/MT-3 protein, the sulfane sulfur content in the protein was significantly different (approximately 4.0 per protein) compared to the Zn<sub>7</sub>GIF/MT-3 form. Further studies are needed to clarify the relationship between sulfane sulfur binding and the types of metals in the future.

      (3) The biological significance of sulfane sulfur in MTs is still unclear to me.

      Thank you for your comments. To address this question, we have added the following sentence to the discussion section: “The biological significance of sulfane sulfur in MTs lies in its ability to 1) contribute to metal binding affinity, 2) provide a sensing mechanism against oxidative stress, and 3) aid in the regeneration of the protein.”

      (4) According to the widely accepted nomenclature of MT, "MT3" should be amended to "MT-3".

      According to your suggestion, we have amended from MT3 to MT-3 throughout the manuscript.

      Reviewer #3 (Recommendations For The Authors):

      Most of my comments are editorial in nature, largely focused on what I perceive as overinterpretation or unnecessary speculation.

      The authors state in the abstract that the intersection of sulfane sulfur and Zn enzymes "has been overlooked." This is not actually true - please tone down to "under investigated" or something like this.

      Based on your suggestion, we have replaced the term “has been overlooked” with “has been under investigated” in the abstract.

      Line 228: The discussion of Fig. 6C involved too much speculation. I cannot see a quantitative experiment that supports this.

      Based on your suggestion, we have removed Fig. 6C (currently referred to as Fig. 7C). Additionally, we have revised the sentence from “implying that the sulfane sulfur is an essential zinc ligand in apo-GIF/MT3 and that an asymmetric SSH or SH ligand is insufficient for native zinc binding (Fig. 6C)” to “implying the contribution of sulfane sulfur to zinc binding in GIF/MT-3”.

      Line 247 "persulfide in apo-GIF/MT3 seems.." I think the authors mean that the Zn form of the protein is resistant to Trx or TCEP.

      Thank you for pointing this out. We realized that the term “persulfide in apo-GIF/MT3” might be confusing. Therefore, we have replaced it with “persulfide formation derived from apo-GIF/MT3” in the corresponding sentence.

      Molecular modeling: We need more details- were these structures energy-minimized in any way? Can the authors comment on the plethora of S-S dihedral angles in these structures, and whether they are consistent with expectations of covalent geometry? Please add text to explain or even a table that compiles these data.

      Thank you for your comment. Yes, energy minimization calculations for structural optimization were conducted during homology modeling in MOE. In fact, we have already stated in the Methods section that “Refinement of the model with the lowest generalized Born/volume integral (GBVI) score was achieved through energy minimization of outlier residues in Ramachandran plots generated within MOE.” In this model, covalent geometry, including the S-S dihedral angles, is also taken into consideration.

      What is a thermostability score? Perhaps a bit more discussion here and what relationship this has to an apparent (or macroscopic) metal affinity constant.

      The thermostability score is used to compare the thermal stability between the wild-type and mutant proteins. As shown in Equation (1) in the method section, it is calculated by subtracting the energy of the hypothetical unfolded state from the energy of the folded state. Since obtaining the structure of the unfolded state requires extensive computational effort, MOE employs an empirical formula based on two-dimensional structural features to estimate it. The ΔΔG values represent the difference between ΔGf(WT) and ΔGf(Mut). However, because it is difficult to directly determine ΔGf(Mut) and ΔGf(WT), MOE calculates ΔΔG using the thermodynamic cycle equivalence: ΔΔGs =ΔGsf (WT→Mut) - ΔGsu (WT→Mut), as expressed in Equation (1).

      On the other hand, the affinity score represents the interaction energy between the target ligand and the protein. In this study, we calculated the affinity score by selecting metal atoms as the ligands. The interaction energy (E int) is defined as:

      E int = E complex − E receptor − E ligand

      where each term is as follows:

      E complex : Potential energy of the complex.

      E receptor : Potential energy of the receptor alone.

      E ligand : Potential energy of the ligand alone.

      Each potential energy term includes contributions from bonded interactions such as bond lengths and bond angles. However, since there is no structural difference among E receptor, and E ligand, the bonded energy components cancel out. Consequently, E int is determined as:

      E int = ΔEele +ΔEvdW +ΔE sol

      Here, a negative E int indicates that the complex is more stable, while a positive E int implies that the receptor and ligand are more stable in their dissociated states.

      We have revised the sentence "The affinity score was also calculated using MOE software as the difference between the ΔΔGs values of the protein, free zinc, and metal–protein complex” to "The affinity score was also calculated using MOE software as the difference between the potential energy values of the protein, free zinc, and metal–protein complex” to correct the misdescription.

      Lines 278-280: The authors state that they observe a "marked enhancement of metal binding affinity, and rearrangement of zinc ions." I don't see support for this rather provocative conclusion. This is the expectation of course. I would love to see actual experimental data on this point, direct binding titrations with metals performed before and after the release of the sulfate sulfur atoms.

      Thank you for your comments. Although this statement is based on the 3D modeling simulation, we have also experimentally observed that the diminishment of sulfane sulfur in GIF/MT-3 resulted in a decrease in zinc binding levels, as shown in Fig. 7. However, conducting direct binding titration experiments was difficult for us due to the difficulty in preparing pure GIF/MT-3 protein with or without sulfane sulfur. Therefore, we have revised the sentence "marked enhancement of metal binding affinity, and rearrangement of zinc ions" to simply "enhancement of metal binding affinity" to avoid over-speculation.

      Table I- quantitatively lower stability for the Cu complex- the stoichiometry is clearly wrong in this simulation- please redo this simulation with the right stoichiometry or Cu to MT3- consult a Stillman paper.

      Thank you for providing this valuable information. We reviewed several papers by the Stillman group and found that the relative binding constants of Cu4-MT, Cu6-MT, and Cu10-MT were determined after the addition of Cu(I) to apo MT-1A, MT-2, and MT-3 (Melenbacher and Stillman, Metallomics, 2024). However, incorporating these copper numbers into our GIF/MT-3 simulation model proved challenging. Therefore, we decided to omit the score value for copper in Table 1.

      I like the model for reversible metal release mediated by the thioredoxin system (Fig. 8D)- but you can also do this with thiols- nothing really novel here. Has it been generally established that tetraulfides are better substrates for the Trx/TR system? The data shown in Fig. 7B seems to suggest this, but is this broadly true, from the literature?

      There are reports describing that persulfides and polysulfides are reduced by the thioredoxin system. However, it is not well-established that tetraulfides are better substrates for the Trx/TR system. To the best of our knowledge, this is the first report demonstrating that apo-MT-3 can serve as a good substrate for the Trx/TR system. Further research is required to compare the catalytic efficiency between proteins containing disulfide and those with tetraulfide moieties.

      Line 380: Many groups have reported that many proteins are per- or polysulfidated in a whole host of cells using mass spectrometry workflows, and that terminal persulfides can be readily reduced by general or specific Trx/TR systems. This work could be better acknowledged in the context of the authors' demonstration of the reduction of the tetrasulfides, which itself would appear to be novel (and exciting!).

      We truly appreciate your positive evaluation of this work.

    1. eLife Assessment

      This fundamental article significantly advances our understanding of FGF signalling, and in particular, highlights the complex modifications affecting this pathway. The evidence for the authors' claims is convincing, combining state-of-the-art conditional gene deletion in the mouse lens with histological and molecular approaches. This work should be of great interest to molecular and developmental biologists beyond the lens community.

    2. Reviewer #1 (Public review):

      Summary:

      This manuscript uses the eye lens as a model to investigate basic mechanisms in the Fgf signaling pathway. Understanding Fgf signaling is of broad importance to biologists as it is involved in the regulation of various developmental processes in different tissues/organs and is often misregulated in disease states. The Fgf pathway has been studied in embryonic lens development, namely with regards to its involvement in controlling events such as tissue invagination, vesicle formation, epithelium proliferation and cellular differentiation, thus making the lens a good system to uncover the mechanistic basis of how the modulation of this pathway drives specific outcomes. Previous work has suggested that proteins, other than the ones currently known (e.g., the adaptor protein Frs2), are likely involved in Fgfr signaling. The present study focuses on the role of Shp2 and Shc1 proteins in the recruitment of Grb2 in the events downstream of Fgfr activation.

      Strengths:

      The findings reveal that the juxtamembrane region of the Fgf receptor is necessary for proper control of downstream events such as facilitating key changes in transcription and cytoskeleton during tissue morphogenesis. The authors conditionally deleted all four Fgfrs in the mouse lens that resulted in molecular and morphological lens defects, most importantly, preventing the upregulation of the lens induction markers Sox2 and Foxe3 and the apical localization of F-actin, thus demonstrating the importance of Fgfrs in early lens development, i.e. during lens induction. They also examined the impact of deleting Fgfr1 and 2, on the following stage, i.e. lens vesicle development, which could be rescued by expressing constitutively active KrasG12D. By using specific mutations (e.g. Fgfr1ΔFrs lacking the Frs2 binding domain and Fgfr2LR harboring mutations that prevent binding of Frs2), it is demonstrated that the Frs2 binding site on Fgfr is necessary for specific events such as morphogenesis of lens vesicle. Further, by studying Shp2 mutations and deletions, the authors present a case for Shp2 protein to function in a context-specific manner in the role of an adaptor protein and a phosphatase enzyme. Finally, the key surprising finding from this study is that downstream of Fgfr signaling, Shc1 is an important alternative pathway - in addition to Shp2 - involved in the recruitment of Grb2 and in the subsequent activation of Ras. The methodologies, namely, mouse genetics and state-of-the-art cell/molecular/biochemical assays are appropriately used to collect the data, which are soundly interpreted to reach these important conclusions. Overall, these findings reveal the flexibility of the Fgf signaling pathway and it downstream mediators in regulating cellular events. This work is expected to be of broad interest to molecular and developmental biologists.

      Weaknesses:

      A weakness that needs to be discussed is that Le-Cre depends on Pax6 activation, and hence its use in specific gene deletion will not allow evaluation of the requirement of Fgfrs in the expression of Pax6 itself. But since this is the earliest Cre available for deletion in the lens, mentioning this in the discussion would make the readers aware of this issue.

    3. Reviewer #2 (Public review):

      Summary

      I have reviewed the revised manuscript submitted by Wang et al., which is entitled "Shc1 cooperates with Frs2 and Shp2 to recruit Grb2 in FGF-induced lens development". In this paper, the authors first examined lens phenotypes in mice with Le-Cre-mediated knockdown (KD) of all four FGFR (FGFR1-4), and found that pERK signals, Jag1 and foxe3 expression are absent or drastically reduced, indicating that FGF signaling is essential for lens induction. Next, the authors examined lens phenotypes of FGFR1/2-KD mice and found that lens fiber differentiation is compromised and that proliferative activity and cell survival are also compromised in lens epithelium. Interestingly, Kras activation rescues defects in lens growth and lens fiber differentiation in FGFR1/2-KD mice, indicating that Ras activation is a key step for lens development, downstream of FGF signaling. Next, the authors examined the role of Frs2, Shp2 and Grb2 in FGF signaling for lens development. They confirmed that lens fiber differentiation is compromised in FGFR1/3-KD mice combined with Frs2-dysfunctional FGFR2 mutants, which is similar to lens phenotypes of Grb2-KD mice. However, lens defects are milder in mice with Shp2YF/YF and Shp2CS mutant alleles, indicating that involvement of Shp2 is limited for the Grb2 recruitment for lens fiber differentiation. Lastly, the authors showed new evidence on the possibility that another adapter protein, Shc1, promotes Grb2 recruitment independent of Frs2/Shp2-mediated Grb2 recruitment.

      Strength

      Overall, the manuscript provides valuable data on how FGFR activation leads to Ras activation through the adapter platform of Frs2/Shp2/Grb2, which advances our understanding on complex modification of FGF signaling pathway. The authors applied a genetic approach using mice, whose methods and results are valid to support the conclusion. The discussion also well summarizes the significance of their findings.

      Weakness

      The authors found that the new adaptor protein Shc1 is involved in Grb2 recruitments in response to FGF receptor activation. However, the main data on Shc1 are only histological sections and statistical evaluation of lens size. In the revised manuscript, the authors did not answer my major concern that cellular-level data are missing, which is not fully enough to support their main conclusion on the involvement of Shc1 in Grb2 recruitment of FGF signaling for lens development. Since the title of this manuscript is that Shc1 cooperates with Frs2 and Shp2 to recruit Grb2 in FGF-induced lens development, it is important to provide the cellular-level evidence on Shc1.

    4. Reviewer #3 (Public review):

      Summary:

      The manuscript entitled "Shc1 cooperates with Frs2 and Shp2 to recruit Grb2 in FGF-induced lens development" by Wang et al., investigates the molecular mechanism used by FGFR signaling to support lens development. The lens has long been known to depend on FGFR-signaling for proper development. Previous investigations have demonstrated the FGFR signaling is required for embryonic lens cell survival and for lens fiber cell differentiation. The requirement of FGFR signaling for lens induction has remained more controversial as deletion of both Fgfr1 and Fgfr2 during lens placode formation does not prevent the induction of definitive lens markers such as FOXE3 or αA-crystallin. Here the authors have used the Le-Cre driver to delete all four FGFR genes from the developing lens placode demonstrating a definitive failure of lens induction in the absence of FGFR-signaling. The authors focused on FGFR1 and FGFR2, the two primary FGFRs present during early lens development and demonstrated that lens development could be significantly rescued in lenses lacking both FGFR1 and FGFR2 by expressing a constitutively active allele of KRAS. They also showed that the removal of pro-apoptotic genes Bax and Bak could also lead to a substantial rescue of lens development in lenses lacking both FGFR1 and FGFR2. In both cases, the lens rescue included both increased lens size and the expression of genes characteristic of lens cells.

      Significantly the authors concentrated on the juxtamembrane domain, a portion of the FGFRs associated with FRS2. Previous investigations have demonstrated the importance of FRS2 activation for mediating a sustained level of ERK activation. FRS2 is known to associate both with GRB2 and SHP2 to activate RAS. The authors utilized a mutant allele of Fgfr1, lacking the entire juxtamembrane domain (Fgfr1ΔFrs) and an allele of Fgfr2 containing two-point mutations essential for Frs2 binding (Fgfr2LR). When combining three floxed alleles and leaving only one functional allele (Fgfr1ΔFrs or Fgfr2LR) the authors got strikingly different phenotypes. When only the Fgfr1ΔFrs allele was retained, the lens phenotype matched that of deleting both Fgfr1 and Fgfr2. However, when only the Fgfr2LR allele was retained the phenotype was significantly milder, primarily affecting lens fiber cell differentiation, suggesting that something other than FRS2 might be interacting with the juxtamembrane domain to support FGFR signaling in the lens. The authors also deleted Grb2 in the lens and showed that the phenotype was similar to that of the lenses only retaining the Fgfr2LR allele, resulting a failure of lens fiber cell differentiation and decreased lens cell survival. However, mutating the major tyrosine phosphorylation site of GRB2 did not affect lens development. The authors additionally investigated the role of SHP2 in lens development by either deleting SHP2 or by making mutations in the SHP2 catalytic domain. The deletion of the SHP2 phosphatase activity did not affect lens development as severely as total loss of SHP2 protein, suggesting a function for SHP2 outside of its catalytic activity. Although the loss of Shc1 alone has only a slight effect on lens size and pERK activation in the lens, the authors showed that the loss of Shc1 exacerbated the lens phenotype in lenses lacking both Frs2 and Shp2. The authors suggest that SHC1 binds to the FGFR juxtamembrane domain allowing for the recruitment of GRB2 in independently of FRS2.

      Strengths:

      (1) The authors used a variety of genetic tools to carefully dissect the essential signals downstream of FGFR signaling during lens development.

      (2) The authors made a convincing case that something other than FRS2 binding mediates FGFR signaling in the juxtamembrane domain.

      (3) The authors demonstrated that despite the requirement of both the adaptor function and phosphatase activity of SHP2 are required for embryonic survival, neither of these activities is absolutely required for lens development.

      (4) The authors provide more information as to why FGFR loss has a phenotype much more severe than the loss of FRS2 alone during lens development.

      (5) The authors followed up their work analyzing various signaling molecules in the context of lens development with biochemical analyses of FGF-induced phosphorylation in murine embryonic fibroblasts (MEFs).

      (6) In general, this manuscript represents a Herculean effort to dissect FGFR signaling in vivo with biochemical backing with cell culture experiments in vitro.

      Weaknesses:

      (1) The authors demonstrate that the loss of FGFR1 and FGFR2 can be compensated by a constitutive active KRAS allele in the lens and suggest that FGFRs largely support lens development only by driving ERK activation. However, the authors also saw that lens development was substantially rescued by preventing apoptosis through the deletion of BAK and BAX. To my knowledge, the deletion of BAK and BAX should not independently activate ERK. The authors do not show whether ERK activation is restored in the BAK/BAX deficient lenses. Do the authors suggest the FGFR3 and/or FGFR4 provide sufficient RAS and ERK activation for lens development when apoptosis is suppressed? Alternatively, is it the survival function of FGFR-signaling as much as a direct effect on lens differentiation?

      (2) Do the authors suggest that GRB2 is required for RAS activation and ultimately ERK activation? If so, do the authors suggest that ERK activation is not required for FGFR-signaling to mediate lens induction? This would follow considering that the GRB2 deficient lenses lack a problem with lens induction.

      (3) The increase in p-Shc is only slightly higher in the Cre FGFR1f/f FGFR2r/LR than in the FGFR1f/Δfrs FGFR2f/f. Can the authors provide quantification?

      (4) The authors have not shown directly that Shc1 binds to the juxtamembrane region of either Fgfr1 or Fgfr2.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This manuscript uses the eye lens as a model to investigate basic mechanisms in the Fgf signaling pathway. Understanding Fgf signaling is of broad importance to biologists as it is involved in the regulation of various developmental processes in different tissues/organs and is often misregulated in disease states. The Fgf pathway has been studied in embryonic lens development, namely with regards to its involvement in controlling events such as tissue invagination, vesicle formation, epithelium proliferation, and cellular differentiation, thus making the lens a good system to uncover the mechanistic basis of how the modulation of this pathway drives specific outcomes. Previous work has suggested that proteins, other than the ones currently known (e.g., the adaptor protein Frs2), are likely involved in Fgfr signaling. The present study focuses on the role of Shp2 and Shc1 proteins in the recruitment of Grb2 in the events downstream of Fgfr activation.

      Strengths:

      The findings reveal that the juxtamembrane region of the Fgf receptor is necessary for proper control of downstream events such as facilitating key changes in transcription and cytoskeleton during tissue morphogenesis. The authors conditionally deleted all four Fgfrs in the mouse lens that resulted in molecular and morphological lens defects, most importantly, preventing the upregulation of the lens induction markers Sox2 and Foxe3 and the apical localization of F-actin, thus demonstrating the importance of Fgfrs in early lens development, i.e. during lens induction. They also examined the impact of deleting Fgfr1 and 2, on the following stage, i.e. lens vesicle development, which could be rescued by expressing constitutively active KrasG12D. By using specific mutations (e.g. Fgfr1ΔFrs lacking the Frs2 binding domain and Fgfr2LR harboring mutations that prevent binding of Frs2), it is demonstrated that the Frs2 binding site on Fgfr is necessary for specific events such as morphogenesis of lens vesicle. Further, by studying Shp2 mutations and deletions, the authors present a case for Shp2 protein to function in a context-specific manner in the role of an adaptor protein and a phosphatase enzyme. Finally, the key surprising finding from this study is that downstream of Fgfr signaling, Shc1 is an important alternative pathway - in addition to Shp2 - involved in the recruitment of Grb2 and in the subsequent activation of Ras. The methodologies, namely, mouse genetics and state-of-the-art cell/molecular/biochemical assays are appropriately used to collect the data, which are soundly interpreted to reach these important conclusions. Overall, these findings reveal the flexibility of the Fgf signaling pathway and its downstream mediators in regulating cellular events. This work is expected to be of broad interest to molecular and developmental biologists.

      Weaknesses:

      A weakness that needs to be discussed is that Le-Cre depends on Pax6 activation, and hence its use in specific gene deletion will not allow evaluation of the requirement of Fgfrs in the expression of Pax6 itself. But since this is the earliest Cre available for deletion in the lens, mentioning this in the discussion would make the readers aware of this issue. Referring to Jag1 among "lens-specific markers" (page 5) is debatable, suggesting changing to the lines of "the expected upregulation of Jag1 in lens vesicle". The Abstract could be modified to clearly convey the existing knowledge gap and the key findings of the present study. As it stands now, it is a bit all over the place. Some typos in the manuscript need to be fixed, e.g. "...yet its molecular mechanism remains largely resolved" - unresolved? "...in the development lens" - in the developing lens? In Figure 4 legend, "(B) Grb2 mutants Grb2 mutants displayed...", etc.

      We thank the reviewer for the thoughtful and constructive feedback. We have added the caveat regarding the Le-Cre dependency on Pax6 expression to the discussion, removed the reference to Jag1 as a “lens-specific marker” and corrected the typographical errors noted by the reviewer.

      Reviewer #2 (Public review):

      Summary:

      I have reviewed a manuscript submitted by Wang et al., which is entitled "Shc1 cooperates with Frs2 and Shp2 to recruit Grb2 in FGF-induced lens development". In this paper, the authors first examined lens phenotypes in mice with Le-Cre-mediated knockdown (KD) of all four FGFR (FGFR1-4), and found that pERK signals, Jag1, and foxe3 expression are absent or drastically reduced, indicating that FGF signaling is essential for lens induction. Next, the authors examined lens phenotypes of FGFR1/2-KD mice and found that lens fiber differentiation is compromised and that proliferative activity and cell survival are also compromised in lens epithelium. Interestingly, Kras activation rescues defects in lens growth and lens fiber differentiation in FGFR1/2-KD mice, indicating that Ras activation is a key step for lens development. Next, the authors examined the role of Frs2, Shp2, and Grb2 in FGF signaling for lens development. They confirmed that lens fiber differentiation is compromised in FGFR1/3-KD mice combined with Frs2-dysfunctional FGFR2 mutants, which is similar to lens phenotypes of Grb2-KD mice. However, lens defects are milder in mice with Shp2YF/YF and Shp2CS mutant alleles, indicating that the involvement of Shp2 is limited for the Grb2 recruitment for lens fiber differentiation. Lastly, the authors showed new evidence on the possibility that another adapter protein, Shc1, promotes Grb2 recruitment independent of Frs2/Shp2-mediated Grb2 recruitment.

      Strengths:

      Overall, the manuscript provides valuable data on how FGFR activation leads to Ras activation through the adapter platform of Frs2/Shp2/Grb2, which advances our understanding of complex modification of the FGF signaling pathway. The authors applied a genetic approach using mice, whose methods and results are valid to support the conclusion. The discussion also well summarizes the significance of their findings.

      Weaknesses:

      The authors eventually found that the new adaptor protein Shc1 is involved in Grb2 recruitments in response to FGF receptor activation. however, the main data for Shc1 are histological sections and statistical evaluation of lens size. So, my major concern is that the authors need to provide more detailed data to support the involvement of Shc1 in Grb2 recruitment of FGF signaling for lens development.

      We thank the reviewer for the positive comments and valuable suggestions. We have addressed the concerns in detail in the response to the recommendation outlined below.

      Reviewer #3 (Public review):

      Summary:

      The manuscript entitled "Shc1 cooperates with Frs2 and Shp2 to recruit Grb2 in FGF-induced lens development" by Wang et al., investigates the molecular mechanism used by FGFR signaling to support lens development. The lens has long been known to depend on FGFR signaling for proper development. Previous investigations have demonstrated that FGFR signaling is required for embryonic lens cell survival and for lens fiber cell differentiation. The requirement of FGFR signaling for lens induction has remained more controversial as deletion of both Fgfr1 and Fgfr2 during lens placode formation does not prevent the induction of definitive lens markers such as FOXE3 or αA-crystallin. Here the authors have used the Le-Cre driver to delete all four FGFR genes from the developing lens placode demonstrating a definitive failure of lens induction in the absence of FGFR signaling. The authors focused on FGFR1 and FGFR2, the two primary FGFRs present during early lens development, and demonstrated that lens development could be significantly rescued in lenses lacking both FGFR1 and FGFR2 by expressing a constitutively active allele of KRAS. They also showed that the removal of pro-apoptotic genes Bax and Bak could also lead to a substantial rescue of lens development in lenses lacking both FGFR1 and FGFR2. In both cases, the lens rescue included both increased lens size and the expression of genes characteristic of lens cells.

      Significantly the authors concentrated on the juxtamembrane domain, a portion of the FGFRs associated with FRS2. Previous investigations have demonstrated the importance of FRS2 activation for mediating a sustained level of ERK activation. FRS2 is known to associate both with GRB2 and SHP2 to activate RAS. The authors utilized a mutant allele of Fgfr1, lacking the entire juxtamembrane domain (Fgfr1ΔFrs), and an allele of Fgfr2 containing two-point mutations essential for Frs2 binding (Fgfr2LR). When combining three floxed alleles and leaving only one functional allele (Fgfr1ΔFrs or Fgfr2LR) the authors got strikingly different phenotypes. When only the Fgfr1ΔFrs allele was retained, the lens phenotype matched that of deleting both Fgfr1 and Fgfr2. However, when only the Fgfr2LR allele was retained the phenotype was significantly milder, primarily affecting lens fiber cell differentiation, suggesting that something other than FRS2 might be interacting with the juxtamembrane domain to support FGFR signaling in the lens. The authors also deleted Grb2 in the lens and showed that the phenotype was similar to that of the lenses only retaining the Fgfr2LR allele, resulting in a failure of lens fiber cell differentiation and decreased lens cell survival. However, mutating the major tyrosine phosphorylation site of GRB2 did not affect lens development. The author additionally investigated the role of SHP2 lens development by making by either deleting SHP2 or by making mutations in the SHP2 catalytic domain. The deletion of the SHP2 phosphatase activity did not affect lens development as severely as the total loss of SHP2 protein, suggesting a function for SHP2 outside of its catalytic activity. Although the loss of Shc1 alone has only a slight effect on lens size and pERK activation in the lens, the authors showed that the loss of Shc1 exacerbated the lens phenotype in lenses lacking both Frs2 and Shp2. The authors suggest that SHC1 binds to the FGFR juxtamembrane domain allowing for the recruitment of GRB2 independently of FRS2.

      Strengths:

      (1) The authors used a variety of genetic tools to carefully dissect the essential signals downstream of FGFR signaling during lens development.

      (2) The authors made a convincing case that something other than FRS2 binding mediates FGFR signaling in the juxtamembrane domain.

      (3) The authors demonstrated that despite the requirement of both the adaptor function and phosphatase activity of SHP2 are required for embryonic survival, neither of these activities is absolutely required for lens development.

      (4) The authors provide more information as to why FGFR loss has a phenotype much more severe than the loss of FRS2 alone during lens development.

      (5) The authors followed up their work analyzing various signaling molecules in the context of lens development with biochemical analyses of FGF-induced phosphorylation in murine embryonic fibroblasts (MEFs).

      (6) In general, this manuscript represents a Herculean effort to dissect FGFR signaling in vivo with biochemical backing with cell culture experiments in vitro.

      We thank the reviewer for the thorough review of our paper and positive comments.

      Weaknesses:

      (1) The authors demonstrate that the loss of FGFR1 and FGFR2 can be compensated by a constitutive active KRAS allele in the lens and suggest that FGFRs largely support lens development only by driving ERK activation. However, the authors also saw that lens development was substantially rescued by preventing apoptosis through the deletion of BAK and BAX. To my knowledge, the deletion of BAK and BAX should not independently activate ERK. The authors do not show whether ERK activation is restored in the BAK/BAX deficient lenses. Do the authors suggest the FGFR3 and/or FGFR4 provide sufficient RAS and ERK activation for lens development when apoptosis is suppressed? Alternatively, is it the survival function of FGFR-signaling as much as a direct effect on lens differentiation?

      Our interpretation is that at the lens induction stage, where FGFR1 and FGFR2 are crucial, their primary function operates through Ras signaling to promote cell survival. Thus, either constitutively active KRAS or the direct suppression of apoptosis by deleting Bak and Bax is sufficient to rescue lens induction. This rescue enables the subsequent differentiation of lens progenitor cells, a process for which FGFR3 and FGFR4 are sufficient to support.

      (2) The authors make the argument that deleting all four FGFRs prevented lens induction but that the deletion of only FGFR1 and FGFR2 did not. Part of this argument is the retention of FOXE3 expression, αA-crystallin expression, and PROX1 expression in the FGFR1/2 double mutants. However, in Figure 1E, and Figure 1F, the staining of the double mutant lens tissue with FOXE3, αA-crystallin, and PROX1 is unconvincing. However, the retention of FOXE3 expression in the FGFR1/FGFR2 double mutants was previously demonstrated in Garcia et al 2011. Also, there needs to be an enlargement or inset to demonstrate the retention of pSMAD in the quadruple FGFR mutants in Figure 1D.

      We have updated Figure 1E with a clearer image of FOXE3 staining to better illustrate FOXE3 expression in the FGFR1/2 double mutants. It seems there may have been a misunderstanding regarding our claims about αA-crystallin and PROX1. To clarify, our observation is that both αA-crystallin and PROX1 are lost in the FGFR1/2 double mutants, which we believe is clearly demonstrated in Figure 1F. Additionally, we have added inserts to Figure 1D to highlight the retention of pSMAD.

      (3) Do the authors suggest that GRB2 is required for RAS activation and ultimately ERK activation? If so, do the authors suggest that ERK activation is not required for FGFR-signaling to mediate lens induction? This would follow considering that the GRB2 deficient lenses lack a problem with lens induction.

      We do believe that GRB2 is required for RAS-ERK signaling activation; however, ERK activation is not absolutely required for lens induction. This conclusion is consistent with our previous study, which showed that deletion of ERK1/2 did not prevent lens induction (Garg et al. eLife 2020;9:e51915), as well as with our current findings demonstrating that the GRB2-deficient mutant is still capable of supporting lens induction.

      (4) The increase in p-Shc is only slightly higher in the Cre FGFR1f/f FGFR2r/LR than in the FGFR1f/Δfrs FGFR2f/f. Can the authors provide quantification?

      pShc quantification is now provided in Fig. 7B.

      (5) The authors have not shown directly that Shc1 binds to the juxtamembrane region of either Fgfr1 or Fgfr2.

      It is not yet clear whether Shc1 directly binds to the juxtamembrane region of FGFR1 or FGFR2, as it may also be recruited indirectly. We acknowledge this as an important question that warrants further investigation in future studies.

      (6) The authors have used the Le-Cre strain for all of their lens deletion experiments. Previous work has documented that the Le-Cre transgene can cause lens defects independent of any floxed alleles in both homozygous and hemizygous states on some genetic backgrounds (Dora et al., 2014 PLoS One 9:e109193 and Lam et al., Human Genomics 2019 13(1):10. Are the controls used in these experiments Le-Cre hemizygotes?

      As stated in the Method section, Le-Cre only or Le-Cre and heterozygous flox mice were used as controls.  

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      Weaknesses

      There are only a few minor weaknesses that need to be addressed.

      (1) The point could be made in the Discussion that since Le-Cre depends on Pax6 placodal expression, it is challenging to evaluate the impact of deletion of the four Fgfrs on the expression of Pax6 (since Pax6 needs to be activated prior to achieving Fgfr deletion). A different Cre line (e.g. a Cre which is expressed in the surface ectoderm prior to lens placode formation) could help partially address this question, although it may not be able to comment on the requirement of the Fgfrs specifically in the lens ectoderm. Thus, it will be prudent to mention this in the discussion.

      We have added the caveat regarding the Le-Cre dependency on Pax6 expression to the discussion.

      (2) Referring to Jag1 among "lens-specific markers" (page 5) is debatable, I suggest changing it along the lines of "the expected upregulation of Jag1 in lens vesicle".

      The wording has been changed as suggested.  

      (3) The Abstract could be modified to clearly convey the existing knowledge gap and the key findings of the present study. As it stands now, it is a bit all over the place.

      The abstract has been revised.  

      (4) Some typos in the manuscript need to be fixed.

      e.g. "...yet its molecular mechanism remains largely resolved" - unresolved?, "...in the development lens" - in the developing lens?, In Fig. 4 legend, "(B) Grb2 mutants Grb2 mutants displayed...", etc.

      These typos have been corrected.

      Reviewer #2 (Recommendations for the authors):

      My specific suggestions are shown below.

      (1) The authors need to describe the role of Shc1 in FGF signaling and vertebrate lens development, by citing previous publications in the introduction.

      We have detailed previous studies on the role of Shc in FGF signaling in the Introduction and discussed its function in the vertebrate lens in the Discussion section.

      (2) Figure 1B bottom panels: Inset images seem to be missing, although frames and arrowheads are there. Please check them.

      The inset images were correctly placed.

      (3) Results (page 5, line 13): The authors mentioned "Sox2 expression remained at basal levels". Since Figure 1B indicates that Sox2 expression fails to be upregulated in FGFR1/2 mutant lens placode in contrast to Pax6, it is better to clearly mention the failure in upregulation of Sox2 expression in the FGFR1/2 mutants.

      This sentence has been rewritten as suggested.  

      (4) Results (page 6, line 8): The authors mentioned "we observed .... expression of Foxe3 in ...mutant lens cells (Figure 1E, arrows). However, Foxe3-expressing lens cells are a very small population in Figure 1E. It is important to state the decreased number of Foxe3-expressing lens cells in FGFR1/2 mutants. In addition, I would like to request the authors to show histograms indicating sample size and statistical analysis for marker expression: Foxe3 (Figure 1E), Prox1 and aA-crystallin (Fig. 1F), cyclin D1 and TUNEL (Fig. 1G) and pmTOR and pS6 (Supplementary figure 1B).

      We added a statement indicating that the number of Foxe3-expressing cells is reduced in FGFR1/2 mutants, which is now quantified in Fig. 1H. Quantifications for Cyclin D1 and TUNEL are now shown in Fig. 1I and J, respectively. However, we chose not to quantify Prox1, αA-crystallin, pmTOR, and pS6, as the FGFR1/2 mutants showed no staining for these markers.

      (5) Results (page 6, line 19- page 7, line 6): The authors showed that inducible expression of constitutive active Kras, KrasG12D, using Le-Cre, recovered lens size to the half level of wild-type control. However, in the lens of mice with Le-Cre; FGFR1/2f/f; LSL-KrasG12D, pERK was detected in the most posterior edge of the lens fiber core, whereas pERK was detected in the broader area of the lens in control. Furthermore, pMEK was detected in the whole lens of mice with Le-Cre; FGFR1/2f/f; and LSL-KrasG12D, whereas pMEK was detected only in the lens epithelial cells at the equator. So, the spatial profile of pERK and pMEK expression was different from those of wild-type, although the authors observed that Prox1 and Crystallin expression are normally induced in the lens of mice with Le-Cre; FGFR1/2f/f; LSL-KrasG12D. I wonder whether the lens normally develops in mice with Le-Cre; LSL-KrasG12D? Is the lens growth enhanced in mice with Le-Cre; LSL-KrasG12D? Please add the panels of mice with Le-Cre; LSL-KrasG12D in Figure 2B and 2C. In addition, I wonder whether apoptosis is suppressed in the lens of mice with Le-Cre; FGFR1/2f/f; LSL-KrasG12D?

      As we previously reported (Developmental Biology 355, 2011, 12–20), Le-Cre; LSL-KrasG12D did not lead to enhanced lens growth. While we agree that including images of Le-Cre; LSL-KrasG12D as controls in Fig. 2B and C and evaluating apoptosis in Le-Cre; FGFR1/2f/f; LSL-KrasG12D mutants would be appropriate, we regretfully no longer have these animals available to conduct these experiments.

      (6) Results (page 11, line 15): the PCR genotyping image of Fig. 6C seems to be missing.

      The PCR genotyping image was correctly placed below Fig. 6B. 

      (7) Results (page 11, lines 15-20): there is no citation of Figure 6D in the results section.

      The citation for Fig. 6D is added in the results section.

      (8) Figures 5H, 6H, and 7A: Western blotting of some of the pERK, ERK lanes is missing.

      These western blots all have pERK/ERK overlay images.

      (9) Figure 7A, western blotting data on pShc levels are important to suggest the involvement of Shc1 in Frs2-independent Grb2 activation by FGF stimulation. Please provide the histogram for statistical analysis.

      pShc quantification is now provided in Fig. 7B.

      (10) There is no citation of Figure 7D, E, and F in the results section. Please add them.

      These citations have been added.

      (11) Figures 7E, and 7F: The authors showed that lens morphology and lens size evaluation in genetic combinations: control, Frs2/Shc1 KD, Frs2/Shp2 KD, and Frs2/Shp2/Shc1 KD. However, I would like to request the authors to show more detailed data in these genetic combinations, for example, pERK, foxe3, Maf, Prox1, Jag1, p57, cyclin D3, g-crystallin, and TUNEL.

      Unfortunately, we no longer have these mutant mice to perform these detailed staining.  

      Reviewer #3 (Recommendations for the authors):

      (1) The figure legend for Figure 2 lists (G) twice. The second (G) should be (H). Also, in Figures 2G and H there is no indication as to what stage lenses were used for the TUNEL and size analyses. I assume that it was E13.5, but it should be explicitly stated.

      The figure labeling has been corrected and the stage added to the figure legend.

      (2) In Figure 4 A the label should be gamma-crystallin rather than r-crystallin.

      The figure labeling has been corrected.

      (3) In Figure 6 D, I believe that the immunolabeling for Maf and Foxe3 are reversed. The Maf should be red as it is in the fibers and the Foxe3 should be green as it is epithelial.

      The figure labeling has been corrected.

      (4) In Figure 6C I believe that the labels for the WT and YF alleles on the western blot are reversed.

      The YF PCR band was designed to be larger than WT, so the labeling was correct as is.

      (5) In Figure 6F I believe that the labels for WT and CS on the western blot are reversed.

      The figure labeling has been corrected.

      (6) In Supplemental figure 2 there are no genotype labels for the TUNEL bar graph.

      The figure labeling has been added.

    1. eLife Assessment

      In this valuable report, the authors investigated the effect of mitochondrial transplantation on post-cardiac arrest myocardial dysfunction (PAMD), which is associated with mitochondrial dysfunction. They convincingly demonstrated that mitochondrial transplantation enhanced cardiac function and increased survival rates after the return of spontaneous circulation (ROSC). They have also shown that myocardial tissues with transplanted mitochondria exhibited increased mitochondrial complex activity, higher ATP levels, reduced cardiomyocyte apoptosis, and lower myocardial oxidative stress post-ROSC.

    2. Reviewer #1 (Public review):

      Summary:

      In this study, the authors investigate the effect of mitochondrial transplantation on post-cardiac arrest myocardial dysfunction (PAMD), which is associated with mitochondrial dysfunction. The authors demonstrate that mitochondrial transplantation enhances cardiac function and increases survival rates after the return of spontaneous circulation (ROSC). Mechanistically, they found that myocardial tissues with transplanted mitochondria exhibit increased mitochondrial complex activity, higher ATP levels, reduced cardiomyocyte apoptosis, and lower myocardial oxidative stress post-ROSC.

      Strengths:

      Previous studies have reported that mitochondrial transplantation can improve myocardial recovery after regional ischemia, but its potential for treating myocardial injury following cardiac arrest has not been tested yet. Therefore, the findings are somewhat novel. Remarkably, the increased survival in mitochondria treated group post ROSC is very promising and highlights its translational potential.

      Comments on revisions:

      My concerns are adequately addressed.

    3. Reviewer #3 (Public review):

      In this manuscript titled "Transplantation of exogenous mitochondria mitigates myocardial dysfunction after cardiac arrest", Zhen Wang et al. report that exogenous mitochondrial transplantation can enhance myocardial function and survival rates. It limits mitochondrial morphology impairment, boosts complexes II and IV activity, and increases ATP levels. Additionally, mitochondrial therapy reduces oxidative stress, lessens myocardial injury, and improves PAMD after cardiopulmonary resuscitation. The results of this manuscript clearly demonstrate that mitochondrial transplantation can effectively improve PAMD after cardiopulmonary resuscitation, highlighting its significant scientific and clinical value. The findings shown in this manuscript are interesting to the readers. However, further experiments are needed to confirm this conclusion. In addition, the results should be rewritten to describe and discuss the relevant data in detail.

      Major comments from the original round of review:

      (1) Can isolated mitochondria be transported to cultured cardiomyocytes, such as H9C2 cells, in vitro?

      (2) The description of results in the manuscript is too simple. It lacks detail on the rationale behind the experiments and the significance of the data.

      (3) The authors demonstrate that mitochondrial transplantation reduces cardiomyocyte apoptosis. Therefore, Western blot analysis of apoptosis-related caspases could be provided for further confirmation.

      (4) Do donor mitochondria fuse with recipient mitochondria? Relevant experiments and data should be provided to address this question.

      (5) In Figure 5A, the histograms are not labeled with the specific experimental groups.

      Comments on revisions:

      The revised manuscript quality has been improved, and most of my concerns were addressed and resolved.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer 3 (Public review):

      Major comments:

      (1) Can isolated mitochondria be transported to cultured cardiomyocytes, such as H9C2 cells, in vitro?

      Thank you for this insightful question. Mitochondria are highly dynamic organelles that play a crucial role in cellular energy metabolism. When cells encounter various stressors and increased energy demands, they can benefit from the incorporation of exogenous mitochondria. In 2013, Masuzawa et al. (Masuzawa, et al.,2013) were the first to demonstrate that transplanted mitochondria are internalized by cardiomyocytes 2 to 8 hours after transplantation, significantly contributing to the preservation of myocardial energetics. Ali et al. (Ali, et al.,2020) discovered that exogenous mitochondria could be internalized by H9C2 cardiomyocytes as quickly as 5 minutes after co-incubation, resulting in an acute enhancement of normal cellular bioenergetics following mitochondrial transplantation. Pacak et al. (Pacak, et al.,2015) established that the internalization of mitochondria into cardiomyocytes is time-dependent and occurs through actin-dependent endocytosis.

      Collectively, these evidences illustrate that exogenous mitochondria can be effectively internalized by H9C2 cells and other cardiomyocytes, our experiments further confirmed that mitochondrial transplantation can be incorporated by the myocardium in vivo.

      (2) The description of results in the manuscript is too simple. It lacks detail on the rationale behind the experiments and the significance of the data.

      Thank you for this suggestion. We have realized that the results in the submitted manuscript have not been adequately interpreted. We have added necessary details on the rationale behind the experiments and the significance of the data to the results section (Lines 57~59, 69~73, 81~88, 91~98, 100~102, 103~104,  10<sup>9</sup>~115, 124~129, 135~146, 149~157, 159~161, 168~169, 178~179). We would like to express our gratitude to the reviewers once again and hope that our modifications will meet their requirements.

      (3) The authors demonstrate that mitochondrial transplantation reduces cardiomyocyte apoptosis. Therefore, Western blot analysis of apoptosis-related caspases could be provided for further confirmation.

      Thank you for this constructive comment. We fully agree with the reviewer's perspective on the detection of apoptosis-related caspases and have conducted a Western blot assay to investigate the impact of mitochondria on myocardial tissue. Our new evidence indicates that rats receiving mitochondrial transplantation exhibited reduced expression of cleaved caspase-3 compared with those in the NS and Vehicle groups (Fig. 6G, 6H, Lines 168~169), suggesting that mitochondrial transplantation decreased the level of apoptosis in the myocardium.

      (4) Do donor mitochondria fuse with recipient mitochondria? Relevant experiments and data should be provided to address this question.

      This is a very helpful comment. Investigating the fate of transplanted mitochondria in myocardial cells after CA is of great significance. The internalization of exogenous mitochondria has been observed across various cell types (Liu, et al.,2021; Shanmughapriya, et al.,2020). Notably, a recent study indicated that after being incorporated into host cells, isolated mitochondria are transported to endosomes and lysosomes. Subsequently, most of these mitochondria escape from these compartments and fuse with the endogenous mitochondrial network (Cowan, et al.,2017). We have discussed this in the manuscript. (Lines 217~220)

      Oxidative stress, a pathophysiological phenomenon common to cells suffering from ischemia/reperfusion insults after CA/CPR, was implicated to promote internalization and survival of exogenous mitochondria (Aharoni-Simon, et al.,2022). In our study, we confirmed that mitochondrial transplantation can enhance the metabolism of cardiomyocytes, increase ATP level, and reduce reactive oxygen species (ROS). Our results indirectly confirm that isolated mitochondria can successfully fuse with myocardial mitochondria.

      (5) In Figure 5A, the histograms are not labeled with the specific experimental groups.

      We apologize for this oversight. We have labeled the specific experimental groups in the histograms presented in Figure 6B and 6C (originally Figure 5A).

      Reviewer #1 (Recommendations For The Authors):

      (1) The age, gender, and strain of the donor rats should be specified in the Methods section. Additionally, it is not obvious what doses of mitochondria were injected into the rats and how the dosage was initially determined.

      Thanks for your suggestion. We have included relevant information about the donor rats in the Methods section(Lines 361~362).

      In Mito group, each animal received 0.5 mL of 1× 10<sup>9</sup>/mL mitochondrial suspension. (Lines 342~345). Considerable amounts of data have demonstrated the efficacy of mitochondrial transplantation in cellular, animal, and human research (Alemany, et al.,2024; Kaza, et al.,2017; Liu, et al.,2023). However, there is currently no evidence to determine the optimal dosage for transplantation. In previous research, isolated mitochondria (1 ×  10<sup>9</sup>) were delivered to the left coronary ostium in pigs, and can be a viable treatment modality in cardiac ischemia-reperfusion injury (Blitzer, et al.,2020; Guariento, et al.,2020). Additionally, the dose of 1× 10<sup>9</sup> mitochondria achieve the maximal hyperemic effect when administered via intracoronary injection (Shin, et al.,2019). Considering that Sprague-Dawley (SD) rats are smaller than pigs and that there is a loss of mitochondria during pulmonary circulation, we adopted a mitochondrial transplantation dose of 5× 10<sup>8</sup>. We will explore the optimal dosage in our future research.

      (2) In Figure 4a, the number of transplanted mitochondria appears to be very low. Considering the high number of mitochondria present in cardiomyocytes, it is unclear whether this small amount of transplanted mitochondria can significantly impact complex II activity and ATP levels in myocardial tissues, as shown in Figures 4b-d, or improve survival post-ROSC, as shown in Figure 2d. Could the observed benefits of mitochondrial transplantation be due to the indirect effects of the injected mitochondria, such as the release of mitochondrial contents, rather than the mitochondria themselves, as discussed by Bertero et al. (2021, Circ. Research)? This issue should be addressed in the manuscript.

      Thanks for this wonderful comment. As presented in Fig. 4 (originally Figure 4A), our results indicated the internalization of mitochondria by myocardium, shown by colocalization of Mito-tracker and myocardium marker. We would like to make our points here regrading to Fig. 4:

      (1) Significant left ventricular systolic and diastolic dysfunction that occurs in the myocardium shortly after the return of ROSC is referred to post-cardiac arrest myocardial dysfunction (PAMD) (Laurent, et al.,2002). It has demonstrated the efficacy of mitochondrial transplantation for the heart following ischemia-reperfusion injury in cellular, animal, and human studies, despite inadequate mitochondrial internalization (Liu, et al.,2023). A low number of transplanted mitochondria may improve cardiac function.

      (2) Only biologically active mitochondria can be specifically labeled with Mito-tracker. Therefore, cardiomyocytes uptake mitochondria that possess complete functionality. Previous results have demonstrated that mitochondrial contents, such as nonviable mitochondria, mitochondrial fractions, mitochondrial deoxyribonucleic acid, ribonucleic acid, exogenous adenosine diphosphate and ATP, do not provide protection to the ischemic heart (McCully, et al.,2017; McCully, et al.,2009).

      (3) The specific mechanism for mitochondrial internalization has yet to be fully elucidated. We totally agree with reviewer’s opinion pertaining the presence of other mechanisms of mitochondria transplantation that play a role in cardiac protection. Multiple mechanism may involve in the cardiac protection effect of mitochondria transplantation, and we are actively seeking reasonable approach to verify these hypotheses in an underway study (Lines 236~246).

      (3) In Figure 4g, the claims regarding sarcomere length, mitochondrial structure, the number of cristae, accumulated calcium etc. seem to rely on the visual interpretation of representative images. To ensure a reliable interpretation of the data, a blinded quantification of each image in each group should be conducted. The same applies to the claims made in Figure 5E.

      Thanks for this suggestion. We have quantitatively evaluated the electron microscope images and HE images of the myocardium to ensure reliable interpretation. Corresponding supplements have been added to the methods (Lines 433~441, 494~496), results sections (Lines  10<sup>9</sup>~115, 178~179), and Figures 5C, 5D, 6K and 6H (originally Figures 4G and 5E).

      (4) In line 69, it is unclear why the authors claim that MAP and HR decrease at 1, 2, 3, and 4 hours after ROSC in all groups compared to the Sham group, despite stating in line 72 that "MAP and HR did not differ at any observational time points (P>0.05, Figure 2C)."

      We apologize for our inaccurate phrasing. In the presented study, there was no statistically significant difference between MAP and HR at any observational timepoints (P>0.05, Figure 2C). In the NS, Vehicle and Mito groups, the MAP and HR decreased at 1, 2, 3, and 4 hours after ROSC, reaching their nadir at 1 hour. Subsequently, MAP and HR increased gradually but did not show any statistically significant differences compared with the Sham group.  (Lines 69~73).

      (5) The absence of increased mitochondrial content in the mito-groups should be discussed further in the manuscript.

      Thank you for your suggestion. We discussed the reasons why the mass of isolated mitochondria did not increase in Lines 224~235.

      (6) The N in Figure 5d should be provided.

      Thanks for your suggestion. We have revised the figure legend to include N of Figure 6F (originally Figures 5D).

      (7) Figure 6 demonstrates content beyond the findings in this manuscript. This reviewer recommends limiting the graphical abstract to the findings specifically in this paper.

      Thanks for your great advice. We have revised Figure 7 (originally Figure 6) and restricted the graphical abstract to the findings presented in this paper.

      Minor issues:

      (8) The order of data in Figure 4 should be consistent with the text in the manuscript. Figures 4E-F-G are described before Figures 4B-C-D in the text. Similarly, Figure 5F was described before Figure 5E in the text.

      Thanks for your great advice. We have rearranged the order of the pictures to align with the text. Thank you for your proposal.

      (9) In Figure 4A, the locations of the epicardium, muscle, and endocardium should be indicated for clarity. Also, it is not obvious where the close-up box refers to in the actual image.

      Thank you for your suggestion. We primarily seek evidence of mitochondrial internalization within the endocardium, as injury occurs first during myocardial ischemia (Kuwada and Takenaka,2000). The close-up box in Fig. 4 refers to the endocardium.

      (10) In Figure 5A, the group annotations are missing from the MDA and SOD graphs. The standard deviation bars for the SOD vehicle and SOD mito groups (3rd and 4th columns) appear to overlap. Can the authors provide the actual p-values?

      We apologize for the mission of group annotations in the MDA and SOD graphs. The p-value between the Vehicle group and the Mito group was 0.004. The SOD activity level of myocardial samples in the groups are presented in Table 1.

      Author response table 1.

      The SOD activity levels of myocardial samples in groups (U/mgprot)

      (11) In line 58, NS abbreviation is used without defining what NS is.

      We apologize for not including the full name of NS. NS is the abbreviation of normal. It has now been marked in the manuscript. (Line 58)

      (12) In line 118, what MDA stands for is not described until line 348. MDA should be defined in the text for the general audience.

      We apologize for this. We have defined it in the manuscript. (Lines 156~157)

      (13) In line 192, the authors state that "mitochondrial transplantation... increased the expression of antioxidant enzymes after four hours of ROSC," while only SOD activity levels were assessed in the manuscript. Increased activity levels do not necessarily imply an increase in expression levels. This discrepancy should be addressed in the Discussion section.

      Sorry for confusing the ‘activity’ with ‘expression’. Although mitochondrial transplantation has been shown to be involved in the restoration of manganese superoxide dismutase levels after ischemic insults, the changes in antioxidant enzyme expression level were not evaluated at the protein level in this paper (Tashiro, et al.,2022). To avoid misunderstandings, we have replaced the term ‘expression’ with ‘activity’ as appropriate. (Lines 268~271)

      (14) Mitochondria from non-ischemic gastrocnemius muscle of health donor animals were isolated and a manner that maximized their healing potential. This sentence is not clear.

      We apologize for the confusing sentence in the original manuscript. To improve clarity, we have revised that sentence. We isolated mitochondria from allogeneic gastrocnemius muscle tissue of healthy rats and maintained optimal mitochondrial activity and therapeutic effects. (Lines 199~201)

      Minor grammar issues:

      In line 153, mitochondrial should be mitochondria.

      Figure 2D: Percent servival should be percent survival.

      There should be a blank in complex IIactivity Figure 4B, and complex IV activity in Figure 4C.

      In line 134, Four hours of ROSC, Tissue samples from. Tissue is capital.

      In line 190, Similaerly should be similarly.

      Thank you for your valuable comments. We apologize for the grammatical issues caused by our oversight. We have made the necessary corrections in the manuscript and figures. (Lines 198, 179, and 268), Figure 2D, Figure 5E (originally Figure 4B); Figure 5F (originally Figure 4C).

      Reviewer #2 (Recommendations For The Authors):

      Some details are lacking clarity, such as the rationale behind choosing certain doses or time points for interventions.

      Thank you for this valuable suggestion. We have explained the rationale behind the selection of the dosage and the timing of the intervention. (Lines 201~212)

      I would suggest verifying mitochondrial function using the seahorse experiment oxygen consumption, and to check mitochondrial oxidative stress. I would also suggest checking the mitochondrial permeability transition pore opening, using for example calcein cobalt quenching or simply a kit to examine this further.

      Thank you for your valuable advice. In our manuscript, we added results regarding mitochondrial reactive oxygen species (ROS) and the mitochondrial permeability transition pore (mPTP) opening. As anticipated, mitochondrial transplantation reduced the increase in mitochondrial ROS and the mPTP opening in ischemic myocardium. (Lines 135~146, 149~157, 442~455, 460~476, Figure 5H, 5I, 6A)

      We agree that seahorse experiment oxygen consumption would be beneficial for understanding the intricacies of their interactions and enhancements. Additionally, Ali et al. (Ali, et al.,2020) have demonstrated that introducing non-autologous mitochondria from healthy skeletal muscle cells into normal cardiomyocytes results in a short-term improvement in bioenergetics, as measured using a Seahorse Extracellular Flux Analyzer. In our results, we have not yet conducted cellular experiments, The process of isolating cells from the myocardial tissue of adult SD rats for Seahorse analysis can lead to secondary damage to the myocardial cells (Jacobson, et al.,1985). In this experiment, we measured ATP content and the activity of mitochondrial complexes to evaluate energy changes after mitochondrial transplantation. We will conduct cell experiments and utilize Seahorse measurements to further clarify the alterations in myocardial energy in future.

      For Figure 3B, it would be beneficial to include the relative quantification of the mitochondrial marker COX-IV. Additionally, if feasible, I suggest verifying the representation of the mitochondria outer membrane TOM20 or VDAC.

      Thank you for your great suggestion. As suggested, we added TOM20 to assess the purity of the isolated mitochondria and reached the same conclusion: the isolated mitochondria exhibited high purity (Figure 3B). TOM20 was expressed in both muscle lysates and isolated mitochondria, whereas GAPDH was exclusively found in the muscle lysate. (We re-validated the purity of the mitochondria by using relative quantification of TOM20 and COX VI.)

      In Figure 2C, the clarity of the graphs depicting both arterial pressure (MAP) and heart rate (HR) is lacking and could potentially confuse the reader. I recommend incorporating color coding instead of relying solely on symbols, or by presenting the data in a more comprehensible format and that aligns with graph B as well.

      Thank you for your constructive comments. We have color-coded the diagrams in Figure 2B and 2C.

      In Figure 4A, please include high-magnification of the mitochondria to provide a more detailed examination.

      Thank you for this insightful comment. We have provided a high-magnification image of the mitochondria in Figure 4.

      Regarding lines 81-82, I recommend specifying the sentence more precisely for better clarity and understanding.

      Thank you for your comments. We have revised the sentences in lines 83~86 to enhance their clarity for readers.

      In the Materials and Methods section, it is crucial to provide precise details. For instance, when staining the exogenous mitochondria with MitoTracker Red, it is important to specify the duration of staining, such as the standard 20 minutes for example. Additionally, it is advisable to mention the number of times these mitochondria were washed with the respiratory solution to ensure thorough removal of excess MitoTracker, thus preventing unintended staining of endogenous mitochondria with MitoTracker red upon injection of pre-labeled mitochondria.

      Thank you for your suggestion. We have added the necessary details regarding Mito-Tracker Red dyeing. (Lines 373~376) In addition, we also added other details in necessary (Lines 373~376, 379~382, 395~396, 397~400, 487~488). We appreciate your suggestion once again.

      The sensitivity of JC-1 dye to temperature and pH fluctuations underscores the necessity for meticulous experimental conditions. It is crucial for the authors to elucidate why they chose to maintain the samples at 4 {degree sign} C for 60 minutes, especially considering the dye's optimal operating temperature of 25 {degree sign} C. Providing a rationale behind this deviation from standard protocol would enhance the scientific rigor and reproducibility of the study. Please add more information on the objectives used in the fluorescence microscope (BX53, OLYMPUS, Tokyo, Japan) and the software used.

      We sincerely apologize for the mistake in this sentence. The purified mitochondria, which are stained with JC-1, should be stored at 4°C and examined using a fluorescence microscope within 60 minutes. Purified mitochondria were incubated with JC-1 staining solution at 37°C for 20 minutes. The fluorescence microscope used in our experiment is equipped with a WHN 10/22 eyepiece, and the software version is OLYMPUS cellSens Standard 3.2. (Lines 379~382)

      Moreover, in the context of immunoblotting, it is imperative for the authors to furnish detailed information regarding the preparation of muscle tissue homogenates. Specifically, clarification is needed regarding the solution utilized for tissue grinding. Did the authors employ ice-cold RIPA lysis buffer or an alternative lysis buffer, supplemented with a protease inhibitor cocktail? Such details are pivotal for methodological transparency.

      Thanks for this wonderful comment. In the methods section, we added detailed information about protein extraction. (Lines 383~385)

      Furthermore, it would be beneficial for the authors to specify the instrument employed for scanning the immunoblots, as well as the software utilized for subsequent analysis of the immunoblot images. Providing this information would not only enhance the reproducibility of the findings but also facilitate the evaluation of the experimental results.

      Thank you for your suggestion. We have included the instrument used for scanning the Western blot, as well as the software used for image analysis in the manuscript. (Lines 397~400)

      Authors must exercise caution against copy-pasting. In line 282, there's a query regarding how the mitochondria were isolated. It is recommended to cite a specific reference and offer more comprehensive details. Despite the authors referencing a number within the text, the absence of numbered references makes it challenging to cross-reference.

      Thank you for pointing this out; we have updated the citation accordingly (Line 361).

      Figure 5C please double check some misspelling label errors (e.g: Vehicle and not Vehucle).

      We apologize for the misspelling in Figure 6E (originally Figure 5C) and have corrected it. Additionally, we have thoroughly reviewed the text for spelling errors and sincerely apologize once again for the previous mistakes. (Lines 249~252, 322)

      References:

      Aharoni-Simon M, Ben-Yaakov K, Sharvit-Bader M, Raz D, Haim Y, Ghannam W, Porat N, Leiba H, Marcovich A, Eisenberg-Lerner A, Rotfogel Z. 2022. Oxidative stress facilitates exogenous mitochondria internalization and survival in retinal ganglion precursor-like cells. SCI REP-UK 12:5122. doi:10.1038/s41598-022-08747-3

      Alemany VS, Nomoto R, Saeed MY, Celik A, Regan WL, Matte GS, Recco DP, Emani SM, Del NP, McCully JD. 2024. Mitochondrial transplantation preserves myocardial function and viability in pediatric and neonatal pig hearts donated after circulatory death. J THORAC CARDIOV SUR 167: e6-e21. doi: 10.1016/j.jtcvs.2023.05.010

      Ali PP, Kenney MC, Kheradvar A. 2020. Bioenergetics Consequences of Mitochondrial Transplantation in Cardiomyocytes. J AM HEART ASSOC 9: e14501. doi:10.1161/JAHA.119.014501

      Blitzer D, Guariento A, Doulamis IP, Shin B, Moskowitzova K, Barbieri GR, Orfany A, Del NP, McCully JD. 2020. Delayed Transplantation of Autologous Mitochondria for Cardioprotection in a Porcine Model. ANN THORAC SURG  109:711-719. doi: 10.1016/j.athoracsur.2019.06.075

      Cowan DB, Yao R, Thedsanamoorthy JK, Zurakowski D, Del NP, McCully JD. 2017. Transit and integration of extracellular mitochondria in human heart cells. SCI REP-UK 7:17450. doi:10.1038/s41598-017-17813-0

      Guariento A, Blitzer D, Doulamis I, Shin B, Moskowitzova K, Orfany A, Ramirez-Barbieri G, Staffa SJ, Zurakowski D, Del NP, McCully JD. 2020. Preischemic autologous mitochondrial transplantation by intracoronary injection for myocardial protection. J THORAC CARDIOV SUR 160: e15-e29. doi: 10.1016/j.jtcvs.2019.06.111

      Jacobson SL, Banfalvi M, Schwarzfeld TA. 1985. Long-term primary cultures of adult human and rat cardiomyocytes. BASIC RES CARDIOL 80 Suppl 1:79-82. doi:10.1007/978-3-662-11041-6_15

      Kaza AK, Wamala I, Friehs I, Kuebler JD, Rathod RH, Berra I, Ericsson M, Yao R, Thedsanamoorthy JK, Zurakowski D, Levitsky S, Del NP, Cowan DB, McCully JD. 2017. Myocardial rescue with autologous mitochondrial transplantation in a porcine model of ischemia/reperfusion. J THORAC CARDIOV SUR 153:934-943. doi: 10.1016/j.jtcvs.2016.10.077

      Kuwada Y, Takenaka K. 2000. [Transmural heterogeneity of the left ventricular wall: subendocardial layer and subepicardial layer]. J CARDIOL 35:205-218.

      Laurent I, Monchi M, Chiche JD, Joly LM, Spaulding C, Bourgeois B, Cariou A, Rozenberg A, Carli P, Weber S, Dhainaut JF. 2002. Reversible myocardial dysfunction in survivors of out-of-hospital cardiac arrest. J AM COLL CARDIOL 40:2110-2116. doi:10.1016/s0735- 1097(02)02594-9

      Liu D, Gao Y, Liu J, Huang Y, Yin J, Feng Y, Shi L, Meloni BP, Zhang C, Zheng M, Gao J. 2021. Intercellular mitochondrial transfer as a means of tissue revitalization. SIGNAL TRANSDUCT TAR 6:65. doi:10.1038/s41392-020-00440-z

      Liu Q, Liu M, Yang T, Wang X, Cheng P, Zhou H. 2023. What can we do to optimize mitochondrial transplantation therapy for myocardial ischemia-reperfusion injury? MITOCHONDRION 72:72-83. doi: 10.1016/j.mito.2023.08.001

      Masuzawa A, Black KM, Pacak CA, Ericsson M, Barnett RJ, Drumm C, Seth P, Bloch DB, Levitsky S, Cowan DB, McCully JD. 2013. Transplantation of autologously derived mitochondria protects the heart from ischemia-reperfusion injury. AM J PHYSIOL-HEART C 304:H966-H982. doi:10.1152/ajpheart.00883.2012

      McCully JD, Cowan DB, Emani SM, Del NP. 2017. Mitochondrial transplantation: From animal models to clinical use in humans. MITOCHONDRION 34:127-134. doi: 10.1016/j.mito.2017.03.004

      McCully JD, Cowan DB, Pacak CA, Toumpoulis IK, Dayalan H, Levitsky S. 2009. Injection of isolated mitochondria during early reperfusion for cardioprotection. AM J PHYSIOL-HEART C 296:H94-H105. doi:10.1152/ajpheart.00567.2008

      Pacak CA, Preble JM, Kondo H, Seibel P, Levitsky S, Del NP, Cowan DB, McCully JD. 2015. Actin-dependent mitochondrial internalization in cardiomyocytes: evidence for rescue of mitochondrial function. BIOL OPEN 4:622-626. doi:10.1242/bio.201511478

      Shanmughapriya S, Langford D, Natarajaseenivasan K. 2020. Inter and Intracellular mitochondrial trafficking in health and disease. AGEING RES REV 62:101128. doi: 10.1016/j.arr.2020.101128

      Shin B, Saeed MY, Esch JJ, Guariento A, Blitzer D, Moskowitzova K, Ramirez-Barbieri G, Orfany A, Thedsanamoorthy JK, Cowan DB, Inkster JA, Snay ER, Staffa SJ, Packard AB, Zurakowski D, Del NP, McCully JD. 2019. A Novel Biological Strategy for Myocardial Protection by Intracoronary Delivery of Mitochondria: Safety and Efficacy. JACC-BASIC TRANSL SC 4:871-888. doi: 10.1016/j.jacbts.2019.08.007

      Tashiro R, Bautista-Garrido J, Ozaki D, Sun G, Obertas L, Mobley AS, Kim GS, Aronowski J, Jung JE. 2022. Transplantation of Astrocytic Mitochondria Modulates Neuronal Antioxidant Defense and Neuroplasticity and Promotes Functional Recovery after Intracerebral Hemorrhage. J NEUROSCI 42:7001-7014. doi:10.1523/JNEUROSCI.2222-21.2022

    1. eLife Assessment

      These useful findings assigned a novel functional implication of histone acylation, crotonylation. Although the mechanistic insights have been provided in great detail regarding the role of the YEATS2-GCDH axis in modulating EMT in HNC, the strength of evidence for the manuscript is incomplete. The patient cohort is very small, with just 10 patients; to establish a significant result the cohort size should be increased. Furthermore, the functional implication of p300 is also to be looked into.

    2. Reviewer #1 (Public review):

      Summary:

      This manuscript investigates a mechanism between the histone reader protein YEATS2 and the metabolic enzyme GCDH, particularly in regulating epithelial-to-mesenchymal transition (EMT) in head and neck cancer (HNC).

      Strengths:

      Great detailing of the mechanistic aspect of the above axis is the primary strength of the manuscript.

      Weaknesses:

      Several critical points require clarification, including the rationale behind EMT marker selection, the inclusion of metastasis data, the role of key metabolic enzymes like ECHS1, and the molecular mechanisms governing p300 and YEATS2 interactions.

      Major Comments:

      (1) The title, "Interplay of YEATS2 and GCDH mediates histone crotonylation and drives EMT in head and neck cancer," appears somewhat misleading, as it implies that YEATS2 directly drives histone crotonylation. However, YEATS2 functions as a reader of histone crotonylation rather than a writer or mediator of this modification. It cannot itself mediate the addition of crotonyl groups onto histones. Instead, the enzyme GCDH is the one responsible for generating crotonyl-CoA, which enables histone crotonylation. Therefore, while YEATS2 plays a role in recognizing crotonylation marks and may regulate gene expression through this mechanism, it does not directly catalyse or promote the crotonylation process.

      (2) The study suggests a link between YEATS2 and metastasis due to its role in EMT, but the lack of clinical or pre-clinical evidence of metastasis is concerning. Only primary tumor (PT) data is shown, but if the hypothesis is that YEATS2 promotes metastasis via EMT, then evidence from metastatic samples or in vivo models should be included to solidify this claim.

      (3) There seems to be some discrepancy in the invasion data with BICR10 control cells (Figure 2C). BICR10 control cells with mock plasmids, specifically shControl and pEGFP-C3 show an unclear distinction between invasion capacities. Normally, we would expect the control cells to invade somewhat similarly, in terms of area covered, within the same time interval (24 hours here). But we clearly see more control cells invading when the invasion is done with KD and fewer control cells invading when the invasion is done with OE. Are these just plasmid-specific significant effects on normal cell invasion? This needs to be addressed.

      (4) In Figure 3G, the Western blot shows an unclear band for YEATS2 in shSP1 cells with YEATS2 overexpression condition. The authors need to clearly identify which band corresponds to YEATS2 in this case.

      (5) In ChIP assays with SP1, YEATS2 and p300 which promoter regions were selected for the respective genes? Please provide data for all the different promoter regions that must have been analysed, highlighting the region where enrichment/depletion was observed. Including data from negative control regions would improve the validity of the results.

      (6) The authors establish a link between H3K27Cr marks and GCDH expression, and this is an already well-known pathway. A critical missing piece is the level of ECSH1 in patient samples. This will clearly delineate if the balance shifted towards crotonylation.

      (7) The p300 ChIP data on the SPARC promoter is confusing. The authors report reduced p300 occupancy in YEATS2-silenced cells, on SPARC promoter. However, this is paradoxical, as p300 is a writer, a histone acetyltransferase (HAT). The absence of a reader (YEATS2) shouldn't affect the writer (p300) unless a complex relationship between p300 and YEATS2 is present. The role of p300 should be further clarified in this case. Additionally, transcriptional regulation of SPARC expression in YEATS2 silenced cells could be analysed via downstream events, like Pol-II recruitment. Assays such as Pol-II ChIP-qPCR could help explain this.

      (8) The role of GCDH in producing crotonyl-CoA is already well-established in the literature. The authors' hypothesis that GCDH is essential for crotonyl-CoA production has been proven, and it's unclear why this is presented as a novel finding. It has been shown that YEATS2 KD leads to reduced H3K27cr, however, it remains unclear how the reader is affecting crotonylation levels. Are GCDH levels also reduced in the YEATS2 KD condition? Are YEATS2 levels regulating GCDH expression? One possible mechanism is YEATS2 occupancy on GCDH promoter and therefore reduced GCDH levels upon YEATS2 KD. This aspect is crucial to the study's proposed mechanism but is not addressed thoroughly.

      (9) The authors should provide IHC analysis of YEATS2, SPARC alongside H3K27cr and GCDH staining in normal vs. tumor tissues from HNC patients.

    3. Reviewer #2 (Public review):

      Summary:

      The manuscript emphasises the increased invasive potential of histone reader YEATS2 in an SP1-dependent manner. They report that YEATS2 maintains high H3K27cr levels at the promoter of EMT-promoting gene SPARC. These findings assigned a novel functional implication of histone acylation, crotonylation.

      Concerns:

      (1) The patient cohort is very small with just 10 patients. To establish a significant result the cohort size should be increased.

      (2) Figure 4D compares H3K27Cr levels in tumor and normal tissue samples. Figure 1G shows overexpression of YEATS2 in a tumor as compared to normal samples. The loading control is missing in both. Loading control is essential to eliminate any disparity in protein concentration that is loaded.

      (3) Figure 4D only mentions 5 patient samples checked for the increased levels of crotonylation and hence forms the basis of their hypothesis (increased crotonylation in a tumor as compared to normal). The sample size should be more and patient details should be mentioned.

      (4) YEATS2 maintains H3K27Cr levels at the SPARC promoter. The p300 is reported to be hyper-activated (hyperautoacetylated) in oral cancer. Probably, the activated p300 causes hyper-crotonylation, and other protein factors cause the functional translation of this modification. The authors need to clarify this with a suitable experiment.

      (5) I do not entirely agree with using GAPDH as a control in the western blot experiment since GAPDH has been reported to be overexpressed in oral cancer.

      (6) The expression of EMT markers has been checked in shControl and shYEATS2 transfected cell lines (Figure 2A). However, their expression should first be checked directly in the patients' normal vs. tumor samples.

      (7) In Figure 3G, knockdown of SP1 led to the reduced expression of YEATS2 controlled gene Twist1. Ectopic expression of YEATS2 was able to rescue Twist1 partially. In order to establish that SP1 directly regulates YEATS2, SP1 should also be re-introduced upon the knockdown background along with YEATS2 for complete rescue of Twist1 expression.

      (8) In Figure 7G, the expression of EMT genes should also be checked upon rescue of SPARC expression.

    4. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This manuscript investigates a mechanism between the histone reader protein YEATS2 and the metabolic enzyme GCDH, particularly in regulating epithelial-to-mesenchymal transition (EMT) in head and neck cancer (HNC).

      Strengths:

      Great detailing of the mechanistic aspect of the above axis is the primary strength of the manuscript.

      Weaknesses:

      Several critical points require clarification, including the rationale behind EMT marker selection, the inclusion of metastasis data, the role of key metabolic enzymes like ECHS1, and the molecular mechanisms governing p300 and YEATS2 interactions.

      We would like to sincerely thank the reviewer for the detailed, in-depth, and positive response. We are committed to implementing constructive revisions to the manuscript to address the reviewer’s concerns effectively.

      Major Comments:

      (1) The title, "Interplay of YEATS2 and GCDH mediates histone crotonylation and drives EMT in head and neck cancer," appears somewhat misleading, as it implies that YEATS2 directly drives histone crotonylation. However, YEATS2 functions as a reader of histone crotonylation rather than a writer or mediator of this modification. It cannot itself mediate the addition of crotonyl groups onto histones. Instead, the enzyme GCDH is the one responsible for generating crotonyl-CoA, which enables histone crotonylation. Therefore, while YEATS2 plays a role in recognizing crotonylation marks and may regulate gene expression through this mechanism, it does not directly catalyse or promote the crotonylation process.

      We thank the reviewer for raising this concern. As stated by the reviewer, YEATS2 functions as a reader protein, capable of recognizing histone crotonylation marks and assisting in the addition of this mark to nearby histone residues, possibly by assisting the recruitment of the writer protein for crotonylation. Our data indicates the involvement of YEATS2 in the recruitment of writer protein p300 on the promoter of the SPARC gene, making YEATS2 a regulatory factor responsible for the addition of crotonyl marks in an indirect manner. Thus, we have decided to make changes in the title by replacing the word “mediates” with “regulates”. Therefore, the updated title can be read as: “Interplay of YEATS2 and GCDH regulates histone crotonylation and drives EMT in head and neck cancer”.

      (2) The study suggests a link between YEATS2 and metastasis due to its role in EMT, but the lack of clinical or pre-clinical evidence of metastasis is concerning. Only primary tumor (PT) data is shown, but if the hypothesis is that YEATS2 promotes metastasis via EMT, then evidence from metastatic samples or in vivo models should be included to solidify this claim.

      We appreciate the reviewer’s suggestion. Here, we would like to state that the primary aim of this study was to delineate the molecular mechanisms behind the role of YEATS2 in maintaining histone crotonylation at the promoter of genes that favour EMT in head and neck cancer. We have dissected the importance of histone crotonylation in the regulation of gene expression in head and neck cancer in great detail, having investigated the upstream and downstream molecular players involved in this process that promote EMT. Moreover, with the help of multiple phenotypic assays, such as Matrigel invasion, wound healing, and 3D invasion assays, we have shown the functional importance of YEATS2 in promoting EMT in head and neck cancer cells. Since EMT is known to be a prerequisite process for cancer cells undergoing metastasis(1), the evidence of YEATS2 being associated with EMT demonstrates a potential correlation of YEATS2 with metastasis. However, as part of the revision, we will use publicly available patient data to investigate the direct association of YEATS2 with metastasis by checking the expression of YEATS2 between different grades of head and neck cancer, as an increase in tumor grade is often correlated with the incidence of metastasis(2).

      (3) There seems to be some discrepancy in the invasion data with BICR10 control cells (Figure 2C). BICR10 control cells with mock plasmids, specifically shControl and pEGFP-C3 show an unclear distinction between invasion capacities. Normally, we would expect the control cells to invade somewhat similarly, in terms of area covered, within the same time interval (24 hours here). But we clearly see more control cells invading when the invasion is done with KD and fewer control cells invading when the invasion is done with OE. Are these just plasmid-specific significant effects on normal cell invasion? This needs to be addressed.

      We appreciate the reviewer for the thorough evaluation of the manuscript. The figure panels in question, Figure 2B and 2C, represent two different experiments performed independently, the invasion assay performed after knockdown and overexpression of YEATS2, respectively. We would like to clarify that both panels represent results that are distinct and independent of each other and that the method used to knockdown or overexpress YEATS2 is also different. As stated in the Materials and Methods section, the knockdown is performed using lentivirus-mediated transfection (transduction) of cells, on the other hand, the overexpression is done using standard method of transfection by directly mixing transfection reagent and the respective plasmids, prior to the addition of this mix to the cells. The difference in the experimental conditions in these two experiments might have attributed to the differences seen in the controls as observed previously(3). Hence, we would like to state that the results of figure panels Figure 2B and Figure 2C should be evaluated independently of each other.

      (4) In Figure 3G, the Western blot shows an unclear band for YEATS2 in shSP1 cells with YEATS2 overexpression condition. The authors need to clearly identify which band corresponds to YEATS2 in this case.

      The two bands seen in the shSP1+pEGFP-C3-YEATS2 condition correspond to the endogenous YEATS2 band (lower band, indicated by * in the shControl lane) and YEATS2-GFP band (upper band, corresponding to overexpressed YEATS2-GFP fusion protein, which has a higher molecular weight). To avoid confusion, the endogenous band will be highlighted (marked by *) in the lane representing the shSP1+pEGFP-C3-YEATS2 condition in the revised version of the manuscript.

      (5) In ChIP assays with SP1, YEATS2 and p300 which promoter regions were selected for the respective genes? Please provide data for all the different promoter regions that must have been analysed, highlighting the region where enrichment/depletion was observed. Including data from negative control regions would improve the validity of the results.

      Throughout our study, we have performed ChIP-qPCR assays to check the binding of SP1 on YEATS2 and GCDH promoter, and to check YEATS2 and p300 binding on SPARC promoter. Using transcription factor binding prediction tools and luciferase assays, we selected multiple sites on the YEATS2 and GCDH promoter to check for SP1 binding. The results corresponding to the site that showed significant enrichment were provided in the manuscript. The region of SPARC promoter in YEATS2 and p300 ChIP assay was selected on the basis of YEATS2 enrichment found in the YEATS2 ChIP-seq data. We will provide data for all the promoter regions investigated (including negative controls) in the revised version of the manuscript.

      (6) The authors establish a link between H3K27Cr marks and GCDH expression, and this is an already well-known pathway. A critical missing piece is the level of ECSH1 in patient samples. This will clearly delineate if the balance shifted towards crotonylation.

      We thank the reviewer for their valuable suggestion. To support our claim, we had checked the expression of GCDH and ECHS1 in TCGA HNC RNA-seq data (provided in Figure 4—figure supplement 1A and B) and found that GCDH showed increase while ECHS1 showed decrease in tumor as compared to normal samples. We hypothesized that higher GCDH expression and decreased ECHS1 expression might lead to an increase in the levels of crotonylation in HNC. To further substantiate our claim, we will check the abundance of ECHS1 in HNC patient samples as part of the revision.

      (7) The p300 ChIP data on the SPARC promoter is confusing. The authors report reduced p300 occupancy in YEATS2-silenced cells, on SPARC promoter. However, this is paradoxical, as p300 is a writer, a histone acetyltransferase (HAT). The absence of a reader (YEATS2) shouldn't affect the writer (p300) unless a complex relationship between p300 and YEATS2 is present. The role of p300 should be further clarified in this case. Additionally, transcriptional regulation of SPARC expression in YEATS2 silenced cells could be analysed via downstream events, like Pol-II recruitment. Assays such as Pol-II ChIP-qPCR could help explain this.

      Using RNA-seq and ChIP-seq analyses, we have shown that YEATS2 affects the expression of several genes by regulating the level of histone crotonylation at gene promoters globally. The histone writer p300 is a promiscuous acyltransferase protein that has been shown to be involved in the addition of several non-acetyl marks on histone residues, including crotonylation(4). Our data provides evidence for the dependency of the writer p300 on YEATS2 in mediating histone crotonylation, as YEATS2 downregulation led to decreased occupancy of p300 on the SPARC promoter (Figure 5F). However, the exact mechanism of cooperativity between YEATS2 and p300 in maintaining histone crotonylation remains to be investigated. To address the reviewer’s concern, we will perform various experiments to delineate the molecular mechanism pertaining to the association of YEATS2 with p300 in regulating histone crotonylation. Following are the experiments that will be performed:

      (a) Co-immunoprecipitation experiments to check the physical interaction between YEATS2 and p300.

      (b) We will check H3K27cr levels on the SPARC promoter and SPARC expression in p300-depleted HNC cells.

      (c) Rescue experiments to check if the decrease in p300 occupancy on the SPARC promoter can be compensated by overexpressing YEATS2.

      (d) As suggested by the reviewer, Pol-II ChIP-qPCR at the promoter of SPARC will be performed in YEATS2-silenced cells to explain the mode of transcriptional regulation of SPARC expression by YEATS2.

      (8) The role of GCDH in producing crotonyl-CoA is already well-established in the literature. The authors' hypothesis that GCDH is essential for crotonyl-CoA production has been proven, and it's unclear why this is presented as a novel finding. It has been shown that YEATS2 KD leads to reduced H3K27cr, however, it remains unclear how the reader is affecting crotonylation levels. Are GCDH levels also reduced in the YEATS2 KD condition? Are YEATS2 levels regulating GCDH expression? One possible mechanism is YEATS2 occupancy on GCDH promoter and therefore reduced GCDH levels upon YEATS2 KD. This aspect is crucial to the study's proposed mechanism but is not addressed thoroughly.

      The source for histone crotonylation, crotonyl-CoA, can be produced by several enzymes in the cell, such as ACSS2, GCDH, ACOX3, etc(5). Since metabolic intermediates produced during several cellular pathways in the cell can act as substrates for epigenetic factors, we wanted to investigate if such an epigenetic-metabolism crosstalk existed in the context of YEATS2. As described in the manuscript, we performed GSEA using publicly available TCGA RNA-seq data and found that patients with higher YEATS2 expression also showed a high correlation with expression levels of genes involved in the lysine degradation pathway, including GCDH. Since the preferential binding of YEATS2 with H3K27cr and the role of GCDH in producing crotonyl-CoA was known(6,7), we hypothesized that higher H3K27cr in HNC could be a result of both YEATS2 and GCDH. We found that the presence of GCDH in the nucleus of HNC cells is correlated to higher H3K27cr abundance, which could be a result of excess levels of crotonyl-CoA produced via GCDH. We also found a correlation between H3K27cr levels and YEATS2 expression, which could arise due to YEATS2-mediated preferential maintenance of crotonylation. This states that although being a reader protein, YEATS2 is affecting the promoter H3K27cr levels, possibly by helping in the recruitment of p300 (as shown in Figure 5F). Thus, YEATS2 and GCDH are both responsible for the regulation of histone crotonylation-mediated gene expression in HNC.

      We did not find any evidence of YEATS2 regulating the expression of GCDH in HNC cells. However, we found that YEATS2 downregulation reduced the nuclear pool of GCDH in head and neck cancer cells (Figure 7F). This suggests that YEATS2 not only regulates histone crotonylation by affecting promoter H3K27cr levels (with p300), but also by affecting the nuclear localization of crotonyl-CoA producing GCDH. Also, we observed that the expression of YEATS2 and GCDH are regulated by the same transcription factor SP1 in HNC. We found that the transcription factor SP1 binds to the promoter of both genes, and its downregulation led to a decrease in their expression (Figure 3 and Figure 7).

      We would like to state that the relationship between YEATS2 and the nuclear localization of GCDH, as well as the underlying molecular mechanism, remains unexplored and presents an open question for future investigation.

      (9) The authors should provide IHC analysis of YEATS2, SPARC alongside H3K27cr and GCDH staining in normal vs. tumor tissues from HNC patients.

      We thank the reviewer for their suggestion. We are consulting our clinical collaborators to assess the feasibility of including this IHC analysis in our revision and will make every effort to incorporate it.

      Reviewer #2 (Public review):

      Summary:

      The manuscript emphasises the increased invasive potential of histone reader YEATS2 in an SP1-dependent manner. They report that YEATS2 maintains high H3K27cr levels at the promoter of EMT-promoting gene SPARC. These findings assigned a novel functional implication of histone acylation, crotonylation.

      We thank the reviewer for the constructive comments. We are committed to making beneficial changes to the manuscript in order to alleviate the reviewer’s concerns.

      Concerns:

      (1) The patient cohort is very small with just 10 patients. To establish a significant result the cohort size should be increased.

      We thank the reviewer for this suggestion. We will increase the number of patient samples to assess the levels of YEATS2 and H3K27cr in normal vs. tumor samples.

      (2) Figure 4D compares H3K27Cr levels in tumor and normal tissue samples. Figure 1G shows overexpression of YEATS2 in a tumor as compared to normal samples. The loading control is missing in both. Loading control is essential to eliminate any disparity in protein concentration that is loaded.

      In Figures 1G and 4D, we have used Ponceau S staining as a control for equal loading. Ponceau S staining is frequently used as an alternative for housekeeping genes like GAPDH as a control for protein loading(8). It avoids the potential for variability in housekeeping gene expression. However, it may be less quantitative than using housekeeping proteins. To address the reviewer’s concern, we will probe with an antibody against a house keeping gene as a loading control in the revised figures, provided its expression remains stable across the conditions tested.

      (3) Figure 4D only mentions 5 patient samples checked for the increased levels of crotonylation and hence forms the basis of their hypothesis (increased crotonylation in a tumor as compared to normal). The sample size should be more and patient details should be mentioned.

      A total of 9 samples were checked for H3K27cr levels (5 of them are included in Figure 4D and rest included in Figure 4—figure supplement 1D). However, as a part of the revision, we will check the H3K27cr levels in more patient samples.

      (4) YEATS2 maintains H3K27Cr levels at the SPARC promoter. The p300 is reported to be hyper-activated (hyperautoacetylated) in oral cancer. Probably, the activated p300 causes hyper-crotonylation, and other protein factors cause the functional translation of this modification. The authors need to clarify this with a suitable experiment.

      In our study, we have shown that p300 is dependent on YEATS2 for its recruitment on the SPARC promoter. As a part of the revision, we propose the following experiments to further substantiate the role of p300 in YEATS2-mediated gene regulation:

      (a) Co-immunoprecipitation experiments to check the physical interaction between YEATS2 and p300.

      (b) We will check H3K27cr levels on the SPARC promoter and SPARC expression in p300-depleted HNC cells.

      (c) Rescue experiments to check if the decrease in p300 occupancy on the SPARC promoter can be compensated by overexpressing YEATS2.

      (d) Pol-II ChIP-qPCR at the promoter of SPARC will be performed in YEATS2-silenced cells to explain the mode of transcriptional regulation of SPARC expression by YEATS2.

      (5) I do not entirely agree with using GAPDH as a control in the western blot experiment since GAPDH has been reported to be overexpressed in oral cancer.

      We would like to clarify that GAPDH was not used as a loading control for protein expression comparisons between normal and tumor samples. GAPDH was used as a loading control only in experiments using head and neck cancer cell lines where shRNA-mediated knockdown or overexpression was employed. These manipulations specifically target the genes of interest and are not expected to alter GAPDH expression, making it a suitable loading control in these instances.

      (6) The expression of EMT markers has been checked in shControl and shYEATS2 transfected cell lines (Figure 2A). However, their expression should first be checked directly in the patients' normal vs. tumor samples.

      We thank the reviewer for the suggestion. To address this, we will check the expression of EMT markers alongside YEATS2 expression in normal vs. tumor samples.

      (7) In Figure 3G, knockdown of SP1 led to the reduced expression of YEATS2 controlled gene Twist1. Ectopic expression of YEATS2 was able to rescue Twist1 partially. In order to establish that SP1 directly regulates YEATS2, SP1 should also be re-introduced upon the knockdown background along with YEATS2 for complete rescue of Twist1 expression.

      To address the reviewer’s concern regarding the partial rescue of Twist1 in SP1 depleted-YEATS2 overexpressed cells, we will perform the experiment as suggested by the reviewer. In brief, we will overexpress both SP1 and YEATS2 in SP1-depleted cells and then assess the expression of Twist1.

      (8) In Figure 7G, the expression of EMT genes should also be checked upon rescue of SPARC expression.

      We thank the reviewer for the suggestion. We will check the expression of EMT markers on YEATS2/ GCDH rescue and update Figure 7G in the revised version of the manuscript.

      References

      (1) T. Brabletz, R. Kalluri, M. A. Nieto and R. A. Weinberg, Nat Rev Cancer, 2018, 18, 128–134.

      (2) P. Pisani, M. Airoldi, A. Allais, P. Aluffi Valletti, M. Battista, M. Benazzo, R. Briatore, S. Cacciola, S. Cocuzza, A. Colombo, B. Conti, A. Costanzo, L. Della Vecchia, N. Denaro, C. Fantozzi, D. Galizia, M. Garzaro, I. Genta, G. A. Iasi, M. Krengli, V. Landolfo, G. V. Lanza, M. Magnano, M. Mancuso, R. Maroldi, L. Masini, M. C. Merlano, M. Piemonte, S. Pisani, A. Prina-Mello, L. Prioglio, M. G. Rugiu, F. Scasso, A. Serra, G. Valente, M. Zannetti and A. Zigliani, Acta Otorhinolaryngol Ital, 2020, 40, S1–S86.

      (3) J. Lin, P. Zhang, W. Liu, G. Liu, J. Zhang, M. Yan, Y. Duan and N. Yang, Elife, 2023, 12, RP87510.

      (4) X. Liu, W. Wei, Y. Liu, X. Yang, J. Wu, Y. Zhang, Q. Zhang, T. Shi, J. X. Du, Y. Zhao, M. Lei, J.-Q. Zhou, J. Li and J. Wong, Cell Discov, 2017, 3, 17016.

      (5) G. Jiang, C. Li, M. Lu, K. Lu and H. Li, Cell Death Dis, 2021, 12, 703.

      (6) D. Zhao, H. Guan, S. Zhao, W. Mi, H. Wen, Y. Li, Y. Zhao, C. D. Allis, X. Shi and H. Li, Cell Res, 2016, 26, 629–632.

      (7) H. Yuan, X. Wu, Q. Wu, A. Chatoff, E. Megill, J. Gao, T. Huang, T. Duan, K. Yang, C. Jin, F. Yuan, S. Wang, L. Zhao, P. O. Zinn, K. G. Abdullah, Y. Zhao, N. W. Snyder and J. N. Rich, Nature, 2023, 617, 818–826.

      (8) I. Romero-Calvo, B. Ocón, P. Martínez-Moya, M. D. Suárez, A. Zarzuelo, O. Martínez-Augustin and F. S. de Medina, Anal Biochem, 2010, 401, 318–320.

    1. eLife Assessment

      In this important study, the authors advance our understanding of copper uptake by chalkophores and their targeted metalloproteins in Mycobacterium tuberculosis. These convincing data demonstrate that chalkophore-acquired copper is solely incorporated into the Mtb bcc:aa3 copper-iron respiratory oxidase under low copper conditions, and that chalkophore-mediated protection of the respiratory chain is critical to Mtb virulence. These findings may be leveraged for drug discovery and will be of broad interest to those studying bacterial pathogenesis.

    2. Reviewer #1 (Public review):

      Summary:

      It is essential for Mycobacterium tuberculosis (Mtb) to scavenge trace metals from its host to survive. In this study, the authors explore the effects of copper limitation on Mtb. Mtb synthesizes small molecular diisonitrile lipopeptides termed chalkophores, that chelate host copper for import, whereby the copper is incorporated into Mtb metalloproteins. However, the role of chalkophores in Mtb biology and their targeted metalloproteins are unknown. This study investigates Mtb proteins that require chalkophores for copper incorporation and their effect on Mtb virulence. It is known that the nrp operon is induced by copper deprivation and encodes the synthesis of chalkophores. A genetic analysis revealed transcriptional differences for WT and Mtb∆nrp when exposed to the copper chelator tetrathiomolybdate (TTM). The authors found that copper chelation results in upregulation of genes in the chalkophore cluster as well as genes involved in the respiratory chain: specifically, components of the heme-dependent oxidase CytBD and subunits of the bcc:aa3 heme-copper oxidase. Interestingly, treatment of Mtb∆nrp with an inhibitor of the QcrB subunit of the bcc:aa3 oxidase (Q203) resulted in similar transcriptional changes. The bcc:aa3 oxidase and CytBD are functionally redundant, and while both utilize heme as a cofactor, only the first utilizes heme and copper. Utilizing Mtb∆nrp, Mtb∆cydAB and MtbΔnrpΔcydAB along with single gene complementation, the authors showed that copper starvation survival requires diisonitrile chalkophore synthesis and that copper starvation results in dysfunctional bcc:aa3 oxidase. Further genetic analysis combined with inhibitor studies indicate that bcc:aa3 oxidase is the only target impacted by copper starvation. By monitoring oxygen consumption for mutants in combination with inhibitors, the authors show that copper deprivation inhibits respiration through the bcc:aa3 oxidase. Similarly, they show that TTM or Q203 treatment inhibits ATP production in MtbΔnrpΔcydAB, but not in WT, showing that chalkophores maintain oxidative phosphorylation. Lastly, the authors compare the virulence of WT Mtb, Mtb∆nrp and MtbΔnrpΔcydAB strains in mice spleen and lung. The Mtb∆nrp strain showed mild attenuation, but virulence in MtbΔnrpΔcydAB was severely attenuated, and complementation with the chalkophore biosynthetic pathway restored Mtb virulence. These results suggest that chalkophore mediated protection of the respiratory chain is critical to Mtb virulence, and the that redundant respiratory oxidases within Mtb provides respiratory chain flexibility that may promote host adaptation.

      Strengths:

      Overall, the paper is very clear and well-written, with thorough and well-thought-out experimentation.

      The methods are all quite standard, so there are no weaknesses identified with regard to methodology.

    3. Reviewer #2 (Public review):

      Summary:

      This is a well-written manuscript that clearly demonstrates that the nrp encoded diisonitrile chalkophore is necessary for the function of the bcc-aa3 oxidase supercomplex under low copper conditions. In addition, the study demonstrates that the chlakophore is important early during infection when copper sequestration is employed by the host as a method of nutritional immunity.

      Strengths:

      The authors use genetic approaches including single and double mutants of chalkophore biosynthesis, and both the Mtb oxidases. They use copper chelators to restrict copper in vitro. A strength of the work was the use of a synthesized a Mtb chalkophore analogue to show chemical complementation of the mutant nrp locus. Oxphos metabolic activity was measuered by oxygen consumption and ATP levels. Importantly, the study demonstrated that chalkophore, especially in a strain lacking the secondary oxidase, was necessary for early infection and ruled out a role for adaptive immunity in the chalkophore lacking Mtb by use of SCID mice. It is interesting that after two weeks of infection and onset of adaptive immunity, the chalkophore is not required, which is consistent with the host environment switching from a copper-restricted to copper overload in phagosomes.

      Weaknesses:

      Most claims in the manuscript are soundly justified. The one exception is the claim that "maintenance of respiration is the only cellular target of chalkophore mediated copper acquisition." While under the in vitro conditions tested this does appear to be the case; however, it can't be ruled out that the chalkophore is important in other situations. In particular, for maintenance of the periplasmic superoxide dismustase, SodC, which is the other M. tuberculosis enzyme known to require copper.

    4. Reviewer #3 (Public review):

      Summary:

      In this manuscript, the group of Glickman expands on their previous studies on the function of chalkophores during the growth of and infection by Mycobacterium tuberculosis. Previously, the group had shown that chalkophores, which are metallophores specific for the scavenging of copper, are induced by M. tuberculosis under copper deprivation conditions. Here, they show that chalkophores, under copper limiting conditions, are essential for the uptake of copper and maturation of a terminal oxidase, the heme-copper oxidase, cytochrome bcc:aa3. As M. tuberculosis has two redundant terminal oxidases, growth of and infection by M. tuberculosis is only moderated if both the chalkophores and the second terminal oxidase, cytochrome bd, are inhibited.

      Strengths:

      A strength of this work is that the lab-culture experiments are expanded upon with mice infection models, providing strong indications that host-inflicted copper deprivation is a condition that M. tuberculosis has adapted to for virulence.

      Weaknesses:

      Because the phenotype of M. tuberculosis lacking chalkophores is similar, if not identical, to using Q203, an inhibitor of cytochrome bcc:aa3, the authors propose that the copper-containing cytochrome bcc:aa3 is the only recipient of copper-uptake by chalkophores. A minor weakness of the work is that this latter conclusion is not verified under infection conditions and other copper-enzymes might still be functionally required during one or more stages of infection.

    5. Author response:

      We thank the reviewers for their careful evaluation of our manuscript and appreciate the suggestions for improvement. We will outline our planned revisions in response to these reviews.

      Reviewer 2:

      “The one exception is the claim that "maintenance of respiration is the only cellular target of chalkophore mediated copper acquisition." While under the in vitro conditions tested this does appear to be the case; however, it can't be ruled out that the chalkophore is important in other situations. In particular, for maintenance of the periplasmic superoxide dismutase, SodC, which is the other M. tuberculosis enzyme known to require copper.”

      And

      Reviewer 3:

      “Because the phenotype of M. tuberculosis lacking chalkophores is similar, if not identical, to using Q203, an inhibitor of cytochrome bcc:aa3, the authors propose that the copper-containing cytochrome bcc:aa3 is the only recipient of copper-uptake by chalkophores. A minor weakness of the work is that this latter conclusion is not verified under infection conditions and other copper-enzymes might still be functionally required during one or more stages of infection.

      Both comments concern the question of whether the bcc:aa3 respiratory oxidase supercomplex is the only target of chalkophore delivered copper. In culture, our experiments suggest that bcc:aa3 is the only target. The evidence for this claim is in Figure 2E and F. In 2E, we show that M. tuberculosis DctaD (a subunit of bcc:aa3) is growth impaired, copper chelation with TTM does not exacerbate that growth defect, and that a DctaDDnrp double mutant is no more sensitive to TTM than DctaD. These data indicate that role of the chalkophore in protecting against copper deprivation is absent when the bcc:aa3 oxidase is missing. Similar results were obtained with Q203 (Figure 2F). Q203 or TTM arrest growth of M. tuberculosis Dnrp, but the combination has no additional effect, indicating that when Q203 is inhibiting the bcc:aa3 oxidase, the chalkophore has no additional role. However, we agree with the reviewers that we cannot exclude the possibility that during infection, there is an additional target of chalkophore mediated Cu acquisition. We will add this caveat to the revised version of this manuscript.

    1. eLife Assessment

      This manuscript reports fundamental discoveries on how necrotic cells contribute to organ regeneration through apoptotic signalling to produce cells with non-lethal apoptotic caspase activity that contribute to the regenerated tissue. These findings will be of broad interest to those who study wound repair and tissue regeneration. The strength of the evidence is solid and has been improved in the revised version.

    2. Reviewer #2 (Public review):

      In this revised manuscript, Klemm et al., build on top of past published findings (Klemm et al., 2021) to characterize caspase activation in distal cells following necrotic tissue damage within the Drosophila wing imaginal disc. Previously in Klemm et al., 2021, the authors describe necrosis-induced-apoptosis (NiA) following the development of a genetic system to study necrosis that is caused by the expression of a constitutive active GluR1 (Glutamate/Ca2+ channel), and they discovered that the appearance of NiA cells were important for promoting regeneration.

      In this manuscript, the authors investigate how tissues regenerate following necrotic cell death. They find that:

      (1) the cells of the wing pouch are more likely to have non-autonomous caspase activation than other regions within the wing imaginal disc (hinge and notum),

      (2) two signaling pathways that are known to be upregulated during regeneration, Wnt (wingless) and JAK/Stat signaling, act to prevent additional NiA in pouch cells, and may partially explain the region specificity,

      (3) the presence of NiA (and/or NiCP) cells promotes regenerative proliferation in the late stages of regeneration,

      (4) not all caspase-positive cells are cleared from the epithelium (these cells are then referred to as Necrosis-induced Caspase Positive (NiCP) cells), these NiCP cells continue to live and promote proliferation in adjacent cells,

      (5) the initiator caspase Dronc is important for creating NiA/NiCP cells and for these cells to promote proliferation. Animals heterozygous for a Dronc null allele show a decrease in regeneration following necrotic tissue damage. In the revised manuscript, the authors provide improvements through additional data quantifications and text changes to better explain NiA/NiCP lineage tracing methods.

      The study has the potential to be broadly interesting due to the insights into how tissues differentially respond to necrosis as compared to apoptosis to promote regeneration. The paper raises many interesting questions for future investigation, including what is the nature of the signaling between the damaged tissue and the NiA/NiCP responsive areas (such as the identity of the DAMPs)? What determines if these cells at a distance undergo apoptosis or remain viable in the tissue as caspase-positive cells? And since the authors have data that indicates that the phenomenon is distinct from 'undead cells', what are the mechanisms by which these cells promote local proliferation?

    3. Reviewer #3 (Public review):

      The manuscript "Regeneration following tissue necrosis is mediated by non-apoptotic caspase activity" by Klemm et al. is an exploration of what happens to a group of cells that experience caspase activation after necrosis occurs some distance away from the cells of interest. These experiments have been conducted in the Drosophila wing imaginal disc, which has been used extensively to study the response of a developing epithelium to damage and stress. The authors revise and refine their earlier discovery of apoptosis initiated by necrosis, here showing that many of those presumed apoptotic cells do not complete apoptosis. Thus, the most interesting aspect of the paper is the characterization of a group of cells that experience mild caspase activation in response to an unknown signal, followed by some effector caspase activation and DNA damage, but that then recover from the DNA damage, avoid apoptosis, and proliferate instead.

      The authors have addressed the concerns raised, including those about drawing conclusions from RNAi knockdown without evaluating the efficacy of the knockdown, and in doing so they revised their conclusions after ascertaining that the Zfh2 RNAi was not effective.

      The authors have added quantification of the imaging data throughout, which strengthens their conclusions.

      In addition, the authors have revised some of the text describing the changes in EdU signal and added explanations of reagents such as the caspase sensors to clarify the experimental approaches, results, and interpretation of those results.

      The authors have also addressed the minor concerns and questions about the figures and text.

      A few questions remain, which the authors may choose to address.

      (1) The hh>Stat92ERNAi was assessed by the 10xSTAT-GFP reporter, as shown in Fig 2 Supp1 F. The authors point out the marked reduction in GFP in the ventral part of the hinge but do not comment on the lack of change in GFP in the dorsal part of the hinge. However, the open arrowhead in Figure 2H indicating the lack of cDcp-1 signal in the hinge in the same experiment points to the dorsal hinge, where the reporter suggests no difference in JAK-STAT signaling.

      (2) The data used to conclude that DRONC-DN and UAS-DIAP1 do not affect regenerative proliferation were normalized EdU intensities. As discussed in the prior review round, normalized EdU may not be a good comparison across experimental conditions given that the remainder of the disc may also have altered EdU incorporation, so this measurement may not be enough by itself to draw conclusions about regenerative proliferation. To strengthen the conclusion that regenerative proliferation is unaffected under these conditions, the authors may want to consider using a second measure such as adult wing size, PCNA, or quantitate mitoses via anti-phospho histone H3 staining.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      In previous work, the authors described necrosis-induced apoptosis (NiA) as a consequence of induced necrosis. Specifically, experimentally induced necrosis in the distal pouch of larval wing imaginal discs triggers NiA in the lateral pouch. In this manuscript, the authors confirmed this observation and found that while necrosis can kill all areas of the disc, NiA is limited to the pouch and to some extent to the notum, but is excluded from the hinge region. Interestingly and unexpectedly, signaling by the Jak/Stat and Wg pathways inhibits NiA. Further characterization of NiA by the authors reveals that NiA also triggers regenerative proliferation which can last up to 64 hours following necrosis induction. This regenerative response to necrosis is significantly stronger compared to discs ablated by apoptosis. Furthermore, the regenerative proliferation induced by necrosis is dependent on the apoptotic pathway because RNAi targeting the RHG genes is sufficient to block proliferation. However, NiA does not promote proliferation through the previously described apoptosis-induced proliferation (AiP) pathway, although cells at the wound edge undergo AiP. Further examination of the caspase levels in NiA cells allowed the authors to group these cells into two clusters: some cells (NiA) undergo apoptosis and are removed, while others referred to as Necrosis-induced Caspase Positive (NiCP) cells survive despite caspase activity. It is the NiCP cells that repair cellular damage including DNA damage and that promote regenerative proliferation. Caspase sensors demonstrate that both groups of cells have initiator caspase activity, while only the NiA cells contain effector caspase activity. Under certain conditions, the authors were also able to visualize effector caspase activity in NiCP cells, but the level was low, likely below the threshold for apoptosis. Finally, the authors found that loss of the initiator caspase Dronc blocks regenerative proliferation, while inhibiting effector caspases by expression of p35 does not, suggesting that Dronc can induce regenerative proliferation following necrosis in a non- apoptotic manner. This last finding is very interesting as it implies that Dronc can induce proliferation in at least two ways in addition to its requirement in AiP.

      Strengths:

      This is a very interesting manuscript. The authors demonstrate that epithelial tissue that contains a significant number of necrotic cells is able to regenerate. This regenerative response is dependent on the apoptotic pathway which is induced at a distance from the necrotic cells. Although regenerative proliferation following necrosis requires the initiator caspase Dronc, Dronc does not induce a classical AiP response for this type of regenerative response. In future work, it will be very interesting to dissect this regenerative response pathway genetically.

      Weaknesses:

      No weaknesses were identified.

      We thank the reviewer for their positive evaluation and kind words.

      Reviewer #2 (Public Review):

      Summary / Strengths:

      In this manuscript, Klemm et al., build on past published findings (Klemm et al., 2021) to characterize caspase activation in distal cells following necrotic tissue damage within the Drosophila wing imaginal disc. Previously in Klemm et al., 2021, the authors describe necrosis-induced-apoptosis (NiA) following the development of a genetic system to study necrosis that is caused by the expression of a constitutive active GluR1 (Glutamate/Ca2+ channel), and they discovered that the appearance of NiA cells were important for promoting regeneration.

      In this manuscript, the authors aim to investigate how tissues regenerate following necrotic cell death. They find that the cells of the wing pouch are more likely to have non-autonomous caspase activation than other regions within the wing imaginal disc (hinge and notum),two signaling pathways that are known to be upregulated during regeneration, Wnt (wingless) and JAK/Stat signaling, act to prevent additional NiA in pouch cells, and may explain the region specificity, the presence of NiA cells promotes regenerative proliferation in late stages of regeneration, not all caspase-positive cells are cleared from the epithelium (these cells are then referred to as Necrosis-induced Caspase Positive (NiCP) cells), these NiCP cells continue to live and promote proliferation in adjacent cells, the caspase Dronc is important for creating NiA/NiCP cells and for these cells to promote proliferation. Animals heterozygous for a Dronc null allele show a decrease in regeneration following necrotic tissue damage.

      The study has the potential to be broadly interesting due to the insights into how tissues differentially respond to necrosis as compared to apoptosis to promote regeneration.

      Weaknesses:

      However, here are some of my current concerns for the manuscript in its current version:

      The presence of cells with activated caspase that don't die (NiCP cells) is an interesting biological phenomenon but is not described until Figure 5. How does the existence of NiCP cells impact the earlier findings presented? Is late proliferation due to NiA, NiCP, or both? Does Wg and JAK/STAT signaling act to prevent the formation of both NiA and NiCP cells or only NiA cells? Moreover, the authors are able to specifically manipulate the wound edge (WE) and lateral pouch cells (LP), but don't show how these manipulations within these distinct populations impact regeneration. The authors provide evidence that driving UAS-mir(RHG) throughout the pouch, in the LP or the WE all decrease the amount of NiA/NiCP in Figure 3G-O, but no data on final regenerative outcomes for these manipulations is presented (such as those presented for Dronc-/+ in Fig 7M). The manuscript would be greatly enhanced by quantification of more of the findings, especially in describing if the specific manipulations that impacted NiA /NiCP cells disrupt end-point regeneration phenotypes.

      We have added a line to the results to clarify that we believe the finding that some NiA likely persist as NiCP does not affect our conclusions up to this point.

      We have added a statement emphasizing the results from our first paper, which demonstrate that LP>miRHG expression reduces the overall capacity to regenerate.

      Quantification of the change in posterior NiA number have been added to Figure 2L to strengthen the evidence. Likewise, we have included quantification of the E2F time course presented in Figure 3A (Figure 3 – Figure supplement 1C), and quantification of the change in GC3Ai signal over time has been added to Figure 5 - Figure supplement 1D) to emphasize the perdurance of GC3Ai-positive NiA/NiCP.

      How fast does apoptosis take within the wing disc epithelium? How many of the caspase(+) cells are present for the whole 48 hours of regeneration? Are new cells also induced to activate caspase during this time window? The author presented a number of interesting experiments characterizing the NiCP cells. For the caspase sensor GC3Ai experiments in Figure 5, is there a way to differentiate between cells that have maintained fluorescent CG3Ai from cells that have newly activated caspase? What is the timeline for when NiA and NiCP are specified? In addition, what fraction of NiCP cells contribute to the regenerated epithelium? Additional information about the temporal dynamics of NiA and NiCP specification/commitment would be greatly appreciated.

      We have included more information concerning the kinetics of apoptotic cell removal, and how this compares to the observations we have made with NiA/NiCP in our GC3Ai experiments. Additionally, we have included a quantification of the percent of the whole wing pouch with GC3Ai signal over time (Figure 5F) as well as the distal wing pouch with GC3Ai signal over time (Figure 5 – Figure supplement 1D) to further support the idea that NiCP persist over time.

      We acknowledge that our GC3Ai time course unfortunately cannot confirm whether the increase in GC3Ai signal over time is due to cells with new caspase activity or proliferating NiCP and have included this point in the discussion.

      We attempted to track the lineage of NiA/NiCP into the pupal and adult wings with CasExpress and DBS, however the results of these experiments were inconsistent, and therefore we did not feel confident to include these data or draw conclusions in either direction. We are currently designing variations of these lineage trace tools in order to better track the lineage of these cells that we hope to include in a future paper.

      The notum also does not express developmental JAK/STAT, yet little NiA was observed within the notum. Do the authors have any additional insights into the differential response between the pouch and notum? What makes the pouch unique? Are NiA/NiCP cells created within other imaginal discs and other tissues? Are they similarly important for regenerative responses in other contexts?

      We have added a brief mention of these points to the appropriate results section to avoid further increasing the length of the discussion.

      Data on the necrosis of other imaginal discs through FLP/FRT clone formation in haltere and leg discs has been added to Figure 1 Figure supplement 1J, and described in the text.

      Reviewer #3 (Public Review):

      The manuscript "Regeneration following tissue necrosis is mediated by non- apoptotic caspase activity" by Klemm et al. is an exploration of what happens to a group of cells that experience caspase activation after necrosis occurs some distance away from the cells of interest. These experiments have been conducted in the Drosophila wing imaginal disc, which has been used extensively to study the response of a developing epithelium to damage and stress. The authors revise and refine their earlier discovery of apoptosis initiated by necrosis, here showing that many of those presumed apoptotic cells do not complete apoptosis. Thus, the most interesting aspect of the paper is the characterization of a group of cells that experience mild caspase activation in response to an unknown signal, followed by some effector caspase activation and DNA damage, but that then recover from the DNA damage, avoid apoptosis, and proliferate instead. Many questions remain unanswered, including the signal that stimulates the mild caspase activation, and the mechanism through which this activation stimulates enhanced proliferation.

      The authors should consider answering additional questions, clarifying some points, and making some minor corrections:

      Major concerns affecting the interpretation of experimental results:

      Expression of STAT92E RNAi had no apparent effect on the ability of hinge cells to undergo NiA, leading the authors to conclude that other protective signals must exist. However, the authors have not shown that this STAT92E RNAi is capable of eliminating JAK/STAT signaling in the hinge under these experimental conditions. Using a reporter for JAK/STAT signaling, such as the STAT-GFP, as a readout would confirm the reduction or elimination of signaling. This confirmation would be necessary to support the negative result as presented.

      We have included data demonstrating our ability to knock down JAK/STAT activity in the hinge with UAS-Stat92E<sup>RNAi</sup> (Figure 2 – Figure supplement 1E and F). Additionally, we have included a quantification of posterior NiA/NiCP with the Stat92E<sup>RNAi</sup> (as well as wg<sup>RNAi</sup> and Zfh-2<sup>RNAi</sup>, Figure 2L) to strengthen our conclusion that JAK/STAT and WNT signaling acts to regulate NiA formation within the pouch.

      Similarly, the authors should confirm that the Zfh2 RNAi is reducing or eliminating Zfh2 levels in the hinge under these experimental conditions, before concluding that Zfh2 does not play a role in stopping hinge cells from undergoing NiA.

      We have repeated this experiment with a longer knockdown using a GAL4 driver that expresses from early larval stages until our evaluation at L3, but were unable to demonstrate a loss of Zfh-2 with IF labeling. Additionally, we have quantified posterior NiA/NiCP with a Zfh-2RNAi (Figure 2L) and do find a slight increase in NiA/NiCP number, however this change is not significant. We have altered our conclusions to reflect these new data.

      EdU incorporation was quantified by measuring the fluorescence intensity of the pouch and normalizing it to the fluorescence intensity of the whole disc. However, the images show that EdU fluorescence intensity of other regions of the disc, especially the notum, varied substantially when comparing the different genetic backgrounds (for example, note the substantially reduced EdU in the notum of Figure 3 B' and B'). Indeed, it has been shown that tissue damage can lead to suppression of proliferation in the notum and elsewhere in the disc, unless the signaling that induces the suppression is altered. Therefore, the normalization may be skewing the results because the notum EdU is not consistent across samples, possibly because the damage-induced suppression of proliferation in the notum is different across the different genetic backgrounds.

      To more accurately reflect the observations that we have made with the EdU assay, we have changed our terminology to indicate that the EdU signal is more localized to the damaged tissue in ablated discs, thus taking into account the relative changes across the disc, rather than referring to it as an increase in the pouch. To further strengthen our observation that damage results in a localized proliferation, we have included a quantification of the E2F time course presented in Figure 3A (Figure 3 – Figure supplement 1C), which underscores the trend observed in our EdU experiments.

      The authors expressed p35 to attempt to generate "undead cells". They take an absence of mitogen secretion or increased proliferation as evidence that undead cells were not generated. However, there could be undead cells that do not stimulate proliferation non-autonomously, which could be detected by the persistence of caspase activity in cells that do not complete apoptosis. Indeed, expressing p35 and observing sustained effector caspase activation could help answer the later question of what percentage of this cell population would otherwise complete apoptosis (NiA, rescued by p35) vs reverse course and proliferate (NiCP, unaffected by p35).

      In our previous work, we showed that P35 expression impairs our ability to detect effector caspases with IF-based tools. This can also be seen in Figure 4 of this work (Figure 4C and F). Given that P35 expression precludes our ability to label and assay effector caspase activity visually, and thus address the concerns outlined above, we relied on other tools such as reporters of AiP mitogens (wg-lacZ & dpp-lacZ) to assay whether NiA participate in AiP. As a functional readout, we also paired P35 expression with the EdU assay to test whether proliferation was altered by the presence of undead cells. The results discussed in Figure 4 lead us to conclude that NiA likely do not participate in the canonical AiP feedforward loop, although it is possible that these experiments generate another type of undead cell – one that utilizes a different mechanism to promote proliferation.

      It is unclear if the authors' model is that the NiCP cells lead to autonomous or non-autonomous cell proliferation, or both. Could the lineage-tracing experiments and/or the experiments marking mitosis relative to caspase activity answer this question?

      We have added further details to the discussion on the potential for NiA/NiCP to induce cell autonomous/non-autonomous proliferation.

      Many of the conclusions rely on single images. Quantification of many samples should be included wherever possible.

      We have added quantification to strengthen the results of Figures 2, 3 and 5.

      Why does the reduction of Dronc appear to affect regenerative growth in females but not males?

      We have repeated this regeneration scoring experiments and have increased the N for control versus droncI29 mutant males, however the results of the analysis for male wing size remain not significant, although the general trend that droncI29 wings are slightly smaller. While there could be sex-specific differences in the capacity to regenerate that contribute to this observation, it is unclear what the underlying mechanism could be.

      Reviewer #1 (Recommendations for the authors):

      The work in this paper is already very complete and very well worked out. The conclusions are well supported by the data in this manuscript. I do not have any experimental requests, only a few minor and formal requests/questions.

      (1) Why does Diap1 overexpression not affect regenerative proliferation, whereas mir(RHG) and dronc[I29] do, given that Diap1 acts between RHG and Dronc?

      We speculate on this point in the discussion section but have adjusted some of the phrasing for clarity.

      (2) I assume that the authors used the cleaved Dcp-1 antibody from Cell Signaling Technologies. I recommend that the authors refer to this antibody as cDcp-1 in text and figures as this antibody specifically detects the cleaved, and thus activated form of Dcp-1, and not the uncleaved, inactive form of Dcp-1 which has a uniform expression in the discs.

      Changed to cDcp-1.

      (3) Line 299: Hay et al. 1994 did not show that p35 inhibits Drice and Dcp-1 (in fact, both genes were not even cloned yet). This was shown by Meier et al. 2000 and Hawkins et al. 2000. Please correct references.

      Corrected.

      (4) Line 574/575. Meier et al. 2000 did not show that Dronc is mono-ubiquitylated. This was shown by Kamber-Kaya et al., 2017. Please correct.

      Corrected.

      Reviewer #2 (Recommendations for the authors):

      (1) Does domeless knockdown cause apoptosis without tissue ablation (Figures 2C-E)? Currently, the non-ablation control is not shown.

      Domeless knockdown does not cause apoptosis in the absence of ablation (Added Figure 2 – Figure supplement 1A).

      (2) The supplemental experiment with zfh2-RNAi is hard to interpret because there is no evidence of RNAi knockdown based on the staining with the anti-Zfh2 antibody.

      As noted above, a longer zfh-2 knockdown does not appear to alter Zfh-2 protein levels. A quantification of posterior NiA/NiCP following knockdown shows a slight (non-significant) increase in posterior NiA/NiCP. Considering these new results, we have altered our interpretation within the appropriate results and discussion sections.

      (3) The authors should consider adding a diagram showing where mir(RHG) and DIAP1 are in the apoptotic/caspase activation pathway (Figure 7N).

      Completed, Figure 7N and 7O.

      Reviewer #3 (Recommendations for the authors):

      (1) Figure 2 I -The purported increase in NiA should be quantitated relative to the NiA in G across many discs.

      Completed (Figure 2L)

      (2) Figure 2 M - contrary to the conclusion drawn, the posterior Dcp1 does not appear different from that in the control (K). This conclusion that the NiA does not occur in the margin could be better supported with more images/quantification.

      We have exchanged the image for a representative one that more clearly shows the lack of margin NiA and highlighted with an arrowhead (Figure 2K)

      (3) Figure 2 supp 1 E - the "slight increase" in NiA in the pouch is relative to which control? Can this conclusion be supported by quantification?

      Figure 2L now quantifies this change.

      (4) Figure 2 Supp 1 D, E - these discs supposedly have Zfh2 RNAi expressed, but there appears to be no reduction in Zfh2.

      We were unable to demonstrate a reduction of Zfh2, even with a longer knockdown. Considering these new data, we have altered our conclusions from the Zfh2 experiments.

      (5) Figure 2 Supp 1 I - please quantitate the Dcp-1 across many discs to support the conclusion.

      This is the UAS-wg experiment, which we decided to remove from the quantification given the non-specific increase in cDcp-1 throughout the disc (likely as a result from ectopic Wg expression).

      (6) Figure 4 legend M - The authors conclude that the experiment indicates that "NiA promote proliferation independent of AiP". It would be more precise to say that NiA cells do not secrete AiP mitogens and do not increase the proliferation of surrounding cells when prevented from completing apoptosis. To say that the NiA-induced proliferation does not require AiP would require eliminating AiP, perhaps through reaper hid grim knockdown or mitogen knockdown.

      Corrected.

      Minor concerns and clarification needed:

      (7) Line 61 - consider the distinction between a feed-forward loop and a positive feedback loop.

      Corrected.

      (8) Line 338 - it would be helpful to have a brief explanation of what the GC3Ai consists of and how it reports caspase activity.

      Corrected.

      (9) Line 343 - the authors should clarify by what they mean when they state GC3Ai-positive cells are "associated with" mitotic cells. Are the GC3Ai cells undergoing mitosis? Or is the increase in mitosis non-autonomous?

      Adjusted. “associated with adjacent proliferative cells”.

      (10) Lines 392-394 - the authors should add brief descriptions of how the Drice-Based sensor and the CasExpress function, so the readers can better understand the distinctions between these sensors and the previously mentioned sensors (anti-Dcp1 and GC3Ai). In addition, please clarify how the Gal80ts modulates the sensitivity of the CasExpress.

      Descriptions of DBS and CasExpress and additional clarification provided.

      (11) Line 413: How does Gal80ts suppress the background developmental caspase signal, and how does this suppression lead to NiCP cells expressing GFP?

      This section has been reworded to clarify.

      (12) Line 417 - which GFP label is referred to here?

      This section has been reworded to clarify.

      (13) Line 445 is the first mention of the CARD domain - it could be introduced more fully and explained why the DroncDN's lack of effect on proliferation excludes the CARD domain as being important.

      Clarified. See also the discussion for the significance of the CARD domain as dispensable for regenerative proliferation following necrosis.

      (14) Line 452 - "As mentioned" - the manuscript has not previously mentioned DIAP1 modification of the CARD domain and what that modification does. Perhaps the previous explanatory text was inadvertently removed?

      Corrected.

      (15) The Discussion is a lengthy list of experiments that the authors did not do or observations they were unable to make. This section could benefit from a more in-depth discussion of necrosis and the possibility that NiCP cells contribute to repair after injury across contexts and species.

      We have made several changes to the discussion that elaborate on some of the points listed in the public reviews.

      (16) All figures: Consider making single-channel panels grayscale to aid visualization. Also consider using color combinations that can be distinguished by color-blind readers.

      We appreciate these suggestions and will consider them for future manuscripts.

      (17) All figure legends - are error bars SD or SEM?

      Standard deviation. Added to appropriate legends.

      (18) Figure 1A,C - it would be helpful in the diagrams to note when the necrosis occurs/completes.

      The endpoint of necrosis is not well defined, given the simultaneous changes that occur with regeneration. Thus, we opted to not include an indicator of when necrotic ablation ends.

      (19) Figure 1B - it would be helpful to name the GAL4 drivers whose expression domain is depicted to correlate with the terms used in the text.

      Completed.

      (20) Figure 1 legend- what do the different colors of the arrowheads denote? The dotted lines are in R' and S', not N' and O'.

      Completed.

      (21) Figure 2G - the yellow dashed line is not in the same place in the two images.

      Corrected.

      (22) Figure 2I - what is the open arrowhead?

      Completed (Figure 2I legend).

      (23) Figure 3 legend - please describe what the time course is observing (EdU).

      Completed.

      (24) Figure 4 - please include the yellow boxes in the Dcp-1 channels.

      Completed.

      (25) Figure 5 F' - add the arrowheads to all the panels. The yellow arrowhead appears to be pointing to nothing.

      Completed.

      (27) Figure 5 legend - what is a "cytoplasmic undisturbed cell"? What is the arrowhead in G? J and J' should show the same view at different time points or different views at the same time point.

      Figure legend has been corrected.

      (28) Figure 5 Supp 1 would be especially helped by having more single-channel panels in grayscale.

      For clarity and consistency, we chose to maintain the different color channels.

      (29) Figure 5 Supp 1 D and E - It would be helpful to have higher magnification and arrows pointing to the cells of interest. Why are there TUNEL+ cells that do not have caspase activation (green)?

      We have added arrowheads as suggested. We believe the disparity in TUNEL and GC3Ai signals are a result of the different sensitivities of the IF staining and the TUNEL assay.

      (30) Figure 5 Supp 1 F - perhaps the arrowheads should be in all panels - they point to empty spaces with no H2Av staining in the final panel. Perhaps a higher magnification image would make the "strong overlap" of the two signals more apparent?

      We have added arrowheads where appropriate.

      (31) Figure 6 D-E - does the widespread GFP lineage tracing signal suggest that most cells in the repaired tissue originated from cells that once had caspases activity?

      Possibly, however given that CasExpress leads to significant developmental labeling, we were unable to determine to what extent the signal in this experiment comes from NiA/NiCP activity versus developmental labeling. Note that tubGAL80ts is not present in this experiment.

      (32) Writing corrections:

      Line 343 "positive" is misspelled.

      Completed

      Line 429 - a word may be missing.

      Completed

      Line 639 - the word "day" may be missing.

      Completed

      Line 658 - what temperature was the recovery?

      Completed

      Lines 706-708 - were the discs incubated in 55 mL and 65 mL of liquid, or a smaller volume?

      Completed

    1. eLife Assessment

      This manuscript establishes a mathematical model to estimate the key parameters that control the repopulation of planarian stem cells after sublethal irradiation as they undergo fate-switching as part of their differentiation and self-renewal process. The findings are valuable for future investigation of stem cell division in planarians. The methods are solid, integrating modeling with perturbations of key transcription factors known to be critical for cell fate decisions, but the authors have only shown that this is the case for a small number of stem cell types.