10,000 Matching Annotations
  1. Sep 2024
    1. Reviewer #1 (Public Review):

      This is a technically sound paper focused on a useful resource around the DRGP phenotypes which the authors have curated, pooled, and provided a user-friendly website. This is aimed to be a crowd-sourced resource for this in the future. The authors should make sure they coordinate as well as possible with the NC datasets and community and broader fly community.

    2. Reviewer #2 (Public Review):

      In the present study, Gardeux et al provide a web-based tool for curated association mapping results from DRP studies. The tool lets users view association results for phenotypes and compare mean phenotype ~ phenotype correlations between studies. In the manuscript, the authors provide several example utilities associated with this new resource, including pan-study summary statistics for sex, traits, and loci. They highlight cross-trait correlations by comparing studies focused on longevity with phenotypes such as oxphos and activity. Strengths: -Considerable efforts were dedicated toward curating the many DRG studies provided. -Available tools to query large DRP studies are sparse and so new tools present appeal Weaknesses: The creation of a tool to query these studies for a more detailed understanding of physiologic outcomes seems underdeveloped. These could be improved by enabling usages such as more comprehensive queries of meta-analyses, molecular information to investigate given genes or pathways, and links to other information such as in mouse rat or human associations.

    3. Author response:

      The following is the authors’ response to the original reviews.

      We would like to thank the reviewers for their positive and constructive comments on the manuscript.

      We committed in our original rebuttal letter to implement the following revisions to both DGRPool and the corresponding manuscript to address the reviewers’ comments:

      (1) We agree with reviewer #1 that normalizing the data could potentially improve the GWAS results. Thus, for computing the GWAS results, we are now using these two additional options in PLINK2: “--quantile-normalize --variance-standardize”. We assessed the impact of these options on the overall results, which revealed only minor improvements of the results, globally being a bit more stringent. In this direction, we also now filter the top results with a nominal p-value of 0.001 instead of 0.01, also because it provided better results for the new gene set enrichment step.

      (2) We added a KRUSKAL test next to the ANOVA test for assessing the links between the phenotypes and the 6 known covariates, as well as a Shapiro-Wilk test of normality.

      (3) We agree with both reviewers that gene expression information is of interest. As mentioned before, adding gene expression data to the portal would have required extensive work, beyond the current scope of this paper, which primarily focuses on phenotypes and genotype-phenotype associations. Nonetheless, we included more gene-level outlinks to Flybase. Additionally, we now link variants and genes to Flybase's online genome browser, JBrowse. By following the reviewers' suggestions, we aim to guide DGRPool users to potentially informative genes.

      (4) Consistent with the latter point, and in agreement with reviewer #2, we acknowledge that additional tools could enhance DGRPool's functionality and facilitate meta- analyses for users. Therefore, we developed a gene-centric tool that now allows users to query the database based on gene names. Moreover, we integrated ortholog databases into the GWAS results. This feature will enable users to extend Drosophila gene associations to other species if necessary.

      (5) We amended the manuscript to describe all the new tools and features that were developed and implemented. In short, the new features include a new gene-centric page with diverse links (Phenotypes, Genome Browser JBrowse, Orthologs …), a variant-centric page (variant details, and PheWAS), an API for programmatic access to the database, and other statistical outputs and filtering options.

      We will detail these advances in the point-by-point response below and in the revised manuscript.

      Reviewer #1 (Public Review):

      This is a technically sound paper focused on a useful resource around the DRGP phenotypes which the authors have curated, pooled, and provided a user-friendly website. This is aimed to be a crowd-sourced resource for this in the future.

      The authors should make sure they coordinate as well as possible with the NC datasets and community and broader fly community. It looks reasonable to me but I am not from that community.

      We thank the reviewer for the positive comments. We will leverage our connections to the fly and DGRP communities to make the resource as valuable as possible. DGRPool in fact already reflects the input of many potential users and was also inspired by key tools on the DGRP2 website. Furthermore, it also rationalizes why we are bridging our results with other resources, such as linking out to Flybase, which is the main resource for the Drosophila community at large.

      I have only one major concern which in a more traditional review setting I would be flagging to the editor to insist the authors did on resubmission. I also have some scene setting and coordination suggestions and some minor textual / analysis considerations.

      The major concern is that the authors do not comment on the distribution of the phenotypes; it is assumed it is a continuous metric and well-behaved - broad gaussian. This is likely to be more true of means and medians per line than individual measurements, but not guaranteed, and there could easily be categorical data in the future. The application of ANOVA tests (of the "covariates") is for example fragile for this.

      The simplest recommendation is in the interface to ensure there is an inverse normalisation (rank and then project on a gaussian) function, and also to comment on this for the existing phenotypes in the analysis (presumably the authors are happy). An alternative is to offer a kruskal test (almost the same thing) on covariates, but note PLINK will also work most robustly on a normalised dataset.

      We thank the reviewer for raising this interesting point. Indeed, we did not comment on the distribution of individual phenotypes due to the underlying variability from one phenotype to another, as suggested by the reviewer. Some distributions appear normal, while others are clearly not normally distributed. This information is 'visible' to users by clicking on any phenotype; DGRPool automatically displays its global distribution if the values are continuous/quantitative. Now, we also provide a Shapiro-Wilk test to assess the normality of the distribution.

      We acknowledge the reviewer's concerns regarding the use of ANOVA tests. However, we want to point out that the ANOVA test is solely conducted to assess whether any of the well- established inversions or symbiont infection status (that, for simplification, we call “covariates” or “known covariates”) are associated with the phenotype of interest. This is merely informational, to help the user understand if their phenotype of interest is associated with a known covariate. But all of these known covariates are put in the model in any case, so PLINK2 will automatically correct for them, whatever is the output of the ANOVA test.

      Still, we amended the manuscript to better explain this, and we added a Kruskal-Wallis test (in addition to the ANOVA test) in the results, so the users can have a better overview of potentially associated known covariates. We added this text on p. 10 of the revised manuscript:

      “The tool further runs a gene set enrichment analysis of the results filtered at p<0.001 to enrich the associated genes to gene ontology terms, and Flybase phenotypes. We also provide an ANOVA and a Kruskal-Wallis test between the phenotype and the six known covariates to uncover potential confounder effects (prior correction), which is displayed as a “warning” table to inform the user about potential associations of the phenotype and any of the six known covariates. It is important to note that these ANOVA and Kruskal tests are conducted for informational purposes only, to assess potential associations between well-established inversions or symbiont infection status and the phenotype of interest. However, all known covariates are included in the model regardless, and PLINK2 will automatically correct for them, irrespective of the results from the ANOVA or Kruskal tests. “

      We also acknowledge in the manuscript (Methods section) that the Kruskal-Wallis test is used for a single factor (independent variables) at a time. This is unlike the ANOVA test that we initially performed, which was handling multiple factors simultaneously (given that it was performed in a multifactorial design). For a more direct comparison with our ANOVA model, we ran separate Kruskal-Wallis tests for each factor, but then we acknowledged its potential limitations compared to our multifactorial ANOVA, since each of these tests treats the factor in question as the only source of variation, not considering other factors. But since the test is not intended for interactions or combined effects of these factors, we deem it to be sufficient.

      Nevertheless, we concur with the reviewer that normalizing the data could potentially enhance GWAS results. Consequently, we have rerun the GWAS analyses using the PLINK2 --quantile- normalize and --variance-standardize options. We have updated all results on the website and also updated the plots in the manuscript, accordingly.

      Minor points:

      On the introduction, I think the authors would find the extensive set of human GWAS/PheWAS resources useful; widespread examples include the GWAS Catalog, Open Targets PheWAS, MR-base, and the FinnGen portal. The GWAS Catalog also has summary statistics submission guidelines, and I think where possible meta-data harmonisation should be similar (not a big thing). Of course, DRGP has a very different structure (line and individuals) and of course, raw data can be freely shown, so this is not a one-to-one mapping.

      Thank you for the suggestion. We cited these resources in the Introduction.

      “This aligns with the harmonization effort undertaken by other human GWAS/PheWAS resources, such as the GWAS Catalog, Open Targets PheWAS, MR-base, and the FinnGen portal, which provide extensive examples of effective data use and accessibility. Although the structure of DGRPool differs from these human databases, we acknowledge the importance of similar meta-data harmonization guidelines. Inspired by the GWAS Catalog's summary statistics submission guidelines, we propose submission guidelines for DGRP phenotyping data in this paper. “

      For some authors coming from a human genetics background, they will be interpreting correlations of phenotypes more in the genetic variant space (eg LD score regression), rather than a more straightforward correlation between DRGP lines of different individuals. I would encourage explaining this difference somewhere.

      We understand that this is a potential issue and we made the distinction clearer in the manuscript to avoid any confusion. We added this text on p.7, at the beginning of the correlation results section:

      “Of note, by “phenotype correlations”, we mean direct phenotype-phenotype correlations, i.e. a straightforward Spearman’s correlation of two phenotypes between common DRGP lines, and we repeated this process for each pair of phenotypes. “

      This leads to an interesting point that the inbred nature of the DRGP allows for both traditional genetic approaches and leveraging the inbred replication; there is something about looking at phenotype correlations through both these lenses, but this is for another paper I suspect that this harmonised pool of data can help.

      We agree with the reviewer and hope that more meta-analyses will be made possible by leveraging the harmonized data that are made available through DGRPool.

      I was surprised the authors did not crunch the number of transcript/gene expression phenotypes and have them in. Is this because this was better done in other datasets? Or too big and annoying on normalisation? I'd explain the rationale to leave these out.

      This is a very good point and is in fact something that we initially wanted to do. However, to render the analysis fair and robust, it would require processing all datasets in the same way. This implies cataloging all existing datasets and processing them through the same pipeline. In addition, it would require adding a “cell type” or “tissue” layer, because gene expression data from whole flies is obviously not directly comparable to gene expression data from specific tissues or even specific conditions. This would be key information as phenotypes are often tissue-dependent. Consequently, and as implied by the reviewer, we deemed this too big of a challenge beyond the scope of the current paper. Nevertheless, we plan to continue investigating this avenue in a potential follow-up paper.

      We still added a gene-centric tool to be able to query the GWAS results by gene. We also added orthologs and Flybase gene-phenotype information, both in this new gene-centric tool and also in all GWAS results.

      I think 25% FDR is dangerously close to "random chance of being wrong". I'd just redo this section at a higher FDR, even if it makes the results less 'exciting'. This is not the point of the paper anyway.

      We agree with the reviewer that this threshold implies a higher risk of false positive results. However, this is not an uncommonly used threshold (Li et al., PLoS biology, 2008; Bevers et al., Nature Metabolism, 2019; Hwangbo et al, Elife, 2023), and one that seems robust enough in our analysis since similar phenotypes are significant in different studies at different FDR thresholds.

      Nevertheless, we revisited these results with a stronger threshold of 5% FDR in the main Figure 3C. Most of the conclusions were maintained, except for the relation between longevity and “food intake”, as well as “sleep duration”. We modified the manuscript accordingly, notably removing these points from the abstract, and tuning down the results section. We kept the 25% FDR results as supplemental information.

      I didn't buy the extreme line piece as being informative. Something has to be on the top and bottom of the ranks; the phenotypes are an opportunity for collection and probably have known (as you show) and cryptic correlations. I think you don't need this section at all for the paper and worry it gives an idea of "super normals" or "true wild types" which ... I just don't think is helpful.

      We appreciate the reviewer’s feedback on the section regarding extreme DGRP lines and understand the concern about potential implications of “super normals” or “true wild types.” This section aimed to explore whether specific DGRP lines consistently rank in the extremes of phenotypic measures, particularly those tied to viability-related traits. Our hypothesis was that if particular lines consistently appear at the top or bottom, this might suggest some inherent bias or inbreeding-related weakness that could influence genetic association studies.

      However, as per the analyses presented, we did not discover support for this phenomenon. Importantly, the observed mild correlation in extremeness across sexes, while not profound, further suggested that this phenomenon is not a consistent population-wide feature.

      Nevertheless, we consider that this message is still important to convey. In response to the reviewer's feedback, we have provided a clearer conclusion of this paper section by adding the following paragraph:

      “In conclusion, this analysis showed that while certain lines exhibit lower longevity or outlier behavior for specific traits, we found no evidence of a general pattern of extremeness across all traits. Therefore, the data do not support the idea of 'super normals' or any other inherently biased lines that could significantly affect genetic studies. “

      I'd say "well-established inversion genotypes and symbiot levels" rather than generic covariates. Covariates could mean anything. You have specific "covariates" which might actually be the causal thing.

      We thank the author for the suggestion. We agree and modified the manuscript accordingly.

      I wouldn't use the adjective tedious about curation. It's a bit of a value judgement and probably places the role of curation in the wrong way. Time-consuming due to lack of standards and best practice?

      We thank the author for the suggestion. We agree and modified the manuscript accordingly, replacing the occurrences by “thorough” and “rigorous” which correspond better to the initial intended meaning.

      Reviewer #2 (Public Review):

      Summary:

      In the present study, Gardeux et al provide a web-based tool for curated association mapping results from DRP studies. The tool lets users view association results for phenotypes and compare mean phenotype ~ phenotype correlations between studies. In the manuscript, the authors provide several example utilities associated with this new resource, including pan-study summary statistics for sex, traits, and loci. They highlight cross-trait correlations by comparing studies focused on longevity with phenotypes such as oxphos and activity.

      Strengths:

      -Considerable efforts were dedicated toward curating the many DRG studies provided.

      -Available tools to query large DRP studies are sparse and so new tools present appeal

      Weaknesses:

      The creation of a tool to query these studies for a more detailed understanding of physiologic outcomes seems underdeveloped. These could be improved by enabling usages such as more comprehensive queries of meta-analyses, molecular information to investigate given genes or pathways, and links to other information such as in mouse rat or human associations.

      We appreciate the reviewer's kind comments.

      Regarding the tools, we concur with the reviewer that incorporating additional tools could enhance DGRPool and facilitate users in conducting meta-analyses. Therefore, we developed two new tools: a gene-centric tool that enables users to query the database based on gene names, and a variant-centric tool mostly for studying the impact of specific genomic loci on phenotypes. Additionally, in all GWAS results, we added links to ortholog databases, thereby allowing users to extend fly gene associations to other species, if required.

      Furthermore, we added links to the Flybase database, for variants, phenotypes, and genes that are already present in Flybase. We also link out to a 'genome browser-like' view (Flybase’s JBrowse tool) of the GWAS results centered around the affected variants/genes.

      Finally, we now also perform a gene-set enrichment analysis for each GWAS result, both in the Flybase gene-phenotype database and the Gene Ontology (GO) database.

      Reviewer #2 (Recommendations For The Authors):

      (1) The authors discuss how current available DRG databases are basically data-dump sites and there is a need for integrative queries. Clearly, they spent (and are spending) considerable efforts into curating associations from available studies so the current resource seems to contain several areas of missed opportunities. The most clear addition would be to integrate gene-level queries. For example which genes underlie associations to given traits, what other traits map to a specific gene, or multiple genes which map to traits. This absence of integration is somewhat surprising given the lab's previous analyses of eQTL data in DRPs (https://doi.org/10.1371/journal.pgen.1003055 ) and readily available additional data (ex. 10.1101/gr.257592.119 ,flybase) simple intersections between these at the locus level would provide much deeper molecular support for searching this database.

      The point raised by the reviewer concerning eQTL / transcriptomic data is in fact similar to the one raised by reviewer #1. We strongly agree with both reviewers that incorporating eQTL results in the tool would be very valuable, and this is in fact something that we initially wanted to do. However, to render the analysis fair and robust, it would require re-processing multiple public datasets in the same way. This would imply cataloging all existing datasets and processing them through the same pipeline. In addition, it would require adding a “cell type” or “tissue” layer, because gene expression data from whole flies is obviously not directly comparable to gene expression data from specific tissues or even specific conditions. This would be key information as phenotypes are often tissue-dependent. Consequently, we deemed implementing all these layers too big of a challenge beyond the scope of the current paper, but we plan to continue investigating this avenue in a potential follow-up paper.

      As mentioned before, we still integrated gene-level queries in a new tool, querying genes in the context of GWAS results. We acknowledge that this is not directly related to gene expression, and thus not implicating eQTL datasets (at least for now), but we think that it is for now a good alternative, reinforcing the interpretation of the GWAS results.

      Since this point was raised by both reviewers, we added a discussion about this in the manuscript.

      “We recognize certain limitations of the current web tool, particularly the lack of eQTL or gene expression data integration. Properly integrating DGRP GWAS results with gene expression data in a fair and robust manner would require uniform processing of multiple public datasets, necessitating the cataloging and standardization of all available datasets through a consistent pipeline. Moreover, incorporating a “cell type” or “tissue” layer would be essential, as gene expression data from whole flies is not directly comparable to data from specific tissues or even specific conditions. Since phenotypes are often tissue-dependent, this information is vital. However, implementing these layers presented too big of a challenge and was beyond the scope of this paper. “

      (2) Another area that would help to improve is to provide either a subset or the ability to perform a meta-analysis of the studies proposed to see where phenotype intersections occur, as opposed to examining their correlation structure. For any given trait the PLINK data or association results seem already generated so running together and making them available seems fairly straightforward. This can be done in several ways to highlight the utility (for example w/wo specific covariates from Huang et al., 2014 and/or comparing associations that occur similarly or differently between sexes).

      We are not 100% sure what the reviewer refers to when mentioning “phenotype intersection”, but we interpreted it as a “PheWAS capability”. Currently, in DGRPool, for every variant, there is a PheWAS option, which scans all phenotypes across all studies to see if several phenotypes are impacted by this same variant.

      We tried to make this tool more visible, both in the GWAS section of the website, but also in the “Check your phenotype” tool, when users are uploading their own data to perform a GWAS. We have also created a “Variants” page, accessible from the top menu, where users can view particular variants and explore the list of phenotypes they are significantly associated with.

      From both result pages, users can download the data table as .tsv files.

      (3) As pointed out by the authors, an advantage of DRGs is the ease of testing on homozygous backgrounds. For each phenotype queried (or groups of related phenotypes would be of interest too), I imagine that subsetting strains by the response would help to prioritize lines used for follow-up studies. For example, resistant or sensitive lines to a given trait. This is already done in Fig 4C and 4E but should be an available analysis for all traits.

      For all quantitative phenotypes, we show the global distribution by sex, followed by the sorted distribution by DGRP line. Since the data can be directly downloaded from the corresponding plots, resistant and sensitive lines can then be readily identified for all phenotypes.

      (4) To researchers beyond the DRP community, one feature to consider would be seeing which other associations are conserved across species. While doing this at the phenotype level might be tricky to rename, assigning gene-level associations would make this streamlined. For example, a user could query longevity, subset by candidate gene associations then examine outputs  for  what  is  associated  with  orthologue  genes  in  humans (ex. https://www.ebi.ac.uk/gwas/docs/file-downloads) or other reference panels such as mice and rats.

      In all GWAS results, and in the gene-centric tool, we have added links to ortholog databases. In short, when clicking on a variant, users can see which gene is potentially impacted by this variant (gene-level variant annotation). When clicking on these genes, the user can then open the corresponding, detailed gene page.

      To address the reviewer’s comment, in the gene page, we have added two orthologous databases (Flybase and OrthoDB), which enables cross-species association analyses.

      (5) Related to enabling a meta-data analysis, it would be helpful to let users download all PLINK or DGRP tables in one query. This would help others to query all data simultaneously.

      We would like to kindly point out that all phenotyping data can already be downloaded from the front page, which includes the phenotypes, the DGRP lines and the studies’ data and metadata. However, we did not provide the global GWAS results through a single file, because the data is too large. Instead, we provide each GWAS dataset via a unique file, available per phenotype, on the corresponding GWAS result page of this phenotype. This file is filtered for p<0.001, and contains GWAS results (PLINK beta, p and FDR) as well as gene and regulatory annotations.

      (6) Following analysis of association data an interesting feature would be to enable users to subset strains for putative LOF variants at a given significant locus. This is commonly done for mouse strains (ex. via MGI).

      The GWAS result table available for each phenotype can be filtered for any variant of interest. We added the capability to filter by variant impact; LOF variants being usually referred to as HIGH impact variants.

      (7) Viewing the locus underlying annotation can also provide helpful information. For example, several nice fly track views are shown in 10.1534/g3.115.018929, which would help users to interpret molecular mechanisms.

      We now link the GWAS results out to Flybase’s JBrowse genome browser.

    1. eLife Assessment

      This important study shows that a splice variant of the kainate receptor Glu1-1a that inserts 15 amino acids in the extracellular N-terminal region substantially changes the channel's desensitization properties, the sensitivity to glutamate and kainate, and the effects of modulatory Neto proteins. In the revised paper the authors have clarified several points raised by reviewers but the structural portion of the study has not been improved and consequently, more data are needed to determine the molecular mechanism by which the insert changes the functional profile of the channel. Even so, these solid findings advance our understanding of splice variants among glutamate receptors and will be of interest to neuro- and cell-biologists and biophysicists in the field.

    2. Reviewer #1 (Public Review):

      Kainate receptors play various important roles in synaptic transmission. The receptors can be divided into low affinity kainate receptors (GluK1-3) and high affinity kainate receptos (GluK4-5). The receptors can assemble as homomers (GluK1-3) or low-high affinity heteromers (GluK4-5). The functional diversity is further increased by RNA splicing. Previous studies have investigated C-terminal splice variants of GluK1, but GluK1 N-terminal (exon 9) insertions have not been previously characterized. In this study Dhingra et al investigate the functional implications of a GluK1 splice variant that inserts a 15 amino acid segment into the extracellular N-terminal region of the protein using whole-cell and excised outside-out electrophysiology.

      The authors convincingly show that the insertion profoundly impacts the function of GluK1-1a - the channels that have the insertion are slower to desensitize. The data also shows that the insertion changes the modulatory effects of Neto proteins, resulting in altered rates of desensitization and recovery from desensitization. To determine the mechanism by which the insertion exerts these functional effects, the authors perform pull-down assays of Neto proteins, and extensive mutagenesis on the insert.<br /> The electrophysiological part of the study is very rigorous and meticulous.

      The biggest weakness of the manuscript is the structural work. Due to issues with preferred orientation (a common problem in cryo-EM), the 3D reconstructions are at a low resolution (in the 5-8 Å range) and cannot offer much mechanistic insight into the effects of the insertion. The authors have opted to keep this data unchanged in the revised manuscript.

      Despite this, the study is a valuable contribution to the field because it characterizes a GluK1 variant that has not been studied before and highlights the functional diversity that exists within the kainate receptor family.

    3. Author response:

      The following is the authors’ response to the previous reviews.

      We are grateful to all three reviewers and editors for their critical comments and suggestions.

      Reviewer #2 (Recommendations For The Authors):

      The authors responded satisfactorily to all my comments and suggestions.

      We thank the reviewer for his time and feedback.

      Reviewer #3 (Recommendations For The Authors):

      Comments for authors:

      The authors have addressed most of the reviewer's concerns. Although no additional data were included to strengthen the manuscript, they have clarified some relevant points, and the manuscript has been updated accordingly. In my view, the current manuscript is well-written and mostly straightforward.

      We thank the reviewer for his time and suggestions. Addressing them have improved the quality of our manuscript.

      After a second revision, I just have a few minor comments (mostly editorial) that should be easy to address.

      (1) Page 16: "The dominant presence of the GRIK1-1 gene was also reported in retinal Off bipolar cells..." Please include reference(s).

      We have now cited the following reference:

      Lindstrom, S.H., Ryan, D.G., Shi, J., DeVries, S.H., 2014. Kainate receptor subunit diversity underlying response diversity in retinal Off bipolar cells. J. Physiol. 592, 1457–1477. https://doi.org/10.1113/jphysiol.2013.265033

      (2) Page 18: "Based on our functional assays, the splice seems to affect the interaction between the receptor and auxiliary proteins". Please remove or tone down this statement; the current data do not support this claim.

      We have revised the sentence as following: “Based on our functional assays, the splice may possibly affect the interaction between the receptor and auxiliary proteins.”

      (3) Page 24: "cultures ... at 0.5 µg/mL were transfected". In the current context, it is not clear what you mean with 0.5 µg/mL. Please check and correct.

      Thanks for pointing out this error. We have corrected it.

      (4) Page 30. He et al. reference is repeated.

      Thanks. We have fixed it now.

      (5) Figure 3, Panel C: Please incorporate the EC50 value for the red trace into the figure; it appears to be a different data set and, consequently, a different fitting compared with Figure 2C.

      The GluK1-1a data set (red trace) is identical to that in Figure 2c, though it may appear different due to the scale of the X and Y axis. As suggested, we have now included the EC50 value for this data set in Figure 3, panel C.

      (6) Figure legend 4: Please check two minor issues here:

      (a) "Bar graphs... with or without Neto1 protein..." This statement is apparently wrong; Figure 4 does not show the effect of Neto1.

      (b) "The wild type GluK1 splice variant data is the same as from Figure 1.." I think the authors mean Figure 2A instead of Fig. 1. Please check.

      Thanks for pointing out the error. We have fixed the same in the revised manuscript.

      (7) Please check and correct spelling/wording issues in the text. Here are some examples:

      (a) Page 9 " Figure 3G - I, Table2.." (There is no Panel I). 

      Fixed.

      (b) Page 16 "... and is involved in various pathophysiology..." 

      We have revised the sentence as “… and is involved in various pathophysiological conditions”

      (c) Page 19 "The constructs used for this study were HEK293 WT mammalian cells were seeded on..." 

      Fixed. Thanks.

      (d) Page 23 "The immunoblots were probed..." Please check the whole paragraph and correct the issues.

      Fixed. Thanks.

      (e) Page 27 "initially, 1,97,908 particles were picked". Check the value; the same issue occurs in Fig.6 table supplement 1. 

      Thanks. We have now modified the sentence to clarify that for  GluK1-1aEM ND-SYM, initially, 1,97,908 particles were picked and subjected to multiple rounds of clean-up using 2D and 3D classification. Finally,  24,531 particles were used for the final 3D reconstruction and refinement.

      (f) Legend Figure 2: Remove "(F)" from the legend. 

      Thanks. Fixed.

      (g) Legend Figure 2-Sup.1: Check/correct spelling issues. 

      Thanks. Fixed.

      (h) Figure 5-figure supplement 1: There is a mistake in panel B: "GFP" label is shown for Gluk1 and Neto2, but the authors mention that the pull-down was done with Anti-His antibodies. Please correct.

      Thanks. The pull-down experiments were done with anti-His for both the blots presented in panels A and B as mentioned in both the figures (right side panels of both A and B). However, for the GluK1 and Neto2 pull downs (panel B), the blots were probed with anti-GFP antibody which would detect both the receptor (as the receptor has both GFP-His8) and Neto2-GFP at their respective sizes. This has been indicated in the figure panel B.

      (8) Related to the point-by-point document:

      Major concern 2: Interpreting the effect of mutants on the regulation by Neto proteins requires knowing how the mutant is affecting the channel properties without Neto. In my view, if the data showing the K368/375/379/382H376-E mutant without Neto is missing (in this case due to low current amplitude), then, the pink bars in Fig. 5 should be removed from the figure. 

      We thank the reviewer for raising this interesting point and agree that it would be valuable to characterize the channel properties of all the mutants individually. However, as mentioned earlier, the functions of some mutant receptors are only rescued, or reliable, measurable currents are detected, when they are co-expressed with Neto proteins. We still believe that comparing wild-type and mutant receptors co-expressed with Neto proteins provides important insights, and therefore, we would like to retain the K368/375/379/382H376-E mutant data in the figure.

      Major concern 4: Figure 6-figure Supplement 8 is not mentioned in the manuscript. It would help to include a proper description in the Results section similar to the answer included in the point-by-point document.

      Figure6-figure Supplement 8 has already been cited on page 15. We have also cited Figure6-figure Supplement 9 on the same page and have added following sentences in the text:

      “A superimposition of GluK1-1aEM (detergent-solubilized or reconstituted in nanodiscs) and GluK1-2a (PDB:7LVT) showed an overall conservation of the structures in the desensitized state. No significant movements were observed at both the ATD and LBD layers of GluK1-1a with respect to GluK1-2a (Figure 6; Figure 6-figure supplement 9).”

      Major concern 5: The ramp/recovery protocol was not included properly in the manuscript; please include the time of the ramp pulse and the time used for the recovery period.

      Elaborated ramp and recovery protocols are included in the methods section. The time used for the recovery period was variable and was tuned as per the recovery kinetics. All the figures were representative traces are shown include the scale bar showing the time period of agonist application.

      Minor concern 1: The proposed change was not included in the manuscript; check page 7.

      Thanks for highlighting this error. We have now changed it in the revised manuscript.

      Minor concern 10: The manuscript was not corrected as indicated. Please check.

      Thanks. We have now modified the sentence as following: “…..a reduction was observed for K375/379/382H376-E receptors (1.17 ± 0.28 P=0.3733) compared to wild-type although differences do not reach statistical significance

      Minor concern 14: The figure was not corrected as indicated. Please check.

      Thanks for highlighting this error. We have now changed it in the revised manuscript.

      Minor concern 19: I suggest including this briefly in the Discussion section.

      Thanks for the suggestion. We have included the following sentence in the discussion:

      “The differences in observations could be due to variations in experimental conditions, such as the constructs and recording conditions used.”

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Weaknesses:

      Given that all mutants tested showed the same degree of activation by PEG400, it seemed possible that PEG400 might be an allosteric activator of WNK1/3 through direct binding interactions. Perhaps PEG400 eliminates CWN1/2 waters by inducing conformational changes so that water loss is an effect not a cause of activation. To address this it would be helpful to comment on whether new electron densities appeared in the X-ray structure of WNK1/SA/PEG400 that might reflect PEG400 interactions with chains A or B.

      We re-evaluated the WNK1/SA/PEG400 electron density looking for non-protein densities larger than water. No new densities were found. However, we do observe a PEG400-destabilizing effect using differential scanning fluorimetry, and have included this data into Figure 2. We conclude that the effects on the water structure and destabilization are due to demands on solvent.

      We have included in the second paragraph of the introduction references to primary literature that advance similar arguments to explain osmolyte induced effects on activity.

      Specifically, Colombo MF, Rau DC, Parsegian VA (1992) Protein solvation in allosteric regulation: a water effect on hemoglobin. Science 256: 655-659 and LiCata VJ, Allewell NM (1997) Functionally linked hydration changes in Escherichia coli aspartate transcarbamylase and its catalytic subunit. Biochemistry 36: 10161—10167. 

      It would also be helpful to discuss any experiments that might have been done in previous work to examine the direct binding of glycerol and other osmolytes to WNKs.

      We did not observe PEG400 in WNK1/SA/PEG400 despite effects on the space group and subunit packing. On the other hand, glycerol was observed in WNK1/SA, which was cryoprotected in glycerol (PDB file 6CN9). We have highlighted these differences in the second section of the results. A thorough analysis on the effects of various osmolytes on WNK structure, stability, and activity is a potential future direction.

      The study would benefit from a deeper discussion about how to reconcile the different effects of mutations. For example, wouldn't most or all of the mutations be expected to disrupt the water network, and relieve the proposed autoinhibition? This seemed especially true for some of the residues, like Y420(Y346), D353(D279), and K310(K236), which based on Fig 3 appeared to interact with waters that were removed by PEG400.

      The manuscript has been updated with new data and better discussion of this point. Given the inconsistencies on the effects of mutation in static light scattering (SLS), we addressed the possibility that the reducing agent was not constant across experiments. In a repeated study, including reducing agent (1 mM TCEP), we obtained results on mutant mass more similar to wild-type than in the original experiment. An exception was that two of the mutants were much more monomeric than wild-type. It follows that the network CWN1 stabilizes the inactive dimer. The reduced activity of some of the mutants probably reflects the position of CWN1 and the AL-CL Cluster in the active site, such that mutants can affect substrate binding or catalysis. This is now better discussed both in the data and discussion sections.

      Mutants have a tendency to have complex effects on activity and structure. It was satisfying to find any activating mutants. We point out that we have been careful to present all of our data including mutants that are not easily explained by our models.

      Alternatively, perhaps the waters in CWN2 are more important for maintaining the autoinhibited structure. This possibility would be useful to discuss, and perhaps comment on what may be known about the energetic contributions of bound water towards stabilizing dimers.

      This research focused on the most salient unique feature of WNK1- CWN1. We also identified CWN2. Mutational analysis of CWN2 can’t be done without disrupting the dimer interface, greatly complicating data interpretation.

      It would also be useful to comment on why aggregation of E319Q/A (E314) shouldn't inhibit kinase activity instead of activating it.

      On recollection of the SLS data in the presence of reducing agent, we saw reduced aggregation. WNK3/D279N and WNK3/E314Q were more monomeric, especially at the higher protein concentration used. WNK3/E314Q is one of the more active mutants.

      The X-ray work was done entirely with WNK1 while the mutational work was done entirely with WNK3. Therefore, a simple explanation for the disconnect between structure and mutations might be that WNK1 and WNK3 differ enough that predictions from the structure of one are not applicable to mutations of the other. It would be helpful to describe past work comparing the structure and regulation of WNK1 and WNK3 that support the assumption of their interchangeability.

      We have responded directly to this concern. We introduced our most interesting amino acid replacement WNK3/E314A into WNK1, making WNK1/E388A. Similar trends in chloride inhibition and mutational activation were observed in WNK1 as in WNK3. This supports the assumption of interchangeability of WNK1 and WNK3 we invoked for practical reasons.  As expected, the overall activity of WNK1 is lower than WNK3. Overall, the lower activity limited data collection. However, the lower activity did allow us to fit the chloride inhibition data to a kinetic model for WNK1.  Panels on WNK1 activity, mutation, and chloride inhibition were added to Figure 5 and to Supplemental data (Table S6).

      Reviewer #2 (Public Review):

      Strengths:

      The most interesting result presented here is that P1 crystals of WNK1 convert to P21 in the presence of PEG400 and still diffract (rather than being destroyed as the crystal contacts change, as one would expect). All of the assays for activity and osmolyte sensing are carried out well.

      Thank you. We have emphasized this point in the Results section with the word “remarkably”

      Weaknesses:

      The rationale for using WNK3 for the mutagenesis study is that it is more sensitive to osmotic pressure than WNK1. I think that WNK1 would have been a better platform because of the direct correlation to the structural work leading to the hypothesis being tested. All of the crystallographic work is WNK1; it is not logical to jump to WNK3 without other practical considerations.

      This point is addressed in the last comment to Reviewer 1. We added autophosphorylation assay data on our most interesting mutant (WNK3/E314A) in WNK1 (WNK1/E388A). Conversely, we have crystallographic data on uWNK3 (on uWNK3/E314A collected to 3.3Å). These new data justify the assumption of interchangeability of results obtained for uWNK1 and uWNK3.

      Osmolyte sensing was tested by measuring ATP consumption as a function of PEG400 (Figure 6). Data for the subset of mutants analyzed by this assay showed increasing activity. It is not clear why the same collection of mutant proteins analyzed in the experiments of Figure 5 was not also measured for osmolyte sensing in Figure 6.

      These data are now more complete, having been now collected for all of the WNK3 mutants (now Figure 7).

      The last set of data presented uses light scattering to test whether the WNK3 mutant proteins exhibit quaternary structural changes consistent with the monomer/dimer hypothesis. If they did, one would expect a higher degree of monomer for those that are activated by mutation, and a lower amount of monomer (like wt) for those that are not. Instead, one of the mutant proteins that showed the most chloride inhibition (Y346F) had a quaternary structure similar to the wt protein, and others have similar monomer/dimer mixtures but distinct chloride inhibition profiles (K307A and M301A). I don't see how the light scattering data contribute to this story other than to refute the hypothesis by showing a lack of correlation between quaternary structure, water binding, and activity. This is another reason why the disconnect between WNK1 and WNK3 could be a problem. All of the detailed structural work with WNK1 must be assumed with WNK3; perhaps the light scattering data are contradicting this assumption?

      As noted above, on recollection of the SLS data in the presence of reducing agent, we saw reduced aggregation and more consistency with our model. Thus, we now feel it is a useful contribution to the manuscript. The table in Supplemental data has been updated.

      Reviewer #1 (Recommendations For The Authors):

      Fig 3D in the PDF manuscript seemed distorted - waters were cut off. Also Fig 2D would benefit from showing the whole molecule, instead of cutting off the top and bottom of the kinase domain.<br /> We suspect this is a data transfer problem, since we don’t see these truncations.

      Both Figure 2 and 3 have been changed, addressing these concerns and adding new differential scanning fluorimetry data as discussed in reply to Reviewer 1. Figure 2 was simplified by eliminating Figures 2A-2C, and replacing them with a new Figure 2B, the superposition of WNK1/SA/PEG400 (PDB 9D3F), WNK1/SA (PDB 6CN9).  

      In Figure 3, we added a panel highlighting the volume change around CWN1 in presence of PEG400 (Figure 3C). Hopefully, inappropriate cropping has been eliminated.

      Line 162: Y314F should be Y346F.

      This has been corrected. Thank you.

      Lines 211-213 - these two sentences do not seem to logically go together: "Two hyper-active mutants were discovered, WNK3/E314A, and WNK3/E314Q. These mutants are straightforward to interpret based on our model: the mutated residues support and stabilize inactive dimeric WNK."

      An extensive rewrite has been conducted to address the difference in activity between the higher activity mutants versus less active mutants, now discussed in two paragraphs, and two Figures, Figure 5 and 6. The SLS data, recollected with more reducing agent, has given more consistent results (Supplemental), making the discussion more straightforward (discussed above).

      Reviewer #2 (Recommendations For The Authors)

      I think WNK1 would be a better platform for mutagenesis than WNK3. Or minimally the authors should better justify the switch to WNK3 from WNK1. Analyze the same set of mutants in Figure 5 into Figure 6.

      Again, we have added assay data on uWNK1/E388A, and structural data on uWNK3/E314A.

      I would analyze the same set of mutants in Figures 5 and 6.

      We have analyzed all of the WNK3 mutants in the ADP-Glo assays (Figure 7).

      Will the P21 crystal form grow independently in PEG400?

      Attempts to crystallize WNK1/SA or WNK3/SA or other constructs in PEG400 have been unsuccessful.

      I would also add some context about the role of water in allosteric mechanisms. I know there is a long history in hemoglobin in which specific waters have been associated with the T and R states such as that by Marcio Colombo. There is a relatively recent article in J. Phys Chem. that would provide good context. Leitner et al., J. Chem. Phys. 152, 240901 (2020)

      Thank you. Good call.

    2. eLife Assessment

      This study presents an important investigation of water coordination in a specific kinase family with a focus on the regulation of osmosensing protein kinases. X-ray crystallographic approaches combined with functional assays are used to address the hypothesis that bound water participates in the osmosensing mechanism as an allosteric kinase inhibitor. The evidence for changes in kinase conformation and space group of the crystal as a function of added low molecular weight polyethylene glycol is solid. The work will be of considerable interest to the kinase field as well as colleagues studying allosteric regulation of protein function.

    3. Reviewer #1 (Public Review):

      This manuscript addresses the regulation of the osmosensing protein kinases, WNK1 and WNK3. Prior work by the authors has shown that these enzymes are activated by PEG400 or ethylene glycol and inhibited by chloride ion, and that activation is associated with a conformational transition from dimer to monomer. In X-ray structures of the WNK1/SA inactive dimer, a water-mediated hydrogen bond network was observed between the catalytic loop (CL) and the activation loop (AL), named CWN1. This led to the proposal that bound water may be part of the osmosensing mechanism.

      The current study carries this work further, by applying PEG400 to Xtals of dimeric WNK1/SA. This results in a change in kinase conformation and space group, along with 4-9 fewer waters in CWN1 and the complete disappearance of another water cluster (CWN2) located at the dimer interface. Six conserved residues lining the CWN1 pocket in WNK3 are mutated to determine effects on activity and inhibition by chloride ion (measured by AL autophosphorylation) and monomer-dimer interconversion (light scattering).

      The results show that two mutants (E314Q/A in WNK3) at a site central to the water cluster result in increased kinase activity (autophosphorylation), and increased SLS, interpreted as aggregation. Three sites (D279A, Y346F, M301A) inhibit kinase activity with varying effects on oligomerization - Y346A and M301A retain monomer-dimer ratios similar to WT while D279N promotes aggregation. K236A and K307A show activity and monomer:dimer ratios similar to WT. Selected mutants (E314Q, D279N, Y346F) and WT appear to retain osmosensitivity with comparable activation by PEG400.

      The study concludes that osmolytes may activate the kinase by removing waters from the CWN1 and CWN2 clusters, suggesting that waters might be considered allosteric ligands that promote the inactive structure of WNKs. The differing effects of mutations may be ascribed to disruption of the water networks as well as inhibitory perturbations at the active site.

      Comments on latest version:

      The revised manuscript incorporated new experiments that satisfactorily addressed my concerns.

    4. Reviewer #2 (Public Review):

      This work tests the hypothesis that water coordination in WNK kinases is linked to allosteric control of activity. It is proposed that dimeric WNK is inactive and bound to some conserved water molecules, and that monomerization/activation involves departure of these waters. New data here include a crystal structure of monomeric WNK1 which shows missing waters compared to the dimeric structure, in support of the hypothesis. Mutant proteins of a different isozyme (WNK3) designed to disrupt water coordination were produced, and activity and quaternary structure were measured.

      Comments on latest version:

      The authors have largely addressed my concerns by making sure collection of mutants analyzed for autophosphorylation in Figure 6 are consistent with the measurement of osmotic sensitivity in Figure 7. The other changes in response to reviews have made a stronger manuscript in my opinion.

    1. eLife Assessment

      This fundamental study uses a creative experimental system to directly test Ohno's hypothesis, which describes how and why new genes might evolve by duplication of existing ones. In agreement with existing criticism of Ohno's original idea, the authors present compelling evidence that having two gene copies does not speed up the evolution of a new function as posited by Ohno, but instead leads to the rapid inactivation of one of the copies through the accumulation of mostly deleterious mutations. These findings will be of broad interest to evolutionary biologists and geneticists.

    2. Reviewer #1 (Public review):

      The authors construct a pair of E. coli populations that differ by a single gene duplication in a selectable fluorescent protein. They then evolve the two populations under differing selective regimes to assess whether the end result of the selective process is a "better" phenotype when starting with duplicated copies. Importantly, their starting duplicated population is structured to avoid the duplication-amplification process often seen in bacterial artificial evolution experiments. They find that while duplication increases robustness and speed of adaptation, it does not result in more highly adapted final states, in contrast to Ohno's hypothesis.

      Comments on revised version:

      The authors have addressed my prior concerns, and I have no further comments on the manuscript.

    3. Reviewer #2 (Public review):

      Summary:

      Drawing from tools of synthetic biology, Mihajlovic et al. use a cleverly designed experimental system to dissect Ohno's hypothesis, which describes the evolution of functional novelty on the gene-level through the process of duplication & divergence.<br /> Ohno's original idea posits that the redundancy gained from having two copies of the same gene allows one of them to freely evolve a new function. To directly test this, the authors make use of a fluorescent protein with two emission maxima, which allows to apply different selection regimes (e.g. selection for green AND blue, or, for green NOT blue). To achieve this feat without being distracted by more complex evolutionary dynamics caused by the frequent recombination between duplicates, the authors employ a well-controlled synthetic system to prevent recombination: Duplicates are placed on a plasmid as indirect repeats in a recombination-deficient strain of E.coli. The authors implement their directed evolution approach through in vitro mutagenesis and selection using fluorescent-activated cell sorting. Their in-depth analysis of evolved mutants in single-copy versus double-copy genotypes provides clear evidence for Ohno's postulate that redundant copies experience relaxed purifying selection. In contrast to Ohno's original postulate, however, the authors go on to show that this does not in fact lead to more rapid phenotypic evolution, but rather, the rapid inactivation of one of the copies.

      Strengths:

      This paper contributes with great experimental detail to an area where the literature predominantly leans on genomics data. Through the use of a carefully-designed, well-controlled synthetic system the authors are able to directly determine the phenotype & genotype of all individuals in their evolving populations and compare differences between genotypes with a single or double copy of coGFP. With it they find clear evidence for what critics of Ohno's original model have termed "Ohno's dilemma", the rapid non-functionalization by predominantly deleterious mutations.

      Including an expressed but non-functional coGFP in (phenotypically) single copy genotypes provides an especially thoughtful control that allows determining a baseline dN/dS ratio in the absence of selection. All in all the study is an exciting example of how the clever use of synthetic biology can lead to new insights.

      Weaknesses:

      In the revised version of the paper, the authors now discuss one potential weakness of their study, which is tied to its biggest strength (as often in experimental biology there is a trade-off between 'resolution' and 'realism').<br /> The experimental set-up leaves out an important component of the evolutionary process in order to disentangle dosage effects from other effects that carrying two copies might have on their evolution. Specifically, by employing a recombination-deficient strain and constructing their duplicates as inverted repeats their experimental design completely abolishes recombination between the two copies. This was pointed out in my first review to be problematic for two reasons:

      (i) In nature, new duplicates do not arise as inverted, but rather as direct (tandem) repeats and - as the authors correctly point out - these are very unstable, due to the fact that repeated DNA is prone to recA-dependent homologous recombination (which arise orders of magnitude more frequently than point mutations).

      (ii) This instability often leads to further amplification of the duplicates under dosage selection both in the lab and in the wild (e.g. Andersson & Hughes, Annu. Rev. Genet. 2009), and would presumably also be an outcome under the current experimental set-up if it was not prevented from happening?

      In their revised version, the authors now address this point and with much clarity explain why their experimental system is so powerful to study the fate of a gene duplicate, not despite lacking recombination, but *because* it lacks recombination.

    4. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This fundamental study uses a creative experimental system to directly test Ohno's hypothesis, which describes how and why new genes might evolve by duplication of existing ones. In agreement with existing criticism of Ohno's original idea, the authors present compelling evidence that having two gene copies does not speed up the evolution of a new function as posited by Ohno, but instead leads to the rapid inactivation of one of the copies through the accumulation of mostly deleterious mutations. These findings will be of broad interest to evolutionary biologists and geneticists.

      We thank the editors and the reviewers for their positive feedback concerning our experimental system and for the constructive feedback on how to further improve the manuscript. We have now addressed the reviewer’s comments in a revised version.

      Reviewer #1 (Public Review):

      Overview:

      The authors construct a pair of E. coli populations that differ by a single gene duplication in a selectable fluorescent protein. They then evolve the two populations under differing selective regimes to assess whether the end result of the selective process is a "better" phenotype when starting with duplicated copies. Importantly, their starting duplicated population is structured to avoid the duplication- amplification process often seen in bacterial artificial evolution experiments. They find that while duplication increases robustness and speed of adaptation, it does not result in more highly adapted final states, in contrast to Ohno's hypothesis.

      Major comments:

      This is an excellent study with a very elegant experimental setup that allows a precise examination of the role of duplication in functional evolution, exclusive of other potential mechanisms. My main concern  is  to  clarify  some  of  the  arguments  relating  to  Ohno's  hypothesis.

      I think my main confusion on first reading the manuscript was in the precise definition of Ohno's hypothesis. I think this confusion was mine and not the authors, but it is likely common and could be addressed.

      Most evolutionary biologists think of gene duplication as making neofunctionalization "easier" by providing functional redundancy and a larger mutational target, such that the evolutionary process of neofunctionalization is faster (as the authors observed). In this framework, the final evolved state might not differ when selection is applied to duplicated copies or a single-copy gene. Ohno's hypothesis, by contrast, argues that there generally exist adaptive conflicts between the ancestral function and the "desired" novel function, such that strong selection on a single-copy gene cannot produce the evolutionary optima that selection on two copies would. This idea is hinted at in the quotation from Ohno in paragraph 2 of the introduction. However, the sentences that follow I don't think reinforce this concept well enough and lead to some confusion.

      With that definition in mind, I agree with the authors' conclusion that these data do not support Ohno's hypothesis. My quibble would be that what is actually shown here is that adaptive conflict in function is not universal: there are cases where a single gene can be optimized for multiple functions just as well as duplicated copies. I do not think the authors have, however, refuted the possibility that such adaptive conflicts are nonetheless a significant barrier to evolutionary innovation in the absence of gene duplication generally. Perhaps just a sentence or two to this effect might be appropriate.

      We fully agree with the reviewer that trade-offs might play an important role in the evolution of single copy and of duplicated genes, depending on the gene and on the selection regime. And while trade-offs are not likely to play a big role in the selection regime we discuss in detail in the main text (evolution towards more green), they probably are important for at least one our selection regimes. In fact, we so state in the following passage of the discussion. In addition, we have now added a sentence that acknowledges the importance of trade-offs for evolution in the absence of gene duplication:

      “A single gene encoding such a protein suffers from an adaptive conflict between the two activities. Gene duplication may provide an escape from this adaptive conflict, because each duplicate may specialize on one activity14, 15. For coGFP, a trade-off likely exists for fluorescence in these two colors, because improvement of green fluorescence entails a loss of blue fluorescence during evolution (Figure S8 and Figure S16). We therefore expected that during selection for both green and blue fluorescence, one cogfp copy in double-copy populations would “specialize” on green fluorescence whereas the other copy would specialize on blue fluorescence. However, when we analyzed individual population members with two active gene copies we could not find any such specialization (Figure S21). Moreover, the identified key mutations at positions 147 and 162 have a very low frequency (<1%) in these populations (Figure S15). Future experiments with different selection strategies might reveal the reasons for this observation and the conditions under which such a specialization can occur.“

      I also think the authors need to clarify their approach to normalizing fluorescence between the two populations to control for the higher relative protein expression of the population with a duplicated gene. Since each population was independently selected with the highest fluorescing 60% (or less) of the cells selected, I think this normalization is appropriate. Of course, if the two populations were to compete against each other, this dosage advantage of the duplicates would itself be a selective benefit. Even as it is, the dosage advantage should be a source of purifying selection on the duplication, and perhaps this should be noted.

      The reviewer is correct. To be able to follow the evolutionary trajectories of the different constructs, the populations were treated separately. The gates were adjusted for each library separately to select for the top 60, 1 or 0.01% of cells and the gates for the double-copy populations were set to slightly higher fluorescence, reflected in the higher fluorescence of these populations in Figure 3A. Indeed, if individuals in these populations were to compete against each other, the double-copy populations would have a benefit due to the dosage advantage. However, as we already pointed out in the manuscript, we did not see any additional advantage beyond the increased gene dosage provided by the second copy (Figure 3B). To discuss this issue in more detail, we have now added the following text to the discussion:

      “It is worth noting that we evolved each of our single- and double-copy populations separately and in parallel to follow their individual evolutionary trajectories. In a natural population, individuals with one or two copies might occur in the same population and compete against each other. In this situation any dosage advantage of a duplicate gene would itself entail selective benefit. Our approach allowed us to find out if gene duplication facilitates phenotypic evolution beyond any such gene dosage effect. At least for the specific genes, selection pressures, and mutation rates we used, the data suggest that it does not.”

      Finally, I am slightly curious about the nature of the adaptations that are evolving. The authors primarily discuss a few amino-acid changing mutations that seem to fix early in the experiment. Looking at Figure 3, it however, appears that the populations are still evolving late in the experiment, and so presumably other changes are occurring later on. Do the authors believe that perhaps expression changes to increase protein levels are driving these later changes?

      Figure S15 shows that some mutations are indeed still increasing in frequency during late evolutionary rounds, in particular S2L, V141L and V205L. We have measured the emission spectra of these mutants (Figure S16), and these mutations increase fluorescence both in green and blue. It is therefore likely that these mutations, similar to L98M, increase protein expression, solubility, or thermal stability, as suggested by the reviewer. We now clarify this matter in a new passage of the results:

      “Like L98M, the additional mutations S2I, V141I and V25L also occurred in all selection regimes, but they reached lower frequencies than L98M during the 5 generations of the experiment. We hypothesized that mutations observed in all selection regimes do not derive their benefit from increasing the intensity of any one fluorescent color. Instead, they may increase protein expression, solubility, or thermal stability.”

      Reviewer #2 (Public Review):

      Summary:

      Drawing from tools of synthetic biology, Mihajlovic et al. use a cleverly designed experimental system to dissect Ohno's hypothesis, which describes the evolution of functional novelty on the gene-level through the process of duplication & divergence.

      Ohno's original idea posits that the redundancy gained from having two copies of the same gene allows one of them to freely evolve a new function. To directly test this, the authors make use of a fluorescent protein with two emission maxima, which allows them to apply different selection regimes (e.g. selection for green AND blue, or, for green NOT blue). To achieve this feat without being distracted by more complex evolutionary dynamics caused by the frequent recombination between duplicates, the authors employ a well-controlled synthetic system to prevent recombination: Duplicates are placed on a plasmid as indirect repeats in a recombination-deficient strain of E.coli. The authors implement their directed evolution approach through in vitro mutagenesis and selection using fluorescent-activated cell sorting. Their in-depth analysis of evolved mutants in single-copy versus double-copy genotypes provides clear evidence for Ohno's postulate that redundant copies experience relaxed purifying selection. In contrast to Ohno's original postulate, however, the authors go on to show that this does not in fact lead to more rapid phenotypic evolution, but rather, the rapid inactivation of one of the copies.

      Strengths:

      This paper contributes with great experimental detail to an area where the literature predominantly leans on genomics data. Through the use of a carefully designed, well-controlled synthetic system the authors are able to directly determine the phenotype & genotype of all individuals in their evolving populations and compare differences between genotypes with a single or double copy of coGFP. With it they find clear evidence for what critics of Ohno's original model have termed "Ohno's dilemma", the rapid non- functionalization by predominantly deleterious mutations.

      Including an expressed but non-functional coGFP in (phenotypically) single copy genotypes provides an especially thoughtful control that allows determining a baseline dN/dS ratio in the absence of selection. All in all the study is an exciting example of how the clever use of synthetic biology can lead to new insights.

      Weaknesses:

      The major weakness of the study is tied to its biggest strength (as often in experimental biology there is a trade-off between 'resolution' and 'realism').

      The paper ignores an important component of the evolutionary process in favour of an in-depth characterization of how two vs one copy evolve. Specifically, by employing a recombination-deficient strain and constructing their duplicates as inverted repeats their experimental design completely abolishes recombination between the two copies.

      This is problematic for two reasons:

      i)  In nature, new duplicates do not arise as inverted, but rather as direct (tandem) repeats and - as the authors correctly point out - these are very unstable, due to the fact that repeated DNA is prone to recA- dependent homologous recombination (which arise orders of magnitude more frequently than point mutations).

      ii)  This instability often leads to further amplification of the duplicates under dosage selection both in the lab and in the wild (e.g. Andersson & Hughes, Annu. Rev. Genet. 2009), and would presumably also be an outcome under the current experimental set-up if it was not prevented from happening?

      So in sum, recombination between duplicate genes is not merely a nuisance in experiments, but occurring at extremely high frequencies in nature (such that the authors needed to devise a clever engineering solution to abolish it), and is often observed in evolving populations, be it in the laboratory or the wild.

      The manuscript sells controlling of copy number as a strength. And clearly, without it, the same insights could not be gained. However, if the basis for the very process of what Ohno's model describes is prevented from happening for the process to be studied, then, for reasons of clarity and context this needs pointing out, especially, to readers less familiar with the principles of molecular evolution.

      Connected to this, there are several places in the introduction and the discussion where I feel that the existing literature, in particular models put forward since Ohno that invoke dosage selection (such as IAD) end up being slightly misrepresented.

      My point is best exemplified in line 1 of Discussion: "To test Ohno's hypothesis and to distinguish its predictions from those of competing hypotheses, it is necessary to maintain a constant and stable copy number of duplicated genes during experimental evolution."

      We understand the reviewer’s position and fully agree that we needed to clarify better what our experiments aimed to achieve. To this end, we rewrote the beginning of the discussion to read:

      “Our aim was to study whether gene duplication can affect mutational robustness and phenotypic evolution beyond any effect of increased gene dosage provided by multiple gene copies. To this end, we needed to maintain a constant and stable copy number of duplicated genes during experimental evolution.”

      I think this statement is simply not true and might be misleading. To take the exaggerated position of a devil's advocate, the goal of evolutionary biology should be to find out how evolution actually proceeds in nature most of the time, rather than creating laboratory systems that manage to recapitulate influential ideas.

      On this point, we respectfully disagree. To ask questions like ours, laboratory experiments that are highly controlled albeit possibly “unnatural” can be essential. And we would argue that our experiments do not merely aim to “recapitulate” an influential idea but to validate it and potentially refute it, as we did for our study system. Validating theory is an essential aspect of experimental science. Textbooks in biology and beyond are rife with examples.

      While fixing copy number may be a necessary step to understand how one copy evolves if a second one is present, it seems that if Ohno's hypothesis only works out in recA-deficient bacterial strains and on engineered inverted repeats, that Ohno might have missed one crucial aspect of how paralogs evolve. The mentioned competing hypotheses have been put forward to (a) address Ohno's dilemma (which the present study beautifully demonstrates exists under their experimental conditions) and (b) to reflect a commonly observed evolutionary process in bacteria (dosage gain in response to selection, e.g. a classic way of gaining antibiotic resistance). Fixing the copy number allowed the authors to show which predictions of Ohno's model hold up and which don't (under these specific conditions). But they do so without even preventing the processes described by alternative models from happening, so the experimental system is hardly appropriate to distinguish between Ohno & alternatives. Therefore, I think it could be made clearer that the experimental system is great to look at certain aspects Ohno's hypothesis in  detail, but  it  can  only inform  us about  a  universe  without  recombination.

      (1)  Citing the works by ref 8, 26, 27 to merely state that "in some copies were gained and some were lost (ref 6, ref 25)" makes it seem as if fixing at 2 copies is some sort of sensible average. Yet ref 6 (Dhar et al.) specifically states that dosage is the most important response. Moreover, in this study gene copies are lost, but plasmid copies are gained instead. In Holloway et al. 2007 (ref 25), the 2 copies resided on different plasmids, so entirely different underlying molecular genetics might be at work (high cost of plasmid maintenance, and competitive binding on both proteins onto the respective (off)-target, where either way selection favored a single copy, so a different situation altogether). In both cited studies, fixing the copy would have prohibited learning something about the process of duplication & divergence.

      Hence this statement seems to distract the readers from the main message, which seems that preventing recombination experimentally allows to follow the divergence of each copy and studying a response that does not involve dosage-increase.

      (2)   "These studies highlighted the importance of gene duplication in providing fast adaptation under changing environmental conditions but they focused on the importance of gene dosage." I think this constructs a false dichotomy. Instead, these studies pointed out that dosage (and with it, selection for dosage)  is  an  important  part  of  the  equation  that  might  have  been  missed  by  Ohno.

      Your points are well taken. To clarify the insights from previous experiments and the aims of our experiments we rewrote this passage in question as follows.

      “These studies underline the importance of gene duplication in providing fast adaptation under changing environmental conditions. In some studies one copy was lost6, 25, while in others, additional copies were gained8, 26, 27. Together these studies highlight that gene dosage and selection for dosage can play an important role during the evolution of duplicated genes6, 8, 25-28.

      These studies also raise the question whether gene duplication can provide an advantage beyond its effects on gene dosage. To find out it is necessary to study the evolution of gene duplicates while keeping the copy number of the duplicated gene exactly at two. This is challenging because gene duplication causes recombinational instability and high variability in copy number. No previous experimental studies were designed to control copy number. Here, we present an experimental system that allowed us to keep the copy number fixed at one or two genes, and to follow the evolution of each gene copy in the absence of any dosage increase.”

      (3)  "Such models are also easier to test experimentally, because they do not require precise control of gene copy number. The necessary tests can even benefit from massive gene amplifications8. Although Ohno's hypothesis is more difficult to test experimentally (...)" - again, I feel the wording is slightly misleading. The point is not that IAD is easier to test and Ohno's model is harder to test in laboratory experiments, rather, experiments (and some more limited observations of naturally evolving populations) seem to suggest that in reality evolution proceeds (more often?) according to IAD rather than Ohno's neofunctionalization hypothesis. However, as the authors point out, it will be exciting to see their clever experimental system used to test other genes and conditions to get a more comprehensive understanding of what gene- and selection- parameter values would overcome Ohno's dilemma.

      We agree and in response rewrote the paragraph in question to read:

      “The challenge that a duplicated gene copy must remain free of frequent deleterious mutations long enough to acquire beneficial mutations that provide a new selectable phenotype is known as Ohno’s dilemma13. Our experiments confirm that this challenge is highly relevant for post-duplication evolution. Other models such as the innovation-amplification-divergence (IAD) model8, 13 postulate that this dilemma can be resolved through an increase in gene dosage that allows latent pre-duplication phenotypes to come under the influence of selection. To distinguish between the effects of gene dosage and other benefits of gene duplication, we prevented recombination and gene amplification to prevent copy number increases beyond two copies. We are aware that our experimental design does not reflect how evolution may occur in the wild. However, this design allowed us to study evolutionary forces separately that are otherwise difficult to disentangle. “

      Finally, we also made two changes in the abstract (highlighted in red) to take your feedback into account.

      Reviewer #2 (Recommendations For The Authors):

      The paper is very well written, with a lot of emphasis put on explaining every step and every finding. It was a joy to read.

      Thanks!

      Full stop missing in line 5 of abstract.

      Corrected.

    1. eLife Assessment

      In this valuable study, the authors investigate how inflammatory priming and exposure to irradiated Mycobacterium tuberculosis or the bacterial endotoxin LPS impact the metabolism of primary human airway macrophages and monocyte-derived macrophages. The work shows that metabolic plasticity is greater in monocyte-derived macrophages than alveolar macrophages, with solid experimental methods and overall evidence. The findings are relevant to the field of immunometabolism.

    2. Reviewer #3 (Public review):

      Summary:

      In this manuscript the authors explore the contribution of metabolism to the response of two subpopulations of macrophages to bacterial pathogens commonly encountered in the human lung, as well as the influence of priming signals typically produced at a site of inflammation. The two subpopulations are resident airway macrophages (AM) isolated via bronchoalveolar lavage and monocyte-derived macrophages (MDM) isolated from human blood and differentiated using human serum. The two cell types were primed using IFNγ and Il-4, which are produced at sites of inflammation as part of initiation and resolution of inflammation respectively, followed by stimulation with either heat-killed tuberculosis (Mtb) or LPS to simulate interaction with a bacterial pathogen that is either gram-negative in the case of Mtb or gram-positive in the case of LPS. The authors use human cells for this work, which makes use of widely reported and thoroughly described priming signals, as well as model antigens. This makes the observations on the functional response of these two subpopulations relevant to human health and disease to a greater extent that the mouse models typically used to interrogate these interactions. To examine the relationship between metabolism and functional response, the authors measure rates of oxidative phosphorylation and glycolysis under baseline conditions, primed using IFNγ or IL-4, and primed and stimulated with Mtb or LPS.

      Overall, this study reveals how inflammatory and anti-inflammatory cytokine priming contributes to the metabolic reprogramming of AM and MDM populations. Their conclusions regarding the relationship between cytokine secretion and inflammatory molecule expression in response to bacterial stimuli are supported by the data. The involvement of metabolism in innate immune cell function is relevant when devising treatment strategies that target the innate immune response during infection. The data presented in this paper further our understanding of that relationship and advance the field of innate immune cell biology.

    3. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #2 (Recommendations for the authors):

      Comments to the authors:

      R1. The authors show a similar reduction in ECAR as a measure of glycolytic inhibition upon treating H37Rv-infected unprimed MDMs with 5 mM 2-DG at 1 h and 24 h. However, the data pertaining to the extent of glycolytic inhibition upon 2-DG treatment in IFN-γ or IL-4 primed AMs or MDMs is not included.

      We acknowledge that we have not checked the ECAR of every dataset herein treated with 2DG. However, we have provided evidence that 2DG reduced ECAR in the control datasets, and moreover, 2DG is functionally affecting the cells (e.g. the presence of 2DG altered cytokine production in both AM and MDM, even in the presence of IFN-γ or IL-4).

      R2. The authors have replotted the same data as percent change and fold difference with different normalizing samples. While they have corrected one of the highlighted discrepancies in the data plotting of Fig. 1A and 1C, similar discrepancies are still evident in many other instances. Based on my understanding of the data and normalization methodology stated by the authors in response to comment (#5) by reviewer 1, the authors are plotting fold changes across all samples with respect to unstimulated and unprimed macrophages, whereas percent changes are plotted for stimulated (LPS or dead H37Rv) samples with respect to baseline measurements for each unstimulated sample under differently primed macrophages. I believe the slope of lines connecting unstimulated and LPS stimulates/H37Rv infected upon percent increase or decrease (from the baseline of unstimulated samples) will still maintain their trend in fold changes (relative to unstimulated and unprimed macrophages) irrespective of changes in absolute values. For instance, in Fig. 1F, there are at least 3 samples that show an increase in fold change in OCR upon H37Rv infection in IFN-γ primed MDMs. However, Fig. 1H, plotted from the same data, shows a decrease in OCR across all IFN-γ primed MDMs upon H37Rv infection. The authors have also highlighted that this decrease in OCR upon H37Rv infection in IFN- γ primed MDMs is highly significant (P < 0.01). The same data is again plotted as a bar plot in Fig. 1J as fold change relative to unstimulated and unprimed macrophages (mislabeled as percent change to unstimulated), showing no difference upon H37Rv infection of IFN-γ primed MDMs.

      We have amended the axis in Figure 1 and Supplemental Figure 1 to more accurately describe the two different forms of analysis. We have fixed the errors outlined. We have also amended the methods in the text to clarify the two analyses carried out on the metabolic data. Lines 113-122 as follows:

      “Fold change in ECAR and OCR was calculated compared to unstimulated unprimed controls at 150 minutes, where unstimulated unprimed macrophages were set to 1. This allows for analysis of the effects of both priming and subsequent stimulation for and accounts for the variation in the raw ECAR and OCR reading between runs thereby making each donor its own control.

      Percent change in ECAR and OCR was also calculated to equalise groups to the same point prior to stimulation. Each condition was compared to its own respective primed or unprimed baseline at 30 minutes and this was set to 100%, prior to stimulation, this was carried to examine the capacity of cells to increase metabolic parameters in response to stimulation. Post stimulation percent change data was then extracted and analysed at 150 minutes. This controls for the priming effect and enables the analysis of metabolic capacity in each dataset.”

      For figure 1J, the data is replotted from fold change datasets (not percentage change where the decrease in OCR is significant). The axis label has been revised for accuracy.

    1. eLife Assessment

      This study investigates the role of Caspar (Casp), an orthologue of human Fas-associated factor-1, in regulating the number of primordial germ cells that form during Drosophila embryogenesis. The findings are important in that they reveal an additional pathway that contributes to germ cell specification and maintenance. The evidence supporting the conclusions is solid, as the authors identify Casp and its binding partner Transitional endoplasmic reticulum 94 (TER94) as factors that influence germ cell numbers.

    2. Reviewer #1 (Public review):

      Summary:

      The authors were seeking to define the roles of the Drosophila caspar gene in embryonic development and primordial germ cell (PGC) formation. They demonstrate that PGC number, and the distribution of the germ cell determinant Oskar, change as a result of changes in caspar expression; reduction of caspar reduces PGC number and the domain of Oskar protein expression, while overexpression of caspar does the reverse. They also observe defects in syncytial nuclear divisions in embryos produced from caspar mutant mothers. Previous work from the same group demonstrated that Caspar protein interacts with two partners, TER94 and Vap33. In this paper, they show that maternal knockdown of TER94 results in embryonic lethality and some overlap of phenotypes with reduction of caspar, supporting the idea may work together in their developmental roles. The authors propose models for how Caspar might carry out its developmental functions. The most specific of these is that Caspar and its partners might regulate oskar mRNA stability by recruiting ubiquitin to the translational regulator Smaug.

      Strengths:

      The work identifies a new factor that is involved in PGC specification and points toward an additional pathway that may be involved in establishing and maintaining an appropriate distribution of Oskar at the posterior pole of the embryo. It also ties together earlier observations about the presence of TER94 in the pole plasm that have not heretofore been linked to a function.

      Weaknesses:

      (1) A PiggyBac insertion allele casp[c04227] is used throughout the paper and referred to as a loss-of-function allele (casp[lof]). While the authors avoid the terms 'null' or 'amorph' and on one occasion refer to the allele as a 'strong hypomorph', nevertheless terming it a 'loss-of-function' allele is misleading. This is because the phenotype of the allele when homozygous is different from the phenotype produced when heterozygous over a deficiency.

      (2) The peptide counts in the mass spectrometry experiment aimed at finding protein partners for Casp are extremely low, except for Casp itself and TER94. Peptide counts of 1-2 seem to me to be of questionable significance.

      (3) The pole bud phenotypes from TER94 knockdown and casp mutant shown in Fig 5 appear to be quite different. These differences are unexplained and seem inconsistent with the model proposed that the two proteins work in a common pathway. Whole embryos should also be shown, as the TER94 KD phenotype could result from a more general dysmorphism.

      (4) Fig 6 is not quantitative, lacking even a second control staining to check for intensity variation artifacts. Therefore it shows that the distribution of Oskar protein changes in the various genotypes, but not convincingly that the level of Oskar changes as the paper claims.

      (5) The error bars are huge in the graphs in Fig 7H, I, and J, and in fact these changes are not statistically significant. Therefore the conclusion that 'Reduction in Casp activity specifically affects Smaug degradation during the MZT' is not supported by the data in this figure.

    3. Reviewer #2 (Public review):

      Summary:

      This study investigated the role of the Caspar (Casp) gene, a Drosophila homolog of human Fas-associated factor-1. It revealed that maternal loss of Casp led to centrosomal and cytoskeletal abnormalities during nuclear cycles in Drosophila early embryogenesis, resulting in defective gastrulation. Moreover, Casp regulates PGC numbers, likely by regulating the levels of Smaug and then Oskar. They demonstrate that Casp protein levels are linearly correlated to the PGC number. The partner protein TER94, an ER protein, shows similar but slightly distinct phenotypes. Based on the deletion mutant analysis, TER94 seems functionally relevant for the observed Casp phenotype. Additionally, it is likely involved in regulating protein degradation during PGC specification.

      Strengths:

      This paper uncovers a new function of the Casper (Casp) gene, previously known for its role in immune response regulation and NF-kB signaling inhibition. This new function includes nuclear division and PGC formation in early fly embryos. The findings provide crucial insights into how this pathway contributes to the proper establishment of both somatic cells and the germline, particularly in the context of early embryogenesis. This research is therefore of significant interest to cell and developmental biologists.

      Future Research:

      While this study has made significant strides in understanding the role of the Casp gene in early embryogenesis, the functional relationships among molecules shown here (Casp, TER94, Osk) and other genes previously known to regulate these processes remain unclear. This underscores the need for future studies to delve deeper into these relationships and their implications.

    4. Reviewer #3 (Public review):

      Summary:

      Das et al. discovered a maternal role for Caspar (Casp), the Drosophila orthologue of human Fas-associated factor-1 (FAF1), in embryonic development and germ cell formation. They find that Casp interacts with Transitional endoplasmic reticulum 94 (TER94). Loss of Casp or TER94 leads to partial embryonic lethality, correlated with aberrant centrosome behavior and cytoskeletal abnormalities. This suggests that Casp, along with TER94, promotes embryonic development through a still unidentified mechanism. They also find that Casp regulates germ cell number by controlling a key determinant of germ cell formation, Oskar, through its negative regulator, Smaug.

      Strengths:

      Overall, the experiments are well-conducted, and the conclusions of this paper are mostly well-supported by data.

      Weaknesses:

      Some additional controls could be included, and the language could be clarified for accuracy.

    5. Author response:

      The following is the authors’ response to the original reviews.

      This study investigates the role of Caspar (Casp), an orthologue of human Fas- associated factor-1, in regulating the number of primordial germ cells that form during Drosophila embryogenesis. The findings are important in that they reveal an additional pathway involved in germ cell specification and maintenance. The evidence supporting the conclusions is solid, as the authors identify Casp and its binding partner Transitional endoplasmic reticulum 94 (TER94) as factors that influence germ cell numbers. Minor changes to the title, text, and experimental design are recommended.

      We thank the Editors and Reviewers for their overall positive and thoughtful feedback. Based on these comments, we have revised our manuscript. The changes in the manuscript have been highlighted in ‘blue’ font for easy visualisation.

      Reviewer #1 (Public Review):

      Summary:

      The authors were seeking to define the roles of the Drosophila caspar gene in embryonic development and primordial germ cell (PGC) formation. They demonstrate that PGC number, and the distribution of the germ cell determinant Oskar, change as a result of changes in caspar expression; reduction of caspar reduces PGC number and the domain of Oskar protein expression, while overexpression of caspar does the reverse. They also observe defects in syncytial nuclear divisions in embryos produced from caspar mutant mothers. Previous work from the same group demonstrated that Caspar protein interacts with two partners, TER94 and Vap33. In this paper, they show that maternal knockdown of TER94 results in embryonic lethality and some overlap of phenotypes with reduction of caspar, supporting the idea may work together in their developmental roles. The authors propose models for how Caspar might carry out its developmental functions. The most specific of these is that Caspar and its partners might regulate oskar mRNA stability by recruiting ubiquitin to the translational regulator Smaug.

      Strengths:

      The work identifies a new factor that is involved in PGC specification and points toward an additional pathway that may be involved in establishing and maintaining an appropriate distribution of Oskar at the posterior pole of the embryo. It also ties together earlier observations about the presence of TER94 in the pole plasm that have not heretofore been linked to a function.

      Weaknesses:

      (1)  A PiggyBac insertion allele casp[c04227] is used throughout the paper and referred to as a loss-of-function allele (casp[lof]). However, this allele does not appear to act strictly as a loss-of-function. Figure 1E shows that some residual Casp protein is present in early embryos produced by casp[lof]/Df females, and this protein is presumably functional as the PiggyBac insertion does not affect the coding region. Also, Figures 1B and 1C show that the phenotypes of casp[lof] homozygotes and casp[lof]/Df are not the same; surprisingly, the homozygous phenotypes are more severe. These observations are unexplained and inconsistent with the insertion being simply a loss-of-function allele. Might there be a second-site mutation in casp[c04227]?

      The term loss-of-function (lof) is used rather than null or amorph. casplof is a strong hypomorph, with residual (and functional) protein estimated in the range 5-10% when compared to the wild type. The caspc04227 was procured from BDSC, and based on the decrease in lethality of the casplof/casp(Df) compared to casplof, we assume that second site hits in the casplof line are the reason for the enhanced lethality. For this very reason, we have used casplof/ casp(Df) for all subsequent experiments. We also conducted rescue experiments wherever possible to confirm the specificity with caspWT and various deletion variants of casp.

      (2)  TER94 knockdown phenotypes have been previously published (Zhang et al 2018 PMID 30012668), and their effects on embryonic viability and syncytial mitotic divisions were described there. This paper is inappropriately not cited, and the data in Figure 4 should be presented in the context of what has been published before.

      We apologize for the oversight. Indeed, Zhang et al. (2018) highlighted TER94 as one of the loci uncovered in their screen and some of the relevant phenotypes are described there. We have referred to their findings at the appropriate junctures as suggested (pg 11, pg13, pg 15).

      (3)  The peptide counts in the mass spectrometry experiment aimed at finding protein partners for Casp are extremely low, except for Casp itself and TER94. Peptide counts of 1-2 seem to me to be of questionable significance.

      Peptide counts are indeed low, but the fact that they are enriched at all, in comparison to controls, considering that we are using whole embryo lysates rather than isolated PGC lysates, suggests interaction with Casp could be biologically/ functionally meaningful. The data is restricted to the supplementary material and is not analyzed in isolation; we have combined data from multiple mass spectrometry experiments by other researchers to link Casp to pole plasm components.

      (4)  The pole bud phenotypes from TER94 knockdown and casp mutant shown in Fig 5 appear to be quite different. These differences are unexplained and seem inconsistent with the model proposed that the two proteins work in a common pathway. Whole embryos should also be shown, as the TER94 KD phenotype could result from a more general dysmorphism.

      We agree that TER94 KD is a stronger phenotype, with TER94 having essential cell division and patterning roles. In fact, the TER94 RNAi embryos, unlike casplof, stall in terms of their developmental program before Stage 4. This has been noted in the earlier study (Zhang et al., 2018). As a result, we focused on pole bud stage embryos that were rare - but present in the collections. We report that PGC from very early TER94 RNAi embryos have fewer pole buds.

      The rationale behind the presumption that these two proteins may work in a common pathway is clear-cut. We have validated the physical interaction using protein lysates from two developmental time points. Satisfyingly, an affinity purification using antibodies against TER94 or Casp invariably enriches the other protein as the primary interacting partner. Our model integrates data from mammalian and fly systems to support the idea that there must be an overlap between TER94/Casp function, with these two proteins working together to engineer the degradation of ubiquitinated Smaug. Future experiments are necessary to confirm and extend this claim.

      (5)  Figure 6 is not quantitative, lacking even a second control staining to check for intensity variation artifacts. Therefore, it shows that the distribution of Oskar protein changes in the various genotypes, but not convincingly that the level of Oskar changes as the paper claims.

      We appreciate that oskar RNA localization is also somewhat altered due to change in casp levels. We have acknowledged the variability in the various phenotypes, and as such, it is unsurprising that it has also reflected in the Oskar levels. However, it is evident that a statistically significant number of mutant embryos show a decrease in Oskar levels.

      (6)  The error bars are huge in the graphs in Figure 7H, I, and J, leading me to question whether these changes are statistically significant. Calculations of statistical significance are missing from these graphs and need to be added.

      The data in the Western blots represents the whole embryo, as the lysates used are from embryos 0-1, 1-2, 2-3 hrs. We have averaged and plotted data from 5 Western blots. The changes are not statistically significant. Even without the statistical significance, the data for Fig. 7I led us to examine Smaug in the pole cells, rather than in the whole embryo. The pole cell data (Fig8-D3) is striking and led to the conclusion – that Smaug protein perdures in the pole cells during the stages of syncytial/cellular blastoderm.

      (7)  There are many instances of fuzzy and confusing language when describing casp phenotypes. For example, on lines 211-212, it is stated that 'casp[lof] adults are only partially homozygous viable as ~70% embryos laid by the homozygous mutant females failed to hatch into larvae'. Isn't this more accurately described as 'casp[c04227] is a maternal-effect lethal allele with incomplete penetrance'? Another example is on line 1165, what exactly is a 'semi-vital function'?

      We thank the reviewer for reading the manuscript in detail. We have tried to pay attention to reduce the ambiguity and fixed the text accordingly (pg 7, line 214; pg 33, line 1169, word semi-vital is deleted).

      Reviewer #2 (Public Review):

      Summary:

      This study investigated the role of the Caspar (Casp) gene, a Drosophila homolog of human Fas-associated factor-1. It revealed that maternal loss of Casp led to centrosomal and cytoskeletal abnormalities during nuclear cycles in Drosophila early embryogenesis, resulting in defective gastrulation. Moreover, Casp regulates PGC numbers, likely by regulating the levels of Smaug and then Oskar. They demonstrate that Casp protein levels are linearly correlated to the PGC number. The partner protein TER94, an ER protein, shows similar but slightly distinct phenotypes. Based on the deletion mutant analysis, TER94 seems functionally relevant for the observed Casp phenotype. Additionally, it is likely involved in regulating protein degradation during PGC specification.

      Strengths:

      The paper reveals an unexpected function of the maternally produced Casp gene, previously implicated in immune response regulation and NF-kB signaling inhibition, in nuclear division and PGC formation in early fly embryos. Experiments are properly conducted and strongly support the conclusion. The rescue experiment using deletion mutant form is particularly informative as it suggests the requirement of each domain function.

      Weaknesses:

      Functional relationships among molecules shown here (and other genes known to regulate these processes) are still unclear.

      We completely agree with this assessment. In our view this is an interesting albeit initial report. We also appreciate that understanding the mechanistic underpinnings of these results will be critical. We have ensured that our present claims are backed up by data, however, are fully sensitive to the fact that newer observations will refine or even alter these claims. We are continuing to work on the problem and will hopefully make further inroads in mechanism in the coming years.

      Reviewer #3 (Public Review):

      Summary:

      Das et al. discovered a maternal role for Caspar (Casp), the Drosophila orthologue of human Fas-associated factor-1 (FAF1), in embryonic development and germ cell formation. They find that Casp interacts with Transitional endoplasmic reticulum 94 (TER94). Loss of Casp or TER94 leads to partial embryonic lethality, correlated with aberrant centrosome behavior and cytoskeletal abnormalities. This suggests that Casp, along with TER94, promotes embryonic development through a still unidentified mechanism. They also find that Casp regulates germ cell number by controlling a key determinant of germ cell formation, Oskar, through its negative regulator, Smaug.

      Strengths:

      Overall, the experiments are well-conducted, and the conclusions of this paper are mostly well-supported by data.

      Weaknesses:

      Some additional controls could be included, and the language could be clarified for accuracy.

      Reviewer #1 (Recommendations For The Authors):

      (1)  The paper is inconsistent in using standard Drosophila nomenclature. Often the name of the mammalian counterpart is used instead. This needs to be cleaned up as it is very confusing to the reader.

      The names of the mammalian counterpart are explicitly used, when we intended, to underscore the parallels between mammalian vs Drosophila function, specifically in the context of the major players in this study, TER94 vs VCP; Caspar vs FAF1. Since we do not have direct biochemical data indicating that TER94/Casp degrades Smaug, we use published mammalian literature to draw parallels. At no point have we swapped terminology casually.

      (2)  The Discussion is far too long and in my view extends too far beyond the experimental data in the paper. As a start for editing, its first two paragraphs (lines 1138-1164) include mostly general statements and could be greatly reduced or eliminated.

      Our aim was to emphasize the repurposing of factors between early development and later/adult stages for different functional contexts. Our laboratory (Ratnaparkhi) works on Casp in terms of its roles in NF-kappa B signalling. We serendipitously stumbled on the embryonic lethality while characterizing the casplof allele, which, later, led us to examine the function of Casp during embryonic germ cell development.

      (3)  The Introduction is weak in its description of the developmental function of Toll and Dorsal. This could be summarized in a sentence or two.

      As suggested, a few sentences that highlight the developmental function of Toll/Dorsal signalling have been added to the text (pg 3, line 90-92).

      (4)  Even if correctly cited, it is not appropriate to simply reproduce an image from a public database, as was done in Figure S1C. This should be removed.

      Figure S1C has been deleted.

      (5)  The Materials and Methods section should be moved to after the Discussion so it does not interrupt the flow of the Results.

      The Section has been moved as suggested.

      Reviewer #2 (Recommendations For The Authors):

      For general readers, more detailed information about the PGC specification will be helpful in the Introduction or Results section.

      PGC specification is introduced in the text as the story transits from global embryonic effects of casp knockdown to specific effects on PGCs. A few additional sentences have been added to bolster the text (pg 11, first paragraph).

      The Methods section talks about live imaging, but I could not find the experiments in the figures. Are the data available for asynchronous nuclear divisions in the live imaging?

      The live imaging relates to DIC movies that are part of Suppl. Fig 2A. The movies are embedded in an MS PowerPoint slide, which has been uploaded as a PowerPoint (and not a PDF).

      To ensure that the mutant changes the Osk translation rate, showing the Osk RNA level may be helpful.

      oskar RNA localization is quite distinct as compared to Oskar and Vasa protein. It has been shown that oskar RNA is localized to the founder granules and is, in fact, excluded from the germ granules that contain Vasa, Oskar and nos RNA etc. Gavis lab recently reported (Eichler et al., 2020) that ectopic localization of osk RNA in the germ granules is toxic to pole cells. Thus, it will be of interest to analyze whether and how oskar RNA is localized in casp embryos.

      More discussion about the difference between Casp and ter94 phenotypes and potential reasons would be informative.

      TER94 appears to be an essential maternal gene. Hypomorphic knockdown of TER94 using RNAi is sufficient to induce early embryonic lethality. In fact, Zhang et. al., 2018 et al., using stronger/earlier maternal drivers highlighted the lethality and somatic cell division defects caused due to the severe loss of TER94. The UBX domain is present in multiple proteins, in addition to Casp. TER94 possible plays a vital role in protein degradation of critical cell cycle proteins, such as cyclins that need to be degraded for efficient genomic duplications in the 10’ nuclear division cycles that predominate the first few hours of embryogenesis.

      N=3 (Fig1 legend) and N =15 (Fig2). What are those numbers?

      N=3 indicated the number of repeats of the western blot. This reference has been deleted. N=15, represents the number of embryos imaged for data in panels G and H.

      Reviewer #3 (Recommendations For The Authors):

      Major Suggestion:

      (1)  Oskar (Osk) mRNA Localization: Does Osk mRNA localization change upon overexpression or LOF of Casp? Since TER94 has been implicated in Osk mRNA localization (Ruden et al., 2000), this would be a good control to include.

      As mentioned earlier, in the response to editors, data presented in our manuscript indicates that Caspar is unique in its ability to regulate both Oskar levels and centrosome dynamics. As the reviewer pointed out, we are in the process of analysing the possible localization defects in oskar mRNA in the embryos. Since the preliminary data are promising, we are pursuing this carefully to better understand the involvement of Caspar. We are focusing on the ability of Caspar to regulate early nuclear divisions prior to pole cell formation. It is possible that in casp mutant embryos the nuclei/centrosomes that enter the pole plasm are already defective and thus can influence release of the pole plasm components. This needs to be examined carefully, and we are conducting these experiments.

      (2)  Western Blot for Osk Protein: It would also be beneficial to perform a western blot for Osk protein to demonstrate that it is indeed increased upon Casp overexpression.

      This is a good suggestion. However, Oskar antibodies are not readily available, and we have a very limited supply which have been used for embryo staining experiments. We considered these more useful as in addition to the absolute levels, staining experiment can reveal localization pattern. It was thus possible to correlate Oskar function with the pole cell counts in respective genetic backgrounds.

      (3)  Title Clarification: The title states, "Caspar determines primordial germ cell identity in Drosophila melanogaster." The current experiments do not show that Casp determines germ cell identity. It would be more accurate to conclude that Casp regulates germ cell numbers.

      Please refer to the introductory paragraphs where we explain our views in this regard. We have modified our title to “Caspar specifies primordial germ cell count and identity in Drosophila melanogaster."

      Minor Suggestions:

      (1)  Line 69: Delete the use of "recent" for papers published in 2001 and 2007. These papers are around 20 years old.

      The word has been deleted.

      (2)  Paragraph from Line 110: Consider splitting this paragraph into two for better readability and clarity.

      Paragraph has been split into two; this has improved readability.

      (3)  Line 266: Check and correct the formatting issues in this line.

      Edited, based on suggestion. A line break was added after the title.

      (4)  Line 328: Adding references to earlier studies here will be useful for providing context and supporting information.

      References that introduce Centrosomes and their roles as organizing centres have been added in line 336.

      (5)  Line 564: It is best to avoid using the word "master." Please consider using other terms such as "key" or "principal."

      Edited, based on suggestion.

      (6)  Citations: The authors should also cite Cinalli et al., 2013 for the Gcl reference to ensure comprehensive citation of relevant literature.

      Thank you for the suggestion. The reference has been added on pages 16 and 29.

      (7)  Overall Length: The paper is quite long. If it can be shortened, it will be easier to read. Consider condensing sections where possible without losing essential information.

      The paper is indeed longer than average, but the choice of eLife as the home for this study was, in part, determined by the platform's flexibility regarding length/ word count. It seemed worthwhile to elaborate the text in places to accentuate the novelty of the findings.

      These additions and adjustments would help to further substantiate the claims and improve the clarity of the paper.

      We hope that the claims made in our manuscript are substantiated by the data that are presented. Wherever possible, we have tried to modify the text suitably to improve clarity.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      The manuscript studies nutrient intake rates for stationary and motile microorganisms to assess the effectiveness of swim vs. stay strategies. This work provides valuable insights on how the different strategies perform in the context of a simplified mathematical model that couples hydrodynamics to nutrient advection and diffusion. The swim and stay strategies are shown to yield similar nutrient flux under a range of conditions.

      Strengths:

      Strengths of the work include (i) the model prediction in Fig. 3 of nutrient flux applied to a range of microorganisms including an entire clade that are known to use different feeding strategies and (ii) a study of the interaction between cilia and absorption coverage showing the robustness of their predictions provided these regions have sufficient overlap.

      We thank the referee for their thorough review of our manuscript and for their constructive feedback.

      Weaknesses: To improve the work, the authors should further expand their discussion of the following points:

      (1) The authors comment that a number of species alternate between sessile and motile behavior. It would be helpful to discuss what is known about what causes switching between these modes and whether this provides insights regarding the advantages of the different behaviors.

      The transition between sessile and motile states is often influenced by external environmental conditions, such as prey availability and predator presence, which determine the most advantageous state at any given time. For instance, members of the genus Stentor are known to detach from their colonies and exhibit solitary swimming behavior in response to low prey abundance (Tartar, 2013) or when avoiding predators (Dexter et al. 2019). Similarly, the transition in Vorticella is influenced by chemical cues, such as pH (Baufer et al., 1999) or algae concentration (Langlois, 1975).

      References:

      Dexter, J. P., Prabakaran, S., & Gunawardena, J. (2019). A complex hierarchy of avoidance behaviors in a single-cell eukaryote. Current biology, 29(24), 4323-4329.

      Tartar, V. (2013). The biology of stentor: International series of monographs on pure and applied biology: Zoology. Elsevier.

      BAUFER, P. J. D., Amin, A. A., Pak, S. C., & BUHSE JR, H. E. (1999). A method for the synchronous induction of large numbers of telotrochs in Vorticella convallaria by monocalcium phosphate at low pH. Journal of Eukaryotic Microbiology, 46(1), 12-16.

      LANGLOIS, G. A. (1975). Effect of algal exudates on substratum selection by motile telotrochs of the marine peritrich ciliate Vorticella marina. The Journal of Protozoology, 22(1), 115-123.

      (2) An encounter zone of R=1.1a appears be used throughout the manuscript, but I could not find a biological justification for this particular value. This results appear to be quite sensitive to this choice, as shown in Supplement Fig. 3(B). In the Discussion, it is mentioned that using a much larger exclusion zone leads to significantly different nutrient flux, and it is implied that such a large exclusion zone is not biologically plausible, but this was not explained sufficiently.

      Thank you for pointing this out. We chose the value of the encounter zone based on a rough calculation of cilia length relative to body length. Cilia are typically of the order of 10 microns in length, and the cell body of a ciliate is typically of the order of 100-1000 microns. 

      For example, in the work of Jiang, H., & Buskey, E. J., 2024, I&II, the nutrient encounter is reported at the leading edge of the ciliary band in Strombidium and Amphorides. Here, cilia appear to be about 20% of the body length and the particles are absorbed quite close to the cell surface. A similar encounter near the cell surface is reported in Gilmour, 1978 and Thomazo et al., 2020.

      In the theoretical model of Andersen and Kiørboe (2020), a much larger encounter zone, extending 10 times the body length (that is, an encounter zone that is 1000% larger than the body length). This is obviously not biologically justifiable. 

      We edited the manuscript to better justify our choices and provide supporting references. 

      References:

      Andersen, A., & Kiørboe, T. (2020). The effect of tethering on the clearance rate of suspension-feeding plankton. Proceedings of the National Academy of Sciences, 117(48), 30101-30103.

      Jiang, H., & Buskey, E. J. (2024). Relating ciliary propulsion morphology and flow to particle acquisition in marine planktonic ciliates II: the oligotrich ciliate Strombidium capitatum. Journal of Plankton Research, fbae011.

      Jiang, H., & Buskey, E. J. (2024). Relating ciliary propulsion morphology and flow to particle acquisition in marine planktonic ciliates I: the tintinnid ciliate Amphorides quadrilineata. Journal of Plankton Research, fbae012.

      Gilmour, T. H. J. (1978). Ciliation and function of the food-collecting and waste-rejecting organs of lophophorates. Canadian Journal of Zoology, 56(10), 2142-2155.

      Thomazo, J. B., Le Révérend, B., Pontani, L. L., Prevost, A. M., & Wandersman, E. (2021). A bending fluctuation-based mechanism for particle detection by ciliated structures. Proceedings of the National Academy of Sciences, 118(31), e2020402118.

      (3) In schematic of the in Fig. 2(B) it was unclear if the encounter zone in the envelope model is defined analogously to the Stokeslet model or if a different formulation is used.

      Yes, we defined the encounter zone the same in both models. In fact, we used two metrics for evaluating nutrient uptake: one considers only the fluid flow rate through an encounter zone, another considers the mass transport within the fluid and absorption at the entire ciliary surface. For the first metric, the clearance rate Q, evaluated by calculating the flow rate past an annular disk, it is consistent applied to all models, depicted in Figure 2(B). The second metric, the nutrient uptake rate, which we define as the dimensionless integration of mass flux over the entire spherical surface, is also consistently applied to all models to evaluate Sh number. Both metrics are evaluated on the Stokeslet and envelope models.

      We edited the main text to further clarify these two metrics in the revision.

      (4) The force balance argument should be clarified. Equation (3) of the supplement gives the force-velocity relation in the motile case. Since equation (4), which the authors state is the net force in the sessile case, seems to involve the same expression, would it not follow from U=0 in the sessile case that one would simply obtain quiescent flow with Fcilia = 0?

      The force balance equations for the model organism differ between the motile and sessile modes. In the submitted version, SI Eq.(3) and SI Eq.(4) are derived from different force balance equations, where the velocity U does not appear in the sessile Stokeslet model.

      Author response image 1.

      For the Stokeslet model, the force generated by the flagella acting on the fluid is modeled as a point force

      Motile Stokeslet model:

      The force balance on the sphere is given by:

      Where  is the thrust force generated by the flagella in the direction of swimming, is the drag force due to a moving sphere in fluid with speed U, and K is the hydrodynamic force acting on the sphere by the flow generated by the point force F. For a given strength of the Stokeslet, , the swimming speed U can be calculated by the force balance.

      Sessile Stokeslet model:

      The force balance on the sphere is given by:

      Where , T= -F, and K are defined as above. Similarly, for a given point force F, the required force provided by a stalk to fix the sphere can be calculated by the force balance.

      Therefore, SI Eq.(3) and (4), are not directly applicable across both the Stokeslet and envelope models. While the expressions appear similar due to the presence of the forces F and K, separate calculations are needed depending on the force model.

      We edited the SI document and SI Figure 3 to clarify this.

      Reference:

      Andersen, A., & Kiørboe, T. (2020). The effect of tethering on the clearance rate of suspension-feeding plankton. Proceedings of the National Academy of Sciences, 117(48), 30101-30103.

      Reviewer #2 (Public Review):

      Summary:

      The authors have collected a significant amount of data from the literature on the flow regimes associated with microorganisms whose propulsion is achieved through the action of cilia or flagella, with particular interest in the competition between sessile and motile lifestyles. They then use several distinct hydrodynamic models for the cilia-driven flows to quantify the nutrient uptake and clearance rate, reported as a function of the Peclet number. Among the interesting conclusions the authors draw concerns the question of whether, for certain ciliates, there is a clear difference in nutrient uptake rates in the sessile versus motile forms. The authors show that this is not the case, thereby suggesting that the evolutionary pressure associated with such a difference is not present. The analysis also includes numerical calculations of the uptake rate for spherical swimmers in the regime of large Peclet numbers, where the authors note an enhancement due to advection-generated thinning of the solutal boundary layer around the organism.

      Strengths:

      In addressing the whole range of organism sizes and Peclet numbers the authors have achieved an important broad perspective on the problem of nutrient uptake of ciliates, with implications for understanding evolutionary driving forces toward particular lifestyles (e.g. sessile versus motile).

      We thank the referee for their thorough review of our manuscript and for their feedback regarding the inclusion of more relevant references.

      Weaknesses:

      The authors appear to be unaware of rather similar calculations that were done some years ago in the context of Volvox, in which the issue of the boundary layer size and nutrient uptake enhancement were clearly recognized [M.B. Short, et al., Flows Driven by Flagella of Multicellular Organisms Enhance Long-Range Molecular Transport, PNAS 103, 8315-8319 (2006)]. This reference also introduced the model of a fixed shear stress at the surface of the sphere as a representation of the action of the cilia, which may be more realistic than the squirmer-type boundary condition, although the two lead to similar large-Pe scalings.

      We apologize for having missed to include this reference in the submitted version of the manuscript. We read this work thoroughly, it is indeed highly relevant to the present study.

      The findings reported in Figure 4, that the uptake rate is robust to variations in cilia coverage and absorption fraction, are similar in spirit to an observation made recently in the context of the somatic cell neighbourhood areas in Vovox [Day, et al., eLife 11, e72707 (2022)]. There, it was found that while there is a broad distribution of those areas, and hence of the coarse-grained tangential flagellar force acting on the fluid, the propulsion speed is rather insensitive to those variations.

      Thank you for pointing us to the work of Day, et al., eLife 11, e72707 (2022). We did not know about this study and have not read it before. The work is broadly relevant to our study, and we added a reference to this work in the discussion.

    2. eLife Assessment

      This important paper addresses the role of fluid flows in nutrient uptake by microorganisms propelled by the action of cilia or flagella. Using a range of mathematical models for the flows created by such appendages, the authors provide convincing evidence that the two strategies of swimming and sessile motion can be competitive. These results will have significant implications for our understanding of the evolution of multicellularity in its various forms.

    3. Reviewer #1 (Public review):

      Summary:

      The manuscript studies nutrient intake rates for stationary and motile microorganisms to assess the effectiveness of swim vs. stay strategies. This work provides valuable insights on how the different strategies perform in the context of a simplified mathematical model that couples hydrodynamics to nutrient advection and diffusion. The swim and stay strategies are shown to yield similar nutrient flux under a range of conditions.

      Strengths:

      Strengths of the work include (i) the model prediction in Fig. 3 of nutrient flux applied to a range of microorganisms including an entire clade that are known to use different feeding strategies and (ii) a study of the interaction between cilia and absorption coverage showing the robustness of their predictions provided these regions have sufficient overlap.

      Weaknesses:

      In the revision, the authors have adequately addressed the weaknesses I raised in the first round of review.

    4. Reviewer #2 (Public review):

      Summary:

      The authors have collected a significant amount of data from the literature on the flow regimes associated with microorganisms whose propulsion is achieved through the action of cilia or flagella, with particular interest in the competition between sessile and motile lifestyles. They then use several distinct hydrodynamic models for the cilia-driven flows to quantify the nutrient uptake and clearance rate, reported as a function of the Peclet number. Among the interesting conclusions the authors draw concerns the question of whether, for certain ciliates, there is a clear difference in nutrient uptake rates in the sessile versus motile forms. The authors show that this is not the case, thereby suggesting that the evolutionary pressure associated with such a difference is not present. The analysis also includes numerical calculations of the uptake rate for spherical swimmers in the regime of large Peclet numbers, where the authors note an enhancement due to advection-generated thinning of the solutal boundary layer around the organism.

      Strengths:

      In addressing the whole range of organism sizes and Peclet numbers the authors have achieved an important broad perspective on the problem of nutrient uptake of ciliates, with implications for understanding evolutionary driving forces toward particular lifestyles (e.g. sessile versus motile).

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer 1 (Public Review):

      Summary: Wilmes and colleagues present a computational model of a cortical circuit for predictive processing which tackles the issue of how to learn predictions when different levels of uncertainty are present for the predicted sensory stimulus. When a predicted sensory outcome is highly variable, deviations from the average expected stimulus should evoke prediction errors that have less impact on updating the prediction of the mean stimulus. In the presented model, layer 2/3 pyramidal neurons represent either positive or negative prediction errors, SST neurons mediate the subtractive comparison between prediction and sensory input, and PV neurons represent the expected variance of sensory outcomes. PVs therefore can control the learning rate by divisively inhibiting prediction error neurons such that they are activated less, and exert less influence on updating predictions, under conditions of high uncertainty.

      Strengths: The presented model is a very nice solution to altering the learning rate in a modality and context-specific way according to expected uncertainty and, importantly, the model makes clear, experimentally testable predictions for interneuron and pyramidal neuron activity. This is therefore an important piece of modelling work for those working on cortical and/or predictive processing and learning. The model is largely well-grounded in what we know of the cortical circuit.

      Weaknesses: Currently, the model has not been challenged with experimental data, presumably because data from an ad- equate paradigm is not yet available. I therefore only have minor comments regarding the biological plausibility of the model:

      Beyond the fact that some papers show SSTs mediate subtractive inhibition and PVs mediate divisive inhibition, the selection of interneuron types for the different roles could be argued further, given existing knowledge of their properties. For instance, is a high PV baseline firing rate, or broad sensory tuning that is often interpreted as a ’pooling’ of pyramidal inputs, compatible with or predicted by the model?

      Thank you for this nice suggestion. We added a section to the discussion expanding on this: “The model predicts that the divisive interneuron type, which we here suggest to be the PVs, receive a representation of the stimulus as an input. PVs could be pooling the inputs from stimulus-responsive layer 2/3 neurons to estimate uncertainty. The more the stimulus varies, the larger the variability of the pyramidal neuron responses and, hence, the variability of the PV activity. The broader sensory tuning of PVs (Cottam et al. 2013) is in line with the model insofar as uncertainty modulation could be more general than the specific feature, which is more likely for low-level features processed in primary sensory cortices. PVs were shown to connect more to pyramidal cells with similar feature-tuning (Znamenskyiy et al. 2024); this would be in line with the model, as uncertainty modulation should be feature-related. In our model, some SSTs deliver the prediction to the positive prediction error neurons. SSTs are already known to be involved in spatial prediction, as they underlie the effect of surround suppression (Adesnik et al. 2012), in which SSTs suppress the local activity dependent on a predictive surround.”

      On a related note, SSTs are thought to primarily target the apical dendrite, while PVs mediate perisomatic inhibition, so the different roles of the interneurons in the model make sense, particularly for negative PE neurons, where a top-down excitatory predicted mean is first subtractively compared with the sensory input, s, prior to division by the variance. However, sensory input is typically thought of as arising ’bottom-up’, via layer 4, so the model may match the circuit anatomy less in the case of positive PE neurons, where the diagram shows ’s’ arising in a top-down manner. Do the authors have a justification for this choice?

      We agree that ‘s’ is a bottom-up input and should have been more clear about that we do not consider ‘s’ to be a top-down input like the prediction. We hence adjusted the figure correspondingly and added a few clarifying sentences to the manuscript. The reviewer, however, raises an important point, which is not talked about enough. Namely, that if the bottom-up input ‘s’ comes from L4, how can it be compared in a subtractive manner with the top-down prediction arriving in the superficial layers? In Attinger et al. it was shown that the visual stimulus had subtractive effects on SST neurons. The axonal fibers delivering the stimulus information are hence likely to arrive in the vicinity of the apical dendrites, where SSTs target pyramidal cells. Hence, those axons delivering stimulus information could also target the apical dendrites of pyramidal cells. As the reviewer probably had in mind, L4 input tends to arrive in the somatic layer. However, there are also stimulus-responsive cells in layer 2/3, such that the stimulus information does not need to come directly from L4, it could be relayed via those stimulus-responsive layer 2/3 cells. It has been shown that L2/3→L3 axons are mostly located in the upper basal dendrites and the apical oblique dendrites, above the input from L4 (Petreanu et al. The subcellular organization of neocortical excitatory connections). Hence, stimulus information could arrive on the apical dendrites, and be subtractively modulated by SSTs. We would also like to note that the model does not take into account the precise dendritic location of the inputs. The model only assumes that the difference between stimulus and prediction is calculated before the divisive modulation by the variance.

      In cortical circuits, assuming a 2:8 ratio of inhibitory to excitatory neurons, there are at least 10 pyramidal neurons to each SST and PV neuron. Pyramidal neurons are also typically much more selective about the type of sensory stimuli they respond to compared to these interneuron classes (e.g., Kerlin et al., 2012, Neuron). A nice feature of the proposed model is that the same interneurons can provide predictions of the mean and variance of the stimulus in a predictor-dependent manner. However, in a scenario where you have two types of sensory stimulus to predict (e.g., two different whiskers stimulated), with pyramidal neurons selective for prediction errors in one or the other, what does the model predict? Would you need specific SST and PV circuits for each type of predicted stimulus?

      If we understand correctly, this would be a scenario in which the same context (e.g., sound) is predicting two types of sensory stimulus. In that case, one may need specific SST and PV circuits for the different error neurons selective for prediction errors in these stimuli, depending on how different the predictions are for the two stimuli as we elaborate in the following. The reviewer is raising an important point here and that is why we added a section to the discussion elaborating on it.

      We think that there is a reason why interneurons are less selective than pyramidal cells and that this is also a feature in prediction error circuits. Similarly-tuned cells are more connected to each other, because they tend to be activated together as the stimuli they encode tend to be present in the environment together. Also, error neurons selective to nearby whiskers are more likely to receive similar stimulus information, and hence similar predictions. Hence, because nearby whiskers are more likely to be deflected similarly, a circuit structure may have developed during development such that neurons selective for prediction errors of nearby whiskers, may receive inputs from the same inhibitory interneurons. In that case, the same SST and PV cells could innervate those different neurons. If, however, the sensory stimuli to be predicted are very different, such that their representations are likely to be located far away from each other, then it also makes sense that the predictions for those stimuli are more diverse, and hence the error neurons selective to these are unlikely to be innervated by the same interneurons.

      We added a shorter version of this to the discussion: “The lower selectivity of interneurons in comparison to pyramidal cells could be a feature in prediction error circuits. Error neurons selective to similar stimuli are more likely to receive similar stimulus information, and hence similar predictions. Therefore, a circuit structure may have developed such that prediction error neurons with similar selectivity may receive inputs from the same inhibitory interneurons.”

      Reviewer 2 (Public Review):

      Summary: This computational modeling study addresses the observation that variable observations are interpreted differently depending on how much uncertainty an agent expects from its environment. That is, the same mismatch between a stimulus and an expected stimulus would be less significant, and specifically would represent a smaller prediction error, in an environment with a high degree of variability than in one where observations have historically been similar to each other. The authors show that if two different classes of inhibitory interneurons, the PV and SST cells, (1) encode different aspects of a stimulus distribution and (2) act in different (divisive vs. subtractive) ways, and if (3) synaptic weights evolve in a way that causes the impact of certain inputs to balance the firing rates of the targets of those inputs, then pyramidal neurons in layer 2/3 of canonical cortical circuits can indeed encode uncertainty-modulated prediction errors. To achieve this result, SST neurons learn to represent the mean of a stimulus distribution and PV neurons its variance.

      The impact of uncertainty on prediction errors is an understudied topic, and this study provides an intriguing and elegant new framework for how this impact could be achieved and what effects it could produce. The ideas here differ from past proposals about how neuronal firing represents uncertainty. The developed theory is accompanied by several predictions for future experimental testing, including the existence of different forms of coding by different subclasses of PV interneurons, which target different sets of SST interneurons (as well as pyramidal cells). The authors are able to point to some experimental observations that are at least consistent with their computational results. The simulations shown demonstrate that if we accept its assumptions, then the authors’ theory works very well: SSTs learn to represent the mean of a stimulus distribution, PVs learn to estimate its variance, firing rates of other model neurons scale as they should, and the level of un- certainty automatically tunes the learning rate, so that variable observations are less impactful in a high uncertainty setting.

      Strengths: The ideas in this work are novel and elegant, and they are instantiated in a progression of simulations that demonstrate the behavior of the circuit. The framework used by the authors is biologically plausible and matches some known biological data. The results attained, as well as the assumptions that go into the theory, provide several predictions for future experimental testing.

      Weaknesses: Overall, I found this manuscript to be frustrating to read and to try to understand in detail, especially the Results section from the UPE/Figure 4 part to the end and parts of the Methods section. I don’t think the main ideas are so complicated, and it should be possible to provide a much clearer presentation.

      For me, one source of confusion is the comparison across Figure 1EF, Figure 2A, Figure 3A, Figure 4AB, and Figure 5A. All of these are meant to be schematics of the same circuit (although with an extra neuron in Figure 5), yet other than Figures 1EF and 4AB, no two are the same! There should be a clear, consistent schematic used, with identical labeling of input sources, neuron types, etc. across all of these panels.

      We changed all figures to make them more consistent and pointed out that we consider subparts of the circuit.

      The flow of the Results section overall is clear until the “Calculation of the UPE in Layer 2/3 error neurons” and Figure 4, where I find that things become significantly more confusing. The mention of NMDA and calcium spikes comes out of the blue, and it’s not clear to me how this fits into the authors’ theory. Moreover: Why would this property of pyramidal cells cause the PV firing rate to increase as stated? The authors refer to one set of weights (from SSTs to UPE) needing to match two targets (weights from s to UPE and weights from mean representation to UPE); how can one set of weights match two targets? Why do the authors mention “out-of-distribution detection’ here when that property is not explored later in the paper? (see also below for other comments on Figure 4)

      We agree that the introduction of NMDA and calcium spikes was too short and understand that it was confusing. We therefore modified and expanded the section. To answer the two specific questions: First, Why would this property of pyramidal cells cause the PV firing rate to increase as stated? This property of pyramidal cells does not cause the PV firing rate to increase. When for example in positive error neurons, the mean input increases, then the PVs receive higher stimulus input on average, which is not compensated by the inhibitory prediction (which is still at the old mean), such that the PV firing rate increases. Due to the nonlinear integration in PVs, the firing rate can increase a lot and inhibit the error neurons strongly. If the error neurons integrate the difference nonlinearly, they compensate for the increased inhibition by PVs. In Figure 5, we show that a circuit in which error neurons exhibit a dendritic nonlinearity matches an idealised circuit in which the PVs perfectly represent the variance. We modified the text to clarify this.

      Second, how can one set of weights match two targets? In our model, one set of weights does not need to match two targets. We apologise that this was written in such a confusing way. In positive error neurons, the inhibitory weights from the SSTs need to match the excitatory weights from the stimulus, and in negative error neurons, the inhibitory weights from the SSTs need to match the excitatory weights from the prediction. The weights in positive and negative circuits do not need to be the same. So, on a particular error neuron, the inhibition needs to match the excitation to maintain EI balance. Given experimental evidence for EI balance and heterosynaptic plasticity, we think that this constraint is biologically achievable. The inhibitory and excitatory synapses that need to match are targeting the same postsynaptic neuron and could hence have access to their postsynaptic effect. We modified the text to be more clear. Finally, we omitted the mentioning of out-of-distribution detection, see our reply below.

      Coming back to one of the points in the previous paragraph: How realistic is this exact matching of weights, as well as the weight matching that the theory requires in terms of the weights from the SSTs to the PVs and the weights from the stimuli to the PVs? This point should receive significant elaboration in the discussion, with biological evidence provided. I would not advocate for the authors’ uncertainty prediction theory, despite its elegant aspects, without some evidence that this weight matching occurs in the brain. Also, the authors point out on page 3 that unlike their theory, “...SSTs can also have divisive effects, and PVs can have subtractive effects, dependent on circuit and postsynaptic properties”. This should be revisited in the Discussion, and the authors should explain why these effects are not problematic for their theory. In a similar vein, this work assumes the existence of two different populations of SST neurons with distinct UPE (pyramidal) targets. The Discussion doesn’t say much about any evidence for this assumption, which should be more thoroughly discussed and justified.

      These are very important points, we agree that the biological plausibility of the model’s predictions should be discussed and hence expanded the discussion with three new paragraphs:

      To enable the comparison between predictions and sensory information via subtractive inhibition, we pointed out that the weights of those inputs on the postsynaptic neuron need to match. This essentially means that there needs to be a balance of excitatory and inhibitory inputs. Such an EI balance has been observed experimentally (Tan and Wehr, 2009). And it has previously been suggested that error responses are the result of breaking this EI balance (Hertäg und Sprekeler, 2020, Barry and Gerstner, 2024). Heterosynaptic plasticity is a possible mechanism to achieve EI balance (Field et al. 2020). For example, spike pairing in pre- and postsynaptic neurons induces long-term potentiation at co-activated excitatory and inhibitory synapses with the degree of inhibitory potentiation depending on the evoked excitation (D’amour and Froemke, 2015), which can normalise EI balance (Field et al. 2020).

      In the model we propose, SSTs should be subtractive and PVs divisive. However, SSTs can also be divisive, and PVs subtractive dependent on circuit and postsynaptic properties (Seybold et al. 2015, Lee et al. 2012, Dorsett et al. 2021). This does not necessarily contradict our model, as circuits in which SSTs are divisive and PVs subtractive could implement a different function, as not all pyramidal cells are error neurons. Hence, our model suggests that error neurons which can calculate UPEs should have similar physiological properties to the layer 2/3 cells observed in the study by Wilson et al. 2012.

      Our model further posits the existence of two distinct subtypes of SSTs in positive and negative error circuits. Indeed, there are many different subtypes of SSTs. SST is expressed by a large population of interneurons, which can be further subdivided. There is e.g. a type called SST44, which was shown to specifically respond when the animal corrects a movement (Green et al. 2023). Our proposal is hence aligned with the observation of functionally specialised subtypes of SSTs.

      Finally, I think this is a paper that would have been clearer if the equations had been interspersed within the results. Within the given format, I think the authors should include many more references to the Methods section, with specific equation numbers, where they are relevant throughout the Results section. The lack of clarity is certainly made worse by the current state of the Methods section, where there is far too much repetition and poor ordering of material throughout.

      We implemented the reviewer’s detailed and helpful suggestions on how to improve the ordering and other aspects of the methods section and now either intersperse the equations within the results or refer to the relevant equation number from the Methods section within the Results section.

      Reviewer 3 (Public Review):

      Summary: The authors proposed a normative principle for how the brain’s internal estimate of an observed sensory variable should be updated during each individual observation. In particular, they propose that the update size should be inversely proportional to the variance of the variable. They then proposed a microcircuit model of how such an update can be implemented, in particularly incorporating two types of interneurons and their subtractive and divisive inhibition onto pyramidal neurons. One type should represent the estimated mean while another represents the estimated variance. The authors used simulations to show that the model works as expected.

      Strengths: The paper addresses two important issues: how uncertainty is represented and used in the brain, and the role of inhibitory neurons in neural computation. The proposed circuit and learning rules are simple enough to be plausible. They also work well for the designated purposes. The paper is also well-written and easy to follow.

      Weaknesses: I have concerns with two aspects of this work.

      (1) The optimality analysis leading to Eq (1) appears simplistic. The learning setting the authors describe (estimating the mean of a stationary Gaussian variable from a stream of observations) is a very basic problem in online learning/streaming algorithm literature. In this setting, the real “optimal” estimate is simply the arithmetic average of all samples seen so far. This can be implemented in an online manner with µˆt = µˆt−1 +(st −µˆt−1)/t. This is optimal in the sense that the estimator is always the maximum likelihood estimator given the samples seen up to time t. On the other hand, doing gradient descent only converges towards the MLE estimator after a large number of updates. Another critique is that while Eq (1) assumes an estimator of the mean (mˆu), it assumes that the variance is already known. However, in the actual model, the variance also needs to be estimated, and a more sophisticated analysis thus needs to take into account the uncertainty of the variance estimate and so on. Finally, the idea that the update should be inverse to the variance is connected to the well-established idea in neuroscience that more evidence should be integrated over when uncertainty is high. For example, in models of two-alternative forced choices it is known to be optimal to have a longer reaction time when the evidence is noisier.

      We agree with the reviewer that the simple example we gave was not ideal, as it could have been solved much more elegantly without gradient descent. And the reviewer correctly pointed out that our solution was not even optimal. We now present a better example in Figure 7, where the mean of the Gaussian variable is not stationary. Indeed, we did not intend to assume that the Gaussian variable is stationary, as we had in mind that the environment can change and hence also the Gaussian variable. If the mean is constant over time, it is indeed optimal to use the arithmetic mean. However, if the mean changes after many samples, then the maximum likelihood estimator model would be very slow to adapt to the new mean, because t is large and each new stimulus only has a small impact on the estimate. If the mean changes, uncertainty modulation may be useful: if the variance was small before, and the mean changes, then the resulting big error will influence the change in the estimate much more, such that we can more quickly learn the new mean. A combination of the two mechanisms would probably be ideal. We use gradient descent here, because not all optimisation problems the brain needs to solve are that simple. The problem with converging only after a large number of updates is a general problem of the algorithm. Here, we propose how the brain could estimate uncertainty to achieve the uncertainty-modulation observed in inference and learning tasks observed in behavioural studies. To give a more complex example, we present in a new Figure 8 how a hierarchy of UPE circuits can be used for uncertainty-based integration of prior and sensory information, similar to Bayes-optimal integration.

      Yes, indeed, there is well-known behavioural evidence, we would like to thank the reviewer for pointing out this connection to two-alternative forced choice tasks. We now cite this work. Our contribution is not on the already established computational or algorithmic level, but the proposal of a neural implementation of how uncertainty could modulate learning. The variance indeed needs to be estimated for optimal mean updating. That means that in the beginning, there will be non-optimal updating until the variance is learned. However, once the variance is learned, mean-updating can use the learned variance. There may be few variance contexts but many means to be learned, such that variance contexts can be reused. In any case, this is a problem on the algorithmic level, and not so much on the implementational level we are concerned with.

      (2) While the incorporation of different inhibitory cell types into the model is appreciated, it appears to me that the computation performed by the circuit is not novel. Essentially the model implements a running average of the mean and a running average of the variance, and gates updates to the mean with the inverse variance estimate. I am not sure about how much new insight the proposed model adds to our understanding of cortical microcircuits.

      We here suggest an implementation for how uncertainty could modulate learning via influencing prediction error com- putation. Our model can explain how humans could estimate uncertainty and weight prior versus sensory information accordingly. The focus of our work was not to design a better algorithm for mean and variance estimation, but rather to investigate how specialised prediction error circuits in the brain can implement these operations to provide new experimental hypotheses and predictions.

      Reviewer 1 (Recommendations For The Authors):

      Clarity and conciseness are a strength of this manuscript, but a more comprehensive explanation could improve the reader’s understanding in some instances. This includes the NMDA-based nonlinearity of pyramidal neuron activation - I am a little unclear exactly what problem this solves and how (alongside the significance of 5D and E).

      We agree that the introduction of the NMDA-based nonlinearity was too short and understand that it was confusing. We therefore modified and expanded the section, where we introduce the dendritic nonlinearity of the error neurons.

      Page 5: I think there is a ’positive’ and ’negative’ missing from the following sentence: ’the weights from the SSTs to the UPE neurons need to match the weights from the stimulus s to the UPE neuron and from the mean representation to the UPE neuron, respectively.’

      Thanks for pointing that out! We changed the sentence to be more clear to the following: “To ensure a comparison between the stimulus and the prediction, the inhibition from the SSTs needs to match the excitation it is compared to in the UPE neurons: In the positive PE circuit, the weights from the SSTs representing the prediction to the UPE neurons need to match the weights from the stimulus s to the UPE neurons. In the negative PE circuit, the weights from SSTs representing the stimulus to the negative UPE neurons need to match the weights from the mean representation to the UPE neurons, respectively.”

      Reviewer 2 (Recommendations For The Authors):

      Related to the first point above: I don’t feel that the authors adequately explained what the “s” and “a” information (e.g., in Figures 2A, 3A) represent, where they are coming from, what neurons they impact and in what way (and I believe Fig. 3A is missing one “a” label). I think they should elaborate more fully on these key, foundational details for their theory. To me, the idea of starting from the PV, SST, and pyramidal circuit, and then suddenly introducing the extra R neuron in Figure 5, just adds confusion. If the R neuron is meant to be the source, in practice, of certain inputs to some of the other cell types, then I think that should be included in the circuit from the start. Perhaps a good idea would be to start with two schematics, one in the form of Figure 5A (but with additional labeling for PV, SST) and one like Figure 1EF (but with auditory inputs as well), with a clear indication that the latter is meant to represent a preliminary, reduced form of the former that will be used in some initial tests of the performance of the PV, SST, UPE part of the circuit. Related to the Methods, I also can give a list of some specific complaints (in latex):

      (1) φ, φP V are used in equations (10), (11), so they should be defined there, not many equations later.

      Thank you, we changed that.

      (2) β, 1 − β appear without justification or explanation in (11). That is finally defined and derived several pages later.

      Thank you, we now define it right at the beginning.

      (3) Equations (10)-(12) should be immediately followed by information about plasticity, rather than deferring that.

      That’s a great idea. We changed it. Now the synaptic dynamics are explained together with the firing rate dynamics.

      (4) After the rate equations (10)-(12) and weight change equations (23)-(25) are presented, the same equations are simply repeated in the “Explanation of the synaptic dynamics” subsection.

      We agree that this was suboptimal. We moved the explanation of the synaptic dynamics up and removed the repetition.

      (5) In the circuit model (13)-(19), it’s not clear why rR shows up in the SST+ and PV− equations vs. rs in PV+ and SST−. Moreover, rs is not even defined! Also, I don’t see why wP V +,R shows up in the equation for rP V − .

      We added more explanation to the Methods section as to why the neurons receive these inputs and renamed rs to s, which is defined. The “+” in wP V +,R was a typo. Thank you for spotting that.

      (6) The authors should only number those equations that they will reference by number. Even more importantly, there are many numbers such as (20), (26), (32), (39) that are just floating there without referring to an equation at all.

      Thank you for spotting that. We corrected this.

      (7) The authors fail to specify what is ra in Figure 8. Moreover, it seems strange to me that wP V,a approaches σ rather than wP V,ara approaching σ, since φP V is a function of wP V,ara.

      You are right, wP V,ara should approach σ, but since ra is either 1 or 0 to indicate the presence of absence of the cue, and only wP V,a is plastic and changing„ wP V,a approaches σ.

      (8) I don’t understand the rationale for the authors to introduce equation. (30) when they already had plasticity equations earlier. What is the relation of (30), (31) to (24)?

      It is the same equation. In 30 we introduce simpler symbols for a better overview of the equations. 31 is equal to 30, with rP V replaced by it’s steady state.

      (9) η is omitted from (33) - it won’t affect the final result but should be there.

      We fixed this.

      I have many additional specific comments and suggestions, some related to errors that really should have been caught before manuscript submission. I will present these based on the order in which they arise in the manuscript.

      (1) In the abstract, the mention of layer 2/3 comes out of nowhere. Why this layer specifically? Is this meant to be an abstract/general cortical circuit model or to relate to a specific brain area? (Also watch for several minor grammatical issues in the abstract and later.)

      Thank you for pointing this out. We now mention that the observed error neurons can be found in layer 2/3 of diverse brain areas. It is meant to be a general cortical circuit model independent of brain area.

      (2) In par. 2 of the introduction, I find sentences 3-4 to be confusing and vague. Please rewrite what is meant more directly and clearly.

      We tried to improve those sentences.

      (3) Results subtitle 1: “suggests” → “suggest”

      Thank you.

      (4) Be careful to use math font whenever variables, such as a and N, are referenced (e.g., use of a instead of a bottom pg. 2).

      We agree and checked the entire manuscript.

      (5) Ref. to Fig. 1B bottom pg. 2 should be Fig. 1CD. The panel order in the figure should then be changed to match how it is referenced.

      We fixed it and matched the ordering of the text with the ordering of the figure.

      (6) Fig. 2C and 3E captions mention std but this is not shown in the figures - should be added.

      It is there, it is just very small.

      (7) Please clarify the relation of Figure 2C to 2F, and Figure 3F to 3H.

      We colour-coded the points in 2F that correspond to the bars in 2C. We did the same for 3F and 3H.

      (8) Figures 3E,3F appear to be identical except for the y-axis label and inclusion of std in 3F. Either more explanation is needed of how these relate or one should be cut.

      The difference is that 3E shows the activity of PVs based on only the sound cue in the absence of a whisker stimulus. And 3F shows the activity of PVs based on both the sound cue and whisker stimuli. We state this more clearly now.

      (9) Bottom of pg. 4: clarify that a quadratic φP V is a model assumption, not derived from results in the figure.

      We added that we assume this.

      (10) When k is referenced in the caption of Figure 4, the reader has no idea what it is. More substantially, most panels of Figure 4 are not referenced in the paper. I don’t understand what point the authors are trying to make here with much of this figure. Indeed, since the claim is that the uncertainy prediction should be based on division by σ2, why aren’t the numerical values for UPE rates much larger, since σ gets so small? The authors also fail to give enough details about the simulations done to obtain these plots; presumably these are after some sort of (unspecified) convergence, and in response to some sort of (unspecified) stimulus? Coming back to k, I don’t understand why k > 2 is used in addition to k = 2. The text mentions – even italicizes – “out-of-distribution dectection’, but this is never mentioned elsewhere in the paper and seems to be outside the true scope of the work (and not demonstrated in Figure 4). Sticking with k = 2 would also allow authors to simply use (·)k below (10), rather than the awkward positive part function that they have used now.

      We now introduce the equation for the error neurons in Eq. 3 within the text, such that k is introduced before the caption. It also explains why the numerical values do not become much larger. Divisive inhibition, unlike mathematical division, cannot lead to multiplication in neurons. To ensure this, we add 1 to the denominator.

      We show the error neuron responses to stimuli deviating from the learned mean after learning the mean and variance. The deviation is indicated either on the x-axis or in the legend depending on the plot. We now more explicitly state that these plots are obtained after learning the mean and the variance.

      We removed the mentioning of the “out-of-distribution detection” as a detailed treatment would indeed be outside of the scope.

      (11) Page 5, please clarify what is meant by “weights from the sound...”. You have introduced mathematical notation - use it so that you can be precise.

      We added the mathematical notation, thank you!

      (12) Figure 5D: legend has 5 entries but the figure panel only plots 4 quantities.

      The SST firing rate was below the R firing rate. We hence omitted the SST firing rate and its legend.

      (13) Figure 5: I don’t understand what point is being made about NMDA spikes. The text for Figure 5 refers to NMDA spikes in Figure 4, but nothing was said about NMDA spikes in the text for Figure 4 nor shown in Figure 4 itself.

      We were referring to the nonlinearity in the activation function of UPEs in Figure 4. We changed the text to clarify this point.

      (14) Figure 6: It is too difficult to distinguish the black and purple curves even on a large monitor. Also, the authors fail to define what they mean by “MM” and also do not define the quantities Y+ and Y− that they show. Another confusing aspect is that the model has PV+ and PV− neurons, so why doesn’t the figure?

      Thank you for the comment. We changed the colour for better visibility, replaced the Upsilons with UPE (we changed the notation at some point and forgot to change it in the figure), and defined MM, which is the mismatch stimulus that causes error activity. We did not distinguish between PV+ and PV− in the plot as their activity is the same on average. We plotted the activity of the PV+. We now mention that we show the activity of PV+ as the representative.

      (15) Also Figure 6: The authors do not make it clear in the text whether these are simulation results or cartoons. If the latter, please replace this with actual simulation results.

      They are actual simulation results. We clarified this in the text.

      (16) This work assumes the existence of two different populations of SST neurons with distinct UPE (pyramidal) targets. The Discussion doesn’t say much about any evidence for this assumption, which should be more thoroughly discussed and justified.

      We now discuss this in more detail in the discussion as mentioned in our response to the public review.

      (17) Par. 2 of the discussion refers to “Bayesian” and “Bayes-optimal” several times. Nothing was said earlier in the paper about a Bayesian framework for these results and it’s not clear what the authors mean by referring to Bayes here. This paragraph needs editing so that it clearly relates to the material of the results section and its implications.

      We added an additional results section (the last section with Figure 8) on integrating prior and sensory information based on their uncertainties, which is also the case for Bayes-optimal integration, and show that our model can reproduce the central tendency effect, which is a hallmark of Bayes-optimal behaviour.

      Reviewer 3 (Recommendations For The Authors):

      See public review. I think the gradient-descent type of update the authors do in Equation (1) could be more useful in a more complicated learning scenario where the MLE has no closed form and has to be computed with gradient-based algorithms.

      We responded in detail to your points in our point-by-point response to the public review.

    2. eLife Assessment

      This important study introduces a new cortical circuit model for predictive processing. Simulations effectively illustrate that, with appropriate synaptic plasticity, a canonical layer 2/3 cortical circuit - comprising two classes of interneurons providing subtractive and divisive inhibition - can generate uncertainty-modulated prediction errors by pyramidal neurons. The model is compelling; although it relies on many assumptions and has not yet been compared directly to data, the model does align with empirical observations and yields a range of testable predictions. The study is expected to be of great interest to those involved in cortical and predictive processing research.

    3. Reviewer #2 (Public review):

      Summary:

      This computational modeling study addresses the observation that variable observations are interpreted differently depending on how much uncertainty an agent expects from its environment. That is, the same mismatch between a stimulus and an expected stimulus would be less significant, and specifically would represent a smaller prediction error, in an environment with a high degree of variability than in one where observations have historically been similar to each other. The authors show that if two different classes of inhibitory interneurons, the PV and SST cells, (1) encode different aspects of a stimulus distribution and (2) act in different (divisive vs. subtractive) ways, and if (3) synaptic weights evolve in a way that causes the impact of certain inputs to balance the firing rates of the targets of those inputs, then pyramidal neurons in layer 2/3 of canonical cortical circuits can indeed encode uncertainty-modulated prediction errors. To achieve this result, SST neurons learn to represent the mean of a stimulus distribution and PV neurons its variance.

      The impact of uncertainty on prediction errors in an understudied topic, and this study provides an intriguing and elegant new framework for how this impact could be achieved and what effects it could produce. The ideas here differ from past proposals about how neuronal firing represents uncertainty. The developed theory is accompanied by several predictions for future experimental testing, including the existence of different forms of coding by different subclasses of PV interneurons, which target different sets of SST interneurons (as well as pyramidal cells). The authors are able to point to some experimental observations that are at least consistent with their computational results. The simulations shown demonstrate that if we accept its assumptions, then the authors' theory works very well: SSTs learn to represent the mean of a stimulus distribution, PVs learn to estimate its variance, firing rates of other model neurons scale as they should, and the level of uncertainty automatically tunes the learning rate, so that variable observations are less impactful in a high uncertainty setting.

      Strengths:

      The ideas in this work are novel and elegant, and they are instantiated in a progression of simulations that demonstrate the behavior of the circuit. The framework used by the authors is biologically plausible and matches some known biological data. The results attained, as well as the assumptions that go into the theory, provide several predictions for future experimental testing. The authors have taken into account earlier review comments to revise their paper in ways that enhance its clarity.

      Weaknesses:

      One weakness could be that the proposed theory does rely on a fairly large number of assumptions. However, there is at least some biological support for these. Importantly, the authors do lay out and discuss their key assumptions in the Discussion section, so readers can assess their validity and implications for themselves.

    4. Reviewer #4 (Public review):

      Summary:

      Wilmes and colleagues develop a model for the computation of uncertainty modulated prediction errors based on an experimentally inspired cortical circuit model for predictive processing. Predictive processing is a promising theory of cortical function. An essential aspect of the model is the idea of precision weighting of prediction errors. There is ample experimental evidence for prediction error responses in cortex. However, a central prediction of the theory is that these prediction error responses are regulated by the uncertainty of the input. Testing this idea experimentally has been difficult due to a lack of concrete models. This work provides one such model and makes experimentally testable predictions.

      Strengths:

      The model proposed is novel and well-implemented. It has sufficient biological accuracy to make useful and testable predictions.

      Weaknesses:

      One key idea the model hinges on is that stimulus uncertainty is encoded in the firing rate of parvalbumin positive interneurons. This assumption, however, is rather speculative and there is no direct evidence for this.

    1. eLife Assessment

      Treatment of pseudomonas aeruginosa (PA) is challenging because of intrinsic and acquired antibiotic resistance to most antibiotic drug classes. Therefore, by using donor B cells in subjects with cystic fibrosis who undergo intermittent or chronic airway PA infections, the authors tried to isolate BCRs against PA virulence factors and examine their biological activities. The data are solid and isolated protective antibodies could be useful for protection against PA.

    2. Joint Public Review:

      Summary:

      This study presents a strategy to efficiently isolate PcrV-specific BCRs from human donors with cystic fibrosis who have/had Pseudomonas aeruginosa (PA) infection. Isolation of mAbs that provide protection against PA may be a key to developing a new strategy to treat PA infection as the PA has intrinsic and acquired resistance to most antibiotic drug classes. Hale et al. developed fluorescently labeled antigen-hook and isolated mAbs with anti-PA activity. Overall, the authors' conclusion is supported by solid data analysis presented in the paper. Four of five recombinantly expressed PcrV-specific mAbs exhibited anti-PA activity in a murine pneumonia challenge model as potent as the V2L2MD mAb (equivalent to gremubamab). However, therapeutic potency for these isolated mAbs is uncertain as the gremubamab has failed in Phase 2 trials. Clarification of this point would greatly benefit this paper.

      Strengths:

      (1) High efficiency of isolating antigen-specific BCRs using an antigenic hook.

      (2) The authors' conclusion is supported by data.

      Weaknesses:

      Although the authors state that the goal of this study was to generate novel protective mAbs for therapeutic use (P12; Para. 2), it is unclear whether PcrV-specific mAbs isolated in this study have therapeutic potential better than the gremubamab, which has failed in Phase 2 trials. Four of five PcrV-specific mAbs isolated in this study reduced bacterial burdens in mice as potent as, but not superior to, gremubamab-equivalent mAb. Clarification of this concern by revising the text or providing experimental results that show better potential than gremubamab would greatly benefit this paper.

    3. Author response:

      Joint Public Review:

      Summary:

      This study presents a strategy to efficiently isolate PcrV-specific BCRs from human donors with cystic fibrosis who have/had Pseudomonas aeruginosa (PA) infection. Isolation of mAbs that provide protection against PA may be a key to developing a new strategy to treat PA infection as the PA has intrinsic and acquired resistance to most antibiotic drug classes. Hale et al. developed fluorescently labeled antigen-hook and isolated mAbs with anti-PA activity. Overall, the authors' conclusion is supported by solid data analysis presented in the paper. Four of five recombinantly expressed PcrV-specific mAbs exhibited anti-PA activity in a murine pneumonia challenge model as potent as the V2L2MD mAb (equivalent to gremubamab). However, therapeutic potency for these isolated mAbs is uncertain as the gremubamab has failed in Phase 2 trials. Clarification of this point would greatly benefit this paper.

      Strengths:

      (1) High efficiency of isolating antigen-specific BCRs using an antigenic hook.

      (2) The authors' conclusion is supported by data.

      Weaknesses:

      Although the authors state that the goal of this study was to generate novel protective mAbs for therapeutic use (P12; Para. 2), it is unclear whether PcrV-specific mAbs isolated in this study have therapeutic potential better than the gremubamab, which has failed in Phase 2 trials. Four of five PcrV-specific mAbs isolated in this study reduced bacterial burdens in mice as potent as, but not superior to, gremubamab-equivalent mAb. Clarification of this concern by revising the text or providing experimental results that show better potential than gremubamab would greatly benefit this paper.

      The authors thank the reviewer for their thoughtful positive assessment. As noted by the reviewer, the studies described here, which were performed in mice, show that our MBC-derived mAbs are as effective as V2L2MD, a mAb that is one component of the gremubamab bi-specific. However, key theoretical strengths of MBC-derived mAbs (reduced immunogenicity, full participation in effector functions) are not easily tested in mice. We have clarified and expanded our discussion of these points in our revised manuscript, particularly in the Discussion paragraph 4.

    1. eLife Assessment

      This valuable study addresses one way in which animals identify predator-associated cues and respond in a manner that reflects the imminence of the potential threat. The report shows that, in mice, fresh saliva from a natural predator (cat) elicits a greater defensive response compared to old cat saliva and implicates the vomeronasal organ and ventromedial hypothalamus as part of a circuit that underlies this process. The evidence supporting the main conclusions is solid. This study will be of interest to those interested in aversive behavior, its processes, and mechanisms.

    2. Reviewer #1 (Public review):

      Summary:

      Animals in natural environments need to identify predator-associated cues and respond with the appropriate behavioral response to survive. In rodents, some chemical cues produced by predators (e.g., cat saliva) are detected by chemosensory neurons in the vomeronasal organ (VNO). The VNO transmits predator-associated information to the accessory olfactory bulb, which in turn projects to the medial amygdala and the bed nucleus of the stria terminalis, two regions implicated in the initiation of antipredator defensive behaviors. A downstream area to these two regions is the ventromedial hypothalamus (VMH), which has been shown to control both active (i.e., flight) and passive (i.e, freezing) antipredator defensive responses via distinct efferent projections to the anterior hypothalamic nucleus or the periaqueductal gray, respectively. However, whether differences in predator-associated sensory information initially processed in the VNO and further conveyed to the VMH can trigger different types of behavioral responses remained unexplored. To address this question, here the authors investigated the behavioral responses of mice exposed to either fresh or old cat saliva, and further compared the underlying neural circuits that are activated by cat saliva with different freshness.

      The scientific question of the study is valid, the experiments were well-performed, and the statistical analyses are appropriate. However, there are some concerns that may directly affect the main interpretation of the results.

      In this revised version of the manuscript, the authors have made important modifications in the text, inserted new experiments and performed additional data analyses, as recommended. These modifications have significantly improved the quality of the manuscript and addressed all the major concerns detected during the prior submission.

    3. Reviewer #2 (Public review):

      In this study, Nguyen et al. showed that cat saliva can robustly induce freezing behavior in mice. This effect is mediated through accessory olfactory system as it requires physical contact and is abolished in Trp2 KO mice. The authors further showed that V2R-A4 cluster is responsive to cat saliva. Lastly, they demonstrated c-Fos induction in AOB and VMHdm/c by the cat saliva. The c-Fos level in the VMHdm/c is correlated with freezing response.

      Strength:

      The study opens an interesting direction. It reveals the potential neural circuit for detecting cat saliva and driving defense behavior in mice. The behavior results and the critical role of accessory olfactory system in detecting cat saliva are clear and convincing.

      Weakness:

      The findings are relatively preliminary. The identities of the receptor and the ligand in the cat saliva that induces the behavior remain unclear. The identity of VMH cells that are activated by the cat saliva remains unclear. There is a lack of targeted functional manipulation to demonstrate the role of V2R-A4 or VMH cells in the behavioral response to the cat saliva.

      Here are some specific comments:

      (1) This result suggests that V2R-A4 may be the dominant VR for mice to detect cat saliva. Future studies should determine the identity of the receptor and the ligand in the cat saliva. Additionally, the functional importance of V2R-A4 remains unclear. It is important to knockout the receptor and test changes in cat saliva-induced freezing.

      (2) AOB does not project to VMH directly. Other known important nodes for the predator defense circuit includes MeApv, BNST, PMd, AHN and PAG. It will be helpful to provide c-Fos data in those regions (especially MEA and BNST as they are between AOB and VMH) to provide a complete picture regarding how the brain process cat saliva to induce the behavior change.

      (3) It is interesting that activation level difference in the VNO by old and fresh cat saliva does not transfer to AOB. It could be informative to examine correlation between VNO and AOB p6/c-Fos cell number and AOB and VMH c-Fos cell number across animals to understand whether the activation level across those regions are related. If they are not correlated, it could be helpful to add a discussion regarding potential reasons, e.g. neuromodulatory inputs to the AOB.

      (4) Please indicate n in all figure plots and specify what individual dots means. In Figure 4h, there are 7 dots in old saliva group, presumably indicating 7 animals. In Figure 6b, there appear to be more than 7 dots for old cat saliva group. Are there more than 7 animals? If so, why are they not included in Figure 4h? If not, what does each dot mean? Note that each dot should represent independent sample. One animal should not contribute more than one dot.

      (5) The identification of a cluster of VMHdm cells uniquely activated by fresh cat saliva urine is interesting. It will be important to identify the molecular handle of the cells to facilitate further investigation. This could be achieved using either activity dependent RNAseq or double in situ of saliva-induced c-Fos and candidate genes (candidate gene may be identified based on the known gene expression pattern).

    4. Reviewer #3 (Public review):

      Summary:

      Nguyen et al show data indicating that the vomeronasal organ (VNO) and ventromedial hypothalamus (VMH) are part of a circuit that elicits defensive responses induced by predator odors. They also suggest that using fresh or old predator saliva may be a method to change the perceived imminence of predation. The authors also identify a family of VNO receptors that are activated by cat saliva. Next, the authors show how different components of this defensive circuit are activated by saliva, as measured by fos expression. The work also shows that different VMH populations are activated by fresh and old saliva, demonstrating that these stimuli create qualitatively different neural activity profiles. However, the exact components that differ between fresh and old saliva remain unknown and may be identified in future studies.

      Strengths:

      (1) Predator saliva is a stimulus of high ethological relevance<br /> (2) The authors performed a careful quantification of fos induction across the anterior-posterior axis<br /> (3) Authors show that different VMH populations are activated by fresh and old saliva

      Weaknesses:

      (1) There is a lack of standard circuit dissection methods, such as characterizing the behavioral effects of increasing and decreasing neural activity of relevant cell bodies and axonal projections

      (2) Some of the findings are disconnected from the story. For example, the authors show V2R-A4-expressing cells are activated by predator odors, but the causal role of these cells in generating defensive actions is not shown

    5. Author response:

      The following is the authors’ response to the previous reviews

      We greatly appreciate all the reviewers’ constructive comments on our previously revised manuscript. In the current revision, we added several experimental data for answering the reviewers’ comments. Below we describe our point-by-point responses to their comments:

      Reviewer #1 (Public Review):

      Unaddressed and additional concerns (re-submission)

      In this revised version of the manuscript, the authors have made important modifications in the text, inserted new references, and incorporated additional quantifications of cFos immunolabeling in three brain regions, as recommended by the reviewers. While these modifications have significantly improved the quality of the manuscript, other critical concerns raised during the initial submission of the

      manuscript (Major concerns 1, 2, and 4; some of them also raised by the other reviewers) were not properly addressed by the authors. On several occasions, the authors recognize the importance of clarifying the points for the correct interpretation of the results but opt for leaving the open questions to be addressed during future studies. Therefore, the authors might consider adding a new section at the end of the manuscript to include all the caveats and future directions.

      In the current revision, in order to answer the reviewer #1’s original concerns 1, 2, and 4, we added several experimental data.

      Original major concerns 1) and 2): Regarding whether mice are detecting qualitative or quantitative differences between fresh and old cat saliva.

      To address these concerns, as shown in new Figure 1I and J, we measured volumes of saliva contained in in individual swabs and total protein concentrations at the time of behavior tests: Fresh (15 minutes after collection) and Old (4 hours after collection). The saliva volumes at the time of behavioral testing were indistinguishable between fresh and old samples (Figure 1I). In addition, the concentrations of total proteins in both fresh and old saliva were also indiscernible (Figure 1J). Furthermore, we also examined the difference of the amount of Fel d 4 protein, one of the most abundant proteins in cat saliva, between fresh and old saliva by conducting western blotting analyses. As shown in new Supplemental Figure 2, the amount of Fel d 4 was nearly equivalent between fresh and old saliva. Indeed, our analyses using recombinant Fel d 4 protein showed that Fel d 4 does not induce freezing behavior (Supplemental Figure 5). Based on these findings, we believe that the difference between fresh and old cat saliva lies in specific components rather than the total or major saliva content. One possible explanation for this difference is the time-dependent reduction of specific freezing-inducing components in old saliva.

      To investigate such a possibility, we also examined mouse behavior directed toward swabs containing diluted fresh cat saliva. Indeed, exposure to diluted fresh saliva resulted in a shorter duration of freezing behavior. Fresh saliva diluted to 70% induced freezing behavior for a duration equivalent to that of undiluted fresh saliva, while freezing behavior in response to 50% and 30% fresh saliva was significantly reduced to the same duration as that observed with old saliva (Figure 1K). The duration of direct interaction with swabs containing 70% and 50–30% fresh saliva also exhibited a similar trend to that observed with fresh and old saliva swabs, respectively (Figure 1L).

      These new results provide compelling evidence that the differential freezing response of mice to fresh versus old cat saliva is not attributed to quantitative differences, such as total volume, total protein concentration, or the amount of major proteins like Fel d 4. However, when fresh saliva was diluted, we observed a corresponding reduction in freezing behavior, suggesting that specific components within the saliva—those responsible for inducing freezing—may decrease over time.

      Our findings indicate that while the overall content of saliva remains consistent over time, specific freezing-inducing components seem to degrade or reduce at a different rate than other components, which alters the composition of saliva over time. The speed of reduction of these freezing-inducing components appears to be different from more stable proteins such as Fel d 4. As a result, the composition of saliva changes over time, leading to a qualitative difference between fresh and old saliva that mice can detect. This ability to discern such subtle chemical changes likely reflects an adaptive sensory mechanism, allowing mice to respond to predator cues to induce optimal defensive behavior in a certain context. Identifying the specific freezing-inducing components through traditional purification processes, such as high-performance liquid chromatography followed by behavioral examination (Haga-Yamanaka et al., 2014; Kimoto et al., 2005), is crucial for a deeper understanding of the mechanisms underlying the observed behavior. Our research team is actively working to isolate these molecules, and we hope to report our findings in future studies.

      (4) The interpretation that fresh and old saliva activates different subpopulations of neurons in the VMH based on the observation that cFos positively correlates with freezing responses only with the fresh saliva lacks empirical evidence. To address this question, the authors should use two neuronal activity markers to track the response of the same population of VHM cells within the same animals during exposure to fresh vs. old saliva.

      To address this issue, as shown in the new Figure 7, we performed a double exposure experiment using Fos2A-iCreERT2; Ai9 (TRAP2) mice (Allen et al., 2017; DeNardo et al., 2019). In this experiment, mice were exposed to the first stimulus under the treatment of 4-hydroxytamoxifen (4-OHT). One week after the initial exposure, the same mice were subjected to a second stimulus exposure for one hour. Through this paradigm, neurons activated by the first stimulus were visualized by tdTomato, while ones activated by the second stimulus were detected as cFos-IR (Figure 7A). Quantification of tdTomato and cFos-IR double-positive cells among tdTomato-labeled cells revealed that 43% (mean per animal: 61 / 143) of cells activated by fresh saliva during the first exposure were also activated by fresh saliva during the second exposure, whereas only 16% (17 / 106) of cells activated by old saliva during the first exposure were activated by fresh saliva during the second exposure (p = 7.5e-6, Chi-squared test). The difference in the fraction of overlapping cells between fresh and old saliva exposures was found significant when we compared the two groups of animals (Figure 7D, p = 0.0035, permutation test). Additionally, quantification of tdTomato and cFos-IR double-positive cells among cFos-IR cells indicated that over 27% (61 / 226) of cells activated by fresh saliva during the second exposure were previously activated by fresh saliva, whereas only 15% (17 / 112) of cells activated by fresh saliva during the second exposure were previously activated by old saliva (p = 0.015, Chi-squared test). The difference in the fraction of overlapping cells between fresh and old saliva exposures was also significant in this analysis (Figure 7E,p = 0.0060, permutation test). Together, these results demonstrate that fresh and old cat saliva activate largely different populations of neurons within the VMH. These new results were described on page 11 line 18 – page 12 line 8.

      In addition to these unaddressed concerns, some new issues have emerged in the new version of the manuscript. For example, the following paragraph introduced in the discussion section is not supported by the experimental findings.

      "We assume that such differential activations of the mitral cells between fresh and old saliva result in the differential activation of targeting neural substrates, possibly MeApv, which results in differential activation of VMH neurons (Figure 7)."

      Although the authors did not observe statistical differences in cFos expression in the pvMeA among groups, they claim that the differences in cFos expression in the VMH between fresh vs. old saliva are mediated by differential activation of upstream neurons in the MeApv. The lack of statistical differences may be caused by the reduced number of subjects in each group, as recognized in the text by the

      authors.

      We appreciate the reviewer's thoughtful comment. We agree that the paragraph in the comment, which presented a working hypothesis regarding differential activations of mitral cells and the MeApv between fresh and old saliva exposures, was speculative and not fully supported by our experimental findings. To address this, we have removed the assumptions related to the differential responses of mitral cells and the MeApv from the discussion and have updated the figure accordingly (now presented as new Figure 8).

      Moreover, the authors propose that in addition to fel d 4, multiple molecules present in the cat saliva can be inducing distinct defensive responses in the animals, but they do not provide any reference to support their claim.

      We thank the reviewer for highlighting this point. Our claim regarding the presence of other molecules in cat saliva inducing freezing defensive responses is based on our observation, as shown in the new Supplemental Figure 5, that recombinant Fel d 4 protein alone does not induce freezing behavior. This suggests the existence of other unidentified components in cat saliva that may contribute to freezing behavior. As we agree that identifying these specific freezing-inducing components is important for a more comprehensive understanding of the underlying mechanisms, our research team is actively working to isolate these molecules, and we hope to report our findings in future studies.

      Reviewer #2 (Public Review):

      The findings are relatively preliminary. The identities of the receptor and the ligand in the cat saliva that induces the behavior remain unclear. The identity of VMH cells that are activated by the cat saliva remains unclear. There is a lack of targeted functional manipulation to demonstrate the role of V2R-A4 or VMH cells in the behavioral response to the cat saliva.

      We thank the reviewer’s important insight on the need for further investigation into the molecular and neural mechanisms underlying the behavioral response to cat saliva. We recognize the importance of conducting studies involving V2R-A4 receptor knockouts and targeted functional manipulations within the VMH using neural circuit perturbation approaches.

      However, the V2R-A4 subfamily consists of 25 Vmn2r genes, most of which are closely grouped together, forming a V2R-A4 gene cluster within a 2.5-megabase chromosomal region. As we described in our recent review article (Rocha et al., 2024), the Vmn2r genes within the V2R-A4 subfamily display a high degree of homology, with nucleotide and amino acid identities among the several Vmn2rs surpassing 97-99%, suggesting possible redundancy among these receptor genes. This is in stark contrast to the diversity typically observed within other V2R subfamilies. Consequently, knockout strategies targeting a single receptor gene, which have been successful for other vomeronasal receptors, may not be effective for V2R-A4 receptor genes. The most appropriate strategy for examining the necessity of V2R-A4 receptors would be knocking out the entire V2R-A4 gene cluster, spanning a 2.5-megabase chromosomal region. Due to the technical challenges involved, addressing this issue is not feasible in the foreseeable future. Moreover, in our current study, we aimed to establish the foundational relationship between predator cues in cat saliva and defensive behaviors. We view our findings as an important first step that sets the stage for these more targeted and mechanistic studies involving the neural circuit perturbation experiments, such as optogenetics and Designer Receptors Exclusively Activated by Designer Drugs (DREADDs), in the next step.

      Reviewer #3 (Public Review):

      Weaknesses:

      (1)  It is unclear if fresh and old saliva indeed alter the perceived imminence of predation, as claimed by the authors. Prior work indicates that lower imminence induces anxiety-related actions, such as re- organization of meal patterns and avoidance of open spaces, while slightly higher imminence produces freezing. Here, the authors show that fresh and old predator saliva only provoke different amounts of freezing, rather than changing the topography of defensive behaviors, as explained above. Another prediction of predatory imminence theory would be that lower imminence induced by old saliva should produce stronger cortical activation, while fresh saliva would activate amygdala, if these stimuli indeed correspond to significantly different levels of predation imminence.

      We appreciate the reviewer’s insightful comments regarding the perceived imminence of predation and the behavioral responses to fresh and old saliva. Our study specifically focused on comparing the defensive behaviors of mice in response to 15-minute-old and 4-hour-old cat saliva, particularly within the context of freezing behavior in their home cages. We chose these specific time points to capture the potential variation in behavioral intensity rather than the full spectrum of defensive behaviors. While a more comprehensive analysis—including varying time points, different types of defensive behaviors, and broader neural activation patterns (e.g., cortical versus amygdala activation)—might provide further insights into predation imminence theory, these aspects were beyond the scope of our current study. Future research could certainly address these points by examining behavioral and neural responses across additional saliva aging intervals and in varied behavioral contexts. Such studies would complement and extend the findings presented here, further elucidating the relationship between predator cue characteristics and defensive behaviors.

      (2)  It is known that predator odors activate and require AOB, VNO and VMH, thus replications of these findings are not novel, decreasing the impact of this work.

      As the reviewer mentioned, the activation of the AOB, VNO, and VMH by predator odors has been established in prior studies. However, our study provides new insights by demonstrating that defensive freezing behavior in response to predator odors is mediated through the vomeronasal organ (VNO) sensory circuit, which has not been previously shown. The novelty of our work lies in two key findings: 1) the introduction of a new behavioral paradigm that assesses freezing responses to predator cues based on the freshness of chemosensory signals in cat saliva, and 2) the demonstration that the vomeronasal sensory circuit mediates defensive freezing behavior in response to cat saliva.

      Additionally, our results show that cat saliva of different freshness levels differentially activates VNO sensory neurons that express the same subfamily of sensory receptors. This differential activation subsequently modulates the downstream neural circuits, leading to varied freezing behavioral outcomes. We believe these findings provide a novel conceptual advance over previous studies by elucidating a more detailed mechanism of how predator-derived cues influence defensive behaviors through the accessory olfactory system.

      (3)  There is a lack of standard circuit dissection methods, such as characterizing the behavioral effects of increasing and decreasing neural activity of relevant cell bodies and axonal projections, significantly decreasing the mechanistic insights generated by this work

      We thank the reviewer for this valuable comment. Investigating the behavioral effects of manipulating specific cell types and axonal projections, as well as characterizing circuit connectivity, is essential for a more comprehensive understanding of the underlying neural circuits. These approaches, such as modulating neural activity in defined cell populations and dissecting circuit pathways, using optogenetics, DREADD, etc., would provide deeper mechanistic insights. In our current study, however, we aimed to establish the foundational relationship between predator cues in cat saliva and defensive behaviors. We view our findings as an important first step that sets the stage for these more targeted and mechanistic studies in the future.

      (4)  The correlation shown in Figure 5c may be spurious. It appears that the correlation is primarily driven by a single point (the green square point near the bottom left corner). All correlations should be calculated using Spearman correlation, which is non-parametric and less likely to show a large correlation due to a small number of outliers. Regardless of the correlation method used, there are too few points in Figure 5c to establish a reliable correlation. Please add more points to 5c.

      We appreciate the reviewer’s suggestion regarding the correlation analysis in Figure 5E. We assessed the normality of our data using both the Shapiro-Wilk and Kolmogorov-Smirnov tests, which confirmed that the dataset is parametric, justifying the use of a parametric correlation method in this context. However, we acknowledge the concern about the limited number of data points and the influence of potential outliers on the observed correlation. Increasing the sample size might provide a more robust assessment of correlation patterns and reduce the potential impact of any single data point. While this would be an important direction for future research, such as with larger sample sizes, it is beyond the scope of the current study.

      (5)  Please cite recent relevant papers showing VMH activity induced by predators, such as https://pubmed.ncbi.nlm.nih.gov/33115925/ and https://pubmed.ncbi.nlm.nih.gov/36788059/

      We thank the reviewer’s suggestion to cite these important papers. https://pubmed.ncbi.nlm.nih.gov/33115925/ (Esteban Masferrer et al., 2020) and https://pubmed.ncbi.nlm.nih.gov/36788059/ (Tobias et al., 2023) are now cited at page 16 line 10 in Discussion under “Differential activation of VMH neurons potentially underlying distinct intensities of freezing behavior.”

      (6)  Add complete statistical information in the figure legends of all figures, which should include n, name of test used and exact p values.

      We included statistical analysis results in figure legends; for Figure 6B, we provided statistical analysis results in Supplemental Table 1.

      (7)  Some of the findings are disconnected from the story. For example, the authors show V2R-A4- expressing cells are activated by predator odors. Are these cells more likely to be connected to the rest of the predatory defense circuit than other VNO cells?

      Yes, our hypothesis posits that V2R-A4-expressing VNO sensory neurons serve as receptor neurons for predator cues present in cat saliva. Additionally, we assume that these specific sensory neurons have stronger anatomical connections with the defensive circuit compared to VNO sensory neurons expressing other receptor subfamilies. In our modified Discussion section, we discussed this point under “V2R-A4 subfamily as the receptor for predator cues in cat saliva.”

      (8)  Please paste all figure legends directly below their corresponding figure to make the manuscript easier to read

      We have added figure legends directly below their corresponding figures.

      (9)  Were there other behavioral differences induced by fresh compared to old saliva? Do they provoke differences in stretch-attend risk evaluation postures, number of approaches, average distance to odor stimulus, velocity of movements towards and away the odor stimulus, etc?

      We appreciate the reviewer's valuable comments. We have now incorporated an analysis of stretch-sniff risk assessment behavior, presented in new Figure 1F (graph) and Supplemental Figure 1B (raster plot). Mice exhibited stretch-sniff risk assessment behavior, which remained consistent across control, fresh saliva, and old saliva swabs. Additionally, we have also included a raster plot for direct investigation, previously noted as ‘interaction’ in the original manuscript (Supplemental Figure 1C). Mice exposed to a swab containing either fresh or old saliva significantly avoided directly investigating the swab. In contrast, mice exposed to a clean control swab spent a significant amount of time directly investigating the swab, engaging in behaviors such as sniffing and chewing (Figure 1G). A comparison of temporal behavioral patterns revealed a slightly higher frequency of direct investigation behavior toward old saliva compared to fresh saliva at the beginning of the exposure period (Supplemental Figure 1C).

      Reviewer #3 (Recommendations For The Authors):

      The authors have partially addressed several important points raised in the prior review, increasing the strength of the manuscript. However, 2 key questions already raised previously, were not addressed:

      (1)  Is old saliva qualitatively different from new saliva, or is it the same as a smaller amount of new saliva? As Reviewer 1 wrote: "An important point that the authors should clarify in this study is whether mice are detecting qualitative or quantitative differences between fresh and old cat saliva."

      Since one of the author's main points is that fresh and old saliva elicit different perceived threat imminences, it is crucial to show that these two stimuli are somehow qualitatively different.

      One way to investigate this could be to show that animals perform different behaviors when exposed to smaller among of new saliva vs old saliva, or that the cfos activation patterns are different in these two conditions.

      The answers to these concerns are provided in the Public review Comment from Reviewer #1.

      (2)  The other key question is if different VMH populations are activated by new vs old saliva.

      The answer to this concern is provided in the Public Review comment from Reviewer #1.

      Lastly, although the new analysis and text changes improved the manuscript, many issues raised were addressed with some variation of 'future studies will be done', or 'we concur with the Reviewer'. However, the extra experiments required to answer these questions were not done. For this reason, even though the authors have numerous exciting pieces of data, overall the work is still incomplete. I highlight below some examples in which the authors agree with the Reviewer, but do not answer the question with the new work that would be required, or propose to do the work in future studies.

      In this revised manuscript, we have conducted several additional experiments to address key concerns raised by the reviewers that are directly relevant to our claims. Specifically, we have examined: 1) whether qualitative or quantitative differences between fresh and old cat saliva are detected by mice to modulate behavior (NEW Figure 1I, J, K, and L, and NEW Supplemental Figure 2); 2) the involvement of Fel d 4 in freezing behavior (NEW Supplemental Figure 5); and 3) whether different VMH populations are activated by fresh versus old saliva (NEW Figure 7). However, some concerns raised by the reviewers fall outside the scope of the current manuscript. These include: 1) identifying the specific components that induce freezing, 2) examining the necessity of V2R-A4 receptors, 3) conducting neural circuit perturbations, and 4) performing a comprehensive analysis—including varying time points, different types of defensive behaviors, and broader neural activation patterns (e.g., cortical versus amygdala activation)—of the mouse’s defensive response to different levels of predator threat imminence. As these aspects are beyond the focus of our current manuscript, we have noted in the Public Review comments.

      References:

      Allen WE, DeNardo LA, Chen MZ, Liu CD, Loh KM, Fenno LE, Ramakrishnan C, Deisseroth K, Luo L. 2017. Thirst-associated preoptic neurons encode an aversive motivational drive. Science 357:1149– 1155.

      DeNardo LA, Liu CD, Allen WE, Adams EL, Friedmann D, Fu L, Guenthner CJ, Tessier-Lavigne M, Luo L. 2019. Temporal evolution of cortical ensembles promoting remote memory retrieval. Nat Neurosci 22:460–469.

      Haga-Yamanaka S, Ma L, He J, Qiu Q, Lavis LD, Looger LL, Yu CR. 2014. Integrated action of pheromone signals in promoting courtship behavior in male mice. Elife 3:e03025.

      Kimoto H, Haga S, Sato K, Touhara K. 2005. Sex-specific peptides from exocrine glands stimulate mouse vomeronasal sensory neurons. Nature 437:898–901.

      Rocha A, Nguyen QAT, Haga-Yamanaka S. 2024. Type 2 vomeronasal receptor-A4 subfamily: Potential predator sensors in mice. Genesis 62:e23597.

    1. eLife Assessment

      This important work shows how a simple geophysical setting of gas flow over a narrow channel of water can create a physical environment that leads to the isothermal replication of nucleic acids. The work presents convincing evidence for an isothermal polymerase chain reaction in careful experiments involving evaporation and convective flows, complimented with fluid dynamics simulations. This work will be of interest to scientists working on the origin of life and more broadly, on nucleic acids and diagnostic applications.

    2. Reviewer #1 (Public review):

      This manuscript from Schwintek and coworkers describes a system in which gas flow across a small channel (10^-4-10^-3 m scale) enables the accumulation of reactants and convective flow. The authors go on to show that this can be used to perform PCR as a model of prebiotic replication.

      Strengths:

      The manuscript nicely extends the authors' prior work in thermophoresis and convection to gas flows. The demonstration of nucleic acid replication is an exciting one, and an enzyme-catalyzed proof-of-concept is a great first step towards a novel geochemical scenario for prebiotic replication reactions and other prebiotic chemistry.

      The manuscript nicely combines theory and experiment, which generally agree well with one another, and it convincingly shows that accumulation can be achieved with gas flows and that it can also be utilized in the same system for what one hopes is a precursor to a model prebiotic reaction. This continues efforts from Braun and Mast over the last 10-15 years extending a phenomenon that was appreciated by physicists and perhaps underappreciated in prebiotic chemistry to increasingly chemically relevant systems and, here, a pilot experiment with a simple biochemical system as a prebiotic model.

      I think this is exciting work and will be of broad interest to the prebiotic chemistry community.

      Weaknesses:

      The manuscript states: "The micro scale gas-water evaporation interface consisted of a 1.5 mm wide and 250 µm thick channel that carried an upward pure water flow of 4 nl/s ≈ 10 µm/s perpendicular to an air flow of about 250 ml/min ≈ 10 m/s." This was a bit confusing on first read because Figure 2 appears to show a larger channel - based on the scale bar, it appears to be about 2 mm across on the short axis and 5 mm across on the long axis. From reading the methods, one understands the thickness is associated with the Teflon, but the 1.5 mm dimension is still a bit confusing (and what is the dimension in the long axis?) It is a little hard to tell which portion (perhaps all?) of the image is the channel. This is because discontinuities are present on the left and right sides of the experimental panels (consistent with the image showing material beyond the channel), but not the simulated panels. Based on the authors' description of the apparatus (sapphire/CNC machined Teflon/sapphire) it sounds like the geometry is well-known to them. Clarifying what is going on here (and perhaps supplying the source images for the machined Teflon) would be helpful.

      The data shown in Figure 2d nicely shows nonrandom residuals (for experimental values vs. simulated) that are most pronounced at t~12 m and t~40-60m. It seems like this is (1) because some symmetry-breaking occurs that isn't accounted for by the model, and perhaps (2) because of the fact that these data are n=1. I think discussing what's going on with (1) would greatly improve the paper, and performing additional replicates to address (2) would be very informative and enhance the paper. Perhaps the negative and positive residuals would change sign in some, but not all, additional replicates?

      The authors will most likely be familiar with the work of Victor Ugaz and colleagues, in which they demonstrated Rayleigh-Bénard-driven PCR in convection cells (10.1126/science.298.5594.793, 10.1002/anie.200700306). Not including some discussion of this work is an unfortunate oversight, and addressing it would significantly improve the manuscript and provide some valuable context to readers. Something of particular interest would be their observation that wide circular cells gave chaotic temperature profiles relative to narrow ones and that these improved PCR amplification (10.1002/anie.201004217). I think contextualizing the results shown here in light of this paper would be helpful. Again, it appears n=1 is shown for Figure 4a-c - the source of the title claim of the paper - and showing some replicates and perhaps discussing them in the context of prior work would enhance the manuscript.

      I think some caution is warranted in interpreting the PCR results because a primer-dimer would be of essentially the same length as the product. It appears as though the experiment has worked as described, but it's very difficult to be certain of this given this limitation. Doing the PCR with a significantly longer amplicon would be ideal, or alternately discussing this possible limitation would be helpful to the readers in managing expectations.

    3. Reviewer #2 (Public review):

      Schwintek et al. investigated whether a geological setting of a rock pore with water inflow on one end and gas passing over the opening of the pore on the other end could create a non-equilibrium system that sustains nucleic acid reactions under mild conditions. The evaporation of water as the gas passes over it concentrates the solutes at the boundary of evaporation, while the gas flux induces momentum transfer that creates currents in the water that push the concentrated molecules back into the bulk solution. This leads to the creation of steady-state regions of differential salt and macromolecule concentrations that can be used to manipulate nucleic acids. First, the authors showed that fluorescent bead behavior in this system closely matched their fluid dynamic simulations. With that validation in hand, the authors next showed that fluorescently labeled DNA behaved according to their theory as well. Using these insights, the authors performed a FRET experiment that clearly demonstrated the hybridization of two DNA strands as they passed through the high Mg++ concentration zone, and, conversely, the dissociation of the strands as they passed through the low Mg++ concentration zone. This isothermal hybridization and dissociation of DNA strands allowed the authors to perform an isothermal DNA amplification using a DNA polymerase enzyme. Crucially, the isothermal DNA amplification required the presence of the gas flux and could not be recapitulated using a system that was at equilibrium. These experiments advance our understanding of the geological settings that could support nucleic acid reactions that were key to the origin of life.

      The presented data compellingly supports the conclusions made by the authors. To increase the relevance of the work for the origin of life field, the following experiments are suggested:

      (1) While the central premise of this work is that RNA degradation presents a risk for strand separation strategies relying on elevated temperatures, all of the work is performed using DNA as the nucleic acid model. I understand the convenience of using DNA, especially in the latter replication experiment, but I think that at least the FRET experiments could be performed using RNA instead of DNA.

      (2) Additionally, showing that RNA does not degrade under the conditions employed by the authors (I am particularly worried about the high Mg++ zones created by the flux) would further strengthen the already very strong and compelling work.

      (3) Finally, I am curious whether the authors have considered designing a simulation or experiment that uses the imidazole- or 2′,3′-cyclic phosphate-activated ribonucleotides. For instance, a fully paired RNA duplex and a fluorescently-labeled primer could be incubated in the presence of activated ribonucleotides +/- flux and subsequently analyzed by gel electrophoresis to determine how much primer extension has occurred. The reason for this suggestion is that, due to the slow kinetics of chemical primer extension, the reannealing of the fully complementary strands as they pass through the high Mg++ zone, which is required for primer extension, may outcompete the primer extension reaction. In the case of the DNA polymerase, the enzymatic catalysis likely outcompetes the reannealing, but this may not recapitulate the uncatalyzed chemical reaction.

    4. Author response:

      Reviewer #1 (Public review):

      This manuscript from Schwintek and coworkers describes a system in which gas flow across a small channel (10^-4-10^-3 m scale) enables the accumulation of reactants and convective flow. The authors go on to show that this can be used to perform PCR as a model of prebiotic replication.

      Strengths:

      The manuscript nicely extends the authors' prior work in thermophoresis and convection to gas flows. The demonstration of nucleic acid replication is an exciting one, and an enzyme-catalyzed proof-of-concept is a great first step towards a novel geochemical scenario for prebiotic replication reactions and other prebiotic chemistry.

      The manuscript nicely combines theory and experiment, which generally agree well with one another, and it convincingly shows that accumulation can be achieved with gas flows and that it can also be utilized in the same system for what one hopes is a precursor to a model prebiotic reaction. This continues efforts from Braun and Mast over the last 10-15 years extending a phenomenon that was appreciated by physicists and perhaps underappreciated in prebiotic chemistry to increasingly chemically relevant systems and, here, a pilot experiment with a simple biochemical system as a prebiotic model.

      I think this is exciting work and will be of broad interest to the prebiotic chemistry community.

      Weaknesses:

      The manuscript states: "The micro scale gas-water evaporation interface consisted of a 1.5 mm wide and 250 µm thick channel that carried an upward pure water flow of 4 nl/s ≈ 10 µm/s perpendicular to an air flow of about 250 ml/min ≈ 10 m/s." This was a bit confusing on first read because Figure 2 appears to show a larger channel - based on the scale bar, it appears to be about 2 mm across on the short axis and 5 mm across on the long axis. From reading the methods, one understands the thickness is associated with the Teflon, but the 1.5 mm dimension is still a bit confusing (and what is the dimension in the long axis?) It is a little hard to tell which portion (perhaps all?) of the image is the channel. This is because discontinuities are present on the left and right sides of the experimental panels (consistent with the image showing material beyond the channel), but not the simulated panels. Based on the authors' description of the apparatus (sapphire/CNC machined Teflon/sapphire) it sounds like the geometry is well-known to them. Clarifying what is going on here (and perhaps supplying the source images for the machined Teflon) would be helpful.

      We understand. We will update the figures to better show dimensions of the experimental chamber. We will also add a more complete Figure in the supplementary information. Part of the complexity of the chamber however stems from the fact that the same chamber design has also been used to create defined temperature gradients which are not necessary and thus the chamber is much more complex than necessary.

      The data shown in Figure 2d nicely shows nonrandom residuals (for experimental values vs. simulated) that are most pronounced at t~12 m and t~40-60m. It seems like this is (1) because some symmetry-breaking occurs that isn't accounted for by the model, and perhaps (2) because of the fact that these data are n=1. I think discussing what's going on with (1) would greatly improve the paper, and performing additional replicates to address (2) would be very informative and enhance the paper. Perhaps the negative and positive residuals would change sign in some, but not all, additional replicates?

      To address this, we will show two more replicates of the experiment and include them in Figure 2.

      We are seeing two effects when we compare fluorescence measurements of the experiments.

      Firstly, degassing of water causes the formation of air-bubbles, which are then transported upwards to the interface, disrupting fluorescence measurements. This, however, mostly occurs in experiments with elevated temperatures for PCR reactions, such as displayed in Figure 4.

      Secondly, due to the high surface tension of water, the interface is quite flexible. As the inflow and evaporation work to balance each other, the shape of the interface adjusts, leading to alterations in the circular flow fields below.

      Thus the conditions, while overall being in steady state, show some fluctuations. The strong dependence on interface shape is also seen in the simulation. However, modeling a dynamic interface shape is not so easy to accomplish, so we had to stick to one geometry setting. Again here, the added movies of two more experiments should clarify this issue.

      The authors will most likely be familiar with the work of Victor Ugaz and colleagues, in which they demonstrated Rayleigh-Bénard-driven PCR in convection cells (10.1126/science.298.5594.793, 10.1002/anie.200700306). Not including some discussion of this work is an unfortunate oversight, and addressing it would significantly improve the manuscript and provide some valuable context to readers. Something of particular interest would be their observation that wide circular cells gave chaotic temperature profiles relative to narrow ones and that these improved PCR amplification (10.1002/anie.201004217). I think contextualizing the results shown here in light of this paper would be helpful.

      Thanks for pointing this out and reminding us. We apologize. We agree that the chaotic trajectories within Rayleigh-Bénard convection cells lead to temperature oscillations similar to the salt variations in our gas-flux system. Although the convection-driven PCR in Rayleigh-Bénard is not isothermal like our system, it provides a useful point of comparison and context for understanding environments that can support full replication cycles. We will add a section comparing approaches and giving some comparison into the history of convective PCR and how these relate to the new isothermal implementation.

      Again, it appears n=1 is shown for Figure 4a-c - the source of the title claim of the paper - and showing some replicates and perhaps discussing them in the context of prior work would enhance the manuscript.

      We appreciate the reviewer for bringing this to our attention. We will now include the two additional repeats for the data shown in Figure 4c, while the repeats of the PAGE measurements are already displayed in Supplementary Fig. IX.2. Initially, we chose not to show the repeats in Figure 4c due to the dynamic and variable nature of the system. These variations are primarily caused by differences at the water-air interface, attributed to the high surface tension of water. Additionally, the stochastic formation of air bubbles in the inflow—despite our best efforts to avoid them—led to fluctuations in the fluorescence measurements across experiments. These bubbles cause a significant drop in fluorescence in a region of interest (ROI) until the area is refilled with the sample.

      Unlike our RNA-focused experiments, PCR requires high temperatures and degassing a PCR master mix effectively is challenging in this context. While we believe our chamber design is sufficiently gas-tight to prevent air from diffusing in, the high surface-to-volume ratio in microfluidics makes degassing highly effective, particularly at elevated temperatures. We anticipate that switching to RNA experiments at lower temperatures will mitigate this issue, which is also relevant in a prebiotic context.

      The reviewer’s comments are valid and prompt us to fully display these aspects of the system. We will now include these repeats in Figure 4c to give readers a deeper understanding of the experiment's dynamics. Additionally, we will provide videos of all three repeats, allowing readers to better grasp the nature of the fluctuations in SYBR Green fluorescence depicted in Figure 4c.

      I think some caution is warranted in interpreting the PCR results because a primer-dimer would be of essentially the same length as the product. It appears as though the experiment has worked as described, but it's very difficult to be certain of this given this limitation. Doing the PCR with a significantly longer amplicon would be ideal, or alternately discussing this possible limitation would be helpful to the readers in managing expectations.

      This is a good point and should be discussed more in the manuscript. Our gel electrophoresis is capable of distinguishing between replicate and primer dimers. We know this since we were optimizing the primers and template sequences to minimize primer dimers, making it distinguishable from the desired 61mer product. That said, all of the experiments performed without a template strand added did not show any band in the vicinity of the product band after 4h of reaction, in contrast to the experiments with template, presenting a strong argument against the presence of primer dimers.

      Reviewer #2 (Public review):

      Schwintek et al. investigated whether a geological setting of a rock pore with water inflow on one end and gas passing over the opening of the pore on the other end could create a non-equilibrium system that sustains nucleic acid reactions under mild conditions. The evaporation of water as the gas passes over it concentrates the solutes at the boundary of evaporation, while the gas flux induces momentum transfer that creates currents in the water that push the concentrated molecules back into the bulk solution. This leads to the creation of steady-state regions of differential salt and macromolecule concentrations that can be used to manipulate nucleic acids. First, the authors showed that fluorescent bead behavior in this system closely matched their fluid dynamic simulations. With that validation in hand, the authors next showed that fluorescently labeled DNA behaved according to their theory as well. Using these insights, the authors performed a FRET experiment that clearly demonstrated the hybridization of two DNA strands as they passed through the high Mg++ concentration zone, and, conversely, the dissociation of the strands as they passed through the low Mg++ concentration zone. This isothermal hybridization and dissociation of DNA strands allowed the authors to perform an isothermal DNA amplification using a DNA polymerase enzyme. Crucially, the isothermal DNA amplification required the presence of the gas flux and could not be recapitulated using a system that was at equilibrium. These experiments advance our understanding of the geological settings that could support nucleic acid reactions that were key to the origin of life.

      The presented data compellingly supports the conclusions made by the authors. To increase the relevance of the work for the origin of life field, the following experiments are suggested:

      (1) While the central premise of this work is that RNA degradation presents a risk for strand separation strategies relying on elevated temperatures, all of the work is performed using DNA as the nucleic acid model. I understand the convenience of using DNA, especially in the latter replication experiment, but I think that at least the FRET experiments could be performed using RNA instead of DNA.

      We understand the request only partially. The modification brought about by the two dye molecules in the FRET probe to be able to probe salt concentrations by melting is of course much larger than the change of the backbone from RNA to DNA. This was the reason why we rather used the much more stable DNA construct which is also manufactured at a lower cost and in much higher purity also with the modifications. But we think the melting temperature characteristics of RNA and DNA in this range is enough known that we can use DNA instead of RNA for probing the salt concentration in our flow cycling.

      Only at extreme conditions of pH and salt, RNA degradation through transesterification, especially under alkaline conditions is at least several orders of magnitude faster than spontaneous degradative mechanisms acting upon DNA [Li, Y., & Breaker, R. R. (1999). Kinetics of RNA degradation by specific base catalysis of transesterification involving the 2 ‘-hydroxyl group. Journal of the American Chemical Society, 121(23), 5364-5372.]. The work presented in this article is however focussed on hybridization dynamics of nucleic acids. Here, RNA and DNA share similar properties regarding the formation of double strands and their respective melting temperatures. While RNA has been shown to form more stable duplex structures exhibiting higher melting temperatures compared to DNA [Dimitrov, R. A., & Zuker, M. (2004). Prediction of hybridization and melting for double-stranded nucleic acids. Biophysical Journal, 87(1), 215-226.], the general impact of changes in salt, temperature and pH [Mariani, A., Bonfio, C., Johnson, C. M., & Sutherland, J. D. (2018). pH-Driven RNA strand separation under prebiotically plausible conditions. Biochemistry, 57(45), 6382-6386.] on respective melting temperatures follows the same trend for both nucleic acid types. Also the diffusive properties of RNA and DNA are very similar [Baaske, P., Weinert, F. M., Duhr, S., Lemke, K. H., Russell, M. J., & Braun, D. (2007). Extreme accumulation of nucleotides in simulated hydrothermal pore systems. Proceedings of the National Academy of Sciences, 104(22), 9346-9351.].

      Since this work is a proof of principle for the discussed environment being able to host nucleic acid replication, we aimed to avoid second order effects such as degradation by hydrolysis by using DNA as a proxy polymer. This enabled us to focus on the physical effects of the environment on local salt and nucleic acid concentration. The experiments performed with FRET are used to visualize local salt concentration changes and their impact on the melting temperature of dissolved nucleic acids.  While performing these experiments with RNA would without doubt cover a broader application within the field of origin of life, we aimed at a step-by-step / proof of principle approach, especially since the environmental phenomena studied here have not been previously investigated in the OOL context. Incorporating RNA-related complexity into this system should however be addressed in future studies. This will likely require modifications to the experimental boundary conditions, such as adjusting pH, temperature, and salt concentration, to account for the greater duplex stability of RNA. For instance, lowering the pH would reduce the RNA melting temperature [Ianeselli, A., Atienza, M., Kudella, P. W., Gerland, U., Mast, C. B., & Braun, D. (2022). Water cycles in a Hadean CO2 atmosphere drive the evolution of long DNA. Nature Physics, 18(5), 579-585.].

      (2) Additionally, showing that RNA does not degrade under the conditions employed by the authors (I am particularly worried about the high Mg++ zones created by the flux) would further strengthen the already very strong and compelling work.

      Based on literature values for hydrolysis rates of RNA [Li, Y., & Breaker, R. R. (1999). Kinetics of RNA degradation by specific base catalysis of transesterification involving the 2 ‘-hydroxyl group. Journal of the American Chemical Society, 121(23), 5364-5372.], we estimate RNA to have a halflife of multiple months under the deployed conditions in the FRET experiment (High concentration zones contain <1mM of Mg2+). Additionally, dsRNA is multiple orders of magnitude more stable than ssRNA with regards to degradation through hydrolysis [Zhang, K., Hodge, J., Chatterjee, A., Moon, T. S., & Parker, K. M. (2021). Duplex structure of double-stranded RNA provides stability against hydrolysis relative to single-stranded RNA. Environmental Science & Technology, 55(12), 8045-8053.], improving RNA stability especially in zones of high FRET signal. Furthermore, at the neutral pH deployed in this work, RNA does not readily degrade. In previous work from our lab [Salditt, A., Karr, L., Salibi, E., Le Vay, K., Braun, D., & Mutschler, H. (2023). Ribozyme-mediated RNA synthesis and replication in a model Hadean microenvironment. Nature Communications, 14(1), 1495.], we showed that the lifetime of RNA under conditions reaching 40mM Mg2+ at the air-water interface at 45°C was sufficient to support ribozymatically mediated ligation reactions in experiments lasting multiple hours.

      With that in mind, gaining insight into the median Mg2+ concentration across multiple averaged nucleic acid trajectories in our system (see Fig. 3c&d) and numerically convoluting this with hydrolysis dynamics from literature would be highly valuable. We anticipate that longer residence times in trajectories distant from the interface will improve RNA stability compared to a system with uniformly high Mg2+ concentrations.

      (3) Finally, I am curious whether the authors have considered designing a simulation or experiment that uses the imidazole- or 2′,3′-cyclic phosphate-activated ribonucleotides. For instance, a fully paired RNA duplex and a fluorescently-labeled primer could be incubated in the presence of activated ribonucleotides +/- flux and subsequently analyzed by gel electrophoresis to determine how much primer extension has occurred. The reason for this suggestion is that, due to the slow kinetics of chemical primer extension, the reannealing of the fully complementary strands as they pass through the high Mg++ zone, which is required for primer extension, may outcompete the primer extension reaction. In the case of the DNA polymerase, the enzymatic catalysis likely outcompetes the reannealing, but this may not recapitulate the uncatalyzed chemical reaction.

      This is certainly on our to-do list. Our current focus is on templated ligation rather than templated polymerization and we are working hard to implement RNA-only enzyme-free ligation chain reaction, based on more optimized parameters for the templated ligation from 2’3’-cyclic phosphate activation that was just published [High-Fidelity RNA Copying via 2′,3′-Cyclic Phosphate Ligation, Adriana C. Serrão, Sreekar Wunnava, Avinash V. Dass, Lennard Ufer, Philipp Schwintek, Christof B. Mast, and Dieter Braun, JACS doi.org/10.1021/jacs.3c10813 (2024)]. But we first would try this at an air-water interface which was shown to work with RNA in a temperature gradient [Ribozyme-mediated RNA synthesis and replication in a model Hadean microenvironment, Annalena Salditt, Leonie Karr, Elia Salibi, Kristian Le Vay, Dieter Braun & Hannes Mutschler, Nature Communications doi.org/10.1038/s41467-023-37206-4 (2023)] before making the jump to the isothermal setting we describe here. So we can understand the question, but it was good practice also in the past to first get to know the setting with PCR, then jump to RNA.

      Reviewer #2 (Recommendations for the authors):

      (1) Could the authors comment on the likelihood of the geological environments where the water inflow velocity equals the evaporation velocity?

      This is an important point to mention in the manuscript, thank you for pointing that out. To produce a defined experiment, we were pushing the water out with a syringe pump, but regulated in a way that the evaporation was matching our flow rate. We imagine that a real system will self-regulate the inflow of the water column on the one hand side by a more complex geometry of the gas flow, matching the evaporation with the reflow of water automatically. The interface would either recede or move closer to the gas flux, depending on whether the inflow exceeds or falls short of the evaporation rate. As the interface moves closer, evaporation speeds up, while moving away slows it down. This dynamic process stabilizes the system, with surface tension ultimately fixing the interface in place.

      We have seen a bit of this dynamic already in the experiments, could however so far not yet find a good geometry within our 2-dimensional constant thickness geometry to make it work for a longer time. Very likely having a 3-dimensional reservoir of water with less frictional forces would be able to do this, but this would require a full redesign of a multi-thickness microfluidics. The more we think about it, the more we envisage to make the next implementation of the experiment with a real porous volcanic rock inside a humidity chamber that simulates a full 6h prebiotic day. But then we would lose the whole reproducibility of the experiment, but likely gain a way that recondensation of water by dew in a cold morning is refilling the water reservoirs in the rocks again. Sorry that I am regressing towards experiments in the future.

      (2) Could the authors speculate on using gases other than ambient air to provide the flux and possibly even chemical energy? For example, using carbonyl sulfide or vaporized methyl isocyanide could drive amino acid and nucleotide activation, respectively, at the gas-water interface.

      This is an interesting prospect for future work with this system. We thought also about introducing ammonia for pH control and possible reactions. We were amazed in the past that having CO2 instead of air had a profound impact on the replication and the strand separation [Water cycles in a Hadean CO2 atmosphere drive the evolution of long DNA, Alan Ianeselli, Miguel Atienza, Patrick Kudella, Ulrich Gerland, Christof Mast & Dieter Braun, Nature Physics doi.org/10.1038/s41567-022-01516-z (2022)]. So going more in this direction absolutely makes sense and as it acts mostly on the length-selectively accumulated molecules at the interface, only the selected molecules will be affected, which adds to the selection pressure of early evolutionary scenarios.

      Of course, in the manuscript, we use ambient air as a proxy for any gas, focusing primarily on the energy introduced through momentum transfer and evaporation. We speculate that soluble gasses could establish chemical gradients, such as pH or redox potential, from the bulk solution to the interface, similar to the Mg2+ accumulation shown in Figure 3c. The nature of these gradients would depend on each gas's solubility and diffusivity. We have already observed such effects in thermal gradients [Keil, L. M., Möller, F. M., Kieß, M., Kudella, P. W., & Mast, C. B. (2017). Proton gradients and pH oscillations emerge from heat flow at the microscale. Nature communications, 8(1), 1897.] and finding similar behavior in an isothermal environment would be a significant discovery.

      (3) Line 162: Instead of "risk," I suggest using "rate".

      Oh well - thanks for pointing this out! Will be changed.

      (4) Using FRET of a DNA duplex as an indicator of salt concentration is a decent proxy, but a more direct measurement of salt concentration would provide further merit to the explicit statement that it is the salt concentration that is changing in the system and not another hidden parameter.

      Directly observing salt concentration using microscopy is a difficult task. While there are dyes that change their fluorescence depending on the local Na+ or Mg2+ concentration, they are not operating differentially, i.e. by making a ratio between two color channels. Only then we are not running into artifacts from the dye molecules being accumulated by the non-equilibrium settings. We were able to do this for pH in the past, but did not find comparable optical salt sensors. This is the reason we ended up with a FRET pair, with the advantage that we actually probe the strand separation that we are interested in anyhow. Using such a dye in future work would however without a doubt enhance the understanding of not only this system, but also our thermal gradient environments.

      (5) Figure 3a: Could the authors add information on "Dried DNA" to the caption? I am assuming this is the DNA that dried off on the sides of the vessel but cannot be sure.

      Thanks to the reviewer for pointing this out. This is correct and we will describe this better in the revised manuscript.

      (6) Figure 4b and c: How reproducible is this data? Have the authors performed this reaction multiple independent times? If so, this data should be added to the manuscript.

      The data from the gel electrophoresis was performed in triplicates and is shown in full in supplementary information. The data in c is hard to reproduce, as the interface is not static and thus ROI measurements are difficult to perform as an average of repeats. Including the data from the independent repeats will however give the reader insight into some of the experimental difficulties, such as air bubbles, which form from degassing as the liquid heats up, that travel upwards to the interface, disrupting the ongoing fluorescence measurements.

      (7) Line 256: "shielding from harmful UV" statement only applies to RNA oligomers as UV light may actually be beneficial for earlier steps during ribonucleoside synthesis. I suggest rephrasing to "shielding nucleic acid oligomers from UV damage.".

      Will be adjusted as mentioned.

      (8) The final paragraph in the Results and Discussion section would flow better if placed in the Conclusion section.

      This is a good point and we will merge results and discussion closer together.

      (9) Line 262, "...of early Life" is slightly overstating the conclusions of the study. I suggest rephrasing to "...of nucleic acids that could have supported early life."

      This is a fair comment. We thank the reviewer for his detailed analysis of the manuscript!

      (10) In references, some of the journal names are in sentence case while others are in title case (see references 23 and 26 for example).

      Thanks - this will be fixed.

    1. eLife assessment

      This important study demonstrates a mechanism underlying the sex-dependent regulation of the susceptibility to gut colonization by Methicillin-resistant Staphylococcus aureus (MRSA). The evidence supporting the conclusion is solid, but additional experiments would strengthen the findings. The work will interest biologists who are working on intestinal infection and immunity.

    2. Reviewer #1 (Public review):

      Summary:

      Lejeune et al. demonstrated sex-dependent differences in the susceptibility to MRSA infection. The authors demonstrated the role of the microbiota and sex hormones as potential determinants of susceptibility. Moreover, the authors showed that Th17 cells and neutrophils contribute to sex hormone-dependent protection in female mice.

      Strengths:

      The role of microbiota was examined in various models (gnotobiotic, co-housing, microbiota transplantation). The identification of responsible immune cells was achieved using several genetic knockouts and cell-specific depletion models. The involvement of sex hormones was clarified using ovariectomy and the FCG model.

      Weaknesses:

      The mechanisms by which specific microbiota confer female-specific protection remain unclear.

    3. Reviewer #2 (Public review):

      The current study by Lejeune et al. investigates factors that allow for persistent MRSA infection in the GI tract. They developed an intriguing model of intestinal MRSA infection that does not use the traditional antibiotic approach, thereby allowing for a more natural infection that includes the normal intestinal microbiota. This model is more akin to what might be expected to be observed in a healthy human host. They find that biological sex plays a clear role in bacterial persistence during infection but only in mice bred at an NYU Facility and not those acquired from Jackson Labs. This clearly indicates a role for the intestinal microbiome in affecting female bacterial persistence but not male persistence which was unaffected by the origin of the mice and thus the microbiome. Through a series of clever microbiome-specific transfer experiments, they determine that the NYU-specific microbiome plays a role in this sexual dimorphism but is not solely responsible. Additional experiments indicate that Th17 cells, estrogen, and neutrophils also participate in the resistance to persistent infection. Notably, they assess the role of sex chromosomes (X/Y) using the established four core genotype model and find that these chromosomes appear to play little role in bacterial persistence.

      Overall, the paper nicely adds to the growing body of literature investigating how biological sex impacts the immune system and the burden of infectious disease. The conclusions are mostly supported by the data although there are some aspects of the data that could be better addressed and clarified.

      (1) There is something of a disconnect between the initial microbiome data and the later data that analyzes sex hormones and chromosomes. While there are clearly differences in microbial species across the two sites (NYU and JAX) how these bacterial species might directly interact with immune cells to induce female-specific responses is left unexplored. At the very least it would help to try and link these two distinct pieces of data to try and inform the reader how the microbiome is regulating the sex-specific response. Indeed, the reader is left with no clear exploration of the microbiota's role in the persistence of the infection and thus is left wanting.

      (2) While the authors make a reasonable case that Th17 T cells are important for controlling infection (using RORgt knockout mice that cannot produce Th17 cells), it is not clear how these cells even arise during infection since the authors make most of the observations 2 days post-infection which is longer before a normal adaptive immune response would be expected to arise. The authors acknowledge this, but their explanation is incomplete. The increase in Th17 cells they observe is predicated on mitogenic stimulation, so they are not specific (at least in this study) for MRSA. It would be helpful to see a specific restimulation of these cells with MRSA antigens to determine if there are pre-existing, cross-reactive Th17 cells specific for MRSA and microbiota species which could then link these two as mentioned above.

      (3) The ovariectomy experiment demonstrates a role for ovarian hormones; however, it lacks a control of adding back ovarian hormones (or at least estrogen) so it is not entirely obvious what is causing the persistence in this experiment. This is especially important considering the experiments demonstrating no role for sex chromosomes thus demonstrating that hormonal effects are highly important. Here it leaves the reader without a conclusive outcome as to the exact hormonal mechanism.

      (4) The discussion is underdeveloped and is mostly a rehash of the results. It would greatly enhance the manuscript if the authors would more carefully place the results in the context of the current state of the field including a more enhanced discussion of the role of estrogen, microbiome, and T cells and how the field might predict these all interact and how they might be interacting in the current study as well.

    4. Reviewer #3 (Public review):

      Summary:

      Using a mouse model of Staphylococcus aureus gut colonization, Lejeune et al. demonstrate that the microbiome, immune system, and sex are important contributing factors for whether this important human pathogen persists in the gut. The work begins by describing differential gut clearance of S. aureus in female B6 mice bred at NYU compared to those from Jackson Laboratories (JAX). NYU female mice cleared S. aureus from the gut but NYU male mice and mice of both sexes from JAX exhibited persistent gut colonization. Further experimentation demonstrated that differences between staphylococcal gut clearance in NYU and JAX female mice were attributed to the microbiome. However, NYU male and female mice harbor similar microbiomes, supporting the conclusion that the microbiome cannot account for the observed sex-dependent clearance of S. aureus gut colonization. To identify factors responsible for female clearance of S. aureus, the authors performed RNAseq on intestinal epithelial cells and cells enriched within the lamina propria. This analysis revealed sex-dependent transcriptional responses in both tissues. Genes associated with immune cell function and migration were distinctly expressed between the sexes. To determine which immune cell types contribute to S. aureus clearance Lejeune et al employed genetic and antibody-mediated immune cell depletion. This experiment demonstrated that CD4+ IL17+ cells and neutrophils promote the elimination of S. aureus from the gut. Subsequent experiments, including the use of the 'four core genotype model' were conducted to discern between the roles of sex chromosomes and sex hormones. This work demonstrated that sex-chromosome-linked genes are not responsible for clearance, increasing the likelihood that hormones play a dominant role in controlling S. aureus gut colonization.

      Strengths:

      A strength of the work is the rigorous experimental design. Appropriate controls were executed and, in most cases, multiple approaches were conducted to strengthen the authors' conclusions. The conclusions are supported by the data.

      The following suggestions are offered to improve an already strong piece of scholarship.

      Weaknesses:

      The correlation between female sex hormones and the elimination of S. aureus from the gut could be further validated by quantifying sex hormones produced in the four core genotype mice in response to colonization. Additionally, and this may not be feasible, but according to the proposed model administering female sex hormones to male mice should decrease colonization. Finally, knowing whether the quantity of IL-17a CD4+ cells change in the OVX mice has the potential to discern whether abundance/migration of the cells or their activation is promoted by female sex hormones.

      In the Discussion, the authors highlight previous work establishing a link between immune cells and sex hormone receptors, but whether the estrogen (and progesterone) receptor is differentially expressed in response to S. aureus colonization could be assessed in the RNAseq dataset. Differential expression of known X and Y chromosome-linked genes were discussed but specific sex hormones or sex hormone receptors, like the estrogen receptor, were not. This potential result could be highlighted.

    5. Author response:

      Reviewer #1 (Public review):

      Summary:

      Lejeune et al. demonstrated sex-dependent differences in the susceptibility to MRSA infection. The authors demonstrated the role of the microbiota and sex hormones as potential determinants of susceptibility. Moreover, the authors showed that Th17 cells and neutrophils contribute to sex hormone-dependent protection in female mice.

      Strengths:

      The role of microbiota was examined in various models (gnotobiotic, co-housing, microbiota transplantation). The identification of responsible immune cells was achieved using several genetic knockouts and cell-specific depletion models. The involvement of sex hormones was clarified using ovariectomy and the FCG model.

      Weaknesses:

      The mechanisms by which specific microbiota confer female-specific protection remain unclear.

      We thank the reviewer for highlighting the strength of the manuscript including the models and techniques we employ. We agree that the relationship between the microbiota and sex-dependent protection is less developed compared with other aspects of the study. In preparation of a revised manuscript, we intend on performing a more thorough comparison of male vs. female microbiota, along with quantification of sex hormones and downstream Th17 function (neutrophil recruitment and activation).

      Reviewer #2 (Public review):

      Overall, the paper nicely adds to the growing body of literature investigating how biological sex impacts the immune system and the burden of infectious disease. The conclusions are mostly supported by the data although there are some aspects of the data that could be better addressed and clarified.

      We thank the reviewer for appreciating our contribution. We intend on performing experiments to fill-in gaps and text revisions to increase clarity and acknowledge limitations.

      (1) There is something of a disconnect between the initial microbiome data and the later data that analyzes sex hormones and chromosomes. While there are clearly differences in microbial species across the two sites (NYU and JAX) how these bacterial species might directly interact with immune cells to induce female-specific responses is left unexplored. At the very least it would help to try and link these two distinct pieces of data to try and inform the reader how the microbiome is regulating the sex-specific response. Indeed, the reader is left with no clear exploration of the microbiota's role in the persistence of the infection and thus is left wanting.

      We agree. This comment is similar to Reviewer #1’s feedback. As mentioned above, we anticipate clarifying the association between sex differences and the microbiota. We will attempt to investigate specific bacteria, although some aspects of microbiota characterization may be outside the timeframe of the revision.

      (2) While the authors make a reasonable case that Th17 T cells are important for controlling infection (using RORgt knockout mice that cannot produce Th17 cells), it is not clear how these cells even arise during infection since the authors make most of the observations 2 days post-infection which is longer before a normal adaptive immune response would be expected to arise. The authors acknowledge this, but their explanation is incomplete. The increase in Th17 cells they observe is predicated on mitogenic stimulation, so they are not specific (at least in this study) for MRSA. It would be helpful to see a specific restimulation of these cells with MRSA antigens to determine if there are pre-existing, cross-reactive Th17 cells specific for MRSA and microbiota species which could then link these two as mentioned above.

      We acknowledge that this is a major limitation of our study. Although an experiment demonstrating pre-existing, cross-reactive T cells would help support our conclusion, aspects of MRSA biology may make the results of this experiment difficult to interpret. We have consulted with an expert on MRSA virulence factors, co-lead author Dr. Victor Torres, about the feasibility of this experiment. MRSA possess superantigens, such as Staphylococcal enterotoxin B, which bind directly to specific Vβ regions of T-cell receptors (TCR) and major histocompatibility complex (MHC) class II on antigen-presenting cells, resulting in hyperactivation of T lymphocytes and monocytes/macrophages. Additionally, other MRSA virulence factors, such as α-hemolysin and LukED, can induce cell death of lymphocytes. MRSA’s enterotoxins are heat stable, so heat-inactivation of the bacterium may not help in this matter.  For these reasons, restimulation of lymphocytes with MRSA antigens may be difficult to interpret. We humbly suggest that addressing this aspect of the mechanism is outside the scope of this manuscript.

      A study by Shao et al. provides an example of a host commensal species inducing Th17 cells with cross-reactivity against MRSA. Upon intestinal colonization, the intestinal fungus Candida albicans influences T cell polarization towards a Th17 phenotype in the spleen and peripheral lymph nodes which provided protection to the host against systemic candidemia. Interestingly, this induction of protective Th17 cells, increased IL-17 and responsiveness in circulating Ly6G+ neutrophils also protected mice from intravenous infection with MRSA, indicating that T cell activation and polarization by intestinal C. albicans leads to non-specific protective responses against extracellular pathogens.

      Shao TY, Ang WXG, Jiang TT, Huang FS, Andersen H, Kinder JM, Pham G, Burg AR, Ruff B, Gonzalez T, Khurana Hershey GK, Haslam DB, Way SS. Commensal Candida albicans Positively Calibrates Systemic Th17 Immunological Responses. Cell Host & Microbe. 2019 Mar 13;25(3):404-417.e6. doi: 10.1016/j.chom.2019.02.004. PMID: 30870622; PMCID: PMC6419754.

      Reviewer #3 (Public review):

      Strengths:

      A strength of the work is the rigorous experimental design. Appropriate controls were executed and, in most cases, multiple approaches were conducted to strengthen the authors' conclusions. The conclusions are supported by the data.

      The following suggestions are offered to improve an already strong piece of scholarship.

      Weaknesses:

      The correlation between female sex hormones and the elimination of S. aureus from the gut could be further validated by quantifying sex hormones produced in the four core genotype mice in response to colonization. Additionally, and this may not be feasible, but according to the proposed model administering female sex hormones to male mice should decrease colonization. Finally, knowing whether the quantity of IL-17a CD4+ cells change in the OVX mice has the potential to discern whether abundance/migration of the cells or their activation is promoted by female sex hormones.

      In the Discussion, the authors highlight previous work establishing a link between immune cells and sex hormone receptors, but whether the estrogen (and progesterone) receptor is differentially expressed in response to S. aureus colonization could be assessed in the RNAseq dataset. Differential expression of known X and Y chromosome-linked genes were discussed but specific sex hormones or sex hormone receptors, like the estrogen receptor, were not. This potential result could be highlighted.

      We appreciate the comment on the scholarship and thank the Reviewer for the insightful suggestions to improve this manuscript. We intend on measuring hormone levels and performing the recommended (or similar) experiments based on availability of reagents and mice during the revision period. We also apologize for not including references that address some of the Reviewer’s questions. Other research groups have compared the levels of hormones between XX and XY males and females in the four core genotypes model and have found similar levels of circulating testosterone in adult XX and XY males. No difference was found in circulating estradiol levels in XX vs XY- females when tested at 4-6 or 7-9 months of age.

      Karen M. Palaszynski, Deborah L. Smith, Shana Kamrava, Paul S. Burgoyne, Arthur P. Arnold, Rhonda R. Voskuhl, A Yin-Yang Effect between Sex Chromosome Complement and Sex Hormones on the Immune Response. Endocrinology, Volume 146, Issue 8, 1 August 2005, Pages 3280–3285, https://doi.org/10.1210/en.2005-0284

      Sasidhar MV, Itoh N, Gold SM, Lawson GW, Voskuhl RR. The XX sex chromosome complement in mice is associated with increased spontaneous lupus compared with XY. Ann Rheum Dis. 2012 Aug;71(8):1418-22. doi: 10.1136/annrheumdis-2011-201246. Epub 2012 May 12. PMID: 22580585; PMCID: PMC4452281.

      Examination of the levels of estrogen, progesterone, and androgen receptors in our cecal-colonic lamina propria RNA-seq dataset is an excellent idea. We will add these analyses to the revised manuscript. We are planning additional experiments to better understand the contributions of hormones or their receptors and anticipate including such data in either a response letter or revised manuscript.

    1. eLife assessment

      This valuable manuscript investigated the role of glutamate signaling in the dorsomedial striatum of rats in a treadmill-based task and reported that it differs in goal-trackers compared to sign-trackers in a way that corresponds to differences in behaviour. The evidence supporting these claims is solid but could be further strengthened by adding more analyses and more detailed descriptions of current analyses. These findings will primarily be of interest to behavioural neuroscientists.

    2. Reviewer #1 (Public review):

      Summary:

      The authors measured glutamate transients in the DMS of rats as they performed an action selection task. They identified diverse patterns of behavior and glutamate dynamics depending on the pre-existing behavioral phenotype of the rat (sign tracker or goal tracker). Using pathway-specific DREADDs, they showed that these behavioral phenotypes and their corresponding glutamate transients were differentially dependent on input from the prelimbic cortex to the DMS.

      Strengths:

      Overall there are some very interesting results that make an important contribution to the field. Notably, the results seem to point to differential recruitment of the PL-DMS pathway in goal-tracking vs sign-tracking behaviors.

      Weaknesses:

      There is a lot of missing information and data that should be reported/presented to allow a complete understanding of the findings and what was done. The writing of the manuscript was mostly quite clear, however, there are some specific leaps in logic that require more elaboration, and the focus at the start and end on cholinergic neurons and Parkinson's disease are, at the moment, confusing and require more justification.

    3. Reviewer #2 (Public review):

      Summary:

      The authors aimed to determine whether goal-directed and cue-driven attentional strategies (goal- and sign-tracking phenotypes) were associated with variation in cued motor responses and dorsomedial striatal (DMS) glutamate transmission. They used a treadmill task in which cues indicated whether rats should turn or stop to receive a reward. They collected and analyzed several behavioral measures related to task performance with a focus on turns (performance, latency, duration) for which there are more measures than for stops. First, they established that goal-trackers perform better than sign-trackers in post-criterion turn performance (cued turns completed) and turn initiation. They used glutamate sensors to measure glutamate transmission in DMS. They performed analyses on glutamate traces that suggest phasic glutamate DMS dynamics to cues were primarily associated with successful turn performance and were more characteristic of goal-trackers (ie. rats with "goal-directed" attentional strategy). Smaller and more frequent DMS glutamate peaks were associated with other task events, cued misses (missed turns), cued stops, and reward delivery and were more characteristic of sign-trackers (i.e. rats with "cue-driven" attentional strategies). Consistent with the reported glutamate findings, chemogenetic inhibition of prelimbic-DMS glutamate transmission had an effect on goal-trackers' turn performance without affecting sign-trackers' performance in the treadmill task.

      Strengths:

      The power of the sign- and goal-tracking model to account for neurobiological and behavioral variability is critically important to the field's understanding of the heterogeneity of the brain in health and disease. The approach and methodology are sound in their contribution to this important effort.

      The authors establish behavioral differences, measure a neurobiological correlate of relevance, and then manipulate that correlate in a broader circuitry and show a causal role in behavior that is consistent with neurobiological measurements and phenotypic differences.

      Sophisticated analyses provide a compelling description of the authors' observations.

      Weaknesses:

      It is challenging to assess what is considered the "n" in each analysis (trial, session, rat, trace (averaged across a session or single trial)). Representative glutamate traces (n = 5 traces (out of hundreds of recorded traces)) are used to illustrate a central finding, while more conventional trial-averaged population activity traces are not presented or analyzed. The latter would provide much-needed support for the reported findings and conclusions. Digging deeper into the methods, results, and figure legends, provides some answers to the reader, but much can be done to clarify what each data point represents and, in particular, how each rat contributes to a reported finding (ie. single trial-averaged trace per session for multiple sessions, or dozens of single traces across multiple sessions).

      Representative traces should in theory be consistent with population averages within phenotype, and if not, discussion of such inconsistencies would enrich the conclusions drawn from the study. In particular, population traces of the phasic cue response in GT may resemble the representative peak examples, while smaller irregular peaks of ST may be missed in a population average (averaged prolonged elevation) and could serve as a rationale for more sophisticated analyses of peak probability presented subsequently.

    4. Reviewer #3 (Public review):

      Summary:

      Avila and colleagues investigate the role of glutamate signaling in the dorsomedial striatum in a treadmill-based task where rats learn to turn or stop their walking based on learning cue-associations that allow them to acquire rewards. Phenotypic variation in Pavlovian conditioned sign and goal-tracking behavior was examined, where behavioral differences in stopping and turning were observed. Glutamate signals in the DMS were recorded during the treadmill task and were related to features of cue-controlled movement, with a stronger relationship seen for goal trackers. Finally, chemogenic inhibition of prelimbic neurons projecting to the DMS (the predicted source of those glutamate signals), preferentially affected cued movement in goal trackers. The authors couch these experiments in the context of cognitive control-attentional mechanisms, movement disorders, and individual differences in cue reactivity.

      Strengths:

      Overall these studies are interesting and are of general relevance to a number of research questions in neurology and psychiatry. The assessment of the intersection of individual differences in cue-related learning strategies with movement-related questions - in this case, cued turning behavior - is an interesting and understudied question. The link between this work and growing notions of corticostriatal control of action selection makes it timely.

      Weaknesses:

      The clarity of the manuscript could be improved in several places, including in the graphical visualization of data. It is sometimes difficult to interpret the glutamate results, as presented, in the context of specific behavior, for example.

    5. Author response:

      Reviewer #1 (Public Review):

      Strengths:

      Overall there are some very interesting results that make an important contribution to the field. Notably, the results seem to point to differential recruitment of the PL-DMS pathway in goal-tracking vs sign-tracking behaviors.

      Thank you.

      Weaknesses:

      There is a lot of missing information and data that should be reported/presented to allow a complete understanding of the findings and what was done. The writing of the manuscript was mostly quite clear, however, there are some specific leaps in logic that require more elaboration, and the focus at the start and end on cholinergic neurons and Parkinson's disease are, at the moment, confusing and require more justification.

      In the revised paper, we provide additional information in support of results and clarify procedures and findings. Furthermore, we expand the discussion of the proposed interpretational framework that suggests that the contrasts between the cortical-striatal processing of movement cues in sign- versus goal trackers are related to previously established, parallel contrasts in the cortical cholinergic detection of attention-demanding cues.

      Reviewer #2 (Public review):

      Strengths:

      The power of the sign- and goal-tracking model to account for neurobiological and behavioral variability is critically important to the field's understanding of the heterogeneity of the brain in health and disease. The approach and methodology are sound in their contribution to this important effort.

      The authors establish behavioral differences, measure a neurobiological correlate of relevance, and then manipulate that correlate in a broader circuitry and show a causal role in behavior that is consistent with neurobiological measurements and phenotypic differences.

      Sophisticated analyses provide a compelling description of the authors' observations.

      Thank you.

      Weaknesses:

      It is challenging to assess what is considered the "n" in each analysis (trial, session, rat, trace (averaged across a session or single trial)). Representative glutamate traces (n = 5 traces (out of hundreds of recorded traces)) are used to illustrate a central finding, while more conventional trial-averaged population activity traces are not presented or analyzed. The latter would provide much-needed support for the reported findings and conclusions. Digging deeper into the methods, results, and figure legends, provides some answers to the reader, but much can be done to clarify what each data point represents and, in particular, how each rat contributes to a reported finding (ie. single trial-averaged trace per session for multiple sessions, or dozens of single traces across multiple sessions).

      Representative traces should in theory be consistent with population averages within phenotype, and if not, discussion of such inconsistencies would enrich the conclusions drawn from the study. In particular, population traces of the phasic cue response in GT may resemble the representative peak examples, while smaller irregular peaks of ST may be missed in a population average (averaged prolonged elevation) and could serve as a rationale for more sophisticated analyses of peak probability presented subsequently.

      Figures 5c-f depict individual data from all rats and trials. For all major analyses, the revised manuscript consolidates information about the number of rats per phenotype and sex, and the number of trials contributed by individual rats, in the result section.

      As detailed in the section on statistical methods, and as mentioned by the reviewer under Strengths, we used advanced statistical methods to assure that data from individual animals contribute equally to the overall result, and to minimize the possibility that an inordinate number of trials obtained from just one or a couple of rats biased the overall analysis.

      As the reviewer correctly pointed out, we have chosen not to show trial- or subject-averaged traces to illustrate glutamate release dynamics across trials. The present analyses focus on peak glutamate concentrations, the number of peaks, and the timing of peaks relative to a task cue or a behavioral event. Within a response bin, such as the 2-s period following turn cues, glutamate peaks – as defined in Methods - occur at variable times relative to cue onset.  Averaging traces over a population of rats or trials would “wash-out” the phenotype- and task event-dependent patterns of glutamate peaks, yielding, for example, a single, nearly 2-s long plateau for cue-locked glutamate recordings from STs (Figure 5b). Thus, subject- or trial-averaged traces would not illustrate the major findings described in this paper and would rather be uninformative. As already mentioned, individual data from all subjects and trials are shown in Figs 5c-f.

      Reviewer #3 (Public review):

      Strengths:

      Overall these studies are interesting and are of general relevance to a number of research questions in neurology and psychiatry. The assessment of the intersection of individual differences in cue-related learning strategies with movement-related questions - in this case, cued turning behavior - is an interesting and understudied question. The link between this work and growing notions of corticostriatal control of action selection makes it timely.

      Thank you.

      Weaknesses:

      The clarity of the manuscript could be improved in several places, including in the graphical visualization of data. It is sometimes difficult to interpret the glutamate results, as presented, in the context of specific behavior, for example.

      We appreciate the reviewer’s concerns about the complexity of some of the graphics, particularly the results from the arguably innovative analysis illustrated in Figure 6. Figure 6 illustrates that the likelihood of a cued turn can be predicted based on single and combined glutamate peak characteristics. The revised legend for this figure provides additional information and examples to ease the readers’ access to this figure.

    1. eLife assessment

      This important study reports the formation of a new organelle, called giant unilocular vacuole (GUVac), in mammary epithelial cells through a macropinocytosis-like process. The evidence supporting conclusions is convincing, using state-of-the-art cell biology techniques. This work will be of interest to cell biologists and contribute to the understanding of cell survival mechanisms against anoikis.

    2. Reviewer #1 (Public review):

      The authors found that the loss of cell-ECM adhesion leads to the formation of giant monocular vacuoles in mammary epithelial cells. This process takes place in a macropinocytosis-like process and involves PI3 kinase. They further identified dynamin and septin as essential machinery for this process. Interestingly, this process is reversible and appears to protect cells from cell death.

      Strengths: The data are clean and convincing to support the conclusions. The analysis is comprehensive, using multiple approaches such as SIM and TEM. The discussion on lactation is plausible and interesting.

      Weaknesses: As the first paper describing this phenomenon, it is adequate. However, the elucidation of the molecular mechanisms is not as exciting as it does not describe anything new. It is hoped that novel mechanisms will be elucidated in the future. Especially the molecules involved in the reversing process could be quite interesting.

    3. Reviewer #2 (Public review):

      Summary:

      The manuscript describes an interesting observation and provides initial steps towards understanding the underlying molecular mechanism.

      The manuscript describes that the majority of non-tumorigenic mammary gland epithelial cells (MCF-10A) in suspension initiate entosis. A smaller fraction of cells form a single giant unilocular vacuole (hereafter referred to as a GUVac). GUVac appeared to be empty and did not contain invading (entotic) cells. The formation of GUVac could be promoted by disrupting actin polymerisation with LatB and CytoD. The formation of GUVacs correlated with resistance to anoikis. GUVac formation was detected in several other epithelial cells from secretory tissues.

      The authors then use electron microscopy and super-resolution imaging to describe the biogenesis of GUVac. They find that GUVac formation is initiated by a micropinocytosis-like phenomenon (that is independent of actin polymerisation). This process leads to the formation of large plasma membrane invaginations, that pinch off from the PM to form larger vesicles that fuse with each other into GUVacs.

      Inhibition of actin polymerisation in suspended MCF-10a leads to the recruitment of Septin 6 to the PM via its amphipathic helix. Treatment with FCF (a septin polymerisation inhibitor) blocked GUVac biogenesis, as did pharmacological inhibition of dynamin-mediated membrane fission. The fusion of these vesicles in GUVacs required (perhaps not surprisingly) PI3P.

      Strengths:

      The authors have made an interesting and potentially important observation. They describe the formation of an endo-lysosomal organelle (a giant unilocular vacuole - GUVac) in suspended epithelial cells and correlate the formation of GUVacs with resistance to aniokis.

      Comments on revised version:

      Additional experiments, including a better characterization of GUVac biogenesis, as well as knockdown and knock out of class II PI3Kα (PI3K-C2α) or class III PI3K (VPS34), have improved the manuscript.

    4. Reviewer #3 (Public review):

      Summary:

      Loss of cell attachment to extracellular matrix (ECM) triggers aniokis (a type of programmed cell death), and resistance to aniokis plays a role in cancer development. However, mechanisms underlying anoikis resistance, and the precise role of F-actin, are not fully known.

      Here authors describe the formation of a new organelle, giant unilocular vacuole (GUVac), in cells whose F-actin is disrupted during loss of matrix attachment. GUVac formation (diameter >500 nm) resulted from a previously unrecognised macropinocytosis-like process, characterized by inwardly curved micron-sized plasma membrane invaginations, dependent on F-actin depolymerization, septin recruitment and PI(3)P. Finally, the authors show GUVac formation after loss of matrix attachment promotes resistance to anoikis.

      From these results, authors conclude that GUVac formation promotes cell survival in environments where F-actin is disrupted and conditions of cell stress.

      Strengths:

      The manuscript is clear and well-written, figures are all presented at a very high level.

      A variety of cutting edge cell biology techniques (eg time-lapse imaging, EM, super-resolution microscopy) are used to study the role of cytoskeleton in GUVac formation, discovering (i) a macropinocytosis-like process dependent on F-actin depolymerisation, SEPT6 recruitment and PI(3)P contributes to GUVac formation, and (ii) GUVac formation is associated with resistance to cell death.

      Experimental work was advanced in response to reviewers' comments, improving the manuscript message and mechanistic advance.

      Weaknesses:

      The manuscript is highly reliant on the use of drugs, or combinations of drugs, for long periods of time (6hr, 18hr). However, in the revised manuscript, authors test conclusions drawn from experiments involving drugs using other canonical cell biology approaches.

      The molecular characterisation of GUVacs has been advanced, although not fully resolved.

      The authors show (mostly using pharmacological inhibition) that F-actin is key for GUVac formation. The precise role of F-actin / GUVac formation in anoikis resistance will be the focus of future work.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      The authors found that the loss of cell-ECM adhesion leads to the formation of giant monocular vacuoles in mammary epithelial cells. This process takes place in a macropinocytosis-like process and involves PI3 kinase. They further identified dynamin and septin as essential machinery for this process. Interestingly, this process is reversible and appears to protect cells from cell death.

      Strengths:

      The data are clean and convincing to support the conclusions. The analysis is comprehensive, using multiple approaches such as SIM and TEM. The discussion on lactation is plausible and interesting.

      We thank the reviewer for the summary of our study and the positive comment.

      Weaknesses:

      As the first paper describing this phenomenon, it is adequate. However, the elucidation of the molecular mechanisms is not as exciting as it does not describe anything new. It is hoped that novel mechanisms will be elucidated in the future. In particular, the molecules involved in the reversing process could be quite interesting.

      We agree with the reviewer’s comments and believe that investigating the molecular mechanisms involved in reversing GUVac formation, as illustrated in Figure 5J, would be valuable for future research.

      Additionally, the relationship to conventional endocytic compartments, such as early and late endosomes, is not analyzed.

      We thank the reviewer for the valuable comment. To determine whether GUVac displays markers of other endomembrane systems, we analyzed several markers, including EEA1, Rab5, LC3B, LAMP1, and Transferrin receptor (TfR). At early time points (1 h), we observed several large vesicles that had taken up 70kDa Dextran and exhibited EEA1 or Rab5, markers of early endosomes. By 6 hours, some of these large vesicles showed lysotracker positivity, indicating a transition from early to late endosomal fate, similar to the maturation process of conventional macropinocytic vesicles (see new Figure 1-figure supplement 2A). However, once the vesicles fused, grew, and became GUVac, these markers did not consistently correspond with the GUVac membrane but were instead unevenly distributed around it (new Figure 1-figure supplement 2B, C). This made it difficult to determine whether they were localized to separate organelles or part of the GUVac membrane. Interestingly, we found that the Transferrin receptor (TfR), which also marks a general membrane population involved in the endocytic pathway (such as PM invagination), was evenly distributed within the GUVac membrane (new Figure 1-figure supplement 2B, D). Therefore, GUVac appears to possess heterogeneous characteristics of the endocytic membrane, mainly with the TfR marker (likely due to PM invagination) and some partial endomembrane system markers. However, further analysis would be required to confirm this.

      Reviewer #2 (Public Review):

      Summary:

      The manuscript "Formation of a giant unilocular vacuole via macropinocytosis-like process confers anoikis resistance" describes an interesting observation and provides initial steps towards understanding the underlying molecular mechanism.

      The manuscript describes that the majority of non-tumorigenic mammary gland epithelial cells (MCF-10A) in suspension initiate entosis. A smaller fraction of cells forms a single giant unilocular vacuole (hereafter referred to as a GUVac). GUVac appeared to be empty and did not contain invading (entotic) cells. The formation of GUVac could be promoted by disrupting actin polymerisation with LatB and CytoD. The formation of GUVacs correlated with resistance to anoikis. GUVac formation was detected in several other epithelial cells from secretory tissues.

      The authors then use electron microscopy and super-resolution imaging to describe the biogenesis of GUVac. They find that GUVac formation is initiated by a micropinocytosis-like phenomenon (that is independent of actin polymerisation). This process leads to the formation of large plasma membrane invaginations, that pinch off from the PM to form larger vesicles that fuse with each other into GUVacs.

      Inhibition of actin polymerisation in suspended MCF-10a leads to the recruitment of Septin 6 to the PM via its amphipathic helix. Treatment with FCF (a septin polymerisation inhibitor) blocked GUVac biogenesis, as did pharmacological inhibition of dynamin-mediated membrane fission. The fusion of these vesicles in GUVacs required (perhaps not surprisingly) PI3P.

      Strengths:

      The authors have made an interesting and potentially important observation. They describe the formation of an endo-lysosomal organelle (a giant unilocular vacuole - GUVac) in suspended epithelial cells and correlate the formation of GUVacs with resistance to aniokis.

      We thank the reviewer for the summary of our study and the positive comment.

      Weaknesses:

      My major concern is the experimental strategy that is used throughout the paper to induce and study the formation GUVac. Almost every experiment is conducted in suspended cells that were treated with actin depolymerising drugs (e.g. LatB) and thus almost all key conclusions are based on the results of these experiments. I only have a few suggestions that would improve these experiments or change their outcome and interpretation. Yet, I believe it is essential to identify the endogenous pathway leading to the actin depolymerisation that drives the formation of GUVacs in detached epithelial cells (or alternatively to figure out how it is suppressed in most detached cells). A first step in that direction would be to investigate the polymerization status of actin in MCF-10a cells that 'spontaneously' form GUVacs and to test if these cells also become resistant to anoikis.

      We thank the reviewer for the valuable comments and fully acknowledge the limitations of our approach. Many detached cells likely tend to contact each other for cell aggregations to suppress GUVac formation. However, it is unclear whether cells that spontaneously form GUVac in suspension have a weakened F-actin structure, which would be valuable to investigate in future studies.

      Also, it would be great (and I believe reasonably easy) to better characterise molecular markers of GUVacs (LAMP's, Rab's, Cathepsins, etc....) to discriminate them from other endosomal organelles

      In response to a similar comment from Reviewer 1, we analyzed markers of other endocytic compartments, including EEA1, Rab5, Transferrin receptor (TfR), LC3B, and LAMP1. At early time points (1 h), we observed several large vesicles that had taken up 70kDa Dextran and exhibited EEA1 or Rab5, markers of early endosomes. By 6 hours, some of these large vesicles showed lysotracker positivity, indicating a transition from early to late endosomal fate, similar to the maturation process of conventional macropinocytic vesicles (see new Figure 1-figure supplement 2A). However, once the vesicles fused, grew, and became GUVac, these markers did not consistently correspond with the GUVac membrane but were instead unevenly distributed around it (new Figure 1-figure supplement 2B, C). This made it difficult to determine whether they were localized to separate organelles or part of the GUVac membrane. Interestingly, we found that the Transferrin receptor (TfR), which also marks a general membrane population involved in the endocytic pathway (such as PM invagination), was evenly distributed within the GUVac membrane (new Figure 1-figure supplement 2B, D). Therefore, GUVac appears to possess heterogeneous characteristics of the endocytic membrane, mainly with the TfR marker (likely due to PM invagination) and some partial endomembrane system markers. However, further analysis would be required to confirm this.

      Reviewer #3 (Public Review):

      Summary:

      Loss of cell attachment to extracellular matrix (ECM) triggers aniokis (a type of programmed cell death), and resistance to aniokis plays a role in cancer development. However, mechanisms underlying anoikis resistance, and the precise role of F-actin, are not fully known.

      Here the authors describe the formation of a new organelle, giant unilocular vacuole (GUVac), in cells whose F-actin is disrupted during loss of matrix attachment. GUVac formation (diameter >500 nm) resulted from a previously unrecognised macropinocytosis-like process, characterized by inwardly curved micron-sized plasma membrane invaginations, dependent on F-actin depolymerization, septin recruitment, and PI(3)P. Finally, the authors show GUVac formation after loss of matrix attachment promotes resistance to anoikis.

      From these results, the authors conclude that GUVac formation promotes cell survival in environments where F-actin is disrupted and conditions of cell stress.

      Strengths:

      The manuscript is clear and well-written, figures are all presented at a very high level.

      A variety of cutting-edge cell biology techniques (eg time-lapse imaging, EM, super-resolution microscopy) are used to study the role of the cytoskeleton in GUVac formation. It is discovered that: (i) a macropinocytosis-like process dependent on F-actin depolymerisation, SEPT6 recruitment, and PI(3)P contributes to GUVac formation, and (ii) GUVac formation is associated with resistance to cell death.

      We thank the reviewer for the concise summary of our study and positive comments.

      Weaknesses:

      The manuscript is highly reliant on the use of drugs, or combinations of drugs, for long periods of time (6hr, 18hr..). Wherever possible the authors should test conclusions drawn from experiments involving drugs also using other canonical cell biology approaches (eg siRNA, Crispr). Although suggestive as a first approach, it is not reliable to draw conclusions from experiments where only drug combinations are being advanced (eg LatB + FCF).

      We thank the reviewer for the comment and suggestion. As suggested, we employed siRNAs targeting Septin2 and Septin9 in cells treated with LatB as an alternative to the drug combination approach. This genetic approach, combined with chemical treatment, led to a consistent reduction in GUVac formation, similar to the results observed with LatB+FCF treatment (see new Figure 3D-WB and graph).

      F-actin is well known to play a wide variety of roles in cell death and other canonical cell death pathways (PMID: 26292640). The authors show using pharmacological inhibition that F-actin is key for GUVac formation. However, especially when testing for physiological relevance, how can these other roles for F-actin be ruled out?

      In Figure 5, we investigate the physiological relevance of GUVac, highlighting its role in suppressing apoptosis and enhancing anoikis resistance. As the reviewer correctly noted, F-actin inhibition is known to reduce apoptotic signaling (PMID: 16072039). However, we observed that anoikis resistance is lost when GUVac is suppressed through knockout of either PI3KC2alpha or VPS34 in cells with F-actin disrupted by LatB (Figure 5I). This suggests that GUVac plays a role in suppressing apoptosis independently of F-actin depolymerization-induced apoptosis resistance.

      To test the role of septins in GUVac formation only recruitment studies and no direct functional work is performed. A drug forchlofeneuron (FCF) is used, but this is well known to have off-target effects (PMID: 27473917).

      We thank the reviewer for the valuable comments. To eliminate potential off-target effects of FCF, as described above, we employed siRNA targeting Septin 2 and Septin 9 and observed similar results (see new Figure 3D).

      Cells that possess GUVac are resistant to aniokis, but how are these cells resistant? This report is focused on mechanisms underlying GUVac formation and does not directly test for mechanisms underlying aniokis resistance.

      We fully agree with the reviewer’s comments and recognize the importance of uncovering the mechanism behind GUVac-mediated anoikis resistance for future research. It will likely be essential to investigate how prosurvival signaling pathways are activated, like the PI3K-AKT signaling (as shown in Figure 5-Supplement 1) or the YAP/TAZ pathway.

      Reviewer #1 (Recommendations For The Authors):

      Figure 4 Supplemental 1. What are the faint bands in clones 23, 26, and 29? Are they cross-reacting bands? Or Vps34?

      We apologize if the data in our original manuscript were misleading. To clarify the specificity of the VPS34 antibody in the Western blot analysis of VPS34 KO clones, we compared these samples with those from siRNA-mediated VPS34-depleted cells (see new Figure 4-Supplement 1E, which replaces the original Figure). Consistent with the known size of VPS34 at approximately 100 kDa, we observed a clear disappearance of the VPS34 band at around 100 kDa in the sgVPS34 clones, which was comparable to the size observed in siRNA-treated cells.

      Reviewer #2 (Recommendations For The Authors):

      Figure 2B: Only 4 cells were counted. Please comment.

      At the outset of this study, we faced technical difficulties in preparing TEM samples, which limited the number of samples included in Figure 2B. However, subsequent experiments that combined TEM with super-resolution microscopy, as shown in Figure 4D-F, produced similar data on plasma membrane invagination, as depicted in Figure 2B, which is the initial step in the formation of GUVac.

      Figure 2C: do cells shrink after treatment with EIPA or LatB? Please comment.

      We apologize if the data presented in our original manuscript were misleading. Control cells treated with DMSO display multiple cell-in-cell structures (known as 'entosis'), which typically results in a larger overall cell size compared to EIPA or LatB-treated non-entotic single cells. This might have created the impression that cells shrink relative to the control under EIPA or LatB treatment. We hope this explanation has answered the reviewer’s question.

      Figure 3A: The changes in the localization of mCherry-Spetin6 appear to be very dramatic. Are these results properly reflected by the quantification in Figure 3B? Is indeed the entire mCherry-Spetin6 pool recruited to the plasma membrane? Wouldn't that imply that all other septin6-regulated processes are blocked?

      Again, we apologize if the data presented in our original manuscript caused any confusion. In Figure 3B, we quantified only the number of filament-like Septin6 structures predominantly observed in LatB-treated cells, rather than measuring changes in the relative fluorescence intensity of Septin6 between the plasma membrane and the cytosol. Although we could not estimate the proportion of total Septin6 recruited to the plasma membrane from the cytosol based solely on Figure 3A-B, conducting plasma membrane fractionation experiments with endogenous Septin6, followed by Western blot analysis, would be valuable for addressing this issue in future studies.

      Figure 3D: Please also provide data for the 6h time-point (as in all other experiments).

      We apologize for omitting the 6-hour time point, which may have caused confusion. The new Figure 3E (previously Figure 3D) shows that recruitment of wild-type Septin6, but not the amphipathic helix (AH) deletion mutant, occurs at a 6-hour time point.

      Figure 3E: Molecular weight for western blot is missing.

      We thank the reviewer for pointing this out and have revised the figure accordingly.

      Line 188 - Title of subchapter could include dynamin.

      We appreciate the reviewer’s helpful suggestion and have updated the revised manuscript to reflect this. The phrase "Recruitment of Septin to the Fluctuating Plasma Membrane Drives Macropinocytosis-like Process" has been revised to "Septin and Dynamin Drive Macropinocytosis-like Process".

      Line 450 - please describe how the genotyping of MCF10a gene-engineered cells was performed.

      We confirmed the knockout of MCF10A cell lines by Western blot analysis using specific antibodies against VPS34 and PI3KC2α, rather than through genotyping.

      Reviewer #3 (Recommendations For The Authors):

      (1) The manuscript is highly reliant on the use of drugs, or combinations of drugs, for long periods of time (6hr, 18hr..). Wherever possible authors should test conclusions drawn from experiments involving drugs also using other canonical cell biology approaches (eg siRNA, Crispr). Although suggestive as a first approach, it is not reliable to draw conclusions from experiments where only drug combinations are being advanced (eg LatB + FCF).

      We thank the reviewer for the comment. As suggested, we employed siRNAs targeting Septin2 and Septin9 in cells treated with LatB as an alternative to the drug combination approach. This genetic approach, combined with chemical treatment, led to a consistent reduction in GUVac formation, similar to the results observed with LatB+FCF treatment (see new Figure 3D-WB and graph).

      (2) SEPT6 is recruited at an inwardly curved plasma membrane. Can the authors better describe what type of structure is being recruited/quantified (filaments, collar-like structures, etc)?

      We apologize if the data presented was unclear. As outlined in the Methods section in the original manuscript, we detected puncta-like Septin6 structures using the Find Maxima tool in ImageJ, which could include both filamentous and collar-like structures that were less apparent in the DMSO control. We have added additional explanations in the revised manuscript in the legend of Figure 3B to clarify the recruitment of Septin6.

      Previous work has shown that octameric septin complexes are linking actin to the plasma membrane (PMID: 36562751). Tests for the recruitment/function of other key septins such as SEPT7 and SEPT9 to support conclusions.

      As previously mentioned, to further explore the role of other septin family members in GUVac formation, we tested the roles of Septin9 and Septin2 using siRNAs and found that they are essential for this process (see new Figure 3D). Unfortunately, we were unable to assess the localization of Septin2 and Septin9 due to the lack of suitable antibodies for detecting endogenous proteins by immunofluorescence.

      (3) SEPT6 recruitment is impaired when cells are treated with FCF. FCF is well known to have off-target effects (PMID: 25217460, PMID: 27473917). siRNA for SEPT2, SEPT7 and/or SEPT9 can be used to test phenotypes obtained using FCF.

      We thank the reviewer for the comment. As also mentioned above, to eliminate potential off-target effects of FCF, we used siRNA to target Septin2 and Septin9, and obtained similar results (see new Figure 3D).

      (4) SEPT6 is recruited to the fluctuating cell membrane via the amphipathic helix (AH) domain (Figure 3D). Are these only representative images? It is not clear what readers should be looking at - can the authors provide arrows to highlight what is the difference +/- AH? Can something be quantified?

      We thank the reviewer for the suggestion and have added arrows from the inset of the merge pannel Figure 3E, along with line profile analysis, to emphasize the failure of the AH deletion mutant of Septin6 to recruit to the plasma membrane.

      Throughout Figure 3, why use LatB treatment at different times?

      We apologize if this was not clearly addressed in our original manuscript. Throughout the study, we primarily used an 18-hour LatB treatment to evaluate GUVac formation, as this longer period allows for gradual vesicle fusion. In contrast, we utilized 6-hour treatments to demonstrate that Septin6 recruitment and subsequent plasma membrane invagination occur at earlier time points, as evidenced by the data in Figure 2G (super-resolution live imaging) and Figure 4D (electron microscopy analysis). This clarification has been incorporated into the revised manuscript.

      (5) F-actin is well known to play a wide variety of roles in cell death and other canonical cell death pathways (PMID: 26292640). The authors show using pharmacological inhibition that F-actin is key for GUVac formation. However, especially when testing for physiological relevance, how can these other roles for F-actin be ruled out?

      In Figure 5, we investigate the physiological relevance of GUVac, highlighting its role in suppressing apoptosis and enhancing anoikis resistance. As the reviewer correctly noted, F-actin inhibition is known to reduce apoptotic signaling (PMID: 16072039). However, when GUVac is suppressed through knockout of either PI3KC2alpha or VPS34 in cells with F-actin disrupted by LatB, anoikis resistance is lost (see Figure 5H, I). This suggests that GUVac plays a role in suppressing apoptosis independently of F-actin depolymerization-induced apoptosis resistance.

      (6) Cells that possess GUVac are resistant to aniokis, but how are these cells resistant? This report is focused on mechanisms underlying GUVac formation and does not directly test for mechanisms underlying aniokis resistance.

      We fully agree with the reviewer’s comments and recognize the importance of uncovering the mechanism behind GUVac-mediated anoikis resistance for future research. It will likely be essential to investigate how prosurvival signaling pathways are activated, like the PI3K-AKT signaling (as shown in Figure 5-Supplement 1) or the YAP/TAZ pathway.

      (7) In the Discussion, there is a lot of text on involution and speculative relevance of GUVac formation. I would focus the Discussion more on the clear results discovered here.

      We thank the reviewer’s feedback and have revised the discussion to reduce its length concerning involution.

      (8) Figure 5. GUVac formation promotes cell survival in altered actin and matrix environments. In Figure 5J, it will not be clear to readers outside the field what is being shown here.

      We appreciate the reviewer’s suggestion and have added two distinct dotted lines around the vacuole and cell area in the revised figure to emphasize the gradual reduction in its size over time.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      SUMO proteins are processed and then conjugated to other proteins via a C-terminal di-glycine motif. In contrast, the N-terminus of some SUMO proteins (SUMO2/3) contains lysine residues that are important for the formation of SUMO chains. Using NMR studies, the N-terminus of SUMO was previously reported to be flexible (Bayer et al., 1998). The authors are investigating the role of the flexible (referred to as intrinsically disordered) N-terminus of several SUMO proteins. They report their findings and modeling data that this intrinsically disordered N-terminus of SUMO1 (and the C. elegans Smo1) regulates the interaction of SUMO with SUMO interacting motifs (SIMs).

      Strengths:

      Among the strongest experimental data suggesting that the N-terminus plays an inhibitory function are their observations that

      (1) SUMO1∆N19 binds more efficiently to SIM-containing Usp25, Tdp2, and RanBp2,<br /> (2) SUMO1∆N19 shows improved sumoylation of Usp25,<br /> (3) changing negatively-charged residues, ED11,12KK in the SUMO1 N-terminus increased the interaction and sumoylation with/of USP25.

      The paper is very well-organized, clearly written, and the experimental data are of high quality. There is good evidence that the N-terminus of SUMO1 plays a role in regulating its binding and conjugation to SIM-containing proteins. Therefore, the authors are presenting a new twist in the ever-evolving saga of SUMO, SIMs, and sumoylation.

      Weaknesses:

      Much has been learned about SUMO through structure-function analyses and this study is another excellent example. I would like to suggest that the authors take some extra time to place their findings into the context of previous SUMO structure-function analyses. Furthermore, it would be fitting to place their finding of a potential role of N-terminally truncated Smo1 into the context of the many prior findings that have been made with regard to the C. elegans SUMO field. Finally, regarding their data modeling/simulation, there are questions regarding the data comparisons and whether manipulations of the N-terminus also have an effect on the 70/80 region of the core.

      We thank the reviewer for insightful and constructive comments to improve our manuscript. We have now placed our findings in the context of previous structure-function analyses at several occasions, details of which can be found in our replies to the detailed comments.

      We are also placing the C. elegans data into context of previously published findings on the various functions of SMO-1 in controlling development and maintaining genomic stability (lines 510ff). Finally, we addressed all questions and suggestions regarding comparison of MD simulation and NMR data, and addressed the question whether mutations in the N-terminus affected the 70/80 region. We have now clarified in the manuscript that the sum of MD and NMR data does not allow a clear-cut conclusion on the 70/80 interactions. 

      Reviewer #2 (Public Review):

      Summary:

      This very interesting study originated from a serendipitous observation that the deletion of the disordered N-terminal tail of human SUMO1 enhances its binding to its interaction partners. This suggested that the N terminus of SUMO1 might be an intrinsic competitive inhibitor of SUMO-interacting motif (SIM) binding to SUMO1. Subsequent experiments support this mechanism, showing that in humans it is specific to SUMO1 and does not extend to SUMO2 or SUMO3 (except, perhaps, when the N terminus of SUMO2 becomes phosphorylated, as the authors intriguingly suggest - and partially demonstrate). The auto-inhibition of SUMO1 via its N-terminal tail apparently explains the lower binding of SUMO1 compared to SUMO2 to some SIMs and lower SIM-dependent SUMOylation of some substrates with SUMO1 compared to SUMO2, thus adding an important element to the puzzle of SUMO paralogue preference. In line with this explanation, N-terminally truncated SUMO1 was equally efficient to SUMO2 in the studied cases. The inhibitory role of SUMO1's N terminus appears conserved in other species including S. cerevisiae and C. elegans, both of which contain only one SUMO. The study also elucidates the molecular mechanism by which the disordered N-terminal region of SUMO1 can exert this auto-inhibitory effect. This appears to depend on the transient, very highly dynamic physical interaction between the N terminus and the surroundings of the SIM-binding groove based mostly on electrostatic interactions between acidic residues in the N terminus and basic residues around the groove.

      Strengths:

      A key strength of this study is the interplay of different techniques, including biochemical experiments, NMR, molecular dynamics simulations, and, at the end, in vivo experiments. The experiments performed with these different techniques inform each other in a productive way and strengthen each others' conclusions. A further strength is the detailed and clear text, which patiently introduces, describes, and discusses the study. Finally, in terms of the message, the study has a clear, mechanistic message of fundamental importance for various aspects of the SUMO field, and also more generally for protein biochemists interested in the functional importance of intrinsically disordered regions.

      Weaknesses:

      Some of the authors' conclusions are similar to those from a recent study by Lussier-Price et al. (NAR, 2022), the two studies likely representing independent inquiries into a similar topic. I don't see it as a weakness by itself (on the contrary), but it seems like a lost opportunity not to discuss at more length the congruence between these two studies in the discussion (Lussier-Price is only very briefly cited). Another point that can be raised concerns the wording of conclusions from molecular dynamics. The use of molecular dynamics simulations in this study has been rigorous and fruitful - indeed, it can be a model for such studies. Nonetheless, parameters derived from molecular dynamics simulations, including kon and koff values, could be more clearly described as coming from simulations and not experiments. Lastly, some of the conclusions - such as enhanced binding to SIM-containing proteins upon N-terminal deletion - could be additionally addressed with a biophysical technique (e.g. ITC) that is more quantitative than gel-based pull-down assays - but I don't think it is a must.

      Thank you very much for pointing towards the study of Lussier-Price. We now point out congruent findings in more detail in the discussion.

      We also thank the reviewer for the advice to present and discuss the MD findings more clearly, and more explicitly specify which parameters were obtained from MD. We have made changes throughout the Results and Discussion sections.

      We agree that it would be a nice addition to use ITC measurements as a more quantitative method to assess differences in binding affinities upon deletion of the SUMO N-terminus. We had tried to measure affinities between SUMO and SIM-containing binding partners by ITC but in our hand, this failed. In the study of Lussier-Price et al., the authors were able to measure differences in SIM binding upon deleting the N-terminus but only when they used phosphorylated SIM peptides. Follow-up studies, e.g., on the effect of SUMO’s N-terminal modifications should certainly include more quantitative measurement such as ITCs, however these studies will have to be picked up by others. The main PI Frauke Melchior and most contributing authors moved on to new challenges.

      Reviewing Editor (Recommendations For The Authors):

      Both reviewers agreed that your manuscript presents novel results and the key findings including the self-inhibitory role of the N-terminal tail of SUMO proteins in their interaction with SIM are overall well supported by the data. The reviewers also provided constructive suggestions. They pointed out that some simulation results are not clear, which could be strengthened by control analysis and by toning down the related descriptions. In addition, Reviewer 2 suggested that the conclusions from the current biochemical and simulation studies could be further reinforced by more quantitative binding measurements. We hope that these points can be addressed in the revision.

      We thank both reviewers for their insightful and constructive comments and the appreciative tone. In our replies above and below we address most of the raised concerns.

      We strongly recommend the change of the current title. eLife advises that the authors avoid unfamiliar abbreviations or acronyms, or spell out in full or provide a brief explanation for any acronyms in the title.

      We changed the title to “The intrinsically disordered N-terminus of SUMO1 is an intramolecular inhibitor of SUMO1 interactions” to avoid acronyms in the title.

      Reviewer #1 (Recommendations For The Authors):

      Major:

      Lines 190-262: The authors use NMR experiments and all-atom molecular dynamics (MD) simulations. They state that this approach reveals a highly dynamic interaction of the SUMO1 N-terminus with the core and that the SIM binding groove and the 70/80 region are temporarily occupied by the SUMO1 N-terminus (Fig. 3C). After comparing SUMO1, Smt3, SUMO2, and Smo1 by this approach they state that the most striking differences exist for the interaction with the SIM-binding groove, while interactions with the 70/80 region are rather comparable.

      The authors then compare the average binding time data of Figure 3C, D, E, F in Figure 3G.

      It is not clear which data points are included in the bar graphs of Figure 3G and how the individual data points (there are maybe 8 shown in each bar) correspond to the data shown in 3C, D, E, and F or if they are iterations (n?) of the modeled data. This should be clarified. Also, for comparison, the authors should also graph the average data of the 70/80 region.

      We clarified the data shown in Figure 3G as well as 3C-F, and how It relates to each other. Indeed, Figure 3G shows 8 data points for 8 trajectories, and their average. Figure 3C-F are based on the same 8 trajectories, in this case broken down per residue of the protein. The average data of the 70/80 region does not show any significant differences across the proteins, as already pretty well visible from panels 3C-F.

      Line 322: More concerning, in Figure 5, the authors model how a ED11,12KK mutations disrupt the interaction between the N-terminus and the SIM-binding groove and state that this mutation leaves interactions with the 70/80 region largely untouched. Again, it is not clear which data points are included in the bar graph 5D and 5G and how many iterations. Furthermore, data of 5B, C (SUMO1) and 5 E, F (smo1) do show clear differences between the WT and mutants affecting both the SIM binding groove and the 70/80 region. The double mutation clearly seems to affect the 70/80 region when comparing 5B, C (SUMO1) and 5 E, F (smo1), but this result is not mentioned. Indeed, the authors state that the double mutants leave the interactions with the 70/80 region largely untouched, but this is not borne out by the data presented.

      We improved the clarity of the legend of Figure 5 as suggested. We also thank the reviewer for the comment on the changes in the 70/80 region, to which we point the reader explicitly now in the corresponding Results section. We, however, refrain from drawing conclusions from the MD in this case, as this change is not supported by the NMR measurements (Fig 5a). Charge-charge interactions in the charge-rich double mutants might be overstabilized in the MD simulations, a problem known for the canonical force fields used here, albeit tailoring it for IDPs. We now cite a corresponding reference. Another potential explanation for that the CMPs do not take this change up upon mutation could be a pronounced fuzziness in this region, which however, in turn, is not apparent from the simulations. We would therefore not overinterpret these differences in the 70/80 region. Our key conclusion is the loss of interactions with the SIM-binding groove – and thus of cis-inhibition – by mutations, which is supported by both, MD and NMR.  

      341: In their N-termini substitution experiments, the authors show that the SUMO1 core that carries the SUMO2 N-terminus (S2N-S1C) binds USP25 more efficiently than wt SUMO1. However, the SUMO1 core that carries the SUMO2 N-terminus is also reduced in its interaction with Usp25. This is concerning as the SUMO2 N-terminus was not predicted to interfere with SIM binding.

      We were excited to see that the inhibitory potential could be partially transplanted by swapping the N-termini of SUMO1 and SUMO2 demonstrating that some important determinants are contained within the N-terminal tail of SUMO proteins. However, the observed effects were partial indicating that also other determinants contribute and that we do not yet understand all aspects. Obviously, the SUMO1 and SUMO2 cores are similar (also in the area comprising the SIM binding groove) but not identical, and as the inhibition arises from dynamic interactions of the N-terminus with the SIM binding area, differences in the SUMO cores and in residues flanking SUMO’s N-terminus are likely to influence the inhibitory potential as well.

      Blue bars in 3G, 5D, and 6A look surprisingly similar down to the individual data points - does that mean that the same SUMO1 WT data was recycled for these different experiments? This is concerning to me.

      The data displayed in the figures listed above are derived from in silico simulations and indeed display the same data set for the case of SUMO1 WT repeatedly, as we also state in the figure legends (we had done so for 5D “(identical to Fig. 3C)”, and now added the same comment to 6A, thanks for pointing this out). We show the SUMO1 WT data again to facilitate comparing the different SUMO variants in MD simulations.

      Line 352 and 496: The authors used phosphomimetic mutants to assess the effect of SUMO2 N-term phosphorylation on interaction with Usp25. The data suggest a mild phenotype (6G) which is borne out by the quantization in 6H. In contrast, the effect of an array of modifications for SUMO1 (Figures 6A - C) was solely analyzed by MD simulation. If possible, this data should be confirmed, at least by using a phosphomimetic at the Ser9 position of SUMO1. Alternatively, a caveat explaining the need to confirm these predictions by actual experiments should be added to the text.

      Already now we state in “Limitations of the study” that “While our MD simulations and in vitro studies with selected mutants point in this direction, we have not been able to generate quantitatively acetylated and/or phosphorylated SUMO variants to test this hypothesis.”

      We agree that the hypothesis needs experimental validation. Phosphomimetic amino acids can be a useful tool in some cases but fail to mimic a phosphor group in other cases. In the past we had tested whether replacing Ser9 by a potentially phospho-mimicking amino acid (Glu) would further diminish binding of SIM-containing proteins compared to already strongly reduced binding to wt SUMO1 but the effect was too mild to yield a significant difference, at least in our assay. Whether this is due to a lack of Glu in mimicking phosphorylation of Ser9, due to limited sensitivity of our pulldown assay combined with the challenge to detect inhibition compared to an already inhibited state, or a failure in our hypothesis we were not able to clarify so far. We therefore now also added a sentence to the paragraph introducing phosphoSer9 MD simulations (now line 367) stating that this hypothesis needs to be tested experimentally.

      Minor:

      Line 110: the authors should include references for their summary statement that "A defining feature of SUMO proteins is the intrinsically disordered N-terminus, whose function is only partly understood." Also cite in line 119.

      Thank you, we now included some references.

      Line 75: Please indicate early on that the N-terminus of some SUMO proteins contains lysines for the formation of SUMO chains. Please list them.

      We now list, which of the SUMO proteins used in this study contain lysine residues in their N-termini.

      Line 113: Please cite studies that elucidated the sumoylation of lysines in the N-terminus of SUMO2/3 proteins.

      Thank you, we now included some references.

      Line 153: The authors should include additional references on Smt3 structure function analyses to provide better context. One important detail, for example, is the important finding that Yeast SUMO (Smt3) deletion can be complemented by hsSUMO1 but not hsSUMO2 and hsSUMO3. Additionally, in yeast the entire Smt3 N-terminus can be deleted without detectable effects on growth, underscoring the enigmatic role of the N-terminus (Newman et al., 2017). Caveat also applies to line 266.

      Thank you, we now included some additional information and references around line 153 and below.

      164: The hypothesis that the SUMO1 N-terminus interferes with SIM binding groove ignores the previous observation that deletion of the SUMO2 N-terminus does not have an effect on binding (in vitro). While this is addressed later, the authors should clarify this e.g. by stating "a unique feature of the SUMO1 N-terminus".
>

      We now explicitly mention that this feature appears to be unique to SUMO1.

      374 and 499: The authors should discuss the caveat that the deletion of the N-terminus of Smt3 does not have a phenotype in yeast in vivo (Newman et al., 2017).

      We now discuss that Smt3’s N-terminus can be deleted without detectable phenotype, both in the results as well as in “Limitations of the study”.

      Line 367: I feel this is overstated and I do not see any evidence that post translation modifications of the SUMO core plays a role. Therefore, I suggest: Our data and modeling are consistent with an interpretation that the N-termini of human and C. elegans SUMO1 proteins are inhibitory and that other SUMO N-termini may acquire such a function upon posttranslational modification of the N-terminus.

      We agree that this is pure speculation and therefore restrict our hypothesis to modifications of the N-terminus.

      Line 374 ff: Since Smo-∆N12 increases sumoylation (Fig. 2I), it is likely that the in vivo defect is due to over-sumoylation in C. elegans. The authors should discuss this possibility and quote appropriate literature e.g.: Rytinki et al., Overexpression of SUMO perturbs the growth and development of Caenorhabditis elegans. Cell Mol Life Sci. 2011 Oct;68(19):3219-32. PMID: 21253676.

      In our study, we employ in vitro SUMOylation as a means to assess the SIM binding capability in an in-solution assay. For this, we use USP25 as a specific substrate known to depend on a SIM for its SUMOylation. We cannot exclude that some specific substrates depending on this same mechanism for their modification may be upregulated in modification also in the Smo-1∆N12 worms. In vivo however, the majority of SUMO substrates is not subject to SIM-dependent SUMOylation. We now added a control experiment showing that we neither observe significantly increased SUMO levels nor upregulated steady state levels of SUMOylation in these worms (Supplemental figure 8).

      The phenotypes shown in the paper by Rytinki et al. do not resemble the smo-1∆N12 mutants. Rather, we observed a specific defect in the meiotic germ cells at the pachytene stage causing increased apoptosis Moreover, we show by western blot analysis that there is no global over-sumoylation occurring in smo-1∆N12 mutants (Fig. s8). Together, our data point to a germline-specific function of the SMO-1 N-terminus in maintaining genome stability (lines 510ff).

      Reviewer #2 (Recommendations For The Authors):

      Page2 - "Small Ubiquitin-related modifiers of the SUMO family regulate thousands of proteins in eukaryotic cells" - The authors could consider a more precise statement, e.g. that SUMO modifiers have been detected on thousands of proteins and their regulatory effect on many proteins have been demonstrated.

      To be a bit more precise, the sentence now reads: “Ubiquitin-related proteins of the SUMO family are reversibly attached to thousands of proteins”. The summary has a word limit, hence we did not expand further at this place.

      Page 4 - "Both events require SUMO-binding motifs (reviewed, e.g. in 7 ." - The end bracket is missing. Also, isn't it too strong a statement that paralogue specificity always requires a SIM? I don't know all the literature sufficiently well, but the authors could double-check if it is correct to say that paralogue-specific SUMOylation always depends on a SIM.

      Thank you, we added the missing bracket. We agree that it would not be correct to say that paralogue-specificity always depends on a SIM. One alternative example is Dpp9, which shows a clear preference for SUMO1 without owning a SIM. Instead, Dpp9 harbors an alternative SUMO-binding motif, the E67-interacting loop, with a strong paralogue-preference (Pilla et al., 2012). We never intended to imply that a SIM is required for paralogue preference and we also rather generically wrote “SUMO binding motif” instead of “SIM”. However, in the subsequent paragraph about SUMO binding motifs we only go into details of SIMs as one of three classes of SUMO binding motifs not even mentioning the alternative classes. To make this more obvious, we now list the two other known classes of SUMO binding motifs hoping that it will shed the correct light onto our previous statement about paralogue preference.

      Page 4 - In the nice discussion of different types of SIMs, the authors could consider mentioning also the special case of TDP2, which is used later by them as a model binding protein. This could provide an occasion to explain what the unusual "split SIM", mentioned on page 6, but not discussed, is, and what its relation to a normal SIM is. Also, it can perhaps be mentioned that TDP2 contacts SUMO2 not only through the two hydrophobic elements contiguous in space that mimic a SIM but also through a slightly larger interface around these regions on the surface of a folded domain.

      Thank you for pointing this out. In the introduction, we extended our section on SUMO binding and now also included TDP2’s “split SIM”.

      Page 11-12 - In the section "Interaction between SUMO's disordered N-termini and the SIM binding groove is highly dynamic" (and corresponding figures), it should be stated that the discussed kinetic parameters are derived from molecular dynamics simulations and not experimental measurements. It was not very clear to me. This also applies to this sentence on page 17: "First, we observed a very fast (ns) rate of the binding/unbinding process", which in its current form suggests direct observation rather than simulation.

      We thank the reviewer for pointing this out, and in fact, Rev #1 made the same comment. We specified now clearly that the rates were calculated from MD simulations, in the Results and Discussion sections (on page 11-12 and 18 (previously 17)).

      Page 16 - The authors could briefly mention that this relatively long disordered N-terminal tail is a specific feature of SUMO proteins that distinguishes them from ubiquitin. I guess it is obvious to people from the SUMO field, but I don't think it is explicitly stated anywhere in the text and it could be interesting for readers who are less familiar with SUMO/ubiquitin differences.

      Thank you, we added a short half-sentence pointing out this difference.

      Page 17 - "The N-terminal region remains fully disordered in the bound state and is thus a classic example of intrinsic disorder irrespective of the binding state." - it could be added to this sentence that this is suggested by molecular dynamics simulations and not directly observed.

      We added the information that this finding is based on the MD simulations.

      Page 18 - "(e.g., 41,53 or flanking the SIM binding groove24,42" - the end bracket is missing.

      Thanks, we added it.

      Page 19 - "Our analysis in C. elegans (Fig. 7) suggests that this N-terminal function is particularly important in DNA damage response, a pathway that is strongly dependent on the SUMO system." - this brief description of the in vivo data seems to overgeneralise them a little bit. Perhaps one can describe what was observed with slightly more nuance.

      See changes on p.19, lines 510ff.

    2. eLife assessment

      This work demonstrates an important regulatory role of the N-terminal disordered tail of small ubiquitin-like modifier (SUMO) proteins, which modulate the function of various proteins in eukaryotic cells. The authors present convincing evidence that the N-terminal tail of SUMO inhibits SUMO's interaction with downstream effector proteins and SUMOylation targets, and that this regulatory mechanism depends on the SUMO paralogue or the phosphorylation of the N-terminal tail. This discovery significantly advances the field by providing a possible explanation of how SUMO paralogues select their effectors and SUMOylation targets.

    3. Reviewer #2 (Public review):

      Summary:

      This very interesting study originated from a serendipitous observation that the deletion of the disordered N-terminal tail of human SUMO1 enhances its binding to its interaction partners. This suggested that the N terminus of SUMO1 might be an intrinsic competitive inhibitor of SUMO-interacting motif (SIM) binding to SUMO1. Subsequent experiments support this mechanism, showing that in humans it is specific to SUMO1 and does not extend to SUMO2 or SUMO3 (except, perhaps, when the N terminus of SUMO2 becomes phosphorylated, as the authors intriguingly suggest - and partially demonstrate). The auto-inhibition of SUMO1 via its N-terminal tail apparently explains lower binding of SUMO1 compared to SUMO2 to some SIMs and lower SIM-dependent SUMOylation of some substrates with SUMO1 compared to SUMO2, thus adding an important element to the puzzle of SUMO paralogue preference. In line with this explanation, N-terminally truncated SUMO1 was equally efficient to SUMO2 in the studied cases. The inhibitory role of SUMO1's N terminus appears conserved in other species including S. cerevisiae and C. elegans, both of which contain only one SUMO. The study also elucidates the molecular mechanism by which the disordered N-terminal region of SUMO1 can exert this auto-inhibitory effect. This appears to depend on the transient, very highly dynamic physical interaction between the N terminus and the surroundings of the SIM-binding groove based mostly on electrostatic interactions between acidic residues in the N terminus and basic residues around the groove.

      Strengths:

      A key strength of this study is the interplay of different techniques, including biochemical experiments, NMR, molecular dynamics simulations, and, at the end, in vivo experiments. The experiments performed with these different techniques inform each other in a productive way and strengthen each others' conclusions. A further strength is the detailed and clear text, which patiently introduces, describes, and discusses the study. Finally, in terms of the message, the study has a clear, mechanistic message of fundamental importance for various aspects of the SUMO field, and also more generally for protein biochemists interested in the functional importance of intrinsically disordered regions. In revision, the authors have further improved the text.

      Weaknesses:

      In the future, further experimental validation will be required, particularly with regards to the biological importance of the uncovered mechanism. These limitations are satisfactorily pointed out by the authors themselves in the revised manuscript.

    1. Author response

      The following is the authors’ response to the original reviews.

      We thank the editors and reviewers for their thoughtful comments on our manuscript. We greatly appreciated the suggestions and recommendations that helped us to improve the study. With adaptations, and inclusion of novel data and analyses, we have addressed all points raised, and hope that by these improvements the study further meets the standards for eLife. 

      Reviewer #1 (Recommendations For The Authors):

      Minor text edits should be made.

      (1.1) As a recent study from the Wong lab also showed sebaceous gland regeneration following complete ablation (Veniaminova et al., 2023), this finding should be mentioned in the text, and the abstract ("Most strikingly...") should be toned down.

      We thank the reviewer for the positive feedback, and for highlighting this part of the study from the Wong lab. Although we cited this study study in a different context, we had not discussed the sebaceous gland regeneration finding. We have now added this to the discussion section of the manuscript.

      (1.2) Introduction: In lines 31-33 discussing the connection of sebaceous glands with skin disorders, the 5 references cited seem to replicate the citations from a similar sentence in Veniaminova et al., 2019. The authors should vary their citations, as there are likely other publications that can be cited here.

      Additional references have been added.

      Reviewer #2 (Recommendations For The Authors):

      The manuscript is well written and the data are well presented in the figures.

      We thank the reviewer for the positive feedback.

      (2.1) Here are some points that could be taken into consideration to improve the manuscript:

      - Row 75 "the primary" regulator could be changed to "a crucial".

      We appreciate this suggestion and have made the text edit.

      - Row 86 could be added: ...is the dominant ligand of the Notch signalling.

      We have made the text edit as suggested.

      (2.2) Row 107-109 from the quantification of Figure 1G and Figure 2 it seems that only the aJ2 treatment has an SG phenotype. Why aJ1 doesn't have any effect? (same is true in other figures). If the data on aJ1 are maintained in the manuscript, this should be argued in the discussion section.

      The reviewer is correct in noting that the aJ1 treatment does not cause the phenotype, and this is indeed one of the key findings of the study. This is maintained throughout the manuscript. We have also cited references showing that embryonic and adult deletions of Jag1 do not cause any sebaceous gland defects. All these data argue that Jag1 is not the relevant Notch signaling ligand in sebocyte differentiation. We have further clarified this in the manuscript.

      (2.3) Related to Figure 3G. As the Lrig1 stem cells can go towards both the sebocyte differentiation, or the sebaceous duct differentiation, it would be interesting to evaluate if the differentiation impairment caused by the antibody treatment affects in a similar manner (or not) the sebaceous duct differentiation. This could be tested through immunofluorescence, selecting markers of sebaceous duct.

      We thank the reviewer for this thoughtful question. We are unable to find any unique markers of the sebaceous ducts (that are not expressed in other parts of the sebaceous gland, especially sebocytes) in the literature, thus, any analysis of markers would be confounded by its change of expression due to the loss of sebocytes.

      However, we have evaluated the histology using bursting sebocytes releasing sebum as a proxy of a functional sebaceous duct. We have not found any significant differences between treatments using this metric (Fig. S1).

      (2.4) As the word "therapeutic" is often underlined in the manuscript, maybe a few sentences on the transnational aspects of the results could be added to the discussion.

      We thank the reviewer for highlighting this point. We have added this to the discussion.

      (2.5) Figure 3 suggests that Jag2 is produced by basal sebocytes and used by these cells to induce sebocyte differentiation. I'm wondering if in an in vitro cell system (with a mixture of marked Jag2-expressing cells and marked Jag2-negative cells), it would be possible to understand if this mechanism of differentiation is a cell-autonomous mechanism or a mechanism based on cell competition (for instance, it would be possible that the progenitors compete for their niche on the basal layer by pushing neighbouring basal cells to differentiate presenting them Jag2).

      We thank the reviewer for the insightful suggestion. The mechanistic underpinning of how Notch signaling induces sebocyte differentiation is still unclear, and we find the reviewer’s suggestion very interesting. However, establishing an in vitro model that captures the aspects mentioned, would require a lot of optimization and validation. To help rapid dissemination of our findings we elected to keep this out of the manuscript, but we will certainly consider it for future studies.

      Reviewer #3 (Recommendations For The Authors):

      (3.1) The authors focussed on mouse back skin sebaceous glands to analyse the phenotype. Are the effects also reproducible in the sebaceous glands of the mouse ears and tail epidermis? If so, the data should be strengthened by quantifying the phenotype using tail epidermal whole mounts (Braun et al., 2003; Development, PMID: 12954714), ideally by co-staining sebaceous glands for differentiation markers (e.g. FASN, Adipophilin) or lipid deposits (e.g., Oil red O). Also, the authors need to clarify how many sebaceous glands were scored per mouse. If not, please provide a rationale explaining the location restriction.

      We thank the reviewer for pointing this out. Indeed, we have only incorporated data from the telogen dorsal skin of the animals. We have now more accurately reflected this in the revised manuscript. Additionally, we have added the number of sebaceous glands quantified in each figure per the reviewer’s suggestion.

      Since the stage of hair growth cycle can affect the sebaceous glands, we chose the resting (telogen) phase of the hair cycle to reliably study the sebaceous glands. At 8 weeks of age, hair follicles have uniformly entered the telogen phase. As subsequent re-entry into the anagen phase is asynchronous in the adult skin, the color of the dorsal skin of C57BL/6 mice can be used to determine whether the hair follicles are in the telogen phase or not. These reasons led us to choose this location, allowing us to study only telogen phase hair follicles.

      We also point out that previously reported data (Estrach et al., 2006) did not show differences between dorsal and tail skin, so we assume the mechanisms must largely be conserved. However, as the reviewer rightfully points out, we cannot be sure and have, therefore, indicated the dorsal location throughout the manuscript.

      (3.2) The micrographs in Figure 2 suggest that expression of both Jagged2 and Notch1 (intercellular domain) is not restricted to the sebaceous glands, as both molecules appear to be detected also in the isthmus and lower hair follicle. Of note, the online tool provided by the Kasper and Linnarsson labs (http://linnarssonlab.org/epidermis/) shows that both molecules are more widely expressed in mouse back skin. Please provide some analysis of the overall expression of these molecules in mouse skin. In line, is the observed effect of using the antagonising antibodies restricted to the sebaceous glands? Please provide additional data on proliferation and differentiation in the interfollicular epidermis, hair follicle cycling, and other skin compartments. For instance, the data published in the cited paper by Lafkas et al. (2005) suggest a thickening of the dermal adipocyte layer upon Jagged2 inhibition using monoclonal therapeutic antibodies.

      The reviewer is correct in noting that expression of both Jag2 and Notch1 is not restricted to the sebaceous gland. The Notch signaling pathway is a well-known regulator for epidermal differentiation, and members of the pathway are expressed in various locations of the skin, including the interfollicular epidermis and the hair follicle. The expression and function of Notch signaling in these locations has been reviewed in (Hsu et al., 2014; Nowell and Radtke, 2013; Watt et al., 2008). We have also added zoomed out images showing expression of Jag2 and Notch1 in the skin (Figure S2e,f).

      The effect of the antagonizing antibodies is not restricted to sebaceous glands, as we already noted in our discussion section: “While injections of the Notch blocking antibodies are systemic, we only observed a reduction in the number of Notch-active cells in the IFE, but not a complete loss.” The functional impact of the antibodies is likely beyond the sebaceous gland, as the reviewer points out, but understanding the full effect in other compartments, we consider beyond the scope of the current study.

      In our previous study (Lafkas et al., 2015), the skin was examined at different animal ages/gender and using different antibody dosing regimens, which is the likely explanation for the differences observed. We have now quantified the width of the adipocyte layer and the IFE and show that there are no significant differences between treatments (Figure S1g-j). This together with the histology suggest that there are no significant differences in the differentiation and proliferation of these compartments.

      (3.3) Since Jagged1 is a Wnt/beta-catenin target gene that is essential for (ectopic) hair follicle formation and differentiation (Estrach et al., 2006, Development, PMID: 17035290) and the sebaceous gland is widely considered as an epidermal compartment with absent/low Wnt/beta-catenin pathway activity during normal homeostasis (Lim & Nusse, 2013, Cold Spring Habor Perspectives in Biology, PMID: 23209129), how is the expression of Notch1 and Jagged2 regulated upstream in sebocyte progenitors? It would be important to bring some more mechanistic insights into the upstream regulation of Notch activity. In line with comment 2, how are the compartment-specific effects molecularly regulated if the effects are not restricted to the sebaceous glands?

      The reviewer is correct in noting that the Wnt pathway does not seem to be a likely candidate for driving sebocyte differentiation through Notch signaling. Indeed, Wnt inhibition is required for sebocyte differentiation (Merrill et al., 2001; Niemann et al., 2002), and the Jag2 promoter region also does not contain TCF binding sites (Katoh and Katoh, 2006).

      We speculate that Myc might regulate Notch signaling in the sebaceous gland. It is expressed in the sebaceous gland basal stem cells and has been reported to positively regulate sebocyte differentiation (Cottle et al., 2013). In addition, studies have shown that Jag2 is a Myc target gene (Fiaschetti et al., 2014; Yustein et al., 2010). However, evaluating which upstream pathway potentially regulates Notch signaling, and resolving the regulatory network of sebocyte differentiation beyond the direct Notch ligands and receptors would require extensive in vivo modeling using KO and transgenic animals, which we consider to be beyond the scope of the current manuscript.

      References

      Cottle DL, Kretzschmar K, Schweiger PJ, Quist SR, Gollnick HP, Natsuga K, Aoyagi S, Watt FM. 2013. c-MYC-Induced Sebaceous Gland Differentiation Is Controlled by an Androgen Receptor/p53 Axis. Cell Rep 3:427–441. doi:10.1016/j.celrep.2013.01.013

      Estrach S, Ambler CA, Celso CLL, Hozumi K, Watt FM. 2006. Jagged 1 is a β-catenin target gene required for ectopic hair follicle formation in adult epidermis. Development 133:4427–4438. doi:10.1242/dev.02644

      Fiaschetti G, Schroeder C, Castelletti D, Arcaro A, Westermann F, Baumgartner M, Shalaby T, Grotzer MA. 2014. NOTCH ligands JAG1 and JAG2 as critical pro-survival factors in childhood medulloblastoma. Acta Neuropathol Commun 2:39. doi:10.1186/2051-5960-2-39

      Hsu Y-C, Li L, Fuchs E. 2014. Emerging interactions between skin stem cells and their niches. Nat Med 20:847–856. doi:10.1038/nm.3643

      Katoh Masuko, Katoh Masaru. 2006. Notch ligand, JAG1, is evolutionarily conserved target of canonical WNT signaling pathway in progenitor cells. Int J Mol Med. doi:10.3892/ijmm.17.4.681

      Lafkas D, Shelton A, Chiu C, Boenig G de L, Chen Y, Stawicki SS, Siltanen C, Reichelt M, Zhou M, Wu X, Eastham-Anderson J, Moore H, Roose-Girma M, Chinn Y, Hang JQ, Warming S, Egen J, Lee WP, Austin C, Wu Y, Payandeh J, Lowe JB, Siebel CW. 2015. Therapeutic antibodies reveal Notch control of transdifferentiation in the adult lung. Nature 528:127–131. doi:10.1038/nature15715

      Merrill BJ, Gat U, DasGupta R, Fuchs E. 2001. Tcf3 and Lef1 regulate lineage differentiation of multipotent stem cells in skin. Genes Dev 15:1688–1705. doi:10.1101/gad.891401

      Niemann C, Owens DM, Hülsken J, Birchmeier W, Watt FM. 2002. Expression of ΔNLef1 in mouse epidermis results in differentiation of hair follicles into squamous epidermal cysts and formation of skin tumours. Development 129:95–109. doi:10.1242/dev.129.1.95

      Nowell C, Radtke F. 2013. Cutaneous Notch Signaling in Health and Disease. Cold Spring Harb Perspect Med 3:a017772. doi:10.1101/cshperspect.a017772

      Watt FM, Estrach S, Ambler CA. 2008. Epidermal Notch signalling: differentiation, cancer and adhesion. Curr Opin Cell Biol 20:171–179. doi:10.1016/j.ceb.2008.01.010

      Yustein JT, Liu Y-C, Gao P, Jie C, Le A, Vuica-Ross M, Chng WJ, Eberhart CG, Bergsagel PL, Dang CV. 2010. Induction of ectopic Myc target gene JAG2 augments hypoxic growth and tumorigenesis in a human B-cell model. Proc Natl Acad Sci 107:3534–3539. doi:10.1073/pnas.0901230107

    2. eLife assessment

      This work aimed at deconstructing how sebaceous gland differentiation is controlled in adult skin. Using monoclonal antibodies designed to inhibit specific Notch ligands or receptors, the authors present convincing evidence that the Jag2/Notch1 signaling axis is a crucial regulator of sebocyte progenitor proliferation and sebocyte differentiation. The valuable findings presented here contribute to the growing evidence that Notch signaling is not only key during the development of the skin and its appendages but also regulates cell fate in adult homeostatic tissues. From a translational perspective, it is intriguing that the effect of Jag2 or Notch1 inhibition, which leads to the accumulation of proliferative stem/progenitor cells in the sebaceous gland and prevents sebocyte differentiation, is reversible.

    3. Reviewer #1 (Public review):

      Summary:

      In this study, Abidi and colleagues used Notch pathway neutralizing antibodies to inhibit sebaceous glands in the skin. The authors find that blocking either the Notch1 receptor or the Jag2 ligand caused loss of the glands and increased retention of sebaceous progenitor cells. Moreover, these glands began to reappear 14 days after treatment.

      Strengths:

      Overall, this study definitively identifies the Notch receptor/ligand combination that maintains these glands in the adult. The manuscript is clearly written and the figures are beautifully made.

      In this resubmitted manuscript, the authors have adequately addressed all the previous critiques.

    4. Reviewer #2 (Public review):

      Summary:

      In this report Abidi et al. use an antibody against Jag2, a Notch1 ligand, to inhibit its activity in skin. A single dose of this treatment leads to an impairment of sebocyte differentiation and an accumulation of basal sebocytes. Consistently Notch1 activity, measured as cleaved form of the Notch1 intracellular domain, is detected in basal sebocytes together with the expression of Jag2. Interestingly the phenotype caused by the antibody treatment is reversible.

      Strengths:

      The quality of the histological data with a clear phenotype, together with the quantification represents a solid base for the authors claims.<br /> This work identifies that the ligand Jag2 is the Notch1 ligand required for sebocyte differentiation.<br /> From a therapeutic point of view, it is interesting that the treatment with the anti-Jag2 is reversible.

      Weaknesses:

      The authors use a single approach to support their claims.<br /> Future in vitro studies will be needed to understand how Notch signaling induces sebocyte differentiation (i.e. a cell-autonomous mechanism, a mechanism based on cell competition, etc.).

    5. Reviewer #3 (Public review):

      Abidi et al. investigated the role of Notch signalling for sebaceous gland differentiation and sebocyte progenitor proliferation in adult mouse skin. By injecting antagonising antibodies against different Notch receptors and ligands into mice, the authors identified that the Notch1 receptor and, to a lesser extent, Notch2 receptor, as well as the Notch ligand Jagged2, contribute to the regulation of sebaceous gland differentiation. In situ hybridisation confirmed that treatment with anti-Jagged2 dramatically reduced the number of basal sebocytes staining for the transcriptionally active intracellular domain of Notch1. Loss of Notch activity in sebocyte progenitors robustly inhibited sebaceous gland differentiation. Under these conditions, the number of sebocyte progenitors marked by Lrig1 was not affected, while the number of proliferating basal sebocytes was increased. Upon recovery of Notch activity, sebaceous gland differentiation could likewise be recovered. By suggesting that Notch activity in sebocyte progenitors is required to balance proliferation and differentiation, these data bring valuable new and relevant findings for the skin field on the sebaceous gland homeostasis.

    1. eLife assessment

      In this small study involving patients with a history of myocardial infarction, Fawaz et al. found no significant contribution of clonal hematopoiesis and mosaic loss of the Y chromosome to the incidence of myocardial infarction and atherosclerosis. Although the evidence provided by the study is incomplete due to its small sample size, the findings are valuable for guiding future larger studies that will further investigate this significant and controversial subject.

    2. Reviewer #2 (Public review):

      Summary:

      The preprint by Fawaz et al. presents the findings of a study that aimed to assess the relationship between somatic mutations associated with clonal hematopoiesis (CHIP) and the prevalence of myocardial infarction (MI). The authors conducted targeted DNA sequencing analyses on samples from 149 MI patients and 297 non-MI controls from a separate cohort. Additionally, they investigated the impact of the loss of the Y chromosome (LOY), another somatic mutation frequently observed in clonally expanded blood cells. The results of the study primarily demonstrate no significant associations, as neither CHIP nor LOY were found to be correlated with an increased prevalence of MI. The null findings regarding CHIP are partly in conflict with several larger studies in the literature. However, it must be noted that the authors did find trends to an association between CHIP and a higher incidence of MI during follow-up among those without a history of MI at baseline, which is more consistent with previous research work. The association with incident MI reached statistical significance in men, particularly in those not showing LOY, suggesting potential interactions between different clonally-expanded somatic mutations.

      Strengths:

      Overall, this is a useful research work on an emerging risk factor for cardiovascular disease (CVD). The use of a targeted sequencing approach is a strength, as it offers higher sensitivity than the whole exome sequencing approaches used in many previous studies. Reporting null findings is definitely relevant in an emerging field such as the role of somatic mutations in cardiovascular disease.

      Weaknesses:

      The study suffers from important limitations, which cast some doubts onto the authors' conclusions, as detailed below:

      (1) The small sample size of the study population is a critical limitation, particularly when reporting null findings that conflict (partly) with positive findings in much larger studies, totaling hundreds of thousands of individuals (e.g. Zekavat et al, Nature CVR 2023, Vlasschaert et al, Circulation 2023; Zhao et al, JAMA Cardio 2024). The authors claim that they have 90% power to detect an effect size of CHIP on MI comparable to that in previous reports (a hazard ratio of 1.7, mainly based on the findings by Jaiswal et al, NEJM 2014,2017). However, this analysis is simply based on the predicted prevalence of CHIP in MI(+) and MI(-) patients, and it does not consider the complex relationship between age CHIP and atherosclerotic disease. More advanced approaches to calculate statistical power may have provided a more accurate estimation. It must also be noted that recent work in much larger populations suggest that the overall effect of CHIP on atherosclerotic CVD is smaller than 1.7, most likely due to the heterogeneity of effects of different mutated genes (e.g. Zekavat et al, Nature CVR 2023, Vlasschaert et al, Circulation 2023; Zhao et al, JAMA Cardio 2024). In addition, several analyses in the current manuscript are conducted separately in MI(+) (n= 149) and MI(-) (N=297) individuals, further limiting statistical power. Power is even lower in the investigation of the effects of LOY and its interaction with CHIP, as only men are included in these analyses. Overall, I believe the study is underpowered from a statistical point of view, so the authors' findings need to be interpreted with caution.

      (2) Related to the above, it is widely accepted that the effects of CHIP on CVD are highly heterogeneous, as some mutated genes appear to have a strong impact on atherosclerosis, whereas the effect of others is negligible (e.g. Zekavat et al, Nature CVR 2023, Vlasschaert et al, Circulation 2023, among others). TET2 mutations are frequently considered a "positive control", given the multiple lines of evidence suggesting that these mutations confer a higher risk of atherosclerotic disease. However, no association with MI or related variables was found for TET2 mutations in the current work, which likely reflects the limited statistical power of the study to assess accurately the effects of CHIP mutations on atherosclerotic disease.

      (3) One of the most essential features of CHIP is the tight correlation with age. In this study, the effect of age on CHIP (e.g. Supp. Tables S5, S6) is statistically significant, but substantially milder than in previous studies. Given the relatively modest effect size of age on CHIP here, it is not surprising that no association with MI or atherosclerotic disease was found, considering that this association would have a much smaller effect size. It must be considered, however, that the advanced age of the population may have confounded the analysis of these relationships, as acknowledged by the authors.

      (4) CHIP represents just one type of clonal hematopoiesis (e.g. see https://doi.org/10.1182/blood.2023022222). In this context, it must be noted that the mutated genes included in the definition of "CHIP" here are markedly different than in most previous studies, particularly when considering specifically the studies that demonstrated an association between CHIP and atherosclerotic CVD. For instance, the definition of CHIP in this manuscript includes genes such as ANKRD26, CALR, CCND2, DDX41... that are not prototypical CHIP genes. This is unlikely to have major impact on the main results, as the vast majority of mutations detected are indeed in bona fide CHIP genes, but it needs to be considered when interpreting the authors' findings. Furthermore, the strategy used here for CHIP variant calling and curation is substantially different than that used in previous studies. This is important, because such differences in the definition of CHIP and the curation of variants are at the basis of most conflicting findings in the literature regarding the effects of this condition. The authors estimate that the effect of these discrepancies on the definition of CHIP is limited, but small differences can have substantial impact in a study with limited sample size.

      (5) A major limitation of the current study is the cross-sectional design of most of the analyses. For instance, it is not surprising that no association is found between CHIP and prevalent atherosclerosis burden by ultrasound imaging, considering that many individuals may have developed atherosclerosis years or decades before the expansion of the mutant clones, limiting the possible effect of CHIP on atherosclerosis burden. Similarly, the analysis of the relationship between CHIP and a history of MI may be confounded by the potential effects of MI on the expansion of mutant clones. In this context, it is noteworthy that the only positive results here are found in the analysis of the relationship between CHIP at baseline and incident MI development over follow-up. A larger sample size in these longitudinal analyses would provide deeper insights into the relationship between CHIP and MI.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review): 

      This manuscript examines the individual and dual effects of CHIP and LOY in MI employing a cohort of ~460 individuals. CHIP is assessed by NGS and LOY is assessed by PCR. The threshold for CHIP is set at 2% (an arbitrary cutoff that is often used) and LOY at 9% (according to the Discussion text - this reviewer may have missed the section that describes why this threshold was employed). The investigation assessed whether LOY could modulate inflammation, atherosclerotic burden, or MI risk associated with CHIP. Neither CHIP nor LOY independently affected hsCRP, atherosclerotic burden, or MI incidence, nor did LOY presence diminish these outcomes in CHIP+ male subjects.

      This study represents the first dual analysis of CHIP and LOY on CVD outcomes. The results are largely negative, contradictory to other studies (many with much larger sample sizes). I would attribute the limitation of sample size as a major contributor to the negative data. While the negative data are suspect, the "positive" finding that LOY abolishes the prognostic significance of CHIP on MI is of interest (and consistent with what is understood from mechanistic studies).

      Overall, I enjoyed reading the paper, and it is of interest to the research community.

      However, I disagree with some of the authors' interpretations of the data.

      Generally, many conclusions on CHIP interpretation are based on the comparison of findings from very large datasets that have been evaluated by shallow NGS DNA sequencing. These studies lack sensitivity and accuracy, but this is counterbalanced by their very large sample sizes. Thus, they draw conclusions from the sickest individuals (ICD codes) with the largest clones (explaining the 10% VAF threshold). Here, the study has a well-phenotyped cohort, but as far as this reviewer can tell, the DNA sequencing is "shallow" NGS. Typically, to assess smaller datasets, investigators employ an error-correction method (DNA barcodes, duplex sequencing, etc.) for the sensitivity and accuracy of calling variants. Thus, the current study appears to suffer from this limitation (small sample sizes combined with NGS).

      We thank the reviewer for his/her positive and open comment. We acknowledge that we did not use error-corrected sequencing method for our study. However, we do not fully agree with the statement that our NGS sequencing technique is “shallow”.

      Considering our entire sequencing panel, we achieve a sequencing depth ≥100X and ≥300X for 100% [99%;100%] and 99% [99%;100%] of the targeted regions respectively. This corresponds to a median depth of 2111X [1578;2574] for all regions sequenced. When considering “CHIP genes”, the median depth is 2694X [1875;3785] for patients from the CHAth study and 3455X [2266;4885] for patients from the 3C study. More specifically, for DNMT3A and TET2 genes, the median depths of sequencing are 2531X [1818;3313] and 3710X [2444;4901] for patients from the CHAth and 3C study respectively. These values are far much higher than the 300X recommended for NGS sequencing by capture technology by the French National Institute of Cancer. Coupling this high depth of sequencing with our bioinformatic pipeline that uses 3 different variant callers, a manual curing for all variants by trained hematobiologists and a bioinformatic tool to estimate the background noise allow us to detect somatic mutation with a VAF of 1% with a high accuracy. Noteworthy, our accuracy in detecting mutations in leukemia-associated genes is tested twice a year as part of our quality control program organized by the French Group of Molecular Biologists in Hematology (GBMHM). We added the information about the depth of sequencing in the Supplementary Methods section.

      While the "negative" data from this study are inconclusive, the positive data (i.e. CHIP being prognostic for MI in the absence but not presence of MI) is of interest. Thus, the investigators may want to consider a shorter report that largely focuses on this finding.

      We thank the reviewer for his/her interest in this result. We also agree that it would be interesting to focus specifically on demonstrating the impact of mLOY in countering the cardiovascular risk associated with CHIP. We performed additional analysis to demonstrate that this effect was independent of age and cardiovascular risk factors and included this information in the results section.

      However, we believe that it is also of interest to show negative results that, although probably due to limitation in sample size, suggest that the cardiovascular risk associated with CHIP is not as strong and clinically pertinent as initially suggested. Of note, if CHIP really increase the risk of Myocardial Infarction in a significant manner, they would be more frequently detected in subjects who suffered from a MI compared to those who did not, which was not observed in our cohort. Moreover, we were able to determine that if CHIP increases the risk of MI, they do it to a much lesser extent (HR = 1.03 for CHIP) -than other established cardiovascular risk factors such as hypercholesterolemia or tobacco use HR = 1.47 and HR = 1.86 respectively in our cohort), which questions the pertinence of considering for CHIP in the management of patients with atherothrombosis. These data have been added in the Results and Discussion sections.

      We also believe that our study has the merit to assess directly the impact of CHIP on atheroma burden, which has been performed in only a limited number of studies in the context of coronary artery disease. This could not be possible by analyzing only male subjects in our cohort because it would further decrease the statistical power of our analyses.

      Reviewer #2 (Public Review):

      Summary: 

      The preprint by Fawaz et al. presents the findings of a study that aimed to assess the relationship between somatic mutations associated with clonal hematopoiesis (CHIP) and the prevalence of myocardial infarction (MI). The authors conducted targeted DNA sequencing analyses on samples from 149 MI patients and 297 non-MI controls from a separate cohort. Additionally, they investigated the impact of the loss of the Y chromosome (LOY), another somatic mutation frequently observed in clonally expanded blood cells. The results of the study primarily demonstrate no significant associations, as neither CHIP nor LOY were found to be correlated with an increased prevalence of MI. Of note, the null findings regarding CHIP are in conflict with several larger studies in the literature.

      Strengths:

      Overall, this is a useful research work on an emerging risk factor for cardiovascular disease (CVD). The use of a targeted sequencing approach is a strength, as it offers higher sensitivity than the whole exome sequencing approaches used in many previous studies.

      Weaknesses:

      Reporting null findings is definitely relevant in an emerging field such as the role of somatic mutations in cardiovascular disease. Nevertheless, the study suffers from severe limitations, which casts doubts on the authors' conclusions, as detailed below:

      (1) The small sample size of the study population is a critical limitation, particularly when reporting null findings that conflict (partly) with positive findings in much larger studies, totaling hundreds of thousands of individuals (e.g. Zekavat et al, Nature CVR 2023, Vlasschaert et al, Circulation 2023; Zhao et al, JAMA Cardio 2024). The authors claim that they have 90% power to detect an effect size of CHIP on MI comparable to that in a previous report (Jaiswal et al, NEJM 2017). However, the methodology used to estimate statistical power is not described.

      We thank the reviewer for his/her pertinent and constructive comments. We totally agree that our study presents a substantially smaller sample size as compared to the studies of Zekavat et al, Vlasschaert et al or Zhao et al.

      The CHAth study was designed as a prospective study (which is not frequent in CHIP reports) to demonstrate that, if CHIP increase the risk of MI, they would be detected more frequently in patients who suffered from a MI compared to those who did not. To achieve this, we defined eligibility criteria to have a rather high prevalence of CHIP and optimize the statistical power of a study based on a limited number of patients. We thus enrolled patients who suffered from a first MI after the age of 75 years. These patients had to be compared with subjects from the Three-City study who had 65 years or more at inclusion and did not present any cardiovascular event before inclusion.

      To determine the number of patients necessary to achieve our objective, we considered a CHIP prevalence of 20% in the general population after the age of 75 years, as estimated when we set up our study (Genovese et al, NEJM 2014, Jaiswal et al, NEJM 2014, Jaiswal et al, NEJM 2017). At this time the relative risk of MI associated with CHIP was shown to be 1.7, leading to an expected prevalence of CHIP of 37% in subjects who presented a MI. Based on these hypotheses, the recruitment of 112 patients in the CHAth would have been sufficient to detect a significant higher prevalence of CHIP in MI(+) patients compared to MI(-) subjects with a power of 0.90 at a type I error rate of 5%. These calculations were performed by the Research Methodology Support Unit of the University Hospital of Bordeaux. These data were added in the Supplementary Methods section to expose more clearly the design and objectives of the CHAth study.

      Finally, we recruited 149 patients in the CHAth study and compared them to 297 control subjects. Although recruiting more patients than initially needed, we observed a similar prevalence of CHIP between our 2 cohorts, suggesting that the cardiovascular risk associated with CHIP is lower than the 1.7 increased risk claimed in most publications related to CHIP in the cardiovascular field. We have to notice that our study was not designed to demonstrate the impact of CHIP on the occurrence of MI during follow-up, which could explain our negative results due to a limited number of patients as stated by the reviewers. This statement has been added in the Supplementary Methods section. However, performing such analysis allowed us to confirm that the risk of MI associated with CHIP was lower than 1.7 and lower than the one associated with hypercholesterolemia or smoking.

      We would like also to notice that the eligibility criteria for both CHAth and the Three-City study can have led to a selection bias, possibly contributing to the contradiction of our results with other studies. As stated before, in the CHAth study, only patients who experience a first MI after the age of 75 were enrolled. In the Three-City study, all subjects had 65 years or more at inclusion. On the contrary, most of the cohorts showing an association between CHIP and cardiovascular events were composed of younger subjects:

      -          Bioimage : median age 70 years (55-80 years)

      -          MDC : median age 60 years

      -          ATVB : subjects with a MI before 45 years

      -          PROMIS : subjects between 30 and 80 years

      -          UK Biobank : between 40 and 70 years at inclusion, median age of 58 years in the study of Vlasschaert et al.

      -          Zhao et al : median age of 53.83 years (45.35-62.39 years).

      This last information was added in the Discussion section (lines 452-454).

      Furthermore, the work by Jaiswal et al (NEJM 2017) showed a hazard ratio of approx. 2.0, but more recent work in much larger populations suggests that the overall effect of CHIP on atherosclerotic CVD is smaller, most likely due to the heterogeneity of effects of different mutated genes (e.g. Zekavat et al, Nature CVR 2023, Vlasschaert et al, Circulation 2023; Zhao et al, JAMA Cardio 2024).

      We thank the reviewer for insisting on the fact that the initial HR of 2.0 observed by Jaiswal et al was shown to be smaller in more recent studies. This corresponds to what we wrote in the introduction (lines 103-109) and discussion (lines 365-370, 465-471).

      In addition, several analyses in the current manuscript are conducted separately in MI(+) (n= 149) and MI(-) (N=297) individuals, further limiting statistical power. Power is still lower in the investigation of the effects of LOY and its interaction with CHIP, as only men are included in these analyses. Overall, I believe the study is severely underpowered, which calls into question the validity of the reported null findings.

      We agree with the reviewer that the statistical power of our study is lower than the one of other studies, in particular those based on several hundred thousand patients. Whenever possible, we analyzed our data by combining MI(+) and MI(-) subjects. However, for some aspects such as atherosclerosis, we did not have the same parameters available for these 2 groups and had to analyze them separately, leading to a more limited statistical power. We also have to acknowledge that our study was not designed to demonstrate an effect of CHIP on incident MI (as stated before), limiting our statistical power to demonstrate an effect of CHIP +/- mLOY on the incident risk of coronary artery disease.

      However, when designing our prospective study (CHAth study), we aimed to address the limitations of a small cohort and obtain rapid, significant results regarding the impact of CHIP. We hypothesized that if CHIP really increases the risk of myocardial infarction (MI), it would be detected more frequently in patients who have experienced a MI compared to those who have not. This study design would demonstrate the importance of CHIP in MI pathophysiology without requiring thousands of patients. However, we did not observe such an association questioning the relevance of detecting CHIP for the management of patients in the field of Cardiology. This was confirmed by the fact that in our cohort, the cardiovascular risk associated with CHIP appears to be low (HR = 1.03 [0.657;1.625] after adjustment on sex, age and cardiovascular risk factors) compared to hypercholesterolemia (HR = 1.474 [0.758;2.866]) or smoking (HR = 1.865 [0.943;3.690]). These data have been added in the Results and Discussion sections.

      In addition, we would like to mention that despite the limited number of subjects studied, we do not have only negative results. When studying only men subjects, we were able to show that CHIP accelerate the occurrence of MI, particularly in the absence of mLOY (Figure 2D). This effect was independent of age and cardiovascular risk factors (diabetes, cholesterol and high blood pressure). We added this last information in the results section of the manuscript, although we acknowledge that this has to be confirmed in future work.

      (2) Related to the above, it is widely accepted that the effects of CHIP on CVD are highly heterogeneous, as some mutated genes appear to have a strong impact on atherosclerosis, whereas the effect of others is negligible (e.g. Zekavat et al, Nature CVR 2023, Vlasschaert et al, Circulation 2023, among others). TET2 mutations are frequently considered a "positive control", given the multiple lines of evidence suggesting that these mutations confer a higher risk of atherosclerotic disease.

      However, no association with MI or related variables was found for TET2 mutations in the current work. Reporting the statistical power specifically for assessing the effect of TET2 mutations would enhance the interpretation of these results.

      We thank the reviewer for this pertinent remark. It has indeed been shown that depending on the somatic mutation, the impact of CHIP on inflammation, atherosclerosis and cardiovascular risk is different. The studies cited by the reviewer suggest that DNMT3A mutations have a low impact on atherosclerosis/atherothrombosis while other “non-DNMT3A” mutations, including TET2 mutations, have a greater impact. In particular, Zekavat et al suggested that TP53, PPM1D, ASXL1 and spliceosome mutations have a similar impact on atherosclerosis/atherothrombosis to TET2.

      To answer to the reviewer in our cohort, we did not find a clear association between the detection of TET2 mutation with a VAF≥2% and:

      -          A history of MI at inclusion (p=0.5339)

      -          Inflammation (p=0.440)

      -          Atherosclerosis burden :

      -   In the CHAth study:

      -  p=0.031 for stenosis≥50%

      -  p=0.442 fir multitruncular lesions

      -  p=0.241 for atheroma volume

      -   in the 3C study :

      -  p=0.792 for the presence of atheroma

      -  p=0.3966 for the number of plaques

      -  p=0.876 for intima-media thickness

      -          Incidence of MI (p=0.5993)

      Similarly we did not find any association between the detection of TET2 mutations with a VAF≥1% and:

      -          A history of MI at inclusion (p=0.5339)

      -          Inflammation (p=0.802)

      -          Atherosclerosis burden :

      -   In the CHAth study :

      -  p=0.104 for stenosis≥50%

      -  p=0.617 fir multitruncular lesions

      -  p=0.391 for atheroma volume

      -   in the 3c study:

      -  p=0.3291 for the presence of atheroma

      -  p=0.2060 for the number of plaques

      -  p=0.2300 for intima-media thickness

      -          Incidence of MI (p=0.195)

      However, analyzing the specific effect of TET2 mutations reduces the cohort of CHIP(+) subjects to 61 individuals. In these conditions, considering a prevalence of “TET2-CHIP” of 13.5% (in our cohort) and a hazard ratio of 1.3 (Vlasschaert et al), the statistical power to show an increased risk of MI is only 16%.

      (3) One of the most essential features of CHIP is the tight correlation with age. In this study, the effect of age on CHIP (Supplementary Tables S5, S6) seems substantially milder than in previous studies. Given the relatively weak association with age here, it is not surprising that no association with MI or atherosclerotic disease was found, considering that this association would have a much smaller effect size.

      We thank the reviewer for highlighting this point. Although the difference of median age between subjects with or without a CHIP is not very important in our cohort, we did observe a significant association of CHIP with age:

      -          The differences in age were statistically significant both in the CHAth and 3C study (Supplementary Tables S5 and S6)

      -          We observed a significant association between age and CHIP prevalence (p<0.001 for the total cohort, p=0.0197 for the CHAth study, and p=0.0394 for the 3C cohort after adjustment on sex). This association was already shown in the figure 1. We added the significant association between age and CHIP prevalence in the Results section (line 279).

      As stated before, we have to remind the reviewer that we enrolled only subjects of ≥75 years and ≥65 years in the CHAth and 3C studies respectively. This led to a median age in our cohort that was substantially higher than in other cohorts (in particular the UK Biobank and the different cohorts studied by Jaiswal et al). This could have contributed to an apparent milder effect of age on CHIP, even if this association was still observed.

      In addition, there are previous reports of sex-related differences in the prevalence of CHIP, is there an association between CHIP and age after adjusting for sex? 

      The reviewer correctly pointed out that sex has been associated with various aspects of CHIP. While Zekavat et al reported that CHIP carriers were more frequently males, Kar et al (Nature Genetics 2022), and Kamphuis et al (Hemasphere 2023) did not observe a difference in the prevalence of CHIP between males and females, but rather a difference in the mutational spectrum. Male presented more frequently SRSF2, ASXL1, SF3B1, U2AF1, JAK2, TP53 and PPM1D mutations while females had more frequently DNMT3A, CBL and GNB1 mutations.

      In our study, the association between CHIP prevalence and age was indeed significant even after adjustment on sex (p<0.001 for the total cohort, p=0.0197 for the CHAth study and p=0.0394 for the 3C).

      (4) The mutated genes included in the definition of "CHIP" here are markedly different than those in most previous studies, particularly when considering specifically the studies that demonstrated an association between CHIP and atherosclerotic CVD. For instance, the definition of CHIP in this manuscript includes genes such as ANKRD26, CALR, CCND2, and DDX41... that are not prototypical CHIP genes. This is unlikely to have a major impact on the main results, as the vast majority of mutations detected are indeed in bona fide CHIP genes, but it should be at least acknowledged.

      We agree with the reviewer that our gene panel includes genes that are not considered prototypical CHIP genes. This acknowledgment has been added in the Supplementary Methods section. To perform this study, we did not design a specific targeted sequencing panel. We used the one that is used for the diagnosis of myeloid malignancies at the University Hospital of Bordeaux. ANKRD26 and DDX41 are genes that, when mutated, predispose to the development of hematological malignancies. CALR mutations are frequently detected in Myeloproliferative Neoplasms while CCND2 mutation can be detected in acute myeloid leukemia among other diseases. As usually performed in our routine practice, we analyzed all the genes in the panel. However, as stated by the reviewer, most of the mutations we detected involved bona fide CHIP genes.

      Furthermore, the strategy used here for the CHIP variant calling and curation seems substantially different than that used in previous studies, which precludes a direct comparison. This is important because such differences in the definition of CHIP and the curation of variants are the basis of most conflicting findings in the literature regarding the effects of this condition. Ideally, the authors should conduct sensitivity analyses restricted to prototypical CHIP genes, using the criteria that have been previously established in the field (e.g. Vlasschaert et al, Blood 2023).

      We agree with the reviewer, our strategy for CHIP variant calling and curation was substantially different from what has been used in other studies. We decided to apply the criteria we used in previous studies for the analysis of somatic mutation in myeloid malignancies. Because CHIP are defined by the detection of “somatic mutations in leukemia driver genes”, this appeared to follow the definition of CHIP.

      We also acknowledge that this discrepancy with the criteria defined by Vlasschaert et al could contribute to our findings that differ from those of other studies. We thus checked whether the variants detected were in accordance or not with the criteria defined by Vlasschaert et al. Pooling the 2 cohorts, we detected 439 variants, 381 of which were in accordance with the criteria established by Vlasschaert et al, representing a concordance rate of 86.8%. Moreover, the variants “wrongly” retained according to these criteria had an impact on the conclusion on the detection of CHIP in only 15 patients (because these variants were associated with a mutation in a bona fide CHIP gene and/or because its VAF was below 2%). Thus, the impact of CHIP variant calling and curation had only a limited impact on our results. This has been added in the discussion (lines 455-459).

      However, we would like to discuss the criteria that have been defined by Vlasschaert et al which are probably too restrictive. For some genes, such as ZRSR2, in addition to frameshift and non-sens mutations that are expected to be associated with a loss of function, only some single nucleotide variations were retained (probably those detected by this group). In our patient 20785, we detected a c.524A>G, p.(Tyr175Cys) mutation that was not reported in the list published by Vlasscheart et al. However, this variant presents a VAF presumptive of a somatic origin (3%), affects the Zn finger domain of the protein and is observed in a male subject. Thus, it presents several criteria to consider it as associated with a loss of function. Similarly, the CBL variant c.1139T>C, p.(Leu380Pro) observed in our patient 21536, although not affecting the residues 381-421 of the protein (the criteria defined by Vlasschaert et al), has been reported in 29 cases of hematological malignancies. It is thus likely to have a significant impact on the behavior of hematopoietic cells. Moreover, in the same patient, a TET2 c.4534G>A, p.(Ala1512Thr) variant was detected. Although not affecting directly the CD1 domain, it has been reported in a case of AML with a VAF suggestive of a somatic origin (Papaemmanuil et al, NEJM 2016). The SH2B3 gene is not considered by Vlasschaert et al as a bona fide CHIP gene, contrary to other genes involved in cell signaling such as JAK2, GNAS, GNB1, CBL. However, inactivating mutations in SH2B3 can be detected in myeloid malignancies and were recently shown to drive the phenotype in some patients with a MPN (Zhang et al, American Journal of Hematology 2024). We could thus expect that this also happens in our patients 22591 and 21998 who harbor mutations of SH2B3 (a SNV in the PH domain and a frameshift mutation respectively).

      Regarding BCOR, STAG2, SMC3 and RAD21 genes, although frameshift mutations are the most prevalent, there are several reports on the existence of SNV in the context of hematological malignancies (COSMIC, Blood (2021) 138 (24): 2455–2468, Blood Cancer Journal (2023)13:18 ; https://doi.org/10.1038/s41408-023-00790-1).

      We can also add that although Vlasschaert et al did not consider CSF3R and CALR as CHIP-genes, Kessler et al did. Because CHIP are an emerging field, it should be considered that the concepts that define it are expected to evolve, as demonstrated by the recent study of the Jyoti Nangalia’s group (Bernstein et al, Nature Genetics 2024) who showed that 17 additional genes (including SH2B3) should be considered as driver of clonal hematopoiesis.

      (5) An important limitation of the current study is the cross-sectional design of most of the analyses. For instance, it is not surprising that no association is found between CHIP and prevalent atherosclerosis burden by ultrasound imaging, considering that many individuals may have developed atherosclerosis years or decades before the expansion of the mutant clones, limiting the possible effect of CHIP on atherosclerosis burden. Similarly, the analysis of the relationship between CHIP and a history of MI may be confounded by the potential effects of MI on the expansion of mutant clones. In this context, it is noteworthy that the only positive results here are found in the analysis of the relationship between CHIP at baseline and incident MI development over follow-up. Increasing the sample size for these longitudinal analyses would provide deeper insights into the relationship between CHIP and MI. 

      We agree with the reviewer that increasing the sample size for longitudinal analyses would provide deeper insights into the relationship between CHIP and MI. Unfortunately, for the moment, we do not have access to additional samples of the 3C study and are not able to perform these additional analyses.

      (6) The description of some analyses lacks detail, but it seems that statistical analyses were exclusively adjusted for age or age and sex. The lack of adjustment for conventional cardiovascular risk factors in statistical analyses may confound results, particularly given the marked differences in several variables observed between groups.

      The reviewer is right when saying that we adjusted our analyses on age and/or sex. This was done because as stated before, our results did not show a lot of significant differences. However, we reanalyzed our data, adjusting further the tests for conventional cardiovascular risk factors, and observed similar results. These data have been added in the results section (lines 286-287, 303, 319, 331-332, 341).

      (7) The variant allele fraction (VAF) threshold for identifying clinically relevant clonal hematopoiesis is still a subject of debate. The authors state that subjects without any detectable mutation or with mutations with a VAF below 2% were considered non-CHIP carriers. While this approach is frequent in the field, it likely misses many impactful mutations with lower VAFs. Such false negatives could contribute to the null findings reported here. Ideally, the authors should determine the lower detection limit of their sequencing approach (either computationally or through serial dilution experiments) and identify the threshold of VAF that can be detected reliably with their sequencing assay. The association between CHIP and MI should then be evaluated considering all mutations above this VAF threshold, in addition to sensitivity analyses with other thresholds frequent in the literature, such as 1% VAF, 2% VAF, and 10% VAF.

      We agree with the reviewer that the VAF threshold for identifying clinically relevant CH is still debated. As stated in the manuscript and by the reviewer, we used the conventional threshold of 2%. Considering that different studies have shown that the cardiovascular risk is increased in a more important manner for CHIP with a high VAF (Jaiswal et al, NEJM 2017, Kessler et al Nature 2022, Vlasschaert et al, Circulation 2023), it is not sure that considering variant with a very low VAF (below 2%) would help us in finding an impact of CHIP on inflammation, atherosclerosis or atherothrombotic risk.

      However, as mentioned by the reviewer, variants with a low VAF could have a clinical impact as recently reported by Zhao et al. In France, the use of biological analysis for medical purposes imposes to demonstrate that all its aspects are mastered, including their performances. In that context, we determined that our NGS strategy allowed us to reliably detect mutation with a VAF down to 1% (data not shown). As stated in the discussion, we also analyzed our results considering variants with a VAF of 1% and found similar results (lines 394-395). The sensitivity analyses were already mentioned in the manuscript, as we also searched for an effect of CHIP with a high VAF (≥5%) and found no effect neither. We did not have a sufficient number of subjects carrying variants with a VAF≥10% to perform analysis with this threshold.

      (8) The authors should justify the use of 3D vascular ultrasound imaging exclusively in the supra-aortic trunk. I am not familiar with this technique, but it seems to be most typically used to evaluate atherosclerosis burden in superficial vascular beds such as carotids or femorals. I am concerned about the potential impact of tissue depth on the accurate quantification of atherosclerosis burden in the current study (e.g. https://doi.org/10.1016/j.atherosclerosis.2016.03.002). It is unclear whether the carotids or femorals were imaged in the study population. 

      We apologize for the lack of precision in the Methods section. As stated by the reviewer, we evaluated the atherosclerosis burden in superficial vascular beds. We measured atheroma volume at the site of the common carotid (as described by B Lopez-Melgar, in Atheroslerosis, 2016). We did not analyze femoral arteries in this study. The sentence is now corrected in the Methods (lines 176-179).

      (9) The specific criteria used to define LOY need to be justified. LOY is stated to be defined based on a "A cut off of 9% of cells with mLOY defined the detection of a mLOY based on the study of 30 men of less than 40 years who had a normal karyotype as assessed by conventional cytogenetic study." As acknowledged by the authors, this definition of LOY is substantially different than that used in recent studies employing the same technique to detect LOY (Mas-Peiro et al, EHJ 2023). In addition, it seems essential to provide more detailed information on the ddPCR assay used to determine LOY, including the operating range and, more importantly, the lower limit of detection (%LOY) of the assay. A dilution series of a control DNA with no LOY would be helpful in this context. 

      We apologize if the definition of the threshold for detecting mLOY was unclear. To test the performance of our ddPCR technique, we first determined the background noise by testing DNA obtained from total leukocytes in 30 men of ≤40 years who presented a normal karyotype as assessed by conventional cytogenetic technics. In this control population supposed not to carry mLOY, we detected of proportion of cells with mLOY of 2,34+/-1,98 (see Author response image 1, panel A). We thus considered a threshold above 9% as being different from background noise (mean + 3 times the standard deviation).

      We then compared the proportion of cells with mLOY measured by ddPCR and conventional karyotype and observed a rather good correlation between the 2 technics (R2\=0.6430, p=0.0053, see Author response image 1, panel B). Finally, we tested the reliability of our ddPCR assay in detecting different levels of mLOY using a dilution series of control DNA (from an equivalent of 2% of cell with mLOY to 98% of cells with mLOY). We observed a very nice correlation between the theoretical and measured proportions of cells with mLOY (R2\=0.9989, p<0.001, see Author response image 1, panel C). Of note, the proportion of mLOY measured for values ≤10% were concordant with theoretical values. However, considering the background noise determined with control DNA, we were unable to confirm that this “signal” was different from the background noise. Therefore, we set a threshold of 9% to define the detection of mLOY by ddPCR. It is also noteworthy that the 10% cell population with mLOY was consistently detected by the ddPCR technique. This has been added in the Methods section (lines 228-235).

      Author response image 1.

      (10) Our understanding of the relationship between CHIP and CVD is evolving fast, and the manuscript should be considered in the context of recent literature in the field. For instance, the recent work by Zhao et al (JAMA Cardio 2024, doi:10.1001/jamacardio.2023.5095) should be considered, as it used a similar targeted DNA sequencing approach as the one used here, but found a clear association between CHIP and coronary heart disease (in a population of 6181 individuals). 

      We thank the reviewer for this pertinent reference. We did not include it in the first version of our manuscript because it was not published yet when we submitted our work. We included this reference in the discussion (lines 451, 455, 464). We also included the recent study of Heimlich et al (Circ Gen Pre Med 2024, lines 464-468) who studied the association of CHIP with atherosclerosis burden.

      (11) The use of subjective terms like "comprehensive" or "thorough" in the title of the manuscript does not align with the objective nature of scientific reporting. 

      We removed the terms “comprehensive” and “thorough” from the title and the text.

      Recommendations for the authors:

      Reviewing Editor:

      The Editors believe that in light of the small study the word Comprehensive has to be removed (including from the title and abstract).

      We agree and removed the term comprehensive from the title and the text.

      Reviewer #1 (Recommendations For The Authors):

      Other comments:

      It has long been recognized that hsCRP does not adequately address the inflammation associated with CHIP. For example, see Bick et al Nature 2020; 586:763. Through an assessment of a large dataset, the regulation of multiple inflammatory mediators was associated with CHIP but not with CRP. 

      We agree that hsCRP is probably not the most sensitive marker for inflammatory state associated with CHIP. However, it is the most commonly used one in medical practise. However, as indicated in the discussion (lines 418-420), we did not observe any association between CHIP and the plasmatic level of different cytokines (IL1ß, IL6, IL18 and TNFα) in patients enrolled in the CHAth study.

      Many of the citations lack journal names, volumes, page numbers, etc. 

      We apologize for this and corrected the citations.

      Please provide more details on the methodology (i.e. is CHIP assessed only through NGS with no error correction?). Specify the rationale for why the 9% LOY threshold was employed. Provide this information in the Methods section.

      We added more details on the methodology as demanded in the results section (lines 212-214 and 228-235).

      Supplementary Table S3 lacks headings. What are the designations for columns 6-8? 

      We apologize for this and corrected the Table. Columns 6-8 correspond to the VAF, coverage of the variants and depth of sequencing, as for Table S4.

    1. eLife assessment

      This important study describes the discovery of a mechanism by which multiple species of bacteria synthesize and localize polar flagella via a novel protein, FipA, which interacts with FlhF. The authors use appropriate methodological approaches (biochemistry, molecular microbiology, quantitative microscopy, and bacterial genetics) to obtain and present convincing results and interpretations. This work will particularly interest those studying bacterial motility and bacterial cell biologists.

    2. Reviewer #1 (Public review):

      Summary:

      Bacteria exhibit species-specific numbers and localization patterns of flagella. How specificity in number and pattern is achieved is poorly understood but often depends on a soluble GTPase called FlhF. Here the authors take an unbiased protein-pulldown approach to identify a protein FipA in V. parahaemolyticus that interacts with FlhF. They show that FipA co-occurs with FlhF in the genomes of bacteria with polarly-localized flagella and study the role of FipA in three different bacteria: V. parahaemolyticus, S. purtefaciens, and P. putida. In each case, they show that FipA contributes to FlhF polar localization, flagellar assembly, flagellar patterning, and motility to different species-specific extents.

      Strengths:

      The authors perform a comprehensive analysis of FipA, including phenotyping of mutants, protein localization, localization dependence, and domains of FipA necessary for each. Moreover, they perform a time-series analysis indicating that FipA localizes to the cell pole likely prior to, or at least coincident with, flagellar assembly. They also show that the role of FipA appears to differ between organisms in detail but the overarching idea that it is a flagellar assembly/localization factor remains convincing.

      Weaknesses:

      For me the comparative analysis in the different organism was on balance, a weakness. By mixing the data for each of the organisms together, I found it difficult to read, and take away key points from the results. In its current form, the individual details seem to crowd out the model.

    3. Reviewer #2 (Public review):

      Summary:

      The authors identify a novel protein, FipA, which facilitates recruitment of FlhF to the membrane at the cell pole together with the known recruitment factor HupB. This finding is key to understanding the mechanism of polar localization. By comparing the role of FipA in polar flagellum assembly in three different species from Vibrio, Shewanella and Pseudomonas, they discover that, while FipA is required in all three systems, evolution has brought different nuances that open avenues for further discoveries.

      Strengths:

      The discovery of a novel factor for polar flagellum development. A significant contribution to our understanding of flagellar evolution. The solid nature and flow of the experimental work.

      Weaknesses:

      All my concerns have been addressed. I find no weaknesses. A nice, solid piece of work.

    4. Reviewer #3 (Public review):

      Summary:

      The authors investigate how polar flagellation is achieved in gamma-proteobacteria. By probing for proteins that interact with the known flagellar placement factor FlhF, they uncover a new regulator (FipA) for flagellar assembly and polar positioning in three flagellated gamma-proteobacteria. They convincingly demonstrate that FipA interacts genetically and biochemically with previously known spatial regulators HubP and FlhF. FipA is a membrane protein with a cytoplasmic DUF2802 and it co-localizes to the flagellated pole with HubP and FlhF. The DUF2802 mediates the interaction between FipA and FlhF and this interaction is required for FipA function. FipA localization depends on HubP and FlhF.

      Strengths:

      The work is throughly executed, relying on bacterial genetics, cell biology and protein interaction studies. The analysis is deep, beginning with the discovery af a new and conserved factor, to the molecular dissection of the protein and probing localisation and interaction determinants. Finally, they show that these determinants are important for function and they perform these studies in parallel in three model systems.

      Weaknesses:

      Because some of the phenotypes and localisation dependencies differ somewhat between model systems, the comparison is challenging to the reader because it is sometimes not obvious what these differences mean and why they arise.

    5. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This important research uses an elegant combination of protein-protein biochemistry, genetics, and microscopy to demonstrate that the novel bacterial protein FipA is required for polar flagella synthesis and binds to FlhF in multiple bacterial species. This manuscript is convincing, providing evidence for the early stages of flagellar synthesis at a cell pole; however, the protein biochemistry is incomplete and would benefit from additional rigorous experiments. This paper could be of significant interest to microbiologists studying bacterial motility, appendages, and cellular biology.

      We are very grateful for the very positive and helpful evaluation.

      Joint Public Review:

      Bacteria exhibit species-specific numbers and localization patterns of flagella. How specificity in number and pattern is achieved in Gamma-proteobacteria needs to be better understood but often depends on a soluble GTPase called FlhF. Here, the authors take an unbiased protein-pulldown approach with FlhF, resulting in identifying the protein FipA in V. parahaemolyticus. They convincingly demonstrate that FipA interacts genetically and biochemically with previously known spatial regulators HubP and FlhF. FipA is a membrane protein with a cytoplasmic DUF2802; it co-localizes to the flagellated pole with HubP and FlhF. The DUF2802 mediates the interaction between FipA and FlhF, and this interaction is required for FipA function. Altogether, the authors show that FipA likely facilitates the recruitment of FlhF to the membrane at the cell pole together with the known recruitment factor HupB. This finding is crucial in understanding the mechanism of polar localization. The authors show that FipA co-occurs with FlhF in the genomes of bacteria with polarly-localized flagella and study the role of FipA in three of these organisms: V. parahaemolyticus, S. purtefaciens, and P. putida. In each case, they show that FipA contributes to FlhF polar localization, flagellar assembly, flagellar patterning, and motility, though the details differ among the species. By comparing the role of FipA in polar flagellum assembly in three different species, they discover that, while FipA is required in all three systems, evolution has brought different nuances that open avenues for further discoveries.

      Strengths:

      The discovery of a novel factor for polar flagellum development. The solid nature and flow of the experimental work.

      The authors perform a comprehensive analysis of FipA, including phenotyping of mutants, protein localization, localization dependence, and domains of FipA necessary for each. Moreover, they perform a time-series analysis indicating that FipA localizes to the cell pole likely before, or at least coincident with, flagellar assembly. They also show that the role of FipA appears to differ between organisms in detail, but the overarching idea that it is a flagellar assembly/localization factor remains convincing.

      The work is well-executed, relying on bacterial genetics, cell biology, and protein interaction studies. The analysis is deep, beginning with discovering a new and conserved factor, then the molecular dissection of the protein, and finally, probing localization and interaction determinants. Finally, the authors show that these determinants are important for function; they perform these studies in parallel in three model systems.

      Weaknesses:

      The comparative analysis in the different organisms was on balance, a weakness. Mixing the data for the organisms together made the text difficult to read and took away key points from the results. The individual details crowded out the model in its current form. Indeed, because some of the phenotypes and localization dependencies differ between model systems, the comparison is challenging to the reader. The authors could more clearly state what these differences mean, why they arise, and (in the discussion) how they might relate to the organism's lifestyle.

      More experiments would be needed to fully analyze the effects of interacting proteins on individual protein stability; this absence slightly detracted from the conclusions.

      We have tried our best to improve the manuscript according to the insightful suggestions of the reviewers. Please find our answers to the raised issues below.

      Reviewer #1 (Recommendations For The Authors):

      We are very grateful to this reviewer for the very positive evaluation and the great suggestions to improve the manuscript.

      I think there is value to the comparative analysis but how to present it in such a way that the key similarities and differences stand out is the challenge. Perhaps a table that compares the three datasets is sufficient. Or tell the story of V. parahaemolyticus first to establish the model, followed by comparative analysis of the other two organisms highlighting differences and relegating similarities to supplemental?

      We agree that the our previous presentation of our comparative analysis made it very hard to follow the major findings and the general role(s) of FipA, and we are very grateful for the suggestions on how to improve this. We have decided to change the presentation as the reviewer recommended. We used V. parahaemolyticus as a ‚lead model‘ to describe the role of FipA, and we then compared the major findings to the other two species. We hope that the story is now easier to follow.

      This is not something that needs to be addressed in the text but I wanted to bring the protein SwrB to the authors' attention which may further expand FipA relevance. Bacillus subtilis uses FlhFG to somehow pattern flagella in a peritrichous arrangement and there are a number of striking similarities, in my opinion, between FipA and SwrB. The two proteins have very similar domain architecture/topology, both proteins promote flagellar assembly, and the genetic neighborhood/operon organization is uncannily similar. There are other more minor similarities dependent on the organism in this paper.

      Phillips, Kearns. 2021. Molecular and cell biological analysis of SwrB in Bacillus subtilis. J Bacteriol 203:e0022721

      Phillips, Kearns. 2015. Functional activation of the flagellar type III secretion export apparatus. PLoS Genet 11:e1005443.

      We thank this reviewer for pointing out these intriguing similarities. For this study we have decided to exclusively concentrate on polarly flagellated bacteria. FlhF und FlhG are also present in B. subtilis where they play a role in organizing flagellation, but we feel that this would be out of scope for this manuscript.

      Reviewer #2 (Recommendations For The Authors):

      We would like to thank this reviewer for the very positive evaluation and for pointing out several issues to strengthen the story.

      Figure 3A data are problematic since everything is too small to visualize. Since these are functional GFP fusions (or mCherry for 2E data), why are they not presented in color?

      Again - why are color figures not used to help the reader in Fig 4A and 5F & 5G to confirm what is asserted?

      Again, it is difficult to see the images presented. It is asserted that FipA is recruited to the cell pole after cell division and before flagellum assembly, but one has to take their word for it.

      We fully agree that in some case the localization pattern is hard to see on the micrographs presented. We have, therefore, provided enlarged micrographs in the supplemental part which allow to better see the fluorescent foci within the cells. With respect to presentations in color – we found that this did not improve the visibility of localizations and therefore have decided to use the grayscale images.

      Here, what is missing are turnover assays. Do FipA, FlhF, and HubP all co-localize as complex or is the absence of one leading to the protein turnover of other partners? I think this needs to be sorted out before final conclusions can be made.

      Thanks for pointing out this important point. We have now provided western analysis which demonstrate that FipA and FlhF are produced and stable in the absence of the other partners (see Supplemental Figure 5). Stability of HubP as a general polar marker not only required for flagellation was not determined.

      Minor comments:

      Line 58: change "around" to "in timing with"

      Line 79: what "signal" is transferred from the C-ring to the MS-ring. Are they not fully connected such that rotation is the entire structure - C-ring-MS-ring-Rod-Hook-Filament. Is it not the change in the relationship to the stator complex where the signal is transferred?

      Line 85: change "counting" to "control of flagellar numbers per cell"

      Line 110: change "is (co-)responsible for recruiting" to "facilitates recruitment of"

      Thanks for pointing this out. We have adjusted the wording according to the reviewer’s suggestions.

      Given that motility phenotypes vary on individual plates (volumes and dryness vary), why in Figure 2C are the motility assays for fipA and flhF mutants of P. putida done on different plates?

      For better visualisation, we have rearranged the spreading halos for the figure. All strain spreading comparisons on soft agar were always conducted on the same plate due to the reasons this reviewer mentioned.

      Reviewer #3 (Recommendations For The Authors):

      We thank this reviewer for the very positive evalution and the great suggestions.

      One possibility is to describe first all the results relating to FipA in Vibrio and then add the result sections at the end to illustrate the differences between Vibrio and Shewanella, and then Vibrio and Pseudomonas. This may make it easier to follow for the reader.

      We agree that the our previous presentation of our comparative analysis made it very hard to follow the major findings and the general role(s) of FipA, and we are very grateful for the suggestions on how to improve this. We have decided to change the presentation as the reviewer recommended. We used V. parahaemolyticus as a ‚lead model‘ to describe the role of FipA, and we then compared the major findings to the other two species. We hope that the story is now easier to follow.

      I would have liked to see some TEM analysis of flagella in fipA/hubP double mutants strains and was also wondering if FipA/FlhF/HubP colocalization had been studied in E. coli when all proteins are expressed together, at least with two bearing fluorescent tags.

      Thanks for these great suggestions. In this study, we have concentrated on the localization of FlhF by FipA and HubP. HubP has multiple functions in the cell and may also affect flagellar synthesis to some extent in a species-specific fashion. Therefore, any findings would have to be discussed very carefully, so we have decided to leave that out for the time being.

      With respect to the FipA/HubP/FlhF production in a heterologous host such as E. coli, this has been partly done (without FipA) in a second parallel story (see reference to Dornes et al (2024) in this manuscript). Rebuilding larger parts of the system in a heterologous host is currently done in an independent study. Therefore, we have decided not to include this already here.

      From the Reviewing Editor:

      We are grateful for handling the fair reviewing process, for the positive evaluation and the helpful hints.

      The microscopy was inconsistent (DIC versus phase) for unclear reasons. Did using different microscopes impact the ability to acquire low-intensity fluorescence signals? Please add a sentence in the Methods section to clarify.

      We are sorry for this inconsistency. As the imaging was carried out by different labs (to some part before the projects were joined), the corresponding preferred microscopy settings were used. We have added an explaining sentence to the Methods section.

      Also, some subcellular fluorescence localizations were not visible in the selected images (e.g., Figures 3 and 5). The reader had to rely on the authors' statements and analyses. The conclusions could be more robust with fluorescence measurements across the cell body for a subset of cells. The authors could provide this data analysis in the Supplemental; this measurement would more clearly show an accumulation of fluorescence at the cell pole, particularly in low-intensity images.

      We fully agree that in some case the localization pattern is hard to see on the micrographs presented. Unfortunately, often the signal is not sufficiently strong to provied proper demographs. We have, therefore, provided enlarged micrographs in the supplemental part, which allow to better see the fluorescent foci within the cells.

    1. Author response:

      We sincerely thank the reviewers for their thoughtful, critical, and constructive comments, which will help us in further exploring the mechanisms by which LDH regulates glycolysis, the tricarboxylic acid cycle, and oxidative phosphorylation future studies. The following is our responses to the reviewers' comments.

      Reviewer #1 (Public Review):

      Summary:

      Zeng et al. have investigated the impact of inhibiting lactate dehydrogenase (LDH) on glycolysis and the tricarboxylic acid cycle. LDH is the terminal enzyme of aerobic glycolysis or fermentation that converts pyruvate and NADH to lactate and NAD+ and is essential for the fermentation pathway as it recycles NAD+ needed by upstream glyceraldehyde-3-phosphate dehydrogenase. As the authors point out in the introduction, multiple published reports have shown that inhibition of LDH in cancer cells typically leads to a switch from fermentative ATP production to respiratory ATP production (i.e., glucose uptake and lactate secretion are decreased, and oxygen consumption is increased). The presumed logic of this metabolic rearrangement is that when glycolytic ATP production is inhibited due to LDH inhibition, the cell switches to producing more ATP using respiration. This observation is similar to the well-established Crabtree and Pasteur effects, where cells switch between fermentation and respiration due to the availability of glucose and oxygen. Unexpectedly, the authors observed that inhibition of LDH led to inhibition of respiration and not activation as previously observed. The authors perform rigorous measurements of glycolysis and TCA cycle activity, demonstrating that under their experimental conditions, respiration is indeed inhibited. Given the large body of work reporting the opposite result, it is difficult to reconcile the reasons for the discrepancy. In this reviewer's opinion, a reason for the discrepancy may be that the authors performed their measurements 6 hours after inhibiting LDH. Six hours is a very long time for assessing the direct impact of a perturbation on metabolic pathway activity, which is regulated on a timescale of seconds to minutes. The observed effects are likely the result of a combination of many downstream responses that happen within 6 hours of inhibiting LDH that causes a large decrease in ATP production, inhibition of cell proliferation, and likely a range of stress responses, including gene expression changes.

      Strengths:

      The regulation of metabolic pathways is incompletely understood, and more research is needed, such as the one conducted here. The authors performed an impressive set of measurements of metabolite levels in response to inhibition of LDH using a combination of rigorous approaches.

      Weaknesses:

      Glycolysis, TCA cycle, and respiration are regulated on a timescale of seconds to minutes. The main weakness of this study is the long drug treatment time of 6 hours, which was chosen for all the experiments. In this reviewer's opinion, if the goal was to investigate the direct impact of LDH inhibition on glycolysis and the TCA cycle, most of the experiments should have been performed immediately after or within minutes of LDH inhibition. After 6 hours of inhibiting LDH and ATP production, cells undergo a whole range of responses, and most of the observed effects are likely indirect due to the many downstream effects of LDH and ATP production inhibition, such as decreased cell proliferation, decreased energy demand, activation of stress response pathways, etc.

      We appreciate the reviewer’s critical comments. The main argument is whether the inhibition of LDH induces a temporal perturbation in glycolysis, the TCA cycle, and OXPHOS, or if it leads to a shift to a new steady state. We argue that this shift represents a transition between two steady states; specifically, GNE-140 treatment drives metabolism from one steady state to another.

      Before conducting the experiment, we performed a time course experiment, measuring glucose consumption and lactate production in cells treated with GNE-140. The results demonstrated a very good linearity, indicating that the glycolytic rate remained constant—thus confirming that glycolysis was at steady state. Given the tight coupling between glycolysis, the TCA cycle, and OXPHOS, we infer that the TCA cycle and OXPHOS were also at steady state. However, this ‘infer’ requires further confirmation.

      Multiple published reports have shown that LDH inhibition in cancer cells causes a shift from fermentative ATP production to respiratory ATP production. This notion persists because it is often compared to the well-established Crabtree and Pasteur effects, where cells toggle between fermentation and respiration based on glucose and oxygen availability. However, in the Pasteur or Crabtree effects, the deprivation of oxygen—the terminal electron acceptor—drives the switch, which is fundamentally different from LDH inhibition.

      Reviewer #2 (Public Review):

      Summary:

      Zeng et al. investigated the role of LDH in determining the metabolic fate of pyruvate in HeLa and 4T1 cells. To do this, three broad perturbations were applied: knockout of two LDH isoforms (LDH-A and LDH-B), titration with a non-competitive LDH inhibitor (GNE-140), and exposure to either normoxic (21% O2) or hypoxic (1% O2) conditions. They show that knockout of either LDH isoform alone, though reducing both protein level and enzyme activity, has virtually no effect on either the incorporation of a stable 13C-label from a 13C6-glucose into any glycolytic or TCA cycle intermediate, nor on the measured intracellular concentrations of any glycolytic intermediate (Figure 2). The only apparent exception to this was the NADH/NAD+ ratio, measured as the ratio of F420/F480 emitted from a fluorescent tag (SoNar).

      The addition of a chemical inhibitor, on the other hand, did lead to changes in glycolytic flux, the concentrations of glycolytic intermediates, and in the NADH/NAD+ ratio (Figure 3). Notably, this was most evident in the LDH-B-knockout, in agreement with the increased sensitivity of LDH-A to GNE-140 (Figure 2). In the LDH-B-knockout, increasing concentrations of GNE-140 increased the NADH/NAD+ ratio, reduced glucose uptake, and lactate production, and led to an accumulation of glycolytic intermediates immediately upstream of GAPDH (GA3P, DHAP, and FBP) and a decrease in the product of GAPDH (3PG). They continue to show that this effect is even stronger in cells exposed to hypoxic conditions (Figure 4). They propose that a shift to thermodynamic unfavourability, initiated by an increased NADH/NAD+ ratio inhibiting GAPDH explains the cascade, calculating ΔG values that become progressively more endergonic at increasing inhibitor concentrations.

      Then - in two separate experiments - the authors track the incorporation of 13C into the intermediates of the TCA cycle from a 13C6-glucose and a 13C5-glutamine. They use the proportion of labelled intermediates as a proxy for how much pyruvate enters the TCA cycle (Figure 5). They conclude that the inhibition of LDH decreases fermentation, but also the TCA cycle and OXPHOS flux - and hence the flux of pyruvate to all of those pathways. Finally, they characterise the production of ATP from respiratory or fermentative routes, the concentration of a number of cofactors (ATP, ADP, AMP, NAD(P)H, NAD(P)+, and GSH/GSSG), the cell count, and cell viability under four conditions: with and without the highest inhibitor concentration, and at norm- and hypoxia. From this, they conclude that the inhibition of LDH inhibits the glycolysis, the TCA cycle, and OXPHOS simultaneously (Figure 7).

      Strengths:

      The authors present an impressively detailed set of measurements under a variety of conditions. It is clear that a huge effort was made to characterise the steady-state properties (metabolite concentrations, fluxes) as well as the partitioning of pyruvate between fermentation as opposed to the TCA cycle and OXPHOS.

      A couple of intermediary conclusions are well supported, with the hypothesis underlying the next measurement clearly following. For instance, the authors refer to literature reports that LDH activity is highly redundant in cancer cells (lines 108 - 144). They prove this point convincingly in Figure 1, showing that both the A- and B-isoforms of LDH can be knocked out without any noticeable changes in specific glucose consumption or lactate production flux, or, for that matter, in the rate at which any of the pathway intermediates are produced. Pyruvate incorporation into the TCA cycle and the oxygen consumption rate are also shown to be unaffected.

      They checked the specificity of the inhibitor and found good agreement between the inhibitory capacity of GNE-140 on the two isoforms of LDH and the glycolytic flux (lines 229 - 243). The authors also provide a logical interpretation of the first couple of consequences following LDH inhibition: an increased NADH/NAD+ ratio leading to the inhibition of GAPDH, causing upstream accumulations and downstream metabolite decreases (lines 348 - 355).

      Weaknesses:

      Despite the inarguable comprehensiveness of the data set, a number of conceptual shortcomings afflict the manuscript. First and foremost, reasoning is often not pursued to a logical conclusion. For instance, the accumulation of intermediates upstream of GAPDH is proffered as an explanation for the decreased flux through glycolysis. However, in Figure 3C it is clear that there is no accumulation of the intermediates upstream of PFK. It is unclear, therefore, how this traffic jam is propagated back to a decrease in glucose uptake. A possible explanation might lie with hexokinase and the decrease in ATP (and constant ADP) demonstrated in Figure 6B, but this link is not made.

      We appreciate the reviewer's critical comment. In Figure 3C, there is no accumulation of F6P or G6P, which are upstream of PFK1. This is because the PFK1-catalyzed reaction sets a significant thermodynamic barrier. Even with treatment using 30 μM GNE-140, the ∆GPFK1 (Gibbs free energy of the PFK1-catalyzed reaction) remains -9.455 kJ/mol (Figure 3D), indicating that the reaction is still far from thermodynamic equilibrium, thereby preventing the accumulation of F6P and G6P.

      We agree with the reviewer that hexokinase inhibition may play a role, this requires further investigation.

      The obvious link between the NADH/NAD+ ratio and pyruvate dehydrogenase (PDH) is also never addressed, a mechanism that might explain how the pyruvate incorporation into the TCA cycle is impaired by the inhibition of LDH (the observation with which they start their discussion, lines 511 - 514).

      We agree with the reviewer’s comment. In this study, we did not explore how the inhibition of LDH affects pyruvate incorporation into the TCA cycle. As this mechanism was not investigated, we have titled the study: "Elucidating the Kinetic and Thermodynamic Insights into the Regulation of Glycolysis by Lactate Dehydrogenase and Its Impact on the Tricarboxylic Acid Cycle and Oxidative Phosphorylation in Cancer Cells."

      It was furthermore puzzling how the ΔG, calculated with intracellular metabolite concentrations (Figures 3 and 4) could be endergonic (positive) for PGAM at all conditions (also normoxic and without inhibitor). This would mean that under the conditions assayed, glycolysis would never flow completely forward. How any lactate or pyruvate is produced from glucose, is then unexplained.

      This issue also concerned me during the study. However, given the high reproducibility of the data, we consider it is true, but requires explanation.

      The PGAM-catalyzed reaction is tightly linked to both upstream and downstream reactions in the glycolytic pathway. In glycolysis, three key reactions catalyzed by HK2, PFK1, and PK are highly exergonic, providing the driving force for the conversion of glucose to pyruvate. The other reactions, including the one catalyzed by PGAM, operate near thermodynamic equilibrium and primarily serve to equilibrate glycolytic intermediates rather than control the overall direction of glycolysis, as previously described by us (J Biol Chem. 2024 Aug 8;300(9):107648).

      The endergonic nature of the PGAM-catalyzed reaction does not prevent it from proceeding in the forward direction. Instead, the directionality of the pathway is dictated by the exergonic reaction of PFK1 upstream, which pushes the flux forward, and by PK downstream, which pulls the flux through the pathway. The combined effects of PFK1 and PK may account for the observed endergonic state of the PGAM reaction.

      However, if the PGAM-catalyzed reaction were isolated from the glycolytic pathway, it would tend toward equilibrium and never surpass it, as there would be no driving force to move the reaction forward.

      Finally, the interpretation of the label incorporation data is rather unconvincing. The authors observe an increasing labelled fraction of TCA cycle intermediates as a function of increasing inhibitor concentration. Strangely, they conclude that less labelled pyruvate enters the TCA cycle while simultaneously less labelled intermediates exit the TCA cycle pool, leading to increased labelling of this pool. The reasoning that they present for this (decreased m2 fraction as a function of DHE-140 concentration) is by no means a consistent or striking feature of their titration data and comes across as rather unconvincing. Yet they treat this anomaly as resolved in the discussion that follows.

      GNE-140 treatment increased the labeling of TCA cycle intermediates by [13C6]glucose but decreased the OXPHOS rate, we consider the conflicting results as an 'anomaly' that warrants further explanation. To address this, we analyzed the labeling pattern of TCA cycle intermediates using both [13C6]glucose and  [13C5]glutamine. Tracing the incorporation of glucose- and glutamine-derived carbons into the TCA cycle suggests that LDH inhibition leads to a reduced flux of glucose-derived acetyl-CoA into the TCA cycle, coupled with a decreased flux of glutamine-derived α-KG, and a reduction in the efflux of intermediates from the cycle. These results align with theoretical predictions. Under any condition, the reactions that distribute TCA cycle intermediates to other pathways must be balanced by those that replenish them. In the GNE-140 treatment group, the entry of glutamine-derived carbon into the TCA cycle was reduced, implying that glucose-derived carbon (as acetyl-CoA) entering the TCA cycle must also be reduced, or vice versa.

      This step-by-step investigation is detailed under the subheading "The Effect of LDHB KO and GNE-140 on the Contribution of Glucose Carbon to the TCA Cycle and OXPHOS" in the Results section in the manuscript.

      In the Discussion, we emphasize that caution should be exercised when interpreting isotope tracing data. In this study, treatment of cells with GNE-140 led to an increase labeling percentage of TCAC intermediates by [13C6]glucose (Figure 5A-E). However, this does not necessarily imply an increase in glucose carbon flux into TCAC; rather, it indicates a reduction in both the flux of glucose carbon into TCAC and the flux of intermediates leaving TCAC. When interpreting the data, multiple factors must be considered, including the carbon-13 labeling pattern of the intermediates (m1, m2, m3, ---) (Figure 5G-K), replenishment of intermediates by glutamine (Figure 5M-V), and mitochondrial oxygen consumption rate (Figure 5W). All these factors should be taken into account to derive a proper interpretation of the data. 

      Reviewer #3 (Public Review):

      Hu et al in their manuscript attempt to interrogate the interplay between glycolysis, TCA activity, and OXPHOS using LDHA/B knockouts as well as LDH-specific inhibitors. Before I discuss the specifics, I have a few issues with the overall manuscript. First of all, based on numerous previous studies it is well established that glycolysis inhibition or forcing pyruvate into the TCA cycle (studies with PDKs inhibitors) leads to upregulation of TCA cycle activity, and OXPHOS, activation of glutaminolysis, etc (in this work authors claim that lowered glycolysis leads to lower levels of TCA activity/OXPHOS). The authors in the current work completely ignore recent studies that suggest that lactate itself is an important signaling metabolite that can modulate metabolism (actual mechanistic insights were recently presented by at least two groups (Thompson, Chouchani labs). In addition, extensive effort was dedicated to understanding the crosstalk between glycolysis/TCA cycle/OXPHOS using metabolic models (Titov, Rabinowitz labs). I have several comments on how experiments were performed. In the Methods section, it is stated that both HeLa and 4T1 cells were grown in RPMI-1640 medium with regular serum - but under these conditions, pyruvate is certainly present in the medium - this can easily complicate/invalidate some findings presented in this manuscript. In LDH enzymatic assays as described with cell homogenates controls were not explained or presented (a lot of enzymes in the homogenate can react with NADH!). One of the major issues I have is that glycolytic intermediates were measured in multiple enzyme-coupled assays. Although one might think it is a good approach to have quantitative numbers for each metabolite, the way it was done is that cell homogenates (potentially with still traces of activity of multiple glycolytic enzymes) were incubated with various combinations of the SAME enzymes and substrates they were supposed to measure as a part of the enzyme-based cycling reaction. I would prefer to see a comparison between numbers obtained in enzyme-based assays with GC-MS/LC-MS experiments (using calibration curves for respective metabolites, of course). Correct measurements of these metabolites are crucial especially when thermodynamic parameters for respective reactions are calculated. Concentrations of multiple graphs (Figure 1g etc.) are in "mM", I do not think that this is correct.

      While the roles of lactate as a signaling metabolite and metabolic models are important areas of research, our work focuses on different aspects.

      It is true that cell homogenates contain many enzymes that use NAD as a hydride acceptor or NADH as a hydride donor. However, in our assay system, the substrates are pyruvate and NADH, meaning only enzymes that catalyze the conversion of pyruvate + NADH to NAD + lactate can utilize NADH. Other enzymes do not interfere with this reaction. Although some enzymes may also catalyze this reaction, their catalytic efficiency is markedly lower than that of LDH, ensuring the validity of this assay.

      Similarly, the assays for glycolytic intermediates are validated by the substrate specificity.

      We have developed an LC-MS methodology for some glycolytic intermediates, but the accuracy of quantification remains unsatisfactory due to inherent limitations of this methodology.

    2. eLife assessment

      This study presents an assessment of the effect of lactate dehydrogenase (LDH) inhibition on the activity of glycolysis and tricarboxylic acid cycle. The data were collected and analyzed using solid and validated methodology. This paper makes a useful contribution to the field as it considers a control analysis of LDH flux.

    3. Reviewer #1 (Public Review):

      Summary:

      Zeng et al. have investigated the impact of inhibiting lactate dehydrogenase (LDH) on glycolysis and the tricarboxylic acid cycle. LDH is the terminal enzyme of aerobic glycolysis or fermentation that converts pyruvate and NADH to lactate and NAD+ and is essential for the fermentation pathway as it recycles NAD+ needed by upstream glyceraldehyde-3-phosphate dehydrogenase. As the authors point out in the introduction, multiple published reports have shown that inhibition of LDH in cancer cells typically leads to a switch from fermentative ATP production to respiratory ATP production (i.e., glucose uptake and lactate secretion are decreased, and oxygen consumption is increased). The presumed logic of this metabolic rearrangement is that when glycolytic ATP production is inhibited due to LDH inhibition, the cell switches to producing more ATP using respiration. This observation is similar to the well-established Crabtree and Pasteur effects, where cells switch between fermentation and respiration due to the availability of glucose and oxygen. Unexpectedly, the authors observed that inhibition of LDH led to inhibition of respiration and not activation as previously observed. The authors perform rigorous measurements of glycolysis and TCA cycle activity, demonstrating that under their experimental conditions, respiration is indeed inhibited. Given the large body of work reporting the opposite result, it is difficult to reconcile the reasons for the discrepancy. In this reviewer's opinion, a reason for the discrepancy may be that the authors performed their measurements 6 hours after inhibiting LDH. Six hours is a very long time for assessing the direct impact of a perturbation on metabolic pathway activity, which is regulated on a timescale of seconds to minutes. The observed effects are likely the result of a combination of many downstream responses that happen within 6 hours of inhibiting LDH that causes a large decrease in ATP production, inhibition of cell proliferation, and likely a range of stress responses, including gene expression changes.

      Strengths:

      The regulation of metabolic pathways is incompletely understood, and more research is needed, such as the one conducted here. The authors performed an impressive set of measurements of metabolite levels in response to inhibition of LDH using a combination of rigorous approaches.

      Weaknesses:

      Glycolysis, TCA cycle, and respiration are regulated on a timescale of seconds to minutes. The main weakness of this study is the long drug treatment time of 6 hours, which was chosen for all the experiments. In this reviewer's opinion, if the goal was to investigate the direct impact of LDH inhibition on glycolysis and the TCA cycle, most of the experiments should have been performed immediately after or within minutes of LDH inhibition. After 6 hours of inhibiting LDH and ATP production, cells undergo a whole range of responses, and most of the observed effects are likely indirect due to the many downstream effects of LDH and ATP production inhibition, such as decreased cell proliferation, decreased energy demand, activation of stress response pathways, etc.

    4. Reviewer #2 (Public Review):

      Summary:

      Zeng et al. investigated the role of LDH in determining the metabolic fate of pyruvate in HeLa and 4T1 cells. To do this, three broad perturbations were applied: knockout of two LDH isoforms (LDH-A and LDH-B), titration with a non-competitive LDH inhibitor (GNE-140), and exposure to either normoxic (21% O2) or hypoxic (1% O2) conditions. They show that knockout of either LDH isoform alone, though reducing both protein level and enzyme activity, has virtually no effect on either the incorporation of a stable 13C-label from a 13C6-glucose into any glycolytic or TCA cycle intermediate, nor on the measured intracellular concentrations of any glycolytic intermediate (Figure 2). The only apparent exception to this was the NADH/NAD+ ratio, measured as the ratio of F420/F480 emitted from a fluorescent tag (SoNar).

      The addition of a chemical inhibitor, on the other hand, did lead to changes in glycolytic flux, the concentrations of glycolytic intermediates, and in the NADH/NAD+ ratio (Figure 3). Notably, this was most evident in the LDH-B-knockout, in agreement with the increased sensitivity of LDH-A to GNE-140 (Figure 2). In the LDH-B-knockout, increasing concentrations of GNE-140 increased the NADH/NAD+ ratio, reduced glucose uptake, and lactate production, and led to an accumulation of glycolytic intermediates immediately upstream of GAPDH (GA3P, DHAP, and FBP) and a decrease in the product of GAPDH (3PG). They continue to show that this effect is even stronger in cells exposed to hypoxic conditions (Figure 4). They propose that a shift to thermodynamic unfavourability, initiated by an increased NADH/NAD+ ratio inhibiting GAPDH explains the cascade, calculating ΔG values that become progressively more endergonic at increasing inhibitor concentrations.

      Then - in two separate experiments - the authors track the incorporation of 13C into the intermediates of the TCA cycle from a 13C6-glucose and a 13C5-glutamine. They use the proportion of labelled intermediates as a proxy for how much pyruvate enters the TCA cycle (Figure 5). They conclude that the inhibition of LDH decreases fermentation, but also the TCA cycle and OXPHOS flux - and hence the flux of pyruvate to all of those pathways. Finally, they characterise the production of ATP from respiratory or fermentative routes, the concentration of a number of cofactors (ATP, ADP, AMP, NAD(P)H, NAD(P)+, and GSH/GSSG), the cell count, and cell viability under four conditions: with and without the highest inhibitor concentration, and at norm- and hypoxia. From this, they conclude that the inhibition of LDH inhibits the glycolysis, the TCA cycle, and OXPHOS simultaneously (Figure 7).

      Strengths:

      The authors present an impressively detailed set of measurements under a variety of conditions. It is clear that a huge effort was made to characterise the steady-state properties (metabolite concentrations, fluxes) as well as the partitioning of pyruvate between fermentation as opposed to the TCA cycle and OXPHOS.

      A couple of intermediary conclusions are well supported, with the hypothesis underlying the next measurement clearly following. For instance, the authors refer to literature reports that LDH activity is highly redundant in cancer cells (lines 108 - 144). They prove this point convincingly in Figure 1, showing that both the A- and B-isoforms of LDH can be knocked out without any noticeable changes in specific glucose consumption or lactate production flux, or, for that matter, in the rate at which any of the pathway intermediates are produced. Pyruvate incorporation into the TCA cycle and the oxygen consumption rate are also shown to be unaffected.

      They checked the specificity of the inhibitor and found good agreement between the inhibitory capacity of GNE-140 on the two isoforms of LDH and the glycolytic flux (lines 229 - 243). The authors also provide a logical interpretation of the first couple of consequences following LDH inhibition: an increased NADH/NAD+ ratio leading to the inhibition of GAPDH, causing upstream accumulations and downstream metabolite decreases (lines 348 - 355).

      Weaknesses:

      Despite the inarguable comprehensiveness of the data set, a number of conceptual shortcomings afflict the manuscript. First and foremost, reasoning is often not pursued to a logical conclusion. For instance, the accumulation of intermediates upstream of GAPDH is proffered as an explanation for the decreased flux through glycolysis. However, in Figure 3C it is clear that there is no accumulation of the intermediates upstream of PFK. It is unclear, therefore, how this traffic jam is propagated back to a decrease in glucose uptake. A possible explanation might lie with hexokinase and the decrease in ATP (and constant ADP) demonstrated in Figure 6B, but this link is not made.

      The obvious link between the NADH/NAD+ ratio and pyruvate dehydrogenase (PDH) is also never addressed, a mechanism that might explain how the pyruvate incorporation into the TCA cycle is impaired by the inhibition of LDH (the observation with which they start their discussion, lines 511 - 514).

      It was furthermore puzzling how the ΔG, calculated with intracellular metabolite concentrations (Figures 3 and 4) could be endergonic (positive) for PGAM at all conditions (also normoxic and without inhibitor). This would mean that under the conditions assayed, glycolysis would never flow completely forward. How any lactate or pyruvate is produced from glucose, is then unexplained.

      Finally, the interpretation of the label incorporation data is rather unconvincing. The authors observe an increasing labelled fraction of TCA cycle intermediates as a function of increasing inhibitor concentration. Strangely, they conclude that less labelled pyruvate enters the TCA cycle while simultaneously less labelled intermediates exit the TCA cycle pool, leading to increased labelling of this pool. The reasoning that they present for this (decreased m2 fraction as a function of DHE-140 concentration) is by no means a consistent or striking feature of their titration data and comes across as rather unconvincing. Yet they treat this anomaly as resolved in the discussion that follows.

    5. Reviewer #3 (Public Review):

      Hu et al in their manuscript attempt to interrogate the interplay between glycolysis, TCA activity, and OXPHOS using LDHA/B knockouts as well as LDH-specific inhibitors. Before I discuss the specifics, I have a few issues with the overall manuscript. First of all, based on numerous previous studies it is well established that glycolysis inhibition or forcing pyruvate into the TCA cycle (studies with PDKs inhibitors) leads to upregulation of TCA cycle activity, and OXPHOS, activation of glutaminolysis, etc (in this work authors claim that lowered glycolysis leads to lower levels of TCA activity/OXPHOS). The authors in the current work completely ignore recent studies that suggest that lactate itself is an important signaling metabolite that can modulate metabolism (actual mechanistic insights were recently presented by at least two groups (Thompson, Chouchani labs). In addition, extensive effort was dedicated to understanding the crosstalk between glycolysis/TCA cycle/OXPHOS using metabolic models (Titov, Rabinowitz labs). I have several comments on how experiments were performed. In the Methods section, it is stated that both HeLa and 4T1 cells were grown in RPMI-1640 medium with regular serum - but under these conditions, pyruvate is certainly present in the medium - this can easily complicate/invalidate some findings presented in this manuscript. In LDH enzymatic assays as described with cell homogenates controls were not explained or presented (a lot of enzymes in the homogenate can react with NADH!). One of the major issues I have is that glycolytic intermediates were measured in multiple enzyme-coupled assays. Although one might think it is a good approach to have quantitative numbers for each metabolite, the way it was done is that cell homogenates (potentially with still traces of activity of multiple glycolytic enzymes) were incubated with various combinations of the SAME enzymes and substrates they were supposed to measure as a part of the enzyme-based cycling reaction. I would prefer to see a comparison between numbers obtained in enzyme-based assays with GC-MS/LC-MS experiments (using calibration curves for respective metabolites, of course). Correct measurements of these metabolites are crucial especially when thermodynamic parameters for respective reactions are calculated. Concentrations of multiple graphs (Figure 1g etc.) are in "mM", I do not think that this is correct.

    1. eLife assessment

      In this valuable work, Lodhiya et al. provide evidence that excessive ATP underlies the killing of the model organism Mycobacterium smegmatis by two mechanistically-distinct antibiotics. Clarification of the role(s) of reactive oxygen species and ADP, as well as discrepancies with existing literature, would strengthen the model proposed. The data are generally solid as the authors deploy multiple, orthogonal readouts and methods for manipulating reactive oxygen species and ATP. The work will be of interest to those studying antibiotic mechanisms of action.

    2. Reviewer #1 (Public review):

      Summary:

      Lodhiya et al. demonstrate that antibiotics with distinct mechanisms of action, norfloxacin, and streptomycin, cause similar metabolic dysfunction in the model organism Mycobacterium smegmatis. This includes enhanced flux through the TCA cycle and respiration as well as a build-up of reactive oxygen species (ROS) and ATP. Genetic and/or pharmacologic depression of ROS or ATP levels protect M. smegmatis from norfloxacin and streptomycin killing. Because ATP depression is protective, but in some cases does not depress ROS, the authors surmise that excessive ATP is the primary mechanism by which norfloxacin and streptomycin kill M. smegmatis. In general, the experiments are carefully executed; alternative hypotheses are discussed and considered; the data are contextualized within the existing literature. Clarification of the effect of 1) ROS depression on ATP levels and 2) ADP vs. ATP on divalent metal chelation would strengthen the paper, as would discussion of points of difference with the existing literature. The authors might also consider removing Figures 9 and 10A-B as they distract from the main point of the paper and appear to be the beginning of a new story rather than the end of the current one. Finally, statistics need some attention.

      Strengths:

      The authors tackle a problem that is both biologically interesting and medically impactful, namely, the mechanism of antibiotic-induced cell death.

      Experiments are carefully executed, for example, numerous dose- and time-dependency studies; multiple, orthogonal readouts for ROS; and several methods for pharmacological and genetic depletion of ATP.

      There has been a lot of excitement and controversy in the field, and the authors do a nice job of situating their work in this larger context.

      Inherent limitations to some of their approaches are acknowledged and discussed e.g., normalizing ATP levels to viable counts of bacteria.

      Weaknesses:

      The authors have shown that treatments that depress ATP do not necessarily repress ROS, and therefore conclude that ATP is the primary cause of norfloxacin and streptomycin lethality for M. smegmatis. Indeed, this is the most impactful claim of the paper. However, GSH and dipyridyl beautifully rescue viability. Do these and other ROS-repressing treatments impact ATP levels? If not, the authors should consider a more nuanced model and revise the title, abstract, and text accordingly.

      Does ADP chelate divalent metal ions to the same extent as ATP? If so, it is difficult to understand how conversion of ADP to ATP by ATP synthase would alter metal sequestration without concomitant burst in ADP levels.

      Some of the results in the paper diverge from what has been previously reported by some of the referenced literature. These discrepancies should be clarified.

    3. Reviewer #2 (Public review):

      Summary:

      The authors are trying to test the hypothesis that ATP bursts are the predominant driver of antibiotic lethality of Mycobacteria.

      Strengths:

      This reviewer has not identified any significant strengths of the paper in its current form.

      Weaknesses:

      A major weakness is that M. smegmatis has a doubling time of three hours and the authors are trying to conclude that their data would reflect the physiology of M. tuberculossi which has a doubling time of 24 hours. Moreover, the authors try to compare OD measurements with CFU counts and thus observe great variabilities.

      If the authors had evidence to support the conclusion that ATP burst is the predominant driver of antibiotic lethality in mycobacteria then this paper would be highly significant. However, with the way the paper is written, it is impossible to make this conclusion.

    4. Author response:

      Reviewer #1 (Public review):

      Summary:

      Lodhiya et al. demonstrate that antibiotics with distinct mechanisms of action, norfloxacin, and streptomycin, cause similar metabolic dysfunction in the model organism Mycobacterium smegmatis. This includes enhanced flux through the TCA cycle and respiration as well as a build-up of reactive oxygen species (ROS) and ATP. Genetic and/or pharmacologic depression of ROS or ATP levels protect M. smegmatis from norfloxacin and streptomycin killing. Because ATP depression is protective, but in some cases does not depress ROS, the authors surmise that excessive ATP is the primary mechanism by which norfloxacin and streptomycin kill M. smegmatis. In general, the experiments are carefully executed; alternative hypotheses are discussed and considered; the data are contextualized within the existing literature. Clarification of the effect of 1) ROS depression on ATP levels and 2) ADP vs. ATP on divalent metal chelation would strengthen the paper, as would discussion of points of difference with the existing literature. The authors might also consider removing Figures 9 and 10A-B as they distract from the main point of the paper and appear to be the beginning of a new story rather than the end of the current one. Finally, statistics need some attention.

      Strengths:

      The authors tackle a problem that is both biologically interesting and medically impactful, namely, the mechanism of antibiotic-induced cell death.

      Experiments are carefully executed, for example, numerous dose- and time-dependency studies; multiple, orthogonal readouts for ROS; and several methods for pharmacological and genetic depletion of ATP.

      There has been a lot of excitement and controversy in the field, and the authors do a nice job of situating their work in this larger context.

      Inherent limitations to some of their approaches are acknowledged and discussed e.g., normalizing ATP levels to viable counts of bacteria.

      We sincerely thanks appreciate the reviewer’s encouraging feedback.

      Weaknesses:

      The authors have shown that treatments that depress ATP do not necessarily repress ROS, and therefore conclude that ATP is the primary cause of norfloxacin and streptomycin lethality for M. smegmatis. Indeed, this is the most impactful claim of the paper. However, GSH and dipyridyl beautifully rescue viability. Do these and other ROS-repressing treatments impact ATP levels? If not, the authors should consider a more nuanced model and revise the title, abstract, and text accordingly.

      We thank the reviewer for asking this question. In the revised version of the manuscript, we will include data on the impact of the antioxidant GSH on ATP levels.

      Does ADP chelate divalent metal ions to the same extent as ATP? If so, it is difficult to understand how conversion of ADP to ATP by ATP synthase would alter metal sequestration without concomitant burst in ADP levels.

      We sincerely thank the reviewer for raising this insightful question. Indeed, ADP and AMP can also form complexes with divalent metal ions; however, these complexes tend to be less stable. According to the existing literature, ATP-metal ion complexes exhibit a higher formation constant compared to ADP or AMP complexes. This has been attributed to the polyphosphate chain of ATP, which acts as an active site, forming a highly stable tridentate structure (Khan et al., 1962; Distefano et al., 1953). An antibiotic-induced increase in ATP levels, irrespective of any changes in ADP levels, could still result in the formation of more stable complexes with metal ions, potentially leading to metal ion depletion. Although recent studies indicate that antibiotic treatment stimulates purine biosynthesis (Lobritz MA et al., 2022; Yang JH et al., 2019), thereby imposing energy demands and enhancing ATP production, the possibility of a corresponding increase in total purine nucleotide levels (ADP+ATP) exist (is mentioned in discussion section). However, this hypothesis requires further investigation.

      Khan MMT, Martell AE. Metal Chelates of Adenosine Triphosphate. Journal of Physical Chemistry (US). 1962 Jan 1;Vol: 66(1):10–5

      Distefano v, Neuman wf. Calcium complexes of adenosinetriphosphate and adenosinediphosphate and their significance in calcification in vitro. Journal of Biological Chemistry. 1953 Feb 1;200(2):759–63

      Lobritz MA, Andrews IW, Braff D, Porter CBM, Gutierrez A, Furuta Y, et al. Increased energy demand from anabolic-catabolic processes drives β-lactam antibiotic lethality. Cell Chem Biol [Internet]. 2022 Feb 17.

      Yang JH, Wright SN, Hamblin M, McCloskey D, Alcantar MA, Schrübbers L, et al. A White-Box Machine Learning Approach for Revealing Antibiotic Mechanisms of Action. Cell [Internet]. 2019 May 30

      Some of the results in the paper diverge from what has been previously reported by some of the referenced literature. These discrepancies should be clarified.

      We apologize for any confusion, but we are uncertain about the specific discrepancies the reviewer is referring. In the discussion section, we have addressed and analysed our results within the broader context of the existing literature, regardless of whether our findings align with or differ from previous studies.

      Reviewer #2 (Public review):

      Summary:

      The authors are trying to test the hypothesis that ATP bursts are the predominant driver of antibiotic lethality of Mycobacteria.

      Strengths:

      This reviewer has not identified any significant strengths of the paper in its current form.

      Weaknesses:

      A major weakness is that M. smegmatis has a doubling time of three hours and the authors are trying to conclude that their data would reflect the physiology of M. tuberculosis which has a doubling time of 24 hours. Moreover, the authors try to compare OD measurements with CFU counts and thus observe great variabilities.

      If the authors had evidence to support the conclusion that ATP burst is the predominant driver of antibiotic lethality in mycobacteria then this paper would be highly significant. However, with the way the paper is written, it is impossible to make this conclusion.

      We have identified this new mechanism of antibiotic action in Mycobacterium smegmatis and have also mentioned that whether and how much of this mechanism is true in other organism needs to be tested as argued extensively in the discussion section of the manuscript.

      We have always drawn inferences from the CFU counts as the OD600nm is never a reliable method as reported in all of our experiments.

    1. eLife assessment

      This valuable study discusses a hot topic in post-endoscopic retrograde cholangiopancreatography pancreatitis. The new score for predicting post-ERCP pancreatitis offers an idea about the risk of pancreatitis before the procedure. Although most scores depend on intraprocedural manoeuvres, such as the number of attempts to cannulate the papilla, this is a solid retrospective single-center study in one country. To be validated, this score should be done in many countries and on large numbers of patients, nevertheless, this paper should interest gastrointestinal endoscopists.

    2. Joint Public Review:

      Summary:

      This work provides a new general tool for predicting post-ERCP pancreatitis before the procedure depending on pancreatic calcification, female sex, intraductal papillary mucinous neoplasm, a native papilla of Vater, or the use of pancreatic duct procedures. Even though it is difficult for the endoscopist to predict before the procedure which case might have post-ERCP pancreatitis, this new model score can help with the maneuver and when the patient is at high risk of pancreatitis, sometimes can be deadly), so experienced endoscopists can do the procedure from the start. This paper provides a model for stratifying patients before the ERCP procedure into low, moderate, and high risk for pancreatitis. To be validated, this score should be done in many countries and on large numbers of patients. Risk factors can also be identified and added to the score to increase rank.

      Strengths:

      (1) One of the severe complications of endoscopic retrograde cholangiopancreatography procedure is pancreatitis, so investigators try all the time to find a score that can predict which patients will probably have pancreatitis after the procedure. Most scores depend on the intraprocedural maneuver. Some studies discuss the preprocedural score that can predict pancreatitis before the procure. This study discusses a new preprocedural score for post-ERCP pancreatitis.

      (2) Depending on this score that identifies low, moderate, and high-risk patients for post-pancreatitis, so from the start, experienced and well-trained endoscopists can do the procedure or can refer patients to tertiary hospitals or use interventional radiology or endoscopic retrograde cholangiopancreatography.

      (3) The number of patients in this study is sufficient to analyze data correctly.

      Weaknesses:

      (1) It is a single-country, retrospective study.

      (2) Many cases were excluded, so the score cannot be applied to those patients.

      (3) Many other studies, e.g., https://link.springer.com/article/10.1007/s00464-021-08491-1, https://pubmed.ncbi.nlm.nih.gov/36344369/, that have been published before discussing the same issue, so what is the new with this score?

      (4) The discussion section needs reformulation to express the study's aim and results.

      (5) Why did the authors select these items in their scoring system and did not add more variables?

    1. eLife assessment

      This important study combines multiple techniques to investigate how caspase activity regulates non-lethal caspase-dependent processes. Through a combination of various approaches, and the development of new techniques, the authors provide compelling evidence supporting the claim that Fas3G-overexpression promotes non-lethal caspase activation in olfactory receptor neurons.

    2. Reviewer #1 (Public review):

      Summary:

      In this manuscript, Muramoto and colleagues have examined a mechanism by which the executioner caspase Drice is activated in a non-lethal context in Drosophila. The authors have comprehensively examined this in the Drosophila olfactory receptor neurons using sophisticated techniques. In particular, they had to engineer a new reporter by which non-lethal caspase activation could be detected. The authors conducted a proximity labeling experiment and identified Fasciclin 3 as a key protein in this context. While the removal of Fascilin 3 did not block non-lethal caspase activation (likely because of redundant mechanisms), its overexpression was sufficient to activate non-lethal caspase activation.

      Strengths:

      While non-lethal functions of caspases have been reported in several contexts, far less is known about the mechanisms by which caspases are activated in these non-lethal contexts. So, the topic is very timely. The overall detail of this work is impressive and the results for the most part are well-controlled and justified.

      Weaknesses:

      The behavioral results shown in Figure 6 need more explanation and clarification (more details below). As currently shown, the results of Figure 6 seem uninterpretable. Also, overall presentation of the Figures and description in legends can be improved.

    3. Reviewer #2 (Public review):

      In this study, the authors investigate the role of caspases in neuronal modulation through non-lethal activation. They analyze proximal proteins of executioner caspases using a variety of techniques, including TurboID and a newly developed monitoring system based on Gal4 manipulation, called MASCaT. They demonstrate that overexpression of Fas3G promotes the non-lethal activation of caspase Dronc in olfactory receptor neurons. In addition, they investigate the regulatory mechanisms of non-lethal function of caspase by performing a comprehensive analysis of proximal proteins of executioner caspase Drice. It is important to point out that the authors use an array of techniques from western blot to behavioral experiments and also that the generated several reagents, from fly lines to antibodies.

      This is an interesting work that would appeal to readers of multiple disciplines. As a whole these findings suggest that overexpression of Fas3G enhances a non-lethal caspase activation in ORNs, providing a novel experimental model that will allow for exploration of molecular processes that facilitate caspase activation without leading to cell death.

    1. eLife assessment

      This valuable study combines electrophysiology experiments and modeling to investigate the encoding of dynamic patterns of polarized light by identified neurons of the bumblebee central complex. The scientific question and methodology are compelling. However, the evidence supporting the authors' conclusions is incomplete without more comprehensive statistical analyses.

    2. Reviewer #1 (Public review):

      Summary:

      The authors of this valuable study use linearly polarized UV light rotating at different angular velocities to stimulate photoreceptors in bumblebees and study the response of TL3 neurons to polarized light. Previous work has typically used a single constant rotation velocity of the polarized light, while the authors of this study explore a range of constant rotational velocities spanning from 30deg/s to 1920deg/s. The authors also use linearly polarized UV light rotating at continuously varying velocities following the angular velocity of the head of a flying bumblebee. 

      Strengths:

      The authors investigate the neuronal responses of TL3 neurons to a variety of rotational velocities. This approach has the potential to reveal the neuronal response to dynamically changing stimuli experienced by the animal as it moves around its environment.

      The authors make good use of physiology and modeling to validate their hypotheses and findings.  If done right, this line of investigation has the potential to provide a very useful methodology for utilizing more complex stimuli in studies of the visual pathway and central complex than traditionally. 

      Weaknesses: 

      The attempt of the authors to use more naturalistic stimuli than previous studies is very important, but the stimulus they use, i.e. linearly polarized UV light projected on the whole dorsal rim of the animal's eyes, is very different from the circular pattern of UV light polarization coming through the sky. In particular, as a bumblebee turns under the sky, the light projected on each ommatidium of the dorsal rim area will not smoothly change like the rotating linearly polarized light used in the experiments. The authors need to discuss this and other limitations of their study. 

      The authors should also commend the light intensity confound common in polarized light setups as discussed by Reinhard Wolf et al, J. Comp. Physiol. 1980 and in the thesis of Peter Weir, California Institute of Technology, 2013. It is unclear whether the authors performed measurements to quantify the intensity pattern and if they took measures to compensate and make the polarized light intensity uniform. 

      The authors show that the neuronal responses of TL3 neurons depend on the recent history of the polarized light stimulus. They use as evidence, the different neuronal firing rates measured when arriving at the same polarization stimulus by following two different preceding stimulus sequences. It would have been worthwhile to investigate to what extent the difference in neuronal response is due to the history alone and to what extent it is due to spike timing stochasticity inherent in the neurons. According to the raster plots in Figure 2F, there is substantial stochasticity in the timing of the action potential firing events.

      The authors appear to base their delay calculations and analysis on the response of one single neuron (Figures 2 and 3) even though they have recorded the responses of several TL3 neurons. There is no reason for the authors not to use all neuron recordings in their calculations and analysis.

      Another concern is that while the authors make good use of modeling, like any model, the presented models only partially explain the observed phenomena. However, a discussion about the limitations of their model needs to be provided.  Actually, observing the discrepancies between the model's output and the intracellular recordings reveals what the model is missing. That is, careful consideration of the discrepancies would have led the authors to try adding some noise in their model, which would partially resolve the differences observed at the lower rotational speeds (see stars deviating from the fitted line in Figure 2A) and to consider that introducing an asymmetry between the post-stimulus inhibition and excitation time constants could result in a model not deviating as much at the higher rotation velocities during counter-clockwise rotation of the polarized light (see stars deviating from the fitted line in Figure 2A). 

      In the end, the authors use the observation that during saccades, the average activity in their model-with-history increases to claim that when the animal does not turn, it uses less neuronal activity and energy. This is not a convincing line of reasoning. To make a claim about energy efficiency, the authors must instead compare their model with alternatives and show that the neuronal activity of their model during straight flight is indeed lower than those alternative models. Note that such a comparison would be meaningful only if the alternative models compared against capture physiology equally well in all other respects. However, the evident deviations of the presented model from the physiology measurements and the short duration of the test stimulus used would make any such claims difficult to substantiate. 

      Finally, for most experiments, the models are stimulated with a single short yaw sequence lasting a few seconds to measure responses. Given the dependence of the model on history, using such a small sample, we cannot see how generalizable the observations are. The authors need to show that the same effect is produced using multiple different trajectories.

    3. Reviewer #2 (Public review):

      Summary:

      The compass network is a higher-order circuit in insects that integrates sensory cues, like the angle of polarized light, with self-motion information to estimate the animal's angular position in space. This paper by Rother et al. uses share electrode recordings to measure intracellular voltage activity from individual compass neurons while polarization patterns are presented to the bee. They present patterns that rotate with variable speed or simulate the sensory experience created by a flight trajectory. The authors discover that at low rotational speeds, TL neuron responses diverge from the tuning expected from a systematic synaptic delay, suggesting that recent experience (history) impacts TL responses. A population model of 180 TL neurons is then used to argue that having cells that are impacted by spiking history could be advantageous for estimating heading. The model activity showed an anticipation of polarization angle for rapid turns that followed prolonged straight flights or turns in the opposite direction. The model also had reduced spiking activity during translational straight flight.

      Strengths:

      One strength of this paper is that it focuses on a question that is underexplored in the field: How does the compass network handle the processing delay caused by multi-synaptic relay from the DRA to the sensory input neurons (TL) to the compass network why the insect is turning rapidly and thus sampling distinct polarization angles in rapid succession? Another strength is the fact that they were able to present neurons with both simulated naturalistic polarization patterns that could occur during flight and synthetic stimuli with a range of rotational velocities. This provides an important data set where these responses can be compared. Another strength is the exploration of how adding a history term to a model of a population of TL neurons can lead to the population coding of polarization angle to vary in how delayed it is from changes to the sensory stimuli. They find that angular coding is more anticipatory (shorter delay) following prolonged periods of fixating a single angle, such as what occurs during translation movement, or following turns in the opposite direction of the current turn.

      Weaknesses:

      A challenge for this experimental approach is the relatively low power for data sets in some of the experimental conditions. Low throughput is expected for this experimental approach, as intracellular recordings are a challenging and time-consuming method. A weakness of the manuscript in its current form is that the data from all cells that were able to be recorded is not always presented or quantified. For example, only a single neuron example is used to show the impact of history on preferred polarization and how this tuning varied with rotation velocity. This is also true for the claim that TL3 neurons exhibit post-inhibitory excitation and post-excitatory inhibition. Another concern is regarding the use of the term "spiking-history" as potentially confusing to readers who might assume this process is cell intrinsic. The authors presented data shows evidence of an effect of stimulus history on the responses of the neurons. However as the authors describe in the discussion this current data set does not distinguish between an effect that occurs in the recorded neurons (e.g. an effect of intrinsic excitability) vs adaptation elsewhere in the circuit or DRA photoreceptors. A final challenge for this approach, shared with other studies that measure neural responses from an insect fixed in place, is that it assumes that these TL neurons are purely sensory and that their response properties (or those upstream of them) do not change when the bee performs a motor action or maneuver. This caveat should be considered when interpreting these data, however these data still represent novel information and important progress in exploring this question.

    4. Reviewer #3 (Public review):

      This manuscript reports the temporal history dependence of central complex TL/ring neuron spiking activity to polarized light patterns. Using sharp recording in tethered bumblebees with synthetic and natural visual stimulation, the authors nicely measured activities to rotating polarized UV light, and made the interesting finding that spiking activity depends on not just current stimulus but also recent activity.

      (1) History dependence has been reported before in ring neurons in Drosophila (Sun et al., Nature Neuroscience, 2017; Shiozaki et al., Nature Neuroscience, 2017). While there are differences in the nature of the visual stimulation used, the basic phenomenology of temporal history dependence bears some resemblance. Where are the differences in the physiological properties of ring/TL neurons between different insect species in relevance to history dependence? What are the structural similarities and differences in the circuits that may help to explain history dependence? Just to name a few. To gain further insight into this question, the manuscript may benefit from putting the findings here into context.

      (2) Figure 3b serves as a critical evidence for history-dependence. However, it is unclear from this data if this is history dependence, or other physiological processes such as OFF response to sensory stimulation, or sensory adaptation. One way to test this is to examine whether such an effect can be detected after a delay period. For example, history dependence in fly ring neurons is mediated by delay period activity present for several seconds. This can be easily tested here as well.

      (3) The properties of the history dependence can be better characterized to help understand its nature. What are the statistical characteristics of post-stimulus inhibition to preferred AoP and post-stimulus excitation to anti-preferred AoP? What are the temporal dynamics of such an effect, e.g., how long does it take to return to baseline? Are the differences in these properties recorded across the TL neuron population? Is it possible to categorize these TL neurons based on these properties and morphology? These properties are important to under the physiological basis of such effect. The authors only presented two traces in Figure 3b, beautiful example traces, but without any further population data and statistical analysis.

      (4) A major point of the manuscript is energy efficiency via reduction of firing rate. However, the only evidence comes from simulation, and it seems to be a weak effect of 0.5 APs/s.

      (5) Another major point of the manuscript is "increases sensitivity for course deviations during straight flight". However, this again is supported by simulation only. To validate these claims, empirical support of behavioral experiments is highly desired. Otherwise, it is recommended to minimize emphasizing such behavioral predictions.

      (6) A substantial portion of the text emphasizes the importance of natural stimulation. While natural stimulation is indeed a desirable experimental approach, it is unclear if natural stimulation is exploited to its full in this manuscript. History dependence can be explored with synthetic stimulation.

      (7) A phenomenological model was used to account for the history effect, by assuming a linear integration process and a linear history effect. However, such an assumption is not adequately backed up by rigorous statistical analysis of experiment data or at least proper conceptual discussion.

      (8) Population responses, as in Figure 4, are based on strong assumptions of neuronal properties without clear experimental support, thus seeming to be quite a stretch.

      (9) There are interesting observations in simulation results from Figure 5; it would be nice to experimentally test at least some of these ideas.

      (10) "anticipate future head directions" seems to be quite a stretch to me without mechanistic explanations.

      (11) The visual stimulation design used can be improved and expanded. The synthetic stimulation used in Figure 1c follows a stereotyped order, according to angular velocities. As the focus of the manuscript is to probe the history effect and to test again the findings made with this stimulation, randomized stimulation should ideally be examined.

      (12) State dependence was observed in ring neurons in Drosophila (Sun et al., Nature Neuroscience, 2017) which might be related to ongoing neural activity and history dependence. While I realize that the animal is tethered, I was wondering if there was any signature of neural activity state dependence observed in this study.

    1. eLife assessment

      This study makes an important effort to observe and quantify synaptic integration in a large and active network of cultured neurons, using simultaneous patch-clamp and large-scale extracellular recordings. They developed a method to distinguish excitatory and inhibitory contributions, show compelling evidence that the subthreshold activity of these neurons is dominated by few presynaptic neurons. They provide convincing statistics about connectivity and network dynamics.

    1. eLife assessment

      Using microscopy experiments and theoretical modelling, the authors present convincing evidence of cellular coordination in the gliding filamentous cyanobacterium Fluctiforma draycotensis. The results are important for the understanding of cyanobacterial motility and the underlying molecular and mechanical pathways of cellular coordination.

    2. Reviewer #1 (Public review):

      Summary:

      The authors use microscopy experiments to track the gliding motion of filaments of the cyanobacteria Fluctiforma draycotensis. They find that filament motion consists of back-and-forth trajectories along a "track", interspersed with reversals of movement direction, with no clear dependence between filament speed and length. It is also observed that longer filaments can buckle and form plectonemes. A computational model is used to rationalize these findings.

      Strengths:

      Much work in this field focuses on molecular mechanisms of motility; by tracking filament dynamics this work helps to connect molecular mechanisms to environmentally and industrially relevant ecological behavior such as aggregate formation.

      The observation that filaments move on tracks is interesting and potentially ecologically significant.

      The observation of rotating membrane-bound protein complexes and tubular arrangement of slime around the filament provides important clues to the mechanism of motion.

      The observation that long filaments buckle has the potential to shed light on the nature of mechanical forces in the filaments, e.g. through the study of the length dependence of buckling.

      Weaknesses:

      The manuscript makes the interesting statement that the distribution of speed vs filament length is uniform, which would constrain the possibilities for mechanical coupling between the filaments. However, Figure 1C does not show a uniform distribution but rather an apparent lack of correlation between speed and filament length, while Figure S3 shows a dependence that is clearly increasing with filament length. Also, although it is claimed that the computational model reproduces the key features of the experiments, no data is shown for the dependence of speed on filament length in the computational model. The statement that is made about the model "all or most cells contribute to propulsive force generation, as seen from a uniform distribution of mean speed across different filament lengths", seems to be contradictory, since if each cell contributes to the force one might expect that speed would increase with filament length.

      The computational model misses perhaps the most interesting aspect of the experimental results which is the coupling between rotation, slime generation, and motion. While the dependence of synchronization and reversal efficiency on internal model parameters are explored (Figure 2D), these model parameters cannot be connected with biological reality. The model predictions seem somewhat simplistic: that less coupling leads to more erratic reversal and that the number of reversals matches the expected number (which appears to be simply consistent with a filament moving backwards and forwards on a track at constant speed).

      Filament buckling is not analysed in quantitative detail, which seems to be a missed opportunity to connect with the computational model, eg by predicting the length dependence of buckling.

    3. Reviewer #2 (Public review):

      Summary:

      The authors combined time-lapse microscopy with biophysical modeling to study the mechanisms and timescales of gliding and reversals in filamentous cyanobacterium Fluctiforma draycotensis. They observed the highly coordinated behavior of protein complexes moving in a helical fashion on cells' surfaces and along individual filaments as well as their de-coordination, which induces buckling in long filaments.

      Strengths:

      The authors provided concrete experimental evidence of cellular coordination and de-coordination of motility between cells along individual filaments. The evidence is comprised of individual trajectories of filaments that glide and reverse on surfaces as well as the helical trajectories of membrane-bound protein complexes that move on individual filaments and are implicated in generating propulsive forces.

      Limitations:

      The biophysical model is one-dimensional and thus does not capture the buckling observed in long filaments. I expect that the buckling contains useful information since it reflects the competition between bending rigidity, the speed at which cell synchronization occurs, and the strength of the propulsion forces.

      Future directions:

      The study highlights the need to identify molecular and mechanical signaling pathways of cellular coordination. In analogy to the many works on the mechanisms and functions of multi-ciliary coordination, elucidating coordination in cyanobacteria may reveal a variety of dynamic strategies in different filamentous cyanobacteria.

    4. Reviewer #3 (Public review):

      Summary:

      The authors present new observations related to the gliding motility of the multicellular filamentous cyanobacteria Fluctiforma draycotensis. The bacteria move forward by rotating their about their long axis, which causes points on the cell surface to move along helical paths. As filaments glide forward they form visible tracks. Filaments preferentially move within the tracks. The authors devise a simple model in which each cell in a filament exerts a force that either pushes forward or backwards. Mechanical interactions between cells cause neighboring cells to align the forces they exert. The model qualitatively reproduces the tendency of filaments to move in a concerted direction and reverse at the end of tracks.

      Strengths:

      The observations of the helical motion of the filament are compelling.

      The biophysical model used to describe cell-cell coordination of locomotion is clear and reasonable. The qualitative consistency between theory and observation suggests that this model captures some essential qualities of the true system.

      The authors suggest that molecular studies should be directly coupled to the analysis and modeling of motion. I agree.

      Weaknesses:

      There is very little quantitative comparison between theory and experiment. It seems plausible that mechanisms other than mechano-sensing could lead to equations similar to those in the proposed model. As there is no comparison of model parameters to measurements or similar experiments, it is not certain that the mechanisms proposed here are an accurate description of reality. Rather the model appears to be a promising hypothesis.

    5. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      The authors use microscopy experiments to track the gliding motion of filaments of the cyanobacteria Fluctiforma draycotensis. They find that filament motion consists of back-and-forth trajectories along a "track", interspersed with reversals of movement direction, with no clear dependence between filament speed and length. It is also observed that longer filaments can buckle and form plectonemes. A computational model is used to rationalize these findings.

      We thank the reviewer for this accurate summary of the presented work.

      Strengths:

      Much work in this field focuses on molecular mechanisms of motility; by tracking filament dynamics this work helps to connect molecular mechanisms to environmentally and industrially relevant ecological behavior such as aggregate formation.

      The observation that filaments move on tracks is interesting and potentially ecologically significant.

      The observation of rotating membrane-bound protein complexes and tubular arrangement of slime around the filament provides important clues to the mechanism of motion.

      The observation that long filaments buckle has the potential to shed light on the nature of mechanical forces in the filaments, e.g. through the study of the length dependence of buckling.

      We thank the reviewer for listing these positive aspects of the presented work.

      Weaknesses:

      The manuscript makes the interesting statement that the distribution of speed vs filament length is uniform, which would constrain the possibilities for mechanical coupling between the filaments. However, Figure 1C does not show a uniform distribution but rather an apparent lack of correlation between speed and filament length, while Figure S3 shows a dependence that is clearly increasing with filament length. Also, although it is claimed that the computational model reproduces the key features of the experiments, no data is shown for the dependence of speed on filament length in the computational model. The statement that is made about the model "all or most cells contribute to propulsive force generation, as seen from a uniform distribution of mean speed across different filament lengths", seems to be contradictory, since if each cell contributes to the force one might expect that speed would increase with filament length.

      We agree that the data shows in general a lack of correlation, rather than strictly being uniform. In the revised manuscript, we intend to collect more data from observations on glass to better understand the relation between filament length and speed. 

      In considering longer filaments, one also needs to consider the increased drag created by each additional cell - in other words, overall friction will either increase or be constant as filament length increases. Therefore, if only one cell (or few cells) are generating motility forces, then adding more cells in longer filaments would decrease speed.

      Since the current data does not show any decrease in speed with increasing filament length, we stand by the argument that the data supports that all (or most) cells in a filament are involved in force generation for motility. We would revise the manuscript to make this point - and our arguments about assuming multiple / most cells in a filament contributing to motility - clear.

      The computational model misses perhaps the most interesting aspect of the experimental results which is the coupling between rotation, slime generation, and motion. While the dependence of synchronization and reversal efficiency on internal model parameters are explored (Figure 2D), these model parameters cannot be connected with biological reality. The model predictions seem somewhat simplistic: that less coupling leads to more erratic reversal and that the number of reversals matches the expected number (which appears to be simply consistent with a filament moving backwards and forwards on a track at constant speed).

      We agree that the coupling between rotation, slime generation and motion is interesting and important when studying the specific mechanism leading to filament motion. However, we believe it even more fundamental to consider the intercellular coordination that is needed to realise this motion. Individual filaments are a collection of independent cells. This raises the question of how they can coordinate their thrust generation in such a way that the whole filament can both move and reverse direction of motion as a single unit. With the presented model, we want to start addressing precisely this point.

      The model allows us to qualitatively understand the relation between coupling strength and reversals (erratic vs. coordinated motion of the filament). It also provides a hint about the possibility of de-coordination, which we then look for and identify in longer filaments.

      While the model results seem obvious in hindsight, the analysis of the model allows phrasing the question of cell-to-cell coordination, which has not been brought up previously when considering the inherently multi-cell process of filament motility.

      Filament buckling is not analysed in quantitative detail, which seems to be a missed opportunity to connect with the computational model, eg by predicting the length dependence of buckling.

      Please note that Figure S10 provides an analysis of filament length and number of buckling instances observed. This suggests that buckling happens only in filaments above a certain length.

      We do agree that further analyses of buckling - both experimentally and through modelling would be interesting.  This study, however,  focussed on cell-to-cell coupling / coordination during filament motility. We have identified the possibility of de-coordination through the use of a simple 1D model of motion, and found evidence of such de-coordination in experiments. Notice that the buckling we report does not depend on the filament hitting an external object. It is a direct result of a filament activity which, in this context, serves as evidence of cellular de-coordination.

      Now that we have observed buckling and plectoneme formation, these processes need to be analysed with additional experiments and modelling. The appropriate model for this process needs to be 3D, and should ideally include torques arising from filament rotation. Experimentally, we need to identify means of influencing filament length and motion and see if we can measure buckling frequency and position across different filament lengths. These works are ongoing and will have to be summarised in a separate, future publication.

      Reviewer #2 (Public review):

      Summary:

      The authors combined time-lapse microscopy with biophysical modeling to study the mechanisms and timescales of gliding and reversals in filamentous cyanobacterium Fluctiforma draycotensis. They observed the highly coordinated behavior of protein complexes moving in a helical fashion on cells' surfaces and along individual filaments as well as their de-coordination, which induces buckling in long filaments.

      We thank the reviewer for this accurate summary of the presented work.

      Strengths:

      The authors provided concrete experimental evidence of cellular coordination and de-coordination of motility between cells along individual filaments. The evidence is comprised of individual trajectories of filaments that glide and reverse on surfaces as well as the helical trajectories of membrane-bound protein complexes that move on individual filaments and are implicated in generating propulsive forces.

      We thank the reviewer for listing these positive aspects of the presented work.

      Limitations:

      The biophysical model is one-dimensional and thus does not capture the buckling observed in long filaments. I expect that the buckling contains useful information since it reflects the competition between bending rigidity, the speed at which cell synchronization occurs, and the strength of the propulsion forces.

      Cell-to-cell coordination is a more fundamental phenomenon than the buckling and twisting of longer filaments, in that the latter is a consequence of limits of the former. In this sense, we are focussing here on something that we think is the necessary first step to understand filament gliding. The 3D motion of filaments (bending, plectoneme formation) is fascinating and can have important consequences for collective behaviour and macroscopic structure formation. As a consequence of cellular coupling, however, it is beyond the scope of the present paper.

      Please also see our response above. We believe that the detailed analysis of buckling and plectoneme formation requires (and merits) dedicated experiments and modelling which go beyond the focus of the current study (on cellular coordination) and will constitute a separate analysis that stands on its own. We are currently working in that direction.

      Future directions:

      The study highlights the need to identify molecular and mechanical signaling pathways of cellular coordination. In analogy to the many works on the mechanisms and functions of multi-ciliary coordination, elucidating coordination in cyanobacteria may reveal a variety of dynamic strategies in different filamentous cyanobacteria.

      We thank the reviewer for highlighting this point again and seeing the value in combining molecular and dynamical approaches.

      Reviewer #3 (Public review):

      Summary:

      The authors present new observations related to the gliding motility of the multicellular filamentous cyanobacteria Fluctiforma draycotensis. The bacteria move forward by rotating their about their long axis, which causes points on the cell surface to move along helical paths. As filaments glide forward they form visible tracks. Filaments preferentially move within the tracks. The authors devise a simple model in which each cell in a filament exerts a force that either pushes forward or backwards. Mechanical interactions between cells cause neighboring cells to align the forces they exert. The model qualitatively reproduces the tendency of filaments to move in a concerted direction and reverse at the end of tracks.

      We thank the reviewer for this accurate summary of the presented work.

      Strengths:

      The observations of the helical motion of the filament are compelling.

      The biophysical model used to describe cell-cell coordination of locomotion is clear and reasonable. The qualitative consistency between theory and observation suggests that this model captures some essential qualities of the true system.

      The authors suggest that molecular studies should be directly coupled to the analysis and modeling of motion. I agree.

      We thank the reviewer for listing these positive aspects of the presented work and highlighting the need for combining molecular and biophysical approaches.

      Weaknesses:

      There is very little quantitative comparison between theory and experiment. It seems plausible that mechanisms other than mechano-sensing could lead to equations similar to those in the proposed model. As there is no comparison of model parameters to measurements or similar experiments, it is not certain that the mechanisms proposed here are an accurate description of reality. Rather the model appears to be a promising hypothesis.

      We agree with the referee that the model we put forward is one of several possible. We note, however, that the assumption of mechanosensing by each cell - as done in this model - results in capturing both the alignment of cells within a filament (with some flexibility) and reversal dynamics. We have explored an even more minimal 1D model, where the cell’s direction of force generation is treated as an Ising-like spin and coupled between nearest neighbours (without assuming any specific physico-chemical basis). We found that this model was not fully able to capture both phenomena. In that model, we found that alignment required high levels of coupling (which is hard to justify except for mechanical coupling) and reversals were not readily explainable (and required additional assumptions). These points led us to the current, mechanically motivated model.

      The parameterisation of the current model would require measuring cellular forces. To this end, a recent study has attempted to measure some of the physical parameters in a different filamentous cyanobacteria [1] and in our revision we will re-evaluate model parameters and dynamics in light of that study. We will also attempt to directly verify the presence of mechano-sensing by obstructing the movement of filaments.

    1. eLife assessment

      The authors present a solid statistical framework for using sibling phenotype data to assess whether there is evidence for de-novo or rare variants causing extreme trait values. Their valuable method is promising and will be of interest to researchers studying complex trait genetics.

    2. Reviewer #1 (Public review):

      This is a clever and well-done paper. The authors sought to craft a method, applicable to biobank-scale data but without necessarily using genotyping or sequencing, to detect the presence of de novo mutations and rare variants that stand out from the polygenic background of a given trait. Their method depends essentially on sibling pairs where one sibling is in an extreme tail of the phenotypic distribution and whether the other sibling's regression to the mean shows a systematic deviation from what is expected under a simple polygenic architecture.

      Their method is successful in that it builds on a compelling intuition, rests on a rigorous derivation, and seems to show reasonable statistical power in the UK Biobank. (More biobanks of this size will probably become available in the near future.) It is somewhat unsuccessful in that rejection of the null hypothesis does not necessarily point to the favored hypothesis of de novo or rare variants. The authors discuss the alternative possibility of rare environmental events of large effect.

      Comments on current version:

      The authors have addressed the concerns of the reviewers. I have no further comments.

    3. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      The authors present valuable findings on how to determine the genetic architecture of extreme phenotype values by using data on sibling pairs. While the authors' derivations of the method are correct, the scenarios considered are incomplete, making it difficult to have confidence in the interpretation of the results as demonstrating the influence of de-novo or Mendelian (rare, penetrant-variant) architectures. The method nevertheless shows promise and will be of interest to researchers studying complex trait genetics. 

      A.1: We have now expanded our consideration of the scenarios and we have ensured that we do not over-interpret our results as being due to de novo or Mendelian architectures. Instead, we make clear that our statistical tests are powered to identify these architectures but that there are other potential causes of significant results (e.g. measurement error or uncontrolled environmental factors from heavy-tailed distributions), making follow-up validation studies necessary before underlying architectures can be confirmed. We consider this to be typical of observational research, in which significant results may indicate causal effects unless uncontrolled confounding factors explain the observed associations, requiring experimental/trial follow-up for validation. We believe that our tests are useful for providing initial inference, and that in some settings – e.g. prioritising samples for sequencing to identify rare variants – could be useful as an initial screening step to increase the efficacy of a planned analysis or study.

      Additionally, we have now developed “SibArc”, an openly available software tool that takes input sibling trait data and estimates conditional sibling heritability across the trait distribution. Then - based on our theoretical framework developed and described in the paper - for each tail of the trait distribution, estimates effect sizes and generates P-values corresponding to our de novo and Mendelian tests, and performs a Kolmogorov-Smirnov test to identify general departures from our null model. Furthermore, SibArc also provides additional functionality for users under preliminary beta form, for example, running an iterative optimisation routine to infer approximate relative degrees of polygenic, de novo, and Mendelian architectures prevailing in each trait tail. We have made this software tool, Quick Start tutorial, and sample data available online at Github and are hosting these on a dedicated website: www.sibarc.net.

      Reviewer #1 (Public Review):

      This is a clever and well-done paper that should be published. The authors sought to craft a method, applicable to biobank-scale data but without necessarily using genotyping or sequencing, to detect the presence of de novo mutations and rare variants that stand out from the polygenic background of a given trait. Their method depends essentially on sibling pairs where one sibling is in an extreme tail of the phenotypic distribution and whether the other sibling's regression to the mean shows a systematic deviation from what is expected under a simple polygenic architecture. 

      Their method is successful in that it builds on a compelling intuition, rests on a rigorous derivation, and seems to show reasonable statistical power in the UK Biobank. (More biobanks of this size will probably become available in the near future.)  It is somewhat unsuccessful in that rejection of the null hypothesis does not necessarily point to the favored hypothesis of de novo or rare variants. The authors discuss the alternative possibility of rare environmental events of large effect. Maybe attention should be drawn to this in the abstract or the introduction of the paper. Nevertheless, since either of these possibilities is interesting, the method remains valuable. 

      A.2: We agree with the reviewer that we should have made it clearer that - while our statistical tests are powered to identify de novo and Mendelian architectures – significant findings from our tests could also be explained by rare environmental events of large effect (specifically by uncontrolled environmental factors with heavy-tailed distributions). We have now made this clear throughout the manuscript (see A.1).

      Moreover, we agree with the reviewer that whether the cause of deviations from expectations are due to de novo or rare variants, or environmental factors, either possibility is interesting. For example, in either scenario, our results can highlight inaccuracy in PRS prediction of extreme trait values for certain traits, and also provides a relative measure across different traits of large effects impacting on the trait tails, irrespective of whether genetic or environmental. We now place more emphasis on this point throughout the manuscript.

      Reviewer #2 (Public Review):

      Souaiaia et al. attempt to use sibling phenotype data to infer aspects of genetic architecture affecting the extremes of the trait distribution. They do this by considering deviations from the expected joint distribution of siblings' phenotypes under the standard additive genetic model, which forms their null model. They ascribe excess similarity compared to the null as due to rare variants shared between siblings (which they term 'Mendelian') and excess dissimilarity as due to de-novo variants. While this is a nice idea, there can be many explanations for rejection of their null model, which clouds interpretation of Souaiaia et al.'s empirical results.

      A.3: We agree with the reviewer that we should have made clearer that there are other explanations for significant results from our tests and we have now fully addressed this point – (see A.1, A.2, A.4, A.5 for more detail).  In addition, we now elaborate on exactly what our null hypothesis is: which is not only that the expected joint distribution of siblings’ phenotypes is governed by the standard additive genetic model, but that environmental effects are either controlled for or else their combined effect is approximately Gaussian. Furthermore, by selecting only those traits whose raw trait distribution most closely corresponds to a Gaussian distribution from the UK Biobank, we increase the probability that significant results from our tests are due to rare variants (shared or unshared among siblings).

      The authors present their method as detecting aspects of genetic architecture affecting the extremes of the trait distribution. However, I think it would be better to characterize the method as detecting whether siblings are more or less likely to be aggregated in the extremes of the phenotype distribution than would be predicted under a common variant, additive genetic model.

      A.4: As discussed above we should have stated more clearly that significant results could be due to non-genetic factors, we have now addressed this.

      However, we do not think that it would be appropriate to characterise our tests as merely corresponding to over and under aggregation of siblings in the tails. Firstly, environmental factors should be controlled for as part of our testing, increasing the probability that significant results are due to genetic, and not environmental factors. Secondly, tests for identifying broad over and under aggregation of siblings in the tails should be designed differently and, accordingly, the tests that we have developed here would not be optimal to detect over/under aggregation of siblings in trait tails. Our test for inference of de novo variants, for example, exploits the fact that de novo alleles of large effect result in one sibling being extreme and all others being drawn from the background distribution, so that the mean of other siblings is relatively low – not merely that other siblings are less likely to be found in the tail. For more discussion on this issue in relation to one of reviewer 1’s points, see A.9.

      Exactly how the rareness and penetrance of a genetic variant influence the conditional sibling phenotype distribution at the extremes is not made clear. The contrast between de-novo and 'Mendelian' architectures is somewhat odd since these are highly related phenomena: a 'Mendelian' architecture could be due to a de-novo variant of the previous generation. The fact that these two phenomena are surmised to give opposing signatures in the authors' statistical tests seems suboptimal to me: would it not be better to specify a parameter that characterizes the degree or sharing between siblings of rare factors of large effect? This could be related to the mixture components in the bimodal distribution displayed in Fig 1. In fact, won't the extremes of all phenotypes be influenced by all three types of variants (common, rare, de-novo) to greater or lesser degree? By framing the problem as a hypothesis testing problem, I think the authors are obscuring the fact that the extremes of real phenotypes likely reflect a mixture of causes: common, de-novo, and rare variants (and shared and non-shared environmental factors). 

      A.5: We absolutely recognise that there will typically be a complex and continuous mix of genetic architectures underlying complex traits in their tails, dictated by the 2-dimensional relationship between allele frequency and effect size. We did consider developing a fully Bayesian statistical framework to model this, but soon realised that doing this properly would require a substantial amount of model development, accounting for multiple factors in ways that would require a great deal of further investigation; for example, performing a range of complex simulations to investigate the effects of different selective pressures over time, different patterns of assortative mating, and effect size generating distributions. We are in the process of applying for funding for a multi-year project that will perform exactly these investigations as a step towards developing more sophisticated models of inference. In the meantime, we do believe that the simpler hypothesis-testing framework that we have developed here does have important value. Assuming that environmental factors are accounted for, or that any that are not accounted for have combined Gaussian effects, then our tests will indeed infer enrichments of de novo and ‘Mendelian’ rare alleles of large effect in the tails of complex traits. Results from these tests can also be compared within and across traits to compare the relative degree of such enrichments among traits. For some traits we observe significant results from both tests, and for other traits we observe highly significant results from one of our tests but not the other. Thus, while our tests do not provide a complete picture about the genetic architecture in the tails of complex traits, they do offer some intriguing initial insights into tail architecture, important given the enrichment of disease in trait tails.

      To better enable interpretation of the results of this method, a more comprehensive set of simulations is needed. Factors that may influence the conditional distribution of siblings' phenotypes beyond those considered include: non-normal distribution, assortative mating, shared environment, interactions between genetic and shared environmental factors, and genetic interactions. 

      A.6: As described above (see A.5) we do agree that a more comprehensive set of simulations is exactly what is needed to further extend this work. However, we believe that the tests that we have developed so far, which make some simplifying assumptions that we think would often hold in practice, is a useful start to what is an entirely novel approach to inferring genetic architecture from family trait-only (non-genetic) data. Our work could already be useful for method developers who may wish to extend our approach in ways that we may not think of. It could also be useful for applied scientists focusing on specific traits who will be able to gain initial, inference-level, insights by applying our tests to their data, while the results of applying our tests may even guide study design of rare variant mapping studies.

      In summary, I think this is a promising method that is revealing something interesting about extreme values of phenotypes. Determining exactly what is being revealed is going to take a lot more work, however. 

      A.7: We thank the reviewer for highlighting the promise in our approach and agree that it is revealing something interesting about complex traits. We also agree that it is going to take a lot more work to reveal exactly what that is for different traits, which we plan to work on ourselves and hope that this paper will help other interested scientists to follow-up on and extend as well.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      R.1.1: Why these particular traits (body fat, mean corpuscular haemoglobin, neuroticism, heel bone mineral density, monocyte count, sitting height)? 

      A.8: Traits were initially selected to cover a variety of traits (anthropometric, metabolic, personality..) and to illustrate different examples of tail architecture. However, in response to a point from reviewer 2 (see A.17), we have now overhauled our quality control of traits to ensure that only traits closely matching Gaussian distributions are included. In total, 18 traits were selected, with detailed results presented in Appendix 4 and results corresponding to 6 of the traits presented in the main text (Figure 6) to show examples of different types of tail architecture.

      R.1.2: Why are there separate tests for de novo and Mendelian architectures? It seems that one could use either of the derived tests for both purposes, simply by switching to a two-sided test for each tail. My guess is that the score test of whether alpha is zero would be the more statistically powerful test. 

      A.9: The score test of whether alpha is zero has limited power to detect Mendelian architectures. This is because under Mendelian effects, half the siblings in a family have trait values reflecting the background distribution, such that the mean of sibling trait values is not so different from the polygenic expectation (i.e. alpha close to 0). The Mendelian score test that we developed is substantially more powerful because it evaluates co-occurrence of siblings in the tails, which is far higher under Mendelian architecture in the tail than compared to polygenic architecture.

      However, in order test for general departures from our null model, including those of non-Gaussian environmental factors, we now include results from performing a Kolmogorov-Smirnoff test of difference from the expected distribution, and also provide this test as an option in our ‘SibArc’ software tool.

      R.1.3: This method assumes that assortative mating is absent. I worry that sitting height might not be a good trait to analyze, since there is some assortative mating (~0.3) for height (e.g., Yengo et al., 2018). Perhaps this trait should not be included among those that are analyzed in this paper. Then again, it is possible that there is less assortative mating for sitting height than total height (i.e., leg length) (Jensen & Sinha, 1993). 

      A.10:  It is true that our method assumes random mating. We note that while  assortative mating increases sibling similarity relative to expectation, if it is stable across the trait distribution it will also bias heritability estimation upward which is likely it’s potential impact in our framework.  However, if assortative mating is more prevalent in the tails of the distribution, it can result in excess kurtosis – an impact that can increase false positive Mendelian tests and false negative de novo tests.  Given that the trait distribution for Sitting Height has only moderate excess Kurtosis (~0.4, see Fig 9, Appendix 4) and we inferred de novo architecture only for this trait, we feel that including it in the paper is appropriate. 

      R.1.4: I wonder if it's possible to discuss the impact of non-additive genetic variance on the method. How does this affect the estimation of heritability, which calibrates the expectation for regression to the mean? Can non-additive genetic deviations explain a rejection of the null hypothesis of simple polygenicity? 

      A.11: Yes, the heritability estimation, which calibrates expectation for regression to the mean, assumes additivity of effects, as do the most popular estimators of heritability from GWAS data in the field: GCTA-GREML, LD Score regression and LDAK. Accordingly, non-additive genetic effects could result in rejection of the null hypothesis. We have highlighted this point in the Discussion. However, we also point out that current evidence suggests that the contribution of non-additive genetic effects to complex trait variation is relatively small (Hivert 2021) and that non-additive genetic effects that have a similar impact across the trait distribution should not be a problem for our approach (only those that have an increasing effect towards the tails would be).

      R.1.5: p.5: Maybe a more realistic way to simulate a genetic architecture is to draw the MAF from the distribution [MAF(1 - MAF)]^{-1} and then an effect of the minor allele from some mound-shaped distribution (e.g., mixture of normals). The absolute or squared effect of the minor allele should increases as the MAF decreases, and there have been some papers trying to estimate this relationship (e.g., Zeng et al., 2021). Maybe make the number of causal SNPs 10,000. I don't rate this as an urgent suggestion because my sense is that the method should be robust, making adequate even a fairly minimal simulation confirming its accuracy. 

      A.11: In separate work, we have performed a comprehensive simulation study using the forward-in-time population genetic simulator SLIM-3 (Haller and Messer, 2019), which generates genetic effects according to Gaussian and Gamma distributions and models different selective pressures on complex traits. We plan to publish this work shortly and also extend the simulations to family data, from which we will be able to test the performance of our methods here under a range of different scenarios of genetic variation generation, including a variety of relationships between allele frequency and effect sizes. We agree with the reviewer that at this point, however, our minimal simulation should be sufficient to confirm our tests’ general robustness and so we will perform further testing once we have extended our more sophisticated simulation study.

      R.1.6: p.6: Step D seems to leave out a normalization of G to have unit variance. Also, the last part should say "the square of the correlation between the genetic liability and the trait is equal to the heritability." 

      A.12: Corrected – we thank the reviewer for spotting this.

      R.1.7: Figure 5: The power being adequate if roughly 1 of a 1000 index siblings with an extreme trait value owes their values to de novo mutations makes me think that there should be a discussion of the prior probability. The average person carries about 80 de novo mutations. How many of these are likely to affect, e.g., height? Zeng et al. (2021) gave estimates of mutational targets. Given that a mutation affects height, will its likely effect size be large enough to be detected with the method? Kemper et al. (2012) discussed this point in a perhaps useful way. 

      A.13: We find the work investigating mutational target sizes and generating effect sizes of different mutations (de novo or rare) to be extremely interesting and critical for understanding the causes of observed genetic variation. However, we think that this work is insufficiently progressed at this point to build on directly here for making more nuanced interpretation of our results. We are, however, exploring the impact of mutational target sizes, effect size distributions and selection effects, on the genetic architecture of complex traits via population genetic simulations (see A.11), and so we hope to be able to provide more in-depth interpretation of our results in the future.

      R.1.8: Figure 6: The number in the tables for Mendelian architecture are presumably observed and expected counts. But what about the numbers for de novo architecture? Those don't look like counts. Maybe they are conditional expectations of standardized trait values. Whatever the case may be, the caption should provide an explanation. 

      A.14: The observed and expected values for the de novo statistical test represent the expected and observed mean standardized trait values for siblings of individuals in the bottom and top 1% of the distribution. We have now made this clear in our updated figure.

      R.1.9: p. 16: Element (2,1) in the precision matrix after Equation 15 is missing a negative sign. 

      A.15: Corrected – we thank the reviewer for spotting this.

      R.1.10: p. 20: Shouldn't Equation 20 place an exponent of n on the factor outside of the exponential? 

      A.16: Corrected – we thank the reviewer for spotting this.

      Reviewer #2 (Recommendations For The Authors):

      R.2.1: The first concern that I have is that their statistical tests rely heavily on an assumption of bivariate normal distribution for sibling pair's phenotypes. Real phenotypes do not have such a distribution in general. The authors rely upon an inverse-normal transform when applying their method to real data. While the inverse-normal transform will ensure that the siblings' phenotypes have a marginal normal distribution, such a transform does not ensure that the joint distribution is bivariate normal. The authors should examine their procedure for simulated phenotypes with a non-normal distribution to see if their statistical tests remain properly calibrated. Related to this, I am concerned about applying an inverse normal transform to the neuroticism phenotype that contains only 13 unique values in UKB. How does the transform deal with tied values? Can we sensibly talk about extreme trait values for such a set of observations? 

      A.17: The reviewer is correct that a bivariate normal distribution for sibling pairs’ trait values does not necessarily hold, and only does so if the assumptions of our null model are met (polygenic effects, Gaussian environmental effects, random mating..). We have now more clearly described the assumptions of our null model, and to increase the matching of our selected traits to those assumptions we have expanded our analyses and now present results on traits that are close to Gaussian. As part of this more strict quality control, only traits with more than 50 unique values are included, meaning that neuroticism is excluded in our final analysis. We also now note that performing an inverse normal transformation on the traits only increases the robustness of the tests to some of our modelling assumptions. In future work we plan to investigate how best to model the conditional sibling distribution under a variety of non-Gaussian environmental effects and different non-random patterns of mating.

      R.2.2: The joint sibling phenotype distribution (Equation 4) can be derived by applying the formula for the conditional distribution of a multivariate Gaussian to the standard additive genetic model. The authors' derivation is unnecessarily complex. Furthermore, many of the formulae have been used in Shai Carmi's work on embryo screening, but this work is not cited. 

      A.18: We now state in the text that the conditional sibling distribution can also be derived from the joint trait distribution of related individuals, which we use in our extension to the 3-sibling scenario, and cite Shai Carmi’s work where this is used. The joint distribution is a more straightforward way to derive the conditional sibling distribution, but our derivation based on considering mid-parents is generalisable to cases where assumptions of random mating, Gaussian population trait distribution and no selection do not hold. We also think that our mid-parent based derivation will be more intuitive to many readers, leading to greater understanding and potential for extension. Therefore, overall we believe that its presentation is worthwhile and we have now elaborated on this in the Methods.

      R.2.3: Equation 8: this probability should be conditional on s1 

      A.19: Corrected – we thank the reviewer for spotting this.

      R.2.4: The empirical application to UKB data is lacking methodological details. Also, the number of siblings used is low compared to the number of available sibling pairs. Around 19k sibling pairs are available in the UKB white British subsample, but only 10k were used for height. Why? Also, why are extreme values excluded? Isn't this removing the signal the authors are looking to explain?

      A.20: We have now provided more methodological details throughout the Methods section, in particular in relation to the samples used and quality control performed. The removal of individuals with extreme values, in particular, is because unusually low/high trait values are more likely to be due to measurement error (e.g. due to imperfect measuring device, or storage/assaying) than for typical values, and so while this may also result in some loss in power (albeit small due to few individuals having values +/- 8 s.d. trait means) we consider it worth it for the potential reduction in type I error. In performing our newly expanded analysis (described above), and accounting for the reviewer’s point here about sample size, we did find a bug in our pipeline that meant that we did not include as many sibling pairs as available. We thank the reviewer for spotting this, since this contributed to our new analysis being substantially more powerful than the original (including up to ~17k sibling pairs depending on completeness of trait data).

      Benjamin C Haller, Phillip W Messer. SLiM 3: Forward Genetic Simulations Beyond the Wright–Fisher Model. Molecular Biology and Evolution. 2019. 36(3): 632-637.

      SD Whiteman, SM McHale, A Soli. Theoretical Perspectives on Sibling Relationships. J Fam Theory Rev. 2011 Jun 1;3(2):124-139.

      Nicholas H Barton, Alison M Etheridge, and Amandine Véber. The infinitesimal model: Definition, derivation, and implications. Theoretical population biology, 118:50–73, 2017.

      Valentin Hivert et al. “Estimation of non-additive genetic variance in human complex traits from a large sample of unrelated individuals.” American journal of human genetics vol. 108,5 (2021)

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The authors demonstrate that it is possible to carry out eQTL experiments for the model eukaryote S. cerevisiae, in "one pot" preparations, by using single-cell sequencing technologies to simultaneously genotype and measure expression. This is a very appealing approach for investigators studying genetic variation in single-celled and other microbial systems, and will likely inspire similar approaches in non-microbial systems where comparable cell mixtures of genetically heterogeneous individuals could be achieved.

      Strengths:

      While eQTL experiments have been done for nearly two decades (the corresponding author's lab are pioneers in this field), this single-cell approach creates the possibility for new insights about cell biology that would be extremely challenging to infer using bulk sequencing approaches. The major motivating application shown here is to discover cell occupancy QTL, i.e. loci where genetic variation contributes to differences in the relative occupancy of different cell cycle stages. The authors dissect and validate one such cell cycle occupancy QTL, involving the gene GPA1, a G-protein subunit that plays a role in regulating the mating response MAPK pathway. They show that variation at GPA1 is associated with proportional differences in the fraction of cells in the G1 stage of the cell cycle. Furthermore, they show that this bias is associated with differences in mating efficiency.

      Weaknesses:

      While the experimental validation of the role of GPA1 variation is well done, the novel cell cycle occupancy QTL aspect of the study is somewhat underexploited. The cell occupancy QTLs that are mentioned all involve loci that the authors have identified in prior studies that involved the same yeast crosses used here. It would be interesting to know what new insights, besides the "usual suspects", the analysis reveals. For example, in Cross B there is another large effect cell occupancy QTL on Chr XI that affects the G1/S stage. What candidate genes and alleles are at this locus? And since cell cycle stages are not biologically independent (a delay in G1, could have a knock-on effect on the frequency of cells with that genotype in G1/S), it would seem important to consider the set of QTLs in concert.

      We thank the reviewer for this suggested clarification. We have modified the text to make it clear that cell cycle occupancy is a compositional phenotype. Like the reviewer, we also noticed the distal trans eQTL hotspot on Chr XI in Cross B, but we were not able to identify compelling candidate gene(s) or variant(s) despite extensive effort.

      Reviewer #2 (Public Review):

      Boocock and colleagues present an approach whereby eQTL analysis can be carried out by scRNA-Seq alone, in a one-pot-shot experiment, due to genotypes being able to be inferred from SNPs identified in RNA-Seq reads. This approach obviates the need to isolate individual spores, genotype them separately by low-coverage sequencing, and then perform RNA-Seq on each spore separately. This is a substantial advance and opens up the possibility to straightforwardly identify eQTLs over many conditions in a cost-efficient manner. Overall, I found the paper to be well-written and well-motivated, and have no issues with either the methodological/analytical approach (though eQTL analysis is not my expertise), or with the manuscript's conclusions.

      I do have several questions/comments.

      393 segregant experiment:

      For the experiment with the 393 previously genotyped segregants, did the authors examine whether averaging the expression by genotype for single cells gave expression profiles similar to the bulk RNA-Seq data generated from those genotypes? Also, is it possible (and maybe not, due to the asynchronous nature of the cell culture) to use the expression data to aid in genotyping for those cells whose genotypes are ambiguous? I presume it might be if one has a sufficient number of cells for each genotype, though, for the subsequent one-pot experiments, this is a moot point.

      As mentioned in our preliminary response, while it is possible to expand the analysis along these lines, this is not relevant for the subsequent one-pot experiments. We have made all the data available so that anyone interested can try these analyses.

      Figure 1B:

      Is UMAP necessary to observe an ellipse/circle - I wouldn't be surprised if a simple PCA would have sufficed, and given the current discussion about whether UMAP is ever appropriate for interpreting scRNA-Seq (or ancestry) data, it seems the PCA would be a preferable approach. I would expect that the periodic elements are contained in 2 of the first 3 principal components. Also, it would be nice if there were a supplementary figure similar to Figure 4 of Macosko et al (PMID 26000488) to indeed show the cell cycle dependent expression.

      We have added two new figures (S2 and S3) that represent alternative visualizations of the cell-cycle that are not dependent on UMAP. Figure S2 shows plots of different pairs of principal components, with each cell colored by its assigned cell-cycle stage. We do not observe a periodic pattern in the first 3 principal components as the reviewer expected, but when we explore the first 6 principal components, we see combinations of components that clearly separate the cell cycle clusters. We emphasize that the clusters were generated using the Louvain algorithm and assigned to cell-cycle stages using marker genes, and that UMAP was used only for visualization.

      We could not create a figure similar to Macosko et al. because of differences between the cell cycle categories we used and those of Spellman et al (PMID 9843569). We instead created Figure S3 to address the reviewer's comment. This figure uses a heatmap in a style similar to that of Macosko et al. to display cell-cycle-dependent expression of the 22 genes we used as cell cycle markers across each of the five cell cycle stages (M/G1, G1, G1/S, S, G2/M).

      We have renumbered the supplementary figures after incorporating these two additional supplementary figures into the manuscript.

      Aging, growth rate, and bet-hedging:

      The mention of bet-hedging reminded me of Levy et al (PMID 22589700), where they saw that Tsl1 expression changed as cells aged and that this impacted a cell's ability to survive heat stress. This bet-hedging strategy meant that the older, slower-growing cells were more likely to survive, so I wondered a couple of things. It is possible from single-cell data to identify either an aging, or a growth rate signature? A number of papers from David Botstein's group culminated in a paper that showed that they could use a gene expression signature to predict instantaneous growth rate (PMID 19119411) and I wondered if a) this is possible from single-cell data, and b) whether in the slower growing cells, they see markers of aging, whether these two signatures might impact the ability to detect eQTLs, and if they are detected, whether they could in some way be accounted for to improve detection.

      As mentioned in our preliminary response, we are not sure how to look for gene expression signatures of aging in yeast scRNA-seq data. We believe that the proposed analyses are beyond the scope of the current paper. As noted above, we have made all the data available so that anyone interested can explore these hypotheses.

      AIL vs. F2 segregants:

      I'm curious if the authors have given thought to the trade-offs of developing advanced intercross lines for scRNA-Seq eQTL analysis. My impression is that AIL provides better mapping resolution, but at the expense of having to generate the lines. It might be useful to see some discussion on that.

      We thank the reviewer for the comments. We believe that a discussion of trade-offs between different approaches for constructing mapping populations, such as AIL and F2 segregants, is beyond the scope of this paper.

      10x vs SPLit-Seq

      10x is a well established, but fairly expensive approach for scRNA-Seq - I wondered how the cost of the 10x approach compares to the previously used approach of genotyping segregants and performing bulk RNA-Seq, and how those costs would change if one used SPLiT-Seq (see PMID 38282330).

      We thank the reviewer for the comments. We believe that a discussion of cost trade-offs between 10x and other approaches is beyond the scope of this paper, especially given the rapidly evolving costs of different technologies.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Throughout the results section the authors point to File S1 for additional information. This file is a tarball with about 20 Excel documents in it, each with several sheets embedded. The authors should provide a detailed README describing how to understand the organizations of the files in File S1 and the many embedded sheets in each file. Statements made in the manuscript about File S1 should explicitly direct the reader to a specific spreadsheet and table to refer to.

      We have added an additional README file to the tarball that explains the organization of File S1 and describes the data contained in each sheet. Throughout the text, we now reference specific spreadsheets to assist the reader. In addition, these spreadsheets have been added to a github repository https://github.com/theboocock/finemapping_spreadsheets_single_cell

      Neither of the two GitHub repositories referenced under "Code availability" has adequate documentation that would allow a reader to try and reproduce the analyses presented here. The one entitled https://github.com/joshsbloom/single_cell_eQTL has no functional README, while https://github.com/theboocock/yeast_single_cell_post_analysis is somewhat better but still hard to navigate. Basic information on expected inputs, file formats, file organization, output types, and formats, etc. is required to get any of these pipelines to run and should be provided at a minimum.

      We thank the reviewer for the comment. In response, we have refactored both GitHub repositories and added extensive documentation to improve usability. We updated the versions of software and packages, this has been reflected in the methods section.

      S. cerevisiae strains are preferentially diploid in nature and many genes involved in the mating pathway are differentially regulated in diploids vs haploids. Have the authors explored the fitness effects of the GPA1 82R allele in diploids? What is the dominance relationship between 82W and 82R?

      We thank the reviewer for the comment. In diploid yeast, the mating pathway is repressed, and thus we would not expect there to be any fitness consequences due to the presence of different alleles of GPA1.

      The diploid expression profiling (page 5 and Table S9) doesn't implicate GPA1; can you the authors comment on this in light of their finding in haploids?

      The mating pathway, including GPA1, is repressed in diploids, and hence the expression of GPA1 cannot be studied in these strains (PMID: 3113739). In addition, allele-specific expression differences only identify cis-regulatory effects. We know that the GPA1 variant results in a protein-coding change, which may or may not influence the levels of mRNA in cis, so that even if GPA1 were expressed in diploids, there would be no expectation of an allele-specific difference in expression.

      With respect to the candidate CYR1 QTL -- note that strains with compromised Cyr1 function also generally show increased sporulation rates and/or sporulation in rich media conditions (cAMP-PKA signaling represses sporulation). Is this the case in diploids with the CBS2888 allele at CYR1? If the CBS2888 allele is a CYR1 defect one might expect reduced cAMP levels. It is possible to estimate adenylate cyclase levels using a fairly straightforward ELISA assay. This would provide more convincing evidence of the causal mechanism of the alleles identified.

      We thank the reviewer for the comment, and we agree that a functional study of the CYR1 alleles would provide more convincing evidence for the causal mechanism of the connection between cell cycle occupancy, cAMP levels, and growth. However, we believe that the proposed experiments are beyond the scope of our current study. The evidence we provide is sufficient to establish that CYR1 is a strong candidate gene for the eQTL hotspot.

      Re: CYR1 candidate QTL -- The authors should reference the work of [Patrick Van Dijck] (https://pubmed.ncbi.nlm.nih.gov/?sort=date&term=Van+Dijck+P&cauthor_id= 20924200) and [Johan M Thevelein] (https://pubmed.ncbi.nlm.nih.gov/?sort=date&term=Thevelein+JM&cauth or_id=20924200) on CYR1 allelic variation, and other papers besides the Matsumoto/ Ishikawa papers, as the effects of cAMP-PKA signaling on stress can be quite variable. cAMP pathway variants, including in CYR1, have popped up in quite a few other yeast QTL mapping and experimental evolution papers. These should be referenced as well.

      We thank the reviewer for these references; we have added a comment about the relationship between stress tolerance and CYR1 variation, and cited the relevant references accordingly.

      Figure S10 - the subfigure showing the frequency of the GPA 82R compared to 82W suggests a fairly large and deleterious fitness effect of this allele; on the order of 7-8% fewer cells per cell cycle stage than the 82W allele. Can the authors reconcile this with the more modest growth rate effect they report on page 8?

      Figure S12C displays the allele frequency of the 82R allele across the cell cycle in the single-cell data from allele-replacement strains. These strains were grown separately and processed using two individual 10x chromium runs. The resulting sequenced library had 11,695 cells with the 82R allele and 14,894 cells with the 82W allele. The 7-8% difference in the number of cells is due to slight differences in the number of captured cells per run, not due to growth differences, because we attempted to pool cells in equal numbers from separate mid-log cultures.

      The proportion of cells in G1 increases by ~3% in strains with the 82R allele relative to the baseline proportion of cells in the experiment, which, to the reviewers point, is still larger than the ~1% growth difference we observed. Cell cycle occupancy is a compositional phenotype. As shown in figure S12C, the 82R variant increases the fraction of cells in G1 and slightly decreases the fraction of cells in M/G1. There is no obvious expectation for quantitatively translating a change in cell cycle occupancy to a change in growth rate.

      The authors refer to the Lang et al. 2009 paper w/respect to GPA1 variant S469I but that paper seems to have explored a different GPA1 allele, GPA1-G1406T, with respect to growth rates.

      We thank the reviewer for their comment. The S469I variant is the same as the G1406T variant, one denoting the amino acid change at position 469 in the protein and the other denoting the corresponding nucleotide change at position 1406 in the DNA coding sequence. We have altered the text to make this clear to the reader.

      Reviewer #2 (Recommendations For The Authors):

      I make no recommendations as to additional work for the authors. The manuscript is complete. I suggested some things I would like to see in my review, but it's up to them to decide whether they think any of those would further enhance the manuscript.

      However, I do have I have some pedantic formatting notes:

      - Microliters are variously presented as uL, ul, and µl - it should be µL

      - Similarly, milliliters are presented as ml and ML - it should be mL

      - Also, there should be a space between the number and the unit, e.g. 10 µL

      - Some gene names in the manuscript are not italicized in all instances, e.g., GPA1

      We thank the reviewer for these formatting suggestions, we have made these changes throughout the text.

    2. eLife assessment

      This manuscript describes the mapping of natural DNA sequence variants that affect gene expression and its noise, as well as cell cycle timing, using as input single-cell RNA-sequencing of progeny from crosses between wild yeast strains. The method represents an important advance in the study of natural genetic variation. The findings, especially given the follow-up validation of the phenotypic impact of a mapped locus of major effect, provide convincing support for the rigor and utility of the method.

    3. Reviewer #1 (Public Review):

      The authors demonstrate that it is possible to carry out eQTL experiments for the model eukaryote S. cerevisiae, in "one pot" preparations, by using single-cell sequencing technologies to simultaneously genotype and measure expression. This is a very appealing approach for investigators studying genetic variation in single-celled and other microbial systems, and will likely inspire similar approaches in non-microbial systems where comparable cell mixtures of genetically heterogeneous individuals could be achieved.

      While eQTL experiments have been done for nearly two decades (the corresponding author's lab are pioneers in this field), this single-cell approach creates the possibility for new insights about cell biology that would be extremely challenging to infer using bulk sequencing approaches. The major motivating application shown here is to discover cell occupancy QTL, i.e. loci where genetic variation contributes to differences in the relative occupancy of different cell cycle stages. The authors dissect and validate one such cell cycle occupancy QTL, involving the gene GPA1, a G-protein subunit that plays a role in regulating the mating response MAPK pathway. They show that variation at GPA1 is associated with proportional differences in the fraction of cells in the G1 stage of the cell cycle. Furthermore, they show that this bias is associated with differences in mating efficiency.

    4. Reviewer #2 (Public Review):

      Boocock and colleagues present an approach whereby eQTL analysis can be carried out by scRNA-Seq alone, in a one-pot-shot experiment, due to genotypes being able to be inferred from SNPs identified in RNA-Seq reads. This approach obviates the need to isolate individual spores, genotype them separately by low-coverage sequencing, and then perform RNA-Seq on each spore separately. This is a substantial advance and opens up the possibility to straightforwardly identify eQTLs over many conditions in a cost-efficient manner. Overall, I found the paper to be well-written and well-motivated, and have no issues with either the methodological/analytical approach (though eQTL analysis is not my expertise), or with the manuscript's conclusions.

    1. eLife assessment

      This important study investigates neurobiological mechanisms underlying the maintenance of stable, functionally appropriate rhythmic motor patterns during changing environmental conditions - temperature in this study in the crab Cancer borealis stomatogastric central neural pattern generating circuits producing the rhythmic pyloric motor pattern, which is naturally subjected to temperature perturbations over a substantial range. The authors present compelling evidence that the neuronal hyperpolarization-activated inward current (Ih), known to contribute to rhythm control, plays a vital role in the ability of these circuits to appropriately adjust the frequency of rhythmic neural activity in a smooth monotonic fashion while maintaining the relative timing of different phases of the activity pattern that determines proper functional motor coordination transiently and persistently to temperature perturbations. This study will be of interest to neurobiologists studying rhythmic motor circuits and systems and their physiological adaptations.

    2. Reviewer #1 (Public review):

      Summary:

      This important study investigates the neurobiological mechanisms underlying the stable operation and maintenance of functionally appropriate rhythmic motor patterns during changing environmental conditions - temperature in this study in the crab Cancer borealis stomatogastric neural pattern generating network producing the pyloric motor rhythm, which is naturally subjected to temperature perturbations over a substantial range. This study is relevant to the general problem that some rhythmic motor systems adjust to changing environmental conditions and state changes by increasing the cycle frequency in a smooth monotonic fashion while maintaining the relative timing of different network activity pattern phases that determine proper motor coordination. How this is achieved mechanistically in complex dynamic motor networks is not understood, particularly how the frequency and phase adjustments are achieved as conditions change while avoiding operational instabilities on different time scales. The authors specifically studied the contributions of the hyperpolarization-activated inward current (Ih), which is involved in rhythm control, to the adjustments of frequency and phases in the pyloric rhythmic pattern as the temperature was altered from 11 degrees C to 21 degrees C. They present compelling evidence that this current is a critical biophysical feature in the ability of this system to adjust transiently and persistently to temperature perturbations appropriately. After blocking Ih in the pyloric network with cesium, the network was unable to reliably produce its characteristic rapid and smooth increase in the frequency of the triphasic rhythmic motor pattern in response to increasing temperature or its typical steady-state increase in frequency over this Q10 temperature range.

      Strengths:

      (1) The authors addressed this problem by technically rigorous experiments in the crab Cancer borealis stomatogastric ganglion (STG) in vitro, which readily allows for neuronal activity recording in a behaviorally and architecturally defined rhythmic neural circuit in conjunction with the application of blockers of Ih and synaptic receptors to disrupt circuit interactions. This approach is an effective way to experimentally investigate how complex rhythmic networks, at least in poikilotherms, mechanistically adjust to environmental perturbations such as temperature.

      (2) While previous work demonstrated that Ih increases in pyloric neurons as temperature increases, the authors here establish that this increase is necessary for normal responses of STG neural activity to temperature, which consist of a smooth monotonic increase in the frequency of rhythmic activity with increasing temperature.

      (3) The data shows that blocking Ih with cesium causes the frequency to transiently decrease ("jags") when the temperature increases and then increases after the temperature stabilizes at a steady state, revealing a non-monotonic frequency response to temperature perturbations.

      (4) The authors dissect some of the underlying neuronal and circuit dynamics, presenting evidence that after blocking Ih, the non-monotonic jags in the frequency response are mediated by intrinsic properties of pacemaker neurons, while in the steady state, Ih determined the overall frequency change (i.e., temperature sensitivity) through network interactions.

      (5) The authors' results highlight more complex dynamic responses to increasing temperature for the first time, suggesting a longer timescale process than previously recognized that may result from interactions between multiple channels and/or ion channel kinetics.

      Weaknesses:

      (1) The involvement of Ih in achieving the frequency and phase adjustments as conditions change and allowing smooth transitions to avoid operational instabilities in other complex rhythmic motor networks, for example, in homeotherms, is not established, so the present results may have limited general extrapolations.

    3. Reviewer #2 (Public review):

      Summary:

      Using the crustacean stomatogastric nervous system (STNS), the authors present an interesting study wherein the contribution of the Ih current to temperature-induced changes in the frequency of a rhythmically active neural circuit is evaluated. Ih is a hyperpolarization-activated cation current that depolarizes neurons. Under normal conditions, increasing the temperature of the STNS increases the frequency of the spontaneously active pyloric rhythm. Notably, under normal conditions, as temperature systematically increases, the concomitant increase in pyloric frequency is smooth (i.e., monotonic). By contrast, blocking Ih with extracellular cesium produces temperature-induced pyloric frequency changes that follow a characteristic sawtooth response (i.e., non-monotonic). That is, in cesium, increasing temperature initially results in a transient drop in pyloric frequency that then stabilizes at a higher frequency. Thus, the authors conclude that Ih establishes a mechanism that ensures smooth changes in neural network frequency during environmental disturbances, a feature that likely bestows advantages to the animal's function.

      The study describes several surprising and interesting findings. In general, the study's primary observation of the cesium-induced sawtooth response is remarkable. To my knowledge, this type of response has not yet been described in neurobiological systems, and I suspect that the unexpected response will be of interest to many readers.

      At first glance, I had some concerns regarding the use of extracellular cesium to understand network phenomena. Yes, extracellular cesium blocks Ih. But extracellular cesium has also been shown to block astrocytic potassium channels, at least in mammalian systems (i.e., K-IR, PMID: 10601465), and such a blockade can elevate extracellular potassium. I was heartened to see that the authors acknowledge the non-specificity of cesium (lines 320-325) and I agree with the authors' contention that "a first approximation most of the effects seen here can likely be attributed to Cs+ block of Ih". Upon reflecting on the potential confound, I was also reassured to see that extracellular cesium alone does not in fact increase pyloric frequency, an effect that might be expected if cesium indirectly raises [K+]outside. If the authors agree, then I suggest including that point in their discussion.

      In summary, the authors present a solid investigation of a surprising biological phenomenon. In general, my comments are fairly minor. Thanks for contributing an interesting study.

      Strengths:

      A major strength of the study is the identification of an ionic conductance that mediates stable, monotonic changes in oscillatory frequency that accompany changes in the environment (i.e., temperature).

      Weaknesses:

      A potential experimental concern stems from the use of extracellular cesium to attribute network effects specifically to Ih. Previous work has shown that extracellular cesium also blocks inward-rectifier potassium channels expressed by astrocytes, and that such blockade may also elevate extracellular potassium, an action that generally depolarizes neurons. Notably, the authors address this potential concern in the discussion.

    4. Reviewer #3 (Public review):

      Summary:

      This paper presents a systematic analylsis of the role of the hyperpolarization-activated inward current (the h current) in the response of the pyloric rhythm of the stomatogastric ganglion (STG) of the crab. In a detailed set of experiments, they analyze the effect of blocking h current with bath infusion of the h current blocker cesium (perfused as CsCl). They show interesting and reproducible effects that blockade of h current results in a period of frequency decrease after an upward step in temperature, followed by a slow increase in frequency. This contrasts with the normal temperature response that shows an increase in frequency with an increase in temperature without a downward "jag" in the frequency response. This is an important paper for showing the role of h current in stabilizing network dynamics in response to perturbations such as a temperature change.

      Strengths of the paper:

      The major effects are shown very clearly and convincingly in a range of experiments with combined intracellular recording from neurons during changes in temperature.

      Weaknesses

      The Marder lab has detailed models of the pyloric rhythm. These temperature effects have not yet been modeled and could be the focus of future modeling studies.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Response to Public Reviews:

      We thank the reviewers for their kind comments have implemented many of the suggestion their suggestions. Our paper has greatly benefited from their advice.  Like Reviewer 1, we acknowledge that while the exact involvement of Ih in allowing smooth transitions is likely not universal across all systems, our demonstration of the ways in which such currents can affect the dynamics of the response of complex rhythmic motor networks provides valuable insight. To address the concerns of Reviewer 2, we included a sentence in the discussion to highlight the fact that cesium neither increased the pyloric frequency nor caused consistent depolarization in intracellular recordings. We also highlighted that these observations suggest both that cesium is not indirectly raising [K+]outside and support the conclusion that the effects of cesium are primarily through blockade of Ih rather than other potassium channels.

      Reviewer 3 raised some important points about modeling. While the lab has models that explore the effects of temperature on artificial triphasic rhythms, these models do not account for all the biophysical nuances of the full biological system. We have limited data about the exact nature of temperature-induced parameter changes and the extent to which these changes are mediated by intrinsic effects of temperature on protein structure versus protein interactions/modification by processes such as phosphorylation. With respects to the A current, Tang et al., 2010 reported that the activation and inactivation rates are differentially temperature sensitive but we do not have the data to suggest whether or not the time courses of such sensitivities are different. As such, we focus our discussion on the properties we know are modulated by temperature, i.e. activation rates. Within the discussion we now include the suggestion that future, more comprehensive modeling may be appropriate to further elucidate the ways in which reducing Ih may produce the here reported experimentally observed effects.

      Reviewer #1 (Recommendations For The Authors):

      Suggested revisions:

      A figure showing examples of the voltage-clamp traces for the critical measurements of the extent of Ih block by 5 mM CsCl in PD and LP neurons at the temperature extremes in these preparations is not shown, and the authors should consider including such a figure, perhaps as a supplemental figure.

      We have added Supplemental Figure 1 containing voltage-clamp traces demonstrating the extent of Ih block by 5mM CsCl in PD and LP neurons at 11 and 21°C.  Due to technical concerns, different preparations were used in the measurements at 11°C and 21°C, but the point that the H-current is reduced is demonstrated in all cases.

      Reviewer #2 (Recommendations for The Authors):

      Specific (Minor) Comments:

      (1) Line 83: In Cs+ "at 11°C, the pyloric frequency was significantly decreased compared to control conditions (Saline: 1.2± 0.2 Hz; Cs+ 0.9± 0.2 Hz)".

      As above, the authors often report that cesium generally reduces pyloric frequency. Figure 5A demonstrates this action quite nicely. However, cesium's effect on pyloric frequency at 11°C seems less robust in Figure 1C. Why the discrepancy?

      There is variability in the effects of Cs+ on the pyloric frequency.  As noted, the standard deviation in frequency in both conditions is 0.2Hz.  As such, there are some cases in which the initial frequency drop in Cs+ compared to control was relatively small.  1C is one such case, but was selected as an example because of its clear reduction in temperature sensitivity. 

      (2) I don't understand what the arrows/dashed lines are trying to convey in Figure 3C.

      The arrows/dashed lines represent the criteria used to define a cycle as “decreasing in frequency” (Temperature Increasing) or “increasing in frequency” (Temperature Stable).  We have amended lines 130 and 137 in the text to hopefully clarify this point, as well as the figure legend.

      (3) Lines 118/168. The description of cesium's specific action on the depolarizing portion of PD activity is a bit confusing. In my mind, "depolarization phase" refers to the point at which PD is most depolarized. Perhaps restating the phrase to "elongation of the depolarizing trajectory" is less confusing. The authors may also want to consider labeling this trajectory in Figure 2C.

      We have changed “depolarization phase” to “depolarizing phase” to highlight that this is the period during which the cell is depolarizing, rather than at its most depolarized.  We consider the plateau of the slow wave and spiking (the point at which PD is most depolarized) to be the “bursting phase”.  We have labeled these phases in Figure 2C as suggested.

      (4) Figure 3C legend: a few words seem to be missing. I suggest "the change in mean frequency was more likely TO decrease IN Cs+ than in saline".

      Thank you for catching this typo, it has been corrected.

      (5) Line 165: Awkward phrasing. “In one experiment, the decrease in frequency while temperature increased and subsequent increase in frequency after temperature stabilized was particularly apparent in Cs+ PTX”.

      How about: “One Cs+ PTX experiment wherein elevating the temperature transiently decreased pyloric frequency is shown in Figure 4F.”

      We have amended this sentence to read, “One Cs++PTX experiment in which elevating the temperature produced a particularly pronounced transient decrease in frequency is shown in Figure 4F.”

      (6) Line 186: Awkward phrasing. "LP OFF was also significantly advanced in Cs+, although duty cycle (percent of the period a neuron is firing) was preserved".

      The use of the word "although" seems a bit strange. If both LP onset and LP offset phase advance by the same amount, then isn't an unchanged duty cycle expected?

      “Although” has been changed to “and subsequently”.

      Reviewer #3 (Recommendations For The Authors):

      Major comments:

      (1) I know the Marder lab has detailed models of the pyloric rhythm. I am not saying they have to add modeling to this already extensive and detailed paper, but it would be useful to know how much of these temperature effects have been modeled, for example in the following locations.

      (2) Line 259 - "Mathematically..." - Is there a computational model of H current that has shown this decrease in frequency in pyloric neurons? If you are working on one for the future, you could mention this.

      There is not currently a model in which the reduction of the H-current results in the non-minimum phase dynamics in the frequency response to temperature seen experimentally. It should be noted that our existing models of pyloric activity responses to temperature are not well suited to investigate such dynamics in their current iterations.  Further work is necessary to demonstrate the principles observed experimentally in computational modeling, and we have added a sentence to the paper to reflect this point (Line 268).

      (3) Line 318 - "therefore it remains unclear" - I thought they had models of the circuit rhythmicity. Do these models include temperature effects? Can they comment on whether their models of the circuit show an opposite effect to what they see in the experiment? I'm not saying they have to model these new effects as that is probably an entirely different paper, but it would be interesting to know whether current models show a different effect.

      We have some models of the pyloric response to temperature, but these models were specifically selected to maintain phase across the range of temperature.  When Ih was reduced in these models, a variety of effects on phase and duty cycle were seen.  These models were selected to have the same key features of behavior as the pyloric rhythm, but do not capture all the biophysical nuances of the complete system, and therefore should not necessarily be expected to reflect the experimental findings in their current iterations.  Furthermore, these models are meant to have temperature as a static, rather than dynamic input, and thus are ill-suited to examine the conditions of our experiments.  The models in their current state are not sufficiently relevant to these experimental findings that we they can illuminate the present paper `2.

      (4) "If deinactivation is more accelerated or altered by temperature than inactivation...While temperature continued to change, the difference in parameters would continue to grow" - This is described as a difference in temperature sensitivity, but it seems like it is also a function of the time course of the response to change in temperature (i.e. the different components could have the same final effect of temperature but show a different time course of the change).

      We know from Tang et al, 2010, that activation and inactivation rates of the A current are differentially temperature sensitive. We have no evidence to suggest that the time course of the response to temperature of various parameters differ.  The physical actions of temperature on proteins are likely to be extremely rapid, making a time course difference on the order of tens of seconds less unlikely, though not impossible. Modeling of the biophysics might illuminate the relative plausibility of these different mechanisms of action, but we feel that our current suggested explanation is reasonable based on existing information.

      (5) Is it known how temperature is altering these channel kinetics? Is it via an intrinsic rearrangement of the protein structure, or is it a process that involves phosphorylation (that could explain differences in time course?). Some mention of the mechanism of temperature changes would be useful to readers outside this field.

      It is not known exactly how temperature alters channel parameters.  Invariably some, if not all, of it is due to an intrinsic rearrangement of protein structure, and our current models treat all parameter changes as an instantaneous consequence.  However, it is possible that some effects of temperature are due to longer timescale processes such as phosphorylation or cAMP interactions.  Current work in the lab is actively exploring these questions, but there is no definitive answer. Given that this paper focuses on the phenomenon and plausible biomolecular explanations based on existing data, we have not altered the paper to include more exhaustive  coverage of all the possible avenues by which temperature may alter channel properties.

      Specific comments:

      Title: misspelling of "Cancer" ?

      We are unsure how that extra “w” got into the earliest version of the manuscript and have removed it.

      Line 66 "We used 5mM CsCl" - might mention right up front that this was a bath application of the substance.

      We have altered this line to read “used bath application of 5mM CsCl”.  

      Figure 4 - "The only feedback synapse to the pacemaker kernel neurons, LP to PD, and is blocked by picrotoxin" - I think the word "and" should be removed from this phrase in the figure legend.

      Fixed

      Figure 4 legend - "Reds denote temperature...yellows denote..." - I think it should be "Red dots denote temperature...yellow dots denote...".

      Done

      Figure 4B - Why does the change in frequency in cesium look so different in Figure 4B compared to Figure 1C or Figure 3B? In the earlier figures, the increase of frequency is smaller but still present in cesium, whereas, in Figure 4B, cesium seems to completely block the increase in frequency. I'm not sure why this is different, but I guess it's because 3B and 4B are just mean traces from single experiments. Presumably, 4B is showing an experiment in which the cesium was subsequently combined with picrotoxin?

      Figures 1C, 3B, and 4B are indeed all from different single experiments. As acknowledged in our concluding paragraph, there was substantial variability in the exact response of the pyloric rhythm to temperature while in cesium.  The most consistent effect was that the difference in frequency between cesium and saline at a particular temperature increased, as demonstrated across 21 preparations in Figure 1D. It may be noted in Figure 1E that the Q10 was not infrequently <1, meaning that there was a net decrease in frequency as temperature increased in some experiments such as seen in the example of Figure 4B.  The “fold over” (initial increase in steady-state frequency with temperature, then decrease at higher temperatures) has been observed at higher temperatures (typically around 23-30 degrees C) even under control conditions but has not been highlighted in previous publications.  The example in 4B was chosen because it demonstrated both the similarity in jags between Cs+ and Cs++PTX and an overall decrease in temperature sensitivity, even though in this instance the steady-state change in frequency with temperature was not monotonic. 

      Figure 6A - "Phase 0 to 1.0" - The y-axis should provide units of phase. Presumably, these are units of radians so 1.0=2*pi radians (or 360 degrees, but probably best to avoid using degrees of phase due to confusion with degrees of temperature).

      Phase, with respect to pyloric rhythm cycles, does not traditionally have units as it is a proportion rather than an angle. As such, we have not changed the figure.

      Line 275 - "the pacemaker neuron can increase" - Does this indicate that the main effects of H current are in the follower neurons (i.e. LP and PY versus the driver neuron PD)?

      Not necessarily.  We posit in the next paragraph that the effect of the H current on the temperature sensitivity could be due to its phase advance of LP, but that phase advance of LP is not particularly expected to increase frequency.  We favor the possibility that temperature increases Ih in the pacemaker, which in turn advances the PRC of the rhythm, allowing the frequency increase seen under normal conditions.  In Cs+, this advance does not occur, resulting in the lower temperature sensitivity.  In Cs++PTX, the lack of inhibition from LP means compensatory advance of the pacemaker PRC by Ih is unnecessary to allow increased frequency.

      Line 285 - "either increase frequency have no effect" - Is there a missing "or" in this phrase?

      Thank you, we have added the “or”.

    1. eLife assessment

      This important study highlights cell types preserving long-lived proteins and lays a foundation for identifying exceptionally long-lived proteins in the ovary. Convincing evidence describes helpful data about protein turnover and identifies long-lived macromolecules in oocytes and somatic cells during mouse ovarian aging. This work will be of interest to researchers working on aging and reproductive health.

    2. Author response:

      The following is the authors’ response to the current reviews.

      Reviewer 2:

      In addition, it is still unacceptable for me that the number of ovulated oocytes in mice at 6 months of age is only one third of young mice (10 vs 30; Fig. S1E). The most of published literature show that mice at 12 months of age still have ~10 ovulated oocytes.

      We disagree with the reviewer’s comment, and the concerns raised were not shared by the other reviewers.  We have reported our data with full transparency (each data point is plotted). In the current study, we observed an intermediate phenotype in gamete number (assessed by both ovarian follicle counts and ovulated eggs) when comparing 6 month old mice to 6 week or 10 month old mice; this is as expected. It is well accepted that follicle counts are highly mouse strain dependent.  Although the reviewer mentions that mice at 12 months have ~10 ovulated oocytes, no actual references are provided nor are the mouse strain or other relevant experimental details mentioned.  Therefore, we do not know how these quoted metrics relate to the female FVB mice used in our current study.   As clearly explained and justified in our manuscript, we used mice at 6 months and 10 months to represent a physiologic aging continuum. 

      Moreover, based on the follicle counting method used in the present study (Fig. S1D), there are no antral follicles observed in mice at 6 months and 10 months of age, which is not reasonable.

      This statement is incorrect. Antral follicles were present at 6 and 10 months of age, but due to the scale of the y-axis and the normalization of follicle number/area in Fig. S1D, the values are small.  The absolute number of antral follicles per ovary (counted in every 5th section) was 31.3 ± 3.8 follicles for 6-week old mice, 9.3 ± 2.3 follicles for 6-month old mice, and 5.3 ± 1.8 follicles for 10-month old mice.  Moreover, it is important to note that these ovaries were not collected in a specific stage of the estrous cycle, so the number of antral follicles may not be maximal.  In addition, as described in the Materials and Methods, antral follicles were only counted when the oocyte nucleus was present in a section to avoid double counting.  Therefore, this approach (which was applied consistently across samples) could potentially underestimate the total number.


      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This manuscript by Bomba-Warczak describes a comprehensive evaluation of long-lived proteins in the ovary using transgenerational radioactive labelled 15N pulse-chase in mice. The transgenerational labeling of proteins (and nucleic acids) with 15N allowed the authors to identify regions enriched in long-lived macromolecules at the 6 and 10-month chase time points. The authors also identify the retained proteins in the ovary and oocyte using MS. Key findings include the relative enrichment in long-lived macromolecules in oocytes, pregranulosa cells, CL, stroma, and surprisingly OSE. Gene ontology analysis of these proteins revealed enrichment for nucleosome, myosin complex, mitochondria, and other matrix-type protein functions. Interestingly, compared to other post-mitotic tissues where such analyses have been previously performed such as the brain and heart, they find a higher fractional abundance of labeled proteins related to the mitochondria and myosin respectively.

      Response: We thank the reviewer for this thoughtful summary of our work.  We want to clarify that our pulse-chase strategy relied on a two-generation stable isotope-based metabolic labelling of mice using 15N from spirulina algae (for reference, please see (Fornasiero & Savas, 2023; Hark & Savas, 2021; Savas et al., 2012; Toyama et al., 2013)).  We did not utilize any radioactive isotopes.

      Strengths:

      A major strength of the study is the combined spatial analyses of LLPs using histological sections with MS analysis to identify retained proteins.

      Another major strength is the use of two chase time points allowing assessment of temporal changes in LLPs associated with aging.

      The major claims such as an enrichment of LLPs in pregranulosa cells, GCs of primary follicles, CL, stroma, and OSE are soundly supported by the analyses, and the caveat that nucleic acids might differentially contribute to this signal is well presented.

      The claims that nucleosomes, myosin complex, and mitochondrial proteins are enriched for LLPs are well supported by GO enrichment analysis and well described within the known body of evidence that these proteins are generally long-lived in other tissues.

      Weaknesses:

      Comment 1: One small potential weakness is the lack of a mechanistic explanation of if/why turnover may be accelerating at the 6-10 month interval compared to 1-6.

      Response 1: At the 6-month time point, we detected more long lived proteins than the 10 month time point in both the ovary and the oocyte.  We anticipated this because proteins are degraded over time, and substantially more time has elapsed at the later time point.  Moreover, at the 6–10-month time point, age-related tissue dysfunction is already evident in the ovary.  For example, in 6-9 month old mice, there is already a deterioration of chromosome cohesion in the egg which results in increased interkinetochore distances (Chiang et al., 2010), and by 10 months, there are multinucleated giant cells present in the ovarian stroma which is consistent with chronic inflammation (Briley et al., 2016).  Thus, the observed changes in protein dynamics may be another early feature of aging progression in the ovary.  

      Comment 2: A mild weakness is the open-ended explanation of OSE label retention. This is a very interesting finding, and the claims in the paper are nuanced and perfectly reflect the current understanding of OSE repair. However, if the sections are available and one could look at the spatial distribution of OSE signal across the ovarian surface it would interesting to note if label retention varied by regions such as the CLs or hilum where more/less OSE division may be expected. 

      Response 2: We agree that the enrichment of long-lived molecules in the OSE is interesting. To make interpretable conclusions about the dynamics of long-lived molecules in the OSE, we would need to generate a series of samples at precise stages of the estrous cycle or ideally across a timecourse of ovulation to capture follicular rupture and repair.  These samples do not currently exist and are beyond the scope of this study. However, this idea is an important future direction and it has been added to the discussion (lines 221-223). Furthermore, from a practical standpoint, MIMS imaging is resource and time intensive. Thus, we are not able to readily image entire ovarian sections.  Instead, we focused on structures within the ovary and took select images of follicles, stroma, and OSE.  We, therefore, do not have a comprehensive series of images of the OSE from the entire ovarian section for each mouse analyzed.

      Reviewer #2 (Public Review):

      Summary:

      The manuscript by Bomba-Warczak et al. applied multi-isotope imaging mass spectrometry (MIMS) analysis to identify the long-lived proteins in mouse ovaries during reproductive aging, and found some proteins related to cytoskeletal and mitochondrial dynamics persisting for 10 months.

      Response: We thank the reviewer for their summary and feedback.

      Strengths:

      The manuscript provides a useful dataset about protein turnover during ovarian aging in mice.

      Weaknesses:

      Comment 1: The study is pretty descriptive and short of further new findings based on the dataset. In addition, some results such as the numbers of follicles and ovulated oocytes in aged mice are not consistent with the published literature, and the method for follicle counting is not accurate. The conclusions are not fully supported by the presented evidence.

      Response 1: We agree with the reviewer that this study is descriptive. Our goal, as stated, was to use a discovery-based approach to define the long-lived proteome of the ovary and oocyte across a reproductive aging continuum.  As the prominent aging researcher, Dr. James Kirkland, stated: “although ‘descriptive’ is sometimes used as a pejorative term…descriptive or discovery research leading to hypothesis generation has become highly sophisticated and of great relevance to the aging field (Kirkland, 2013).”  We respectfully disagree with the reviewer that our study is short of new findings. In fact, this is the first time that a stable two-generation stable isotope-based metabolic labelling of mice in combination with two different state-of-the-art mass spectrometry methods has been used to identify and localize long lived molecules in the ovary and oocyte along this particular reproductive aging continuum in an unbiased manner.  We have identified proteins groups that were previously not known to be long lived in the ovary and oocyte.  Our hope is that this long-lived proteome will become an important hypothesis-generating resource for the field of reproductive aging.

      The age-dependent decline in number of follicles and eggs ovulated in mice has been well established by our group as well as others (Duncan et al., 2017; Mara et al., 2020).  Thus, we are unclear about the reviewer’s comments that our results are not consistent with the published literature.  The absolute numbers of follicles and eggs ovulated as well as the rate of decline with age are highly strain dependent.  Moreover, mice can have a very small ovarian reserve and still maintain fertility (Kerr et al., 2012).  In our study, we saw a consistent age-dependent decrease in the ovarian reserve (Figure 1 – figure supplement 1 D), the number of oocytes collected from large antral follicles following hyperstimulation with PMSG (used for LC-MS/MS), and the number of eggs collected from the oviduct following hyperstimulation and superovulation with PMSG and hCG (Figure 1 – figure supplement 1 E and F).  In all cases, the decline was greater in 10 month old compared to 6 month old mice demonstrating a relative reproductive aging continuum even at these time points.

      Our research team has significant expertise in follicle classification and counting as evidenced by our publication record (Duncan et al., 2017; Kimler et al., 2018; Perrone et al., 2023; Quan et al., 2020).  We used our established methods which we have further clarified in the manuscript text (lines 395-397).  Follicle counts were performed on every 5th tissue section of serial sectioned ovaries, and 1 ovary from 3 mice per timepoint were counted. Therefore, follicle counts were performed on an average of 48-62 total sections per ovary. The number of follicles was then normalized per total area (mm2) of the tissue section, and the counts were averaged. Figure 1 – figure supplement 1 C and D represents data averaged from all ovarian sections counted per mouse.   It is important to note that the same criteria were applied consistently to all ovaries across the study, and thus regardless of the technique used, the relative number of follicles or oocytes across ages can be compared.

      Reviewer #3 (Public Review):

      Summary:

      In this study, Bomba-Warczak et al focused on reproductive aging, and they presented a map for long-lived proteins that were stable during reproductive lifespan. The authors used MIMS to examine and show distinct molecules in different cell types in the ovary and tissue regions in a 6 month mice group, and they also used proteomic analysis to present different LLPs in ovaries between these two timepoints in 6-month and 10-month mice. The authors also examined the LLPs in oocytes in the 6-months mice group and indicated that these were nuclear, cytoskeleton, and mitochondria proteins.

      Response: We thank the reviewer for their summary and feedback.

      Strengths:

      Overall, this study provided basic information or a 'map' of the pattern of long-lived proteins during aging, which will contribute to the understanding of the defects caused by reproductive aging.

      Weaknesses:

      Comment 1: The 6-month mice were used as an aged model; no validation experiments were performed with proteomics analysis only.  

      Response 1:  We did not select the 6-month time point to be representative of the “aged model” but rather one of two timepoints on the reproductive aging continuum – 6 and 10 months.  In the manuscript (Figure 1 – figure supplement 1) we have demonstrated the relevance of the two timepoints by illustrating a decrease in follicle counts, number of fully grown oocytes collected, and number of eggs ovulated as well as a tendency towards increased stromal fibrosis (highlighted in the main text lines 78-85).  Inclusion of the 6-month timepoint ultimately turned out to be informative and essential as many long-lived proteins were absent by the 10 month timepoint. These results suggest that important shifts in the proteome occur during mid to advanced reproductive age.  The relevance of these timepoints is mentioned in the discussion (lines 247-270).

      Two independent mass spectrometry approaches (MIMS and LC-MS/MS) were used to validate the presence of long-lived macromolecules in the ovary and oocyte. Studies focused on the role of specific long-lived proteins in oocyte and ovarian biology as well as how they change with age in terms of function, turnover, and modification are beyond the scope of the current study but are ongoing.  We have acknowledged these important next steps in the manuscript text (lines 286-288, 311-312).

      It is important to note, that oocytes are biomass limited cells, and their numbers decrease with age.  Thus, we had to select ages where we could still collect enough from the mice available to perform LC-MS/MS. 

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Comment 1: The writing and figures are beautiful - it would be hard to improve this manuscript.

      Response 1: We greatly appreciate this enthusiastic evaluation of our work.

      Comment 2: In Fig S1E/F it would help to list the N number here. Why are there 2 groups at 6-12 wk?

      Response 2:  We did not have 6 month and 10-month-old mice available at the same time to be able to run the hyperstimulation and superovulation experiment in parallel.  Therefore, we performed independent experiments comparing the number of eggs collected from either 6-month-old or 10 month old mice relative to 6-12 week old controls.  In each trial, eggs were collected from pooled oviducts from between 3-4 mice per age group, and the average total number of eggs per mouse was reported.  Each point on the graph corresponds to the data from an individual trial, and two trials were performed.  This has been clarified in the figure legend (lines 395-397).  Of note, while addressing this reviewer’s comments, we noticed that we were missing Materials and Methods regarding the collection of eggs from the oviduct following hyperstimulation and superovulation with PMSG and hCG.  This information has now been added in Methods Section, lines 477-481.

      Comment 3: The manuscript would benefit from an explanation of why the pups were kept on a 1-month N15 diet after birth, since the oocytes are already labeled before birth, and granulosa at most by day 3-4. Would ZP3 have not been identified otherwise?

      Response 3:   The pups used in this study were obtained from fully labeled female dams that were maintained on an15N diet.  These pups had to be kept with their mothers through weaning.  To limit the pulse period only through birth, the pups would have had to be transferred to unlabeled foster mothers.  However, this would have risked pup loss which would have significantly impacted our ability to conduct the studies given that we only had 19 labeled female pups from three breeding pairs.  We have clarified this in the manuscript text in lines 78-80.  It is hard to know, without doing the experiment, whether we would have detected ZP3 if we only labeled through birth.  The expression of ZP3 in primordial follicles, albeit in human, would suggest that this protein is expressed quite early in development.

      Comment 4: What is happening to the mitochondria at 6-10 months? Does their number change in the oocyte? Is there a change in the rate of fission? Any chance to take a stab at it with these or other age-matched slides?

      Response 4:  The reviewer raises an excellent point.  As mentioned previously in the Discussion (lines 290-301), there are well documented changes in mitochondrial structure and function in the oocyte in mice of advanced reproductive age.  However, there is a paucity of data on the changes that may happen at earlier mid-reproductive age time points.  From the oocyte mitochondrial proteome perspective, our data demonstrate a prominent decline in the persistence of long-lived proteins between 6 and 10 months, and this occurs in the absence of a change in the total pool of mitochondrial proteins (both long and short lived populations) as assessed by spectral counts or protein IDs (figure below).  These data, which we have added into Figure 3 – figure supplement 1 and in the manuscript text (lines 164-170) are suggestive of similar numbers of mitochondria at these two timepoints. It would be informative to do a detailed characterization of oocyte mitochondrial structure and function within this window to see if there is a correlation with this shift in long lived mitochondrial proteins.  Although this analysis is beyond the scope of the current manuscript, it is an important next line of inquiry which we have highlighted in the manuscript text (lines 255-257 and 311-312).

      Reviewer #2 (Recommendations For The Authors):

      Several concerns are raised as shown below.

      Comment 1: In Fig. 2F, it is surprising that ZP3 disappeared in the ovary from mice at the age of 10 months by MIMS analysis, because quite a few oocytes with intact zona pellucida can still be obtained from mice at this age. Notably, ZP would not be renewed once formed.

      Response 1: To clarify, Figure 2F shows LC-MS/MS data and not MIMS data.  As mentioned in the Discussion, the detection of long-lived pools of ZP3 at 6 months cannot be derived from newly synthesized zona pellucidae in growing follicles because they would not have been present during the pulse period.  The only way we could detect ZP3 at 6 months is if it forms a primitive zona scaffold in the primordial follicle or if ZPs from atretic follicles of the first couple of waves of folliculogenesis incorporate into the extracellular matrix of the ovary.  The lack of persistence of ZP3 at 10 months could be due to protein degradation. Should ZP3 indeed form a primitive zona, its loss at 10 months would be predicted to result in poor formation of a bona fide zona pellucida upon follicle growth.  Interestingly, aging has been associated with alterations in zona pellucida structure and function.   These data open novel hypotheses regarding the zona pellucida (e.g. a primitive zona scaffold and part of the extracellular matrix) and will require significant further investigation to test. These points are highlighted in the Discussion lines 227-245.

      Comment 2: To determine whether those proteins that can not be identified by MIMS at the time point of 10 months are degraded or renewed, the authors should randomly select some of them to examine their protein expression levels in the ovary by immunoblotting analysis.

      Response 2: To clarify, proteins were identified by LC-MS/MS and not MIMS which was used to visualize long lived macromolecules.   Each protein will be comprised of old pools (15N containing) and newly synthesized pools (14N containing).  Degradation of the old pool of protein does not mean that there will be a loss of total protein.  Moreover, immunoblotting cannot distinguish old and newly synthesized pools of protein. Where overall peptide counts are listed for each protein identified at both time points.  As peptides derive from proteins, the table provided with the manuscript reflects what immunoblotting would, but on a larger and more precise scale.

      Comment 3: I think those proteins that can be identified by MIMS at the time point of 6 months but not 10 months deserve more analyses as they might be the key molecules that drive ovarian aging.

      Response 3:  This comment conflicts with comment 2 from Reviewer #3 (Recommendations For The Authors).  This underscores that different researchers will prioritize the value and follow up of such rich datasets differently.  We agree that the LLP identified at 6 months are of particular interest to reproductive aging, and we are planning to follow up on these in future studies.

      Comment 4:  Figure 1 – figure supplement 1 C-F, compared with the published literature, the numbers of follicles at different developmental stages and ovulated oocytes at both ages of 6 months and 10 months were dramatically low in this study. For 6-month-old female mice, the reproductive aging just begins, thus these numbers should not be expected to decrease too much. In addition, follicle counting was carried out only in an area of a single section, which is an inaccurate way, because the numbers and types of follicles in various sections differ greatly. Also, the data from a single section could not represent the changes in total follicle counts.

      Response 4: We have addressed these points in response to Comment 1 in the Reviewer #2 Public Review, and corresponding changes in the text have been noted.    

      Comment 5:  The study lacks follow-up verification experiments to validate their MIMS data.

      Response 5: Two independent mass spectrometry approaches (MIMS and LC-MS/MS) were used to validate the presence of long-lived macromolecules in the ovary and oocyte. Studies focused on the role of specific long-lived proteins in oocyte and ovarian biology as well as how they change with age in terms of function, turnover, and modification are beyond the scope of the current study but ongoing.  We have acknowledged these important next steps in the manuscript text (lines 286-288 and 311-312).

      Reviewer #3 (Recommendations For The Authors):

      Comment 1: The authors used the 6-month mice group to represent the aged model, and examined the LLPs from 1 month to 6 months. Indeed, 6-month-old mice start to show age-related changes; however, for the reproductive aging model, the most widely accepted model is that 10-month-old age mice start to show reproductive-related changes and 12-month-old mice (corresponding to 35-40 year-old women) exhibit the representative reproductive aging phenotypes. Therefore, the data may not present the typical situation of LLPs during reproductive aging.

      Response 1: As described in the response to Comment 1 in the Reviewer #3 Public Review, there were clear logistical and technical feasibility reasons why the 6 month and 10-month timepoints were selected for this study.  Importantly, however, these timepoints do represent a reproductive aging continuum as evidenced by age-related changes in multiple parameters.  Furthermore, there were ultimately very few LLPs that remained at 10 months in both the oocyte and ovary, so inclusion of the 6-month time point was an important intermediate.  Whether the LLPs at the 6-month timepoint serve as a protective mechanism in maintaining gamete quality or whether they contribute to decreased quality associated with reproductive aging is an intriguing dichotomy which will require further investigation.  This has been added to the discussion (lines 247-257).

      Comment 2:  Following the point above, the authors examined the ovaries in 6 months and 10 months mice by proteomics, and found that 6 months LLPs were not identical compared with 10 months, while there were Tubb5, Tubb4a/b, Tubb2a/b, Hist2h2 were both expressed at these two time points (Fig 2B), why the authors did not explore these proteins since they expressed from 1 month to 10 months, which are more interesting.

      Response 2:  The objective of this study was to profile the long-lived proteome in the ovary and oocyte as a resource for the field rather than delving into specific LLPs at a mechanistic level.  That being said, we wholeheartedly agree with the reviewer that the proteins that were identified at both 6 month and 10 months are the most robust and long lived and worthy of prioritizing for further study.  Interestingly, Tubb5 and Tubb4a have high homology to primate-specific Tubb8, and Tubb8 mutations in women are associated with meiosis I arrest in oocytes and infertility (Dong et al., 2023; Feng et al., 2016).  Thus, perturbation of these specific proteins by virtue of their long-lived nature may be associated with impaired function and poor reproductive outcomes.  We have highlighted the importance of these LLPs which are present at both timepoints and persist to at least 10 months in the manuscript text (lines 259-270).

      Comment 3:  The authors also need to provide a hypothesis or explanation as to why LLDs from 6 months LLPs were not identical compared with 10 months.

      Response 3:  We agree that LLDs identified at 10 months should be also identified as long-lived at 6 months. This is a common limitation of mass spectrometry-based proteomics where each sample is prepared and run individually, which introduces variability between biological replicates, especially when it comes to low abundant proteins. It is key to note that just because we do not identify a protein, it does not mean the protein is not there – it merely means that we were not able to detect it in this particular experiment, but low levels of the protein may still be there. To compensate for this known and inherent variability, we have applied stringent filtering criteria where we required long-lived peptides to be identified in an independent MS scan (alternative is to identify peptide in either heavy or light scan and use modeling to infer FA value based on m/z shift), which gave us peptides of highest confidence. Ideally, these experiments would be done using TMT (tandem mass tag) approach. However, TMT-based experiments typically require substantial amount of input (80-100ug per sample) which unfortunately is not feasible with oocytes obtained from a limited number of pulse-chased animals.  We have added this explanation to the discussion (lines 265-270).

      Comment 4:  The reviewer thinks that LLPs from 6 months to 10 months may more closely represent the long-lived proteins during reproductive aging.

      Response 4:  We fully agree that understanding the identity of LLPs between the 6 month and 10 month period will be quite informative given that this is a dynamic period when many of LLPs get degraded and thus might be key to the observed decline in reproductive aging. This is a very important point that we hope to explore in future follow-up studies.

      Comment 5: The authors used proteomics for the detection of ovaries and oocytes, however, there are no validation experiments at all. Since proteomics is mainly for screening and prediction, the authors should examine at least some typical proteins to confirm the validity of proteomics. For example, the authors specifically emphasized the finding of ZP3, a protein that is critical for fertilization.

      Response 5:  Thank you, we agree that closer examination of proteins relevant and critical for fertilization is of importance.  However, a detailed analysis of specific proteins fell outside of the scope of this study which aimed at unbiased identification of long-lived macromolecules in ovaries and oocytes. We hope to continue this important work in near future.

      Comment 6: For the oocytes, the authors indicated that cytoskeleton, mitochondria-related proteins were the main LLPs, however, previous studies reported the changes of the expression of many cytoskeleton and mitochondria-related proteins during oocyte aging. How do the authors explain this contrary finding?   

      Response 6:  Our findings are not contrary to the studies reporting changes in protein expression levels during oocyte aging – the two concepts are not mutually exclusive. The average FA value at 6-month chase for oocyte proteins is 41.3 %, which means that while 41.3% of long-lived proteins pool persisted for 6 months, the other 58.7% has in fact been renewed. With the exception of few mitochondrial proteins (Cmkt2 and Apt5l), and myosins (Myl2 and Myh7), which had FA values close to 100% (no turnover), most of the LLPs had a portion of protein pools that were indeed turned over. Moreover, we included new data analysis illustrating that we identify comparable number of mitochondrial proteins between the two time points, indicating that while the long-lived pools are changing over time, the total content remains stable (Figure 3 – figure supplement 1E-G).

      Comment 7:  The authors also should provide in-depth discussion about the findings of the current study for long-lived proteins. In this study, the authors reported the relationship between these "long-lived" proteins with aging, a process with multiple "changes". Do long-lived proteins (which are related to the cytoskeleton and mitochondria) contribute to the aging defects of reproduction? or protect against aging?

      Response 7: This is a very important comment and one that needs further exploration. The fact is – we do not know at this moment whether these proteins are protective or deleterious, and such a statement would be speculative at this stage of research into LLPs in ovaries and oocytes. Future work is needed to address this question in detail.

      Briley, S. M., Jasti, S., McCracken, J. M., Hornick, J. E., Fegley, B., Pritchard, M. T., & Duncan, F. E. (2016). Reproductive age-associated fibrosis in the stroma of the mammalian ovary. Reproduction, 152(3), 245-260. https://doi.org/10.1530/REP-16-0129

      Chiang, T., Duncan, F. E., Schindler, K., Schultz, R. M., & Lampson, M. A. (2010). Evidence that Weakened Centromere Cohesion Is a Leading Cause of Age-Related Aneuploidy in Oocytes. Current Biology, 20(17), 1522-1528. https://doi.org/10.1016/j.cub.2010.06.069

      Dong, J., Jin, L., Bao, S., Chen, B., Zeng, Y., Luo, Y., Du, X., Sang, Q., Wu, T., & Wang, L. (2023). Ectopic expression of human TUBB8 leads to increased aneuploidy in mouse oocytes. Cell Discov, 9(1), 105. https://doi.org/10.1038/s41421-023-00599-z

      Duncan, F. E., Jasti, S., Paulson, A., Kelsh, J. M., Fegley, B., & Gerton, J. L. (2017). Age-associated dysregulation of protein metabolism in the mammalian oocyte. Aging Cell, 16(6), 1381-1393. https://doi.org/10.1111/acel.12676

      Feng, R., Sang, Q., Kuang, Y., Sun, X., Yan, Z., Zhang, S., Shi, J., Tian, G., Luchniak, A., Fukuda, Y., Li, B., Yu, M., Chen, J., Xu, Y., Guo, L., Qu, R., Wang, X., Sun, Z., Liu, M., . . . Wang, L. (2016). Mutations in TUBB8 and Human Oocyte Meiotic Arrest. N Engl J Med, 374(3), 223-232. https://doi.org/10.1056/NEJMoa1510791

      Fornasiero, E. F., & Savas, J. N. (2023). Determining and interpreting protein lifetimes in mammalian tissues. Trends Biochem Sci, 48(2), 106-118. https://doi.org/10.1016/j.tibs.2022.08.011

      Hark, T. J., & Savas, J. N. (2021). Using stable isotope labeling to advance our understanding of Alzheimer's disease etiology and pathology. J Neurochem, 159(2), 318-329. https://doi.org/10.1111/jnc.15298

      Kerr, J. B., Hutt, K. J., Michalak, E. M., Cook, M., Vandenberg, C. J., Liew, S. H., Bouillet, P., Mills, A., Scott, C. L., Findlay, J. K., & Strasser, A. (2012). DNA damage-induced primordial follicle oocyte apoptosis and loss of fertility require TAp63-mediated induction of Puma and Noxa. Mol Cell, 48(3), 343-352. https://doi.org/10.1016/j.molcel.2012.08.017

      Kimler, B. F., Briley, S. M., Johnson, B. W., Armstrong, A. G., Jasti, S., & Duncan, F. E. (2018). Radiation-induced ovarian follicle loss occurs without overt stromal changes. Reproduction, 155(6), 553-562. https://doi.org/10.1530/REP-18-0089

      Kirkland, J. L. (2013). Translating advances from the basic biology of aging into clinical application. Exp Gerontol, 48(1), 1-5. https://doi.org/10.1016/j.exger.2012.11.014

      Mara, J. N., Zhou, L. T., Larmore, M., Johnson, B., Ayiku, R., Amargant, F., Pritchard, M. T., & Duncan, F. E. (2020). Ovulation and ovarian wound healing are impaired with advanced reproductive age. Aging (Albany NY), 12(10), 9686-9713. https://doi.org/10.18632/aging.103237

      Perrone, R., Ashok Kumaar, P. V., Haky, L., Hahn, C., Riley, R., Balough, J., Zaza, G., Soygur, B., Hung, K., Prado, L., Kasler, H. G., Tiwari, R., Matsui, H., Hormazabal, G. V., Heckenbach, I., Scheibye-Knudsen, M., Duncan, F. E., & Verdin, E. (2023). CD38 regulates ovarian function and fecundity via NAD(+) metabolism. iScience, 26(10), 107949. https://doi.org/10.1016/j.isci.2023.107949

      Quan, N., Harris, L. R., Halder, R., Trinidad, C. V., Johnson, B. W., Horton, S., Kimler, B. F., Pritchard, M. T., & Duncan, F. E. (2020). Differential sensitivity of inbred mouse strains to ovarian damage in response to low-dose total body irradiationdagger. Biol Reprod, 102(1), 133-144. https://doi.org/10.1093/biolre/ioz164

      Savas, J. N., Toyama, B. H., Xu, T., Yates, J. R., 3rd, & Hetzer, M. W. (2012). Extremely long-lived nuclear pore proteins in the rat brain. Science, 335(6071), 942. https://doi.org/10.1126/science.1217421

      Toyama, B. H., Savas, J. N., Park, S. K., Harris, M. S., Ingolia, N. T., Yates, J. R., 3rd, & Hetzer, M. W. (2013). Identification of long-lived proteins reveals exceptional stability of essential cellular structures. Cell, 154(5), 971-982. https://doi.org/10.1016/j.cell.2013.07.037

    3. Reviewer #1 (Public Review):

      Summary:

      This manuscript by Bomba-Warczak describes a comprehensive evaluation of long-lived proteins in the ovary using a transgenerational diet-derived 15N-labelling in pulse-chased mice. The transgenerational labeling of proteins (and nucleic acids) with 15N allowed the authors to identify regions enriched in long-lived macromolecules at the 6 and 10-month chase time points. The authors also identified the retained proteins in the ovary and oocyte using MS. Key findings include the relative enrichment in long-lived macromolecules in oocytes, pregranulosa cells, CL, stroma, and surprisingly OSE. Gene ontology analysis of these proteins revealed an enrichment for nucleosome, myosin complex, mitochondria, and other matrix-type protein functions. Interestingly, compared to other post-mitotic tissues where such analyses have been previously performed such as the brain and heart, they find a higher fractional abundance of labeled proteins related to the mitochondria and myosin respectively.

      Strengths:

      A major strength of the study is the combined spatial analyses of LLPs using histological sections with MS analysis to identify retained proteins.

      Another major strength is the use of two chase time points allowing assessment of temporal changes in LLPs associated with aging.

      The major claims such as an enrichment of LLPs in pregranulosa cells, GCs of primary follicles, CL, stroma, and OSE are soundly supported by the analyses and the caveat that nucleic acids might differentially contribute to this signal is well presented.

      The claims that nucleosomes, myosin complex, and mitochondrial proteins are enriched for LLPs are well supported by GO enrichment analysis and well described within the known body of evidence that these proteins are generally long-lived in other tissues.

      Weaknesses:

      All weaknesses were addressed in the revised manuscript.

      Impact of the work:

      This work represents the first study addressing the turnover and retention of long-lived protein in the ovary and will be an invaluable resource for the research community, particularly for those studying ovarian aging. This work also raises important unanswered questions worthy of follow-up including interesting findings regarding the timing of turnover of cell types such as the OSE, organelles such as mitochondria, and ECM proteins such as ZP3 and Tubb family proteins. Most striking are the differences between the two timepoints used (6 and 10 months) which lead the authors to infer trajectories and kinetics of replacement of proteins potentially contributing to ovarian longevity or decline. As such I expect the work will contribute to hypothesis generation and stand to have an important impact on the field.

    4. Reviewer #2 (Public Review):

      Summary:

      The manuscript by Bomba-Warczak et al. applied multi-isotope imaging mass spectrometry (MIMS) analysis to identify the long-lived proteins in mouse ovaries during reproductive aging, and found some proteins related to cytoskeletal and mitochondrial dynamics persisting for 10 months.

      Strengths:

      The manuscript provides a useful dataset about protein turnover during ovarian aging in mice.

      Weaknesses:

      The study is pretty descriptive and short of further new findings based on the dataset. In addition, some results such as the numbers of follicles and ovulated oocytes in aged mice are not consistent with the published literature.

      Comments on revised version:

      The authors did not fully address my previous concerns, especially regarding the verification of the identified proteins, and follow-up functional experiments. In addition, it is still unacceptable for me that the number of ovulated oocytes in mice at 6 months of age is only one third of young mice (10 vs 30; Fig. S1E). The most of published literature show that mice at 12 months of age still have ~10 ovulated oocytes. Moreover, based on the follicle counting method used in the present study (Fig. S1D), there are no antral follicles observed in mice at 6 months and 10 months of age, which is not reasonable.

    5. Reviewer #3 (Public Review):

      Summary:

      In this study Bomba-Warczak et al focused on the reproductive aging, and they presented a map for long-lived proteins which were stable during the reproductive lifespan. The authors used MIMS to examine and show distinct molecules in different cell types in the ovary and tissue regions in 6 months mice, and they also used proteomic analysis to present different LLPs in ovaries between these two timepoints in 6 months and 10 months mice; besides, the authors also examined the LLPs in oocytes in 6 months mice and indicated that these were nuclear, cytoskeleton and mitochondria proteins.

      Strengths:

      Overall, this study provided important information about the pattern of long-lived proteins during aging, which will contribute to the understanding of the defects caused by reproductive aging.

      Weaknesses:

      12 months mice were not examined as the typical aged model.

      Comments on revised version:

      The authors responded to my comments and suggestions. Due to the limitation of the manuscript type, most suggestions of my comments in first round could be considered for future studies by the authors.

    1. eLife assessment

      This potentially valuable study examines the role of IL17-producing Ly6G PMNs as a reservoir for Mycobacterium tuberculosis to evade host killing activated by BCG immunisation. The authors report that IL17-producing polymorphonuclear neutrophils harbour a significant bacterial load in both wild-type and IFNg-/- mice and that targeting IL17 and Cox2 improved disease outcomes whilst enhancing BCG efficacy. Although the authors suggest that targeting these pathways may improve disease outcomes in humans, the evidence as it stands is incomplete and requires additional experimentation for the study to realise its full impact.

    2. Reviewer #1 (Public review):

      Summary:

      Recruitment of neutrophils to the lungs is known to drive susceptibility to infection with M. tuberculosis. In this study, the authors present data in support of the hypothesis that neutrophil production of the cytokine IL-17 underlies the detrimental effect of neutrophils on disease. They claim that neutrophils harbor a large fraction of Mtb during infection, and are a major source of IL-17. To explore the effects of blocking IL-17 signaling during primary infection, they use IL-17 blocking antibodies, SR221 (an inverse agonist of TH17 differentiation), and celecoxib, which they claim blocks Th17 differentiation, and observe modest improvements in bacterial burdens in both WT and IFN-γ deficient mice using the combination of IL-17 blockade with celecoxib during primary infection. Celecoxib enhances control of infection after BCG vaccination.

      Strengths:

      The most novel finding in the paper is that treatment with celecoxib significantly enhances control of infection in BCG-vaccinated mice that have been challenged with Mtb. It was already known that NSAID treatments can improve primary infection with Mtb.

      Weaknesses:

      The major claim of the manuscript - that neutrophils produce IL-17 that is detrimental to the host - is not strongly supported by the data. Data demonstrating neutrophil production of IL-17 lacks rigor. The experiments examining the effects of inhibitors of IL-17 on the outcome of infection are very difficult to interpret. First, treatment with IL-17 inhibitors alone has no impact on bacterial burdens in the lung, either in WT or IFN-γ KO mice. This suggests that IL-17 does not play a detrimental role during infection. Modest effects are observed using the combination of IL-17 blocking drugs and celecoxib, however, the interpretation of these results mechanistically is complicated. Celecoxib is not a specific inhibitor of Th17. Indeed, it affects levels of PGE2, which is known to have numerous impacts on Mtb infection separate from any effect on IL-17 production, as well as other eicosanoids. Finally, the human data simply demonstrates that neutrophils and IL-17 both are higher in patients who experience relapse after treatment for TB, which is expected and does not support their specific hypothesis. The use of genetic ablation of IL-17 production specifically in neutrophils and/or IL-17R in mice would greatly enhance the rigor of this study. The authors do not address the fact that numerous studies have shown that IL-17 has a protective effect in the mouse model of TB in the context of vaccination. Finally, whether and how many times each animal experiment was repeated is unclear.

    3. Reviewer #2 (Public review):

      Summary:

      In this study, Sharma et al. demonstrated that Ly6G+ granulocytes (Gra cells) serve as the primary reservoirs for intracellular Mtb in infected wild-type mice and that excessive infiltration of these cells is associated with severe bacteremia in genetically susceptible IFNγ-/- mice. Notably, neutralizing IL-17 or inhibiting COX2 reversed the excessive infiltration of Ly6G+Gra cells, mitigated the associated pathology, and improved survival in these susceptible mice. Additionally, Ly6G+Gra cells were identified as a major source of IL-17 in both wild-type and IFNγ-/- mice. Inhibition of RORγt or COX2 further reduced the intracellular bacterial burden in Ly6G+Gra cells and improved lung pathology.

      Of particular interest, COX2 inhibition in wild-type mice also enhanced the efficacy of the BCG vaccine by targeting the Ly6G+Gra-resident Mtb population.

      Strengths:

      The experimental results showing improved BCG-mediated protective immunity through targeting IL-17-producing Ly6G+ cells and COX2 are compelling and will likely generate significant interest in the field. Overall, this study presents important findings, suggesting that the IL-17-COX2 axis could be a critical target for designing innovative vaccination strategies for TB.

      Weaknesses:

      However, I have the following concerns regarding some of the conclusions drawn from the experiments, which require additional experimental evidence to support and strengthen the overall study.

      Major Concerns:

      (1) Ly6G+ Granulocytes as a Source of IL-17: The authors assert that Ly6G+ granulocytes are the major source of IL-17 in wild-type and IFN-γ KO mice based on colocalization studies of Ly6G and IL-17. In Figure 3D, they report approximately 500 Ly6G+ cells expressing IL-17 in the Mtb-infected WT lung. Are these low numbers sufficient to drive inflammatory pathology? Additionally, have the authors evaluated these numbers in IFN-γ KO mice?

      (2) Role of IL-17-Producing Ly6G Granulocytes in Pathology: The authors suggest that IL-17-producing Ly6G granulocytes drive pathology in WT and IFN-γ KO mice. However, the data presented only demonstrate an association between IL-17+ Ly6G cells and disease pathology. To strengthen their conclusion, the authors should deplete neutrophils in these mice to show that IL-17 expression, and consequently the pathology, is reduced.

      (3) IL-17 Secretion by Mtb-Infected Neutrophils: Do Mtb-infected neutrophils secrete IL-17 into the supernatants? This would serve as confirmation of neutrophil-derived IL-17. Additionally, are Ly6G+ cells producing IL-17 and serving as pathogenic agents exclusively in vivo? The authors should provide comments on this.

      (4) Characterization of IL-17-Producing Ly6G+ Granulocytes: Are the IL-17-producing Ly6G+ granulocytes a mixed population of neutrophils and eosinophils, or are they exclusively neutrophils? Sorting these cells followed by Giemsa or eosin staining could clarify this.

    4. Reviewer #3 (Public review):

      Summary:

      The authors examine how distinct cellular environments differentially control Mtb following BCG vaccination. The key findings are that IL17-producing PMNs harbor a significant Mtb load in both wild-type and IFNg-/- mice. Targeting IL17 and Cox2 improved disease and enhanced BCG efficacy over 12 weeks and neutrophils/IL17 are associated with treatment failure in humans. The authors suggest that targeting these pathways, especially in MSMD patients may improve disease outcomes.

      Strengths:

      The experimental approach is generally sound and consists of low-dose aerosol infections with distinct readouts including cell sorting followed by CFU, histopathology, and RNA sequencing analysis. By combining genetic approaches and chemical/antibody treatments, the authors can probe these pathways effectively.

      Understanding how distinct inflammatory pathways contribute to control or worsen Mtb disease is important and thus, the results will be of great interest to the Mtb field.

      Weaknesses:

      A major limitation of the current study is overlooking the role of non-hematopoietic cells in the IFNg/IL17/neutrophil response. Chimera studies from Ernst and colleagues (PMCID: PMC2807991) previously described this IDO-dependent pathway following the loss of IFNg through an increased IL17 response. This study is not cited nor discussed even though it may alter the interpretation of several experiments.

      Several of the key findings in mice have previously been shown (albeit with less sophisticated experimentation) and human disease and neutrophils are well described - thus the real new finding is how intracellular Mtb in neutrophils are more refractory to BCG-mediated control. However, given there are already high levels of Mtb in PMNs compared to other cell types, and there is a decrease in intracellular Mtb in PMNs following BCG immunization the strength of this finding is a bit limited.

    5. Author response:

      eLife assessment

      This potentially valuable study examines the role of IL17-producing Ly6G PMNs as a reservoir for Mycobacterium tuberculosis to evade host killing activated by BCG immunisation. The authors report that IL17-producing polymorphonuclear neutrophils harbour a significant bacterial load in both wild-type and IFNg-/- mice and that targeting IL17 and Cox2 improved disease outcomes whilst enhancing BCG efficacy. Although the authors suggest that targeting these pathways may improve disease outcomes in humans, the evidence as it stands is incomplete and requires additional experimentation for the study to realise its full impact.

      Thank you for evaluating our manuscript. We understand the concern related to the direct role of Ly6G+Gra-derived IL17 in TB pathogenesis. For the revised manuscript, we will provide additional experimental evidence through direct regulation of IL-17 production in Mtb-infected mice and its impact on improving BCG efficacy.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      Recruitment of neutrophils to the lungs is known to drive susceptibility to infection with M. tuberculosis. In this study, the authors present data in support of the hypothesis that neutrophil production of the cytokine IL-17 underlies the detrimental effect of neutrophils on disease. They claim that neutrophils harbor a large fraction of Mtb during infection, and are a major source of IL-17. To explore the effects of blocking IL-17 signaling during primary infection, they use IL-17 blocking antibodies, SR221 (an inverse agonist of TH17 differentiation), and celecoxib, which they claim blocks Th17 differentiation, and observe modest improvements in bacterial burdens in both WT and IFN-γ deficient mice using the combination of IL-17 blockade with celecoxib during primary infection. Celecoxib enhances control of infection after BCG vaccination. 

      Thank you for the summary.

      Strengths:

      The most novel finding in the paper is that treatment with celecoxib significantly enhances control of infection in BCG-vaccinated mice that have been challenged with Mtb. It was already known that NSAID treatments can improve primary infection with Mtb.

      Thank you.

      Weaknesses:

      The major claim of the manuscript - that neutrophils produce IL-17 that is detrimental to the host - is not strongly supported by the data. Data demonstrating neutrophil production of IL17 lacks rigor. 

      Our response: Neutrophil production of IL-17 is supported by two independent methods/ techniques in the current version: 

      (1) Through Flow cytometry- a large fraction of Ly6G+CD11b+ cells from the lungs of Mtb-infected mice were also positive for IL-17 (Fig. 3C).

      (2) IFA co-staining of Ly6G + cells with IL-17 in the lung sections from Mtb-infected mice (Fig. 3 E_G and Fig. 4H, Fig. 5I).

      However, to further strengthen this observation, we plan to analyse sorted Ly6G+Gra from the lungs of infected mice using IL-17 ELISPOT assay. This will unequivocally prove the Ly6+Gra production of IL-17. Several publications support the production of IL-17 by neutrophils (Li et al. 2010; Katayama et al. 2013; Lin et al. 2011). For example, neutrophils have been identified as a source of IL-17 in human psoriatic lesions (Lin et al. 2011), in neuroinflammation induced by traumatic brain injury (Xu et al. 2023) and in several mouse models of infectious and autoimmune inflammation (Ferretti et al. 2003; Hoshino et al. 2008) (Li et al. 2010). However, ours is the first study reporting neutrophil IL-17 production during Mtb pathology.

      The experiments examining the effects of inhibitors of IL-17 on the outcome of infection are very difficult to interpret. First, treatment with IL-17 inhibitors alone has no impact on bacterial burdens in the lung, either in WT or IFN-γ KO mice. This suggests that IL-17 does not play a detrimental role during infection. Modest effects are observed using the combination of IL-17 blocking drugs and celecoxib, however, the interpretation of these results mechanistically is complicated. Celecoxib is not a specific inhibitor of Th17. Indeed, it affects levels of PGE2, which is known to have numerous impacts on Mtb infection separate from any effect on IL-17 production, as well as other eicosanoids. 

      The reviewer correctly says that Celecoxib is not a specific inhibitor of Th17. However, COX-2 inhibition does have an effect on IL-17 levels, and numerous reports support this observation (Paulissen et al. 2013; Napolitani et al. 2009; Lemos et al. 2009). We elaborate on the results below for better clarity.

      Firstly, in the WT mice, Celecoxib treatment led to a complete loss of IL-17 production in the lungs of Mtb-infected mice (Fig. 5D). Interestingly, IL-17 production independent of IL-23 is known to require PGE2 (Paulissen et al. 2013; Polese et al. 2021). In the WT or IFNγ KO mice, we rather noted a decline in IL-23 levels post-infection, suggesting a possible role of PGE2 in IL-17 production. However, in the lung homogenates of Mtb-infected IFNγ KO mice, Celecoxib had no effect on IL-17 levels in the lung homogenates. Thus, celecoxib controls IL-17 levels only in the Mtb-infected WT mice. Including celecoxib with anti-IL17 in the IFNγ KO mice controls pathology and extends its survival.

      Second, the reviewer’s observation is only partially correct that IL-17 inhibition has a modest effect on the outcome of infection. While IL-17 neutralization and inhibition alone in the IFNγ KO mice and WT mice, respectively, did not bring down the lung CFU burden significantly, in both these cases, there was an improvement in the lung pathology. The reduced pathology coincided with reduced neutrophil recruitment and a reduced Ly6G+Graresident Mtb population in the WT mice. IL-17 neutralization alone improved IFNγ KO mice survival by ~10 days (Fig. 4F-G). 

      Third, regarding the SR2211 and Celecoxib combination study, we agree with the reviewer that Celecoxib has roles independent of IL-17 regulation. However, in the results presented in this study, there are three key aspects- 1) neutrophil-derived IL-17-dependent neutrophil recruitment, 2) the presence of a large proportion of intracellular Mtb in the neutrophils and 3) dissemination of Mtb to the spleen. Celecoxib treatment alone helps reduce lung Mtb burden in the WT mice. However, SR2211 fails to do so. It is evident that celecoxib is doing more than just inhibiting IL-17 production. The result shows that celecoxib blocks neutrophil recruitment (which could be an IL-17-dependent mechanism) and also controls the intraneutrophil bacterial population. Finally, either SR2211 or celecoxib could block dissemination to the spleen. The role of neutrophils in TB dissemination is only beginning to emerge (Hult et al. 2021). We will revise the description in the results and discussion section for this data to make it easier to understand.

      Finally, we have also done experiments with SR2211 in BCG-vaccinated animals, which shows the direct impact of IL-17 inhibition on the BCG vaccine efficacy. We will add this result in the revised version.

      Finally, the human data simply demonstrates that neutrophils and IL-17 both are higher in patients who experience relapse after treatment for TB, which is expected and does not support their specific hypothesis. 

      We disagree with the above statement. Why a higher IL-17 is expected in patients who show relapse, death or failed treatment outcomes? Classically, IL-17 is believed to be protective against TB, and the reviewer also points to that in the comments below. A very limited set of studies support the non-protective/pathological role of IL-17 in tuberculosis (Cruz et al. 2010). High IL-17 and neutrophilia at the baseline in the human subjects (i.e. at the time of recruitment in the study) highlight severe pathology in those subjects, which could have contributed to the failed treatment outcome. This observation in the human cohort strongly supports the overall theme and central observation in this study.

      The use of genetic ablation of IL-17 production specifically in neutrophils and/or IL-17R in mice would greatly enhance the rigor of this study. 

      The reviewer’s point is well-taken. Having a genetic ablation of IL-17 production, specifically in the neutrophils, would be excellent. At present, however, we lack this resource, and therefore, it is not feasible to do this experiment within a defined timeline. Instead, for the revised manuscript, we will present the data with SR2211, a direct inhibitor of RORgt and, therefore, IL-17, in BCG-vaccinated mice.

      The authors do not address the fact that numerous studies have shown that IL-17 has a protective effect in the mouse model of TB in the context of vaccination.

      Yes, there are a few articles that talk about the protective effect of IL-17 in the mouse model of TB in the context of vaccination (Khader et al. 2007; Desel et al. 2011; Choi et al. 2020). This part was discussed in the original manuscript (in the Introduction section). For the revised manuscript, we will also provide results from the experiment where we blocked IL-17 production by inhibiting RORgt using SR2211 in BCG-vaccinated mice. The results clearly show IL-17 as a negative regulator of BCG-mediated protective immunity. We believe some of the reasons for the observed differences could be 1) in our study, we analysed IL-17 levels in the lung homogenates at late phases of infection, and 2) most published studies rely on ex vivo stimulation of immune cells to measure cytokine production, whereas we actually measured the cytokine levels in the lung homogenates. We will elaborate on these points in the revised version.

      Finally, whether and how many times each animal experiment was repeated is unclear.

      We will provide the details of the number of experiments in the revised version. Briefly, the BCG vaccination experiment (Figure 1) and BCG vaccination with Celecoxib treatment experiment (Figure 6) were performed twice and thrice, respectively. The IL-17 neutralization experiment (Figure 4) and the SR2211 treatment experiment (Figure 5) were done once. We will add another SR2211 experiment data in the revised version. 

      Reviewer #2 (Public review):

      Summary:

      In this study, Sharma et al. demonstrated that Ly6G+ granulocytes (Gra cells) serve as the primary reservoirs for intracellular Mtb in infected wild-type mice and that excessive infiltration of these cells is associated with severe bacteremia in genetically susceptible IFNγ/- mice. Notably, neutralizing IL-17 or inhibiting COX2 reversed the excessive infiltration of Ly6G+Gra cells, mitigated the associated pathology, and improved survival in these susceptible mice. Additionally, Ly6G+Gra cells were identified as a major source of IL-17 in both wild-type and IFNγ-/- mice. Inhibition of RORγt or COX2 further reduced the intracellular bacterial burden in Ly6G+Gra cells and improved lung pathology.

      Of particular interest, COX2 inhibition in wild-type mice also enhanced the efficacy of the BCG vaccine by targeting the Ly6G+Gra-resident Mtb population.

      Thank you for the summary.

      Strengths:

      The experimental results showing improved BCG-mediated protective immunity through targeting IL-17-producing Ly6G+ cells and COX2 are compelling and will likely generate significant interest in the field. Overall, this study presents important findings, suggesting that the IL-17-COX2 axis could be a critical target for designing innovative vaccination strategies for TB.

      Thank you for highlighting the overall strengths of the study.  Weaknesses:

      However, I have the following concerns regarding some of the conclusions drawn from the experiments, which require additional experimental evidence to support and strengthen the overall study.

      Major Concerns:

      (1) Ly6G+ Granulocytes as a Source of IL-17: The authors assert that Ly6G+ granulocytes are the major source of IL-17 in wild-type and IFN-γ KO mice based on colocalization studies of Ly6G and IL-17. In Figure 3D, they report approximately 500 Ly6G+ cells expressing IL-17 in the Mtb-infected WT lung. Are these low numbers sufficient to drive inflammatory pathology? Additionally, have the authors evaluated these numbers in IFN-γ KO mice? 

      Thank you for pointing out about the numbers in Fig. 3D. It was our oversight to label the axis as No. of IL17+Ly6G+Gra/lung. For this data, only a part of the lung was used. For the revised manuscript, we will provide the number of these cells at the whole lung level from Mtb-infected WT mice. Unfortunately, we did not evaluate these numbers in IFN-γ KO mice through FACS. 

      For the assertion that Ly6G+Gra are the major source of IL-17 in TB, we have used two separate strategies- a) IFA and b) FACS. 

      However, as described above in response to the first reviewer, for the revision, we propose to perform an IL-17 ELISpot assay on the sorted Ly6G+Gra from the lungs of Mtb-infected WT mice.

      (2) Role of IL-17-Producing Ly6G Granulocytes in Pathology: The authors suggest that IL17-producing Ly6G granulocytes drive pathology in WT and IFN-γ KO mice. However, the data presented only demonstrate an association between IL-17+ Ly6G cells and disease pathology. To strengthen their conclusion, the authors should deplete neutrophils in these mice to show that IL-17 expression, and consequently the pathology, is reduced.

      Thank you for this suggestion. Others have done neutrophil depletion studies in TB, and so far, the outcomes remain inconclusive. In some studies, neutrophil depletion helps the pathogen (Rankin et al. 2022; Pedrosa et al. 2000; Appelberg et al. 1995), and in others, it helps the host (Lovewell et al. 2021; Mishra et al. 2017) ). One reason for this variability is the stage of infection when neutrophil depletion was done. However, another crucial factor is the heterogeneity in the neutrophil population. There are reports that suggest neutrophil subtypes with protective versus pathological trajectories (Nwongbouwoh Muefong et al. 2022; Lyadova 2017; Hellebrekers, Vrisekoop, and Koenderman 2018; Leliefeld et al. 2018). Depleting the entire population using anti-Ly6G could impact this heterogeneity and may impact the inferences drawn. A better approach would be to characterise this heterogeneous population, efforts towards which could be part of a separate study.

      For the revised manuscript, we will provide results from the SR2211 experiment in BCG-vaccinated mice and other results to show the role of IL-17-producing Ly6G+Gra in TB pathology.   

      (3) IL-17 Secretion by Mtb-Infected Neutrophils: Do Mtb-infected neutrophils secrete IL-17 into the supernatants? This would serve as confirmation of neutrophil-derived IL-17. Additionally, are Ly6G+ cells producing IL-17 and serving as pathogenic agents exclusively in vivo? The authors should provide comments on this.

      We have not directly measured IL-17 secretion by neutrophils in our experiments. However, Hu et al have reported IL-17 secretion by Mtb-infected neutrophils in vitro (Hu et al. 2017). Whether there are a few neutrophil roles exclusively seen under in vivo condition is an interesting proposition. We do have some observations that suggest in vitro phenotype of Mtb-infected neutrophils is different from in vivo.

      (4) Characterization of IL-17-Producing Ly6G+ Granulocytes: Are the IL-17-producing Ly6G+ granulocytes a mixed population of neutrophils and eosinophils, or are they exclusively neutrophils? Sorting these cells followed by Giemsa or eosin staining could clarify this.

      This is a very important point. While usually eosinophils do not express Ly6G markers in laboratory mice, under specific contexts, including infections, eosinophils can express Ly6G. Since we have not characterized these potential Ly6G+ sub-populations, that is one of the reasons we refer to the cell types as Ly6G+ granulocytes, which do not exclude Ly6G+ eosinophils. A detailed characterization of these subsets could be taken up as a separate study.

      Reviewer #3 (Public review):

      Summary:

      The authors examine how distinct cellular environments differentially control Mtb following BCG vaccination. The key findings are that IL17-producing PMNs harbor a significant Mtb load in both wild-type and IFNg-/- mice. Targeting IL17 and Cox2 improved disease and enhanced BCG efficacy over 12 weeks and neutrophils/IL17 are associated with treatment failure in humans. The authors suggest that targeting these pathways, especially in MSMD patients may improve disease outcomes.

      Thank you.

      Strengths:

      The experimental approach is generally sound and consists of low-dose aerosol infections with distinct readouts including cell sorting followed by CFU, histopathology, and RNA sequencing analysis. By combining genetic approaches and chemical/antibody treatments, the authors can probe these pathways effectively.

      Understanding how distinct inflammatory pathways contribute to control or worsen Mtb disease is important and thus, the results will be of great interest to the Mtb field.

      Thank you.

      Weaknesses:

      A major limitation of the current study is overlooking the role of non-hematopoietic cells in the IFNg/IL17/neutrophil response. Chimera studies from Ernst and colleagues (PMCID: PMC2807991) previously described this IDO-dependent pathway following the loss of IFNg through an increased IL17 response. This study is not cited nor discussed even though it may alter the interpretation of several experiments.

      Thank you for pointing out this earlier study, which we concede we missed discussing. We disagree on the point that results from that study may alter the interpretation of several experiments in our study. On the contrary, the main observation that loss of IFNγ causes severe IL-17 levels is aligned in both studies.

      IDO1 is known to alter Th cell differentiation towards Tregs and away from Th17 (Baban et al. 2009). It is absolutely feasible for the non-hematopoietic cells to regulate these events. However, that does not rule out the neutrophil production of IL-17 and the downstream pathological effect shown in this study. We will discuss and cite this study in the revised manuscript.

      Several of the key findings in mice have previously been shown (albeit with less sophisticated experimentation) and human disease and neutrophils are well described - thus the real new finding is how intracellular Mtb in neutrophils are more refractory to BCGmediated control. However, given there are already high levels of Mtb in PMNs compared to other cell types, and there is a decrease in intracellular Mtb in PMNs following BCG immunization the strength of this finding is a bit limited.

      The reviewer’s interpretation of the BCG-refractory Mtb population in the neutrophil is interesting. The reviewer is right that neutrophils had a higher intracellular Mtb burden, which decreased in the BCG-vaccinated animals. Thus, on that account, the reviewer rightly mentions that BCG is able to control Mtb even in neutrophils. However, BCG almost clears intracellular burden from other cell types analysed, and therefore, the remnant pool of intracellular Mtb in the lungs of BCG-vaccinated animals could be mostly those present in the neutrophils. This is a substantial novel development in the field and attracts focus towards innate immune cells for vaccine efficacy. 

      References:

      Appelberg, R., A. G. Castro, S. Gomes, J. Pedrosa, and M. T. Silva. 1995. 'SuscepBbility of beige mice to Mycobacterium avium: role of neutrophils', Infect Immun, 63: 3381-7.

      Baban, B., P. R. Chandler, M. D. Sharma, J. Pihkala, P. A. Koni, D. H. Munn, and A. L. Mellor. 2009. 'IDO activates regulatory T cells and blocks their conversion into Th17-like T cells', J Immunol, 183: 2475-83.

      Choi, H. G., K. W. Kwon, S. Choi, Y. W. Back, H. S. Park, S. M. Kang, E. Choi, S. J. Shin, and H. J. Kim. 2020. 'AnBgen-Specific IFN-gamma/IL-17-Co-Producing CD4(+) T-Cells Are the Determinants for ProtecBve Efficacy of Tuberculosis Subunit Vaccine', Vaccines (Basel), 8.

      Cruz, A., A. G. Fraga, J. J. Fountain, J. Rangel-Moreno, E. Torrado, M. Saraiva, D. R. Pereira, T. D. Randall, J. Pedrosa, A. M. Cooper, and A. G. Castro. 2010. 'Pathological role of interleukin 17 in mice subjected to repeated BCG vaccination after infection with Mycobacterium tuberculosis', J Exp Med, 207: 1609-16.

      Desel, C., A. Dorhoi, S. Bandermann, L. Grode, B. Eisele, and S. H. Kaufmann. 2011. 'Recombinant BCG DeltaureC hly+ induces superior protection over parental BCG by simulating a balanced combination of type 1 and type 17 cytokine responses', J Infect Dis, 204: 1573-84.

      Ferreg, S., O. Bonneau, G. R. Dubois, C. E. Jones, and A. Trifilieff. 2003. 'IL-17, produced by lymphocytes and neutrophils, is necessary for lipopolysaccharide-induced airway neutrophilia: IL-15 as a possible trigger', J Immunol, 170: 2106-12.

      Hellebrekers, P., N. Vrisekoop, and L. Koenderman. 2018. 'Neutrophil phenotypes in health and disease', Eur J Clin Invest, 48 Suppl 2: e12943.

      Hoshino, A., T. Nagao, N. Nagi-Miura, N. Ohno, M. Yasuhara, K. Yamamoto, T. Nakayama, and K. Suzuki. 2008. 'MPO-ANCA induces IL-17 production by activated neutrophils in vitro via classical complement pathway-dependent manner', J Autoimmun, 31: 79-89.

      Hu, S., W. He, X. Du, J. Yang, Q. Wen, X. P. Zhong, and L. Ma. 2017. 'IL-17 ProducBon of Neutrophils Enhances AnBbacteria Ability but Promotes ArthriBs Development During Mycobacterium tuberculosis InfecBon', EBioMedicine, 23: 88-99.

      Hult, C., J. T. Magla, H. P. Gideon, J. J. Linderman, and D. E. Kirschner. 2021. 'Neutrophil Dynamics Affect Mycobacterium tuberculosis Granuloma Outcomes and DisseminaBon', Front Immunol, 12: 712457.

      Katayama, M., K. Ohmura, N. Yukawa, C. Terao, M. Hashimoto, H. Yoshifuji, D. Kawabata, T. Fujii, Y. Iwakura, and T. Mimori. 2013. 'Neutrophils are essential as a source of IL-17 in the effector phase of arthritis', PLoS One, 8: e62231.

      Khader, S. A., G. K. Bell, J. E. Pearl, J. J. Fountain, J. Rangel-Moreno, G. E. Cilley, F. Shen, S. M. Eaton, S. L. Gaffen, S. L. Swain, R. M. Locksley, L. Haynes, T. D. Randall, and A. M. Cooper. 2007. 'IL-23 and IL-17 in the establishment of protective pulmonary CD4+ T cell responses after vaccination and during Mycobacterium tuberculosis challenge', Nat Immunol, 8: 369-77.

      Leliefeld, P. H. C., J. Pillay, N. Vrisekoop, M. Heeres, T. Tak, M. Kox, S. H. M. Rooijakkers, T. W. Kuijpers, P. Pickkers, L. P. H. Leenen, and L. Koenderman. 2018. 'DifferenBal antibacterial control by neutrophil subsets', Blood Adv, 2: 1344-55.

      Lemos, H. P., R. Grespan, S. M. Vieira, T. M. Cunha, W. A. Verri, Jr., K. S. Fernandes, F. O. Souto, I. B. McInnes, S. H. Ferreira, F. Y. Liew, and F. Q. Cunha. 2009. 'Prostaglandin mediates IL-23/IL-17induced neutrophil migraBon in inflammation by inhibiting IL-12 and IFNgamma production', Proc Natl Acad Sci U S A, 106: 5954-9.

      Li, L., L. Huang, A. L. Vergis, H. Ye, A. Bajwa, V. Narayan, R. M. Strieter, D. L. Rosin, and M. D. Okusa. 2010. 'IL-17 produced by neutrophils regulates IFN-gamma-mediated neutrophil migration in mouse kidney ischemia-reperfusion injury', J Clin Invest, 120: 331-42.

      Lin, A. M., C. J. Rubin, R. Khandpur, J. Y. Wang, M. Riblen, S. Yalavarthi, E. C. Villanueva, P. Shah, M. J. Kaplan, and A. T. Bruce. 2011. 'Mast cells and neutrophils release IL-17 through extracellular trap formation in psoriasis', J Immunol, 187: 490-500.

      Lovewell, R. R., C. E. Baer, B. B. Mishra, C. M. Smith, and C. M. Sasseg. 2021. 'Granulocytes act as a niche for Mycobacterium tuberculosis growth', Mucosal Immunol, 14: 229-41.

      Lyadova, I. V. 2017. 'Neutrophils in Tuberculosis: Heterogeneity Shapes the Way?', Mediators Inflamm, 2017: 8619307.

      Mishra, B. B., R. R. Lovewell, A. J. Olive, G. Zhang, W. Wang, E. Eugenin, C. M. Smith, J. Y. Phuah, J. E. Long, M. L. Dubuke, S. G. Palace, J. D. Goguen, R. E. Baker, S. Nambi, R. Mishra, M. G. Booty, C. E. Baer, S. A. Shaffer, V. Dartois, B. A. McCormick, X. Chen, and C. M. Sasseg. 2017. 'Nitric oxide prevents a pathogen-permissive granulocytic inflammation during tuberculosis', Nat Microbiol, 2: 17072.

      Napolitani, G., E. V. Acosta-Rodriguez, A. Lanzavecchia, and F. Sallusto. 2009. 'Prostaglandin E2 enhances Th17 responses via modulation of IL-17 and IFN-gamma production by memory CD4+ T cells', Eur J Immunol, 39: 1301-12.

      Nwongbouwoh Muefong, C., O. Owolabi, S. Donkor, S. Charalambous, A. Bakuli, A. Rachow, C. Geldmacher, and J. S. Sutherland. 2022. 'Neutrophils Contribute to Severity of Tuberculosis Pathology and Recovery From Lung Damage Pre- and Posnreatment', Clin Infect Dis, 74: 1757-66.

      Paulissen, S. M., J. P. van Hamburg, N. Davelaar, P. S. Asmawidjaja, J. M. Hazes, and E. Lubberts. 2013. 'Synovial fibroblasts directly induce Th17 pathogenicity via the cyclooxygenase/prostaglandin E2 pathway, independent of IL-23', J Immunol, 191: 1364-72.

      Pedrosa, J., B. M. Saunders, R. Appelberg, I. M. Orme, M. T. Silva, and A. M. Cooper. 2000. 'Neutrophils play a protective nonphagocytic role in systemic Mycobacterium tuberculosis infection of mice', Infect Immun, 68: 577-83.

      Polese, B., B. Thurairajah, H. Zhang, C. L. Soo, C. A. McMahon, G. Fontes, S. N. A. Hussain, V. Abadie, and I. L. King. 2021. 'Prostaglandin E(2) amplifies IL-17 production by gamma-delta T cells during barrier inflammation', Cell Rep, 36: 109456.

      Rankin, A. N., S. V. Hendrix, S. K. Naik, and C. L. Stallings. 2022. 'Exploring the Role of Low-Density Neutrophils During Mycobacterium tuberculosis InfecBon', Front Cell Infect Microbiol, 12: 901590.

      Xu, X. J., Q. Q. Ge, M. S. Yang, Y. Zhuang, B. Zhang, J. Q. Dong, F. Niu, H. Li, and B. Y. Liu. 2023. 'Neutrophil-derived interleukin-17A participates in neuroinflammation induced by traumatic brain injury', Neural Regen Res, 18: 1046-51.

    1. eLife assessment

      This useful study investigates the role of Complement 3a Receptor 1 (C3aR) in the pathogenesis of Metabolic Dysfunction-Associated Steatotic Liver Disease (MASLD) using mouse models with specific target deletions in various cell types. While the relevance of C3aR in inflammatory contexts has been established, the authors provide helpful but incomplete evidence that C3aR does not contribute significantly to MASLD pathogenesis in their models, a claim that would require additional experiments for support.

    2. Reviewer #1 (Public review):

      Summary:

      In this paper Homan et al used mouse models of Metabolic Dysfunction-Associated Steatotic Liver Disease and different specific target deletions in cells to rule out the role of Complement 3a Receptor 1 in the pathogenesis of disease. They provided limited evidence and only descriptive results that despite C3aR being relevant in different contexts of inflammation, however, these tenets did not hold true.

      Weaknesses:

      (1) The results are based on readouts showing that C3aR is not involved in the pathogenesis of liver metabolic disease.

      (2) The description of the mouse models they used to validate their findings is not clear. Lysm-cre mice - which are claimed to delete C3aR in (?) macrophages are not specific for these cells, and the genetic strategy to delete C3aR in Kupffer cells is not clear.

      (3) Taking this into account, it is very challenging to determine the validity of these data, also considering that they are merely descriptive and correlative.

    3. Reviewer #2 (Public review):

      Summary:

      Homan et al. examined the effect of macrophage- or Kupffer cell-specific C3aR1 KO on MASLD/MASH-related metabolic or liver phenotypes.

      Strengths:

      Established macrophage- or Kupffer cell-specific C3aR1 KO mice.

      Weaknesses:

      Lack of in-depth study; flaws in comparisons between KC-specific C3aR1KO and WT in the context of MASLD/MASH, because MASLD/MASH WT mice likely have a low abundance of C3aR1 on KCs.

      Homan et al. reported a set of observation data from macrophage or Kupffer cell-specific C3aR1KO mice. Several questions and concerns as follows could challenge the conclusions of this study:

      (1) As C3aR1 is robustly repressed in MASLD or MASH liver, GAN feeding likely reduced C3aR1 abundance in the liver of WT mice. Thus, it is not surprising that there were no significant differences in liver phenotypes between WT vs. C3aR1KO mice after prolonged GAN diet feeding. It would give more significance to the study if restoring C3aR1 abundance in KCs in the context of MASLD/MASH.

      (2) Would C3aR1KO mice develop liver abnormalities after a short period of GAN diet feeding?

      (3) What would be the liver macrophage phenotypes in WT vs C3aR1KO mice after GAN feeding?

      (4) In Fig 1D, >25wks GAN feeding had minimal effects on female body weight gain. These GAN-fed female mice also develop NASLD/MASH liver abnormalities?

      (5) Would C3aR1KO result in differences in liver phenotypes, including macrophage population/activation, liver inflammation, lipogenesis, in lean mice?

      (6) The authors should provide more information regarding the generation of KC-specific C3aR1KO. Which Cre mice were used to breed with C3aR1 flox mice?

    4. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      Summary: 

      In this paper Homan et al used mouse models of Metabolic Dysfunction-Associated Steatotic Liver Disease and different specific target deletions in cells to rule out the role of Complement 3a Receptor 1 in the pathogenesis of disease. They provided limited evidence and only descriptive results that despite C3aR being relevant in different contexts of inflammation, however, these tenets did not hold true. 

      Weaknesses: 

      (1) The results are based on readouts showing that C3aR is not involved in the pathogenesis of liver metabolic disease. 

      (2) The description of the mouse models they used to validate their findings is not clear. Lysm-cre mice - which are claimed to delete C3aR in (?) macrophages are not specific for these cells, and the genetic strategy to delete C3aR in Kupffer cells is not clear. 

      (3) Taking this into account, it is very challenging to determine the validity of these data, also considering that they are merely descriptive and correlative. 

      We generated 2 different cohorts of mice using LysM-Cre (Jackson Strain #004781) to drive deletion in all macrophages and Clec4f-Cre (Jackson Strain #033296) to specifically ablate C3ar1 in Kupffer cells. We will ensure that experimental models will be clearly defined in the revised manuscript. The reviewer’s point is well taken that LysM-Cre transgene can also be active in granulocytes and some dendritic cells. Even so, despite deletion of C3ar1 in macrophages and other granulocytes, we do not see a major effect on hepatic steatosis and fibrosis in this GAN diet induced model of MASLD/MASH. This was a somewhat surprising finding. We do not agree that our findings are correlative. We specifically ablated C3aR1 in macrophages or Kupffer cells and found no significant differences in the major readouts of steatosis and fibrosis for MASLD/MASH between control and knockout mice. It is possible that in other models of liver injury that we did not test (e.g., short-term treatment with a hepatotoxin such as carbon tetrachloride), there may be differences in liver injury in mice lacking C3ar1 in macrophages, but the GAN diet model has been shown to better parallel the gene expression changes in human MAFLD/MASH.

      Reviewer #2 (Public review):

      Summary:

      Homan et al. examined the effect of macrophage- or Kupffer cell-specific C3aR1 KO on MASLD/MASHrelated metabolic or liver phenotypes. 

      Strengths:

      Established macrophage- or Kupffer cell-specific C3aR1 KO mice. 

      Weaknesses:

      Lack of in-depth study; flaws in comparisons between KC-specific C3aR1KO and WT in the context of MASLD/MASH, because MASLD/MASH WT mice likely have a low abundance of C3aR1 on KCs. 

      Homan et al. reported a set of observation data from macrophage or Kupffer cell-specific C3aR1KO mice. Several questions and concerns as follows could challenge the conclusions of this study: 

      (1) As C3aR1 is robustly repressed in MASLD or MASH liver, GAN feeding likely reduced C3aR1 abundance in the liver of WT mice. Thus, it is not surprising that there were no significant differences in liver phenotypes between WT vs. C3aR1KO mice after prolonged GAN diet feeding. It would give more significance to the study if restoring C3aR1 abundance in KCs in the context of MASLD/MASH. 

      GAN diet feeding resulted in higher liver C3ar1 compared to regular diet (Figure 1H). This thus became an impetus for studying the effects of C3ar1 deletion in macrophages or Kupffer cells, which are responsible for the majority of liver C3ar1 expression, in MASLD/MASH (Figures 2B and 3H).  

      (2) Would C3aR1KO mice develop liver abnormalities after a short period of GAN diet feeding?  

      We did not assess if short term GAN diet feeding resulted in significant differences in liver abnormalities in the C3ar1 macrophage or Kupffer cell knockout mice. Perhaps the reviewer’s point is that perhaps with shorter periods of GAN diet feeding there may be a phenotype in the KO mice. We agree that this is entirely possible, though with shorter feeding timeframes what is typically seen is hepatic steatosis without fibrosis. Nevertheless, the most important element in our opinion for a disease preventing or modifying model lies with the longer-term GAN diet feeding. With long term GAN diet feeding that has been previously shown to model human MASLD/MASH, we did not observe significant differences in liver abnormalities with the KO mice.

      (3) What would be the liver macrophage phenotypes in WT vs C3aR1KO mice after GAN feeding? 

      Similar to the above point, given the lack of a major MASLD/MASH phenotype in hepatic steatosis and fibrosis, we did not further profile the liver macrophage profiles of the macrophage or Kupffer cell C3ar1 KO mice with GAN feeding.  

      (4) In Fig 1D, >25wks GAN feeding had minimal effects on female body weight gain. These GAN-fed female mice also develop NASLD/MASH liver abnormalities? 

      We thank the reviewer for this question. In general, female GAN-fed mice develop milder MASLD/MASH abnormalities. We will include additional data in the revised manuscript.

      (5) Would C3aR1KO result in differences in liver phenotypes, including macrophage population/activation, liver inflammation, lipogenesis, in lean mice? 

      Likewise, we will include data further characterizing liver inflammation, lipogenesis and macrophages in macrophage C3ar1 KO mice under lean/regular diet conditions.

      (6) The authors should provide more information regarding the generation of KC-specific C3aR1KO. Which Cre mice were used to breed with C3aR1 flox mice? 

      Clec4f-Cre transgenic mice were used to generate Kupffer cell specific KO of C3ar1. This will be clarified and explicitly stated in the revised manuscript.

    1. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This study puts forth the model that under IFN-B stimulation, liquid-phase WTAP coordinates with the transcription factor STAT1 to recruit MTC to the promoter region of interferon-stimulated genes (ISGs), mediating the installation of m6A on newly synthesized ISG mRNAs. This model is supported by strong evidence that the phosphorylation state of WTAP, regulated by PPP4, is regulated by IFN-B stimulation, and that this results in interactions between WTAP, the m6A methyltransferase complex, and STAT1, a transcription factor that mediates activation of ISGs. This was demonstrated via a combination of microscopy, immunoprecipitations, m6A sequencing, and ChIP. These experiments converge on a set of experiments that nicely demonstrate that IFN-B stimulation increases the interaction between WTAP, METTL3, and STAT1, that this interaction is lost with the knockdown of WTAP (even in the presence of IFN-B), and that this IFN-B stimulation also induces METTL3-ISG interactions.

      Strengths:

      The evidence for the IFN-B stimulated interaction between METTL3 and STAT1, mediated by WTAP, is quite strong. Removal of WTAP in this system seems to be sufficient to reduce these interactions and the concomitant m6A methylation of ISGs. The conclusion that the phosphorylation state of WTAP is important in this process is also quite well supported.

      Weaknesses:

      The evidence that the above mechanism is fundamentally driven by different phase-separated pools of WTAP (regulated by its phosphorylation state) is weaker. These experiments rely relatively heavily on the treatment of cells with 1,6-hexanediol, which has been shown to have some off-target effects on phosphatases and kinases (PMID 33814344).

      Given that the model invoked in this study depends on the phosphorylation (or lack thereof) of WTAP, this is a particularly relevant concern.

      Related to this point, it is also interesting (and potentially concerning for the proposed model) that the initial region of WTAP that was predicted to be disordered is in fact not the region that the authors demonstrate is important for the different phase-separated states. Taking all the data together, it is also not clear to me that one has to invoke phase separation in the proposed mechanism.

      We are grateful for the Reviewer’s positive comment and constructive feedback. In this article, we claim a novel and important mechanism that de-phosphorylation-driven solid to liquid phase transition of WTAP mediates its co-transcriptional m6A modification. We first observed that WTAP underwent phase transition during virus infection and IFN-β stimulation, and confirmed the phase transition driven force of WTAP through multiple experiments. Besides 1,6‐hexanediol (1,6-hex) treatment, we also introduced S/T to D/A mutations to mimic the phosphorylation and de-phosphorylation WTAP in vitro and in cells, identified 5ST-D mutant as SLPS mutant, and 5ST-A mutant as LLPS mutant. We then performed 1,6-hex experiment to confirm the importance of phase separation for WTAP function, and revealed that 5ST-D SLPS mutant and 5ST-A LLPS mutant had different influence on WTAP-promoter region interaction and co-transcriptional m6A modification. Following the reviewer’s suggestion, we need to further clarify the phosphorylation of WTAP phase separation. We plan to repeat the experiments by introducing potent PP4 inhibitor, fostriecin, and performed further experiments to explore the effect of WTAP IDR domain, which is reported to play a critical role for its phase separation.

      1,6-hex was initially considered as the inhibitor of hydrophobic interaction which involved in various kinds of protein-protein interaction, indicating that off-target effects of 1,6-hex was inevitable. It is reported that 1,6-hex impaired RNA pol II CTD specific phosphatase and kinase activity at 5% concentration3. However, 1,6-hex is still widely used in the LLPS-associated functional studies despite its off-target effect. Related to this article, 10% 1,6-hex was reported to dissolve WTAP phase separation droplets2. Beside WTAP, 1,6-hex (5%-10% w/v) was also used to explore the phase separation characteristic and function on phosphorylated protein or even kinase, including p‐tau441, TAZ, HSF1 and so on4-6. 10% 1,6-hex inhibited the crucial role of phosphorylation-driven HSF1 LLPS in chromatin binding and transcriptional process presented by RNA-seq dataset6, indicating the function on kinase or phosphatase of 1,6-hex might not a global effect. To avoid the 1,6-hex-mediated kinase/phosphatase impairment in this project, we introduced the WTAP SLPS mutation and LLPS mutation besides 1,6-hex treatment to explore the m6A modification function of WTAP phase transition. We plan to repeat the experiments by lower the 1,6-hex concentration, check the WTAP phosphorylation status after 1,6-hex treatment, and discuss them in the discussion part.

      A considerable number of proteins undergo phase separation via interactions between intrinsically disordered regions (IDRs). IDR contains more charged and polar amino acids to present multiple weakly interacting elements, while lacking hydrophobic amino acids to show flexible conformations7. In our article, we used PLAAC websites (http://plaac.wi.mit.edu/) to predict IDR domain of WTAP, and a fragment (234-249 amino acids) was predicted as prion-like domain. However, deletion of this fragment failed to abolish the phase separation properties of WTAP, which might be the main confusion to reviewers. To explain this issue, we checked the WTAP structure (within part of MTC complex) from protein data bank (https://www.rcsb.org/structure/7VF2) and found that prediction of IDR has been renewed due to the update of different algorithm. IDR of WTAP has expanded to 245-396 amino acids, containing the whole CTD region. According to our results, lack of CTD inhibited WTAP liquid-liquid phase separation both in vitro and in cells, while the phosphorylation status on CTD had dramatic impact on WTAP phase transition, which was consistent with the LLPS-regulating function of IDR. Therefore, we will revise our description on WTAP IDR, and performed further experiment to test its function.

      Taken together, given the highly association between WTAP phosphorylation with phase separation status and its function during IFN-β stimulation, it is necessary to involve WTAP phase separation in our mechanism. We will perform further experiments to propose more convincing evidence and perfect our project.

      Reviewer #2 (Public review):

      In this study, Cai and colleagues investigate how one component of the m6A methyltransferase complex, the WTAP protein, responds to IFNb stimulation. They find that viral infection or IFNb stimulation induces the transition of WTAP from aggregates to liquid droplets through dephosphorylation by PPP4. This process affects the m6A modification levels of ISG mRNAs and modulates their stability. In addition, the WTAP droplets interact with the transcription factor STAT1 to recruit the methyltransferase complex to ISG promoters and enhance m6A modification during transcription. The investigation dives into a previously unexplored area of how viral infection or IFNb stimulation affects m6A modification on ISGs. The observation that WTAP undergoes a phase transition is significant in our understanding of the mechanisms underlying m6A's function in immunity. However, there are still key gaps that should be addressed to fully accept the model presented.

      Major points:

      (1) More detailed analyses on the effects of WTAP sgRNA on the m6A modification of ISGs:

      a. A comprehensive summary of the ISGs, including the percentage of ISGs that are m6A-modified. merip-isg percentage

      b. The distribution of m6A modification across the ISGs. topology

      c. A comparison of the m6A modification distribution in ISGs with non-ISGs. topology

      In addition, since the authors propose a novel mechanism where the interaction between phosphorylated STAT1 and WTAP directs the MTC to the promoter regions of ISGs to facilitate co-transcriptional m6A modification, it is critical to analyze whether the m6A modification distribution holds true in the data.

      We appreciate the reviewer‘s summary of our manuscript and the constructive assessment. We plan to perform the related analysis accordingly to present the m6A modification in ISGs in our model. 

      (2) Since a key part of the model includes the cytosol-localized STAT1 protein undergoing phosphorylation to translocate to the nucleus to mediate gene expression, the authors should focus on the interaction between phosphorylated STAT1 and WTAP in Figure 4, rather than the unphosphorylated STAT1. Only phosphorylated STAT1 localizes to the nucleus, so the presence of pSTAT1 in the immunoprecipitate is critical for establishing a functional link between STAT1 activation and its interaction with WTAP.

      We plan to repeat the immunoprecipitation experiments to clarify the function of pSTAT1 in WTAP interaction and m6A modification as the reviewer suggested.

      (3) The authors should include pSTAT1 ChIP-seq and WTAP ChIP-seq on IFNb-treated samples in Figure 5 to allow for a comprehensive and unbiased genomic analysis for comparing the overlaps of peaks from both ChIP-seq datasets. These results should further support their hypothesis that WTAP interacts with pSTAT1 to enhance m6A modifications on ISGs.

      We first performed the MeRIP-seq and RNA-seq and explored the critical role of WTAP in ISGs m6A modification and expression. By immunoprecipitation and immunofluorescence experiments, we found phase transition of WTAP enhanced its interaction to pSTAT1. These results indicate that WTAP mediated ISGs m6A modification and expression by enhanced its interaction with pSTAT1 during virus infection and IFN-β stimulation. However, we were still not sure how WTAP-mediated m6A modification related to pSTAT1-mediated transcription. By analyzing METTL3 ChIP-seq data or caPAR-CLIP-seq data, several researches have revealed the recruitment of m6A methylation complex (MTC) to transcription start sites (TSS) of coding genes and R-loop structure by interacting with transcriptional factors STAT5B or DNA helicase DDX21, indicating the engagement of MTC mediated m6A modification on nascent transcripts at the very beginning of transcription 8-10. Thus, we proposed that phase transition of WTAP could be recruited to the ISGs promoter region by pSTAT1, and verified this hypothesis by pSTAT1/WTAP-ChIP-qPCR. We believe ChIP-seq experiment is a good idea to explore the mechanism in depth, but the results in this article for now are enough to explain our mechanism. We will continuously focus on the whole genome chromatin distribution of WTAP and explore more functional effect of transcriptional factor-dependent WTAP-promoter region interaction in t.

      Minor points:

      (1) Since IFNb is primarily known for modulating biological processes through gene transcription, it would be informative if the authors discussed the mechanism of how IFNb would induce the interaction between WTAP and PPP4.

      (2) The authors should include mCherry alone controls in Figure 1D to demonstrate that mCherry does not contribute to the phase separation of WTAP. Does mCherry have or lack a PLD?

      (3) The authors should clarify the immunoprecipitation assays in the methods. For example, the labeling in Figure 2A suggests that antibodies against WTAP and pan-p were used for two immunoprecipitations. Is that accurate?

      (4) The authors should include overall m6A modification levels quantified of GFPsgRNA and WTAPsgRNA cells, either by mass spectrometry (preferably) or dot blot.

      We thank reviewer for raising these useful suggestions. We will perform related experiments and revised the manuscript carefully the as reviewer suggested.

      Reviewer #3 (Public review):

      Summary:

      This study presents a valuable finding on the mechanism used by WTAP to modulate the IFN-β stimulation. It describes the phase transition of WTAP driven by IFN-β-induced dephosphorylation. The evidence supporting the claims of the authors is solid, although major analysis and controls would strengthen the impact of the findings. Additionally, more attention to the figure design and to the text would help the reader to understand the major findings.

      Strength:

      The key finding is the revelation that WTAP undergoes phase separation during virus infection or IFN-β treatment. The authors conducted a series of precise experiments to uncover the mechanism behind WTAP phase separation and identified the regulatory role of 5 phosphorylation sites. They also succeeded in pinpointing the phosphatase involved.

      Weaknesses:

      However, as the authors acknowledge, it is already widely known in the field that IFN and viral infection regulate m6A mRNAs and ISGs. Therefore, a more detailed discussion could help the reader interpret the obtained findings in light of previous research.

      It is well-known that protein concentration drives phase separation events. Similarly, previous studies and some of the figures presented by the authors show an increase in WTAP expression upon IFN treatment. The authors do not discuss the contribution of WTAP expression levels to the phase separation event observed upon IFN treatment. Similarly, METTL3 and METTL14, as well as other proteins of the MTC are upregulated upon IFN treatment. How does the MTC protein concentration contribute to the observed phase separation event?

      How is PP4 related to the IFN signaling cascade?

      In general, it is very confusing to talk about WTAP KO as WTAPgRNA.

      We are grateful for the positive comments and the unbiased advice by reviewer. To interpret the findings in previous research, we will revise the manuscript carefully and preform more detailed discussion on ISGs m6A modification during virus infection or IFN stimulation. As previous reported, WTAP protein level will be induced by long time IFN-β stimulation or LPS stimulation, while LPS-induced WTAP expression promoted its phase separation ability2,11. Although there was no significant upregulation of WTAP expression level in our short time treatment, we hypothesized that WTAP phase separation will be promoted due to higher protein concentration after long time IFN stimulation, enhancing m6A modification deposition on ISGs mRNA, revealing a feedback loop between WTAP phase separation and m6A modification during specific stimulation. To discuss the effect of MTC protein concentration in our proposed event, we will perform immunoblotting experiments of MTC proteins and check the phase separation effect in different WTAP concentration.

      Protein phosphatase 4 (PP4) is a multi-subunit Ser/Thr phosphatase complex that participate in diverse cellular pathways including DDR, cell cycle progression, and apoptosis12. Protein phosphatase 4 catalytic subunit 4C (PPP4C) is one of the components of PP4 complex. Previous research showed that knockout of PPP4C enhanced IFN-β downstream signaling and gene expression, which was consistent with our findings that knockdown of PPP4C impaired WTAP-mediated m6A modification, enhanced the ISGs expression. Since there was no significant enhancement in PPP4C expression level during IFN-β stimulation in our results, we will consider to explore the post-translation modification that may influence the protein-protein interaction, such as ubiquitination.

      In this project, all the WTAP-deficient THP-1 cells were bulk cells treated with WTAPsgRNA, but not monoclonal knockout cells. We confirmed that WTAP expression was efficiently knockdown in WTAPsgRNA THP-1 cells, and the m6A modification level has been impaired, avoiding the compensatory effect on m6A modification by other possible proteins. Thus, we prefer to call it WTAPsgRNA THP-1 cells rather than WTAP KO THP-1 cells.  

      References

      (1) Raja, R., Wu, C., Bassoy, E.Y., Rubino, T.E., Jr., Utagawa, E.C., Magtibay, P.M., Butler, K.A., and Curtis, M. (2022). PP4 inhibition sensitizes ovarian cancer to NK cell-mediated cytotoxicity via STAT1 activation and inflammatory signaling. J Immunother Cancer 10. 10.1136/jitc-2022-005026.

      (2) Ge, Y., Chen, R., Ling, T., Liu, B., Huang, J., Cheng, Y., Lin, Y., Chen, H., Xie, X., Xia, G., et al. (2024). Elevated WTAP promotes hyperinflammation by increasing m6A modification in inflammatory disease models. J Clin Invest 134. 10.1172/JCI177932.

      (3) Duster, R., Kaltheuner, I.H., Schmitz, M., and Geyer, M. (2021). 1,6-Hexanediol, commonly used to dissolve liquid-liquid phase separated condensates, directly impairs kinase and phosphatase activities. J Biol Chem 296, 100260. 10.1016/j.jbc.2021.100260.

      (4) Wegmann, S., Eftekharzadeh, B., Tepper, K., Zoltowska, K.M., Bennett, R.E., Dujardin, S., Laskowski, P.R., MacKenzie, D., Kamath, T., Commins, C., et al. (2018). Tau protein liquid-liquid phase separation can initiate tau aggregation. The EMBO journal 37. 10.15252/embj.201798049.

      (5) Lu, Y., Wu, T., Gutman, O., Lu, H., Zhou, Q., Henis, Y.I., and Luo, K. (2020). Phase separation of TAZ compartmentalizes the transcription machinery to promote gene expression. Nat Cell Biol 22, 453-464. 10.1038/s41556-020-0485-0.

      (6) Zhang, H., Shao, S., Zeng, Y., Wang, X., Qin, Y., Ren, Q., Xiang, S., Wang, Y., Xiao, J., and Sun, Y. (2022). Reversible phase separation of HSF1 is required for an acute transcriptional response during heat shock. Nat Cell Biol 24, 340-352. 10.1038/s41556-022-00846-7.

      (7) Hou, S., Hu, J., Yu, Z., Li, D., Liu, C., and Zhang, Y. (2024). Machine learning predictor PSPire screens for phase-separating proteins lacking intrinsically disordered regions. Nat Commun 15, 2147. 10.1038/s41467-024-46445-y.

      (8) Hao, J.D., Liu, Q.L., Liu, M.X., Yang, X., Wang, L.M., Su, S.Y., Xiao, W., Zhang, M.Q., Zhang, Y.C., Zhang, L., et al. (2024). DDX21 mediates co-transcriptional RNA m(6)A modification to promote transcription termination and genome stability. Mol Cell 84, 1711-1726 e1711. 10.1016/j.molcel.2024.03.006.

      (9) Barbieri, I., Tzelepis, K., Pandolfini, L., Shi, J., Millan-Zambrano, G., Robson, S.C., Aspris, D., Migliori, V., Bannister, A.J., Han, N., et al. (2017). Promoter-bound METTL3 maintains myeloid leukaemia by m(6)A-dependent translation control. Nature 552, 126-131. 10.1038/nature24678.

      (10) Bhattarai, P.Y., Kim, G., Lim, S.C., and Choi, H.S. (2024). METTL3-STAT5B interaction facilitates the co-transcriptional m(6)A modification of mRNA to promote breast tumorigenesis. Cancer Lett 603, 217215. 10.1016/j.canlet.2024.217215.

      (11) Ge, Y., Ling, T., Wang, Y., Jia, X., Xie, X., Chen, R., Chen, S., Yuan, S., and Xu, A. (2021). Degradation of WTAP blocks antiviral responses by reducing the m(6) A levels of IRF3 and IFNAR1 mRNA. EMBO Rep 22, e52101. 10.15252/embr.202052101.

      (12) Dong, M.Z., Ouyang, Y.C., Gao, S.C., Ma, X.S., Hou, Y., Schatten, H., Wang, Z.B., and Sun, Q.Y. (2022). PPP4C facilitates homologous recombination DNA repair by dephosphorylating PLK1 during early embryo development. Development 149. 10.1242/dev.200351.

    2. eLife assessment

      This important study demonstrates that interferon beta stimulation induces WTAP transition from aggregates to liquid droplets, coordinating m6A modification of a subset of mRNAs that encode interferon-stimulated genes and restricting their expression. The evidence presented is solid, supported by microscopy, immunoprecipitations, m6A sequencing, and ChIP, to show that WTAP phosphorylation controls phase transition and its interaction with STAT1 and the methyltransferase complex.

    3. Reviewer #1 (Public review):

      Summary:

      This study puts forth the model that under IFN-B stimulation, liquid-phase WTAP coordinates with the transcription factor STAT1 to recruit MTC to the promoter region of interferon-stimulated genes (ISGs), mediating the installation of m6A on newly synthesized ISG mRNAs. This model is supported by strong evidence that the phosphorylation state of WTAP, regulated by PPP4, is regulated by IFN-B stimulation, and that this results in interactions between WTAP, the m6A methyltransferase complex, and STAT1, a transcription factor that mediates activation of ISGs. This was demonstrated via a combination of microscopy, immunoprecipitations, m6A sequencing, and ChIP. These experiments converge on a set of experiments that nicely demonstrate that IFN-B stimulation increases the interaction between WTAP, METTL3, and STAT1, that this interaction is lost with the knockdown of WTAP (even in the presence of IFN-B), and that this IFN-B stimulation also induces METTL3-ISG interactions.

      Strengths:

      The evidence for the IFN-B stimulated interaction between METTL3 and STAT1, mediated by WTAP, is quite strong. Removal of WTAP in this system seems to be sufficient to reduce these interactions and the concomitant m6A methylation of ISGs. The conclusion that the phosphorylation state of WTAP is important in this process is also quite well supported.

      Weaknesses:

      The evidence that the above mechanism is fundamentally driven by different phase-separated pools of WTAP (regulated by its phosphorylation state) is weaker. These experiments rely relatively heavily on the treatment of cells with 1,6-hexanediol, which has been shown to have some off-target effects on phosphatases and kinases (PMID 33814344). Given that the model invoked in this study depends on the phosphorylation (or lack thereof) of WTAP, this is a particularly relevant concern. Related to this point, it is also interesting (and potentially concerning for the proposed model) that the initial region of WTAP that was predicted to be disordered is in fact not the region that the authors demonstrate is important for the different phase-separated states. Taking all the data together, it is also not clear to me that one has to invoke phase separation in the proposed mechanism.

    4. Reviewer #2 (Public review):

      In this study, Cai and colleagues investigate how one component of the m6A methyltransferase complex, the WTAP protein, responds to IFNb stimulation. They find that viral infection or IFNb stimulation induces the transition of WTAP from aggregates to liquid droplets through dephosphorylation by PPP4. This process affects the m6A modification levels of ISG mRNAs and modulates their stability. In addition, the WTAP droplets interact with the transcription factor STAT1 to recruit the methyltransferase complex to ISG promoters and enhance m6A modification during transcription. The investigation dives into a previously unexplored area of how viral infection or IFNb stimulation affects m6A modification on ISGs. The observation that WTAP undergoes a phase transition is significant in our understanding of the mechanisms underlying m6A's function in immunity. However, there are still key gaps that should be addressed to fully accept the model presented.

      Major points:

      (1) More detailed analyses on the effects of WTAP sgRNA on the m6A modification of ISGs:<br /> a. A comprehensive summary of the ISGs, including the percentage of ISGs that are m6A-modified.<br /> b. The distribution of m6A modification across the ISGs.<br /> c. A comparison of the m6A modification distribution in ISGs with non-ISGs.

      In addition, since the authors propose a novel mechanism where the interaction between phosphorylated STAT1 and WTAP directs the MTC to the promoter regions of ISGs to facilitate co-transcriptional m6A modification, it is critical to analyze whether the m6A modification distribution holds true in the data.

      (2) Since a key part of the model includes the cytosol-localized STAT1 protein undergoing phosphorylation to translocate to the nucleus to mediate gene expression, the authors should focus on the interaction between phosphorylated STAT1 and WTAP in Figure 4, rather than the unphosphorylated STAT1. Only phosphorylated STAT1 localizes to the nucleus, so the presence of pSTAT1 in the immunoprecipitate is critical for establishing a functional link between STAT1 activation and its interaction with WTAP.

      (3) The authors should include pSTAT1 ChIP-seq and WTAP ChIP-seq on IFNb-treated samples in Figure 5 to allow for a comprehensive and unbiased genomic analysis for comparing the overlaps of peaks from both ChIP-seq datasets. These results should further support their hypothesis that WTAP interacts with pSTAT1 to enhance m6A modifications on ISGs.

      Minor points:

      (1) Since IFNb is primarily known for modulating biological processes through gene transcription, it would be informative if the authors discussed the mechanism of how IFNb would induce the interaction between WTAP and PPP4.

      (2) The authors should include mCherry alone controls in Figure 1D to demonstrate that mCherry does not contribute to the phase separation of WTAP. Does mCherry have or lack a PLD?

      (3) The authors should clarify the immunoprecipitation assays in the methods. For example, the labeling in Figure 2A suggests that antibodies against WTAP and pan-p were used for two immunoprecipitations. Is that accurate?

      (4) The authors should include overall m6A modification levels quantified of GFPsgRNA and WTAPsgRNA cells, either by mass spectrometry (preferably) or dot blot.

    5. Reviewer #3 (Public review):

      Summary:

      This study presents a valuable finding on the mechanism used by WTAP to modulate the IFN-β stimulation. It describes the phase transition of WTAP driven by IFN-β-induced dephosphorylation. The evidence supporting the claims of the authors is solid, although major analysis and controls would strengthen the impact of the findings. Additionally, more attention to the figure design and to the text would help the reader to understand the major findings.

      Strength:

      The key finding is the revelation that WTAP undergoes phase separation during virus infection or IFN-β treatment. The authors conducted a series of precise experiments to uncover the mechanism behind WTAP phase separation and identified the regulatory role of 5 phosphorylation sites. They also succeeded in pinpointing the phosphatase involved.

      Weaknesses:

      However, as the authors acknowledge, it is already widely known in the field that IFN and viral infection regulate m6A mRNAs and ISGs. Therefore, a more detailed discussion could help the reader interpret the obtained findings in light of previous research.

      It is well-known that protein concentration drives phase separation events. Similarly, previous studies and some of the figures presented by the authors show an increase in WTAP expression upon IFN treatment. The authors do not discuss the contribution of WTAP expression levels to the phase separation event observed upon IFN treatment. Similarly, METTL3 and METTL14, as well as other proteins of the MTC are upregulated upon IFN treatment. How does the MTC protein concentration contribute to the observed phase separation event?

      How is PP4 related to the IFN signaling cascade?

      In general, it is very confusing to talk about WTAP KO as WTAPgRNA.

    1. eLife assessment

      This valuable study confirms the association between the human leukocyte antigen (HLA)-II region and tuberculosis (TB) susceptibility in genetically admixed South African populations, specifically identifying a near-genome-wide significant association in the HLA-DPB1 gene, which originates from KhoeSan ancestry. Whilst some of the evidence supporting the association between the HLA-II region and TB susceptibility is solid, the analysis is incomplete and requires further work for the study to achieve its full value. The work will be of interest to those studying the genetic basis of tuberculosis susceptibility/infection resistance.

    2. Reviewer #2 (Public review):

      Summary:

      This manuscript is about using different analytical approaches to allow ancestry adjustments to GWAS analyses amongst admixed populations. This work is a follow-on from the recently published ITHGC multi-population GWAS (https://doi.org/10.7554/eLife.84394), with a focus on the admixed South African populations. Ancestry adjustment models detected a peak of SNPs in the class II HLA DPB1, distinct from the class II HLA DQA1 loci significant in the ITHGC analysis.

      Strengths:

      Excellent demonstration of GWAS analytical pipelines in highly admixed populations. Further confirmation of the importance of the HLA class II locus in genetic susceptibility to TB.

      Weaknesses:

      Limited novelty compared to the group's previous existing publications and the body of work linking HLA class II alleles with TB susceptibility in South Africa or other African populations. This work includes only ~100 new cases and controls from what has already been published. High-resolution HLA typing has detected significant signals in both the DQA1 and DPB1 regions identified by the larger ITHGC and in this GWAS analysis respectively (Chihab L et al. HLA. 2023 Feb; 101(2): 124-137).

      Despite the availability of strong methods for imputing HLA from GWAS data (Karnes J et Plos One 2017), the authors did not confirm with HLA typing the importance of their SNP peak in the class II region. This would have supported the importance of this ancestry adjustment versus prior ITHGC analysis.

      The populations consider active TB and healthy controls (from high-burden presumed exposed communities) and do not provide QFT or other data to identify latent TB infection.

      Important methodological points for clarification and for readers to be aware of when reading this paper:

      (1) One of the reasons cited for the lack of African ancestry-specific associations or suggestive peaks in the ITHGC study was the small African sample size. The current association test includes a larger African cohort and yields a near-genome-wide significant threshold in the HLA-DPB1 gene originating from the KhoeSan ancestry. The investigation is needed as to whether the increase in power is due to increased African samples and not necessarily the use of the LAAA model as stated on lines 295 and 296?

      (2) In line 256, the number of SNPs included in the LAAA analysis was 784,557 autosomal markers; the number of SNPs after quality control of the imputed dataset was 7,510,051 SNPs (line 142). It is not clear how or why ~90% of the SNPs were removed. This needs clarification.

      (3) The authors have used the significance threshold estimated by the STEAM p-value < 2.5x10-6 in the LAAA analysis. Grinde et al. (2019 implemented their significance threshold estimation approach tailored to admixture mapping (local ancestry (LA) model), where there is a reduction in testing burden. The authors should justify why this threshold would apply to the LAAA model (a joint genotype and ancestry approach).

      (4) Batch effect screening and correction (line 174) is a quality control check. This section is discussed after global and local ancestry inferences in the methods. Was this QC step conducted after the inferencing? If so, the authors should justify how the removed SNPs due to the batch effect did not affect the global and local ancestry inferences or should order the methods section correctly to avoid confusion.

    3. Reviewer #1 (Public review):

      Summary:

      The authors aimed to confirm the association between the human leukocyte antigen (HLA)-II region and tuberculosis (TB) susceptibility within admixed African populations. Building upon previous findings from the International Tuberculosis Host Genetics Consortium (ITHGC), this study sought to address the limitations of small sample size and the inclusion of admixed samples by employing the Local Ancestry Allelic Adjusted (LAAA) model, as well as identify TB susceptibility loci in an admixed South African cohort.

      Strengths:

      The major strengths of this study include the use of six TB case-control datasets collected over 30 years from diverse South African populations and ADMIXTURE for global ancestry inference. The former represents comprehensive dataset used in this study and the later ensures accurate determination of ancestral contributions. In addition, the identified association in the HLA-DPB1 gene shows near-genome-wide significance, enhancing the credibility of the findings.

      Weaknesses:

      The major weakness of this study includes insufficient significant discoveries and reliance on cross-validation. This study only identified one variant significantly associated with TB status, located in an intergenic region with an unclear link to TB susceptibility. Despite identifying multiple lead SNPs, no other variants reached the genome-wide significance threshold, limiting the overall impact of the findings. The absence of an independent validation cohort, with the study relying solely on cross-validation, is also a major limitation. This approach restricts the ability to independently confirm the findings and evaluate their robustness across different population samples.

      Appraisal:

      The authors successfully achieved their aims of confirming the association between the HLA-II region and TB susceptibility in admixed African populations. However, the limited number of significant discoveries, reliance on cross-validation, and insufficient discussion of model performance and SNP significance weaken the overall strength of the findings. Despite these limitations, the results support the conclusion that considering local ancestry is crucial in genetic studies of admixed populations.

      Impact:

      The innovative use of the LAAA model and the comprehensive dataset in this study make substantial contributions to the field of genetic epidemiology.

    1. eLife assessment

      This valuable study presents compelling evidence that a single member of the Ly49 gene family (Ly49a) provides sufficient inhibitory signaling to license NK cell activity when its H-2Dd ligand is present. There is also convincing evidence of the effect of Ly49a expression on in vitro killing and IFNgamma production. The use of the authors' system to investigate additional Ly49 receptors, such as Ly49c/i on the H2b background, could provide information on their relative contribution to NK cell licensing. Improvements to the presentation with respect to figure clarity and terminology would allow a better understanding of this complex system by non-experts.

    2. Reviewer #1 (Public review):

      Summary:

      The article by Piersma et al. aims to reduce the complex process of NK cell licensing to the action of a single inhibitory receptor for MHC class I. This is achieved using a mouse strain lacking all of the Ly49 receptors expressed by NK cells and inserting the Ly49a gene into the Ncr1 locus, leading to expression on the majority of NK cells.

      Strengths:

      The mouse model used represents a precise deletion of all NK-expressed genes within the Ly49 cluster. The re-introduction of the Ly49a gene into the Ncr1 locus allows expression by most NK cells. Convincing effects of Ly49a expression on in vitro activation and in vivo killing assay are shown.

      Weaknesses:

      The choice of Ly49a provides a clear picture of H-2Dd recognition by this Ly49. It would be valuable to perform additional studies investigating Ly49c and Ly49i receptors for H-2b. This is of interest because there are reports indicating that Ly49c may not be a functional receptor in B6 mice due to strong cis interactions.

      This work generates an excellent mouse model for the study of NK cell licensing by inhibitory Ly49s that will be useful for the community. It provides a platform whereby the functional activity of a single Ly49 can be assessed.

    3. Reviewer #2 (Public review):

      Piersma et al. continue to work on deciphering the role and function of Ly49 NK cell receptors. This manuscript shows that a single inhibitory Ly49 receptor is sufficient to license NK cells and eliminate MHC-I-deficient target cells in mice. In short, they refined the mouse model ∆Ly49-1 (Parikh et al., 2020) into the Ly49KO model in which all Ly49 genes are disrupted. Using this model, they confirmed that NK cells from Ly49KO mice cannot be licensed, produce lower levels of IFN-gamma, and cannot reject MHC-I-deficient cells. To study the effect of a single Ly49 receptor in the function of NK cells, the authors backcrossed Ly49KO mice to H-2Dd transgenic KODO (D8-KODO) Ly49A knock-in mice in which a single inhibitory Ly49A receptor that recognizes H-2Dd ligands is expressed. By doing so, they demonstrate that a single inhibitory Ly49 receptor expressed by all NK cells is sufficient for licensing and missing-self killing.

      While the results of the study are largely consistent with the conclusions, it is important to address some discrepancies. For instance, in the title of Figure 1, the authors state that NK cells in Ly49KO mice compared to WT mice have a less mature phenotype , which is not consistent with the corresponding text in the Results section (lines 170-171) that states there is no difference in maturation. These differences are not evident in Figure 1, panel D. It is crucial to acknowledge these inconsistencies to ensure a comprehensive understanding of the research findings.

      In the legend of Figure 2. the text related to panel C indicates the use of dyes to label the splenocytes, and CFSE, CTV, and CTFR were mentioned. However, only CTV and CTFR are shown on the plots and mentioned in the corresponding text in the Results section. Similarly, in the legend of Figure 4, which is related to panel C, the authors write that splenocytes were differentially labeled with CFSE and CTV as indicated; however, in Figure 4, C and the Results section text, there is no mention of CFSE.

      The authors should clarify why they assume that KLRG1 expression is influenced by the expression of inhibitory Ly49 receptors and not by manipulations on chromosome 6, where the genes for both KLRG1 and Ly49 receptors are located. However, a better explanation for the possible influence of other inhibitory NK cell receptors still needs to be included. In the study by Zhang et al. (doi: 10.1038/s41467-019-13032-5 the authors showed the synergized regulation of NK cell education by the NKG2A receptor and the specific Ly49 family members. Although in this study, Piersma and colleagues show the control of MHC-I deficient cells by Ly49A+ NKG2A-NK cells in Figure 4., this receptor is not mentioned in the Results or in the Discussion section, so its role in this story needs to be clarified. Therefore, the reader would benefit from more information regarding NKG2A receptor and NKG2A+/- populations in their results.

    4. Reviewer #3 (Public review):

      Summary:

      In this study, Piersma et al. successfully generated a mouse model with all Ly49 genes knocked out, resulting in the complete absence of Ly49 receptor expression on the cell surface. The absence of Ly49 expression led to the loss of NK cell education/licensing and consequently, a failure in responsiveness against missing-self target cells. The experimental work and findings are partially overlapping with the previous work by Zhang et al. (2019), who also performed knockout of the entire Ly49 locus in mice and demonstrated that loss of NK responsiveness was due to the removal of inhibitory, and not activating Ly49 genes. The authors demonstrate the restoration of NK cell licensing by knocking in a single Ly49 gene, Ly49A, in a mouse expressing the H-2Dd ligand for this receptor, which is a novel and important finding.

      Strengths:

      The authors established a novel mouse model enabling them to have a clean and thorough study on the function of Ly49 on NK cell licensing. Also, by knocking in a single Ly49, they were able to investigate the function of a given Ly49 receptor excluding the "contamination" of co-expression of any other Ly49 genes. Their idea and method were novel though the mouse model was somehow genetically similar to a previous study. The experiment design and data interpretation were logically clear and the evidence was solid.

      Weaknesses:

      The paper is very poorly written and confusing. The authors should be more accurate in the usage of terminology, provide more details on experimental procedures, and revise much of the text to improve clarity and coherence. A thorough revision aiming to clarify the paper would be helpful.

    1. eLife assessment

      This paper reports the synthesis of covalent inhibitors bearing a unique fragment as a protected covalent warhead for irreversible binding to histidine in carbonic anhydrase (CA) enzymes. These findings are important due to the broad utility of the approach for covalent drug discovery applications and could have long-term impacts on related covalent targeting approaches. The data convincingly support the main conclusions of the paper.

    2. Reviewer #1 (Public review):

      Summary:

      This paper describes the covalent interactions of small molecule inhibitors of carbonic anhydrase IX, utilizing a pre-cursor molecule capable of undergoing beta-elimination to form the vinyl sulfone and covalent warhead.

      Strengths:

      The use of a novel covalent pre-cursor molecule that undergoes beta-elimination to form the vinyl sulfone in situ. Sufficient structure-activity relationships across a number of leaving groups, as well as binding moieties that impact binding and dissociation constants.

      Overall, the paper is clearly written and provides sufficient data to support the hypothesis and observations. The findings and outcomes are significant for covalent drug discovery applications and could have long-term impacts on related covalent targeting approaches.

      Weaknesses:

      No major weaknesses were noted by this reviewer.

    3. Reviewer #2 (Public review):

      Summary:

      The authors utilized a "ligand-first" targeted covalent inhibition approach to design potent inhibitors of carbonic anhydrase IX (CAIX) based on a known non-covalent primary sulfonamide scaffold. The novelty of their approach lies in their use of a protected pre(pro?)-vinylsulfone as a precursor to the common vinylsulfone covalent warhead to target a nonstandard His residue in the active site of CAIX. In addition to a biochemical assessment of their inhibitors, they showed that their compounds compete with a known probe on the surface of HeLa cells.

      Strengths:

      The authors use a protected warhead for what would typically be considered an "especially hot" or even "undevelopable" vinylsulfone electrophile. This would be the first report of doing so making it a novel targeted covalent inhibition approach specifically with vinylsulfones.

      The authors used a number of orthogonal biochemical and biophysical methods including intact MS, 2D NMR, x-ray crystallography, and an enzymatic stopped-flow setup to confirm the covalency of their compounds and even demonstrate that this novel pre-vinylsulfone is activated in the presence of CAIX. In addition, they included a number of compelling analogs of their inhibitors as negative controls that address hypotheses specific to the mechanism of activation and inhibition.

      The authors employed an assay that allows them to assess target engagement of their compounds with the target on the surface of cells and a fluorescent probe which is generally a critical tool to be used in tandem with phenotypic cellular assays.

      Weaknesses:

      While the authors show that the pre-vinyl moiety is shown biochemically to be transformed into the vinylsulfone, they do not show what the fate of this -SO2CH2CH2OCOR group is in a cellular context. Does the pre-vinylsulfone in fact need to be in the active site of CAIX on the surface of the cell to be activated or is the vinylsulfone revealed prior to target engagement?

      I appreciate the authors acknowledging the limitations of using an assay such as thermal shift to derive an apparent binding affinity, however, it is not entirely convincing and leaves a gap in our understanding of what is happening biochemically with these inhibitors, especially given the two-step inhibitory mechanism. It is very difficult to properly understand the activity of these inhibitors without a more comprehensive evaluation of kinact and Ki parameters. This can then bring into question how selective these compounds actually are for CAIX over other carbonic anhydrases.

      The authors did not provide any cellular data beyond target engagement with a previously characterized competitive fluorescent probe. It would be critical to know the cytotoxicity profile of these compounds or even how they affect the biology of interest regarding CAIX activity if the intention is to use these compounds in the future as chemical probes to assess CAIX activity in the context of tumor metastasis.

    4. Reviewer #3 (Public review):

      Summary:

      Targeted covalent inhibition of therapeutically relevant proteins is an attractive approach in drug development. This manuscript now reports a series of covalent inhibitors for human carbonic anhydrase (CA) isozymes (CAI, CAII, and CAIX, CAXIII) for irreversible binding to a critical histidine amino acid in the active site pocket. To support their findings, they included co-crystal structures of CAI, CAII, and CAIX in the presence of three such inhibitors. Mass spectrometry and enzymatic recovery assays validate these findings, and the results and cellular activity data are convincing.

      Strengths:

      The authors designed a series of covalent inhibitors and carefully selected non-covalent counterparts to make their findings about the selectivity of covalent inhibitors for CA isozymes quite convincing. The supportive X-ray crystallography and MS data are significant strengths. Their approach of targeted binding of the covalent inhibitors to histidine in CA isozyme may have broad utility for developing covalent inhibitors.

      Weaknesses:

      This reviewer did not find any significant weaknesses. However, I suggest several points in the recommendation for the authors' section for authors to consider.

    1. eLife assessment

      This study presents an important platform for mapping mutation effects onto higher-level protein structural information, addressing a significant gap in current research. While the work is ambitious and incorporates often-overlooked aspects of higher-order structure, the strength of the evidence supporting some results seems incomplete. The quaternary structure modeling appears to underestimate oligomeric proteins compared to previous studies, and the mutation analysis lacks crucial baseline information. Despite these limitations, the method has potential for broader applications and generalization to additional organisms, warranting further development and refinement.

    2. Reviewer #1 (Public review):

      Summary:

      This work presents a computational platform that integrates currently available experimental or precomputed datasets and/or state-of-the-art modeling methods to assemble a proteome structure from a given list of genes (representing a whole proteome of an organism, or some specific subset of interest). The main advancement is that the proteome structure contains not only the tertiary structure information (such as is provided by precomputed AlphaFold predicted proteomes) but also information about the quaternary structure. Adding quaternary structure information on the whole proteomes is a challenging problem (and the manuscript would benefit from a more comprehensive introduction section presenting these challenges). Importantly, this addition of quaternary structure information is likely to significantly improve any downstream modelling or prediction. This is because most proteins form either stable or transient complexes, and a significant proportion of proteins interacts with cellular structures such as the different biological membranes. These interactions provide important context for interpreting residue-level information, such as for example the fitness/functional effects of point mutations.

      Strengths:

      The main strength of this work is that it approaches the question of protein quaternary structure in a comprehensive way. Namely, in addition to oligomeric state, it also includes membrane and cellular localization. It also demonstrates how to use and combine the available experimental and precomputed modelling to achieve the same for any set of genes.

      Weaknesses:

      The feasibility of obtaining a similar dataset (of useful/informative size) for a more complex organism is not clear.

    3. Reviewer #2 (Public review):

      In this study, a methodology called QSPACE is developed and presented. It integrates structural information for a specific organism, here E. coli. The process entails the gathering of individual structures, including oligomeric information/stoichiometry, the incorporation of data on transmembrane regions, and the utilization of the resulting dataset for the analysis of mutation effects and the allocation of proteomes.

      This work aims high, setting an ambitious goal of modeling the quaternary structure of a proteome. The method could be applied to other organisms in the future and has value in that respect. At the same time, the work tries to cover (too?) much ground and some of the results/analyses don't measure up. There are indeed a number of shortcomings and/or inconsistencies in the results presented. The comments below will help improve the work and its usefulness.

      (1) It is described that "QSPACE then finds the 3D coordinate file (i.e. "structure") that best reflects the user-defined (input #2) multi-subunit protein assembly". What is meant by "best reflects"? What if two different structures with the same stoichiometry are available? Which one is picked?

      (2) There appears to be a significant under-estimation of oligomer formation: it is reported that "31% (1,334/4,309) of E. coli genes participate in 1,047 oligomeric complexes, 667 genes are annotated as monomers, and 2,308 genes are not included". However, it is generally observed that ~50% of E coli genes form homo-oligomers (see PMID 10940245 or more recently 38325366), and adding hetero-oligomers on top of that should increase the fraction of oligomers further. In that respect, the estimate forming the basis of this work (31% of genes participating in oligomeric complexes) seems incorrect. It is unclear why the authors did not identify more proteins as adopting a quaternary structure. It is generally hard to grasp details of the dataset, for example, the simple statistic of how many genes participate in homo- versus hetero- oligomer. Such information is partially presented in panels 2c & 2d, but it is very small and hard to see (I would suggest removing the structures of the ABC transporters to make space to present this with more detail).

      (3) There are a number of misleading statements/overstatements that I encourage the authors to revise. For example (not exhaustive):<br /> "to our knowledge this result is the most advanced genome-scale structural representation of the E. coli proteome and de facto represents a major advancement in genome annotation."<br /> "angstrom-level subcellular compartmentalization" - Can we really talk about sub-atomic precision when even side chains can move by several angstroms?<br /> "we provide a global accounting of all functionally important regions" - "all" is not justified<br /> "Incorporated into genome-scale models that compute protein expression" - what does that mean? There are gene expression & protein abundance datasets, why is the "compute" necessary?<br /> "Likewise, sequence-based prediction software (e.g., DeepTMHMM49) and structure-based prediction software (e.g., OPM50) are agnostic to membrane orientation and can also generate erroneous results" - what does "erroenous results" mean in this context? Those tools are not supposed to predict orientation.

      (4) What was the benchmark used to estimate the accuracy of orientation assignments?

      (5) It is not clear why structural information is required to calculate the volume taken up by different proteins across the proteome. For each protein, the expression level (copy number) is expected to have a significant effect, but I'm unsure of why oligomerization is considered key here. It will modulate the volume exclusion associated with interface contact areas, but isn't this negligible compared to other factors, in particular expression?

      (6) Models aiming at predicting deleterious effects of mutations typically use sequence conservation, but I do not see such information used in Figure 4. Assessing the added value of structural information should include such evolutionary information (residue-level sequence conservation) in the baseline.

      (7) The "proteome allocation" analysis is presented as an important result, but I did not find details of equations used to conduct this analysis. I assume that "proteome allocation" is based solely on expression, and that "cell volume" uses structural information on top of it. There is a significant difference between "proteome allocation" and "cell volume" as reflected in the proteomaps shown in panels 4e & 4f, but there is no explanation for it. Are the proteins' identities the same in these two panels? Were only proteins counted or was RNA considered as well? Clarifications are needed for RNA, for example, how were volumes calculated in structures containing RNAs? Datasets used to derive these maps should also be provided to enable reproducing them.

      (8) I did not see that the structures generated are available - they should be deposited on a permanent repository with a DOI.

    1. eLife assessment

      This study focuses on the role of a T-cell-specific receptor, ctla-4, in a new zebrafish model of IBD-like phenotype. Although implicated in IBD diseases, the function of ctla-4 has been hard to study in mice as the KO is lethal. Ctla-4 mutant zebrafish exhibited significant intestinal inflammation and dysbiosis, mirroring the pathology of inflammatory bowel disease (IBD) in mammals, providing a new valuable model to the field of IBD research. However, although many of the results are solid, the methods as provided are incomplete, without information on methods for many data panels.

    2. Reviewer #1 (Public review):

      "Unraveling the Role of Ctla-4 in Intestinal Immune Homeostasis: Insights from a novel Zebrafish Model of Inflammatory Bowel Disease" suggests the identification of the zebrafish homolog of ctla-4 and generates a 14bp deletion/early stop codon mutation that is viable. This mutant exhibits an IBD-like phenotype, including decreased intestinal length, abnormal intestinal folds, decreased goblet cells, abnormal cell junctions between epithelial cells, increased inflammation, and alterations in microbial diversity. Bulk and single-cell RNA-seq show upregulation of immune and inflammatory response genes in this mutant (especially in neutrophils, B cells, and macrophages) and downregulation of genes involved in adhesion and tight junctions in mutant enterocytes. The work suggests that the makeup of immune cells within the intestine is altered in these mutants, potentially due to changes in lymphocyte proliferation. Introduction of recombinant soluble Ctla-4-Ig to mutant zebrafish rescued body weight, histological phenotypes, and gene expression of several pro-inflammatory genes, suggesting a potential future therapeutic route.

      Strengths:

      - Generation of a useful new mutant.

      - The demonstration of an IBD-like phenotype in this mutant is extremely comprehensive.

      - Demonstrated gene expression differences provide mechanistic insight into how this mutation leads to IBD-like symptoms.

      - Demonstration of rescue with a soluble protein suggests exciting future therapeutic potential.

      - The manuscript is mostly well organized and well written.

      Weaknesses:

      - Given the sequence similarity between CTLA-4 and its related receptor CD28, and the difference in subcellular localization of this protein vs. human CTLA-4, some confusion remains about which gene is mutated in this manuscript (CD28 or CTLA-4/CD152).

      - Some conclusions made from scRNAseq data (e.g. increased apoptosis, changes in immune cell numbers) could potentially result from dissociation artifacts and would be stronger with validation staining.

      - The Methods section is woefully incomplete and describes fewer than half of the experiments performed in this manuscript.

    3. Reviewer #2 (Public review):

      Summary:

      The authors aimed to elucidate the role of Ctla-4 in maintaining intestinal immune homeostasis by using a novel Ctla-4-deficient zebrafish model. This study addresses the challenge of linking CTLA-4 to inflammatory bowel disease (IBD) due to the early lethality of CTLA-4 knockout mice. Four lines of evidence were shown to show that Ctla-4-deficient zebrafish exhibited hallmarks of IBD in mammals:<br /> (1) impaired epithelial integrity and infiltration of inflammatory cells;<br /> (2) enrichment of inflammation-related pathways and the imbalance between pro- and anti-inflammatory cytokines;<br /> (3) abnormal composition of immune cell populations; and<br /> (4) reduced diversity and altered microbiota composition. By employing various molecular and cellular analyses, the authors established ctla-4-deficient zebrafish as a convincing model of human IBD.

      Strengths:

      The characterization of the mutant phenotype is very thorough, from anatomical to histological and molecular levels. The finding effectively established ctla-4 mutants as a novel zebrafish model for investigating human IBD. Evidence from the histopathological and transcriptome analysis was very strong and supported a severe interruption of immune system homeostasis in the zebrafish intestine. Additional characterization using sCtla-4-Ig further probed the molecular mechanism of the inflammatory response and provided a potential treatment plan for targeting Ctla-4 in IBD models.

      Weaknesses:

      Since CTLA-4 is one of the most well-established immune checkpoint molecules, it is not clear whether the ctla-4 mutant zebrafish exhibits inflammatory phenotypes in other tissues than the intestine. Although the evidence for intestinal phenotypes is clear and similar to human IBD, it can be ambiguous whether the mutant is a specific model for IBD, or abnormal immune response in general.

      To probe the molecular mechanism of Ctla-4, the authors used a spectrum of antibodies that target Ctla-4 or its receptors. The phenotype assayed was lymphocyte proliferation, while it was the composition rather than the number of in immune cell number that was observed to be different in the scRNASeq assay. Although sCtla-4 has an effect of alleviating the IBD-like phenotypes, I found this explanation a bit oversimplified.

    4. Reviewer #3 (Public review):

      Summary:

      The current study on the mutant zebrafish for IBD modeling is worth trying. The author provided lots of evidence, including histopathological observation, gut microflora, as well as intestinal tissue or mucosa cells' transcriptomic data. The multi-omic study has demonstrated the enteritis pathology at multi levels in zebrafish model. However, poor writing of methods and insufficient discussion of current findings were the main defects.

      Strengths:

      The important immune checkpoint of Treg cells was knocked out in zebrafish, and the enteritis was found then. It could be a substitution of the mouse knockout model to investigate the molecular mechanism of gut disease.

      Weaknesses:

      (1) The use of the English language requires further editing.

      (2) The background of this study has not been introduced sufficiently.

      (3) The medical concepts were overstated for immune cell populations.

      (4) A lot of methods were not provided.

      (5) The age of fish varied a lot in this study.

      (6) The pathological index can't reflect the detailed changes in intestinal mucosa.

      (7) A lot of findings reflected by the current were not discussed.

      (8) The structuring of the text is poor and lacks good logic.

    1. eLife assessment

      Leafhoppers coat their body surface with nanoparticles, called brochosomes, which are an evolutionary innovation in this insect clade. The important paper adds significant evidence for the biological role of these structures consisting of a reflection effect of UV light as a defense against predatory spiders. Convincing support is provided for a new functional aspect of brochosomes, elucidating the emergence of the underlying genes and the principles of self-assembly of these biological nanoparticles.

    2. Reviewer #1 (Public review):

      Summary:

      Evading predation is of utmost importance for most animals and camouflage is one of the predominant mechanisms. Wu et al. set out to test the hypothesis of a unique camouflage system in leafhoppers. These animals coat themselves with brochosomes, which are spherical nanostructures that are produced in the Malpighian tubules and are distributed on the cuticle after eclosion. Based on previous findings on the reflectivity properties of brochosomes, the authors provide very good evidence that these nanostructures indeed reduce the reflectivity of the animals thereby reducing predation by jumping spiders. Further, they identify four proteins, which are essential for the proper development and function of brochosomes. In RNAi experiments, the regular brochosome structure is lost, the reflectivity reduced and the respective animals are prone to increased predation. Finally, the authors provide some phylogenetic sequence analyses and speculate about the evolution of these essential genes.

      Strengths:

      The study is very comprehensive including careful optical measurements, EM and TM analysis of the nanoparticles and their production line in the malphigian tubules, in vivo predation tests, and knock-down experiments to identify essential proteins. Indeed, the results are very convincingly in line with the starting hypothesis such that the study robustly assigns a new biological function to the brochosome coating system.

      A key strength of the study is that the biological relevance of the brochosome coating is convincingly shown by an in vivo predation test using a known predator from the same habitat.

      Another major step forward is an RNAi screen, which identified four proteins, which are essential for the brochosome structure (BSMs). After respective RNAi knock-downs, the brochosomes show curious malformations that are interesting in terms of the self-assembly of these nanostructures. The optical and in vivo predation tests provide excellent support for the model that the RNAi knock-down leads to a change of brochosomes structure, which reduces reflectivity, which in turn leads to a decrease of the antipredatory effect.

      Weaknesses:

      The reduction of reflectivity by aberrant brochosomes or after ageing is only around 10%. This may seem little to have an effect in real life. On the other hand, the in vivo predation tests confirm an influence. Hence, this is not a real weakness of the study - just a note to reconsider the wording for describing the degree of reflectivity.<br /> The single gene knockdowns seemed to lead to a very low penetrance of malformed brochosomes (Figure Supplement 3). Judging from the overview slides, less than 1% of brochosomes may have been affected. A quantification of regular versus abnormal particles in both, wildtype and RNAi treatments would have helped to exclude that the shown aberrant brochosomes did not just reflect a putative level of "normal" background defects. Of note, the quadruple knock-down of all BSMs seemed to lead to a high penetrance (Figure 4), which was already reflected in the microtubule production line. While the data shown are convincing, a quantification might strengthen the argument.

      While the RNAi effects seemed to be very specific to brochosomes and therefore very likely specific, an off-target control for RNAi was still missing. Finding the same/similar phenotype with a non-overlapping dsRNA fragment in one off-target experiment is usually considered required and sufficient. Further, the details of the targeted sequence will help future workers on the topic.

      The main weakness in the current manuscript may be the phylogenetic analysis and the model of how the genes evolved. Several aspects were not clearly or consistently stated such that I felt unsure about what the authors actually think. For instance: Are all the 4 BSMs related to each other or only BSM2 and 3? If so, not only BSM2 and 3 would be called "paralogs" but also the other BSMs. If they were all related, then a phylogenetic tree including all BSMs should be shown to visualize the relatedness (including the putative ancestral gene if that is the model of the authors). Actually, I was not sure about how the authors think about the emergence of the BSMs. Are they real orphan genes (i.e. not present outside the respective clade) or was there an ancestral gene that was duplicated and diverged to form the BSMs? Where in the phylogeny does the first of the BSMs or ancestral proteins emerge (is the gene found in Clastoptera arizonana the most ancestral one?)? Maybe, the evolution of the BSMs would have to be discussed individually for each gene as they show somewhat different patterns of emergence and loss (BSM4 present in all species, the others with different degrees of phylogenetic restriction). Related to these questions I remained unsure about some details in Figure 5. On what kind of analysis is the phylogeny based? Why are some species not colored, although they are located on the same branch as colored ones? What is the measure for homology values - % identity/similarity? The homology labels for Nephotetix cincticeps and N. virescens seem to be flipped: the latter is displayed with 100% identity for all genes with all proteins while the former should actually show this. As a consequence of these uncertainties, I could not fully follow the respective discussion and model for gene evolution.

      Conclusion:

      The authors successfully tested their hypothesis in a multidisciplinary approach and convincingly assigned a new biological function to the brochosomes system. The results fully support their claims - only the quantification of the penetrance in the RNAi experiments would be helpful to strengthen the point. The author's analysis of the evolution of BSM genes remained a bit vague and I remained unsure about their respective conclusions.

      The work is a very interesting study case of the evolutionary emergence of a new system to evade predators. Based on this study, the function of the BSM genes could now be studied in other species to provide insights into putative ancestral functions. Further, studying the self-assembly of such highly regular complex nano-structures will be strongly fostered by the identification of the four key structural genes.

    3. Reviewer #3 (Public review):

      Summary:

      In this manuscript, the authors investigate the optical properties of brochosomes produced by leafhoppers. They hypothesize that brochosomes reduce light reflection on the leafhopper's body surface, aiding in predator avoidance. Their hypothesis is supported by experiments involving jumping spiders. Additionally, the authors employ a variety of techniques including micro-UV-Vis spectroscopy, electron microscopy, transcriptome and proteome analysis, and bioassays. This study is highly interesting, and the experimental data is well-organized and logically presented.

      Strengths:

      The use of brochosomes as a camouflage coating has been hypothesized since 1936 (R.B. Swain, Entomol. News 47, 264-266, 1936) with evidence demonstrated by similar synthetic brochosome systems in a number of recent studies (S. Yang, et al. Nat. Commun. 8:1285, 2017; L. Wang, et al., PNAS. 121: e2312700121, 2024). However, direct biological evidence or relevant field studies have been lacking to directly support the hypothesis that brochosomes are used for camouflage. This work provides the first biological evidence demonstrating that natural brochosomes can be used as a camouflage coating to reduce the leafhoppers' observability of their predators. The design of the experiments is novel.

      Weaknesses:

      (1) The observation that brochosome coatings become sparse after 25 days in both male and female leafhoppers, resulting in increased predation by jumping spiders, is intriguing. However, since leafhoppers consistently secrete and groom brochosomes, it would be beneficial to explore why brochosomes become significantly less dense after 25 days.

      (2) The authors demonstrate that brochosome coatings reduce UV (specular) reflection compared to surfaces without brochosomes, which can be attributed to the rough geometry of brochosomes as discussed in the literature. However, it would be valuable to investigate whether the proteins forming the brochosomes are also UV absorbing.

      (3) The experiments with jumping spiders show that brochosomes help leafhoppers avoid predators to some extent. It would be beneficial for the authors to elaborate on the exact mechanism behind this camouflage effect. Specifically, why does reduced UV reflection aid in predator avoidance? If predators are sensitive to UV light, how does the reduced UV reflectance specifically contribute to evasion?

      (4) An important reference regarding the moth-eye effect is missing. Please consider including the following paper: Clapham, P. B., and M. C. Hutley. "Reduction of lens reflection by the 'Moth Eye' principle." Nature 244: 281-282 (1973).

      (5) The introduction should be revised to accurately reflect the related contributions in literature. Specifically, the novelty of this work lies in the demonstration of the camouflage effect of brochosomes using jumping spiders, which is verified for the first time in leafhoppers. However, the proposed use of brochosome powder for camouflage was first described by R.B. Swain (R.B. Swain, Notes on the oviposition and life history of the leafhopper Oncometopta undata Fabr. (Homoptera: Cicadellidae), Entomol. News. 47: 264-266 (1936)). Recently, the antireflective and potential camouflage functions of brochosomes were further studied by Yang et al. based on synthetic brochosomes and simulated vision techniques (S. Yang, et al. "Ultra-antireflective synthetic brochosomes." Nature Communications 8: 1285 (2017)). Later, Lei et al. demonstrated the antireflective properties of natural brochosomes in 2020 (C.-W. Lei, et al., "Leafhopper wing-inspired broadband omnidirectional antireflective embroidered ball-like structure arrays using a nonlithography-based methodology." Langmuir 36: 5296-5302 (2020)). Very recently, Wang et al. successfully fabricated synthetic brochosomes with precise geometry akin to those natural ones, and further elucidated the antireflective mechanisms based on the brochosome geometry and their role in reducing the observability of leafhoppers to their predators (L. Wang et al. "Geometric design of antireflective leafhopper brochosomes." Proceedings of the National Academy of Sciences 121: e2312700121 (2024))

    1. eLife assessment

      This paper reports a novel mechanism of regulation of the heat shock response in plants that acts as a brake to prevent hyperactivation of the stress response. The findings are valuable to understand and potentially manipulate the plant's response to heat stress and the presented evidence is overall solid. However, in some cases, the data are either poorly presented or insufficient to support the primary claims.

    2. Reviewer #1 (Public review):

      In the present work, Chen et al. investigate the role of short heat shock factors (S-HSF), generated through alternative splicing, in the regulation of the heat shock response (HSR). The authors focus on S-HsfA2, an HSFA2 splice variant containing a truncated DNA-binding domain (tDBD) and a known transcriptional-repressor leucin-rich domain (LRD). The authors found a two-fold effect of S-HsfA2 on gene expression. On the one hand, the specific binding of S-HsfA2 to the heat-regulated element (HRE), a novel type of heat shock element (HSE), represses gene expression. This mechanism was also shown for other S-HSFs, including HsfA4c and HsfB1. On the other hand, S-HsfA2 is shown to interact with the canonical HsfA2, as well as with a handful of other HSFs, and this interaction prevents HsfA2 from activating gene expression. The authors also identified potential S-HsfA2 targets and selected one, HSP17.6B, to investigate the role of the truncated HSF in the HSR. They conclude that S-HsfA2-mediated transcriptional repression of HSP17.6B helps avoid hyperactivation of the HSR by counteracting the action of the canonical HsfA2.

      The manuscript is well written and the reported findings are, overall, solid. The described results are likely to open new avenues in the plant stress research field, as several new molecular players are identified. Chen et al. use a combination of appropriate approaches to address the scientific questions posed. However, in some cases, the data are inadequately presented or insufficient to fully support the claims made. As such, the manuscript would highly benefit from tackling the following issues:

      (1) While the authors report the survival phenotypes of several independent lines, thereby strengthening the conclusions drawn, they do not specify whether the presented percentages are averages of multiple replicates or if they correspond to a single repetition. The number of times the experiment was repeated should be reported. In addition, Figure 7c lacks the quantification of the hsp17.6b-1 mutant phenotype, which is the background of the knock-in lines. This is an essential control for this experiment.

      (2) In Figure 1c, the transcript levels of HsfA2 splice variants are not evident, as the authors only show the quantification of the truncated variant. Moreover, similar to the phenotypes discussed above, it is unclear whether the reported values are averages and, if so, what is the error associated with the measurements. This information could explain the differences observed in the rosette phenotypes of the S-HsfA2-KD lines. Similarly, the gene expression quantification presented in Figures 4 and 5, as well as the GUS protein quantification of Figure 3F, also lacks this crucial information.

      (3) The quality of the main figures is low, which in some cases prevents proper visualization of the data presented. This is particularly critical for the quantification of the phenotypes shown in Figure 1b and for the fluorescence images in Figures 4f and 5b. Also, Figure 9b lacks essential information describing the components of the performed experiments.

      (4) Mutants with low levels of S-HsfA2 yield smaller plants than the corresponding wild type. This appears contradictory, given that the proposed role of this truncated HSF is to counteract the growth repression induced by the canonical HSF. What would be a plausible explanation for this observation? Was this phenomenon observed with any of the other tested S-HSFs?

      (5) In some cases, the authors make statements that are not supported by the results:<br /> (i) the claim that only the truncated variant expression is changed in the knock-down lines is not supported by Figure 1c;<br /> (ii) the increase in GUS signal in Figure 3a could also result from local protein production;<br /> (iii) in Figure 6b, the deletion of the HRE abolishes heat responsiveness, rather than merely altering the level of response; and<br /> (iv) the phenotypes in Figure 8b are not clear enough to conclude that HSP17.6B overexpressors exhibit a dwarf but heat-tolerant phenotype.

    3. Reviewer #2 (Public review):

      Summary:

      The authors report that Arabidopsis short HSFs S-HsfA2, S-HsfA4c, and S-HsfB1 confer extreme heat. They have truncated DNA binding domains that bind to a new heat-regulated element. Considering Short HSFA2, the authors have highlighted the molecular mechanism by which S-HSFs prevent HSR hyperactivation via negative regulation of HSP17.6B. The S-HsfA2 protein binds to the DNA binding domain of HsfA2, thus preventing its binding to HSEs, eventually attenuating HsfA2-activated HSP17.6B promoter activity. This report adds insights to our understanding of heat tolerance and plant growth.

      Strengths:

      (1) The manuscript represents ample experiments to support the claim.<br /> (2) The manuscript covers a robust number of experiments and provides specific figures and graphs in support of their claim.<br /> (3) The authors have chosen a topic to focus on stress tolerance in a changing environment.

      Weaknesses:

      (1) One s-HsfA2 represents all the other s-Hsfs; S-HsfA4c, and S-HsfB1. s-Hsfs can be functionally different. Regulation may be positive or negative. Maybe the other s-hsfs may positively regulate for height and be suppressed by the activity of other s-hsfs.

      (2) Previous reports on gene regulations by hsfs can highlight the mechanism.

      (3) The Materials and Methods section could be rearranged so that it is based on the correct flow of the procedure performed by the authors.

      (4) Graphical representation could explain the days after sowing data, to provide information regarding plant growth.

      (5) Clear images concerning GFP and RFP data could be used.

    1. eLife assessment

      This important study reveals a novel mechanism by which hypoxia-ischemia damages the neonatal brain and how hypothermia protects from brain injury. The paper presents an interesting combination of state-of-the-art optical measurements, mitochondrial assays, and the use of various control experiments providing solid evidence for the derived conclusions. Reviewers caution that possible adverse effects of prolonged anesthesia, as well as pain and stress after a major surgical procedure might influence the outcomes and should be carefully considered. This work will be of interest to the fields of hypoxia and brain metabolism research.

    2. Reviewer #1 (Public review):

      Summary:

      This manuscript addresses an important problem of the uncoupling of oxidative phosphorylation due to hypoxia-ischemia injury of the neonatal brain and provides insight into the neuroprotective mechanisms of hypothermia treatment.

      Strengths:

      The authors used a combination of in vivo imaging of awake P10 mice and experiments on isolated mitochondria to assess various key parameters of the brain metabolism during hypoxia-ischemia with and without hypothermia treatment. This unique approach resulted in a comprehensive data set that provides solid evidence for the derived conclusions.

      Weaknesses:

      (1) The experiments were performed acutely on the same day when the surgery was performed. There is a possibility that the physiology of mice at the time of imaging was still affected by the previously applied anesthesia. This is particularly of concern since the duration of anesthesia was relatively long. Is it possible that the observed relatively low baseline OEF (~20%) and trends of increased OEF and CBF over several hours after the imaging start were partially due to slow recovery from prolonged anesthesia? The potential effects of long exposure to anesthesia before imaging experiments were not discussed.

      (2) The Methods Section does not provide information about drugs administered to reduce the pain. If pain was not managed, mice could be experiencing significant pain during experiments in the awake state after the surgery. Since the imaging sessions were long (my impression based on information from the manuscript is that imaging sessions were ~4 hours long or even longer), the level of pain was also likely to change during the experiments. It was not discussed how significant and potentially evolving pain during imaging sessions could have affected the measurements (e.g., blood flow and CMRO2). If mice received pain management during experiments, then it was not discussed if there are known effects of used drugs on CBF, CMRO2, and lesion size after 24 hr.

      (3) Animals were imaged in the awake state, but they were not previously trained for the imaging procedure with head restraint. Did animals receive any drugs to reduce stress? Our experience with well-trained young-adult as well as old mice is that they can typically endure 2 and sometimes up to 3 hours of head-restrained awake imaging with intermittent breaks for receiving the rewards before showing signs of anxiety. We do not have experience with imaging P10 mice in the awake state. Is it possible that P10 mice were significantly stressed during imaging and that their stress level changed during the imaging session? This concern about the potential effects of stress on the various measured parameters was not discussed.

      (4) The temperature of the skull was measured during the hypothermia experiment by lowering the water temperature in the water bath above the animal's head. Considering high metabolism and blood flow in the cortex, it could be challenging to predict cortical temperature based on the skull temperature, particularly in the deeper part of the cortex.

      (5) The map of estimated CMRO2 (Fig. 4B) looks very heterogeneous across the brain surface. Is it a coincidence that the highest CMRO2 is observed within the central part of the field of view? Is there previous evidence that CMRO2 in these parts of the mouse cortex could vary a few folds over a 1-2 mm distance?

      (6) The justification for using P10 mice in the experiments has not been well presented in the manuscript.

      (7) It was not discussed how the observations made in this manuscript could be affected by the potential discrepancy between the developmental stages of P10 mice and human babies regarding cellular metabolism and neurovascular coupling

    3. Reviewer #2 (Public review):

      Summary:

      In this study, authors have hypothesized that mitochondrial injury in HIE is caused by OXPHOS-uncoupling, which is the cause of secondary energy failure in HI. In addition, therapeutic hypothermia rescues secondary energy failure. The methodologies used are state-of-the art and include PAM technique in live animal , bioenergetic studies in the isolated mitochondria, and others.

      Strengths:

      The study is comprehensive and impressive. The article is well written and statistical analyses are appropriate.

      Weaknesses:

      (1) The manuscript does not discuss the limitation of this animal model study in view of the clinical scenario of neonatal hypoxia-ischemia.

      (2) I see many studies on Pubmed on bioenergetics and HI. Hence, it is unclear what is novel and what is known.

      (3) What are the limitations of ex-vivo mitochondrial studies?

      (4) PAM technique limits the resolution of the image beyond 500-750 micron depth. Assessing basal ganglia may not be possible with this approach.

      (5) Hypothermia in present study reduces the brain temperature from 37 to 29-32 degree centigrade. In clinical set up, head temp is reduced to 33-34.5 in neonatal hypoxia ischemia. Hence a drop in temperature to 29 degrees is much lower relative to the clinical practice. How the present study with greater drop in head temperature can be interpreted for understanding the pathophysiology of therapeutic hypothermia in neonatal HIE. Moreover, in HIE model using higher temperature of 37 and dropping to 29 seems to be much different than the clinical scenario. Please discuss.

      (6) NMR was assessed ex-vivo. How does it relate to in vivo assessment. Infants admitted in Neonatal intensive Care Unit, frequently get MRI with spectroscopy. How do the MRS findings in human newborns with HIE correlate with the ex-vivo evaluation of metabolites.

    4. Reviewer #3 (Public review):

      Sun et al. present a comprehensive study using a novel photoacoustic microscopy setup and mitochondrial analysis to investigate the impact of hypoxia-ischemia (HI) on brain metabolism and the protective role of therapeutic hypothermia. The authors elegantly demonstrate three connected findings: (1) HI initially suppresses brain metabolism, (2) subsequently triggers a metabolic surge linked to oxidative phosphorylation uncoupling and brain damage, and (3) therapeutic hypothermia mitigates HI-induced damage by blocking this surge and reducing mitochondrial stress.

      The study's design and execution are great, with a clear presentation of results and methods. Data is nicely presented, and methodological details are thorough.

      However, a minor concern is the extensive use of abbreviations, which can hinder readability. As all the abbreviations are introduced in the text, their overuse may render the text hard to read to non-specialist audiences. Additionally, sharing the custom Matlab and other software scripts online, particularly those used for blood vessel segmentation, would be a valuable resource for the scientific community. In addition, while the study focuses on the short-term effects of HI, exploring the long-term consequences and definitively elucidating HI's impact on mitochondria would further strengthen the manuscript's impact.

      Despite these minor points, this manuscript is very interesting.

    1. eLife assessment

      This important study provides a comprehensive assessment of mitochondrial function across age and sex in mice. The strength of evidence supporting this resource is compelling, given the exhaustive number of tissues profiled and in-depth analyses performed.

    2. Reviewer #1 (Public review):

      In this study, Sarver and colleagues carried out an exhaustive analysis of the functioning of various components (Complex I/II/IV) of the mitochondrial electron transport chain (ETC) using a real-time cell metabolic analysis technique (commonly referred as Seahorse oxygen consumption rate (OCR) assay). The authors aimed to generate an atlas of ETC function in about 3 dozen tissue types isolated from all major mammalian organ systems. They used a recently published improvised method by which ETC function can be quantified in freshly frozen tissues. This method enabled them to collect data from almost all organ systems from the same mouse and use many biological replicates (10 mice/experiment) required for an unbiased and statistically robust analysis. Moreover, they studied the influence of sex (male and female) and aging (young adult and old age) on ETC function in these organ systems. The main findings of this study are (1) cells in the heart and kidneys have very active ETC complexes compared to other organ systems, (2) the sex of the mice has little influence on the ETC function, and (3) aging undermined the mitochondrial function in most tissue, but surprisingly in some tissue aging promoted the activity of ETC complexes (e.g., Quadriceps, plantaris muscle, and Diaphragm).

      Comments on revised version:

      The revised manuscript has improved significantly, addressing some of my previous concerns in the discussion. There is no doubt the method used to estimate the maximal uncoupled respiration rate in mitochondria across different organ systems and ages is excellent for getting an overview of the mitochondrial state. However, the correlation between the measured maximal respiration rate and the actual mitochondrial ATP production is still not adequately addressed. The authors could performed few straight forward experiments on freshly isolated mitochondria from 1-2 tissue samples of their choice to provide data linking maximal respiration rates with mitochondrial ATP production. Providing evidence that directly links maximal respiration rates with mitochondrial ATP production would help readers understand how mitochondrial function is affected in various tissues.

    3. Reviewer #2 (Public review):

      Summary:

      The authors utilize a new technique to measure mitochondrial respiration from frozen tissue extracts, which goes around the historical problem of purifying mitochondria prior to analysis, a process that requires a fair amount of time and cannot be easily scaled up.

      Strengths:

      A comprehensive analysis of mitochondrial respiration across tissues, sexes, and two different ages provides foundational knowledge needed in the field.

      Weaknesses:

      While many of the findings are mostly descriptive, this paper provides a large amount of data for the community and can be used as a reference for further studies. As the authors suggest, this is a new atlas of mitochondrial function in mouse. The inclusion of a middle aged time point and a slightly older young point (3-6 months) would be beneficial to the study.

    4. Reviewer #3 (Public review):

      The aim of the study was to map, a) whether different tissues exhibit different metabolic profiles (this is known already), what differences are found between female and male mice and how the profiles changes with age. In particular, the study recorded the activity of respirasomes, i.e. the concerted activity of mitochondrial respiratory complex chains consisting of CI+CIII2+CIV, CII+CIII2+CIV or CIV alone.

      The strength is certainly the atlas of oxidative metabolism in the whole mouse body, the inclusion of the two different sexes and the comparison between young and old mice. The measurement was performed on frozen tissue, which is possible as already shown (Acin-Perez et al, EMBO J, 2020).

      Weakness:

      The assay reveals the maximum capacity of enzyme activity, which is an artificial situation and may differ from in vivo respiration, as the authors themselves discuss. The material used was a very crude preparation of cells containing mitochondria and other cytosolic compounds and organelles. Thus, the conditions are not well defined and the respiratory chain activity was certainly uncoupled from ATP synthesis. Preparation of more pure mitochondria and testing for coupling would allow evaluation of additional parameters: P/O ratios, feedback mechanism, basal respiration, and ATP-coupled respiration, which reflect in vivo conditions much better. The discussion is rather descriptive and cautious and could lead to some speculations about what could cause the differences in respiration and also what consequences these could have, or what certain changes imply.<br /> Nevertheless, this study is an important step towards this kind of analysis.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Although this study provides a comprehensive outlook on the ETC function in various tissues, the main caveat is that it's too technical and descriptive. The authors didn't invest much effort in putting their findings in the context of the biological function of the tissue analyzed, i.e., some tissues might be more glycolytic than others and have low ETC activity.

      To better contextualize our results, we have added substantial amount of new information to the Discussion Section.

      Also, it is unclear what slight changes in the activity of one or the other ETC complex mean in terms of mitochondrial ATP production.

      Unfortunately, the method we used can only determine oxygen consumption rate through complex I (CI), CII, or CIV. It cannot tell us about ATP production. This method only measures maximal uncoupled respiration.

      Likely, these small changes reported do not affect the mitochondrial respiration.

      We are indeed looking at mitochondrial respiration. Some changes are more dramatic while others are much more modest. We are looking at the normal aging process across tissues (focusing on mitochondrial respiration) and not pathological states. As such, we expect many of the changes in mitochondrial respiration across tissues to be mild or relatively modest. After all, aging is slow and progressive. In fact, the variations we observed in mitochondrial respiration across tissues are consistent with the known heterogenous rate of aging across tissues.

      With such a detailed dataset, the study falls short of deriving more functionally relevant conclusions about the heterogeneity of mitochondrial function in various tissues. In the current format, the readers get lost in the large amount of data presented in a technical manner.

      We agree that the paper contains a large amount of information. In the revised manuscript, we did our best to contextualize our results by substantially expanding the Discussion Section.

      Also, it is highly recommended that all the raw data and the values be made available as an Excel sheet (or other user-friendly formats) as a resource to the community.

      We included all the data in two excel sheets (Figure 1 – data source 1; Figure 1 – data source 2). We presented them in such as way that it will be easy for other investigators to follow and re-use our dataset in their own studies for comparison.

      Major concerns

      (1) In this study, the authors used the method developed by Acin-Perez and colleagues (EMBO J, 2020) to analyze ETC complex activities in mitochondria derived from the snap-frozen tissue samples. However, the preservation of cellular/mitochondrial integrity in different types of tissues after being snap-frozen was not validated.

      All the samples are actually maximally preserved due to being snap frozen. Freezing the samples disrupts the mitochondria to produce membrane fragments. Subsequent thawing, mincing, and homogenization in a non-detergent based buffer (mannose-sucrose) ensures that all tissue samples are maximally disrupted into fragments which contain ETC units in various combinations. This allows the assay to give an accurate representation of maximal respiratory capacity given the ETC units present in a tissue sample.

      Since aging has been identified as the most important effector in this study, it is essential to validate how aging affects respiration in various fresh frozen tissues. Such analysis will ensure that the results presented are not due to the differential preservation of the mitochondrial respiration in the frozen tissue. In addition, such validations will further strengthen the conclusions and promote the broad usability of this "new" method.

      The reason we adopted this method is because it has been rigorously validated in the original publication (PMID: 32432379) and a subsequent methods paper (PMID: 33320426). The authors in the original paper benchmarked their frozen tissue method with freshly isolated mitochondria from the same set of tissues. Their work showed highly comparable mitochondrial respiration from frozen tissues and isolated mitochondria. For this reason, we did not repeat those validation studies.

      (2) In this study, the authors sampled the maximal activity of ETC complex I, II, and IV, but throughout the manuscript, they discussed the data in the context of mitochondrial function.

      We apologize that we did not make it clearer in our manuscript. We corrected this in our revised manuscript (the Discussion Section). Our method we measure respiration starting at Complex I (CI; via NADH), starting at CII (via succinate), or starting at CIV (using TMPD and ascorbate). Regardless of whether electrons (donated by the substrate) enter the respiratory chain through CI, CII or CIV, oxygen (as the final electron acceptor) is only consumed at CIV. Therefor, the method measures mitochondrial respiration and function through CI, CII, or CIV. This high-resolution respirometry analysis method is different from the classic enzymatic method of assessing CI, CII, or CIV activity individually; the enzymatic method does not actually measure oxygen consumption due to electrons flowing through the respiratory complexes.

      However, it is unclear how the changes in CI, CII, and CIV activity affect overall mitochondrial function (if at all) and how small changes seen in the maximal activity of one or more complexes affect the efficiency and efficacy of ATP production (OxPhos).

      Please see the preceding response to the previous question. The method is measuring mitochondrial respiration through CI, CII or CIV. The limitation of this method is that it is maximal uncoupled respiration; namely, mitochondrial respiration is not coupled to ATP synthesis since the measurements are not performed on intact mitochondria. As such, we cannot say anything about the efficiency and efficacy of ATP production. This will be an interesting future studies to further investigating tissue level variations of mitochondrial OXPHOS.

      The authors report huge variability between the activity of different complexes - in some tissues all three complexes (CI, CII, and CIV) and often in others, just one complex was affected. For example, as presented in Figure 4, there is no difference in CI activity in the hippocampus and cerebellum, but there is a slight change in CII and CIV activity. In contrast, in heart atria, there is a change in the activity of CI but not in CII and CIV. However, the authors still suggest that there is a significant difference in mitochondrial activity (e.g., "Old males showed a striking increase in mitochondrial activity via CI in the heart atria....reduced mitochondrial respiration in the brain cortex..." - Lines 5-7, Page 9). Until and unless a clear justification is provided, the authors should not make these broad claims on mitochondrial respiration based on small changes in the activity of one or more complexes (CI/CII/CIV). With such a data-heavy and descriptive study, it is confusing to track what is relevant and what is not for the functioning of mitochondria.

      We have attempted to address these issues in the revised Discussion section.

      (3) What do differences in the ETC complex CI, CII, and CIV activity in the same tissue mean? What role does the differential activity of these complexes (CI, CII, and CIV) play in mitochondrial function? What do changes in Oxphos mean for different tissues? Does that mean the tissue (cells involved) shift more towards glycolysis to derive their energy? In the best world, a few experiments related to the glycolytic state of the cells would have been ideal to solidify their finding further. The authors could have easily used ECAR measurements for some tissues to support their key conclusions.

      We have attempted to address these issues in the revised Discussion section. The frozen tissue method does not involve intact mitochondria. As such, the method cannot measure ECAR, which requires the presence of intact mitochondria.

      (4) The authors further analyzed parameters that significantly changed across their study (Figure 7, 98 data points analyzed). The main caveat of such analysis is that some tissue types would be represented three or even more times (due to changes in the activity of all three complexes - CI, CII, and CIV, and across different ages and sexes), and some just once. Such a method of analysis will skew the interpretation towards a few over-represented organ/tissue systems. Perhaps the authors should separately analyze tissue where all three complexes are affected from those with just one affected complex.

      Figure 7 summarizes the differences between male vs female, and between young vs old. All the tissue-by-tissue comparisons (data separated by CI-linked respiration, CII-linked respiration, and CIV-linked respiration) can be found in earlier figures (Figure 1-6).

      The focus of Figure 7 is to helps us better appreciate all the changes seen in the preceding Figure 1-6:

      Panel A and B indicate all changes that are considered significant

      Panel C indicates total tissues with at least one significantly affected respiration

      Panel D indicates total magnitude of change (i.e., which tissue has the highest OCR) offering a non-relative view

      Panel E indicates whole body separations

      Panel F indicates whole body separations and age vs sex clustering

      (5) The current protocol does not provide cell-type-specific resolution and will be unable to identify the cellular source of mitochondrial respiration. This becomes important, especially for those organ systems with tremendous cellular heterogeneity, such as the brain. The authors should discuss whether the observed changes result from an altered mitochondria respiratory capacity or if changes in proportions of cell types in the different conditions studied (young vs. aged) might also contribute to differential mitochondrial respiration.

      We agree with the reviewer that this is a limitation of the method. We have addressed this issue in the revised Discussion section.

      (6) Another critical concern of this study is that the same datasets were repeatedly analyzed and reanalyzed throughout the study with almost the same conclusion - namely, aging affects mitochondrial function, and sex-specific differences are limited to very few organs. Although this study has considerable potential, the authors missed the chance to add new insights into the distinct characteristics of mitochondrial activity in various tissue and organ systems. The author should invest significant efforts in putting their data in the context of mitochondrial function.

      We have attempted to address these issues in the revised Discussion section.

      Reviewer #2 (Public Review):

      Summary:

      The authors utilize a new technique to measure mitochondrial respiration from frozen tissue extracts, which goes around the historical problem of purifying mitochondria prior to analysis, a process that requires a fair amount of time and cannot be easily scaled up.

      Strengths:

      A comprehensive analysis of mitochondrial respiration across tissues, sexes, and two different ages provides foundational knowledge needed in the field.

      Weaknesses:

      While many of the findings are mostly descriptive, this paper provides a large amount of data for the community and can be used as a reference for further studies. As the authors suggest, this is a new atlas of mitochondrial function in mouse. The inclusion of a middle aged time point and a slightly older young point (3-6 months) would be beneficial to the study.

      We agreed with the reviewer that inclusion of additional time points (e.g., 3-6 months) would further strengthen the study. However, the cost, labor, and time associated with another set of samples (660 tissue samples from male and female mice and 1980 respirometry assays) are too high for our lab with limited budget and manpower. Regrettably, we will not be able to carry out the extra work as requested by the reviewer.  

      Reviewer #3 (Public Review):

      The aim of the study was to map, a) whether different tissues exhibit different metabolic profiles (this is known already), what differences are found between female and male mice and how the profiles changes with age. In particular, the study recorded the activity of respirasomes, i.e. the concerted activity of mitochondrial respiratory complex chains consisting of CI+CIII2+CIV, CII+CIII2+CIV or CIV alone.

      The strength is certainly the atlas of oxidative metabolism in the whole mouse body, the inclusion of the two different sexes and the comparison between young and old mice. The measurement was performed on frozen tissue, which is possible as already shown (Acin-Perez et al, EMBO J, 2020).

      Weakness:

      The assay reveals the maximum capacity of enzyme activity, which is an artificial situation and may differ from in vivo respiration, as the authors themselves discuss. The material used was a very crude preparation of cells containing mitochondria and other cytosolic compounds and organelles. Thus, the conditions are not well defined and the respiratory chain activity was certainly uncoupled from ATP synthesis. Preparation of more pure mitochondria and testing for coupling would allow evaluation of additional parameters: P/O ratios, feedback mechanism, basal respiration, and ATP-coupled respiration, which reflect in vivo conditions much better. The discussion is rather descriptive and cautious and could lead to some speculations about what could cause the differences in respiration and also what consequences these could have, or what certain changes imply.

      Nevertheless, this study is an important step towards this kind of analysis.

      We have attempted to address some of these issues in the revised Discussion Section. The frozen tissue method can only measure maximal uncoupled respiration. Because we are not measuring mitochondrial respiration using intact mitochondria, several of the functional parameters the reviewer alluded to (e.g., P/O ratios, feedback mechanism, basal respiration, and ATP-coupled respiration) simply cannot be obtained with the current set of samples. Nevertheless, we agree that all the additional data (if obtained) would be very informative.

      Reviewer #1 (Recommendations For The Authors):

      (1) For most of the comparative analysis, the authors normalized OCR/min to MitoTracker Deep RedFM (MTDR) fluorescence intensity. Why was the data normalized to the total protein content not used for comparative analysis? Is there a correlation between MTDR fluorescence and the protein content across different tissues?

      Given that we used the crude extract method, total protein content does not equal total mitochondrial protein content. This is why the MTDR method was used, as this represents a high throughput method of assessing mitochondrial mass in this volume of samples. In general, the total protein concentration is used to ensure the respiration intensity was approximately the same across all samples loaded into the Seahorse machine.

      (2) To test the mitochondrial isolation yield, the authors should run immunoblot against canonical mitochondrial proteins in both homogenates and mitochondrial-containing supernatants and show that the protocol followed effectively enriched mitochondria in the supernatant fraction. This would also strengthen the notion that the "µg protein" value used to normalize the total mitochondrial content comes from isolated mitochondria and not other extra-mitochondrial proteins.

      Because we are using crude tissue lysate (from frozen tissue), the total ug protein content does not come from isolated mitochondria; for this reason, it was not used and this is why MTDR was. Total mitochondrial protein content is subject to change depending on tissue for non-mitochondrial reasons. This method does not use isolated mitochondria; we only use tissue lysates enriched for mitochondrial proteins. This method has been rigorously validated in the original study (PMID: 32432379) and a subsequent methods paper (PMID: 33320426). In those studies, the authors had performed requisite quality checks the reviewer has asked for (e.g., immunoblot against canonical mitochondrial proteins in both homogenates and mitochondrial-containing supernatants to show effective enrichment of mitochondrial proteins). For this reason, we did not repeat this.

      (3) MitoTracker loads into mitochondria in a membrane potential-dependent manner. The authors should rule out the possibility that samples from different ages and sexes might have different mitochondrial membrane potentials and exhibit a differential MitoTracker loading capacity. This becomes relevant for data normalization based on MTDR (MTDR/µg protein) since it was assumed that loading capacity is the same for mitochondria across different tissue and age groups.

      MitoTracker Deep Red is not membrane potential dependent and can be effectively used to quantify mitochondrial mass even when mitochondrial membrane potential is lost. This is highlighted in the original study (PMID: 32432379).

      (4) Page 11, line 3 typo - across, not cross.

      Response: We have fixed the typo.

      Reviewer #2 (Recommendations For The Authors):

      If possible, I would include a middle aged time point between 12 and 14 months of age.

      We agreed with reviewer that inclusion of additional time points (e.g., 3-6 months) would further strengthen the study. However, the cost, labor, and time associated with another set of samples (660 tissue samples from male and female mice and 1980 respirometry assays) are too high for our lab with limited budget and manpower. Regrettably, we will not be able to carry out the extra work as requested by the reviewer. 

      Reviewer #3 (Recommendations For The Authors):

      Overall, the work is well done and the data are well processed making them easy to understand. Some minor adjustments would improve the manuscript further:

      - Significance OCR in Figure 2, maybe add error bars?

      We have added the error bars and statistical significance to revised Figure 2.

      - Tissue comparison A-C, right panel: graphs are cropped

      We are not sure what the reviewer meant here. We have double checked all our revised figures to make sure nothing is accidentally cropped.

      - Heart ventricle: Old males and females have higher CI- and CII-dependent respiration than young males and females? Only CIV respiration is lower?

      Comparing old to young male or female heart ventricle respiration via CI or CII shows an increase in maximal capacity with age. CIV-linked respiration is in the upward direction as well, although not significant, when comparing old to young. When comparing the respiration values among themselves within a mouse, i.e. old male CI- or CII-linked respiration compared to old male CIV- linked respiration, we can see that the old male CIV-linked respiration is very similar. When comparing the same in the old female mouse, there appears to be something special about electrons entering through CI as compared to CII or CIV, as CI-linked respiration appears to be elevated compared to both CII and CIV. Although we do not know if this is significantly different, the trend in the data is clear. We do not know the exact reason as to why this occurred in the heart ventricles. To differing degrees, the connected nature of CI-, CII-, and CIV-linked respirations seems to be in a generally similar style in most skeletal muscles as well, and the old male heart atria. Again, the root of this discrepancy is unknown and potentially indicates an interesting physiologic trait of certain types of muscle and merits further exploration.

      - What is plotted in Fig.3: The mean of all OCR of all tissues? A,B,C: Plot with break in x-axis to expand the violin, add mean/median values as numbers to the graph (same for Fig4)

      The left most side of Figure 3 A, B, and C shows the average OCR/MTDR value across all tissues in a group. Each tissue assayed is represented in the violin plot as an open circle.

      - Fig. 3D: add YM/YF to graph for better understanding, same in following figures

      This is in the scale bar next to all heat maps presented in the figures. We also added to the revised figure as well to improve clarity.

      - Additional figures: x-axis title (time) is missing in OCR graphs

      Time has been added to the x axis of all additional figures for clarity.

      - Also a more general question is: where the concentrations of substrates and inhibitors optimized before starting the series of experiments?

      All the details of assay optimization was carried out in the original study (PMID: 32432379) and the subsequent methods paper (PMID: 33320426). Because we had to survey 33 different tissues, we tested and optimized the “optimal” protein concentrations we need to use; the primary goal of this was to balance enough respiration signal without too much respiration signal across all tissue types as to keep all the diverse tissues analyzed under the Seahorse machine’s capabilities of detection. Through our optimization of mostly the very high respiring tissues like heart and kidney, we were also able to prove that all substrates and inhibitors were in saturating concentrations since we could get respiration to go higher if more sample was added and that all signal could be lost in these samples with the same amount of inhibitors.

    1. eLife assessment

      This study offers a valuable description of the layer-and sublayer specific outputs of the somatosensory cortex based on compelling evidence obtained with modern tools for the analysis of brain connectivity, together with functional validation of the connectivity using optogenetic approaches in vivo. Beyond bridging together, in one dataset, the results of disparate studies, this effort brings new insights on layer specific outputs, and on differences between primary and secondary somatosensory areas. This study will be of interest to neuroanatomists and neurophysiologists.

    2. Reviewer #1 (Public review):

      Summary:

      This is a fine paper that serves the purpose to show that the use of light sheet imaging may be used to provide whole brain imaging of axonal projections. The data provided suggest that at this point the technique provides lower resolution than with other techniques. Nonetheless, the technique does provide useful, if not novel, information about particular brain systems.

      Strengths:

      The manuscript is well written. In the introduction a clear description of the functional organization of the barrel cortex is provided provides the context for applying the use of specific Cre-driver lines to map the projections of the main cortical projection types using whole brain neuroanatomical tracing techniques. The results provided are also well written, with sufficient detail describing the specifics of how techniques were used to obtain relevant data. Appropriate controls were done, including the identification of whisker fields for viral injections and determination of the laminar pattern of Cre expression. The mapping of the data provides a good way to visualize low resolution patterns of projections.

      Weaknesses:

      (1) The results provided are, as stated in the discussion, "largely in agreement with previously reported studies of the major projection targets". However it must be stated that the study does not "extend current knowledge through the high sensitivity for detecting sparse axons, the high specificity of labeling of genetically defined classes of neurons and the brain wide analysis for assigning axons to detailed brain regions" which have all been published in numerous other studies. ( the allen connectivity project and related papers, along with others). If anything the labeling of axons obtained with light sheet imaging in this study does not provide as detailed mapping obtained with other techniques. Some detail is provided of how the raw images are processed to resolve labeled axons, but the images shown in the figures do not demonstrate how well individual axons may be resolved, of particular interest would be to see labeling in terminal areas such as other cortical areas, striatum and thalamus. As presented the light sheet imaging appears to be rather low resolution compared to the many studies that have used viral tracing to look at cortical projections from genetically identified cortical neurons.<br /> (2) Amongst the limitations of this study is the inability to resolve axons of passage and terminal fields. This has been done in other studies with viral constructs labeling synaptophysin. This should be mentioned.<br /> (3) Figure 5 is an example of the type of large sets of data that can be generated with whole brain mapping and registration to the Allen CCF that provides information of questionable value. Ordering the 50 plus structures by the density of labeling does not provide much in terms of relative input to different types of areas. There are multiple subregions for different functional types ( ie, different visual areas and different motor subregions are scattered not grouped together. Makes it difficult to understand any organizing principles.<br /> (4) The GENSAT Cre driver lines used must have the specific line name used, not just the gene name as the GENSAT BAC-Cre lines had multiple lines for each gene and often with very different expression patterns. Rbp4_KL100, Tlx3_PL56, Sim1_KJ18, Ntsr1_ GN220.

    3. Reviewer #2 (Public review):

      Summary:

      This study takes advantage of multiple methodological advances to perform layer-specific staining of cortical neurons and tracking of their axons to identify the pattern of their projections. This publication offers a mesoscale view of the projection patterns of neurons in the whisker primary and secondary somatosensory cortex. The authors report that, consistent with the literature, the pattern of projection is highly different across cortical layers and subtype, with targets being located around the whole brain. This was tested across 6 different mouse types that expressed a marker in layer 2/3, layer 4, layers 5 (3 sub-types) and layer 6.

      Looking more closely to the projections from primary somatosensory cortex into the primary motor cortex, they found that there was a significant spatial clustering of projections from topographically separated neurons across the primary somatosensory cortex. This was true for neurons with cell bodies located across all tested layers/types.

      Strengths:

      This study successfully looks at the relevant scale to study projection patterns, which is the whole brain. This is acheived thanks to an ambitious combination of mouse lines, immuno-histochemistry, imaging and image processing, which results in a standardized histological pipeline that processes the whole-brain projection patterns of layer-selected neurons of the primary and secondary somatosensory cortex.<br /> This standardization means that comparisons between cell-types projection patterns are possible and that both the large scale structure of the pattern and the minute details of the intra-areas pattern are available.<br /> This reference dataset and the corresponding analysis code are made available to the research community.

      Weaknesses:

      One major question raised by this dataset is the risk of missing axons during the post-processing step. Following the previous review round, my concerns have been addressed regarding this point.

    4. Reviewer #3 (Public review):

      Summary:

      The paper offers a systematic and rigorous description of the layer-and sublayer specific outputs of the somatosensory cortex using a modern toolbox for the analysis of brain connectivity which combines: 1) Layer-specific genetic drivers for conditional viral tracing; 2) whole brain analyses of axon tracts using tissue clearing and imaging; 3) Segmentation and quantification of axons with normalization to the number of transduced neurons; 4) registration of connectivity to a widely used anatomical reference atlas; 5) functional validation of the connectivity using optogenetic approaches in vivo.

      Strengths:

      Although the connectivity of the somatosensory cortex is already known, precise data are dispersed in different accounts (papers, online resources, ) using different methods. So the present account has the merit of condensing this information in one very precisely documented report. It also brings new insights on the connectivity, such as the precise comparison of layer specific outputs, and of the primary and secondary somatosensory areas. It also shows a topographic organization of the circuits linking the somatosensory and motor cortices. The paper also offers a clear description of the methodology and of a rigorous approach to quantitative anatomy.

      Weaknesses:

      The weakness relates to the intrinsic limitations of the in toto approaches, that currently lack the precision and resolution allowing to identify single axons, axon branching or synaptic connectivity. These limitations are identified and discussed by the authors.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Summary: 

      This is a fine paper that serves the purpose to show that the use of light sheet imaging may be used to provide whole brain imaging of axonal projections. The data provided suggest that at this point the technique provides lower resolution than with other techniques. Nonetheless, the technique does provide useful, if not novel, information about particular brain systems. 

      Strengths: 

      The manuscript is well written. In the introduction a clear description of the functional organization of the barrel cortex is provided provides the context for applying the use of specific Cre-driver lines to map the projections of the main cortical projection types using whole brain neuroanatomical tracing techniques. The results provided are also well written, with sufficient detail describing the specifics of how techniques were used to obtain relevant data. Appropriate controls were done, including the identification of whisker fields for viral injections and determination of the laminar pattern of Cre expression. The mapping of the data provides a good way to visualize low resolution patterns of projections. 

      Weaknesses: 

      (1) The results provided are, as stated in the discussion, "largely in agreement with previously reported studies of the major projection targets". However it must be stated that the study does not "extend current knowledge through the high sensitivity for detecting sparse axons, the high specificity of labeling of genetically defined classes of neurons and the brain wide analysis for assigning axons to detailed brain regions" which have all been published in numerous other studies. ( the allen connectivity project and related papers, along with others). If anything the labeling of axons obtained with light sheet imaging in this study does not provide as detailed mapping obtained with other techniques. Some detail is provided of how the raw images are processed to resolve labeled axons, but the images shown in the figures do not demonstrate how well individual axons may be resolved, of particular interest would be to see labeling in terminal areas such as other cortical areas, striatum and thalamus. As presented the light sheet imaging appears to be rather low resolution compared to the many studies that have used viral tracing to look at cortical projections from genetically identified cortical neurons. 

      We agree with the reviewer that the resolution of imaging should be further improved in future studies, as also mentioned in the original manuscript. On P. 17 of the revised manuscript we write “Probably most important for future studies is the need to increase the light-sheet imaging resolution perhaps combined with the use of expansion microscopy to provide brain-wide micron-resolution data (Glaser et al., 2023; Wassie et al., 2019).” However, even at somewhat lower resolution, through bright sparse labelling, individual axonal segments can nonetheless be traced through machine learning to define axonal skeletons, whose length can be quantified as we do in this study. This methodology highlights sparse wS1 and wS2 innervation of a large number of brain areas, some of which are not typically considered, and our anatomical results might therefore help the neuronal circuit analysis underlying various aspects of whisker sensorimotor processing. Despite impressive large-scale projection mapping projects such as the Allen connectivity atlas, there remains relatively sparse cell typespecific projection map data for the representations of the large posterior whiskers in wS1 and wS2, and our data in this study thus adds to a growing body of cell-type specific projection mapping with the specific focus on the output connectivity of these whisker-related neocortical regions of sensory cortex.

      In the revised manuscript, we now provide an additional supplementary figure (Figure 1 – figure supplement 2) showing examples of the axonal segmentation from further additional image planes including branching axons in the key innervation regions mentioned by the reviewer, namely “other cortical areas, striatum and thalamus”.

      (2) Amongst the limitations of this study is the inability to resolve axons of passage and terminal fields. This has been done in other studies with viral constructs labeling synaptophysin. This should be mentioned. 

      The reviewer brings up another important point for future methodological improvements to enhance connectivity mapping. Indeed, we already mentioned this in our original submission near the end of the first paragraph under the Limitations and future perspectives section. In the revised manuscript on P. 17, we write “Future studies should also aim to identify neurotransmitter release sites along the axon, which could be achieved by fluorescent labeling of prominent synaptic components, such as synaptophysin-GFP (Li et al., 2010).”

      (3) There is no quantitative analysis of differences between the genetically defined neurons projecting to the striatum, what is the relative area innervated by, density of terminals, other measures. 

      The reviewer raises an interesting question, and in the revised manuscript, we now present a more detailed analysis of cell class-specific axonal projections focusing specifically on the striatum. Following the reviewer’s suggestion, in a new supplementary figure (Figure 7 – figure supplement 1), we now report spatial axonal density maps in the striatum from SSp-bfd and SSs, finding potentially interesting differences comparing the projections of Rasgrf2-L2/3, Scnn1a-L4 and Tlx3-L5IT neurons. On P. 12 of the revised manuscript, we now write “We also investigated the spatial innervation pattern of Rasgrf2-L2/3, Scnn1a-L4 and Tlx3-L5IT neurons in the striatum (Figure 7 – figure supplement 1), where we found that axonal density from Rasgrf2-L2/3 neurons in both SSp-bfd and SSs was concentrated in a posterior dorsolateral part of the ipsilateral striatum, whereas Tlx3-L5IT neurons had extensive axonal density across a much larger region of the striatum, including bilateral innervation by SSp-bfd neurons. Striatal innervation by Scnn1a-L4 neurons was intermediate between Rasgrf2-L2/3 and Tlx3-L5IT neurons.” We think the reviewer’s comment has helped reveal further interesting aspects of our data set, and we thank the reviewer.

      (4) Figure 5 is an example of the type of large sets of data that can be generated with whole brain mapping and registration to the Allen CCF that provides information of questionable value. Ordering the 50 plus structures by the density of labeling does not provide much in terms of relative input to different types of areas. There are multiple subregions for different functional types ( ie, different visual areas and different motor subregions are scattered not grouped together. Makes it difficult to understand any organizing principles.

      We agree with the reviewer, and fully support the importance of considering subregions within the relatively coarse compartmentalization of the current Allen CCF. In order to provide some further information about connectivity that may help give the reader further insights into the data, we have now added further quantification of cortex-specific axonal density ranked according to functional subregions in a new supplementary figure (Figure 5 – figure supplement 2). 

      (5) The GENSAT Cre driver lines used must have the specific line name used, not just the gene name as the GENSAT BAC-Cre lines had multiple lines for each gene and often with very different expression patterns. Rbp4_KL100, Tlx3_PL56, Sim1_KJ18, Ntsr1_ GN220. 

      In the revised manuscript, we now write out a fuller description of the mouse lines the first time they are mentioned in the Results section on P. 7. The full mouse line names, accession numbers and references were of course already described in the methods section, which remains the case in the revised manuscript.

      Reviewer #2 (Public Review): 

      Summary: 

      This study takes advantage of multiple methodological advances to perform layer-specific staining of cortical neurons and tracking of their axons to identify the pattern of their projections. This publication offers a mesoscale view of the projection patterns of neurons in the whisker primary and secondary somatosensory cortex. The authors report that, consistent with the literature, the pattern of projection is highly different across cortical layers and subtype, with targets being located around the whole brain. This was tested across 6 different mouse types that expressed a marker in layer 2/3, layer 4, layer 5 (3 sub-types) and layer 6.  Looking more closely at the projections from primary somatosensory cortex into the primary motor cortex, they found that there was a significant spatial clustering of projections from topographically separated neurons across the primary somatosensory cortex. This was true for neurons with cell bodies located across all tested layers/types. 

      Strengths: 

      This study successfully looks at the relevant scale to study projection patterns, which is the whole brain. This is achieved thanks to an ambitious combination of mouse lines, immunohistochemistry, imaging and image processing, which results in a standardized histological pipeline that processes the whole-brain projection patterns of layer-selected neurons of the primary and secondary somatosensory cortex. 

      This standardization means that comparisons between cell-types projection patterns are possible and that both the large-scale structure of the pattern and the minute details of the intra-areas pattern are available. 

      This reference dataset and the corresponding analysis code are made available to the research community. 

      Weaknesses: 

      One major question raised by this dataset is the risk of missing axons during the postprocessing step. Indeed, it appears that the control and training efforts have focused on the risk of false positives (see Figure 1 supplementary panels). And indeed, the risk of overlooking existing axons in the raw fluorescence data id discussed in the article. 

      Based on the data reported in the article, this is more than a risk. In particular, Figure 2 shows an example Rbp4-L5 mouse where axonal spread seems massive in Hippocampus, while there is no mention of this area in the processed projection data for this mouse line. 

      In Figure 2, we show the expression of tdTomato in double-transgenic mice in which the Cre-driver lines were crossed with a Cre-dependent reporter mouse expressing cytosolic tdTomato. In addition to the specific labelling of L5PT neurons in the somatosensory cortex, Rbp4-Cre mice also express Cre-recombinase in other brain regions including the hippocampus. In the reporter mice crossed with Rbp4-Cre mice, tdTomato is expressed in neurons with cell bodies in the hippocampus which is clearly visualized in Figure 2. Because our axonal labelling is based on localized viral vector expression of tdTomato in SSp-bfd and SSs, the expression of Cre in hippocampus does not affect our analysis. In order to clarify to the reader, in the legend to Figure 2D, we now specifically write “As for panel A, but for Rbp4-L5 neurons. Note strong expression of Cre in neurons with cell bodies located in the hippocampus, which does not affect our analysis of axonal density based on virus injected locally into the neocortex.” Consistent with this observation, the Allen Institute’s ISH data support

      expression of Rbp4 in neurons of the hippocampus e.g. https://mouse.brainmap.org/gene/show/19425 and https://mouse.brainmap.org/experiment/show/68632655.

      Similarily, the Ntsr1-L6CT example shows a striking level of fluorescence in Striatum, that does not reflect in the amount of axons that are detected by the algorithms in the next figures.  These apparent discrepancies may be due to non axonal-specific fluorescence in the samples. In any case, further analysis of such anatomical areas would be useful to consolidate the valuable dataset provided by the article. 

      As pointed out above, Figure 2 shows cytosolic tdTomato fluorescence in transgenic crosses of the Cre-driver mice with Cre-dependent tdTomato reporter mice. For the Ntsr1-Cre x LSL-tdTomato mice, all corticothalamic L6CT neurons from across the entire cortex drive tdTomato expression. The axon of each neuron must traverse the striatum giving rise to fluorescence in the striatum. As discussed above, labelling of synaptic specialisations will be important in future studies to separate travelling axon from innervating axon. However, the overall impact of the axons traversing the striatum is again mitigated in our study by considering the axonal projections from local sparse infections in SSp-bfd and SSs rather than from cortex-wide tdTomato expression.

      Reviewer #3 (Public Review): 

      Summary: 

      The paper offers a systematic and rigorous description of the layer-and sublayer specific outputs of the somatosensory cortex using a modern toolbox for the analysis of brain connectivity which combines: 1) Layer-specific genetic drivers for conditional viral tracing; 2) whole brain analyses of axon tracts using tissue clearing and imaging; 3) Segmentation and quantification of axons with normalization to the number of transduced neurons; 4) registration of connectivity to a widely used anatomical reference atlas; 5) functional validation of the connectivity using optogenetic approaches in vivo. 

      Strengths: 

      Although the connectivity of the somatosensory cortex is already known, precise data are dispersed in different accounts (papers, online resources,) using different methods. So the present account has the merit of condensing this information in one very precisely documented report. It also brings new insights on the connectivity, such as the precise comparison of layer specific outputs, and of the primary and secondary somatosensory areas. It also shows a topographic organization of the circuits linking the somatosensory and motor cortices. The paper also offers a clear description of the methodology and of a rigorous approach to quantitative anatomy. 

      Weaknesses: 

      The weakness relates to the intrinsic limitations of the in toto approaches, that currently lack the precision and resolution allowing to identify single axons, axon branching or synaptic connectivity. These limitations are identified and discussed by the authors. 

      We agree with the reviewer.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      No additional comment 

      OK

      Reviewer #2 (Recommendations For The Authors): 

      In Figure 8, we don't get to see much raw data, while the diversity of functional responses pattern to the primary and supplementary S1 activations is highly intriguing (and this diversity exists as suggested by the results in Figure 8E, LRPT). 

      Can Figure 8C be less blurred? Maybe give more space to individual examples, such as an overlay of the delineations of the activated area across the tested mice? 

      Also, can we have a view on the time dynamics of the functional activation and integration window? 

      Raw data - We have now added a new supplementary figure (Figure 8 – figure supplement 1) to show data from individual mice, as well as plotting the time-course of the evoked jRGECO fluorescence signals in the frontal cortex hotspot. 

      Image blur - Each pixel represents 62.5 x 62.5 um on the cortical surface. The images in Figure 8B&C were averaged across mice, which causes some additional spatial blurring. However, the most likely explanation for the ‘blurred’ impression, is the overall large horizontal extent of the axonal innervation as well as likely rapid lateral spread of excitation both at the stimulation area and in the target region, as for example also indicated in rapid voltage-sensitive imaging experiments (Ferezou et al., 2007).  

      Reviewer #3 (Recommendations For The Authors): 

      At the time being, the abstract is really centred on the methodology which is no longer very novel as it has actually been already been described previously by other groups. In my view the paper would gain visibility, and be a useful tool for the community if amended to better point out the significant results of the study, for instance, i) the layer and sub-layer specificity of the outputs, using the listed genetic drivers; ii) the comparison of primary and secondary somatosensory areas, iii) the functional validation. The layer specificity of each cre- line should be indicated in the abstract. 

      We have tried to improve the writing of the abstract along the lines suggested by the reviewer. Specifically, we have now added layer and projection class of the various Cre-lines, and we now also highlight the most obvious differences in the innervation patterns.

      There is some degree of redundancy in the description in the result section. One suggestion, for an easier flow of reading, would be to join the paragraphs " Laminar characterization of the Cre-lines.." and: "Axonal projections...". Start for each Cre-line with a description of the laminar localisation of recombination in the somatosensory cortices, followed therefrom by the description of outputs from SSp-bfd and SSs; Then the general description/overview of the outputs can be summarized as a legend to Figure 5-supplementary 2, which could appear as a main figure. 

      Although we agree with the reviewer that there is some level of redundancy in the text, the results of the characterization of the Cre-line (Figure 2) is quite a different experiment compared to the viral injections described in other figures, and we therefore prefer to keep these sections separate.

      Other minor points: 

      In the text; Indicate the genetic background of the transgenic mouse lines. 

      On P. 18, we now indicate that all mice were “back-crossed with C57BL/6 mice”.

      Keep consistency in the designation of the areas, S1 appears sometimes as SSp-bfd or as SSp 

      We thank the reviewer for pointing out the inconsistent nomenclature, which we have now corrected in the revised manuscript. ‘SSp’ remains used on P. 9 and P. 16 of the revised manuscript to indicate a region including SSp-bfd but also extending beyond.

      Figure 1 supplement 2 is not really necessary to show (as the viral tools have previously been validated) can just be stated in the text. Conversely one would like to see a higher resolution image of the injection sites that allowed to do the cell counts used for normalization, as this can be pretty tricky. 

      In response to the reviewer’s suggestion, we have now added a new supplemental figure to show an example of how cells in the injection site were counted (Figure 1 – figure supplement 3).

      Figure 2: the most important here is the higher magnification to show the precise laminar localisation of the recombination, rather than the atlas landmarks that is already shown in Figure 1. This would allow more space for clearer higher magnification panels comparing SSs and SSp. The present image hints to some real differences, but difficult to appreciate with the current resolution. The legend should also comment on the labelling seen in layer 1, in the Tlx2 and Rbp4 lines. Could be dendritic labelling, but this needs a word of clarification.

      We think both the overview images as well as the high-resolution images are of value to the reader. Following the reviewer’s comment, in the legends to Figure 2C&D, we have now added text suggesting that the layer 1 fluorescence is likely axonal or dendritic in origin : “Labelling in layer 1 is likely of axonal or dendritic origin, and no cell bodies were labelled in this layer.” In addition, we have added a new supplemental figure which shows the cortical labelling in SSp and SSS in a more magnified view (Figure 2 – figure supplement 1).

      Figure 3: the comparison of the 3 transgenic lines labelling layer 5 and showing sublaminar identities is really interesting in showing the heterogeneity of this layer and possible regional differences. However, the cases shown for illustration for Rbp4 and Tlx3 seem pretty massive in comparison with the other drivers. Maybe cases with smaller injections could be chosen for illustration. 

      Figure 3 shows grand average axonal density maps across different mice normalized to the number of neurons in the injection site. The large amount of axon per neuron observed in Rbp4 and Tlx3 mice therefore shows their long, wide-ranging axons compared to other neuronal classes.

      Figure 6A could be a supplementary figure in my view; 6B is clearer. 

      We think both representations are useful, and we think different readers might better appreciate either of the two analyses.

    1. eLife assessment

      The authors presented a valuable bioinformatics pipeline for screening and identifying inhibitory receptors for potential drug targets. They provided solid evidence showing a sequential reduction in the search space through various screening tools and algorithms and demonstrated that this pipeline can be used to "rediscover" known targets. Further experimental validation on putative and unknown inhibitory receptors will strengthen the evidence reported in this work. This study will be of interest to bioinformaticians and computational biologists working on immune regulation, sequence screening, and target identification of immune checkpoint inhibitors.

    2. Reviewer #2 (Public review):

      Summary:

      The authors developed a bioinformatic pipeline to aid the screening and identification of inhibitory receptors suitable as drug targets. The challenge lies in the large search space and lack of tools for assessing the likelihood of their inhibitory function. To make progress, the authors used a consensus protein membrane topology and sequence motif prediction tool (TOPCOS) combined with both a statistical measure assessing their likelihood function and a machine learning protein structural prediction model (AlphaFold) to greatly cut down the search space. After obtaining a manageable set of 398 high confidence known and putative inhibitory receptors through this pipeline, the authors then mapped these receptors to different functional categories across different cell types based on their expression both in the resting and activated state. Additionally, by using publicly available pan cancer scRNA-seq for tumor-infiltrating T cells data, they showed that these receptors are expressed across various cellular subsets.

      Strengths:

      The authors presented sound arguments motivating the need to efficiently screen inhibitory receptors and to identify those that are functional. Key components of the algorithm were presented along with solid justification for why they addressed challenges faced by existing approaches. To name a few:

      • TOPCON algorithm was elected to optimize the prediction of membrane topology<br /> • A statistical measure was used to remove potential false positives<br /> • AlphaFold is used to filter out putative receptors that are low confidence (and likely intrinsically disordered)

      To examine receptors screened through this pipeline through a functional lens, the authors proposed to look at their expression of various immune cell subsets to assign functional categories. This is a reasonable and appropriate first step for interpreting and understanding how potential drug targets are differentially expressed in some disease contexts. They also presented an example showing this pipeline can be used to "rediscover" known targets.

      Weaknesses:

      The paper has strength in the pipeline they presented, but the weakness, in my opinion, lies in the lack of direct experimental validation on putative receptors. That said, the authors presented in the revised manuscript, as a proof-of-concept, an analytic approach for using functional categorization of putative inhibitory receptors to select therapeutic targets based on in vitro RNAseq. Such analysis will benefit from further investigation across different cancer types using in vivo expression.

    3. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This work is potentially useful because it has generated a mineable yield of new candidate immune inhibitory receptors, which can serve both as drug targets and as subjects for further biological investigation. It is noted however that the argument of the work is rather incomplete, in that it does very little to validate the putative new receptors, and merely makes a study of their putative distribution across cell types. Experimental follow-up to demonstrate the claimed properties for the proteins identified, or mining existing experimental data sources on gene expression across tissues to at least show that the pipeline correctly identified genes likely to be specific to immune cells (or something along these lines), would make this work more complete and compelling. 

      We thank the editors for their critical reading and assessment of our manuscript. We acknowledge that the present study is limited by a lack of experimental follow-up. However, we purposely chose to make this pipeline of putative novel inhibitory receptors public at this early stage for our work to be a starting point for further functional investigation of these targets by the scientific community.   

      Public Reviews:

      Reviewer #1 (Public Review):

      This manuscript proposes a new bioinformatics approach identifying several hundreds of previously unknown inhibitory immunoreceptors. When expressed in immune cells (such as neutrophils, monocytes, CD8+, CD4+, and T-cells), such receptors inhibit the functional activity of these cells. Blocking inhibitory receptors represents a promising therapeutic strategy for cancer treatment.

      As such, this is a high-quality and important bioinformatics study. One general concern is the absence of direct experimental validation of the results. In addition to the fact that the authors bioinformatically identified 51 known receptors, providing such experimental evaluation (of at least one, or better few identified receptors) would, in my opinion, significantly strengthen the presented evidence.

      I will now briefly summarize the results and give my comments.

      First, using sequence comparison analysis, the authors identify a large set of putative receptors based on the presence of immunoreceptor tyrosine-based inhibitory motifs (ITIMs), or immunoreceptor tyrosinebased switch motifs (ITSMs). They further filter the identified set of receptors for the presence of the ITIMs or ITSMs in an intracellular domain of the protein. Second, using AlphaFold structure modeling, the authors select only receptors containing ITIMs/ITSMs in structurally disordered regions. Third, the evaluation of gene expression profiles of known and putative receptors in several immune cell types was performed. Fourth, the authors classified putative receptors into functional categories, such as negative feedback receptors, threshold receptors, threshold disinhibition, and threshold-negative feedback. The latter classification was based on the available data from Nat Rev Immunol 2020. Fifth, using publicly available single-cell RNA sequencing data of tumor-infiltrating CD4+ and CD8+ cells from nearly twenty types of cancer, the authors demonstrate that a significant fraction of putative receptors are indeed expressed in these datasets.

      In summary, in my opinion, this is an interesting, important, high-quality bioinformatics work. The manuscript is clearly written and all technical details are carefully explained.

      One comment/suggestion regarding the methodology of evaluating gene expression profiles of putative receptors: perhaps it might be important to look at clusters of genes that are co-expressed with putative inhibitory receptors. 

      We thank the reviewer for their comments and suggestions.  We acknowledge that looking at co-expressed genes and subsequently at gene ontology enrichment could be an interesting approach to prioritize the inhibitory receptors. However, since there are many ways to approach the results of the gene coexpression networks, which also depend on the cell type and activation status of interest, we have chosen to discuss the implications of these networks in the discussion with the following paragraph, rather than reporting all these different approaches in the paper:

      “To further prioritize inhibitory receptors in immune cell subsets or diseases of interest, gene coexpression networks of putative inhibitory receptors could be assessed. On the one hand, the cooccurrence of putative inhibitory receptors with known inhibitory receptors within a module could be one approach, while on the other hand the presence of putative inhibitory receptors in a different module could suggest novel regulation of different biological functions than the known receptors. The location of the putative inhibitory receptors in the network could also change depending on the cell type and the activation status of the cell. Additionally, one could look at the co-expression of candidates with other genes within a gene module to look at potential biological function, and at co-expression with signalling molecules known to interact with inhibitory receptors, such as Csk, SHP-1, SHP-2 and SHIP1, although their regulation might be more post-translationally regulated rather than at mRNA level.”

      Reviewer #2 (Public Review):

      Summary:

      The authors developed a bioinformatic pipeline to aid the screening and identification of inhibitory receptors suitable as drug targets. The challenge lies in the large search space and lack of tools for assessing the likelihood of their inhibitory function. To make progress, the authors used a consensus protein membrane topology and sequence motif prediction tool (TOPCOS) combined with both a statistical measure assessing their likelihood function and a machine learning protein structural prediction model (AlphaFold) to greatly cut down the search space. After obtaining a manageable set of 398 high-confidence known and putative inhibitory receptors through this pipeline, the authors then mapped these receptors to different functional categories across different cell types based on their expression both in the resting and activated state. Additionally, by using publicly available pan-cancer scRNA-seq for tumor-infiltrating T-cell data, they showed that these receptors are expressed across various cellular subsets.

      Strengths:

      The authors presented sound arguments motivating the need to efficiently screen inhibitory receptors and to identify those that are functional. Key components of the algorithm were presented along with solid justification for why they addressed challenges faced by existing approaches. To name a few:

      • TOPCON algorithm was elected to optimize the prediction of membrane topology.

      • A statistical measure was used to remove potential false positives.

      • AlphaFold is used to filter out putative receptors that are low confidence (and likely intrinsically disordered).

      To examine receptors screened through this pipeline through a functional lens, the authors proposed to look at their expression of various immune cell subsets to assign functional categories. This is a reasonable and appropriate first step for interpreting and understanding how potential drug targets are differentially expressed in some disease contexts.

      Weaknesses:

      The paper has strength in the pipeline they presented, but the weakness, in my opinion, lies in the lack of concrete demonstration on how this pipeline can be used to at least "rediscover" known targets in a

      disease-specific manner. For example, the result that both known and putative immune inhibitory receptors are expressed across a wide variety of tumor-infiltrating T-cell subsets is reassuring, but this would have been more informative and illustrative if the authors could demonstrate using a disease with known targets, as opposed to a pan-cancer context. Additionally, a discussion that contrasts the known and putative receptors in the context above would help readers better identify use cases suitable for their research using this pipeline. Particularly,

      • For known receptors, does the pipeline and the expression analysis above rediscover the known target in the disease of interest?

      • For putative receptors, what do the functional category mapping and the differential expression across various tumor-infiltrating T-cell subsets imply on a potential therapeutic target?

      We thank the reviewer for their assessment and comments. The primary purpose of the bioinformatics pipeline was to identify putative inhibitory receptors in a disease-agnostic manner and allow the scientific community to further explore targets in their specific diseases of interest. We performed our pan-cancer expression analysis as a preliminary proof of concept and agree that exploring targets in specific diseases, cancer or otherwise, could be more informative. To validate that we rediscovered known immunotherapeutic targets, we analyzed the expression of known inhibitory receptors on tumorinfiltrating T cells of melanoma patients using the same dataset as figure 3. We find high expression of known therapeutic targets, such as PD-1, in addition to other known inhibitory receptors that are being targeted in clinical trials, one of which being TIGIT. We have added this information to the results section and added the corresponding graph as supplementary figure 5. 

      For the putative inhibitory receptors, we believe the functional categorization can assist in selecting targets that are more likely to be successful in a therapeutic context. As we previously proposed in our perspective on functional categorization of inhibitory receptors (Rumpret et al., Nat Imm, 2020), it might be beneficial to target inhibitory receptors of different functional categories in cancer immunotherapy. Targeting a threshold receptor to lower the threshold for activation and a negative feedback receptor to lengthen and strengthen the cellular response might therefore be more effective than targeting two receptors of a single functional category. Even though we realize RNA sequencing data of in vitro stimulated immune cells is not identical to data from TILs, we have tried to characterize the functional categories expressed by TILs by extrapolating the defined functional categorization per gene from figure 2, and added the corresponding graphs as supplementary figure 4. This shows that mainly threshold receptors and some (threshold-)negative feedback receptors are expressed by the different T cell subsets, which would open the possibility of using the proposed therapeutic strategy of targeting different functional categories. However, we acknowledge that this will require further validation of expression patterns in vivo in different cancers and immune cell subsets. 

      Reviewer #1 (Recommendations For The Authors):

      One comment/suggestion regarding the methodology of evaluating gene expression profiles of putative receptors: perhaps it might be important to look at clusters of genes that are co-expressed with putative inhibitory receptors.

      See our reply to the suggestion above.

      Reviewer #2 (Recommendations For The Authors):

      Results section

      (a) "Putative ITIM/ITSM-bearing immune inhibitory receptors can be found in the human genome"

      i. Figure 1 could benefit from additional labeling. For example, in B, the grey line indicates 5%, etc. Additionally, in panel B&C, I assume by "predicted" the author meant using TOPCONS?

      ii. Figure 1B doesn't seem to be consistent with this sentence "However, for 10 out of 51, we observed ITIM/ITSM sequences in the permutated sequence up to ~25% of the time" [page 2, line 1-3], as all 51 data points in Figure 1B (under "Known" panel) are below the 0.25 horizontal line?

      i. We have adjusted the figure legend to better indicate the information provided in the figures. The predicted genes are all unknown transmembrane candidates that contain an ITIM or ITSM in their intracellular domain, as determined using TOPCONS.

      ii. Due to the nature of permutation testing, there is some variation in the individual likelihood values for each protein sequence. However, as they were generally below 0.25 in any given iteration, we decided to define this value as a threshold for inclusion. 

      (b) "AlphaFold structure predictions can assist in identifying likely functional ITIM/ITSMs"

      i. Readability would increase if the author indicate how pLDDT score is computed and in what range is it (between 0 and 100.)

      ii. Third paragraph. Can the author comment on why 80 pLDDT is chosen as the cutoff? The first sentence of this paragraph states "We found that 99 out of 101 ITIM/ITSMs of the 51 known receptors had low confidence score, i.e., less than 80 pLDDT, with an average confidence score of 49.3 pLDDT..." However, it was later stated in the Discussion, page 10, starting Line 11 "We determined a threshold of 80 pLDDT based on the average prediction scores of the ITIM/ITSMs in known inhibitory receptors....". If 99 out of 101 ITIM/ITSMs had pLDDT<80, then it seems strange that the average of the 101 is at 80pLDDT, even in the extreme where the remaining 101-99=2 ITIM/ITSMs attain the maximum pLDDT score at 100, unless the distribution of those 99 is narrowly centered around 80? A distribution of the pLDDT would help clarify.

      i. The pLDDT scores are computed by AlphaFold as a way to determine how well a specific residue and/or region is expected to be modelled in three-dimensional space. We now refer to the corresponding AlphaFold publications and references therein to clarify this (10.1093/nar/gkab1061, 10.1038/s41586021-03819-2, 10.1093/bioinformatics/btt473). We also have now included the range (i.e., 0-100) in the text.

      ii. The threshold of 80 pLDDT was chosen as this still encompasses all known inhibitory receptors and was not calculated based on an average of the prediction scores. In this way, we still included ITIM/ITSMs with a relatively high pLDDT, such as those observed in PD-1 and LAIR-1. The previous text ‘average prediction scores of the ITIM/ITSMs in known inhibitory receptors’ referred to the averaging of the confidence score for each of the six amino acids encompassing the ITIM/ITSM into one overall score per ITIM/ITSM. We have adjusted the text to better reflect this.

      (c) "Putative inhibitory receptors are expressed across immune cell subsets"

      Figure S2, the last sentence in the caption (relevant for panel C) states "Cell subsets without uniquely expressed putative inhibitory receptors i.e., B cells and T cell, are excluded from the panel for clarity", but B cells and T cells are present in panel C?

      Indeed, but they are only included for the cases where the cell subsets share receptor expression with other immune cell subsets. The B and T cells do not express any unique putative multi-spanning receptors, all receptors are shared with at least one other immune cell subset. 

      (d) "Known and putative inhibitory receptors are expressed on tumour infiltrating T cells"

      i. Missing panel C label in Figure 3 and S3.

      ii. By comparing Figure 3 and S3, it looks to me that there's not a big difference between single-spanning and multi-spanning inhibitory receptors. I wonder if the authors can comment or speculate on this similarity in addition to differences of expression among T-cell subsets. Would the similarities and differences above be explained by cancer type?

      i. Figure 3 and S3 do not contain a panel C, but panel B consists of a lower (CD8+) and an upper (CD4+) subpanel, we have more clearly indicated this in the figure legend in the revised manuscript. 

      ii. While some T cell subsets, such as exhausted CD8+ T cells and CD4+ regulatory T cells, appear to not differ much in their expression of either single- or multi-spanning receptors, we do observe that, for example, effector memory CD4+ T cells or EMRA CD8+ T cells express single-spanning inhibitory receptors to a higher extent than multi-spanning inhibitory receptors. It is possible that these differences and similarities reflect some of the roles multi-spanning inhibitory receptors could play in regulating immune cells, for example in response to chemokines, as many chemokine receptors are multi-spanning proteins. 

      Data and Code availability

      Although the Methods section provides some context for the computational analysis and citations for relevant data, software availability and a data availability statement are lacking.

      We have included a data availability statement to the data files and code in the revised manuscript.

    1. eLife assessment

      This important study investigates the intracellular localization patterns of G proteins involved in GPCR signaling, presenting compelling evidence for their preference for plasma and lysosomal membranes over endosomal, endoplasmic reticulum, and Golgi membranes. This discovery has significant implications for understanding GPCR action and signaling from intracellular locations. This research will interest cell biologists studying protein trafficking and pharmacologists exploring localized signaling phenomena.

    2. Reviewer #1 (Public review):

      Summary:

      The manuscript by Jang et al. describes the application of new methods to measure the localization GTP-binding signaling proteins (G proteins) on different membrane structures in a model mammalian cell line (HEK293). G proteins mediate signaling by receptors found at the cell surface (GPCRs), with evidence from the last 15 years suggesting that GPCRs can induce G-protein mediated signaling from different membrane structures within the cell, with variation in signal localization leading to different cellular outcomes. While it has been clearly shown that different GPCRs efficiently traffic to various intracellular compartments, it is less clear whether G proteins traffic in the same manor, and whether GPCR trafficking facilitates "passenger" G protein trafficking. This question was a blind spot in the burgeoning field of GPCR localized signaling in need of careful study, and the results obtained will serve as an important guide post for further work in this field.<br /> The extent to which G proteins localize to different membranes within the cell is the main experimental question tested in this manuscript. This question is pursued by through two distinct methods, both relying on genetic modification of the G-beta subunit with a tag. In one method, G-beta is modified with a small fragment of the fluorescent protein mNG, which combines with the larger mNG fragment to form a fully functional fluorescent protein to facilitate protein trafficking by fluorescent microscopy. This approach was combined with expression of fluorescent proteins directed to various intracellular compartments (different types of endosomes, lysosome, endoplasmic reticulum, golgi, mitochondria) to look for colocalization of G-beta with these markers. These experiments showed compelling evidence that G-beta co-localizes with markers at the plasma membrane and the lysosome, with weak or absent co-localization for other markers. A second method for measuring localization relied on fusing G-beta with a small fragment from a miniature luciferase (HiBit) that combines with a larger luciferase fragment (LgBit) to form an active luciferase enzyme. Localization of G-beta (and luciferase signal) was measured using a method known as bystander BRET, which relies on expression of a fluorescent protein BRET acceptor in different cellular compartments. Results using bystander BRET supported findings from fluorescence microscopy experiments. These methods for tracking G protein localization were also used to probe other questions. The activation of GPCRs from different classes had virtually no impact on the localization of G-beta, suggesting that GPCR activation does not result in shuttling of G proteins through the endosomal pathway with activated receptors.

      In the revised version of this manuscript the authors have performed informative and important new experiments in addition to adding new text to address conceptual questions. These new data and discussions are commendable and address most or all of the weaknesses listed in the initial review.

      Strengths:

      The question probed in this study is quite important and, in my opinion, understudied by the pharmacology community. The results presented here are an important call to be cognizant of the localization of GPCR coupling partners in different cellular compartments. Abundant reports of endosomal GPCR signaling need to consider how the impact of lower G protein abundance on endosomal membranes will affect the signaling responses under study.

      *The work presented is carefully executed, with seemingly high levels of technical rigor. These studies benefit from probing the experimental questions at hand using two different methods of measurement (fluorescent microscopy and bystander BRET). The observation that both methods arrive at the same (or a very similar) answer inspires confidence about the validity of these findings.

      Weaknesses:

      *As noted by the authors, they do not demonstrate that the tagged G-beta is predominantly found within heterotrimeric G protein complexes. In the revised manuscript the authors have added new discussion text on why it is likely that G-beta is mostly found in complexes. This line of reasoning is convincing, although more robust experimental methods for assessing the assembly status of G-beta could be a valuable target for future experimental developments.

    3. Reviewer #2 (Public review):

      This study assess the subcellular distribution of a major G protein subunit (Gβ1) when expressed at an endogenous level in a well-studied model cell system (293 cells). The approach elegantly extends a gene editing strategy described by Leonetti's group and combines it with a FRET-based proximity assay to detect the presence of endogenously tagged Gβ1 on membrane compartments of 293 cells. The authors achieve their goal, and the data are convincing and interesting. The authors do a nice job of integrating their results with previous work in the field. The methods are now sufficiently well-described to enable other investigators to apply or adapt them in future studies.

    4. Reviewer #3 (Public review):

      Summary:

      This article addresses an important and interesting question concerning intracellular localization and dynamics of endogenous G proteins. The fate and trafficking of G protein-coupled receptors (GPCRs) have been extensively studied but so far little is known about the trafficking routes of their partner G proteins that are known to dissociate from their respective receptors upon activation of the signaling pathway. Authors utilize modern cell biology tools including genome editing and bystander bioluminescence resonance energy transfer (BRET) to probe intracellular localization of G proteins in various membrane compartments in steady state and also upon receptor activation. Data presented in this manuscript shows that while G proteins are mostly present on the plasma membrane, they can be also detected in endosomal compartments, especially in late endosomes and lysosomes. This distribution, according to data presented in this study, seems not to be affected by receptor activation. These findings will have implications in further studies addressing GPCR signaling mechanisms from intracellular compartments.

      Strengths:

      The methods used in this study are adequate for the question asked. Especially use of genome-edited cells (for addition of the tag on one of the G proteins) is a great choice to prevent effects of overexpression. Moreover, use of bystander BRET allowed authors to probe intracellular localization of G proteins in a very high-throughput fashion. By combining imaging and BRET authors convincingly show that G proteins are very low abundant on early endosomes (also ER, mitochondria, and medial Golgi), however seem to accumulate on membranes of late endosomal compartments. Moreover, authors also looked at the dynamics of G protein trafficking by tracking them over multiple time points in different compartments.

      Weaknesses:

      While authors provide a novel dataset, many questions regarding G protein trafficking remain open. For example, it is not entirely clear which pathway is utilized to traffic G proteins from the plasma membrane to intracellular compartments. Additionally, future studies should also include more quantitative details considering G-protein distribution in different compartments as well as more detailed dynamic data on G protein internalization as well as intracellular trafficking kinetics.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The manuscript by Jang et al. describes the application of new methods to measure the localization of GTP-binding signaling proteins (G proteins) on different membrane structures in a model mammalian cell line (HEK293). G proteins mediate signaling by receptors found at the cell surface (GPCRs), with evidence from the last 15 years suggesting that GPCRs can induce G-protein mediated signaling from different membrane structures within the cell, with variation in signal localization leading to different cellular outcomes. While it has been clearly shown that different GPCRs efficiently traffic to various intracellular compartments, it is less clear whether G proteins traffic in the same manner, and whether GPCR trafficking facilitates "passenger" G protein trafficking. This question was a blind spot in the burgeoning field of GPCR localized signaling in need of careful study, and the results obtained will serve as an important guidepost for further work in this field. The extent to which G proteins localize to different membranes within the cell is the main experimental question tested in this manuscript. This question is pursued through two distinct methods, both relying on genetic modification of the G-beta subunit with a tag. In one method, G-beta is modified with a small fragment of the fluorescent protein mNG, which combines with the larger mNG fragment to form a fully functional fluorescent protein to facilitate protein trafficking by fluorescent microscopy. This approach was combined with the expression of fluorescent proteins directed to various intracellular compartments (different types of endosomes, lysosome, endoplasmic reticulum, Golgi, mitochondria) to look for colocalization of G-beta with these markers. These experiments showed compelling evidence that G-beta co-localizes with markers at the plasma membrane and the lysosome, with weak or absent co-localization for other markers. A second method for measuring localization relied on fusing G-beta with a small fragment from a miniature luciferase (HiBit) that combines with a larger luciferase fragment (LgBit) to form an active luciferase enzyme. Localization of Gbeta (and luciferase signal) was measured using a method known as bystander BRET, which relies on the expression of a fluorescent protein BRET acceptor in different cellular compartments. Results using bystander BRET supported findings from fluorescence microscopy experiments. These methods for tracking G protein localization were also used to probe other questions. The activation of GPCRs from different classes had virtually no impact on the localization of G-beta, suggesting that GPCR activation does not result in the shuttling of G proteins through the endosomal pathway with activated receptors.

      Strengths:

      The question probed in this study is quite important and, in my opinion, understudied by the pharmacology community. The results presented here are an important call to be cognizant of the localization of GPCR coupling partners in different cellular compartments. Abundant reports of endosomal GPCR signaling need to consider how the impact of lower G protein abundance on endosomal membranes will affect the signaling responses under study.

      The work presented is carefully executed, with seemingly high levels of technical rigor. These studies benefit from probing the experimental questions at hand using two different methods of measurement (fluorescent microscopy and bystander BRET). The observation that both methods arrive at the same (or a very similar) answer inspires confidence about the validity of these findings.

      Weaknesses:

      The rationale for fusing G-beta with either mNG2(11) or SmBit could benefit from some expansion. I understand the speculation that using the smallest tag possible may have the smallest impact on protein performance and localization, but plenty of researchers have fused proteins with whole fluorescent proteins to provide conclusions that have been confirmed by other methods. Many studies even use G proteins fused with fluorescent proteins or luciferases. Is there an important advantage to tagging G-beta with small tags? Is there evidence that G proteins with full-size protein tags behave aberrantly? If the studies presented here would not have been possible without these CRISPR-based tagging approaches, it would be helpful to provide more context to make this clearer. Perhaps one factor would be interference from newly synthesized G proteins-fluorescent protein fusions en route to the plasma membrane (in the ER and Golgi).

      There are several advantages to using small peptide tags that we did not fully explain. From a practical standpoint the most important advantage of using the HiBit tag instead of full-length Nanoluc is that it allows us to restrict luminescence output to cells transiently transfected with LgBit. In this way untransfected cells contribute no background signal. Although we did not take advantage of it here, this also applies to fluorescent protein complementation, and will be useful for visualizing proteins in individual cells within tissues. The HiBit tag also allows PAGE analysis by probing membranes with LgBit (as in Fig. 1). We are not aware of evidence that tagging Gb or Gg subunits on the N terminus results in aberrant behavior, while there is some evidence that Ga subunits tagged with full-size protein tags (in some positions) have altered functional properties (PMID: 16371464). We do think that editing endogenous genes is critical, as studies using transient overexpression (usually driven by strong promoters) have sometimes reported accumulation of tagged G proteins in the biosynthetic pathway (e.g., PMID: 17576765), as the reviewer suggests. Ga and Gbg appear to be mutually dependent on each other for appropriate trafficking to the plasma membrane (reviewed in PMID: 23161140), therefore the native (presumably matched) stoichiometry is likely to be critical.

      To clarify this context the revised manuscript includes the following:

      “For bioluminescence experiments we added the HiBit tag (Schwinn et al., 2018) and isolated clonal “HiBit-b1“ cell lines. An advantage of this approach over adding a full-length Nanoluc luciferase is that it requires coexpression of LgBit to produce a complemented luciferase. This limits luminescence to cotransfected cells and thus eliminates background from untransfected cells.”

      “Some studies using overexpressed G protein subunits have suggested that a large pool of G proteins is located on intracellular membranes, including the Golgi apparatus (Chisari et al., 2007; Saini et al., 2007; Tsutsumi et al., 2009), whereas others have indicated a distribution that is dominated by the plasma membrane (Crouthamel et al., 2008; Evanko, Thiyagarajan, & Wedegaertner, 2000; Marrari et al., 2007; Takida & Wedegaertner, 2003). A likely factor contributing to these discrepant results is the stoichiometry of overexpressed subunits, as neither Ga nor Gbg traffic appropriately to the plasma membrane as free subunits (Wedegaertner, 2012). Our gene-editing approach presumably maintains the native subunit stoichiometry, providing a more accurate representation of native G protein distribution.”

      As noted by the authors, they do not demonstrate that the tagged G-beta is predominantly found within heterotrimeric G protein complexes. If there is substantial free G-beta, then many of the conclusions need to be reconsidered. Perhaps a comparison of immunoprecipitated tagged G beta vs immunoprecipitated supernatant, with blotting for other G protein subunits would be informative.

      We do think that HiBit-b1 exists predominantly within heterotrimeric complexes, for several reasons. First, overexpression studies have shown that Gbg requires association with Ga to traffic to the plasma membrane, and that by itself Gbg is retained on the endoplasmic reticulum

      (PMID: 12609996; PMID: 12221133). We find almost no endogenous Gb1 on the endoplasmic reticulum, and a high density on the plasma membrane. Second, we are able to detect large increases in free HiBit-Gbg after G protein activation using free Gbg sensors (e.g. Fig. 1). Third, many proteins that bind to free Gbg are found entirely in the cytosol of HEK 293 cells (e.g. PMID: 10066824), suggesting there is not a large population of free Gbg. We have added discussion of these points to the revised manuscript as follows:

      “Endogenous Ga and Gb subunits are expressed at approximately a 1:1 ratio, and Gb subunits are tightly associated with Gg and inactive Ga subunits (Cho et al., 2022; Gilman, 1987; Krumins & Gilman, 2006). Moreover, proteins that bind to free Gbg dimers are found in the cytosol of unstimulated HEK 293 cells, suggesting at most only a small population of free Gbg in these cells. Therefore, we assume that the large majority of mNG-b1 and HiBit-b1 subunits in unstimulated cells are part of heterotrimers.”

      “Notably, when Gbg dimers are expressed alone they accumulate on the endoplasmic reticulum

      (Michaelson et al., 2002; Takida & Wedegaertner, 2003). That we detect almost no endogenous Gbg on the endoplasmic reticulum supports our conclusion that the large majority of Gbg in unstimulated HEK 293 cells is associated with Ga, although we cannot rule out a small population of free Gbg.”

      We do not entirely understand the suggested experiment, as free Gbg will still be largely associated with the membrane fraction. Notably, we find almost no HiBit-b1 in the supernatant after lysis in hypotonic buffer and preparation of membrane fractions, and the small amount that we do find does not change if Ga is overexpressed.

      Additional context and questions:

      (1) There exists some evidence that certain GPCRs can form enduring complexes with G-betagamma (PubMed: 23297229, 27499021). That would seem to offer a mechanism that would enable receptor-mediated transport of G protein subunits. It would be helpful for the authors to place the findings of this manuscript in the context of these previous findings since they seem somewhat contradictory.

      We agree. In our original submission we noted “It is possible that other receptors will influence G protein distribution using mechanisms not shared by the receptors we studied.” In the revised manuscript we have added:

      “For example, a few receptors are thought to form relatively stable complexes with Gbg, which could provide a mechanism of trafficking to endosomes (Thomsen et al., 2016; Wehbi et al., 2013).”

      (2) There is some evidence that GaS undergoes measurable dissociation from the plasma membrane upon activation (see the mechanism of the assay in PubMed: 35302493). It seems possible that G-alpha (and in particular GaS) might behave differently than the G-beta subunit studied here. This is not entirely clear from the discussion as it now stands.

      Indeed, there is abundant evidence that some Gas translocates away from the plasma membrane upon activation. We referred to translocation of “some Ga subunits” in the introduction, although we did not specify that Gas is by far the most studied example. In a previous study (PMID: 27528603) we found that overexpressed Gas samples many intracellular membranes upon activation and returns to the plasma membrane when activation ceases. This is similar to activation-dependent translocation of free Gbg dimers. Because these translocation mechanisms depend on activation and are reversible they are unlikely to be a major source of inactive heterotrimers for intracellular membranes.

      We did a poor job of making it clear that we intentionally avoided translocation mechanisms that operate only during receptor and G protein stimulation. In the revised manuscript we have added new data showing reversible activation-dependent translocation of endogenous HiBitGb1.

      (3) The authors say "The presence of mNG-b1 on late endosomes suggested that some G proteins may be degraded by lysosomes". The mechanism of lysosomal degradation by proteins on the outside of the lysosome is not clear. It would be helpful for the authors to clarify.

      We agree we didn’t connect the dots here. Our initial idea was that G proteins on the surface of late endosomes might reach the interior of late endosomes and then lysosomes by involution into multivesicular bodies. However, the reviewer correctly points out that much of the G protein associated with lysosomes still appears to be on the cytosolic surface, where it would not be subject to degradation. In fact, since lysosomes can fuse with the plasma membrane under certain circumstances, this could even represent a pathway for recycling G proteins to the plasma membrane.

      We have revised the text to avoid giving the impression that lysosomes degrade G proteins, since we have scant evidence that this occurs. In the revised discussion we point out that we do not know the fate of G proteins located on the surface of lysosomes and speculate that these could be returned to the plasma membrane:

      “We do not know the fate of G proteins located on the surface of lysosomes. Since lysosomes may fuse with the plasma membrane under certain circumstances (Xu & Ren, 2015), it is possible that this represents a route of G protein recycling to the plasma membrane.”

      (4) Although the authors do a good job of assessing G protein dilution in endosomal membranes, it is unclear how this behavior compares to the measurement of other lipidanchored proteins using the same approach. Is the dilution of G proteins what we would expect for any lipid-anchored protein at the inner leaflet of the plasma membrane?

      This is a great question. To begin to address it we have studied a model lipid-anchored protein consisting of mNeongreen2 anchored to the plasma membrane by the C terminus of HRas, which is palmitoylated and prenylated. We find that this protein is also diluted on endocytic vesicles, although to a lesser degree than heterotrimeric G proteins. We have added a section to the results and a new figure supplement describing these results:

      “To test if other peripheral membrane proteins are similarly depleted from endocytic vesicles, we performed analogous experiments by overexpressing mNG bearing the C-terminal membrane anchor of HRas (mNG-HRas ct). We found that mNG-HRas ct was also less abundant on FM464-positive endocytic vesicles than expected based on plasma membrane abundance, although not to the same extent as mNG-b1 (Figure 4 - figure supplement 2); mNG-HRas ct density on FM4-64-positive vesicles was 64 ± 17% (mean ± 95% CI; n=78) of the nearby plasma membrane.”

      Reviewer #2 (Public Review):

      This is an interesting method that addresses the important problem of assessing G protein localization at endogenous levels. The data are generally convincing.

      Specific comments

      Methods:

      The description of the gene editing method is unclear. There are two different CRISPR cell lines made in two different cell backgrounds. The methods should clearly state which CRISPR guides were used on which cell line. It is also not clear why HiBit is included in the mNG-β1 construct. Presumably, this is not critical but it would be helpful to explicitly note. In general, the Methods could be more complete.

      We have added the following to the methods to clarify that the same gRNA was used to produce both cell lines:

      “The human GNB1 gene was targeted at a site corresponding to the N-terminus of the Gb1 protein; the sequence 5’-TGAGTGAGCTTGACCAGTTA-3’ was incorporated into the crRNA, and the same gRNA was used to produce both HiBit-b1 and mNG-b1 cell lines.”

      We have added the following to the methods to clarify why HiBit is included in the mNG-b1 construct:

      “HiBit was included in the repair template for producing mNG-b1 cells to enable screening for edited clones using luminescence.”

      Results:

      The explanation of validation experiments in Figures 1 C and D is incomplete and difficult to follow. The rationale and explanation of the experiments could be expanded. In addition, because this is an interesting method, it would be helpful to know if the endogenous editing affects normal GPCR signaling. For example, the authors could include data showing an Isoinduced cAMP response. This is not critical to the present interpretation but is relevant as a general point regarding the method. Also, it may be relevant to the interpretation of receptor effects on G protein localization.

      We have expanded the rationale and explanation of experiments in Figures 1C and D by adding:

      “For example, we observed agonist-induced BRET between the D2 dopamine receptor and mNG-b1, an interaction that requires association with endogenous Ga subunits (Figure 1C). Similarly, we observed BRET between HiBit-b1 and the free Gbg sensor memGRKct-Venus after activation of receptors that couple Gi/o, Gs, and Gq heterotrimers, indicating that HiBit-b1 associated with endogenous Ga subunits from these three families (Figure 1D).”

      We have done the suggested cAMP experiment and provide the data in a new figure supplement:

      “We also found that cyclic AMP accumulation in response to stimulation of endogenous b adrenergic receptors was similar in edited cell lines and their unedited parent lines (Figure 1 - figure supplement 1).”

      Discussion:

      The conclusion that beta-gamma subunits do not redistribute after GPCR activation seems new and different from previous reports. Is this correct? Can the authors elaborate on how the results compare to previous literature?

      Many previous studies have indeed shown that free Gbg dimers can redistribute after GPCR activation and sample intracellular membranes. Our initial focus was on possible changes in heterotrimer distribution after GPCR activation, but in retrospect we should have directly addressed free Gbg translocation and made the distinction clear. 

      In the revised manuscript we show that during stimulation we observe changes consistent with modest translocation of endogenous Gbg from the plasma membrane and sampling of intracellular compartments. To our knowledge this is the first demonstration of endogenous Gbg translocation.

      We have added:

      “With overexpressed G proteins free Gbg dimers translocate from the plasma membrane and sample intracellular membrane compartments after activation-induced dissociation from Ga subunits. Consistent with this, we observed small decreases in bystander BRET at the plasma membrane and small increases in bystander BRET at intracellular compartments during activation of GPCRs, suggesting that endogenous Gbg subunits undergo similar translocation (Figure 5- figure supplement 1). Notably, these changes occurred at room temperature, suggesting that endocytosis was not involved, and developed over the course of minutes. The latter observation and the small magnitude of agonist-induced changes are both consistent with expression of primarily slowly-translocating endogenous Gg subtypes in HEK 293 cells. Moreover, as shown previously for overexpressed Gbg, the changes we observed with endogenous Gbg were readily reversible (Figure 5- figure supplement 1), suggesting that most heterotrimers reassemble at the plasma membrane after activation ceases.”

      Can the authors note that OpenCell has endogenously tagged Gβ1 and reports more obvious internal localization? Can the authors comment on this point?

      OpenCell has tagged GNB1 and the Leonetti group kindly provided a parent cell line we used to add a slightly different tag. Although their study did not identify any specific intracellular compartments, our impression is that most of the internal structures visible in their images are likely to be lysosomes, as they are large, round and often have a clear lumen. Overall their images and ours are comfortingly similar. We have added:

      “Unsurprisingly, our images are quite similar to those made as part of previous study that labeled Gb1 subunits with mNG2 (Cho et al., 2022).”

      Notably, the Leonetti group has recently reported the subcellular distribution of many untagged proteins using a proteomic approach. They find that Gb1 is enriched on the plasma membrane and lysosomes but is not enriched on endosomes, the Golgi apparatus, endoplasmic reticulum or mitochondria (https://www.biorxiv.org/content/10.1101/2023.12.18.572249v1). We have cited this work in the revised manuscript.

      Is this the first use of CRISPR / HiBit for BRET assay? It would be helpful to know this or cite previous work if not. Also, as this is submitted as a tools piece, the authors might say a little more about the potential application to other questions.

      The only previous study we are aware of utilizing a similar combination of methods is a 2020 report from the group of Dr. Stephen Hill, in which the authors studied binding of fluorescent ligands to HiBit-tagged GPCRs. This work is now cited.

      We have also added the following to our previous brief statement about potential applications:

      “In addition, it may also be possible to use these cells in combination with targeted sensors to study endogenous G protein activation in different subcellular compartments. More broadly, our results show that subcellular localization of endogenous membrane proteins can be studied in living cells by adding a HiBit tag and performing bystander BRET mapping. Applied at large scale this approach would have some advantages over fluorescent protein complementation, most notably the ability to localize endogenous membrane proteins that are expressed at levels that are too low to permit fluorescence microscopy.”

      Reviewer #3 (Public Review):

      Summary:

      This article addresses an important and interesting question concerning intracellular localization and dynamics of endogenous G proteins. The fate and trafficking of G protein-coupled receptors (GPCRs) have been extensively studied but so far little is known about the trafficking routes of their partner G proteins that are known to dissociate from their respective receptors upon activation of the signaling pathway. The authors utilize modern cell biology tools including genome editing and bystander bioluminescence resonance energy transfer (BRET) to probe intracellular localization of G proteins in various membrane compartments in steady state and also upon receptor activation. Data presented in this manuscript shows that while G proteins are mostly present on the plasma membrane, they can be also detected in endosomal compartments, especially in late endosomes and lysosomes. This distribution, according to data presented in this study, seems not to be affected by receptor activation. These findings will have implications in further studies addressing GPCR signaling mechanisms from intracellular compartments.

      Strengths:

      The methods used in this study are adequate for the question asked. Especially, the use of genome-edited cells (for the addition of the tag on one of the G proteins) is a great choice to prevent the effects of overexpression. Moreover, the use of bystander BRET allowed authors to probe the intracellular localization of G proteins in a very high-throughput fashion. By combining imaging and BRET authors convincingly show that G proteins are very low abundant on early endosomes (also ER, mitochondria, and medial Golgi), however seem to accumulate on membranes of late endosomal compartments.

      Weaknesses:

      While the authors provide a novel dataset, many questions regarding G protein trafficking remain open. For example, it is not entirely clear which pathway is utilized to traffic G proteins from the plasma membrane to intracellular compartments. Additionally, future studies should also address the dynamics of G protein trafficking, for example by tracking them over multiple time points.

      We agree, there is much more to do.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      On page 7 the text says "the difference did reach significance (Figure 5D)". It looks like the difference did not reach significance. Please check on this.

      Thank you, this was an unfortunately significant typo.

      Reviewer #3 (Recommendations For The Authors):

      This article addresses an important and interesting question concerning intracellular localization and dynamics of endogenous G proteins. While the posed question is indeed a grand one and the methods used by the authors are novel, I believe that the data presented in this manuscript are still insufficient to support all claims posed by the authors. Below I list my major concerns:

      (1) The authors claim that they provide a "detailed subcellular map of endogenous G protein distribution", however, the map is in my opinion not sufficiently detailed (e.g. trans-Golgi network is not included) and not quantitative enough (e.g. % of proteins present on one compartment vs. the other as authors claim that BRET signals "cannot be directly compared between different compartments"). To strengthen this statement, except for providing more extensive and quantitative data, it would be beneficial to provide such a "map" as an illustration based on the findings presented in this article.

      “Detailed” is certainly a subjective term. While we maintain that our description of endogenous G protein distribution is far more detailed than any previous study, we now simply claim to provide a “subcellular map”. We have added images of TGNP (TGN46; TGOLN2), showing that endogenous G proteins are readily detectable on the structures labeled by this marker. These data are now provided in Figure 3 – figure supplement 7.

      We did not claim that our study was quantitative- we did not try to count G proteins. However, if we use published estimates of total G proteins and surface area for HEK 293 cells we estimate that there are roughly 2,500 G proteins µm-2 on the plasma membrane and 500 G proteins µm-2 on endocytic vesicles. For other intracellular compartments relative density can be approximated by inspecting images, but a truly quantitative estimate would require a surface area standard analogous to FM4-64 for each compartment. The percentage of the total G protein pool on a given compartment is, in our opinion, less important than the density of G proteins on that compartment, as the latter is more likely to affect the efficiency of local signal transduction. Since we do not claim to have accurate G protein density estimates for many intracellular compartments, we prefer to provide several raw images for each compartment rather than a schematized map.

      Bystander BRET values cannot be compared directly across compartments due to differences in expression and energy transfer efficiency of different markers and compartment surface area. This method is well suited for following changes in distribution as a function of time or after perturbations and for sensitive detection of weak colocalization but can only provide approximate “maps” of absolute distribution.

      (2) Probing of the intracellular distribution of these proteins, especially after GPCR activation, includes a single chosen timepoint. I believe that the manuscript would greatly benefit from including some dynamic data on internalization and intracellular trafficking kinetics. What is the turnover of tested G proteins? What is the fraction that is going to recycling compartments and/or lysosomes? Authors could perhaps turn to other methods to be able to dynamically track proteins over time e.g. via photoconversion techniques.

      Because G protein trafficking appears to be largely constitutive there is no easy way for us to assess how long it takes G proteins to transit various intracellular compartments, although we agree this would be interesting. As the reviewer suggests, dynamic data on constitutive trafficking would require methods (such as photoconversion) not currently available to us for endogenous G proteins. Accordingly, we have made no claims regarding the kinetics of G protein trafficking. As for possible redistribution after GPCR activation, in the revised manuscript we have added 5- and 15-minute timepoints after agonist stimulation for our bystander BRET mapping (Figure 5- figure supplement 2). These timepoints were chosen to correspond to persistent signaling mediated by internalized receptors. 

      (3) Exemplary images with cells showing significant colocalization with lysosomal compartments seem to contain more intracellular vesicles visible in the mNG channel than in the case of the other compartment. Is it an effect of the treatment to stain lysosomes? It would be beneficial to compare it with some endogenous marker e.g. LAMP1 without additional treatments.

      The visibility of intracellular vesicles in our lysosome images likely reflects our selection of cells and regions with visible and abundant lysosomes, specifically peripheral regions directly adhered to the coverslip, rather than treatment with lysosomal stains (LV 633 and dextran). As suggested, we now include images of cells expressing LAMP1 as an alternative lysosome marker (Figure 3 - figure supplement 6).

      (4) The authors probe an abundance of G proteins along the constitutive endocytic pathway. However, to prove that G proteins are not de-palmitoylated rather than endocytosed authors should perform control experiments where endocytosis is blocked e.g. pharmacologically or via a knockdown approach. Additionally, various endocytic pathways can be probed.

      We did not claim that depalmitoylation plays no role in delivery of G proteins to internal compartments. In fact, we pointed out that we cannot at present rule out other pathways and delivery mechanisms. Importantly, if some of the G proteins that we detect along the endocytic pathway do arrive there by trafficking through the cytosol this would only strengthen our major conclusion that endocytosis is inefficient.

      Having said this, we have now conducted extensive experiments investigating the role of palmitate cycling in the trafficking of heterotrimeric G proteins and the small G protein H-Ras. Our results suggest that a depalmitoylation-repalmitoylation cycle is not important for the distribution of heterotrimers, but these findings will be the subject of a separate publication focused on this specific question for both large and small G proteins.

      We agree that it will be interesting to probe different endocytic pathways, as suggested using a genetic approach. Our main interest here was in endocytic membranes that were defined functionally (with FM4-64 or internalized receptors) rather than biochemically.

      Minor comments:

      (5) "Imaging" paragraph in the Methods section refers to a non-existent figure called "SI Appendix S9".

      Thank you.

      (6) It is not clear what was used as a "control" in Figure 5E.

      “Control” refers to DPBS vehicle alone. This information is now added to the legend for Figure 5E.

    1. eLife assessment

      This paper presents a valuable automated method to track individual mammalian cells as they progress through the cell cycle using the FUCCI system. The authors have developed a technique for analyzing cells that grow in suspension and used their method to look at different tumor cell lines that grow in suspension and determine the effect of drugs that directly affect the cell cycle. They show solid evidence that the method can be applied to both adherent and non-adherent cell lines. This paper will be of interest to cell biologists investigating cell cycle effects.

    2. Reviewer #3 (Public review):

      Summary:

      This paper provides presents an automated method to track individual mammalian cells as they progress through the cell cycle using the FUCCI system, and applies the method to look at different tumor cell lines that grow in suspension and determine their cell cycle profile and the effect of drugs that directly affect the cell cycles, on progression through the cell cycle for a 72 hour period.

      Strengths:

      This is a METHODS paper. The one potentially novel finding is that they can identify cells which are at the G1-S transition by the change in color as one protein starts to go up and the other one goes down, similar to change seen as cells enter G2/M. They have provided detailed data in the resubmission, demonstrating how this can be done in different cell lines and that the resolution of the brief time is about (about 1 hr) when the cells are determined to be in the transition from G1 to S. They further showed how one can explore this period (using EDU labeling in conjunction with FUCCI how one can determine whether cells have entered S-phase. This nicely addressed a weakness identified in the previous review.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer 1:

      Comment 1 and 2: “The pipeline relies on a large number of hard-coded conditions: size of Gaussian blur (Gaussian should be written in uppercase), values of contrast, size of filters, levels of intensity, etc. Presumably, the authors followed a heuristic approach and tried values of these and concluded that the ones proposed were optimal. A proper sensitivity analysis should be performed. That is, select a range of values of the variables and measure the effect on the output.”

      “Linked to the previous comments. Other researchers that want to follow the pipeline would have either to have exactly the same acquisition conditions as the manuscript or start playing with values and try to compensate for any difference in their data (cell diameter, fluorescent intensity, etc.) to see if they can match the results of the manuscript.”

      We thank the Reviewer for his insightful comments. We have modified the “Usage” section of the GitHub page (https://github.com/ieoresearch/cellcycle-image-analysis) to include, for each step of the image processing, a paragraph explaining the significance of the operation and a paragraph named “Suggested Values Range” where tips for optimal parameter settings are given and examples with different parameter settings are shown. We believe that these new paragraphs help researchers easily customize the pipeline to their own data.

      Reviewer 2:

      Comment 1: “It would be useful to include frames from the movie showing a G1/S cell in Figures 1 and S1 with some indication of how long that cell is present. From Figure S4 it looks like it is substantially less than an hour.

      It would definitely be nice to validate this observation. A brief pulse of EdU together with the FUCCI colors could allow you to do that in a culture of cycling cells. It appears that the green color as cells enter S-phase develops slowly (and maybe gets brighter continuously) as does the red color as cells progress through G1. It would be nice to validate what the color the cells are when they actually initiate DNA replication.”  

      We thank the Reviewer for the opportunity to further investigate our results and clarify points that were unclear in the first version of the manuscript. As suggested, we have included all acquired frames depicting the G1 to S transition/early S phase of three cells: the Kasumi-1 untreated cell and the PF-06873600 treated NB4 cell shown in Fig. 1A, and the MDA-MB-231 cell shown in Fig. S1; they are shown in panels D of Fig. 4 and S5, respectively.

      For the Kasumi-1 and NB4 cells, the G1 to S transition/early S phase, defined in the pipeline refinement step as a yellow phase appearing before the S phase, is visible at the 12-hour frame. Conversely, the MDA-MB-231 cell shown in Fig. S5D does not exhibit the G1 to S or early S phase, yellow; it transitions abruptly from red to green within our acquisition timeframe (30 min in this case), producing a green early S phase. This observation supports the Reviewer's suggestion that the G1 to S yellow transition is often shorter than one hour and it is not identifiable in all cells.

      To further investigate this point, we also conducted the EdU experiments kindly suggested by the Reviewer. Kasumi-1 and MDA-MB-231 cells expressing the FUCCI(CA)2 probes were exposed to a pulse of EdU, and subsequently analyzed using flow cytometry and confocal microscopy. A new paragraph titled “The workflow allows the identification of the G1 to S phase transition” has been added to the Results section, with the corresponding data presented in Fig. 4 and Fig. S5 for Kasumi-1 and MDA-MB-231 cells, respectively. The Methods section has also been updated describing the new experiments.

      Additionally, in BOX1 under the 'Cell phase assignment' paragraph, point (III), we have removed point 'a. Re-assign the G2/M frames to G1'. Although theoretically possible according to the pipeline, this reassignment is incorrect in practice because mVenus fluorescence indicates that the cells are starting or have already initiated DNA replication.

      All the modifications we made in the text and Figure captions are highlighted in red. We would be thankful if the co-first authorship of Kourosh Hayatigolkhatmi, Chiara Soriani and Emanuel Soda is acknowledged in the final published version of the article.

      We believe that the revisions have strengthened our manuscript, and we hope that it now meets the reviewers' suggestions for greater clarity.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      In the present study, Rincon-Torroella et al. developed ME3BP-7, a microencapsulated formulation of 3BP, as an agent to target MCT1 overexpressing PDACs. They provided evidence showing the specific killing of PDAC cells with MCT1 overexpressing in vitro, along with demonstrating the safety and anti-tumor efficacy of ME3BP-7 in PDAC orthotopic mouse models.

      Strengths:

      * Developed a novel agent.

      * Well-designed experiments and an organized presentation of data that support the conclusions drawn.

      Weaknesses:

      There are some minor issues that could enhance the clarity and completeness of the study:

      (1) Statistical results should be visually presented in Figure 4 and Figure S1.

      (2) Given the tumor heterogeneity and the identification of focal high expression of MCT1 in Figure 7 and Figure S5B, it is suggested that the authors include the results of immunohistochemical (IHC) analysis of MCT1 expression in both control and ME3BP-7 treated tumor tissues. This addition may offer insight into whether the remaining tumors are composed of PDAC cells with negative MCT1 expression, while the cells with relatively high levels of MCT1 expression were eliminated by ME3BP-7 treatment.

      (3) The authors are encouraged to discuss the future directions for improving the efficacy of this study. For example, exploring the combination of ME3BP-7 with a glutaminase-1 inhibitor (PMID 37891897) could be a valuable avenue for further research.

      We thank the reviewer for pointing these out. We have addressed these individually in detail in the next section

      Reviewer #2 (Public Review):

      Summary:

      In the manuscript by Rincon-Torroella et al, the authors evaluated the therapeutic potential of ME3BP-7, a microencapsulated formulation of 3BP which specifically targets MCT-1 high tumor cells, in pancreatic cancer models. The authors showed that, compared to 3BP, ME3BP-7 exhibited much-enhanced stability in serum. In addition, the authors confirmed the specificity of ME3BP-7 toward MCT-1 high tumor cells and demonstrated the in vivo anti-tumor effect of ME3BP-7 in orthotopic xenograft of human PDAC cell line and PDAC PDX model.

      Strengths:

      (1) The study convincingly demonstrated the superior stability of ME3BP-7 in serum.

      (2) The specificity of ME3BP-7 and 3BP toward MCT-1 high PDAC cells was clearly demonstrated with CRISPR-mediated knockout experiments.

      Weaknesses:

      The advantage of ME3BP-7 over 3BP under an in vivo situation was not fully established.

      This is a helpful observation indeed and we have attempted to address this in the revised manuscript as well as clarified the details in the following section in detail.

      Reviewer #1 (Recommendations For The Authors):

      There are some minor issues that could enhance the clarity and completeness of the study:

      We appreciate these comments and have addressed them to the best of our abilities in the revised manuscript.

      (1) Statistical results should be visually presented in Figure 4 and Figure S1.

      Figure 4 and S1 have been updated to include visual representation of statistical results.

      (2) Given the tumor heterogeneity and the identification of focal high expression of MCT1 in Figure 7 and Figure S5B, it is suggested that the authors include the results of immunohistochemical (IHC) analysis of MCT1 expression in both control and ME3BP-7 treated tumor tissues. This addition may offer insight into whether the remaining tumors are composed of PDAC cells with negative MCT1 expression, while the cells with relatively high levels of MCT1 expression were eliminated by ME3BP-7 treatment.

      This is an excellent suggestion, but unfortunately, we were unable to implement it.   We identified a single antibody that showed specificity in our MCT1 knockout isogenic panel after testing 6 different commercial anti-MCT1 antibodies. While the chosen antibody (sc-365501) worked well on fixed human pancreatic cancer samples, it exhibited significant cross-reactivity against background mouse tissue, rendering it difficult to effectively visualize the orthotopically implanted PDx samples.  

      (3) The authors are encouraged to discuss the future directions for improving the efficacy of this study. For example, exploring the combination of ME3BP-7 with a glutaminase-1 inhibitor (PMID 37891897) could be a valuable avenue for further research.

      We have included potentially useful combinations of ME3BP-7 in the discussion section.

      Reviewer #2 (Recommendations For The Authors):

      The overall study is straightforward with translational significance. However, additional clarification is needed to determine the novelty of the study. As cited by the authors, the same group previously published a paper in Clinical Cancer Research, demonstrating the anti-tumor effect of beta-CD-3BP which is also a microencapsulated form of 3BP prepared with succinyl-beta-cyclodextrin. Please clarify what is the major difference between the ME3BP-7 and beta-CD-3BP.

      We designed the first generation of beta-CD-3BP and presented the preliminary results in the Clinical Cancer Research paper.  Over the last several years, we sought to optimize the formulation so that it would be a a robust clinical candidate. The current manuscript describes our in-depth exploration.

      We used a combination of SEC HPLC analyses (representative chromatogram in Fig. 3A) along with a newly developed assay to assess serum stability (representative data in Fig 3B) of a panel of ME-3BP complexes. The panel was created by varying the molar ratios of three different beta-CDs (succinyl beta-CD, native beta-CD and hydroxypropyl beta CD) to 3BP.   We discovered that an excess of succinyl-beta-CD (1.2 :1) resulted in the most stable agent with no noticeable batch effects, and this formulation was dubbed ME3BP-7).

      The study clearly demonstrated the superior stability of ME3BP-7 in serum compared to 3BP. To further support the advantage of ME3BP-7, it will be important to include the same dose of 3BP as a control in the in vivo treatment experiment to evaluate the difference in both toxicity and anti-tumor effect.

      We wanted to include a control arm in our study wherein the same dose of 3BP was used. However, in toxicity studies on three different species of mice, we found that infusion of 3BP at the identical dose was highly toxic, killing the animals within a few days.  We have highlighted this toxicity of the non-microencapsulated 3BP in the revised manuscript.

    2. eLife assessment

      This study presents a valuable finding and developed ME3BP-7 as a novel microencapsulated formulation of 3BP, which specifically targets MCT1-overexpressing PDAC cells. It demonstrates its specificity and efficacy in vitro and in PDAC mouse models, with significant anti-tumor effects and improved serum stability. Overall, the evidence supporting the authors' claims is solid.

    3. Reviewer #1 (Public review):

      Summary:

      In this revised manuscript, Rincon-Torroella et al. developed ME3BP-7, a microencapsulated formulation of 3BP, as a potential agent to target MCT1 overexpressing PDACs. The authors provided compelling experimental evidence demonstrating the specific and rapid killing of MCT1 overexpressing PDAC cells in vitro, along with the safety and significant anti-tumor efficacy of ME3BP-7 in multiple PDAC orthotopic mouse models. Overall, this study is very novel, with well-designed experiments and a clear, organized presentation of data that supports the conclusions. The authors have effectively addressed the questions raised in the primary review and provided a thorough discussion of the study's significance, limitations, and future directions, which enhances the readers' understanding of the potential clinical impact of this research.

      Strengths:

      * Developed a novel agent.<br /> * Well-designed experiments and an organized presentation of data that support the conclusions.

      Weaknesses:

      No significant weaknesses are noticed.

    4. Reviewer #2 (Public review):

      Summary:

      In the manuscript by Rincon-Torroella et al, the authors evaluated the therapeutic potential of ME3BP-7, a microencapsulated formulation of 3BP which specifically target MCT-1 high tumor cells, in pancreatic cancer models. The authors showed that, compared to 3BP, ME3BP-7 exhibited much enhanced stability in serum. In addition, the authors confirmed the specificity of ME3BP-7 toward MCT-1 high tumor cells and demonstrated the in vivo anti-tumor effect of ME3BP-7 in orthotopic xenograft of human PDAC cell line and PDAC PDX model.

      Strengths:

      (1) The study convincingly demonstrated the superior stability of ME3BP-7 in serum.<br /> (2) the specificity of ME3BP-7 and 3BP toward MCT-1 high PDAC cells was clearly demonstrated with CRISPR-mediated knockout experiments.<br /> (3) The advantage of ME3BP-7 over 3BP under in vivo situation is highlighted in the revised manuscript.

    1. eLife assessment

      This important study identifies a new class of small molecules that activate the integrated stress response via the kinase HRI. Solid evidence indicates that two of these compounds promote mitochondrial elongation. The findings would be strengthened if the mutant cells with reduced fusion activity of Mfn2 were analyzed for the rescue of mitochondrial functions.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reveiwer#1 (Public Review):

      Weaknesses:

      While the novel compound showed a promising potency to the HER2-positive gastric cancer cells and xenograft model, it would be great to also to be evaluated with the HER2-positive breast cancer cell models. The author did not compare the current compounds with other therapeutic strategies targeting HER2 expression at the genetic level. It is unclear whether the EGFR inhibitors gefitinib and canertinib but not HER2-specific inhibitors (i.e. tucatinib) were used as a control in the manuscript.

      We appreciate the reviewer’s insightful comments. Evaluating compound 10 on HER2-positive breast cancer cells is indeed crucial, especially given the established HER2-targeting therapies for breast cancer. In response to this concern, we conducted additional experiments to investigate the impact of compound 10 on HER2-positive breast cancer cell lines AU565 and BT474, specifically assessing its HER2 downregulating activity (Author response image 1).

      Author response image 1.

      HER2 downregulatory effect of compound 10 in HER2-positive breast cancer cell lines, AU565 and BT474.

      The selection of gefitinib (an EGFR tyrosine kinase inhibitor) and canertinib (a pan-HER inhibitor) as positive controls in our manuscript is based on their demonstrated ability to inhibit the protein-protein interaction (PPI) between ELF3 and MED23, as previously reported (J Adv Res. 47, (2023) 173-87. 10.1016/j.jare.2022.08.003; Cancer letters. 325, (2012) 72-9. 10.1016/j.canlet.2012.06.004). In referenced studies, SEAP reporter gene assay was utilized to screen compounds for their capacity to disrupt the ELF3-MED23 PPI. This assay involves GAL4-ELF3 binding to a GAL4 binding site in the SEAP reporter gene, followed by interaction with MED23, leading to RNA polymerase II recruitment and SEAP expression in cells (J Am Chem Soc. 2004, 126(49), 15940. doi: 10.1021/ja0445140). Canertinib exhibited stronger inhibitory activity against ELF3-MED23 PPI compared to gefitinib, but also showed non-specific cytotoxicity. YK1 was subsequently developed based on structural analysis of the interfaces between gefitinib and MED23, and between ELF3 and MED23. Considering the previously validated inhibitory activities of gefitinib and canertinib, these drugs were selected as positive controls in the current study to compare the ELF3-MED23 inhibitory efficacy of novel compounds.

      Reveiwer#1 (Recommendations For the Authors):

      (1) It is unclear how compound 5 did not inhibit HER2 overexpression at mRNA but at protein levels as compounds 3 and 10. Could the author further explain the potential mechanism for compound 5?

      While the exact mechanism remains unclear, the results indicated that compound 5 likely affects the protein level of HER2 through somewhat non-specific mechanisms rather than by inhibiting the ELF3-MED23 PPI. Based on this assessment, compound 5 was excluded from further investigation.

      (2) The HER2 expression and its downstream signaling pathway assay are unclear about the approach. It needs to be included in the methods or supplementary.

      We investigated the ELF3-MED23 PPI inhibitory activity and its subsequent effect on HER2 downregulation using a comprehensive approach involving multiple techniques to ensure precise and unbiased experimental results.

      To assess PPI inhibition, we employed the following assays:

      · SEAP reporter gene assay

      · Fluorescence polarization (FP)

      · Split-luciferase complementation assay

      · GST-pulldown

      · Immunoprecipiation (IP)

      HER2 expression levels were evaluated through:

      · SEAP reporter gene assay

      · Luciferase promoter assay

      · Quantification of HER2 mRNA using qPCR

      · Measurement of HER2 protein levels via western blot analysis

      To evaluate downstream signaling of HER2, we analyzed:

      · Phosphorylation levels of MAPK (pMAPK) and AKT (pAKT)

      These methods were systematically applied to elucidate the mechanism of action of compound 10 in inhibiting ELF3-MED23 interaction and subsequently downregulating HER2.

      For clarity, we have revised the manuscript to provide a detailed description of the experimental methods to assess PPI, as described below.

      “SEAP assay was performed as previously described to measure ELF3-MED23 PPI-dependent HER2 transcription [29]. In this assay, the GAL4-ELF3 fusion protein binds to one of the five GAL4 binding sites on the reporter gene (pG4IL2SX). The interaction between the GAL4-ELF3 fusion protein and endogenous MED23 induces the expression of the SEAP. Once expressed, SEAP acts as a phosphatase on the substrate 4-MUP (4-methyl umbelliferyl phosphate), resulting in increased fluorescence. The mammalian expression vector, …”

      “FP assay was conducted following a previously described method to evaluate the molecular interaction between ELF3 and MED23 [29]. The FP assay operates on the principle of the molecular rotation dynamics. When a fluorescently labeled small molecule is excited by polarized light, the emitted fluorescence can be polarized or depolarized depending on the molecular status. Free small molecules rotate rapidly, altering the orientation of their fluorescence dipole and emitting depolarized light. However, when these small molecules bind to large molecules, such as proteins, the resulting complex rotates more slowly, and the emitted light retains much of its original polarization. In this study, different concentrations of (His)6-MED23391–582, as the large molecule, and 10 nM of FITC-labeled ELF3129–145 peptide, as the fluorescence-labeled small molecule, were combined in …”

      (3) It is confusing to me about the order of the experiments, in which the SAR work came after the synthesis and a series of biochemical studies for the characterization of the synthetic compounds. What is the specific reason for this order?

      We concluded that the current approach is appropriate because the analysis was not intended for structural modification and optimization through SAR (Structure-Activity Relationship) analysis. Instead, the primary objective was to elucidate the structural basis underlying the efficacy of PPI inhibition among compounds sharing the same scaffold. We believe this will provide valuable insights for future design and synthesis of new compounds.

      (4) The yield for each step of the general synthesis needs to be included in the scheme 1.

      Scheme 1 has been updated to include the yield of each step of the synthesis process.

      (5) In line 532, the authors stated 28 compounds, should it be 26?

      ‘Twenty-eight compounds’ includes 26 newly synthesized compounds and 2 positive controls, gefitinib and canertinib.

      (6) Introduction part, lines 74 to 75, "While HER2 gene amplification is the primary mechanism responsible for HER2 overexpression" may not be confirmed in lung cancers.

      HER2 overexpression is usually a direct consequence of gene amplification, although overexpression can occur by other mechanisms [Nat Rev Cancer. 2009;9:463–475. doi: 10.1038/nrc2656.; Cell. 2007;129:1275–1286. doi: 10.1016/j.cell.2007.04.034.]. The levels of HER2 protein expression and gene amplification are linearly associated and highly concordant in breast cancer, colorectal cancer, ovarian cancer, and esophageal adenocarcinoma [World J Gastrointest Oncol. 2019, 11(4): 335–347. doi: 10.4251/wjgo.v11.i4.335; J Clin Oncol. 2002;20:719–26. doi.org/10.1200/JCO.2002.20.3.71; Oncology. 2001;61(Suppl 2):14–21. doi.org/10.1159/000055397; Science. 1989, 244(4905):707-12. doi: 10.1126/science.2470152; Cancer. 2014 Feb 1; 120(3): 415–424. doi: 10.1002/cncr.28435]. As reviewer mentioned, the linear association between of HER2 protein expression and gene amplification has not been fully established for NSCLC [ESMO Open. 2022, 100395. doi: 10.1016/j.esmoop.2022.100395].

      Therefore, we change the sentence as describe below.

      “While HER2 gene amplification is the primary mechanism responsible for HER2 overexpression in most HER2-positive cancers, except in lung cancer [16], high transcription rates of HER2 per gene copy have also been observed to contribute.”

      (7) The abstract part, lines 31 and 32, the detailed experimental data for SEAP needs to be expressed in another way.

      SEAP is a type of reporter gene assay. We revised the manuscript as follows and we additionally described it method part.

      “Upon systematic analysis, candidate compound 10 was selected due to its potency in downregulating reporter gene activity of HER2 promoter confirmed by SEAP activity and its effect on HER2 protein and mRNA levels.”

      (8) The author should combine the box for Chalcone, pyrazoline, Licochalcone E, and YK-1, Figures 1 and 2 into a new single Figure.

      We revised the manuscript following the reviewer's comments.

      (9) Provide the list of antibodies and sources for the cell-based and western blot assays.

      Table S1 presents detailed information about the antibodies and dilution ratios used in the cell-based and western blot assays.

      Reveiwer#2 (Public Reviews):

      Weaknesses:

      The rationale behind the proposed structural modifications for the three groups of compounds is not clear.

      Reveiwer#2 (Recommendations For the Authors):

      (1) Based on previous work experience, it would be interesting to evaluate the in silico mode of interaction of compound 10.

      As suggested by the reviewers, we additionally performed in silico docking study to identify the mode of interaction of compound 10 (Author response image 2). As shown below, the results indicate that compound 10 shares a similar binding orientation with YK1, forming an H-bond with the H449 residue. Although it does not interact with the D400 residue, it was predicted to create an additional H-bond with S450, which is right next to H449, thereby reinforcing the overall binding of compound 10 to MED23. Moreover, compound 10 was additionally predicted to form a pi-pi interaction with F399, which has been previously identified as an important interaction for compounds to demonstrate outstanding PPI inhibitory effect against ELF3 and MED23.

      Author response image 2.

      Docking analysis of compound 10.

      (2) The chalcones presented in this study are structurally similar to those previously presented by the group (ref 29). In said work, most of the compounds exhibited activities with IC50 values between 1.3 and 3 μM, with inhibition values at 10 μM ranging between 80 and 90% in the SEAP assay. These results are similar to those observed in this paper for the same assay. Can an explanation be found?

      Chalcones are inherently flexible molecules, giving them a high chance of occupying critical hotspot residues within the binding interface of ELF3-MED23, irrespective of the side chains introduced to this moiety. However, depending on the type of side chains introduced, the overall drug-like properties of compounds can be significantly altered, while still maintaining their PPI inhibitory effect. The significance of this study lies in our effort to enhance metabolic stability through extensive introduction of methoxy groups and other hydrophobic side chains to the chalcone skeleton, while preserving high PPI inhibitory activity.

      (3) Is the replacement of H and OH by OMe necessary? Does it improve any property (activity, selectivity, bioavailability, solubility, etc.)? Regarding the derivatives of group 2, why did they decide to replace the O-H, which in silico demonstrated favorable hydrogen bond interactions with Asp400? How do these molecules look in the binding site? Perhaps this is a point to discuss since the substitution of OH led to the obtaining of inactive molecules, or is the effect due to substitution with the terminal aromatic ring with 3 OMe?

      We modified the hydroxyl group moiety of YK-1 into a methoxy group to reduce the polarity of the compound, thereby enhancing its cell membrane permeability (Author response image 3) and reducing the likelihood of rapid elimination through phase II metabolic pathways in vivo. Additionally, we considered the potential conversion of the methoxy group back to a hydroxyl group via phase I metabolism in vivo.

      Author response image 3.

      Impact of methoxy group introduction on TPSA (total polar surface area) of each molecule. TPSA of each molecule containing chalcone structure were calculated using the Molinspiration webserver.

      (4) Lines 134 and 134: "Only compounds are in red."

      We revised the manuscript following the reviewer's comments.

      (5) Line 171: "Chalcone skeleton, shown in red."

      We revised the manuscript following the reviewer's comments.

      (6) Line 350: "N-1-acetyl-4,5-dihydropyrazoline."

      We revised the manuscript following the reviewer's comments.

      (7) Scheme 1. Replace "h" with "hr".

      We revised the manuscript following the reviewer's comments. Scheme 1 has been replaced by a new version.

      (8) Where is "Table S1" in SI?

      Tables S1 and S2 are supposed to be included in SI. We will ensure that Tables S1 and S2 are properly uploaded to the SI section.

      (9) In Figure 6, Graph D, to enhance comprehension, please incorporate red arrows indicating drug administration.

      We revised Figure 6 (D) following the reviewer's comments. Red arrows indicating drug administration have been incorporated, along with a descriptive comment "Drug administration" next to each arrow. Additionally, the figure legend now includes a clear description of these additions.

      Reveiwer#3 (Public review):

      Weaknesses:

      Compound 10 potency as PPI inhibitor has been shown in only one cell line NCI-N87.

      Reveiwer#3 (Recommendations For the Authors):

      (1) The authors should show this compound 10 is effective in other gastric cancer cells like KATOIII, SNU1.

      We evaluated the HER2 downregulating activity of compound 10 in the gastric cancer cell line, SNU216, which is confirmed to express high level of HER2 protein (Author response image 4).

      Author response image 4.

      HER2 downregulatory effect of compound 10 in HER2-positive gastric cancer cell line, SNU216. (A) Expression levels of HER2 and ELF3 in various gastric cancer cell lines. (B) HER2 downregulation in the SNU216 cell line following treatment with compound 10.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The manuscript by Kim et al. describes a role for axonal transport of Wnd (a dual leucine zipper kinase) for its normal degradation by the Hiw ubiquitin ligase pathway. In Hiw mutants, the Wnd protein accumulates dramatically in nerve terminals compared to the cell body of neurons. In the absence of axonal transport, Wnd levels rise and lead to excessive JNK signaling that makes neurons unhappy.

      Strengths:

      Using GFP-tagged Wnd transgenes and structure-function approaches, the authors show that palmitoylation of the protein at C130 plays a role in this process by promoting golgi trafficking and axonal localization of the protein. In the absence of this transport, Wnd is not degraded by Hiw. The authors also identify a role for Rab11 in the transport of Wnd, and provide some evidence that Rab11 loss-of-function neuronal degenerative phenotypes are due to excessive Wnd signaling. Overall, the paper provides convincing evidence for a preferential site of action for Wnd degradation by the Hiw pathway within axonal and/or synaptic compartments of the neuron. In the absence of Wnd transport and degradation, the JNK pathway becomes hyperactivated. As such, the manuscript provides important new insights into compartmental roles for Hiw-mediated Wnd degradation and JNK signaling control.

      Weaknesses:

      It is unclear if the requirement for Wnd degradation at axonal terminals is due to restricted localization of HIW there, but it seems other data in the field argues against that model. The mechanistic link between Hiw degradation and compartmentalization is unknown. 

      We thank the Reviewer for valuable comments. In our revised manuscript, we have addressed reviewer ‘s comments and clarified confusions. We did not intent to imply that Rab11 directly mediates anterograde Wnd protein transport towards axon terminals. We re-worded related text throughout our manuscript to avoid confusion. Additionally, to strengthen the link between Rab11 and Wnd, we have added additional data that heterozygous mutation of wnd could rescue the eye degeneration phenotypes caused by Rab11 loss-of-function (new Figure 7C).

      It is unclear if the requirement for Wnd degradation at axonal terminals is due to restricted localization of HIW there, but it seems other data in the field argues against that model. The mechanistic link between Hiw degradation and compartmentalization is unknown.

      We believe that the mechanistic understanding on how Wnd protein turnover is restricted to axon/axon terminals is beyond the scope of current manuscript. We are actively investigating this interesting research question – please see our point-by-point response for details.

      Reviewer #2 (Public Review):

      Summary:

      Utilizing transgene expression of Wnd in sensory neurons in Drosophila, the authors found that Wnd is enriched in axonal terminals. This enrichment could be blocked by preventing palmitoylation or inhibiting Rab1 or Rab11 activity. Indeed, subsequent experiments showed that inhibiting Wnd can prevent toxicity by Rab11 loss of function.

      Strengths:

      This paper evaluates in detail Wnd location in sensory neurons, and identifies a novel genetic interaction between Rab11 and Wnd that affects Wnd cellular distribution.

      Weaknesses:

      The authors report low endogenous expression of wnd, and expressing mutant hiw or overexpressing wnd is necessary to see axonal terminal enrichment. It is unclear if this overexpression model (which is known to promote synaptic overgrowth) would be relevant to normal physiology.

      We agree that most of our subcellular localization studies were conducted using transgenes, which may not accurately reflect endogenous protein localization. Albeit with this technical limitation, our work addresses an important mechanistic link between DLK’s axonal localization and protein turnover, in neuronal stress signaling and neurodegeneration. 

      Additionally, most of our experiments were done using a kinase-dead form of Wnd or with DLKi treatment (DLK kinase inhibitor). Neurons do not display synaptic overgrowth phenotypes under these experimental conditions. Thus, the changes in Wnd axonal localization are likely independent of synaptic overgrowth phenotypes.

      Palmitoylation of the Wnd orthologue DLK in sensory neurons has previously been identified as important for DLK trafficking in a cell culture model.

      Palmitoylation of DLK has been studied in previous works including Holland et al. 2015. These are important works. However, there are significant differences from our findings. First, inhibiting DLK palmitoylation caused cytoplasmic localization of DLK. It has been reported that expression levels of wild-type and the palmitoylation-defective DLK (DLK-CS) in axons are not different in cultured sensory neurons (Holland 2015, Figure 2A and 2B). This could be simply because DLK-CS is entirely cytoplasmic and can readily diffuse into axons – which led to the conclusion that DLK palmitoylation is essential for DLK localization on motile axonal puncta. Second, because of this cytoplasmic localization, DLK-CS failed to induce downstream signaling (Holland 2015).

      However, the behavior of Wnd-CS from our study is entirely different. Wnd-CS does not show diffuse cytoplasmic localization, rather shows discrete localizations in neuronal cell bodies (Figure 2E, Figure 2-supplement 1). Furthermore, Wnd-CS is able to induce downstream signaling (Figure 4 – supplement 1 and 2). Thus, our manuscript is not an extension of previously published work. Rather, our manuscript took advantage of this unique behavior of Wnd-CS and elucidated biological function of the axonal localization of Wnd.

      The authors find genetic interaction between Wnd and Rab11, but these studies are incomplete and they do not support the authors' mechanistic interpretation.

      Our model describes that Wnd is constantly transported to axon terminals for protein degradation (protein turnover), and that this process is essential to keep Wnd activity at low levels to prevent unwanted neuronal stress signal. Based on this model, a failure in Wnd transport to axon terminals – as seen in Wnd-C130S or by Rab11 loss-of-function – would compromises protein degradation of Wnd, hence, results in excessive abundance of Wnd proteins. This was clearly demonstrated for Wnd-C130S (Figure 3) and for Rab11 mutants (Figure 6E), which support our model.

      To strengthen the link between Rab11 and Wnd, we have added additional data in our revised manuscript, which showed that heterozygous mutation of wnd significantly rescued the eye degeneration phenotypes caused by Rab11 loss-of-function (new Figure 7C).

      We did not intent to imply that Rab11 directly mediates anterograde Wnd protein transport towards axon terminals. We re-worded related text throughout our manuscript to avoid confusion.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) It would be interesting to overexpress Hiw in C4da neurons to see if this can degrade the C130S Wnd protein and reduce ERK signaling, or overexpress Hiw in the Rab11 mutant background to see if this can reduce the accumulation of Wnd or total Wnd levels. This could address the question of whether the reduction in Wnd turnover is due to Hiw's inaccessibility to Wnd.

      Thank you for your comment. We believe this question warrants an independent line of study. Although this is beyond the scope of current work, we would like to share our findings here. We have found that overexpressing Hiw did not suppress the transgenic expression of Wnd-KD in C4da neurons regardless of cellular locations. However interestingly, the same Hiw overexpression suppressed increased Wnd-KD expression by hiw mutations in C4da neuron axon terminals. Thus, it seems that endogenous levels of Hiw in wild-type was sufficient to suppress transgenic expression of Wnd-KD, and that excessive Hiw expression does not further enhance this effect. Currently, we do not know the mechanisms underlying these observations. One possibility is that Hiw functions exclusively in the context of E3 ubiquitin ligase complex. Wu et al. (2007) found that DFsn is synaptically enriched and acts as an F-box protein of Hiw E3 ligase complex. It is possible that DFsn or some other components of Hiw E3 ligase complex determine the subcellular specificity of Hiw function. We are actively pursuing this research question currently.

      (2) The authors claim that Rab11 transports Wnd to the axon terminals. However, they do not see reliable colocalization of Rab11 and Wnd at axon terminals. Can the authors see Rab11-enriched vesicles with Wnd in nerve bundles, or is the role only to sort Wnd onto a post-recycling endosome compartment that moves to axonal terminals without Rab11?

      We apologize for the confusion. We did not intend to claim that Rab11 directly transports Wnd along axons. We suggested that Rab11 is necessary for axonal localization of Wnd by acting at the somatic recycling endosomes since Rab11 and Wnd extensively colocalize in the cell body but not in the axon terminals (Figure 6 and Figure 6 supplement 1). In our new “Figure 6 supplement 1”, we have now added Rab11 and Wnd colocalization in axons (segmental nerves). We also revised the text (line 294-298) “On the other hand, we did not detect any meaningful colocalization between YFP::Rab11 and Wnd-KD::mRFP in C4da axon terminals or in axons (Manders’ coefficient 0.34 ± 0.14 and 0.41 ± 0.10 respectively) (Figure 6 – supplement 1). These suggest that Rab11 is involved in Wnd protein sorting at the somatic REs rather than transporting Wnd directly.” And in Discussion (line 396-398) “These further suggest that Rab11 is not directly involved in the anterograde long-distance transport of Wnd proteins, rather is responsible for sorting Wnd into the axonal anterograde transporting vesicles.”.

      (3) The authors mis-cite the Tortosa et al 2022 study which shows the exact opposite of what the authors state. Tortosa et al show DLK recruitment to vesicles through phosphorylation and palmitoylation is essential for its signaling, not the opposite, so the authors should reword that or remove the citation.

      We believe the citation is correct. Tortosa et al (2022) “Stress‐induced vesicular assemblies of dual leucine zipper kinase are signaling hubs involved in kinase activation and neurodegeneration” describes that membrane association of DLK rather than palmitoylation itself is sufficient for DLK signaling activation. This is achieved by DLK palmitoylation for mammalian DLK. However, when artificially targeted to cellular membranes, palmitoylation defective DLK (mammalian DLK-CS in their study) was able to induce DLK signaling. Specifically, in their Figure 2 (K-N), when targeted to the intracellular membranes of ER and mitochondria, DLK-CS (palmitoylation defective DLK) elicited DLK signaling as shown by c-Jun phosphorylation.

      Reviewer #2 (Recommendations For The Authors):

      Major Concerns:

      (1) A concern is the overinterpretation of results. The authors find the accumulation of Wnd in axon terminals when they express hiw null or when they overexpress Wnd, but extrapolate that this occurs in "normal conditions" without evidence. Could the increase of Wnd in the axonal terminal be in the setting of known synaptic overgrowth associated with transgene expression?

      Most of our work was conducted using a kinase-dead version of Wnd (Wnd-KD) in a wild-type background (Figure 1C and Figure 1 supplement 1). Moreover, Wnd kinase activity does not affect Wnd axonal localization in our experimental settings (Figure 1 supplement 1).

      When using hiw mutant background, the larvae were treated with Wnd kinase inhibitor thus, prevented excessive axonal growth (Figure 1E, bottom right image – note that there is no axonal overgrowth in this condition). Additionally, Wnd-C130S is expressed lower levels in axon terminals than Wnd (Figure 3B) while exhibiting similar axon overgrowth (Figure 4 supplement 1B). Taken together, axonal overgrowth is unlikely affect axonal protein localization of Wnd.

      (2) The interpretation of these results is based on a supposition that Rab11 anterogradely transports Wnd along axons without evidence for this. Indeed, it has been shown that Rab11 is excluded from axons in mature neurons, but can be mislocalized when overexpressed. This should be addressed in their discussion.

      We apologize for the confusion. We did not intend to suggest that Rab11 directly transports Wnd along axons. We suggested that Rab11 is necessary for axonal localization of Wnd by acting at the somatic recycling endosomes since Rab11 and Wnd extensively colocalize in the cell body but not in the axon terminals (Figure 6 and Figure 6 supplement 1). In our new “Figure 6 supplement 1”, we have now added Rab11 and Wnd colocalization in axons (segmental nerves). We also revised the text (line 296-298) “On the other hand, we did not detect any meaningful colocalization between YFP::Rab11 and Wnd-KD::mRFP in C4da axon terminals or in axons (Manders’ coefficient 0.34 ± 0.14 and 0.41 ± 0.10 respectively) (Figure 6 – supplement 1). These suggest that Rab11 is involved in Wnd protein sorting at the somatic REs rather than transporting Wnd directly.” And in Discussion (line 396-398) “These further suggest that Rab11 is not directly involved in the anterograde long-distance transport of Wnd proteins, rather is responsible for sorting Wnd into the axonal anterograde transporting vesicles.”.

      (3) In Figure 1, the authors should also show images of Wnd-GFSTF in wild-type (non-hiw mutations) to show endogenous Wnd levels in the axon terminal.

      We have now added the figures of Wnd-GFSTF in wild-type (new Figure 1A). To show the comparable fluorescent intensities, we also re-performed hiw mutant experiment and replaced the old images.

      (4) For Figure 1- Supplement, the authors state that the kinase-dead version of Wnd exhibited similar axonal enrichment in comparison to Wnd::GFP in the presence and absence of DLKi. This statement would be better supported with images specifically showing this (for example Wnd-KD::GFP compared to Wnd:GFP with DLKi and Wnd:GFP without DLKi).

      We did not show the images from Wnd::GFP (DLKi) in this supplement figure because it would be redundant with Figure 1C. Rather, we presented the axonal enrichment index for Wnd::GFP (DLKi), Wnd-KD::GFP, Wnd-KD::GFP (DLKi), and Wnd-KD::GFP (DMSO) in Figure 1 supplement 1B.

      Overexpressing catalytically active Wnd dramatically lowers ppk-GAL4 activity in C4da neurons thus prevents us from performing an experiment for Wnd::GFP without DLKi. In this condition, Wnd::GFP expression is barely detectable in C4da neurons.

      (5) In Figure 2 - Supplement 3 the authors state that their data suggests that Wnd protein palmitoylation is catalyzed by HIP14 due to colocalization in the somatic Golgi and mutating HIP14 leads to less Wnd in the axon terminal. This statement would be better supported by evaluating Wnd's palmitoylation via immunoprecipitation in response to dHIP14 enzyme activity.

      We appreciate reviewer’s comment. Although the exact identity of Wnd palmitoyltransferase might be of high interest, our study rather concerns about the biological role of Wnd axonal localization. Moreover, the identity of DLK palmitoyltransferase has been identified in mammalian cell culture and worm studies (Niu et al. 2020 “Coupled Control of Distal Axon Integrity and Somal Responses to Axonal Damage by the Palmitoyl Acyltransferase ZDHHC17”). ZDHHC17 is another name for HIP14. Our data together with these published works strongly suggest that Wnd, the Drosophila DLK might also be targeted by Drosophila HIP14 or dHIP14.

      (6) The authors argue that palmitoylation of Wnd is essential for axonal localization of Wnd. If dHIP14 indeed palmitoylates Wnd as the authors claim, shouldn't there be a decrease in Wnd's palmitoylation within dHIP14 mutants, consequently resulting in its accumulation in the cell body rather than localization in the axonal terminal? However, Wnd is reduced at the axon terminal in dHip14 mutants, but it does not appear to increase in the cell body (Figure 2S3.C). This observation contradicts the results showing increased Wnd in the cell body presented in Figure 2. B and E. This discrepancy should be addressed.

      Thank you for your comment. Our study concerns about the biological role of Wnd axonal localization. Although in an ideal model, dHIP14 mutations should prevent Wnd palmitoylation and causes subsequent cell body accumulation. However, it is highly likely that dHIP14 mutations affect a large number of protein palmitoylations – not just Wnd, which likely changes many aspect of cell functions. We envision that Wnd protein expression might be indirectly affected by these changes. In this context, mutating C130 in Wnd can be considered as more targeted approach – and our data clearly shows that such Wnd mutations render Wnd accumulation in cell bodies.

      (7) Figure 3 - the authors show increased Wnd protein by Western blot in WndC130S:GFP compared to Wnd::GFP. qPCR experiments to show similar mRNA expression of these two transgenes would be an important control, if it's thought that the increase of protein is due to reduction of protein degradation.

      Thank you for your comment. Expressing WndC130S::GFP vs Wnd::GFP was done by GAL4-UAS system – not through endogenous wnd promoter. Thus, we do not expect different mRNA abundance of WndC130S::GFP and Wnd::GFP. However, your concern is valid for Rab11 mutants. We measured wnd mRNA abundance by RT-qPCR and found that Rab11 mutations did not increase wnd mRNA levels (Figure 6 - Supplement 2). Rather, we observed consistent reduction in wnd mRNA levels by Rab11 mutant. Please note that total Wnd protein levels were significantly increased by Rab11 mutations. We currently do not have a clear explanation. We envision that the dramatic increase in Wnd signaling (ie, JNK signal, Figure 7A) induces a negative-feedback to reduce wnd mRNA levels (line 313-317).

      (8) Figure 4 Supplement - the authors report that Wnd::GFP causes robust induction of Puc-LacZ. A control without Wnd::GFP expression would be necessary to support that there was an induction.

      We have added control data of UAS-Wnd-KD::GFP (new Figure 4 supplement 1A). Since this required a new side-by-side comparison of fluorescent intensities, we re-performed the full set of experiments and replaced our old data sets.  The results confirmed that both Wnd::GFP and Wnd-C130S::GFP induces puc-lacZ expression. 

      (9) Previously it was shown that inhibiting palmitoylation of DLK prevented activation of JNK signaling (Holland et al 2015), but the authors show in Figure 4A instead an increase of JNK signaling. This discrepancy should be addressed.

      The use of Wnd palmitoylation-defective mutant in our study was only possible because of different behavior of Wnd-C130S from those of palmitoylation-defective DLK. Unlike diffuse cytoplasmic localization of the palmitoylation-defective DLK in mammalian cells or in C elegans neurons, Wnd-C130S exhibited clear puncta localization in neuronal cell bodies – which extensively co-localizes with somatic Golgi complex (Figure 2E and Figure 2 supplement 1). Tortosa et al (2022) showed that palmitoylation-defective DLK (DLK-CS) can trigger DLK signaling when artificially targeted to intracellular membranous organelles (Tortosa 2022, Figure 2 (K-N)). Thus, we reasoned that unlike the palmitoylation-defective DLK from mammalian and worms, Drosophila DLK, Wnd might be catalytically active when mutated on Cysteine 130 because of its puncta localization.

      (10) Figure 6 Supplement - the Rab11 staining is not in a pattern that would be expected with endosomes. A control of just YFP would be useful to determine if this fluorescence signal is specific to Rab11. Can endogenous Rab11 be detected in axons or in the axonal terminal?

      In our model system, endogenously tagged Rab11 (TI-Rab11) does not show clear puncta patterns in segmental nerves (axons) and neuropils (axon terminals), neither colocalize with Wnd-KD. This is indeed related to the reviewer’s comment #2, which suggests that Rab11 does not form endosomes in distal axons or axon terminals in mature neurons. Expressing Rab11 transgenes exhibited some puncta structures in axons (segmental nerves) (new Figure 6 supplement 1). However, they did not show meaningful colocalize with Wnd-KD. These are consistent with our model that Rab11 acts in neuronal cell bodies for Wnd axonal transport – likely via a sorting process.

      (11) There is growing evidence that palmitoylation is important for cargo sorting in the Golgi, and Rab11 is also located at the Golgi and important for trafficking from the Golgi. A mechanism that could be considered from your data is that blocking palmitoylation impairs sorting at the Golgi and trafficking from the Golgi, as opposed to impairing fast axonal transport. Indeed, Rab11 has been shown to be blocked from axons in mature neurons, making Rab11 unlikely to be responsible for the fast axonal transport of Wnd. Direct evidence of Rab11 transporting Wnd in axons would be necessary for the claim that Rab11 constantly transports DLK to terminals.

      We apologize for the confusion. We did not intend to suggest that Rab11 directly transports Wnd along the axons. We suggested that Rab11 is necessary for axonal localization of Wnd by acting at the somatic recycling endosomes since Rab11 and Wnd extensively colocalize in the cell body but not in the axon terminals (Figure 6 and Figure 6 supplement 1). In our new “Figure 6 supplement 1”, we have now added Rab11 and Wnd colocalization in axons (segmental nerves). We also revised the text (line 296-298) “On the other hand, we did not detect any meaningful colocalization between YFP::Rab11 and Wnd-KD::mRFP in C4da axon terminals or in axons (Manders’ coefficient 0.34 ± 0.14 and 0.41 ± 0.10 respectively) (Figure 6 – supplement 1). These suggest that Rab11 is involved in Wnd protein sorting at the somatic REs rather than transporting Wnd directly.” And in Discussion (line 394-398) “These further suggest that Rab11 is not directly involved in the anterograde long-distance transport of Wnd proteins, rather is responsible for sorting Wnd into the axonal anterograde transporting vesicles.”.

    2. eLife assessment

      This important manuscript shows that axonal transport of Wnd is required for its normal degradation by the Hiw ubiquitin ligase pathway. In Hiw mutants, the Wnd protein accumulates in nerve terminals. In the absence of axonal transport, Wnd levels also rise and lead to excessive JNK signaling, disrupting neuronal function. These are interesting findings supported by convincing data. However, how Rab11 is involved in Golgi processing or axonal transport of Wnd is not resolved as it is clear that Rab11 is not travelling with Wnd to the axon.

    3. Reviewer #1 (Public Review):

      Summary:

      The manuscript by Kim et al. describes a role for axonal transport of Wnd (a dual leucine zipper kinase) for its normal degradation by the Hiw ubiquitin ligase pathway. In Hiw mutants, the Wnd protein accumulates dramatically in nerve terminals compared to the cell body of neurons. In the absence of axonal transport, Wnd levels rise and lead to excessive JNK signaling that makes neurons unhappy.

      Strengths:

      Using GFP-tagged Wnd transgenes and structure-function approaches, the authors show that palmitoylation of the protein at C130 plays a role in this process by promoting golgi trafficking and axonal localization of the protein. In the absence of this transport, Wnd is not degraded by Hiw. The authors also identify a role for Rab11 in the transport of Wnd, and provide some evidence that Rab11 loss-of-function neuronal degenerative phenotypes are due to excessive Wnd signaling. Overall, the paper provides convincing evidence for a preferential site of action for Wnd degradation by the Hiw pathway within axonal and/or synaptic compartments of the neuron. In the absence of Wnd transport and degradation, the JNK pathway becomes hyperactivated. As such, the manuscript provides important new insights into compartmental roles for Hiw-mediated Wnd degradation and JNK signaling control.

      Weaknesses:

      It is unclear if the requirement for Wnd degradation at axonal terminals is due to restricted localization of HIW there, but it seems other data in the field argues against that model. The mechanistic link between Hiw degradation and compartmentalization is unknown.

    4. Reviewer #2 (Public Review):

      Summary:

      Utilizing transgene expression of Wnd in sensory neurons in Drosophila, the authors found that Wnd is enriched in axonal terminals. This enrichment could be blocked by preventing palmitoylation or inhibiting Rab1 or Rab11 activity. Indeed, subsequent experiments showed that inhibiting Wnd can prevent toxicity by Rab11 loss of function.

      Strengths:

      This paper evaluates in detail Wnd location in sensory neurons, and identifies a novel genetic interaction between Rab11 and Wnd that affects Wnd cellular distribution.

      Weaknesses:

      The authors report low endogenous expression of wnd, and expressing mutant hiw or overexpressing wnd is necessary to see axonal terminal enrichment. It is unclear if this overexpression model (which is known to promote synaptic overgrowth) would be relevant to normal physiology.

      Palmitoylation of the Wnd orthologue DLK in sensory neurons has previously been identified as important for DLK trafficking in a cell culture model.

    1. eLife assessment

      This study presents valuable findings from an observational dataset in a riverine ecosystem about the effects of genetic and species diversity, across multiple trophic levels, on ecosystem functions. However, the support for these findings is currently incomplete because raw data are not provided and there is insufficient information in the manuscript for readers to understand and assess the statistical analyses and conclusions. The work will be of broad interest to ecologists.

    2. Reviewer #1 (Public review):

      Summary:

      This work used a comprehensive dataset to compare the effects of species diversity and genetic diversity within each trophic level and across three trophic levels. The results showed that species diversity had negative effects on ecosystem functions, while genetic diversity had positive effects. These effects were observed only within each trophic level and not across the three trophic levels studied. Although the effects of biodiversity, especially genetic diversity across multi-trophic levels, have been shown to be important, there are still very few empirical studies on this topic due to the complex relationships and difficulty in obtaining data. This study collected an excellent dataset to address this question, enhancing our understanding of genetic diversity effects in aquatic ecosystems.

      Strengths:

      The study collected an extensive dataset that includes species diversity of primary producers (riparian trees), primary consumers (macroinvertebrate shredders), and secondary consumers (fish). It also includes the genetic diversity of the dominant species at each trophic level, biomass production, decomposition rates, and environmental data.

      The conclusions of this paper are mostly well supported by the data and the writing is logical and easy to follow.

      Weaknesses:

      While the dataset is impressive, the authors conducted analyses more akin to a "meta-analysis," leaving out important basic information about the raw data in the manuscript. Given the complexity of the relationships between different trophic levels and ecosystem functions, it would be beneficial for the authors to show the results of each SEM (structural equation model).

      The main results presented in the manuscript are derived from a "metadata" analysis of effect sizes. However, the methods used to obtain these effect sizes are not sufficiently clarified. By analyzing the effect sizes of species diversity and genetic diversity on these ecosystem functions, the results showed that species diversity had negative effects, while genetic diversity had positive effects on ecosystem functions. The negative effects of species diversity contradict many studies conducted in biodiversity experiments. The authors argue that their study is more relevant because it is based on a natural system, which is closer to reality, but they also acknowledge that natural systems make it harder to detect underlying mechanisms. Providing more results based on the raw data and offering more explanations of the possible mechanisms in the introduction and discussion might help readers understand why and in what context species diversity could have negative effects.

      Environmental variation was included in the analyses to test if the environment would modulate the effects of biodiversity on ecosystem functions. However, the main results and conclusions did not sufficiently address this aspect.

    3. Reviewer #2 (Public review):

      Summary:

      Fargeot et al. investigated the relative importance of genetic and species diversity on ecosystem function and examined whether this relationship varies within or between trophic-level responses. To do so, they conducted a well-designed field survey measuring species diversity at 3 trophic levels (primary producers [trees], primary consumers [macroinvertebrate shredders], and secondary consumers [fishes]), genetic diversity in a dominant species within each of these 3 trophic levels and 7 ecosystem functions across 52 riverine sites in southern France. They show that the effect of genetic and species diversity on ecosystem functions are similar in magnitude, but when examining within-trophic level responses, operate in different directions: genetic diversity having a positive effect and species diversity a negative one. This data adds to growing evidence from manipulated experiments that both species and genetic diversity can impact ecosystem function and builds upon this by showing these effects can be observed in nature.

      Strengths:

      The study design has resulted in a robust dataset to ask questions about the relative importance of genetic and species diversity of ecosystem function across and within trophic levels.

      Overall, their data supports their conclusions - at least within the system that they are studying - but as mentioned below, it is unclear from this study how general these conclusions would be.

      Weaknesses:

      (1) While a robust dataset, the authors only show the data output from the SEM (i.e., effect size for each individual diversity type per trophic level (6) on each ecosystem function (7)), instead of showing much of the individual data. Although the summary SEM results are interesting and informative, I find that a weakness of this approach is that it is unclear how environmental factors (which were included but not discussed in the results) nor levels of diversity were correlated across sites. As species and genetic diversity are often correlated but also can have reciprocal feedbacks on each other (e.g., Vellend 2005), there may be constraints that underpin why the authors observed positive effects of one type of diversity (genetic) when negative effects of the other (species). It may have also been informative to run SEM with links between levels of diversity. By focusing only on the summary of SEM data, the authors may be reducing the strength of their field dataset and ability to draw inferences from multiple questions and understand specific study-system responses.

      (2) My understanding of SEM is it gives outputs of the strength/significance of each pathway/relationship and if so, it isn't clear why this wasn't used and instead, confidence intervals of Z scores to determine which individual BEFs were significant. In addition, an inclusion of the 7 SEM pathway outputs would have been useful to include in an appendix.

      (3) I don't fully agree with the authors calling this a meta-analysis as it is this a single study of multiple sites within a single region and a specific time point, and not a collection of multiple studies or ecosystems conducted by multiple authors. Moreso, the authors are using meta-analysis summary metrics to evaluate their data. The authors tend to focus on these patterns as general trends, but as the data is all from this riverine system this study could have benefited from focusing on what was going on in this system to underpin these patterns. I'd argue more data is needed to know whether across sites and ecosystems, species diversity and genetic diversity have opposite effects on ecosystem function within trophic levels.

    4. Reviewer #3 (Public review):

      The manuscript by Fargeot and colleagues assesses the relative effects of species and genetic diversity on ecosystem functioning. This study is very well written and examines the interesting question of whether within-species or among-species diversity correlates with ecosystem functioning, and whether these effects are consistent across trophic levels. The main findings are that genetic diversity appears to have a stronger positive effect on function than species diversity (which appears negative). These results are interesting and have value.

      However, I do have some concerns that could influence the interpretation.

      (1) Scale: the different measures of diversity and function for the different trophic levels are measured over very different spatial scales, for example, trees along 200 m transects and 15 cm traps. It is not clear whether trees 200 m away are having an effect on small-scale function.

      (2) Size of diversity gradients: More information is needed on the actual diversity gradients. One of the issues with surveys of natural systems is that they are of species that have already gone through selection filters from a regional pool, and theoretically, if the environments are similar, you should get similar sets of species, without monocultures. So, if the species diversity gradients range from say, 6 to 8 species, but genetic diversity gradients span an order of magnitude more, you can explain much more variance with genetic diversity. Related to this, species diversity effects on function are often asymptotic at high diversity and so if you are only sampling at the high diversity range, we should expect a strong effect.

      (3) Ecosystem functions: The functions are largely biomass estimates (expect decomposition), and I fail to see how the biomass of a single species can be construed as an ecosystem function. Aren't you just estimating a selection effect in this case?

      Note that the article claims to be one of the only studies to look at function across trophic levels, but there are several others out there, for example:

      Li, F., Altermatt, F., Yang, J., An, S., Li, A., & Zhang, X. (2020). Human activities' fingerprint on multitrophic biodiversity and ecosystem functions across a major river catchment in China. Global change biology, 26(12), 6867-6879.

      Luo, Y. H., Cadotte, M. W., Liu, J., Burgess, K. S., Tan, S. L., Ye, L. J., ... & Gao, L. M. (2022). Multitrophic diversity and biotic associations influence subalpine forest ecosystem multifunctionality. Ecology, 103(9), e3745.

      Moi, D. A., Romero, G. Q., Antiqueira, P. A., Mormul, R. P., Teixeira de Mello, F., & Bonecker, C. C. (2021). Multitrophic richness enhances ecosystem multifunctionality of tropical shallow lakes. Functional Ecology, 35(4), 942-954.

      Wan, B., Liu, T., Gong, X., Zhang, Y., Li, C., Chen, X., ... & Liu, M. (2022). Energy flux across multitrophic levels drives ecosystem multifunctionality: Evidence from nematode food webs. Soil Biology and Biochemistry, 169, 108656.

      And the case was made strongly by:

      Seibold, S., Cadotte, M. W., MacIvor, J. S., Thorn, S., & Müller, J. (2018). The necessity of multitrophic approaches in community ecology. Trends in ecology & evolution, 33(10), 754-764.

    5. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This work used a comprehensive dataset to compare the effects of species diversity and genetic diversity within each trophic level and across three trophic levels. The results showed that species diversity had negative effects on ecosystem functions, while genetic diversity had positive effects. These effects were observed only within each trophic level and not across the three trophic levels studied. Although the effects of biodiversity, especially genetic diversity across multi-trophic levels, have been shown to be important, there are still very few empirical studies on this topic due to the complex relationships and difficulty in obtaining data. This study collected an excellent dataset to address this question, enhancing our understanding of genetic diversity effects in aquatic ecosystems.

      Strengths:

      The study collected an extensive dataset that includes species diversity of primary producers (riparian trees), primary consumers (macroinvertebrate shredders), and secondary consumers (fish). It also includes the genetic diversity of the dominant species at each trophic level, biomass production, decomposition rates, and environmental data.

      The conclusions of this paper are mostly well supported by the data and the writing is logical and easy to follow.

      Weaknesses:

      While the dataset is impressive, the authors conducted analyses more akin to a "meta-analysis," leaving out important basic information about the raw data in the manuscript. Given the complexity of the relationships between different trophic levels and ecosystem functions, it would be beneficial for the authors to show the results of each SEM (structural equation model).

      We understand the point raised by the reviewer. Our objective was to focus the Results section on the main hypotheses, and for this we let away the raw statistics. We can definitively show the seven individual SEM, highlighting the major links, which may help understand some processes. This will be done in the next version of the manuscript.

      The main results presented in the manuscript are derived from a "metadata" analysis of effect sizes. However, the methods used to obtain these effect sizes are not sufficiently clarified. By analyzing the effect sizes of species diversity and genetic diversity on these ecosystem functions, the results showed that species diversity had negative effects, while genetic diversity had positive effects on ecosystem functions. The negative effects of species diversity contradict many studies conducted in biodiversity experiments. The authors argue that their study is more relevant because it is based on a natural system, which is closer to reality, but they also acknowledge that natural systems make it harder to detect underlying mechanisms. Providing more results based on the raw data and offering more explanations of the possible mechanisms in the introduction and discussion might help readers understand why and in what context species diversity could have negative effects.

      We hope you will be right. As said above, we will explore this possibility.

      Environmental variation was included in the analyses to test if the environment would modulate the effects of biodiversity on ecosystem functions. However, the main results and conclusions did not sufficiently address this aspect.

      This will be addressed by the more in-depth analysis of individual SEM, and we will discuss this further.

      Reviewer #2 (Public review):

      Summary:

      Fargeot et al. investigated the relative importance of genetic and species diversity on ecosystem function and examined whether this relationship varies within or between trophic-level responses. To do so, they conducted a well-designed field survey measuring species diversity at 3 trophic levels (primary producers [trees], primary consumers [macroinvertebrate shredders], and secondary consumers [fishes]), genetic diversity in a dominant species within each of these 3 trophic levels and 7 ecosystem functions across 52 riverine sites in southern France. They show that the effect of genetic and species diversity on ecosystem functions are similar in magnitude, but when examining within-trophic level responses, operate in different directions: genetic diversity having a positive effect and species diversity a negative one. This data adds to growing evidence from manipulated experiments that both species and genetic diversity can impact ecosystem function and builds upon this by showing these effects can be observed in nature.

      Strengths:

      The study design has resulted in a robust dataset to ask questions about the relative importance of genetic and species diversity of ecosystem function across and within trophic levels.

      Overall, their data supports their conclusions - at least within the system that they are studying - but as mentioned below, it is unclear from this study how general these conclusions would be.

      Weaknesses:

      (1) While a robust dataset, the authors only show the data output from the SEM (i.e., effect size for each individual diversity type per trophic level (6) on each ecosystem function (7)), instead of showing much of the individual data. Although the summary SEM results are interesting and informative, I find that a weakness of this approach is that it is unclear how environmental factors (which were included but not discussed in the results) nor levels of diversity were correlated across sites. As species and genetic diversity are often correlated but also can have reciprocal feedbacks on each other (e.g., Vellend 2005), there may be constraints that underpin why the authors observed positive effects of one type of diversity (genetic) when negative effects of the other (species). It may have also been informative to run SEM with links between levels of diversity. By focusing only on the summary of SEM data, the authors may be reducing the strength of their field dataset and ability to draw inferences from multiple questions and understand specific study-system responses.

      We will address this issue by performing a more in-depth analysis of each individual SEMs, and provide directly these raw data. Regarding the comment on species-genomic diversity correlations (SGDCs), we would like to point out that this has already been addressed in a previous paper (Fargeot et al. Oikos, 2023). There is actually no correlations between genomic and species diversity in these dataset, which is merely explain by the selection of the sampling sites. The relationships between species diversity, genomic diversity and environmental factors are also detailed in Fargeot et al. (2023). We precisely published this paper first to focus here “only” on BEFs. But we realize we need to provide further information and discuss further these issues. This will be done in the next version of the manuscript.

      (2) My understanding of SEM is it gives outputs of the strength/significance of each pathway/relationship and if so, it isn't clear why this wasn't used and instead, confidence intervals of Z scores to determine which individual BEFs were significant. In addition, an inclusion of the 7 SEM pathway outputs would have been useful to include in an appendix.

      Yes, we can provide p-values. Results from p-values will provide the same information than 95%Cis, both yield very similar (if not exactly the same) results/conclusions. We wil provide the 7 SEMs in Appendices.

      (3) I don't fully agree with the authors calling this a meta-analysis as it is this a single study of multiple sites within a single region and a specific time point, and not a collection of multiple studies or ecosystems conducted by multiple authors. Moreso, the authors are using meta-analysis summary metrics to evaluate their data. The authors tend to focus on these patterns as general trends, but as the data is all from this riverine system this study could have benefited from focusing on what was going on in this system to underpin these patterns. I'd argue more data is needed to know whether across sites and ecosystems, species diversity and genetic diversity have opposite effects on ecosystem function within trophic levels.

      We agree. “Meta-regression” would perhaps be more adequate than “meta-analyses”. As said above, more details will be provided on the next version of the manuscript.

      Reviewer #3 (Public review):

      The manuscript by Fargeot and colleagues assesses the relative effects of species and genetic diversity on ecosystem functioning. This study is very well written and examines the interesting question of whether within-species or among-species diversity correlates with ecosystem functioning, and whether these effects are consistent across trophic levels. The main findings are that genetic diversity appears to have a stronger positive effect on function than species diversity (which appears negative). These results are interesting and have value.

      However, I do have some concerns that could influence the interpretation.

      (1) Scale: the different measures of diversity and function for the different trophic levels are measured over very different spatial scales, for example, trees along 200 m transects and 15 cm traps. It is not clear whether trees 200 m away are having an effect on small-scale function.

      Trees identification and invertebrate (and fish) sampling are done on the same scale. Trees are spread along the river so that their leaves fall directly in the river. Traps have been installed all along the same transect in various micro-habitats. Diversity have been measured at the exact same scale for all organisms. We will try to be more precise.

      (2) Size of diversity gradients: More information is needed on the actual diversity gradients. One of the issues with surveys of natural systems is that they are of species that have already gone through selection filters from a regional pool, and theoretically, if the environments are similar, you should get similar sets of species, without monocultures. So, if the species diversity gradients range from say, 6 to 8 species, but genetic diversity gradients span an order of magnitude more, you can explain much more variance with genetic diversity. Related to this, species diversity effects on function are often asymptotic at high diversity and so if you are only sampling at the high diversity range, we should expect a strong effect.

      We will provide more information. The range of diversity also vary according to the trophic level; there are more invertebrate species than fish species. But overall the rage of species number is large.

      (3) Ecosystem functions: The functions are largely biomass estimates (expect decomposition), and I fail to see how the biomass of a single species can be construed as an ecosystem function. Aren't you just estimating a selection effect in this case?

      The biomass estimated for a certain area represent an estimate of productivity, whatever the number of species being considered. Obviously, productivity of a species can be due to environmental constraints; the biomass is expected to be lower at the niche margin (selection effect). But is these environmental effects are taken into account (which is the case in the SEMs), then the residual variation can be explained by biodiversity effects. We will try to make it more clear.

      Note that the article claims to be one of the only studies to look at function across trophic levels, but there are several others out there, for example:

      Thanks, we will cite some of these studies (and make our claim less strong)

      Li, F., Altermatt, F., Yang, J., An, S., Li, A., & Zhang, X. (2020). Human activities' fingerprint on multitrophic biodiversity and ecosystem functions across a major river catchment in China. Global change biology, 26(12), 6867-6879.

      Luo, Y. H., Cadotte, M. W., Liu, J., Burgess, K. S., Tan, S. L., Ye, L. J., ... & Gao, L. M. (2022). Multitrophic diversity and biotic associations influence subalpine forest ecosystem multifunctionality. Ecology, 103(9), e3745.

      Moi, D. A., Romero, G. Q., Antiqueira, P. A., Mormul, R. P., Teixeira de Mello, F., & Bonecker, C. C. (2021). Multitrophic richness enhances ecosystem multifunctionality of tropical shallow lakes. Functional Ecology, 35(4), 942-954.

      Wan, B., Liu, T., Gong, X., Zhang, Y., Li, C., Chen, X., ... & Liu, M. (2022). Energy flux across multitrophic levels drives ecosystem multifunctionality: Evidence from nematode food webs. Soil Biology and Biochemistry, 169, 108656.

      And the case was made strongly by:

      Seibold, S., Cadotte, M. W., MacIvor, J. S., Thorn, S., & Müller, J. (2018). The necessity of multitrophic approaches in community ecology. Trends in ecology & evolution, 33(10), 754-764.

    1. eLife assessment

      This study provides direct evidence showing that Kv1.8 channels provide the basis for several potassium currents in the two types of sensory hair cells found in the mouse vestibular system. This is an important finding because the nature of the channels underpinning the unusual potassium conductance gK,L in type I hair cells has been under scrutiny for many years. The experimental evidence is compelling and the analysis is rigorous. The study will be of interest to cell and molecular biologists as well as vestibular and auditory neuroscientists.

    2. Reviewer #1 (Public Review):

      Summary:

      In this paper the authors provide a thorough demonstration of the role that one particular type of voltage-gated potassium channel, Kv1.8, plays in a low voltage activated conductance found in type I vestibular hair cells. Along the way, they find that this same channel protein appears to function in type II vestibular hair cells as well, contributing to other macroscopic conductances. Overall, Kv1.8 may provide especially low input resistance and short time constants to facilitate encoding of more rapid head movements in animals that have necks. Combination with other channel proteins, in different ratios, may contribute to the diversified excitability of vestibular hair cells.

      Strengths:

      The experiments are comprehensive and clearly described, both in text and in the figures. Statistical analyses are provided throughout.

      Weaknesses:

      None.

    3. Reviewer #2 (Public Review):

      The focus of this manuscript was to investigate whether Kv1.8 channels, which have previously been suggested to be expressed in type I hair cells of the mammalian vestibular system, are responsible for the potassium conductance gK,L. This is an important study because gK,L is known to be crucial for the function of type I hair cells, but the channel identity has been a matter of debate for the past 20 years. The authors have addressed this research topic by primarily investigating the electrophysiological properties of the vestibular hair cells from Kv1.8 knockout mice. Interestingly, gK,L was completely abolished in Kv1.8-deficient mice, in agreement with the hypothesis put forward by the authors based on the literature. The surprising observation was that in the absence of Kv1.8 potassium channels, the outward potassium current in type II hair cells was also largely reduced. Type II hair cells express the largely inactivating potassium conductance g,K,A, but not gK,L. The authors concluded that heteromultimerization of non-inactivating Kv1.8 and the inactivating Kv1.4 subunits could be responsible for the inactivating gK,A. Overall, the manuscript is very well written and most of the conclusions are supported by the experimental work. The figures are well described, and the statistical analysis is robust.

    4. Reviewer #3 (Public Review):

      Summary:

      This paper by Martin et al. describes the contribution of a Kv channel subunit (Kv1.8, KCNA10) to voltage-dependent K+ conductances and membrane properties of type I and type II hair cells of the mouse utricle. Previous work has documented striking differences in K+ conductances between vestibular hair cell types. In particular amniote type I hair cells are known to express a non-typical low-voltage-activated K+ conductance (GK,L) whose molecular identity has been elusive. K+ conductances in hair cells from 3 different mouse genotypes (wildtype, Kv1.8 homozygous knockouts and heterozygotes) are examined here and whole cell patch-clamp recordings indicate a prominent role for Kv1.8 subunits in generating GK,L. Results also interestingly support a role for Kv1.8 subunits in type II hair cell K+ conductances; inactivating conductances in null mice are reduced in type II hair cells from striola and extrastriola regions of the utricle. Kv1.8 is therefore proposed to contribute as a pore-forming subunit for 3 different K+ conductances in vestibular hair cells. The impact of these conductances on membrane responses to current steps is studied in current clamp. Pharmacological experiments use XE991 to block some residual Kv7-mediated current in both hair cell types, but no other pharmacological blockers are used. In addition immunostaining data are presented and raise some questions about Kv7 and Kv1.8 channel localization. Overall, the data present compelling evidence that removal of Kv1.8 produces profound changes in hair cell membrane conductances and sensory capabilities. These changes at hair cell level suggest vestibular function would be compromised and further assessment in terms of balance behavior in the different mice would be interesting.

      Strengths:

      This study provides strong evidence that Kv1.8 subunits are major contributors to the unusual K+ conductance in type I hair cells of the utricle. It also indicates that Kv1.8 subunits are important for type II hair cell K+ conductances because Kv1.8-/- mice lacked an inactivating A conductance and had reduced delayed rectifier conductance compared to controls. A comprehensive and careful analysis of biophysical profiles is presented of expressed K+ conductances in 3 different mouse genotypes. Voltage-dependent K+ currents are rigorously characterized at a range of different ages and their impact on membrane voltage responses to current input is studied. Some pharmacological experiments are performed in addition to immunostaining to bolster the conclusions from the biophysical studies. The paper has a significant impact in showing the role of Kv1.8 in determining utricular hair cell electrophysiological phenotypes.

      Weaknesses:

      (1) From previous work it is known that GK,L in type I hair cells has unusual ion permeation and pharmacological properties that differ greatly from type II hair cell conductances. Notably GK,L is highly permeable to Cs+ as well as K+ ions and is slightly permeable to Na+. It is blocked by 4-aminopyridine and divalent cations (Ba2+, Ca2+, Ni2+), enhanced by external K+ and modulated by cyclic GMP. The question arises-if Kv1.8 is a major player and pore-forming subunit in type I and type II cells (and cochlear inner hair cells as shown by Dierich et al. 2020) how are subunits modified to produce channels with very different properties? A role for Kv1.4 channels (gA) is proposed in type II hair cells based on previous findings in bird hair cells. However, hair cell specific partner interactions with Kv1.8 that result in GK, L in type I hair cells and Cs+ impermeable, inactivating currents in type II hair cells remain for the most part unexplored.

      (2) Data from patch-clamp and immunocytochemistry experiments are not in close alignment. XE991 (Kv7 channel blocker) decreases remaining K+ conductance in type I and type II hair cells from null mice supporting the presence of Kv7 channels in hair cells (Fig. 7). Also, Holt et al. (2007) previously showed inhibition of GK,L in type I hair cells (but not delayed rectifier conductance in type II hair cells) using a dominant negative construct of Kv7.4 channels. However, immunolabelling indicates Kv7.4 channels on the inner face of calyx terminals adjacent to hair cells (Fig. 5). Some reconciliation of these findings is needed.

      (3) A previous paper reported that a vestibular evoked potential was abnormal in Kv1.8-/- mice (Lee et al. 2013) as briefly mentioned (lines 94-95). It would be really interesting to know if any vestibular-associated behaviors and/or hearing loss were observed in the mice populations. If responses are compromised at the sensory hair cell level across different zones, degradation of balance function would be anticipated and should be elucidated.

    5. Author response:

      The following is the authors’ response to the previous reviews.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      Line 127. Provide a few more words describing the voltage protocol. To the uninitiated, panels A and B will be difficult to understand. "The large negative step is used to first close all channels, then probe the activation function with a series of depolarizing steps to re-open them and obtain the max conductance from the peak tail current at -36 mV. "

      We have revised the text as suggested (revision lines 127 to Line 131): “From a holding potential within the gK,L activation range (here –74 mV), the cell is hyperpolarized to –124 mV, negative to EK and the activation range, producing a large inward current through open gK,L channels that rapidly decays as the channels deactivate. We use the large transient inward current as a hallmark of gK,L. The hyperpolarization closes all channels, and then the activation function is probed with a series of depolarizing steps, obtaining the max conductance from the peak tail current at –44 mV (Fig. 1A).”

      Incidentally, why does the peak tail current decay? 

      We added this text to the figure legend to explain this: “For steps positive to the midpoint voltage, tail currents are very large. As a result, K+ accumulation in the calyceal cleft reduces driving force on K+, causing currents to decay rapidly, as seen in A (Lim et al., 2011).”

      The decay of the peak tail current is a feature of gK,L (large K+ conductance) and the large enclosed synaptic cleft (which concentrates K+ that effluxes from the HC). See Govindaraju et al. (2023) and Lim et al. (2011) for modeling and experiments around this phenomenon.

      Line 217-218. For some reason, I stumbled over this wording. Perhaps rearrange as "In type II HCs absence of Kv1.8 significantly increased Rin and tauRC. There was no effect on Vrest because the conductances to which Kv1.8 contributes, gA and gDR activate positive to the resting potential. (so which K conductances establish Vrest???). 

      We kept our original wording because we wanted to discuss the baseline (Vrest) before describing responses to current injection.

      ->Vrest is presumably maintained by ATP-dependent Na/K exchangers (ATP1a1), HCN, Kir, and mechanotransduction currents. Repolarization is achieved by delayed rectifier and A-type K+ conductances in type II HCs.

      Figure 4, panel C - provides absolute membrane potential for voltage responses. Presumably, these were the most 'ringy' responses. Were they obtained at similar Vm in all cells (i.e., comparisons of Q values in lines 229-230). 

      We added the absolute membrane potential scale. Type II HC protocols all started with 0 pA current injection at baseline, so they were at their natural Vrest, which did not differ by genotype or zone. Consistent with Q depending on expression of conductances that activate positive to Vrest, Q did not co-vary with Vrest (Pearson’s correlation coefficient = 0.08, p = 0.47, n= 85).

      Lines 254. Staining is non-specific? Rather than non-selective? 

      Yes, thanks - Corrected (Line 264).

      Figure 6. Do you have a negative control image for Kv1.4 immuno? Is it surprising that this label is all over the cell, but Kv1.8 is restricted to the synaptic pole? 

      We don’t have a null-animal control because this immunoreactivity was done in rat. While the cuticular plate staining was most likely nonspecific because we see that with many different antibodies, it’s harder to judge the background staining in the hair cell body layer. After feedback from the reviewers, we decided to pull the KV1.4 immunostaining from the paper because of the lack of null control, high background, and inability to reproduce these results in mouse tissue. In our hands, in mouse tissue, both mouse and rabbit anti-KV1.4 antibodies failed to localize to the hair cell membrane. Further optimization or another method could improve that, but for now the single-cell expression data (McInturff et al., 2018) remain the strongest evidence for KV1.4 expression in murine type II hair cells.

      Lines 400-404. Whew, this is pretty cryptic. Expand a bit? 

      We simplified this paragraph (revision lines 411-413): “We speculate that gA and gDR(KV1.8) have different subunit composition: gA may include heteromers of KV1.8 with other subunits that confer rapid inactivation, while gDR(KV1.8) may comprise homomeric KV1.8 channels, given that they do not have N-type inactivation .”

      Line 428. 'importantly different ion channels'. I think I understand what is meant but perhaps say a bit more. 

      Revised (Line 438): “biophysically distinct and functionally different ion channels”.

      Random thought. In addition to impacting Rin and TauRC, do you think the more negative Vrest might also provide a selective advantage by increasing the driving force on K entry from endolymph? 

      When the calyx is perfectly intact, gK,L is predicted to make Vrest less negative than the values we report in our paper, where we have disturbed the calyx to access the hair cell (–80, Govindaraju et al., 2023, vs. –87 mV, here). By enhancing K+ accumulation in the calyceal cleft, the intact calyx shifts EK—and Vrest—positively (Lim et al., 2011), so the effect on driving force may not be as drastic as what you are thinking.

      Reviewer #2 (Recommendations For The Authors): 

      (1) Introduction: wouldn't the small initial paragraph stating the main conclusion of the study fit better at the end of the background section, instead of at the beginning? 

      Thank you for this idea, we have tried that and settled on this direct approach to let people know in advance what the goals of the paper are.

      (2) Pg.4: The following sentence is rather confusing "Between P5 and P10, we detected no evidence of a non-gK,L KV1.8-dependent.....". Also, Suppl. Fig 1A seems to show that between P5 and P10 hair cells can display a potassium current having either a hyperpolarised or depolarised Vhalf. Thus, I am not sure I understand the above statement. 

      Thank you for pointing out unclear wording. We used the more common “delayed rectifier” term in our revision (Lines 144-147): “Between P5 and P10, some type I HCs have not yet acquired the physiologically defined conductance, gK,L.. N effects of KV1.8 deletion were seen in the delayed rectifier currents of immature type I HCs (Suppl. Fig. 1B), showing that they are not immature forms of the Kv1.8-dependent gK,L channels. ”

      (3) For the reduced Cm of hair cells from Kv1.8 knockout mice, could another reason be simply the immature state of the hair cells (i.e. lack of normal growth), rather than less channels in the membrane? 

      There were no other signs to suggest immaturity or abnormal growth in KV1.8–/– hair cells or mice. Importantly, type II HCs did not show the same Cm effect.

      We further discussed the capacitance effect in lines 160-167: “Cm scales with surface area, but soma sizes were unchanged by deletion of KV1.8 (Suppl. Table 2). Instead, Cm may be higher in KV1.8+/+ cells because of gK,L for two reasons. First, highly expressed trans-membrane proteins (see discussion of gK,L channel density in Chen and Eatock, 2000) can affect membrane thickness (Mitra et al., 2004), which is inversely proportional to specific Cm. Second, gK,L could contaminate estimations of capacitive current, which is calculated from the decay time constant of transient current evoked by small voltage steps outside the operating range of any ion channels. gK,L has such a negative operating range that, even for Vm negative to –90 mV, some gK,L channels are voltage-sensitive and could add to capacitive current.”

      (4) Methods: The electrophysiological part states that "For most recordings, we used .....". However, it is not clear what has been used for the other recordings.

      Thanks for catching this error, a holdover from an earlier ms. version.  We have deleted “For most recordings” (revision line 466).

      Also, please provide the sign for the calculated 4 mV liquid junction potential. 

      Done (revision line 476).

      Reviewer #3 (Recommendations For The Authors): 

      (1) Some of the data in panels in Fig. 1 are hard to match up. The voltage protocols shown in A and B show steps from hyperpolarized values to -71mV (A) and -32 mV (B). However, the value from A doesn't seem to correspond with the activation curve in C.

      Thank you for catching this.  We accidentally showed the control I-X curve from a different cell than that in A. We now show the G-V relation for the cell in A.

      Also the Vhalf in D for -/- animals is ~-38 mV, which is similar to the most positive step shown in the protocol.

      The most positive step in Figure 1B is actually –25 mV. The uneven tick labels might have been confusing, so we re-labeled them to be more conventional.

      Were type I cells stepped to more positive potentials to test for the presence of voltage-activated currents at greater depolarizations? This is needed to support the statement on lines 147-148. 

      We added “no additional K+ conductance activated up to +40 mV” (revision line 149-150).  Our standard voltage-clamp protocol iterates up to ~+40 mV in KV1.8–/– hair cells, but in Figure 1 we only showed steps up to –25 mV because K+ accumulation in the synaptic cleft with the calyx distorts the current waveform even for the small residual conductances of the knockouts. KV1.8–/– hair cells have a main KV conductance with a Vhalf of ~–38 mV, as shown in Figure 1, and we did not see an additional KV conductance that activated with a more positive Vhalf up to +40 mV.

      (2) Line 151 states "While the cells of Kv1.8-/- appeared healthy..." how were epithelia assessed for health? Hair cells arise from support cells and it would be interesting to know if Kv1.8 absence influences supporting cells or neurons. 

      We added our criteria for cell health to lines 477-479: “KV1.8–/– hair cells appeared healthy in that cells had resting potentials negative to –50 mV, cells lasted a long time (20-30 minutes) in ruptured patch recordings, membranes were not fragile, and extensive blebbing was not seen.”

      Supporting cells were not routinely investigated. We characterized calyx electrical activity (passive membrane properties, voltage-gated currents, firing pattern) and didn’t detect differences between +/+, +/–, and –/– recordings (data not shown). KV1.8 was not detected in neural tissue (Lee et al., 2013). 

      (3) Several different K+ channel subtypes were found to contribute to inner hair cell K+ conductances (Dierich et al. 2020) but few additional K+ channel subtypes are considered here in vestibular hair cells. Further comments on calcium-activated conductances (lines 310-317) would be helpful since apamin-sensitive SK conductances are reported in type II hair cells (Poppi et al. 2018) and large iberiotoxin-sensitive BK conductances in type I hair cells (Contini et al. 2020). Were iberiotoxin effects studied at a range of voltages and might calcium-dependent conductances contribute to the enhanced resonance responses shown in Fig. 4? 

      We refer you to lines 310-317 in the original ms (lines 322-329 in the revised ms), where we explain possible reasons for not observing IK(Ca) in this study.

      (4) Similar to GK,L erg (Kv11) channels show significant Cs+-permeability. Were experiments using Cs+ and/or Kv11 antagonists performed to test for Kv11? 

      No. Hurley et al. (2006) used Kv11 antagonists to reveal Kv11 currents in rat utricular type I hair cells with perforated patch, which were also detected in rats with single-cell RT-PCR (Hurley et al. 2006) and in mice with single-cell RNAseq (McInturff et al., 2018).  They likely contribute to hair cell currents, alongside Kv7, Kv1.8, HCN1, and Kir. 

      (5) Mechanosensitive ("MET") channels in hair cells are mentioned on lines 234 and 472 (towards the end of the Discussion), but a sentence or two describing the sensory function of hair cells in terms of MET channels and K+ fluxes would help in the Introduction too. 

      Following this suggestion we have expanded the introduction with the following lines  (78-87): “Hair cells are known for their large outwardly rectifying K+ conductances, which repolarize membrane voltage following a mechanically evoked perturbation and in some cases contribute to sharp electrical tuning of the hair cell membrane.  Because gK,L is unusually large and unusually negatively activated, it strongly attenuates and speeds up the receptor potentials of type I HCs (Correia et al., 1996; Rüsch and Eatock, 1996b). In addition, gK,L augments a novel non-quantal transmission from type I hair cell to afferent calyx by providing open channels for K+ flow into the synaptic cleft (Contini et al., 2012, 2017, 2020; Govindaraju et al., 2023), increasing the speed and linearity of the transmitted signal (Songer and Eatock, 2013).”

      (6) Lines 258-260 state that GKL does not inactivate, but previous literature has documented a slow type of inactivation in mouse crista and utricle type I hair cells (Lim et al. 2011, Rusch and Eatock 1996) which should be considered. 

      Lim et al. (2011) concluded that K+ accumulation in the synaptic cleft can explain much of the apparent inactivation of gK,L. In our paper, we were referring to fast, N-type inactivation. We changed that line to be more specific; new revision lines 269-271: “KV1.8, like most KV1 subunits, does not show fast inactivation as a heterologously expressed homomer (Lang et al., 2000; Ranjan et al., 2019; Dierich et al., 2020), nor do the KV1.8-dependent channels in type I HCs, as we show, and in cochlear inner hair cells (Dierich et al., 2020).”

      (7) Lines 320-321 Zonal differences in inward rectifier conductances were reported previously in bird hair cells (Masetto and Correia 1997) and should be referenced here.

      Zonal differences were reported by Masetto and Correia for type II but not type I avian hair cells, which is why we emphasize that we found a zonal difference in I-H in type I hair cells. We added two citations to direct readers to type II hair cell results (lines 333-334): “The gK,L knockout allowed identification of zonal differences in IH and IKir in type I HCs, previously examined in type II HCs (Masetto and Correia, 1997; Levin and Holt, 2012).”

      Also, Horwitz et al. (2011) showed HCN channels in utricles are needed for normal balance function, so please include this reference (see line 171). 

      Done (line 184).

      (8) Fig 6A. Shows Kv1.4 staining in rat utricle but procedures for rat experiments are not described. These should be added. Also, indicate striola or extrastriola regions (if known). 

      We removed KV1.4 immunostaining from the paper, see above.

      (9) Table 6, ZD7288 is listed -was this reagent used in experiments to block Gh? If not please omit. 

      ZD7288 was used to block gH to produce a clean h-infinity curve in Figure 6, which is described in the legend.

      (10) In supplementary Fig. 5A make clear if the currents are from XE991 subtraction. Also, is the G-V data for single cell or multiple cells in B? It appears to be from 1 cell but ages P11-505 are given in legend. 

      The G-V curve in B is from XE991 subtraction, and average parameters in the figure caption are for all the KV1.8–/–  striolar type I hair cells where we observed this double Boltzmann tail G-V curve. I added detail to the figure caption to explain this better.

      (11) Supplementary Fig. 6A claims a fast activation of inward rectifier K+ channels in type II but not type I cells-not clear what exactly is measured here.

      We use “fast inward rectifier” to indicate the inward current that increases within the first 20 ms after hyperpolarization from rest (IKir, characterized in Levin & Holt, 2012) in contrast to HCN channels, which open over ~100 ms. We added panel C to show that the activation of IKir is visible in type II hair cells but not in the knockout type I hair cells that lack gK,L. IKir was a reliable cue to distinguish type I and type II hair cells in the knockout.

      For our actual measurements in Fig 6B, we quantified the current flowing after 250 ms at –124 mV because we did not pharmacologically separate IKir and IH.

      Could the XE991-sensitive current be activated and contributing?

      The XE991-sensitive current could decay (rapidly) at the onset of the hyperpolarizing step, but was not contributing to our measurement of IKir­ and IH, made after 250 ms at –124 mV, at which point any low-voltage-activated (LVA) outward rectifiers have deactivated. Additionally, the LVA XE991-sensitive currents were rare (only detected in some striolar type I hair cells) and when present did not compete with fast IKir, which is only found in type II hair cells.

      Also, did the inward rectifier conductances sustain any outward conductance at more depolarized voltage steps? 

      For the KV1.8-null mice specifically, we cannot answer the question because we did not use specific blocking agents for inward rectifiers.  However, we expect that there would only be sustained outward IR currents at voltages between EK and ~-60 mV: the foot of IKir’s I-V relation according to published data from mouse utricular hair cells – e.g., Holt and Eatock 1995, Rusch and Eatock 1996, Rusch et al. 1998, Horwitz et al., 2011, etc.  Thus, any such current would be unlikely to contaminate the residual outward rectifiers in Kv1.8-null animals, which activate positive to ~-60 mV. 

      (I-HCN is also not a problem, because it could only be outward positive to its reversal potential at ~-40 mV, which is significantly positive to its voltage activation range.)

    1. Author response:

      The following is the authors’ response to the original reviews.

      We edited the manuscript for clarity, added information described in new figure panels (below) and corrected typos.

      In figure 1 we corrected a typo.

      In figure 2, panel 2H, and Figure S2E, we included a new statistical analysis (mixed effect linear regression) to compare mutational burden in controls and AD patients.

      In figure 3, and Figure S4B, we revised the western blots panels in Panel 3E,F, to improve presentation of controls and quantification.

      we corrected typos.

      In figure 5 we removed a panel (former 5D) which did not add useful information.

      In Figure S1A we included information about sex and age from the control and patients analyzed. In Figure S2B, we added an analysis of the mutational burden in controls, distinguishing controls with and without cancer.

      We modified Table S1 for completeness of information for all samples analyzed.

      Reviewer #1:

      Weaknesses: 

      Even though the study is overall very convincing, several points could help to connect the seen somatic variants in microglia more with a potential role in disease progression. The connection of P-SNVs in the genes chosen from neurological disorders was not further highlighted by the authors. 

      All P-SNVs are reported in Table S3.

      We observed only two P-SNVs within genes associated to neurological disorders (brain panel in Table S2). - SQSTM1 (p.P392L) was identified in blood but not in brain from the patient AD48A.

      - OPTN was identified (p.Q467P) in PU.1 from control 25.   

      To highlight this point, we modified the first paragraph of the discussion as follow:

      “We report here that microglia from a cohort of 45 AD patients with intermediate-onset sporadic AD (mean age 65 y.o) is enriched for clones carrying pathogenic/oncogenic variants in genes associated with clonal proliferative disorders (Supplementary Table 2) in comparison to 44 controls. Of note we did not observe microglia P-SNVs within genes reported to be associated with neurological disorders (Supplementary Table 2) in patients, and one such variant was identified in a control (Supplementary Table 3) “.

      The authors show in snRNA-seq data that a disease-associated microglia state seems to be enriched in patients with somatic variants in the CBL ring domain, however, this analysis could be deepened. For example, how this knowledge may translate to patient benefits when the relevant cell populations appear concentrated in a single patient sample (Figure 5; AD52) is unclear; increasing the analyzed patient pool for Figure 5 and showcasing the presence of this microglia state of interest in a few more patients with driving mutations for CBL or other MAPK pathway associated mutations would lend their hypotheses further credibility. 

      We acknowledge this limitation, but we respectfully submit that the analysis was performed in 2 patients. AD 53 also show a MAPK-associated inflammatory signature in the microglia clusters associated with mutations.

      We performed the analysis on all FACS-purified PU.1+ nuclei samples that passed QC for single nuclei RNAseq. It should be noted that this analysis is extremely difficult with current technologies because microglia nuclei need to be fixed for PU.1 staining and FACS purification and the clones are small (~1% of microglia).

      A potential connection between P-SNVs in microglia and disease pathology and symptoms was not further explored by the authors. 

      At the population level, Braak/CERAD scores, the presence of Lewy bodies, amyloid angiopathy, tauopathy, or alpha synucleinopathy were not different between AD patients with or without pathogenic microglial clones (Figure S3 and Table S1). Of note, we studied here a homogenous population of AD patients.

      At the tissue level, the roles of mutant microglia in plaques for example is being investigated, but we do not have results to present at this time.

      A recent preprint (Huang et al., 2024) connected the occurrence of somatic variants in genes associated with clonal hematopoiesis in microglia in a large cohort of AD patients, this study is not further discussed or compared to the data in this manuscript. 

      This pre-print supports the high frequency of detection of oncogenic variants associated with clonal proliferative disorders, they hypothesize that the mutations may be associated with microglia, but they only check a few mutations in purified microglia. Most of the study is performed in whole brain tissue. It does not really bring new information as compared to other study we cite in the introduction (and to our manuscript).

      Reviewer #2 (Recommendations For The Authors): 

      Suggestions for improved or additional experiments, data, or analyses: 

      The authors can demonstrate that identified pathological SNVs from their AD cohort also lead to the activation of human microglia-like cells in vitro, but do not provide any data from histological examination of the patient cohort (e.g. accumulation at the plaque site, microglia distribution, and cell number). The study could be further supported by providing a histological examination of patients with and without P-SNVs to identify if microglia response to pathology, microglia accumulation, or phagocytic capacity are altered in these patients. 

      We performed IBA1 staining in brain samples from control and from AD patients, with or without microglial clones and microglia density was not different between patient with and without mutations. In addition, histological reports from the brain bank (Braak/CERAD scores, Lewis bodies, amyloid angiopathy, tauopathy, or alpha synucleinopathy did not suggest differences between patient with and without mutations (Figure S3). These results are preliminary and further investigations are ongoing.

      It would have been interesting to see if for example, transgenic AD mice with an introduced somatic mutation in microglia show an altered disease progression with alterations in amyloid pathology or cognition. 

      We agree with the reviewer. We performed an in vivo study with mice expressing a  5xFAD transgene, an inducible microglia Cx3cr1CreERt2 BrafLSL-V600E transgene, or both, and performed survival, behavioral (Y-Maze and Novel Object Recognition), and histological analyses for β-Amyloid, p-Tau and Iba1 staining.

      Microgliosis was increased in the group with the 2 transgenes, however the phenotype associated with the expression of a BrafV600E allele in microglia (Mass et al Nature 2017) was strongly dominant over the phenotype of 5xFAD mice, which did not allow us to conclude on survival and behavioral analyses.

      Other studies with different transgenes are in progress but we have no results yet to include in this revised manuscript.

      To connect the somatic mutations in microglia better to a potential contribution in neurodegeneration or neurotoxicity, the authors could provide further details on how to demonstrate if human microglia-like cells respond differentially to amyloid or induce neurotoxicity in a co-culture or slice culture model. 

      These studies are undertaken in the laboratory, but unfortunately, we have no results as yet to include in this revised manuscript.

      The number of samples analyzed for hippocampi, especially in the age-matched controls might be underpowered. 

      Unfortunately, despite our best efforts, we were not able to analyze more hippocampus from control individuals. To control for bias in sampling as well as to other potential bias in our analysis, we investigated the statistical analysis of the cohorts for inclusion of age as a criterion (age matched controls), inclusion of a random effect structure, and possible confounding factor such as sex, brain bank site, and samples’ anatomical location (see revised Methods and revised Fig. 2C, F, and H, and S2B).

      We first tested whether the inclusion of age is appropriate in a fixed-effects linear regression using a generalized linear model (GLM) with gaussian distribution. Compared to the baseline model, the model with age had significantly low AIC (from -66.6 to -71.9, P = 0.0067 by chi-square test). Therefore, the inclusion of age as a fixed effect is appropriate. We next tested multiple structures of mixed-effects linear modeling. We used donors as random effects, while utilizing age, disease status (neurotypical control vs. AD), or both as fixed effects. Fitting was performed using the lme function implemented in the nlme package with the maximum likelihood (ML) method. The incorporation of age and disease status significantly improved overall model fitting. Both age and AD are associated with a significant increase in SNV burden in this model (P<1x10^-4 and P=1x10^-4, respectively, by likelihood ratio test). The model's total explanatory power is substantial (conditional R^2=0.48). We also asked if the addition of potential confounding factors to the model is justified. Three factors were tested via the two above-mentioned methods: sex, brain bank site, and the anatomical location of the samples. In all cases, the AIC increased, and the P values by likelihood ratio tests were higher than 0.99. Therefore, from a statistical standpoint, the inclusion of these potential confounding factors does not seem to improve overall model fitting.

      Minor corrections to the text and figures: 

      The authors made a great effort to analyze various samples from one individual donor. One can get a bit confused by the sentence that "an average of 2.5 brains samples were analyzed for each donor". Maybe the authors could highlight more in the first paragraph of the results section and in Figure 1A, that there are multiple samples ("technical replicates") from one individual patient across different brain regions used. 

      We removed the ‘2.5’ sentence and rewrote the paragraph for clarity. Samples information’s are now displayed in Table S1.

      In the method section is a part included "Expression of target genes in microglia", it was very hard to allocate where these data from public data sets were actually used and for which analysis. Maybe the authors could clarify this again. 

      AU response: we apologize and corrected the paragraph in the methods (page 6) as follow: “ Expression of target genes in microglia. To evaluate the expression levels of the genes identified in this study as target of somatic variants, we consulted a publicly available database (https://www.proteinatlas.org/), and also plotted their expression as determined by RNAseq in 2 studies (Galatro et al. GSE99074 33, and Gosselin et al. 34) (Table S3 and Figure S2). For data from Galatro et al. (GSE99074) 33, normalized gene expression data and associated clinical information of isolated human microglia (N = 39) and whole brain (N = 16) from healthy controls were downloaded from GEO. For data from Gosselin et al. 34, raw gene expression ­data and associated clinical information of isolated microglia (N = 3) and whole brain (N = 1) from healthy controls were extracted from the original dataset. Raw counts were normalized using the DESeq2 package in R 35.”

      Table S3 is very informative, but also very complex. The reader could maybe benefit a lot from this table if it can be structured a bit easier especially when it comes to identifying P-SNVs and in which tissue sample they were found and if this was the same patient. The sorting function on top of the columns helps, but the color coding is a bit unclear. 

      Despite our best efforts we agree that the table, which contain all sequencing data for all samples, is complex. The color coding (red) only highlights the presence of pathogenic mutation.

      Reviewer #3 (Recommendations For The Authors): 

      This is a well-done study of an important problem. I present the following minor critiques: 

      At the bottom of Page 4 and into the top of Page 5, the authors state that 66 of the 826 variants identified in their panel sequencing experiment were found in multiple donors. Then the authors proceed to analyze the remaining 760 variants. It seems that the authors concluded that these multi-donor mosaics were artifacts, which is why they were excluded from further analysis. I think this is a reasonable assumption, but it should be stated explicitly so it is clear to the reader. Complicating this assumption, however, the authors later state that one of their CBL variants was found in two donors, and it is treated as a true mosaic. The authors should make it clear whether recurrent variants were filtered out of any given analysis. It remains possible that all recurrent variants are true mosaics that occurred in multiple donors. The authors should do a bit more to characterize these recurrent variants. Are they observed in the human population using a database like gnomAD, which, together with their recurrence, would strongly suggest they are germline variants? Are they in MAPK genes, or otherwise relevant to the study?

      We apologize for the confusion. Our original intent for the ddPCR validation of variants (Figure 1E) was to count only 1 ‘unique’ variant for variants found for example in 1 brain sample and in the blood from the same patient, or in 2 brain regions from one patient, in order to avoid the criticism of overinflating our validation rate. This was notably the case for TET2 and DNMT3 variants. For example, validation of a TET2 variant found in 2 different brain areas and blood of the same donor is counted as 1 and not 3. We did not eliminate these variants from the analysis as they passed the criteria for somatic variants as presented in Methods.

      In contrast, when a specific variant was found and validated in two different donors, we counted it as 2.

      The characterization of variants included multiple parameters and databases, including for example AF and gnomAD, as indicated in Methods and reported in Table S3.

      All ddPCR results can be found at the end of Table S3.

      Figure 2B labels age-matched controls as "C", but Figure 2C labels age-matched controls as AM-C. Labels should be consistent throughout the manuscript. 

      We corrected this in the revised version.

      It is not clear if the "p:0.02" label in Figure 2F is referring to AM-C Cx vs. AD-Cx or AM-C vs. AD. Please clarify. 

      We apologize for the confusion, and we corrected the legend. The calculated p value is for the comparison between Cortex from Controls (age-matched) and the Cortex from AD.

      On Page 7, the authors state, "The allelic frequencies at which MAPK activating variants are detected in brain samples from AD patients range from ~1-6% of microglia (Fig. 3G), which correspond to clones representing 2 to 12% of mutant microglia in these samples, assuming heterozygosity." I understand what the authors mean here but I think it's a bit confusingly stated. I suggest something like "The allelic frequencies at which MAPK activating variants are detected in brain samples from AD patients range from ~1-6% in microglia (Figure 3G), which correspond to mutant clones representing 2 to 12% of all microglia in these samples, assuming heterozygosity." 

      We thank the reviewer for this suggestion and re-wrote that sentence.

      Is there any evidence that the transcriptional regulators mutated in AD microglia (MED12, SETD2, MLL3, DNMT3A, ASXL1, etc.) are involved in regulating MAPK genes? This would tie these mutations into the broader conclusions of the paper. 

      This is a very interesting question, and indeed published studies indicate that some of the transcriptional /epigenetic regulators regulate expression of MAPK genes. However, in the absence of experimental evidence in microglia and patients, the argument may be too speculative to be included.

      Do the authors have any thoughts as to whether germline variants in CBL are linked to AD? If not, why do they think germline mutations in CBL are not relevant to AD? 

      This is also a very interesting question. As indicated in our manuscript, germline mutations in CBL (and other member of the classical MAPK genes, see Figure 3C) cause early onset (pediatric) and severe developmental diseases known as RASopathies, characterized by multiple developmental defects, and associated with frequent neurological and cognitive deficits.

      It is possible that some other (and more frequent?) germline variants may be associated with a late-onset brain restricted phenotype, but we did not find germline pSNV in our patients. GWAS studies may be more appropriate to test this hypothesis.

      Do any donors show multiple variants? I don't think this is addressed in the text. 

      We do find donors with multiple variants (see Figure 3D and Figure S3), however at this stage, we did not perform single nuclei genotyping to investigate whether they are part of the same clone.

      Figure S3 appears to be upside down. 

      This was corrected

      Figure 5C should have some kind of label telling the reader what gene set is being depicted. 

      We added this information above the panel (it was in the corresponding legend).

      At the top of Page 12, Lewy bodies are written as Lewis bodies. 

      This was corrected

      Many control donors died of cancer (Table S1). Is there any information on which, if any, chemotherapeutics or radiation these patients received? Might this impact the somatic mutation burden? The authors should compare controls with and without cancer or with and without cancer treatments to rule this out. 

      As suggested by the reviewer, we analyzed the mutational load of age-matched controls with and without cancer (revised Figure S2B). As expected, we saw an increase in the mutational load in controls with cancer, particularly in their blood. This information was added in the result section.

      This is most likely associated with the treatments received as well as possible cancer clones.

      The formatting for Table S3 is odd. Multiple different fonts are used (this is also seen in Table S5). Column Q has no column ID. The word "panel" is spelled "pannel." The word "expressed" is spelled "expressd" in one of the worksheet labels. Columns BG-BN in the ALL-SNV worksheet are blank but seemingly part of the table. 

      We fixed this error in Table S3.

    2. eLife assessment

      This fundamental study enhances our understanding of how somatic variants in microglia might influence the onset and progression of neurodegenerative diseases such as Alzheimer's. The evidence supporting the conclusions is compelling, with the authors employing a multi-faceted approach to identify an enrichment of potentially pathogenic somatic mutations in Alzheimer's disease microglia. This research will be of significant interest to those investigating somatic mutations, Alzheimer's disease, microglial biology and cell signalling pathways.

    3. Reviewer #1 (Public review):

      In the revised manuscript Vicario et al. provide new insights on a potential contribution of somatic mutations within the microglia population of the CNS that accelerates microglia activation and disease-associated gene signatures in Alzheimer's disease. Here they especially identified an "enrichment" of pathological SNVs in microglia, but not the peripheral blood, that are associated with clonal proliferative disorders and neurological diseases in a subset of patients with AD. They identified P-SNVs in microglia of AD patients located within the ring domain of CBL, a negative regulator of MAPK signaling. They further provide mechanistic insights how these variants result in MAPK over-activation and subsequently in a pro-inflammatory phenotype in human microglia-like cells in vitro.

      Overall, this study provides novel evidence from an AD patient cohort pointing to a potential contribution of microglia-specific somatic mutations to disease onset and/or progression in at least a subset of patients with Alzheimer's disease.

      The work within this study is highly relevant and will open new study lines to explore somatic mutations within the microglia compartment and neurodegenerative diseases.

      Strengths:

      As outlined above, the study identified P-SNVs in microglia of AD patients associated with clonal proliferative disorders, but also give an in depth analysis in re-occurring P-SNVs located within the ring domain of CBL, a negative regulator of MAPK signaling. They further provide mechanistic insights how these variants result in MAPK over-activation and subsequently in a pro-inflammatory phenotype in HEK cells, BV2 cells, MAC cells and human microglia-like cells in vitro. The over-activation of the cells in vitro is convincing.

      Great care was taken to identify the limitations of the possible conclusions and to make careful conclusions. For example, they highlight that the pathway proposed to be affected may be an explanation for a subset of AD patients, and emphasize that it is yet unclear whether this accumulation of pathological SNVs is a cause or consequence of disease progression

      The study supports an enrichment of P-SNVs in several genes associated clonal proliferative disorders in microglia and nicely separates this from SNVs associated with clonal hematopoiesis in the peripheral blood found in AD patients and controls.

      The authors further acknowledged that several age matched control patients were diagnosed with cancer or tumor-associated diseases and carefully dissected the occurring SNVs in these patients are not associated with the P-SNVs identified in the microglial compartment of the AD cohort.

      Weaknesses:

      The revised study is overall convincing and has improved in the revised version, but some points especially regarding the clear connection of the seen somatic variants in microglia with a potential role in disease progression remain unanswered.

      A potential connection between P-SNVs in microglia and disease pathology and symptoms was not further explored by the authors but might be in future work.

      Taken this into account, maybe the title is a bit overstated and could be tuned down.